Movement learning in robotics

Matteo Curci
15 min readMar 19, 2021

--

Today robotics is more and more inspired by the behaviour of the human being. If we want robots to interact with the real world, we have to look toward the smartest creature, us. Thanks to different researches in science, now we can try to reproduce, although with limitations, all the ability of the human in a robot. The results of these researches can be seen in developmental robotics that merge robotics, biology and neuroscience. The best known example of developmental robotics is iCub [11] that is a humanoid robot used for research into human cognition and artificial intelligence. The iCub has demonstrated capabilities in successfully perform human-like tasks: crawling, vision processing functions, progressive language learning, manipulation functions and so on. One of the most controversial point in this field is the movement learning. While efficient movement planning in typical low dimensional industrial robots, usually characterized by three to six DOFs, is a complex problem, optimal planning in 30 to 50 DOF systems, like iCub, with uncertain geometric and dynamic models is really hard, especially if we aim at real-time performance in a robotic system.

In everyday life we are required to move in a changing environment. Despite these variations we are able to achieve our behavioural goals easily, thanks to the process of motor learning. Motor learning can be defined as a set of processes that, with exercises and experience, determines a relative change in performance and potentiality of the behaviour [12]. Also motor learning is generally conceived as the acquisition of new skilled movements. Learning is not directly visible, because the processes that determine changes are internal. If we observe an individual learning of a new movement, we may notice that the target is not immediately reached during its first attempts, in which executions are coarse or wrong, because learning is not that simple. The more the movement is complex, the more time is needed to learn it, adding new movement segments. Thus, learning requires repetitions. In 1975 Schmidt said that “the number of repeats of the movement to learn represent a basic element in order to form and strengthen the schema of the action. Such executions are necessary to store information on initial conditions, the parameters used in the response, on sensory feedback and the achieved results” [14].

The performances are improved from time to time at each new execution until the formation of a pattern relatively stable, by means of which the movement can come close to that intended. The beginning of learning process in humans occurs in newborn, which unconsciously learn to recognize and use their own body, thus beginning to interact with the environment. Furthermore, do not forget that the phenomenon is not a purely neurophysiological process, since it has also important psychological implications. Like guessed by Donald Hebb learning takes place as a process “experiencedependent” [15]; everything we experience can potentially significantly influence our neuronal connections and our brain, a phenomenon known by the term of neural plasticity.

According to the studies of Hebb about neural plasticity and of Ramon y Cajal, it was discovered that the hippocampus and its plasticity(LTP — Long-term potentiation) are the basis of cognitive learning and memory. It turned out that stimulation of sensory cortex could produce LTP in motor cortex [17]. This indicates that repeated practice of a movement, which sends input to sensory cortex produces LTP to the motor cortex. Insight into the neural basis of learning was derived also from progress in surgery. In 1957 Scoville and Milner reported a historical finding about the genesis of learning and memory. Scoville performed an operation to treat an epileptic patient by removal of the temporal cortex and the hippocampus. Although the operation was successful, this patient suffered from retrograde amnesia; he could not remember daily experiences, although he mantained long-term memories that he had acquired before the operation [18]. Similar observations were described in other patients and animals, and it is generally accepted that the hippocampus and related structures are involved in the memorization of recent memories.

There are various types of motor learning studied. Different studied on animals confirmed the important role of cerebellum and motor cortex in learning. Sasaki, working with the monkey, proposed the premotor and the frontal cortices [19] as well as the cerebellum as the sites for learning a conditioned response. Thompson and Woody have studied the conditioning of the eye-blink reflex in the cat and have proposed that the cerebellum and motor cortex are the sites for learning [20]. Another theory about motor learning is that linked to motor primitives. This theory said that the brain may control complex movements through flexible combination of these primitives, where each primitive is an element of computation in the sensorimotor map that transforms desired limb trajectories into motor commands. Theoretical studies have shown that a system’s ability to learn action depends on the shape of its primitives. It has been shown that humans learn the dynamics of reaching movements through a flexible combination of primitives that have gaussian-like tuning functions encoding hand velocity [21]. These motor patterns are available from birth and they represent survival and exploratory functions, which enable humans to build sensorimotor maps that we use as the basis for other elements activities [1]. Some recent studies support this hypothesis and they say that the nervous system combines them based on the task you want to accomplish [2]. A test of the modular construction of the movement is observed by recording the EMG signals from a sample of individual. By applying to these signals to the decomposition algorithm, it has been observed common regularities in space-time between individuals and shared between different activities. In Fig. 1 EMG signals acquired during the experiment are then analyzed, and an algorithm of reduction of dimensionality is applied to get the synergies.

Fig. 1

At the end, what can be a method to speed up the motor learning, reducing the number of trial-and-error trial?

Motor skills imitation relies on the ability to recognize actions and to transform visual patterns into motor commands. An important prerequisite for imitation is a connection between the sensory systems and the motor systems, such that percepts can be mapped into actions. But, do in our brain exist particular areas specialized for imitation? Perrett [3] [4] reported that neurons within the superior temporal sulcus (STs) of macaques, respond to both form and motion of objects. Also, many cells were sensitive to movements of specific body parts of an observed human. In the lower part of the STs, similar phenomena were found for actions of the hand [4]. From this results, Perrett concluded that STs is suited to extract the attention and goals of others [24]. Thus, STs neurons are in the ideal situation to analyze the movement of others and seem to be a candidate for a first processing step for imitation.

There are other areas involved in imitation. Rizzolatti found neurons in area F5 that were specific to the execution of goal related movements, e.g., reaching, bringing-to-the-body [25]. Recent studies showed that a subset of the neurons located in the rostral part of inferior area F5 (Fig.) of the monkey become active both during monkey movements and when the monkey observes the experimenter or another monkey performing “an action similar to the one that, when actively performed, triggers [that] neuron”. Neurons with this property are called “mirror neurons”. Mirror neurons are responsible “for matching the neural command for an action with the neural code for the recognition of the same action executed by another primate” [26].

There is increasing evidence that a mirror neuron system also exists in humans. The first case of the existence of a mirror neuron system in humans was provided by Fadiga and associates who delivered single pulse trans-cranial magnetic stimulation (TMS) to volunteers while observing an experimenter doing different hand actions [28]. Also, single pulse TMS was delivered during the observation of the same objects, observation of an experimenter tracing figures in the air with his arm, and dimming light detection. Motor evoked potentials (MEPs) were recorded from extrinsic and intrinsic hand muscles. Results showed that the action observation, but not the other conditions, bring to an increase of MEPs in those same hand muscles involved in the actual execution of the observed action by the observer. These TMS data support the notion of a mirror neuron system, matching action execution and action observation. Craighero and associates found newer results in a study in which people were asked to prepare to grasp as fast as possible a bar oriented either clockwise or counterclockwise, after presentation of a picture showing the right hand [30]. In the first experiment the picture represented a mirror image of the final position of the hand required to grasp the bar. The second experiment included the same stimuli as in the first one, plus two pictures, 90 degrees of rotations of the hand in both leftward and rightward directions. In both experiments, responses of the participants were faster when the hand orientation of the picture corresponded to that achieved by the hand at the end of the action, when actually executed. Psychologically speaking, different studies have been done, mainly in cognitive sciences. Jean Piaget described for the first time the term “deferred imitation”, the delayed repetition of a behavior at a later time than when it actually occurred. Piaget noted that this ability appeared in children ages between 18 and 24 months. Infants and young children are unable to hold memories of behaviors in their memory and recall them later. Children develop the ability to mentally represent the behavior in their mind and repeat it, for example a child mimicking their parents cooking dinner by playing with pots and pans and pretending to cook. Other studies shows that animals could not do this, so we can see the imitation like an expression of intelligence. All of this researches bring out the importance of imitation in motor learning. The first approach used is the symbolic reasoning. During a training phase, different example movements were generated under manual robot control that achieved a given task. Sensor readings, i.e. position and force, were stored during the demonstration with the positions and orientations of goal state. In the case of imitation, the goal is divided in sub-goals, and the movement in “primitive actions”. An example of subgoal can be the orientation of an end-effector, compared to a goal position. This primitives can be labeled, such that a state-action paradigm is created. When the robot is in a determinate state, do a determinate action. This high-level representation results in a graph, where each state becomes a graph node and each action a link between two nodes.

Learning complex tasks, composed of a combination of individual motions, is the ultimate goal of imitation learning. An approach is to first learn models of all of the individual motions, using demonstrations of each of these actions individually [31], and then learn the right combination in a second stage either by observing a human performing the whole task [32] or through reinforcement learning [33]. An alternative is to watch the human perform the complete task and to automatically extract the primitive actions, as in [34] The interface used to show a robot the information is very important. We find projects with directly recording human motions, with the use of vision, exoskeleton or other wearable motion sensors [35] [36]. Another type of teaching is kinesthetic teaching, where the robot is physically guided by the humans. The principal issue with this teaching is that the teacher often use more of their own degrees of freedom to move the robot than the number of degrees of freedom they are trying to control.

In robotics have been developed different bio-inspired models for the movement. One of the most used are CPG models, that are inspired by Central Pattern Generator. Central pattern generators (CPGs) are neural networks capable of producing coordinated patterns of rhythmic activity without any rhythmic inputs from sensory feedback or from higher control centers [38]. CPGs are important for movement, breathing and rhythm generation. In 1911 In 1994, Calancie claimed to have witnessed the “first well-defined example of a central rhythm generator for stepping in the adult human”. The subject was a 37-year-old male who suffered an injury to the cervical spinal cord 17 years prior. After initial paralysis below the neck, the subject regained some movement of the arms and fingers and limited movement in the lower limbs. After 17 years, the subject found that when lying supine and extending his hips, his lower extremities underwent step-like movements for as long as he remained lying down. “The movements (i) involved alternating flexion and extension of his hips, knees, and ankles; (ii) were smooth and rhythmic; (iii) were forceful enough that the subject soon became uncomfortable due to excessive muscle ’tightness’ and an elevated body temperature; and (iv) could not be stopped by voluntary effort”. After extensive study of the subject, the experimenters concluded that “these data represent the clearest evidence to date that such a [CPG] network does exist in man”[40]. The consequence is that have been developed different models based on this.

Salamander CPG model tested with a amphibious salamander-like robot
Salamander CPG model tested with a amphibious salamander-like robot

It is possible to find different experiments in robotics using CPG models. These experiments works on actions like crawling, flying, walking, running and swimming [44] [47], or on online trajectory generation [45]. Another bio-inspired way to generate trajectory is using neural network model (NN mode). Many neural network model were proposed to generate real-time trajectory. In 1994 Li and Ogmen proposed a NN model combining adaptive sensorymotor mapping model and an online visual error correction [48]. Many other models have been developed for the trajectory generation, but not bio-inspired. A different bio-inspired approach for the generation of the movement is the concept of motor primitives. As we have seen, different studies show us that exist also this concept in biology. In literature we can find two principal model to represent primitives: force field (FF) [7] and dynamic movement primitives (DMP) [50]. Force field is a functional unit of the spinal cord, that generates a motor output linked with a synergy (Fig. 2.4).

Spinal cord region with neural circuits for the force fields
Spinal cord region with neural circuits for the force fields

This sinergy causes a force that bring the end-effector to a determinate position in the space. Force field is a vector field that, for every position of the body, associates a force with a synergy. A complex movement force field is the result of the sum of different force field. In Fig. 2.5 A and B fields are the output of the stimulation of two different spinal’s sites. The resulting field and is obtained simultaneously stimulating the two sites. This is really similar to sum field, that is the result from A and B.

Vectorial sum of force fields
Vectorial sum of force fields

We can see these modules like an alphabet, that contain units. These units can be overlapped for the generation of a movement. The movement planning works in end-effector space, so there is a problem of transformation from effector space to muscle space. Another approach leads the problem of movement generation to a problem of optimal control on complex movement, without primitives [51]. If we apply dynamical programming techniques to this problem, we obtain nonlinear equations, where the solutions are hard to find, mainly if we think that today robots have high number of degrees. The solution to this problem is to rewrite equations introducing simplifications. As we can see this approach is not optimal, which bring us to DMP model. DMPs are a model for the primitive based on a set of differential equations that encode the movement from a cinematic point of view. They were presented in 2002 from Schaal [52], and then updated in 2013 by Auke Ijspeert [53]. This was motivated by the desire to find a way to represent complex motor actions that can be flexibly adjusted without manual parameters tuning. DMPs are a proposed mathematical formalization of these primitives, the difference is that each DMP is a nonlinear dynamical system. The basic idea is that you take a dynamical system with well specified, stable behaviour and add another term that makes it follow some interesting trajectory. There are two kinds of DMPs: discrete and rhythmic. For discrete movements the base system is a point attractor, and for rhythmic movements a limit cycle is used.

But I’ll elaborate the concept in a future article.

Bibliography

[1] R. Paine and J. W. Tani, “Adaptive motor primitive and sequence
formation in a hierarchical recurrent neural network,” Neural Networks
17, 2004

[2] C. Alessandro, “Muscle synergies in neuroscience and robotics: from
input-space to task-space perspectives,” Front. Comput. Neurosci.,
2013.

[3] D. I. Perrett, P. A. Smith, A. J. Mistlin, A. J. Chitty, A. S. Head,
D. D. Potter, R. Broennimann, A. D. Milner, and M. A. Jeeves, “Visual
analysis of body movements by neurones in the temporal cortex of the
macaque monkey: a preliminary report,” Behaviour Brain Research,
vol. 16, 1985.

[4] D. I. Perrett, R. Harries, M. H.and Bevan, S. Thomas, P. J. Benson,
A. J. Mistlin, A. J. Chitty, J. K. Hietanen, and J. E. Ortega, “Frameworks of analysis for the neural representation of animate objects and
actions.,” Journal of Experimental Biology, vol. 146, pp. 87–113, 1989.

[7] E. Bizzi and F. A. Mussa-Ivaldi, “Toward a neurobiology of coordinate
transformations,” The Cognitive Neurosciences, 2013.

[11] G. Metta, L. Natale, F. Nori, G. Sandini, D. Vernona, L. Fadiga, C. von
Hofstenc, K. Rosander, M. Lopes, J. Santos-Victor, A. Bernardinoe,
and L. Montesano, “The icub humanoid robot: An open-systems platform for research in cognitive development,” Neural Networks, 2010.

[12] R. Magill, Motor Learning:concepts and applications. McGraw-Hill,
6 ed., 2001.

[14] R. Schmidt, “A schema theory of discrete motor skill learning,” Psychological Review, no. 82, 1975.

[15] D. Hebb, The organization of behaviour. Wiley, 1949.

[17] T. Sakamoto, K. Arissian, and H. Asanuma, “Functional role of the
sensory cortex in learning motor skills in cats,” Brain Research, vol. 503,
1989.

[18] W. Scoville and B. Milner, “Loss of recent memory after bilateral hippocampal lesions,” journal of neurology neurosurgery and psychiatry,
vol. 9, 1957.

[19] K. Sasaki, “Development and change of cortical field potentials during
learning processes of visually initiated hand movements in the monkey,”
Springer Journals, 1982.

[20] R. Thompson, “The neurobiology of learning and memory,” Science,
vol. 233, pp. 941–947, 1986.

[21] K. Thoroughman and R. Shadmehr, “Learning of action through adaptive combination of motor primitives.,” Nature, 2000.

[24] D. I. Perrett, M. H. Harries, Mistlin, A. J., Hietanen, J. K., Benson,
P. J., R. Bevan, S. Thomas, M. W. Oram, J. Ortega, and K. Brierley,
“Social signals analyzed at the signle cell level: Someone is looking at
me, something touched me, something moved!,” International Journal
of Comparative Psychology, vol. 4, pp. 25–55, 1990.

[25] G. Rizzolatti, R. Camarda, L. Fogassi, M. Gentilucci, G. Luppino, and
M. Matelli, “Functional organization of inferior area 6 in the macaque
monkey. ii. area f5 and the control of distal movements,” Experimental
Brain Research, vol. 71, pp. 491–507, 1988.

[26] G. di Pellegrino, L. Fadiga, L. Fogassi, V. Gallese, and G. Rizzolatti,
“Understanding motor events: a neurophysiological study,” Experimental Brain Research, vol. 91, 1992.

[28] L. Fadiga, L. Fogassi, G. Pavesi, and G. Rizzolatti, “Motor facilitation
during action observation: a magnetic stimulation study,” Journal of
Neurophysiology, vol. 73, pp. 2608–2611, 1995.

[30] L. Craighero, A. Bello, L. Fadiga, and G. Rizzolatti, “Hand action
preparation influences the responses to hand pictures,” Neuropsychologia, vol. 40, pp. 492–502, 2002.

[31] O. Mangin and P.-Y. Oudeyer, “Unsupervised learning of simultaneous
motor primitives through imitation,” IEEE International Conference
on Developmental Learning, 2011.

[32] A. Skoglund, B. Iliev, B. Kadmiry, and R. Palm, “Programming by
demonstration of pick-and-place tasks for industrial manipulators using task primitives,” International Symposium on Computational Intelligence in Robotics and Automation, 2007.

[33] K. M¨ulling, J. Kober, O. Kr¨omer, and J. Peters, “Learning to select
and generalize striking movements in robot table tennis,” International
Journal of Robotics Research, p. 280298, 2013.

[34] D. Kulic, C. Ott, D. Lee, J. Ishikawa, and Y. Nakamura, “Incremental
learning of full body motion primitives and their sequencing through
human motion observation,” The International Journal of Robotics Research, pp. 330–345, 2012.

[35] D. Kulic, W. Takano, and Y. Nakamura, “Incremental learning, clustering and hierarchy formation of whole body motion patterns using
adaptive hidden markov chains,” The International Journal of Robotics
Research, vol. 27, pp. 761–784, 2008.

[36] A. Ude, C. Atkeson, and M. Riley, “Programming full-body movements
for humanoid robots by observation,” Robotics and Autonomous Systems, p. 93108, 2004.

[38] S. L. Hooper, “Central pattern generators,” Encyclopedia of Life Sciences., 19992010.

[40] B. Calancie, B. Needham-Shropshire, P. Jacobs, K. Willer, and G. Zych,
“Involuntary stepping after chronic spinal cord injury. evidence for a
central rhythm generator for locomotion in man,” Brain, vol. 117, 1994

[44] A. Crespi and A. Ijspeert, “Amphibot ii: An amphibious snake robot
that crawls and swims using a central pattern generator,” Proceedings
of the 9th International Conference on Climbing and Walking Robots,
2006.

[45] A. J. Ijspeert and A. Crespi, “Online trajectory generation in an amphibious snake robot using a lampreylike central pattern generator
model.,” Proceedings of the IEEE International Conference on Robotics
and Automation, 2007.

[47] Y. Fukuoka, H. Kimura, and A. H. Cohen, “Adaptive dynamic walking
of a quadruped robot on irregular terrain based on biological concepts.,”
The International Journal of Robotics Research,, p. 187202, 2003.

[48] L. Li and O. H., “Visually guided motor control: Adaptive sensorimotor
mapping with on-line visual-error correction,” Proceedings of the World
Congress on Neural Networks, 1994.

[50] S. Schaal and J. Peters, “Control, planning, learning, and imitation
with dynamic movement primitives,” EEE Int. Conf. on Intelligent
Robots and Systems, 2003.

[51] E. Todorov, Bayesian brain: probabilistic approaches to neural coding.
MIT press, 2006.

[52] S. Schaal, “Dynamic movement primitives — a framework for motor control in humans and humanoid robots,” The International Symposium
on Adaptive Motion of Animals and Machines, 2003.

[53] A. Ijspeert, J. Nakanishi, P. Pastor, H. Hoffmann, and S. Schaal, “Dynamical movement primitives: Learning attractor models formotor behaviors,” Neural Computation, vol. 25, pp. 328–373, 2013.

--

--

Matteo Curci

Computer engineer passionate about everything called Coding