Background

Imagination and mental simulation

a. Revisiting the developmental path

In this study, the developmental path of the infant is revisited. Table 1 shows the functions of interest. Five-month-old infants begin to stare at their own hands. This can aid in learning forward and inverse models of the hand. At 6 months, babies will touch the faces of people who hold them. This may allow the infants to integrate visual and tactile information about the face. Infants at this age also turn objects that they hold, so that the objects can be seen from various angles; this helps them learn to recognize 3D objects. Infants learn causality and the permanence of objects around 7 months of age. By 8 months, they learn the dynamical model of objects and associations between vision, auditory information, and motor command. They start to discover tool use at about 9 months. At 10 months, infants begin to imitate, including actions that they cannot see (e.g., pats on the head). By 11 months, infants grasp objects with a precision grip and will give things over to others. Around this age, the basis of action recognition and cooperative action forms. At 12 months, infants begin to play make-believe—the origin of mental simulation. After 12 months, they perform complicated imitation while watching others’ actions. This progression shows that infants develop a body schema of self, objects to grasp, tools to use, internal simulation, action recognition, and imitation, in a stepwise manner.

Table 1 Infant Development and Learning Targets

b. Object permanence

Between the ages of approximately 6 months and 1 year, infants acquire the knowledge that objects continue to exist even when they become invisible. This concept is called ‘object permanence’ and is related to development of image generation and motor prediction abilities in the brain. These abilities for prediction are considered to develop along with the infant’s acquisition of goal-directed movements (reaching or reaching to grasp). In addition, predictive control mechanisms may play important roles in the development of nonverbal communication, such as pointing or imitation.

c. Frames of reference

One of the developmental goals during infancy is to establish frames of reference that allow the infant to perceive and understand objects in the external world, and to choose and adopt an appropriate frame of reference to adapt to the environment depending on the situation. There are two types of frame of reference: egocentric and allocentric. Both are tightly linked to body schema, and various higher cognitive functions are realized based on these frames of reference. In the first year of life, it is most important to establish these two frames of reference (egocentric and allocentric) and to learn the mutual transformation between the two. Based on a number of neuroimaging studies, Inui [B-20] proposed a hypothesis that the left parietal lobe is involved in egocentric description while the right parietal lobe is involved in allocentric description. In addition, he suggested that the left parietal lobe functions to project an external stimulus to the self-body while the right parietal lobe functions to project the self-body to an external object or another body. He also presumed that body images are generated with a predictive control system regarding one’s own body.

d. Necessary conditions and the principles for development of the mind

 Needless to say, the human mind is essential for survival and conservation of the species. In order to survive, the following functions are necessary: “predicting the behaviors of others,” “accurately assessing situations,” “predicting future evolution of the present situation,” and “selecting appropriate behaviors.” However, in reality, these capabilities are difficult to achieve because prediction of behavior depends on the mental states of other people. Thus, facial expressions and gestures become important cues for assessing situations. To achieve these functions, the following necessary conditions for the mind are proposed:

a) The ability to construct an egocentric model (one’s own actions are inputs to the model).
b) The ability to predict object behaviors and actions based on this model.
c) The ability to perform mental simulations based on this model.

Two original programs that satisfy these necessary conditions are also proposed:

a) Imitation program
b) Prediction program

Both programs are indispensable for the development of the mind because mental functions must be utilized for intraspecific communication.

Significance of imitation and action understanding

As previously mentioned, imitation learning is important for the acquisition of internal models of the external environment. The input to these models is in the form of motor commands. The models can be described by visual information as well as by self-motor generation. The following four types of information are important for formation of the functions of the mind:

a) echolalia: vocalization;
b) empathic imitation: expression;
c) joint attention: visual direction; and
d) second-order imitation, i.e., make-believe play or pantomime: operation and action

The basis of imitation is the transformation of visual and auditory information into motor command signals. Research involving autistic children has shown that second-order imitation plays a significant role in language acquisition. Autistic children have difficulty with imitation and in understanding gestures and pantomime. Secondary imitation must invoke motor memory independently of the target. Recent neuroscience studies indicate that imitation skills and recognition skills are two sides of the same coin. In other words, the key is whether or not the forward model can be operated without sensory input.

Imitation and Language

The most important aspect of language processing is a syntactic one. Therefore, grammar acquisition is a significant function in cognitive development. In this study, we hypothesize the following:

1) Cognitive function, including grammar acquisition, is developed mainly by imitation.
2) Objects are represented in egocentric coordinates, by imitation.
3) Egocentric representation is a forward model of objects, the input to which is one’s own body movement.
4) Egocentric representation includes prediction of change or movement of objects in time.

The important aspect of imitation that we assumed above is the transformation of visual information into motor commands (i.e., imitating by watching other’s actions and understanding gestures) and the transformation of auditory information into motor commands (i.e., repeating or rehearsal). Prediction and imitation are therefore possibly unified as a function, so the “motor sequence prediction hypothesis” is that imitation learning is preceded by prediction of a motor command.
Note that “motor sequence” does not mean the individual muscle command signals generated in BA4. Rather, it refers to more abstract “action units”, that is, chunk codes of more complex command signals. These units may correspond to either visual or auditory segments.

Outcome

Detailed study on the process of cognitive development, based on the above research, suggests that the following functions, under the principle of subject-object inseparability, play important roles in cognitive development, especially in the development of communication functions:

1) Synesthetic acquisition of body schema by the fetus;
2) Contingency control (including maximization principle) by the newborn;
3) Learning of forward and inverse models of own body movement; and
4) Learning of coordinate transformation.

Decoupled representations are learned after these functions.

Body schema acquired through function 1) form the foundation for the initial preference for paying attention to the human face. Paying attention to stimuli that exhibit high contingency, as shown in 2), promotes learning of forward and inverse models of self-body, as shown in 3). As we will discuss later, the features of developmental psychology strongly suggest that forward and inverse models of the hands are learned simultaneously. This way of learning causes infants to reach out their hands for objects that cannot be reached, and it leads to the underlying function of pointing fingers. The acquisition of a forward model of movement allows motor prediction, and the prediction function is in turn the basis of object permanence and image generation. Once an internal model of movement is acquired, attending to the movements of non-contingent others or objects becomes possible. Then, learning of 4) enables perception and recognition of an object in egocentric coordinates as well as in object-centered or allocentric coordinates. This type of coordinate transformation requires the prediction signals of movement generated from an acquired forward model. After this learning, context-independent symbolizing (i.e., decoupling) could be promoted. Decoupling enables the individual to see something as a different entity or to enjoy make-believe play. In the following sections, we will report the outcomes of our research about these functions.