Category Archives: language

Inner speech is close to uttered speech

There has recently been a paper in eLife by Whitford etal, Neurophysiological evidence of efference copies to inner speech, Dec 2017 doi 10.7554/eLife.28197.001, examining inner speech. They find it very similar to overt speech.

When we speak a series of motor commands are prepared and executed by the mouth, throat and vocal cords. Copies of these commands, called efferemce copies are used to predict what the auditory area will hear. This prediction is called the internal forward model. When incoming sounds match the prediction, the auditory area lowers its response to the speech. This efference copy mechanism applies to other motor commands and is why we cannot tickle ourselves. The sensory pattern that will result from an action is predicted so that self-generated sensory input is attenuated compared to input that is not self-generated. In the case of speech, the actual sounds are predicted and when input arrives at the right time that matches the expected sound, the sound is dampened. This dampening can be measured. The sounds result in a particular brain wave which has an amplitude that matches the volume of the sound and it can be seen in EEG traces. It is called N1 indicating that it is the first event produced negative wave. This wave has less amplitude for sounds in self-generated speech than for identical sounds that were not self-generated.

In their introduction the author say “…the central aim of the present study is to explore whether N1-suppression, which has consistently been observed in response to overt speech, also occurs in response to inner speech, which is a purely mental action. Inner speech - also known as covert speech, imagined speech, or verbal thoughts - refers to the silent production of words in one’s mind. Inner speech is one of the most pervasive and ubiquitous of human activities; it has been estimated that most people spend at least a quarter of their lives engaged in inner speech. An influential account of inner speech suggests that it ultimately reflects a special case of overt speech in which the articulator organs (e.g., mouth, tongue, larynx) do not actually move; that is, inner speech is conceptualized as ‘a kind of action’. Support for this idea has been provided by studies showing that inner speech activates similar brain regions to overt speech, including audition and language-related perceptual areas and supplementary motor areas, but does not typically activate primary motor cortex. While previous data suggest that inner and overt speech share neural generators, relatively few neurophysiological studies have explored the extent to which these two processes are functionally equivalent. If inner speech is indeed a special case of overt speech - ‘a kind of action - then it would also be expected to have an associated internal forward model.” The researcher show that thinking of a particular sound (such as ba) attenuates the N1 signal of an external sound if they are the same sound at the matching timing. The inner speech efference copy and its internal forward model are produced in inner speech and can dampen an external sound if it matches the internal one.

Here is their abstract. “Efference copies refer to internal duplicates of movement-producing neural signals. Their primary function is to predict, and often suppress, the sensory consequences of willed movements. Efference copies have been almost exclusively investigated in the context of overt movements. The current electrophysiological study employed a novel design to show that inner speech – the silent production of words in one’s mind – is also associated with an efference copy. Participants produced an inner phoneme at a precisely specified time, at which an audible phoneme was concurrently presented. The production of the inner phoneme resulted in electrophysiological suppression, but only if the content of the inner phoneme matched the content of the audible phoneme. These results demonstrate that inner speech – a purely mental action – is associated with an efference copy with detailed auditory properties. These findings suggest that inner speech may ultimately reflect a special type of overt speech.

This probably explains the nature of ‘hearing voices’. If this mechanism failed and inner speech was not properly predicted, it would appear to be external speech. It would not be ‘owned’ by the individual.

Does it ring true?

I make a point of not commenting on research into medical and psychological conditions. However, I am dyslexic and feel able to comment on research into that specific condition. I recognize that there are probably many types, levels and causes of dyslexia and so my reaction might not be the same as others. But I still automatically judge the research by ‘does it feel like it is right in my case?’

Several theories have fit with my experience of dyslexia. The idea that there is a problem with the corpus callosum, the nerves that connect the two hemispheres, in the region where the sound processing is done so that the left and right hemispheres do not properly cooperate for auditory information. This fits with my brother’s cleft pallate and more severe dyslexia and with my high pallate. It might explain the lack of consciousness of what I am going to say that often happens to me. (It has only been on rare occasions that I have disagreed with that I have said.) I am left-handed and perhaps am not conscious of what the other hemisphere is preparing to say due to a lack of communication at some area along the corpus callosum.

Another theory points to a fault in the dorsal/ventral streams. This idea is that sensory information leaves the primary sensory areas via two paths called the dorsal and ventral streams, also called the ‘where/how’ and the ‘what’ streams. The dorsal (where) path leads to motor speech areas, is very fast, and not very conscious. The ventral (what) path leads to more cognitive areas where auditory information is converted into semantic information, is slower, and more conscious. These streams interact in some ways – they both map phonemes but in two different maps and those maps need to be consistent with one another. We need to recognize a phoneme and we need to speak a phoneme. Dyslexics appear to have great difficulty consciously recognizing individual phonemes. They also appear to have difficulty with very short phonemes in particular. This appears to have something to do with a lack of communication between the streams.

Reasonable oral skill (as opposed to written) is possible without phonological awareness by dealing with syllables as entities that are not divided into individual phonemes. The vowel in the syllable is modified by the consonants that proceed or follow it. So the a in bat is different than the a in cap. It is not necessary to recognize the individual b, t, c or p in order to recognize the two words and produce them in speech because the short consonants modify the vowel. This also rings true to me – it is like it feels. The inability to consciously recognize things as separate if they are close together and very poor reflex times also indicate this time problem with short consonants. It is odd, but I find it hard to explain to people how it is to hear a syllable clearly but not hear its components. I seems such a simple obvious perception to me, a single indivisible sound.

Neither of these theories explain the symptoms of mixing up left and right, clockwise and counterclockwise, confusing something with its mirror image and the ‘was’ and ‘saw’ problem. Nor do they explain the slight lag between knowing something was said and hearing what it was.

Theories that have to do with vision or with short-term memory do not seem to apply to me. Although I have to admit that I am not sure what a bad short-term memory would feel like. I certainly have an excellent long-term memory.

Recently there has appear a paper with a new theory. (Perrachione, Del Tufo, Winter, Murtagh, Cyr, Chang, Halverson, Ghosh, Christodoulou, Gabrieli; Dysfunction of Rapid Neural Adaptation in Dyslexia ; Neuron 92, 1383–1397, December 2016) They looked at perceptual adaption in dyslexics and non-dyslexics. Perceptual adaption is the attenuation in perceptual processing of repetitive stimuli. So for example if the same voice says a list of words, there is less activity in parts of the brain than if a different voice delivers each word. The brain has adapted to the voice and that makes processing easier. They measured the adaptation using fMRI and used procedures featuring spoken words, written words, objects and faces with adult subjects and children just starting to read. Always the adaption was weaker for dyslexics then for controls. Also the differences were in the areas involved in processing the particular type of stimulus (such as in visual areas for visual stimuli). The amount of adaptation in these areas correlated with the level of reading skill of the dyslexic. The research supports the idea that dysfunction in neural adaptation may be and important aspect of dyslexia.

Here is part of their conclusion:

Dyslexia is a specific impairment in developing typical reading abilities. Correspondingly, structural and functional disruptions to the network of brain areas known to support reading are consistently observed in dyslexia. However, these observations confound cause and consequence, especially since reading is a cultural invention that must make use of existing circuitry evolved for other purposes. In this way, differences between brains that exert more subtle influences on non-reading behaviors are likely to be the culprit in a cascade of perceptual and mnemonic challenges that interfere with the development of typical reading abilities. Recent research has begun to elucidate a cluster of behaviorally distinct, but potentially physiologically related, impairments that are evinced by individuals with reading difficulties and observable in their brains. Through this collection of neural signatures—including unstable neural representations, diminished top-down control, susceptibility to noise, and inability to construct robust short-term perceptual representations—we are beginning to see that reading impairments can arise from general dysfunction in the processes supported by rapid neural adaptation.”

Does the theory ring true? It certainly fits with the feeling that the problem is wider than just language. I have to say that I have always found it difficult to mimic other people’s speech and that would fit with a weak adaptation. The theory does not seem impossible to me but it also does not seem to fit closely to how I feel about being dyslexic. I feel a kind of wall between what I hear and written language; I have never felt that I have overcome the wall; but I have felt that I worked around it.

I have to give the paper respect for the convincing data even if it does not seem to be the whole story. The picture may be about some aspect of the dyslexic developmental fault but not actually have much to do with the main symptom, difficulty with phoneme awareness.

The opposite trap

Judith Copithorne image

I vaguely remember as a child that one of the ways to learn new words and get some understanding of their meaning was to learn pairs of words that were opposites. White and black, day and night, left and right, and endless pairs were presented. But in making learning easier for children, this model of how words work makes learning harder for adults.

There are ideas that people insist on seeing as opposites – more of one dictates less of the other. They can be far from opposite but it is difficult for people to abandon this relationship. It seems that a mechanism we have for words is making our understanding of reality more difficult. An example is economy and environment. The notion that what is good for the environment has to be bad for the economy and vice versa is not strictly true because there are actions that are good for both and actions that are bad for both, as well as the actions that favour only one. We do not seem to look for the win-win actions and even distrust people who do try.

Another pair is nurture against nature or environment against genetics. These are very simply not opposites, really, they are not even a little bit so. Almost every feature of our bodies is under the overlapping control of our genetics and our environment. They are interwoven factors. And, it is not just our current environment but our environmental history and also that of our parents and sometimes our grandparents that is mixed in with our genetics.

In thinking about our thoughts and actions, opposites just keep being used. We are given a picture of our heads as venues for various parts of our minds to engage in wars and wrestling matches. We can start with an old one: mind versus brain or non-material mental versus material neural dualism. This opposition is almost dead but its ghost walks still. Some people divide themselves at the neck and ask whether the brain controls the body or does the body control the brain – and they appear to actually want a clear-cut answer. There is the opposition we inherited from Freud: a thought process that is conscious and one that is unconscious presented as two opposed minds (or three in the original theory). This separation is still with us, although it has been made more reasonable in the form of system1 and system2 thinking. System2 uses working memory and is therefore registered in consciousness. It is slow, takes effort, is limited in scope and is sequential. System1 does not use working memory and therefore does not register in consciousness. It is fast, automatic, can handle many inputs and is not sequential. These are not separate minds but interlocking processes. We use them both all the time and not in opposition. But they are often presented as opposites.

Recently, there has been added a notion that the hemispheres of the brain can act separately and in opposition. This is nonsense – the two hemispheres complement each other and cooperate in their actions. But people seem to love the idea of one dominating the other and so it does not disappear.

It would be easier to think about many things without the tyranny of some aspects of language, like opposites, that we learn as very young children and have to live with for the rest of our lives. The important danger is not when we name the two ends of a spectrum, but when we name two states as mutually exclusive, they had better actually be so or we will have problems. It is fine to label a spectrum from left-handed to right-handed but if they were opposites then all the levels of ambidextrous handedness would be a problem. The current problem with the rights of LBGT would be less if the difference between women and men was viewed as a complex of a few spectra rather than a single pair of opposites.

Neuroscience and psychology need to avoid repeatedly falling into opposite-traps. It still has too many confusions, errors, things to be discovered, dots to be connected and old baggage to be discarded.

Thanks Judith for the use of your image


Babies show the way

It is January and therefore we see the answers to the Edge Question. This year the question is “What do you consider the most interesting recent (scientific) news? What makes it important?” I have to say that I did not find this year’s crop of short essays as interesting as in previous years – but there were some gems.

For example N.J. Enfield’s ‘Pointing is a Prerequisite for Language’ fits so well with what I think and is expressed so well (here). I have a problem with the idea that language is not primarily about communication but rather is about a way of thinking. I cannot believe that language arose over a short space of time rather than a long evolution (both biological and cultural evolution). And it began as communication not as a proper-ish language. “Infants begin to communicate by pointing at about nine months of age, a year before they can produce even the simplest sentences. Careful experimentation has established that prelinguistic infants can use pointing gestures to ask for things, to help others by pointing things out to them, and to share experiences with others by drawing attention to things that they find interesting and exciting. … With pointing, we do not just look at the same thing, we look at it together. This is a particularly human trick, and it is arguably the thing that ultimately makes social and cultural institutions possible. Being able to point and to comprehend the pointing gestures of others is crucial for the achievement of “shared intentionality,” the ability to build relationships through the sharing of perceptions, beliefs, desires, and goals.”

So this is where to start to understand language – with communication and with gestures, and especially joint attention with another person as in pointing. EB Bolles has a lot of information on this, collected over quite a few years in his blog (here).

Language in the left hemisphere

Here is the posting mentioned in the last post. A recent paper (Harvey M. Sussman; Why the Left Hemisphere Is Dominant for Speech Production: Connecting the Dots; Biolinguistics Vol 9 Dec 2020), deals with the nature of language processing in the left hemisphere and why it is that in right-handed people with split brains only the left cortex can talk although both sides can listen. There is a lot of interesting information in this paper (especially for someone like me who is left-handed and dyslexic). He has a number of ‘dots’ and he connects them.

Dot 1 is infant babbling. The first language-like sounds babies make are coos and these have a very vowel-like quality. Soon they babble consonant-vowel combinations in repetitions. By noting the asymmetry of the mouth it can be shown that babbling comes from the left hemisphere, non-babbling noises from both, and smiles from the right hemisphere. A speech sound map is being created by the baby and it is formed at the dorsal pathway’s projection in the frontal left articulatory network.

Dot 2 is the primacy of the syllable. Syllables are the unit of prosodic events. A person’s native language syllable constraints are the origin of the types of errors that happen in second language pronunciation. Also syllables are the units of transfer in language play. Early speech sound networks are organized in syllable units (vowel and associated consonants) in the left hemisphere of right-handers.

Dot 3 is the inability for the right hemisphere to talk in split brain people. When language tasks are directed at the right hemisphere the stimulus exposure must be longer (greater than 150 msec) than when directed to the left. The right hemisphere can comprehend language but does not evoke a sound image from seen objects and words although the meaning of the objects and words is understood by that hemisphere. The right hemisphere cannot recognize if two words rhyme from seeing illustations of the words. So the left hemisphere (in right-handers) has the only language neural network with sound images. This network serves as the neural source for generating speech, therefore in a split brain only the left side can speak.

Dot 4 deals with the problems of DAS, Development Apraxia of Speech. I am going to skip this.

Dot 5 is the understanding of speech errors. The ‘slot-segment’ hypothesis is based on analysis of speech errors. Two thirds of errors are the type where phonemes are substituted, omitted, transposed or added. The picture is of a two-tiered neural ‘map’ with syllable slots serially ordered as one tier, and an independent network of consonant sounds in the other tier. The tiers are connected together. The vowel is the heart of the syllable in the nucleus slot. Forms are built around it with consonants (CV, CVC, CCV etc.). Spoonerisms are restricted to consonants exchanging with consonants and vowels exchanging with vowels; and, exchanges occurring between the same syllable positions – first with first, last with last etc.

Dot 6 is Hawkin’s model, “the neo-cortex uses stored memories to produce behaviors.” Motor memories are used sequentially and operate in an auto-associative way. Each memory elicits the next in order (think how hard it is to do things backwards). Motor commands would be produced in a serial order, based on syllables - learned articulatory behaviors linked to sound equivalents.

Dot 7 is experiments that showed representations of sounds in human language at the neural level. For example there is a representation of a generic ‘b’ sound, as well as representations of various actual ‘b’s that differ from one another. This is why we can clearly hear a ‘b’ but have difficulty identifying a ‘b’ when the sound pattern is graphed.

Here is the abstract:

Evidence from seemingly disparate areas of speech/language research is reviewed to form a unified theoretical account for why the left hemisphere is specialized for speech production. Research findings from studies investigating hemispheric lateralization of infant babbling, the primacy of the syllable in phonological structure, rhyming performance in split-brain patients, rhyming ability and phonetic categorization in children diagnosed with developmental apraxia of speech, rules governing exchange errors in spoonerisms, organizational principles of neocortical control of learned motor behaviors, and multi-electrode recordings of human neuronal responses to speech sounds are described and common threads highlighted. It is suggested that the emergence, in developmental neurogenesis, of a hard-wired, syllabically-organized, neural substrate representing the phonemic sound elements of one’s language, particularly the vocalic nucleus, is the crucial factor underlying the left hemisphere’s dominance for speech production.

Complexity of conversation

Language is about communication. It can be studied as written sentences, as production of spoken language, or as comprehension of spoken language, but these do not get to the heart of communicating. Language evolved as conversation, each baby learns it in conversation and most of our use of it each day is in conversations. Exchanges, taking turns, is the essence of language. A recent paper by S. Levinson in Trends in Cognitive Sciences, “Turn-taking in Human Communication – Origins and Implications for Language Processing”, looks at the complications of turn-taking.

The world’s languages vary in almost all levels of organization but there is a striking similarity in exchanges – rapid turns of short phrases or clauses within single sound envelopes. There are few long gaps or much overlapping speech during the changes of speaker. Not only is a standard turn-taking universal in human cultures but it is found in all types of primates and it is learned by babies before any language is acquired. It may be the oldest aspect of our language.

But it is paradoxical – for the gap between speakers is too short to produce a response to what has been said by the last speaker. In fact, the gap tends to be close to the minimum reflex time. A conversational speaking turn averages 2 seconds (2000ms) and the gap between speakers is about 200ms, but it takes 600ms to prepare the first word (1500ms for a short phrase). So it is clear that production and comprehension must go on at the same time in the same areas of the brain and that comprehension must include a good deal of prediction of how a phrase is going to end. Because comprehension and production have been studied separately, it is not clear how this multitasking, if that is what it is, is accomplished. First, the listener has to figure out what sort of utterance the speaker is making – statement, question, command or whatever. Without this the listener does not know what sort of reply is appropriate. The listener then must predict (guess) the rest of the utterance, decide what the response should be and formulate it. Finally the listener must recognize the signal/s of when the end of the utterance will be. The listener can immediately begin to talk as soon as the utterance ends. There is more to learn about how the brain does this and what the effect of turn-taking has on the nature of language.

There are cultural conventions that override turn-taking so that speakers can talk for some time without interruption, and even if they pause from time to time, no one jumps in. Of course, if someone speaks for too long without implicit permission, they will be forcibly interrupted fairly soon, people will drift away or some will start new conversations in sub-groups. That’s communication.

Here is the abstract of - Stephen C. Levinson. Turn-taking in Human Communication – Origins and Implications for Language Processing. Trends in Cognitive Sciences, 2015:

Most language usage is interactive, involving rapid turn-taking. The turn-taking system has a number of striking properties: turns are short and responses are remarkably rapid, but turns are of varying length and often of very complex construction such that the underlying cognitive processing is highly compressed. Although neglected in cognitive science, the system has deep implications for language processing and acquisition that are only now becoming clear. Appearing earlier in ontogeny than linguistic competence, it is also found across all the major primate clades. This suggests a possible phylogenetic continuity, which may provide key insights into language evolution.


The bulk of language usage is conversational, involving rapid exchange of turns. New information about the turn-taking system shows that this transition between speakers is generally more than threefold faster than language encoding. To maintain this pace of switching, participants must predict the content and timing of the incoming turn and begin language encoding as soon as possible, even while still processing the incoming turn. This intensive cognitive processing has been largely ignored by the language sciences because psycholinguistics has studied language production and comprehension separately from dialog.

This fast pace holds across languages, and across modalities as in sign language. It is also evident in early infancy in ‘proto-conversation’ before infants control language. Turn-taking or ‘duetting’ has been observed in many other species and is found across all the major clades of the primate order.


Shared attention

Social interaction or communication requires the sharing of attention. If two people are not paying attention to one another then there is no interaction and no communication. Shared attention is essential for a child’s development of social cognition and communication skills. Two types of shared attention have been identified: mutual gaze when two people face one another and attend to each others eyes; and joint attention when two people look at a third person or object. Joint attention is not the same for both individuals because one initiates it and the other responds.

In a recent paper, researchers studied shared attention (Takahiko Koike etal; Neural substrates of shared attention as social memory: A hyperscanning functional magnetic resonance imaging study ; NeuroImage 125 (2016) 401–412). This cannot be done on an individual level as it involves social exchange and so the researchers used fMRI hyperscanning. Real time video recording and projection allowed two individuals in separate scanners to communicate through facial expression and eye movements while they were both being scanned. Previous studies had shown neural synchronization during shared attention and synchronization of eye blinks. They found that it was the task of establishing joint attention which requires sharing an attentional temporal window that task creates the blink synchrony. This synchrony is remembered in a pair specific way in social memory.

Mutual gaze is needed to give mutual attention - and that is needed to initiate joint attention which requires a certain synchrony - and finally that synchronizing results in a specific memory of the pair’s joint attention which allows further synchrony during subsequent mutual gaze without joint attention first.

Here is their abstract: “During a dyadic social interaction, two individuals can share visual attention through gaze, directed to each other (mutual gaze) or to a third person or an object (joint attention). Shared attention is fundamental to dyadic face- to-face interaction, but how attention is shared, retained, and neutrally represented in a pair-specific manner has not been well studied. Here, we conducted a two-day hyperscanning functional magnetic resonance imaging study in which pairs of participants performed a real-time mutual gaze task followed by a joint attention task on the first day, and mutual gaze tasks several days later. The joint attention task enhanced eye-blink synchronization, which is believed to be a behavioral index of shared attention. When the same participant pairs underwent mutual gaze without joint attention on the second day, enhanced eye-blink synchronization persisted, and this was positively correlated with inter-individual neural synchronization within the right inferior frontal gyrus. Neural synchronization was also positively correlated with enhanced eye-blink synchronization during the previous joint attention task session. Consistent with the Hebbian association hypothesis, the right inferior frontal gyrus had been activated both by initiating and responding to joint attention. These results indicate that shared attention is represented and retained by pair-specific neural synchronization that cannot be reduced to the individual level.

The right inferior gyrus (rightIFG) region of the brain has been linked in other research with: interfacing between self and other; unconscious incorporation of facial expression in self and others; the release from mutual attention; and, neural synchronization during social encounters. The rightIFG is active in both initiating and responding to joint attention and in the synchrony during mutual gaze (when it is present). However it is unlikely to cause blinking directly. “Neural synchronization of the right IFG represents learned shared attention. Considering that shared attention is to be understood as a complementary action due to its social salience, relevance in initiating communication, and joint action, the present finding is consistent with a previous study by Newman-Norlund et al. who showed that the right IFG is more active during complimentary as compared to imitative actions.” Communication, communication, communication!

This fits with the theory that words steer joint attention to things present or absent, concrete or abstract in a way that is similar to the eyes steering joint attention on concrete and present things. Language has harnessed the brain’s mechanisms for joint attention if this theory is correct (I think it is).


Two things on language

There are a couple of interesting reports about language.

First, it has been shown that repeating something aloud helps us remember it. But a recent study goes further – we remember even better if we repeat it aloud to someone. The act of communication helps the memory. The paper is: Alexis Lafleur, Victor J. Boucher. The ecology of self-monitoring effects on memory of verbal productions: Does speaking to someone make a difference? Consciousness and Cognition, 2015; 36: 139 DOI:10.1016/j.concog.2015.06.015.

From ScienceDaily (here) Previous studies conducted at Professor Boucher’s Phonetic Sciences Laboratory have shown that when we articulate a sound, we create a sensory and motor reference in our brain, by moving our mouth and feeling our vocal chords vibrate. “The production of one or more sensory aspects allows for more efficient recall of the verbal element. But the added effect of talking to someone shows that in addition to the sensorimotor aspects related to verbal expression, the brain refers to the multisensory information associated with the communication episode,” Boucher explained. “The result is that the information is better retained in memory.

No one can tell me that language is not about and for communication.

The second item is reported in ScienceDaily (here) Infants cannot perceive the difference between certain sounds when their tongue is restricted with a teether. They have to be able to mimic the sounds in order to distinguish them. The paper is: Alison G. Bruderer, D. Kyle Danielson, Padmapriya Kandhadai, and Janet F. Werker. Sensorimotor influences on speech perception in infancy. PNAS, October 12, 2021 DOI: 10.1073/pnas.1508631112.

From ScienceDaily: …teething toys were placed in the mouths of six-month-old English-learning babies while they listened to speech sounds-two different Hindi “d” sounds that infants at this age can readily distinguish. When the teethers restricted movements of the tip of the tongue, the infants were unable to distinguish between the two “d” sounds. But when their tongues were free to move, the babies were able to make the distinction. Lead author Alison Bruderer, a postdoctoral fellow in the School of Audiology and Speech Sciences at UBC, said the findings call into question previous assumptions about speech and language development. “Until now, research in speech perception development and language acquisition has primarily used the auditory experience as the driving factor,” she said. “Researchers should actually be looking at babies’ oral-motor movements as well.”

hey say that parents do not need to worry about using teething toys but a child should also have time to freely use their tongue for good development.


It is about communication

Some people understand language as a way of thinking and ignore the obvious – language is a way of communicating. A recent study looks at the start of language in very young babies and shows the importance of communication. (Marno, H. et al. Can you see what I am talking about? Human speech triggers referential expectation in four-month-old infants. Sci. Rep. 5, 13594; doi: 10.1038/srep13594 (2015)) The researchers looked at infants’ ability to recognize that a word can refer to an object in the world but they also show the importance of the infants’ recognizing the act of communication.

The authors review what is known and it is an interesting list. “Human language is a special auditory stimulus for which infants show a unique sensitivity, compared to any other types of auditory stimuli. Various studies found that newborns are not only able to distinguish languages they never heard before based on their rhythmical characteristics, but they can also detect acoustic cues that signal word boundaries, discriminate words based on their patterns of lexical stress and distinguish content words from function words by detecting their different acoustic characteristics. Moreover, they can also recognize words with the same vowels after a 2 min delay. In fact, infants are more sensitive to the statistical and prosodic patterns of language than adults, which provides an explanation of why acquiring a second language is more difficult in adulthood than during infancy. In addition to this unique sensitivity to the characteristics of language, infants also show a particular preference for language, compared to other auditory stimuli. For example, infants at the age of 2-months, and even newborns prefer to listen to speech compared to non-speech stimuli, even if the non-speech stimuli retain many of the spectral and temporal properties of the speech signal. Thus, there is growing evidence that infants are born with a unique interest and sensitivity to process human language. … it might be that infants are receptive towards speech because they also understand that speech can communicate about something. More specifically, they might understand that speech can convey information about the surrounding world and that words can refer to specific entities. Indeed, without this understanding, they would have great difficulty to accept relations between objects and their labels, and thus language acquisition would become impossible.

The experiments reported in the paper are designed to show whether infants (about 4 months old) understand that words can refer to objects in the world. They do show this, but also show that this depends on the infant recognizing the act of communication. The infant attends to eye-contact and when the face speaks language (not backward language or silent mimed language), the infant then appears to recognize it is being communicated with. Without the eye-contact or without the actual language, the infant does not assume an act of communication. Then the infant can go on to recognize that reference to something is what is being communicated. “… we suggest that during the perception of a direct eye-gaze, infants can recognize the communicative intention, even before they could assess the content of these intentions. Eye-gaze thus is able to establish a communicative context, which can direct the attention of the infant. However, we also suggest that while an infant-directed gaze acts as a communicative cue signaling that the infant was addressed by someone, additional cues are required to elicit the referential expectation of the infant (i.e. to understand that the speaker is talking about something). Following this, we propose that when the infant hears speech (without being able to actually understand the content of speech) and observes a person directly gazing at her/him (like in the Infant-directed gaze condition in our experiment), s/he will understand the communicative intention of the speaker (i.e. that s/he was addressed by the speaker), but s/he will still have to wait for additional referential cues to make an inference that the speaker is actually talking about something. This additional cue arrives when the direct eye contact is broken: the very moment when the speaker averts her gaze to a new direction, the infant will infer that some new and relevant information is being presented to her via the speech signals, and, as a consequence will be ready to seek this information.

Language is about communication. Children learn language by communicating, for communicating.

Abstract: “ Infants’ sensitivity to selectively attend to human speech and to process it in a unique way has been widely reported in the past. However, in order to successfully acquire language, one should also understand that speech is a referential, and that words can stand for other entities in the world. While there has been some evidence showing that young infants can make inferences about the communicative intentions of a speaker, whether they would also appreciate the direct relationship between a specific word and its referent, is still unknown. In the present study we tested four-month-old infants to see whether they would expect to find a referent when they hear human speech. Our results showed that compared to other auditory stimuli or to silence, when infants were listening to speech they were more prepared to find some visual referents of the words, as signalled by their faster orienting towards the visual objects. Hence, our study is the first to report evidence that infants at a very young age already understand the referential relationship between auditory words and physical objects, thus show a precursor in appreciating the symbolic nature of language, even if they do not understand yet the meanings of words.


First and last syllables

Have you wondered why rhyme and alliteration are so common and pleasing, why they assist memorization? They seem to be taking advantage of the way words are ‘filed’ in the brain.

A ScienceDaily item (here) looks at a paper on how babies hear syllables. (Alissa L. Ferry, Ana Fló, Perrine Brusini, Luigi Cattarossi, Francesco Macagno, Marina Nespor, Jacques Mehler. On the edge of language acquisition: inherent constraints on encoding multisyllabic sequences in the neonate brain. Developmental Science, 2015; DOI: 10.1111/desc.12323).

It is known that our cognitive system recognizes the first and last syllables of words better than middle syllables. For example there is a trick of being able to read print where the middle of the words are changed. It has also been noted that the edges of words are often information rich, especially with grammatical information.

This paper shows that this is a feature of our brains from birth – no need to learn it.At just two days after birth, babies are already able to process language using processes similar to those of adults. SISSA researchers have demonstrated that they are sensitive to the most important parts of words, the edges, a cognitive mechanism which has been repeatedly observed in older children and adults.” The babies were also sensitive to the very short pause between words as a way to tell when one word ends and another begins.

Here is the abstract: “To understand language, humans must encode information from rapid, sequential streams of syllables – tracking their order and organizing them into words, phrases, and sentences. We used Near-Infrared Spectroscopy (NIRS) to determine whether human neonates are born with the capacity to track the positions of syllables in multisyllabic sequences. After familiarization with a six-syllable sequence, the neonate brain responded to the change (as shown by an increase in oxy-hemoglobin) when the two edge syllables switched positions but not when two middle syllables switched positions (Experiment 1), indicating that they encoded the syllables at the edges of sequences better than those in the middle. Moreover, when a 25ms pause was inserted between the middle syllables as a segmentation cue, neonates’ brains were sensitive to the change (Experiment 2), indicating that subtle cues in speech can signal a boundary, with enhanced encoding of the syllables located at the edges of that boundary. These findings suggest that neonates’ brains can encode information from multisyllabic sequences and that this encoding is constrained. Moreover, subtle segmentation cues in a sequence of syllables provide a mechanism with which to accurately encode positional information from longer sequences. Tracking the order of syllables is necessary to understand language and our results suggest that the foundations for this encoding are present at birth.