Tag Archives: syntax

My problem with Merge


When linguists talk about language they use the idea of a function called Merge. Chomsky has the theory that without Merge there is no Language. The idea is that two things are merged together and make one composite thing. And it can be done iteratively to make longer and longer strings. Is this the magic key to language?

The ancient Greeks had ‘elements’ and everything was a combination of elements. The elements were water, fire, earth and air. That is a pretty good guess: matter in its three states and energy. This system was used to understand the world. It was not until it became clear that matter was atomic and atoms came in certain varieties that our current idea of elements replaced the Greek one. It was not that the Greek elements were illogical or that they could not be used to describe the world. The problem was that there was now a much better way to describe the world. The new way was less intuitive, less simple, less beautiful but it explained more, predicted better and fit well with other new knowledge about the world.

This illustrates my problem with conventional syntax and especially Merge. Syntax is not a disembodied logic system because we know it is accomplished in the brain by cells and networks of cells in the brain. It is a biological thing. So a description of how language is formatted has to fit with our knowledge of how the brain works. It is not our theories of language that dictate how the brain works; it is the way the brain works that dictates how we understand language. Unfortunately, we have only just begun to understand the brain.

Some of the things that we think the brain does fit well with language. The brain uses the idea of causal links, events are understood in terms of cause and effect and even in terms of actor – action – outcome. So it is not surprising that a great many utterances have a form that expresses this sort of relationship: subject – verb or subject – verb – object. We are not surprised that the brain would use the same type of relationship to express an event as it does to create that event from sensory input and store it. Causal events are natural to the brain.

So is association, categorization and attribution natural. We see a blue flower but these are separate in the brain until they are bound together. Objects are identified and their color is identified and then they are combined. So not only nouns and verbs are natural to the brain’s way of working but so are attributes – adjectives and adverbs for example. Copula forms are another example: they link an entity with another or with an attribute. And so it goes, most things I can think of about language are natural seeming to the brain (time, place, proper names, interjections etc.).

Even Merge in a funny way is normal to the brain in the character of clumping. The working memory is small and holds 4 to 7 items, we think. But by clumping items together and treating them as one item the memory is able to deal with more items. Clumping is natural to the brain.

This picture is like Changizi’s harnessing theory. The things we have created, were created by harnessing pre-existing abilities of the brain. The abilities needed no new mutation to be harnessed to a new function, mutations to make a better fit would come after they were used for the new function – otherwise there would be no selective pressure modifying the ability to the new function.

So what is my problem with conventional syntax and especially with Merge? It is not a problem with most of the entities – parts of speech, cases, tenses, word order and the like. It is a problem with the rigidity of thought. Parsing diagrams make me grind my teeth. There is an implication that these trees are the way the brain works. I have yet to encounter any good evidence that those diagrams reflect processes in the brain. The idea that a language is a collection of possible sentences bothers me – why does language have to be confined to sentences. I have read verbatum court records – actual complete and correctly formed sentences appear to be much less common than you would think. It is obvious that utterances are not always (probably not mostly) planned ahead. The mistakes people make often imply that they changed horse in mid-sentence. Most of what we know about our use of language implies that the process is not at all like the diagrams or the approaches of grammarians.

The word ‘Merge’, unlike say ‘modify’, is capitalized. This is apparently because some feel it is the essence of language, the one thing that makes human language unique and the one mutation required for our languages. But if merge is just an ordinary word and pretty much like clumping, which I think it is, than poof goes the magic. My dog can clump and merge things into new wholes – she can organize a group of things into a ritual and recognize that ritual event with a single word or short phrase or indicate it with a small action.

What is unique about humans is not Merge but the extent and sophistication of our communication. We do not need language to think in the way we do, language is built on the way we think. We need language in order to communicate better.


Language, music and echolocation

In the immediately previous posting, the main idea was that linguistic and musical communication shared the same syntactic processing in the brain but not the same semantic meaning processing. How can they share syntax? We need to look at communication and at syntax.


The simplest type of human communication is non verbal signals: things like posture, facial expression, gestures, tone of voice. They are in effect contagious: if you are sad, I will feel a little sad, if I then cheer up, you may too. The signals are indications of emotional states and we tend to react to another’s emotional state by a sort of mimicry that puts us in sync with them. We can carry on a type of emotional conversation in this way. Music appears to use this emotional communication – it causes emotions in us without any accompanying semantic messages. It appears to cause that contagion with three aspects: the rhythmic rate, the sound envelope and the timbre of the sound. For example a happy musical message has a fairly fast rhythm, flat loudness envelop with sharp ends, lots of pitch variation and a simple timbre with few harmonics. Language seems to use the same system for emotion, or at least some emotion. The same rhythm, sound envelope and timbre is used in the delivery of oral language and it carries the same emotional signals. Whether it is music or language, this sound specification cuts right past the semantic and cognitive processes and goes straight to the emotional ones. Language seems to share these emotional signals with music but not the semantic meaning that language contains.


Syntax has a slippery meaning. Its many definitions usually apply to language and it is extended to music as a metaphor. But – if we look at the idea in a more basic way we can see how important this is to processing sound. When we get visual information it is two dimensional because the retina is a surface with two dimensions and the maps of the retina on the cortex are also in two dimensions. Perceptional processing adds depth for a third dimension. But sound comes to us with one dimension because the cochlea is essentially a spiral line. It is mapped as a line on the cortex. Perception gives us a direction for the source of the sound and sometimes a feeling of distance. The identification of what is in the visual field (objects, movement etc.) is perceived by a different process than the identification of what is in a sound. As with all the senses, in perception we are trying to model the environment and events in it. Sound is no different, the meaning of sounds is what we can learn from them about what is happening in the world. This is just like vision which gains meaning from the model of the environment and events in it that it produces. Language and music must be processed by the sound perception system because they come to us as sounds.


One description of syntax is that it deals with trains of sound that are complex, have hierarchical patterns, are abstract, have rigid or probabilistic relationships between entities (or rules). It could be presumed that any domain that involves such trains of sound would be processed, as language and music are, in a syntactical manner. The hierarchy would be established, the abstract patterns and relationships identified. The beauty of the train of sound would be appreciated. The entities resulting from this processing would be available for semantic or other processes. There is no reason to rule out a general syntactical processing system, and there is no reason why the domains of sound that use it need to be similar in the sense that they can be mapped one-to-one. Music need not have an exact equivalent of a sentence.


If we looked for them, there may be other domains that use the same type of analysis – perhaps all trains of sound to a certain extent. How do we know that a sound (with its echoes) is thunder rather than a gun shot or a dynamite explosion? Perhaps the sound is processed into a hierarchy of direct sounds and echoes with particular sorts of patterns. Ah thunder, we think. This idea of echo processing is intriguing – it would seem, like language and music, to have a syntax that is complex, hierarchical, etc. Some animals, and the humans who have learned it, use echolocation. Would this not be a candidate for the syntactical type of pattern identification? We do not postulate a newish and dedicated visual process to explain reading, and likewise we do not need a newish and dedicated sound process to explain syntactical processing of language. We can be using a system that is very old and only mildly tweaked for language, for music, for echoes.


The ingredients of music that appear to be a syntax-like architecture are: there are scales of permissible notes, chords based on those scales and key structures based on changes of chords used within a piece of music. There are similar hierarchies in rhythm patterns with different note lengths and emphasis, organized into bars and the bars into larger patterns. But when these sorts of regularities are compared to words, phrases, sentences and other hierarchies in language, the match is weak at the detailed level.


It would be surprising if language and music shared a functional area of the brain that was not more general in nature given the lack of detailed parallels between the structure of language and music.


So, is there any evidence that echolocation shares any processing with language and music? There is no real evidence that I can find. A recent paper (citation below) by Thaler, Arnott and Goodale appears to rule out the possibility, but on reflection does not. Here is the abstract:


A small number of blind people are adept at echolocating silent objects simply by producing mouth clicks and listening to the returning echoes. Yet the neural architecture underlying this type of aid-free human echolocation has not been investigated. To tackle this question, we recruited echolocation experts, one early- and one late-blind, and measured functional brain activity in each of them while they listened to their own echolocation sounds.


When we compared brain activity for sounds that contained both clicks and the returning echoes with brain activity for control sounds that did not contain the echoes, but were otherwise acoustically matched, we found activity in calcarine cortex in both individuals. Importantly, for the same comparison, we did not observe a difference in activity in auditory cortex. In the early-blind, but not the late-blind participant, we also found that the calcarine activity was greater for echoes reflected from surfaces located in contralateral space. Finally, in both individuals, we found activation in middle temporal and nearby cortical regions when they listened to echoes reflected from moving targets.


These findings suggest that processing of click-echoes recruits brain regions typically devoted to vision rather than audition in both early and late blind echolocation experts.


The actual location is done in the otherwise unused visual cortex (calcarine). This may be a situation like language where the semantic meaning is extracted in parts of the cortex that are not associated with auditory perception. It seems that echolocation does not require extraordinary auditory perception but it does required a systematic attention to sound. So a fairly normal sense of hearing is able to provide to the visual part of the cortex the input it requires (in a trained blind individual) to do echolocation. That input is likely to include a sophisticated pattern identification of the echoes. “All subjects also show BOLD activity in the lateral sulcus (i.e. Auditory Complex) of the left and right hemispheres and adjacent and inferior to the right medial frontal sulcus. The former likely reflects the auditory nature of the stimuli. The latter most likely reflects the involvement of higher order cognitive and executive control processes during task performance.” This description and the areas in the illustrations could be parts of the Broca’s and Wernicke’s areas, the areas that were shown to be active in language and music communication.

Thaler, L., Arnott, S., & Goodale, M. (2011). Neural Correlates of Natural Human Echolocation in Early and Late Blind Echolocation Experts PLoS ONE, 6 (5) DOI: 10.1371/journal.pone.0020162