Category Archives: music

Metaphors and shapes

Judith Copithorne image

Metaphors (including analogs and similitudes) appear to be very basic to thought. These are very important to language and communication. A large bulk of dictionary meanings of words are actually old metaphors, that have been used so much and for so long that the words has lost its figurative root and become literal in their meaning. We simply do not recognize that it was once a metaphor. Much of our learning is metaphorical. We understand one complex idea by noticing its similarity to another complex idea that we already understand. For example, electricity is not easy to understand at first but we have learned to understand a great deal about how water flows as we have grown up by watching it. Basic electrical theory is often taught by comparing it to water. By and large, when we examine our knowledge of the world, we find it is rife with metaphors. We can trace many ways we think about things and events to ‘grounding’ in experiences of infants. The way babies establish movement and sensory information is the foundation of enormous trees and pyramids of metaphorical understanding.

But what is a metaphor? We can think of it as a number of entities that are related in some way (in space, in time, in cause-effect, or in logic etc.) to form a structure that we can understand and think of/ remember/ name/ use as a predictive model and treat as a single thing. This structure can be reused without being reinvented. The entities can be re-labeled and so can the relations between them. So if we know water flowing through a pipe will be limited by a narrower length of pipe we can envisage an electrical current in a wire being limited by a resistor. Nothing needs to be retained in a metaphor but the abstract structure. This facility of being able to manipulate metaphors is important to thinking, learning, communicating. Is there more? Perhaps.

A recent paper (Rolf Inge Godøy, Minho Song, Kristian Nymoen, Mari Romarheim Haugen, Alexander Refsum Jensenius; Exploring Sound-Motion Similarity in Musical Experience; Journal of New Music Research, 2016; 1) talks about the use of a type of metaphor across the senses and movement. Here is the abstract:

People tend to perceive many and also salient similarities between musical sound and body motion in musical experience, as can be seen in countless situations of music performance or listening to music, and as has been documented by a number of studies in the past couple of decades. The so-called motor theory of perception has claimed that these similarity relationships are deeply rooted in human cognitive faculties, and that people perceive and make sense of what they hear by mentally simulating the body motion thought to be involved in the making of sound. In this paper, we survey some basic theories of sound-motion similarity in music, and in particular the motor theory perspective. We also present findings regarding sound-motion similarity in musical performance, in dance, in so-called sound-tracing (the spontaneous body motions people produce in tandem with musical sound), and in sonification, all in view of providing a broad basis for understanding sound-motion similarity in music.”

The part of this paper that I found most interesting was a discussion of abstract ‘shapes’ being shared by various senses and motor actions.

A focus on shapes or objects or gestalts in perception and cognition has particularly concerned so-called morphodynamical theory … morphodynamical theory claims that human perception is a matter of consolidating ephemeral sensory streams (of sound, vision, touch, and so on) into somehow more solid entities in the mind, so that one may recall and virtually re-enact such ephemeral sensations as various kinds of shape images. A focus on shape also facilitates motion similarity judgments and typically encompasses, first of all, motion trajectories (as so-called motion capture data) at various timescales (fast to slow, including quasi-stationary postures) and amplitudes (from large to small, including relative stillness). But shapes can also capture perceptually and affectively highly significant derivatives, such as acceleration and jerk of body motion, in addition.

The authors think of sound objects as occurring in the time range of half a second to five seconds. Sonic objects have pitch and timbre envelopes, rhythmic, melodic and harmonic patterns. In terms of dynamics, sonic objects can: be impulsive with an envelop showing an abrupt onset and then decay, or be sustained with a gradual onset and longer duration, or be iterative with rapidly repeated sound, tremolo, or drum roll. Sonic objects could have pitch that is stable, variable or just noise. These sonic objects are related to similar motion objects – objects in the same time range that produce music or react to it. For example the sonic objects in playing a piano piece or in dancing. They also have envelopes of velocity and so on. This reminds me of the similar emotions that are triggered by similar envelopes of musical sound and speech. Or, the objects that fit with the nonsense words ‘bouba’ and ‘kiki’ being smooth or sharp. ‘Shape’ is a very good description of the vague but strong and real correspondences between objects from different domains. It is probably the root of being able to use adjectives across domains. For example, we can have soft light, soft velvet, soft rustle, soft steps, soft job, and more or less soft anything. Soft describes different things in different domains but, despite the differences, it is a metaphoric connection between domains so that concrete objects can be made by combining a number of individual sensory/motor objects which share abstract characteristics like soft.

In several studies of cross-modal features in music, a common element seems to be the association of shape similarity with sound and motion, and we believe shape cognition can be considered a basic amodal element of human cognition, as has been suggested by the aforementioned morphodynamical theory …. But for the implementation of shape cognition, we believe that body motion is necessary, and hence we locate the basis for amodal shape cognition in so-called motor theory. Motor theory is that which can encompass most (or most relevant) modalities by rendering whatever is perceived (features of sound, textures, motion, postures, scenes and so on) as actively traced shape images.

The word ‘shape’, used to describe corresponding characteristics from different domains, is very like the word ‘structure’ in metaphors and may point to the foundation of our cognition mechanisms, including much more than just the commonplace metaphor.


Music affects on the brain

A recent paper identified genes that changed their expression as a result of music performance in trained musicians. (see citation below). There were a surprising number of affected genes, 51 genes had increased and 22 had decreased expression, compared to controls who were also trained musicians but were not involved in making or listening to music for the same time period. It is also impressive that this set of 73 genes has a very broad range of presumed functions and effects in the brain.

Another interesting aspect is the overlap of a number of these genes with some that have been identified in song birds. This implies that the music/sophisticated sound perception and production has been conserved from a common ancestor of birds and mammals.

It has been known for some time that musical training has a positive effect on intelligence and outlook – that it assists learning. Musical training changes the structure of the brain. Now scientists are starting to trace the biology of music’s effects. Isn’t it about time that education stopped treating music (and other arts for that matter) as unimportant frills? It should not be the first thing to go when money or teaching time is short.

Here is the Abstract:

Music performance by professional musicians involves a wide-spectrum of cognitive and multi-sensory motor skills, whose biological basis is unknown. Several neuroscientific studies have demonstrated that the brains of professional musicians and non-musicians differ structurally and functionally and that musical training enhances cognition. However, the molecules and molecular mechanisms involved in music performance remain largely unexplored. Here, we investigated the effect of music performance on the genome-wide peripheral blood transcriptome of professional musicians by analyzing the transcriptional responses after a 2-hr concert performance and after a ‘music-free’ control session. The up-regulated genes were found to affect dopaminergic neurotransmission, motor behavior, neuronal plasticity, and neurocognitive functions including learning and memory. Particularly, candidate genes such as SNCA, FOS and DUSP1 that are involved in song perception and production in songbirds, were identified, suggesting an evolutionary conservation in biological processes related to sound perception/production. Additionally, modulation of genes related to calcium ion homeostasis, iron ion homeostasis, glutathione metabolism, and several neuropsychiatric and neurodegenerative diseases implied that music performance may affect the biological pathways that are otherwise essential for the proper maintenance of neuronal function and survival. For the first time, this study provides evidence for the candidate genes and molecular mechanisms underlying music performance.”

Kanduri, C., Kuusi, T., Ahvenainen, M., Philips, A., Lähdesmäki, H., & Järvelä, I. (2015). The effect of music performance on the transcriptome of professional musicians Scientific Reports, 5 DOI: 10.1038/srep09506

Chimps appreciate rhythm


Science Daily has an item (here) on musical appreciation in chimpanzees. Previous studies using blues, classical and pop music have found that although chimps can distinguish features of music and have preferences, they still preferred silence to the music. So were the chimps able to ‘hear’ the music but not appreciate its beauty? A new paper has different results using non-western music: West African akan, North Indian raga, and Japanese taiko. Here the chimps liked the African and Indian music but not the Japanese. They seemed to base their appreciation on the rhythm. The Japanese music has very regular prominent beats like western music, while the African and Indian music had varied beats. “The African and Indian music in the experiment had extreme ratios of strong to weak beats, whereas the Japanese music had regular strong beats, which is also typical of Western music.”

It may be that they like a more sophisticated rhythm. Or de Waal says, ““Chimpanzees may perceive the strong, predictable rhythmic patterns as threatening, as chimpanzee dominance displays commonly incorporate repeated rhythmic sounds such as stomping, clapping and banging objects.”

Here is the abstract for M. Mingle, T. Eppley, M. Campbell, K. Hall, V. Horner, F. de Waal; Chimpanzees Prefer African and Indian Music Over Silence;Journal of Experimental Psychology: Animal Learning and Cognition, 2014:

All primates have an ability to distinguish between temporal and melodic features of music, but unlike humans, in previous studies, nonhuman primates have not demonstrated a preference for music. However, previous research has not tested the wide range of acoustic parameters present in many different types of world music. The purpose of the present study is to determine the spontaneous preference of common chimpanzees (Pan troglodytes) for 3 acoustically contrasting types of world music: West African akan, North Indian raga, and Japanese taiko. Sixteen chimpanzees housed in 2 groups were exposed to 40 min of music from a speaker placed 1.5 m outside the fence of their outdoor enclosure; the proximity of each subject to the acoustic stimulus was recorded every 2 min. When compared with controls, subjects spent significantly more time in areas where the acoustic stimulus was loudest in African and Indian music conditions. This preference for African and Indian music could indicate homologies in acoustic preferences between nonhuman and human primates.”


Why do we get pleasure from sad music?


Sadness is a negative emotion; and, we recognize sadness in some music; but yet, we often enjoy listening to sad music. We can be positive about a negative emotion. A recent paper by Kawakami (citation below) differentiates between some hypotheses to explain this contradiction.

The hypotheses that the response has to do with musical training (ie that the pleasure comes from the appreciation and familiarity with the art involved) was shown false by finding no difference in response between musicians and non-musicians in their experiments. “Participants’ emotional responses were not associated with musical training. Music that was perceived as tragic evoked fewer sad and more romantic notions in both musicians and non-musicians. Therefore, our hypothesis—when participants listened to sad (i.e., minor-key) music, those with more musical experience (relative to those with less experience) would feel (subjectively experience) more pleasant emotions than they would perceive (objectively hear in the music)—was not supported.

The key innovation in this experimental setup was that the subjects were not just asked how sad they found the music but were given an extensive quiz. For each of 2 pieces of music, played in both minor and major keys, the subjects rated the experience in terms of 62 words and phrases, rating both their perception of the music’s emotional message and the personal emotion they actually felt. Four factors were extracted from the 62 emotional descriptions: tragic emotion, heightened emotion, romantic emotion, blithe emotion.

As would be expected the tragic emotion was rated higher for the minor key and lower for the major key music for both perceived and felt emotion. Likewise, there is no surprise that the blithe emotion was the opposite, high for the major and low for the minor for both felt and perceived emotion. The heightened emotion was only slightly higher for the sad minor music over the happy major. Romantic emotion was moderately higher for the happy music over the sad. However, there were differences between felt and perceived emotion. These were significant for the minor music: it was felt to be less tragic, more romantic and more blithe than it was perceived. This difference between felt and perceived is not too difficult to imagine. Suppose you are arguing with someone and you make them very angry. You can perceive their anger while your own feelings may be of smug satisfaction. Although emotion can be very contagious, it is not a given that felt emotion will be identical to perceived emotion.

The hypothesis of catharsis would imply a deeply felt sadness to lift depression. But this is not what was seen. The next hypothesis the authors discuss is ‘sweet anticipation’. A listener has certain expectations of what will be heard next and a positive emotion is felt when the prediction is fulfilled. This could contribute to the effect (but not because of musical training).

A third hypothesis is that we have an art-experience-mode in which we have positive emotions from exposure to art. If we believe we are in the presence of ‘art’ that in itself is positive. “When we listen to music, being in a listening situation is obvious to us; therefore, how emotion is evoked would be influenced by our cognitive appraisal of listening to music. For example, a cognitive appraisal of listening to sad music as engagement with art would promote positive emotion, regardless of whether that music evoked feelings of unpleasant sadness, thereby provoking the experience of ambivalent emotions in response to sad music. ” Again this could contribute.

Their new and favourite hypothesis is ‘vicarious emotion’. “In sum, we consider emotion experienced in response to music to be qualitatively different from emotion experienced in daily life; some earlier studies also proposed that music may evoke music-specific emotions. The difference between the emotions evoked in daily life and music-induced emotions is the degree of directness attached to emotion-evoking stimuli. Emotion experienced in daily life is direct in nature because the stimuli that evoke the emotion could be threatening. However, music is a safe stimulus with no relationship to actual threat; therefore, emotion experienced through music is not direct in nature. The latter emotion is experienced via an essentially safe activity such as listening to music. We call this type of emotion

vicarious emotion.” … That is, even if the music evokes a negative emotion, listeners are not faced with any real threat; therefore, the sadness that listeners feel has a pleasant, rather than an unpleasant, quality to it. This suggests that sadness is multifaceted, whereas it has previously been regarded as a solely unpleasant emotion. ”

I find the notion of vicarious emotion could also explain why we can be entertained and enjoy frightening plays, books and movies. All sorts of negative emotions are sought as vicarious experiences and enjoyed. Many things we do for leisure and our enjoyment of much of art have a good deal of vicarious emotional content for us to safely enjoy and even learn from.

Kawakami, A., Furukawa, K., & Okanoya, K. (2014). Music evokes vicarious emotions in listeners Frontiers in Psychology, 5 DOI: 10.3389/fpsyg.2014.00431

Language, music and echolocation

In the immediately previous posting, the main idea was that linguistic and musical communication shared the same syntactic processing in the brain but not the same semantic meaning processing. How can they share syntax? We need to look at communication and at syntax.


The simplest type of human communication is non verbal signals: things like posture, facial expression, gestures, tone of voice. They are in effect contagious: if you are sad, I will feel a little sad, if I then cheer up, you may too. The signals are indications of emotional states and we tend to react to another’s emotional state by a sort of mimicry that puts us in sync with them. We can carry on a type of emotional conversation in this way. Music appears to use this emotional communication – it causes emotions in us without any accompanying semantic messages. It appears to cause that contagion with three aspects: the rhythmic rate, the sound envelope and the timbre of the sound. For example a happy musical message has a fairly fast rhythm, flat loudness envelop with sharp ends, lots of pitch variation and a simple timbre with few harmonics. Language seems to use the same system for emotion, or at least some emotion. The same rhythm, sound envelope and timbre is used in the delivery of oral language and it carries the same emotional signals. Whether it is music or language, this sound specification cuts right past the semantic and cognitive processes and goes straight to the emotional ones. Language seems to share these emotional signals with music but not the semantic meaning that language contains.


Syntax has a slippery meaning. Its many definitions usually apply to language and it is extended to music as a metaphor. But – if we look at the idea in a more basic way we can see how important this is to processing sound. When we get visual information it is two dimensional because the retina is a surface with two dimensions and the maps of the retina on the cortex are also in two dimensions. Perceptional processing adds depth for a third dimension. But sound comes to us with one dimension because the cochlea is essentially a spiral line. It is mapped as a line on the cortex. Perception gives us a direction for the source of the sound and sometimes a feeling of distance. The identification of what is in the visual field (objects, movement etc.) is perceived by a different process than the identification of what is in a sound. As with all the senses, in perception we are trying to model the environment and events in it. Sound is no different, the meaning of sounds is what we can learn from them about what is happening in the world. This is just like vision which gains meaning from the model of the environment and events in it that it produces. Language and music must be processed by the sound perception system because they come to us as sounds.


One description of syntax is that it deals with trains of sound that are complex, have hierarchical patterns, are abstract, have rigid or probabilistic relationships between entities (or rules). It could be presumed that any domain that involves such trains of sound would be processed, as language and music are, in a syntactical manner. The hierarchy would be established, the abstract patterns and relationships identified. The beauty of the train of sound would be appreciated. The entities resulting from this processing would be available for semantic or other processes. There is no reason to rule out a general syntactical processing system, and there is no reason why the domains of sound that use it need to be similar in the sense that they can be mapped one-to-one. Music need not have an exact equivalent of a sentence.


If we looked for them, there may be other domains that use the same type of analysis – perhaps all trains of sound to a certain extent. How do we know that a sound (with its echoes) is thunder rather than a gun shot or a dynamite explosion? Perhaps the sound is processed into a hierarchy of direct sounds and echoes with particular sorts of patterns. Ah thunder, we think. This idea of echo processing is intriguing – it would seem, like language and music, to have a syntax that is complex, hierarchical, etc. Some animals, and the humans who have learned it, use echolocation. Would this not be a candidate for the syntactical type of pattern identification? We do not postulate a newish and dedicated visual process to explain reading, and likewise we do not need a newish and dedicated sound process to explain syntactical processing of language. We can be using a system that is very old and only mildly tweaked for language, for music, for echoes.


The ingredients of music that appear to be a syntax-like architecture are: there are scales of permissible notes, chords based on those scales and key structures based on changes of chords used within a piece of music. There are similar hierarchies in rhythm patterns with different note lengths and emphasis, organized into bars and the bars into larger patterns. But when these sorts of regularities are compared to words, phrases, sentences and other hierarchies in language, the match is weak at the detailed level.


It would be surprising if language and music shared a functional area of the brain that was not more general in nature given the lack of detailed parallels between the structure of language and music.


So, is there any evidence that echolocation shares any processing with language and music? There is no real evidence that I can find. A recent paper (citation below) by Thaler, Arnott and Goodale appears to rule out the possibility, but on reflection does not. Here is the abstract:


A small number of blind people are adept at echolocating silent objects simply by producing mouth clicks and listening to the returning echoes. Yet the neural architecture underlying this type of aid-free human echolocation has not been investigated. To tackle this question, we recruited echolocation experts, one early- and one late-blind, and measured functional brain activity in each of them while they listened to their own echolocation sounds.


When we compared brain activity for sounds that contained both clicks and the returning echoes with brain activity for control sounds that did not contain the echoes, but were otherwise acoustically matched, we found activity in calcarine cortex in both individuals. Importantly, for the same comparison, we did not observe a difference in activity in auditory cortex. In the early-blind, but not the late-blind participant, we also found that the calcarine activity was greater for echoes reflected from surfaces located in contralateral space. Finally, in both individuals, we found activation in middle temporal and nearby cortical regions when they listened to echoes reflected from moving targets.


These findings suggest that processing of click-echoes recruits brain regions typically devoted to vision rather than audition in both early and late blind echolocation experts.


The actual location is done in the otherwise unused visual cortex (calcarine). This may be a situation like language where the semantic meaning is extracted in parts of the cortex that are not associated with auditory perception. It seems that echolocation does not require extraordinary auditory perception but it does required a systematic attention to sound. So a fairly normal sense of hearing is able to provide to the visual part of the cortex the input it requires (in a trained blind individual) to do echolocation. That input is likely to include a sophisticated pattern identification of the echoes. “All subjects also show BOLD activity in the lateral sulcus (i.e. Auditory Complex) of the left and right hemispheres and adjacent and inferior to the right medial frontal sulcus. The former likely reflects the auditory nature of the stimuli. The latter most likely reflects the involvement of higher order cognitive and executive control processes during task performance.” This description and the areas in the illustrations could be parts of the Broca’s and Wernicke’s areas, the areas that were shown to be active in language and music communication.

Thaler, L., Arnott, S., & Goodale, M. (2011). Neural Correlates of Natural Human Echolocation in Early and Late Blind Echolocation Experts PLoS ONE, 6 (5) DOI: 10.1371/journal.pone.0020162

The importance of communication

A recent paper (see citation below) has helped to clarify the relationship between linguistic and musical communication. The researchers used a standard type of communication between jazz players, called “trading fours”. The musicians alternate playing four bar phrases, each relating to the previous one, so that the players in effect answer one another. This back and forth is a musical conversation.

The authors used a number of controls that were not musical conversations as contrasts to the “trading fours”: scales, a practiced melody, improvisation without relating to another player. The resulting music was analyzed for “note density, pitch class distribution, pitch class transitions, duration distribution, duration transitions, interval distribution, interval transitions, melodic complexity, and self- organizing maps of key”. This was used to give a numeric value to the melodic complexity and to identify the nature of the conversation in the “trading fours” sessions. The improvisation in the “trading fours” music was more melodically complex and was related in a conversational way.

One of the players was scanned with fMRI during the sessions. The improvised conversation involved intense activation of two of the language centers (Broca’s and Wernicke’s areas ) and also their right hemisphere counterparts. The left side areas “are known to be critical for language production and comprehension as well as processing of musical syntax.” The right side match to Broca’s area is “associated with the detection of task relevant cues such as those involved in the identification of salient harmonic and rhythmic elements.” These two areas appear to perform syntactic processing for both music and speech. The Wernicke’s area is involved in harmonic processing and it’s right homologue is “implicated in auditory short-term memory, consistent with the maintenance of the preceding musical phrases.” These results are similar to a study of linguistic conversation and are consistent with the ‘shared syntactic integration resource hypotheses’. In other words they are consistent with music and language “sharing a common neural network for syntactic operations”.

However music and language are not semantically similar. In the ‘trading fours’ situation there is a marked deactivation of the angular gyrus which is related to “semantic processing of auditory and visual linguistic stimuli and the production of written language and written music.” It appears that during communication, language and music resemble one another in form (syntax) but not in meaning (semantics).

This points in a particular direction. There may be no language specific system in the brain but rather a communication specific system. Interesting.

Here is the abstract:

Interactive generative musical performance provides a suitable model for communication because, like natural linguistic discourse, it involves an exchange of ideas that is unpredictable, collaborative, and emergent. Here we show that interactive improvisation between two musicians is characterized by activation of perisylvian language areas linked to processing of syntactic elements in music, including inferior frontal gyrus and posterior superior temporal gyrus, and deactivation of angular gyrus and supramarginal gyrus, brain structures directly implicated in semantic processing of language. These

findings support the hypothesis that musical discourse engages language areas of the brain specialized for processing of syntax but in a manner that is not contingent upon semantic processing. Therefore, we argue that neural regions for syntactic processing are not domain-specific for language but instead may be domain-general for communication.

Donnay, G., Rankin, S., Lopez-Gonzalez, M., Jiradejvong, P., & Limb, C. (2014). Neural Substrates of Interactive Musical Improvisation: An fMRI Study of ‘Trading Fours’ in Jazz PLoS ONE, 9 (2) DOI: 10.1371/journal.pone.0088665