Mixed Multimodal Metaphors In Advertising In English


The goal of this paper is to investigate the phenomenon of a multimodal verbal-visual metaphor and explore the possibility of labelling it 'mixed' (compared to mixed verbal monomodal metaphors). Among the main features of multimodal metaphors there are those characteristics that contribute to the development of the so-called cognitive dissonance (a psychological state of discomfort in a human). For the purposes of the current research, the case studies related to the most universal issues of human life (contemporary social and commercial advertisements) have been chosen. The analysis was carried out through a set of cognitive tools, such as conceptual metaphor analysis, conceptual integration, and image schema techniques to describe the interaction between metaphorical and metonymic processes that help to enrich the target domain with the concepts from several source domains that belong to different modes. As a result, we managed to identify the features which point to the mixed character of multimodal metaphors. Among them there are those characteristics that contribute to enhancing cognitive dissonance: the presence of two or more disconnected multimodal source domains; unclear, vague or ambiguous clues for metonymic mappings; elaboration of the blended meaning in the generic space and, most importantly, the so-called contextual density – ability to pack maximum information into the limited space. Some cognitively revealing and potentially beneficial applications of the directions of the research into the phenomenon of multimodality in metaphors have been pointed out.

Keywords: Conceptual metaphor, multimodal metaphor, mixed metaphor, metonymy, metaphtonymy


Metaphor research, as well as general studies of language and cognition benefit from adopting a multimodal metaphor perspective as it does, not only provides a theoretically and descriptively rich research area, but also suggests new practical applications of the general theory of metaphor. The main focus of this paper is metaphors in varied modalities. Namely, it focusses on the process of mixing information coming in through different channels within the so-called mixed metaphors.

Lakoff and Johnson's (1980) dictum that the process of metaphorising is mainly a mental activity and only derivatively a verbal one has promoted extensive research into the relation between metaphors' verbal manifestations and their cognitive origins. Later studies have proved the assumption that metaphors appearing in verbal guise are only observable manifestations of conceptual metaphors, but not identical to them. Mental structures, thus, are metaphorically reflected in language, gestures, images, music, and other modalities.

Problem Statement

Among multimodal metaphors one type deserves special attention. It is not only characterised as multimodal, combining at least two modes of input (verbal and visual), but highly challenging to decode. In fact, in linguistics this type in its verbal monomodal version was labelled as 'mixed'. Prior research of this phenomenon shows several basic reasons for mixing metaphors with the main one as the desire to sound more eloquent or humorous. The main research into metaphors, until recently, was focused on their verbal manifestations. Despite the fact that the ability of conceptual metaphor to exist outside the linguistic system was recognised, the combination of verbal and non-verbal realisations of conceptual metaphor has only recently become the issue of systematic and comprehensive research. Our suggestion is that multimodal verbal-visual metaphors can be mixed in character, too, but it is yet unclear what cognitive mechanisms provide for their overall meaning and ensure correct decoding of the message.

Research Questions

Among verbal-visual examples of metaphor the most prototypical type is the one that appears in contemporary social and commercial advertisement elicited from the visual corpora Google Image and VisMet.org. For the purposes of the current research, the case studies concerned with the issues of human life (driving, smoking, shopping) were chosen. The research questions centre around the meaning making process in multimodal mixed metaphors: what features in the visual-verbal mode signal the presence of mixed metaphor, how the two modes interact, what cognitive mechanisms underly the process of blending the meaning in two modes.

Purpose of the Study

The aim of this paper is to explore whether metaphors with multimodal structure can be considered 'mixed' like their monomodal verbal counterparts, what criteria are crucial for that, and what main functions are performed by those multimodal mixed metaphors.

Research Methods

To describe the process of meaning-making in mixed multimodal metaphors the usual set of cognitive-linguistic tools has been employed. The conceptual metaphor analysis (Lakoff & Johnson, 1980) coupled with metaphthonymic analysis has been used as a primary method to describe the process of their meaning construction. The conceptual integration theory by Fauconnier (1985) has been exploited to account for the source domains' blending procedure, and the analysis by image schemas (Kövecses, 2010) to reveal the underpinning scaffolding of the mixing process.


Types of metaphors

Currently in linguistics mixed metaphor is defined as clusters of metaphors which appear in close contextual adjacency but have different cognitive basis (Kimmel, 2010). Forceville (2016) offers a more emphatic definition of this phenomenon: "two metaphors squeezed into a single grammatical expression" (p. 224), although later he adds that the "conflated scenarios usually appear within phrasal units such as sentences or clauses, but it is not necessarily true, as a mixed metaphor can straddle two sentences". Thus, two parameters seem to stand out: in mixed metaphor there is a high degree of contextual "clusterisation" of metaphors, their clash and/or their mixture in the limited context of a paragraph, a sentence, or even a word (consider the cases of contamination, for example). The second parameter is the fact that their source domains represent different cognitive realia, which is the case in non-mixed metaphor as well because the very essence of using metaphor is mixing pieces of reality in one blend. It follows then, that mixed metaphors should be more cognitively demanding with respect to an addressee (Golubkova & Taymour, 2020; Taymour, 2020). In this paper we will argue that some multimodal metaphors can be mixed too, as they seem to share a common feature typical of all mixed metaphors, which can be described as conceptual density or the ability to pack maximum amount of information into the minimal amount of space.

Mixed multimodal metaphors

Monomodal metaphors possess only one modality, in which Forceville (2016) discriminates nine modes of depiction: written signs, pictorial signs, spoken signs, sounds, gestures, music, smells, touch, and taste. Meanwhile, multimodal metaphors predominantly or exclusively combine several modalities, e.g., verbal-visual or verbal-auditory. A multimodal mixed metaphor can be defined as a metaphor where two (or more) source domains represented by different modalities, e.g., verbal / pictorial / auditory / gestural, form a single target domain within a restricted contextual space, e.g., one page or poster of an ad, typically causing cognitive dissonance of varied intensity in a recipient.

Increased interest in multimodality in metaphors can be accounted for by the nature of communication in modern society that engages various modalities additional to the verbal one, as well as by the development of tools (multimodal corpora, eye-tracking techniques). One of the questions of great importance in metaphor analysis is "what knowledge and background assumptions must be recruited by its envisaged audience for this audience to be able to interpret the metaphor, and to interpret it in the manner that its sender wants it to be interpreted" (Forceville, 2017, p. 27). While some bodily experiences that constitute the basis for many conceptual metaphors are shared almost universally, there are other dimensions that can influence and change the perception of a metaphor, either monomodal or multimodal. One of such issues is individual's cultural background. It could presumably be shaped by one's social position, age, gender, level of education, immediate cultural context, and some other factors. In this study we will argue that some multimodal metaphors can be considered as 'mixed' when several disconnected source domains are used in order to form a single target domain in a somewhat restricted visual contextual space (that of an advert or a meme, for instance).

Verbal-visual multimodal metaphors in social advertising

Nowadays many companies use multimodality in their advertisements as it is well realised that pictures trigger deeper and faster emotional response of an addressee in comparison to words. Figure 01 features a social advert: a pictorial part comprises two objects, a human brain (i.e. source domain HUMAN), and a gearbox (i.e. source domain VEHICLE).

Figure 1: Social advert Driving (1) (https://images.app.goo.gl/ukxQTNHYDTURxUAHA)
Social advert Driving (1)
See Full Size >

Identification of these domains requires some amount of background knowledge from a recipient. Non-drivers are less likely to recognise the schematic depiction of a transmission (which does not really resemble a regular car gearbox), which looks like a human brain at the same time. The third clue element under the image is the caption 'drive responsibly' that helps to generate the necessary target domain recognition. The message might roughly sound like 'a motorist must use their inherent ability to think in a sensible way while driving a car, as it is a dangerous activity'. A high level of creativity of the ad also capitalises on the benefits of double metonymy: the gearbox represents a car and the process of driving, while it also schematically represents human brain and common sense. Thus, metaphor enhanced by metonymy creates some dissonance making the addressee ponder over the message of the advertisement, and also contributes to its memorisation.

The same strategy of blending source domains for creating particular cognitive tension and holding the recipient's attention is used in ads in Figure 2.

Figure 2: Social advert Driving (2) (http://justsomething.co/20-thought-provoking-advertisements-that-will-make-you-look-twice/)
Social advert Driving (2)
See Full Size >

In Figure 2 a key and a keychain are laid out on the surface in the shape of a pistol. A car-owner is typically able to understand that it is not just a key but an ignition key. Although, in modern cars a key is used more and more rarely, which can create extra perception difficulties for the younger generation of motorists. Like in the previously mentioned example, metonymy helps to recognise the metaphorical source scenario: a key – a car – driving a car. A linguistic cue is crucial too, as it compares driving a car to possessing / using weaponry, and the number of casualties involved in road accidents is comparable to the number of the murdered by a gunshot. Figuring out the message calls for an additional inference which results in an extended logical chain: a key – a car – driving a car – pistol – danger – required caution. Schematically the metonymic and metaphoric (metaphthonymic) mapping is shown on a graphical representation in Figure 3, where thin arrows stand for metonymic mapping, and thick arrows stand for metaphoric mapping.

Figure 3: Metaphtonymic map for the social advert Driving (2)
Metaphtonymic map for the social advert Driving (2)
See Full Size >

The remaining issue, though, is whether the combination between metaphor and metonymy actually guarantees a successful interpretation of the message. Considering these metaphorical-metonymic relations carefully, we may arrive at a conclusion that the dynamic interplay between the textual and visual modalities in this particular case equally contributes to the construction of the intended message, since the mappings are clear and unambiguous. What seems to be quite obvious, is that metaphtonymy is one of the most frequent conceptual operations in advertising, as metonymy is able to connect objects, products and brands, and metaphor helps transfer the properties from some necessary source domain to the advertised / promoted product (whether commercial or social) (Pérez-Sobrino, 2016). If metaphoric and metonymic mappings get broken at some point, e.g., a recipient does not realise that the key in the picture is the ignition key or does not pay attention to the verbal component, as the caption is typed in rather small font, the message enciphered in the social ad may not be delivered to the addressee. One of the wrong interpretations could be "selling machine guns is dangerous", which apparently shows that all the elements of the multimodal metaphor are crucial for correct understanding.

Verbal-visual multimodal metaphors in shockvertising

Sometimes social advertising uses visual images in the way of, a type of advertising that "deliberately, rather than inadvertently, startles and offends its audience by violating norms for social values and personal ideals" (Dahl et al., 2003, p. 268). In some cases, these ads cannot be deemed successful, and sometimes even a linguistic component might not help to overcome excessive cognitive dissonance in a recipient leading to the unsatisfactory interpretation of the advertiser's message. In figure 4, the pictorial part of the ad is combined with the caption

Figure 4: Social advert by World Wildlife Fund for Nature (https://images.app.goo.gl/UVRYfyhDcuWbjjQk9)
Social advert by World Wildlife Fund for Nature
See Full Size >

The campaign is trying to warn people against deforestation, but the advertisement itself requires a very high level of cognitive effort for deciphering its meaning, particularly the pictorial part. In order to highlight the most important mapping connections, it seems to be necessary to extract the main features of human behaviour normally typical of plants. The need for oxygen and nutrition could be named here, but in this social advertisement none of those features seem to play a part. Here a beheaded torso appearing from the ground represents the whole humanity that hurts itself by destroying the forests of the planet. This metonymical mapping would not seem important if there wasn't a metaphor 'a beheaded person – a cut off tree' available. Thus, firstly, the metonymical mapping suggests that the injuries of a human body are the injuries of a tree. Then the expansion equates one person to all humanity, and one tree to all trees on the planet. Thus, for establishing the right sequence of metaphorical and metonymic mappings, a recipient may build the following mappings:

  • there is a human body half buried in or growing out of the ground (in a forest);
  • the body is beheaded;
  • the body is not real, but wooden;
  • the human body supposedly metaphorically represents a tree (the position of the ‘neck' that is very close to the ground also suggests that);
  • one body metonymically represents the humanity;
  • one cut tree metonymically represents other forests and woods.

These deep and rather controversial metaphorical and metonymic mappings may (and typically do) evoke a high level of cognitive dissonance in a recipient, which does not allow us to consider this visual-verbal metaphor unambiguous and successful (figure 5).

Figure 5: Metaphtonymic map for the social advert by World Wildlife Fund for Nature
Metaphtonymic map for the social advert by World Wildlife Fund for Nature
See Full Size >

An interaction between metonymic and metaphoric mappings presumably allows a recipient to transfer the features of the human's position (hopelessness, defencelessness, death) to the situation with forests (helpless victims of human actions). This multimodal metaphor can be also considered 'mixed', as it satisfies the requirements mentioned above. We tend to agree that ‘a multimodal approach to metaphor in interaction with metonymy helps to achieve finer-grained analyses that contribute to discard faulty interpretations’ (Pérez-Sobrino, 2016, p. 25). However, it can be argued that only those combinations of metonymic and metaphorical processes where there are typical, frequent, and recognisable mappings available, allows for a more productive deduction of the suggested message.


The present paper focusses on the so-called mixed multimodal metaphors, a challenging type of metaphors containing two inputs – verbal and visual. Multimodal metaphors are frequently employed in verbal-visual messages, such as social and commercial ads. We claim that the reason for their complexity is grounded in the fact that they resemble verbal mixed metaphors whose meaning is also an amalgam of mappings, metaphors, metonymies, and inferences. Our case study reveals that mixed multimodal metaphors owe their complexity in terms of meaning construction to the nature of their source domains which are numerous and often multimodal, i.e. belong to two semiotic systems. Apart from the possible ambiguity of the verbal component, the visual component has no definite 'grammar' in the iconic sense of the word and is liable to a variety of (mis)interpretations. Their typical features contribute to enhancing the so-called cognitive dissonance due to the presence of two or more disconnected source domains; unclear, vague or ambiguous clues for metonymic mappings; elaboration of the blended meaning in the generic space and, most importantly, the contextual density. Further research is needed in order to address this issue in a broader and more systematic way in order to find out what types of mappings allow a straightforward effortless interpretation of a message by recipients from different cultural backgrounds. As the scope of this paper is limited to several examples, larger corpora should be used for the deeper analysis of multimodal metaphors in different genres.


  • Dahl, D. W., Frankenberger, K. D., & Manchanda, R. V. (2003). Does It Pay to Shock? Reactions to Shocking and Nonshocking Advertising Content among University Students. Journal of Advertising Research, 43(3), 268-280.

  • Fauconnier, G. (1985). Mental Spaces Text. Mass.: MIT Press.

  • Forceville C. (2017). Visual and multimodal metaphor in advertising: cultural perspectives. Styles of communication, 9(2), 26-41.

  • Forceville, C. (2016). Mixing in pictorial and multimodal metaphors?” Mixing metaphors. John Benjamins (pp. 223-239).

  • Golubkova, E. E., & Taymour, M. P. (2020). Cognitive specificities of mixed multimodal metaphors: the recipe for cooking. Cognitive studies of language, 41, 386-391.

  • Kimmel, M. (2010). Why we mix metaphors (and mix them well). Journal of pragmatics, 42(1), 97-115.

  • Kövecses, Z. (2010). Metaphor. A practical introduction. 2nd ed. Oxford University Press.

  • Lakoff, G., & Johnson, M. (1980). Metaphors we live by. University of Chicago Press.

  • Pérez-Sobrino, P. (2016). Shockvertising: Conceptual interaction patterns as constrains on advertising creativity. Círculo de lingüística aplicada a la comunicación, 62, 257-290. https://revistas.ucm.es

  • Taymour, M. P. (2020). Multi-level analysis of mixed metaphor as a linguo-cognitive phenomenon (based on the material of the English language). Voprosy Kognitivnoy Lingvistiki, 3, 71-76. https://doi.org/10.20916/1812-3228-2020-3-71-76

Copyright information

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

About this article

Publication Date

02 December 2021

eBook ISBN



European Publisher



Print ISBN (optional)


Edition Number

1st Edition




Linguistics, cognitive linguistics, education technology, linguistic conceptology, translation

Cite this article as:

Golubkova, E., & Taymour, M. (2021). Mixed Multimodal Metaphors In Advertising In English. In O. Kolmakova, O. Boginskaya, & S. Grichin (Eds.), Language and Technology in the Interdisciplinary Paradigm, vol 118. European Proceedings of Social and Behavioural Sciences (pp. 618-625). European Publisher. https://doi.org/10.15405/epsbs.2021.12.76