Multimodal Metaphor In American Political Cartoons


Multimodality is a combination and interdependence of various channels of information influence on a person's consciousness, namely, verbal and visual. The metaphor in the framework of the present work is considered from the point of view of the cognitive approach, according to which it acts as an instrument of cognition of a new fragment of the surrounding reality. In the case where the source domain and the donor domain are represented by different modes, the metaphor is called multimodal. Political cartoons, being a genre variety of political discourse, has a pronounced social-critical orientation and is mediated by the mass media. The comic effect has a high significance in the political cartoon, the key role in the understanding of which is played by background knowledge, or presuppositions - extralinguistic, political, logical and linguistic; logical pre-opposition is mandatory. The material for the study was the verbal-visual representations of the supranational, global concept of "terrorism" in the American political cartoon. The study confirms the key role of the visual component for understanding the multimodal metaphor. The most frequent types of correlation between the verbal and visual codes include parallel and complementary correlation.

Keywords: Multimodal metaphorsglobal conceptterrorismpolitical cartoonsverbal codevisual code


The fundamental division of language and communication, thinking and behavior is artificial. In contrast to this statement, the main branches of linguistics still recognize the priority of the verbal channel, while other channels are most often taken into account as being auxiliary ones. Nevertheless, other areas of modern science whose focus of interest also includes communication rely on other assessments. So, in practical psychology, the following figures are often indicated: the visual channel transmits 55% of information, prosody - 38%, whereas the verbal channel answers only 7%.

Different information channels

A number of studies attempted to empirically estimate the relative contribution of three different information channels. In particular, A.A. Kibrik, in his study "Multimodal Linguistics" (2010), presents the results of an experiment that allows one to assess the degree of understanding of discourse on the basis of data coming through separate information channels (verbal, visual, prosodic), and also by combining several (usually two) channels . On the basis of the obtained empirical data, the author of the study comes to the conclusion that if it is "conditional to assume that the three information channels are independent (which is, of course, a strong coarsening) and normalize the contribution of three isolated channels ... then the contributions of the channels can be estimated as: verbal - 39%, prosodic - 28%, visual - 33%" (Kibrik, 2010).

The data obtained from the research indicate that both the traditional point of view in linguistics and the situation that is common in psychology are equally flawed. This allows us to talk more about the similar importance of various information channels.

Monomodal and multimodal phenomena

It’s crucial for this research to draw the difference between monomodal and multimodal phenomena. Obviously, "written texts are exceptional in being more or less completely monomodal (discounting elements such as font type, lay-out, and cover design, which some scholars would consider “modes” in their own right). Communication in other media is less often so purely monomodal: static pictures are combined with language; spoken language is accompanied by gestures; animation shorts have music and sound effects. All of these combinations can spawn metaphors. It makes sense to postulate a continuum between monomodal and multimodal metaphors" (Forceville, 2016).

The term "multimodal", which we use in the framework of this study, is based on "the understanding of modality adopted in psychology, neurophysiology and informatics: modality is a type of external stimulus perceived by one of the senses of man, primarily sight and hearing" (Kibrik, 2010). Thus, modus, in contrast to the conventional approach in linguistic terminology, does not express the speaker's attitude to judgment, but is presented as one of the information channels of influence on a person's consciousness.

It’s important to state that “in non-verbal and multimodal metaphors, the signals that cue metaphorical similarity between two phenomena are different, and bound to differ depending on the mode(s) in which the metaphorical terms are represented” (Qui, 2013). Within the scope of the current study we turn to investigating the relations between the verbal and visual codes as being constituent parts of a multimodal metaphor.

Problem Statement

The visual code has a significant potential impact on the message receiver and it is exactly the visual component that is an integral part of political cartoons.

Cartoons as a type of polycode texts

Cartoons, "having a clear socio-critical orientation, attribute the surrounding reality signs that have an estimated meaning" (Biserova & Mishlanova, 2016). The peculiarity of political cartoons is connected with the fact that it is one of the genre varieties of political discourse and is mediated by the mass media, and thus, verbal and graphic components form one visual, structural and semantic whole. The text of the signature in the cartoon is minimal, which makes it possible to strengthen the effect from the visual component. When considering a cartoon as a variant of a polycode text, it is possible to speak with a rather high degree of certainty about the dominance of the effect of the iconic component. A large number of researchers also claim that “the inclusion of text is productive too, although only in combination with visuals. It might be worth considering if this is due to the increasing reliance on images in our civilization” (Pérez-Sobrino, 2016).

The comic effect in political cartoons

The comic effect in the cartoon directly depends on the recipient's awareness of the peculiarities of culture, within which a cartoon is created, as well as events on the world political arena. Background knowledge, or presupposition, is a key factor in understanding the comic effect. There are four types of presupposition: extralinguistic (general scientific knowledge, as well as knowledge in such areas as culture, literature, etc.); political (knowledge of current political processes in the world, political personalities, parties); logical (the idea of natural relations between events, the ability to establish a logical connection between the explicit meaning of the work and implicitly meaning in the consciousness of participants in communication); linguistic (knowledge of linguistic reality and language features).

Cartoons, being an illogical combination of verbal and iconic components, require mandatory presence of logical presupposition, other presuppositions may be necessary depending on the context of the cartoon, the described fragment of reality and other factors.

Research Questions

The following research questions are addressed in this study:

  • What distinguishes multimodal metaphors from monomodal?

  • What are the peculiarities of multimodal metaphors explicated in political cartoons?

  • What are the most frequent types of the relations between verbal and visual components of multimodal metaphor in political cartoons?

Purpose of the Study

The current research is aimed at defining the interaction regularities of verbal and visual codes in multimodal metaphor presented in modern political cartoons.

Research Methods

The methods applied when conducting this research include general scientific methods of observation, evaluation and relevant literature review. Within the scope of this work metaphor is viewed through the prism of the cognitive approach, according to which it acts as a tool of perception of a new reality fragment.


It’s crucial to elicit and analyse the domains which form metaphors. In case source domain and target domain are represented with different modi, the metaphor is deemed to be multimodal.

Metaphor is used in order to more accurately and fully create in mind an "abstract" concept, both existing and non-existent. Researchers attribute a metaphor to fundamental feelings that help to understand the surrounding reality, a metaphor is called a means of shaping reality. From the point of view of cognitive linguistics, a metaphor is a powerful instrument of cognition, when new knowledge is comprehended by comparison with the already known one. The mechanisms of analogy are introduced through the principle of fiction. The metaphor "begins" with this principle, lives by it and dies if it ceases to be recognized in the internal form of the name.

Data collection and analysis

The research material is verbal-visual representations of a supranational global concept of terrorism in modern American political cartoons. The metaphorical models "are embedded in the conceptual system of the human mind, these are a kind of a scheme using which a person thinks. Thus, observation of the functioning of metaphors is recognized as an important source of data on the functioning of the human mind. The metaphor in the political cartoon requires obligatory background knowledge, only a combination of media text with extralinguistic knowledge allows to correctly interpret the metaphor.


Dwelling upon the correlation of verbal and visual codes within the framework of political cartoons, one should single out such variants as: the complete coincidence of the contents of the picture and the text ("parallel correlation"); partial overlapping of the verbal text with iconic information ("complementary correlation"); content incoherence of text and images ("interpretive correlation").

It is worth noting that in most cases, when analysing a cartoon, the visual code allows you to see additional meanings in the transmitted message.

Complementary correlation

Let us turn to a series of examples of American political cartoons depicting events related to the manifestations of terrorism. On one of the caricatures, published on July 9, 2017, the pot (Pot) with the inscription Saudi Arabia and the kettle with the inscription Qatar are displayed next. Steam from the boiler and from the spout of the kettle rises in the form of bombs with lit wicks, from the cauldron representing the image of Saudi Arabia, a cloud rises with the following text: "You export terrorism".

To understand the ideological and social orientation of this cartoon, one must have background knowledge about the current political events. In June 2017, Saudi Arabia, as well as several other states, announced the severance of diplomatic relations with Qatar, accusing Qatar of supporting several terrorist groups. The cartoon ridicules such statements, pointing out that Saudi Arabia itself is an even more serious source of terrorism than Qatar (in the image of a couple in the form of bombs from the boiler is much more than a teapot). This example demonstrates a complementary correlation when iconic information partially overlaps the verbal component, providing it with additional meanings and details.

Thus, within the framework of this multimodal metaphor, the country that is the supplier of terrorism (target domain) is depicted as household utensils, cooking utensils, and the manifestations of terrorism and terrorist acts, which can also be regarded as a target domain, are explicated with the help of an image of steam generated during cooking or boiling. The source domain is the process of cooking.

Parallel correlation

Another representative example is the caricature, which was published on May 16, 2014. The image is presented as a question in the test with the answer options. To the question "What do Islamic terrorists fear most?" two options are suggested: A - Smart bombes and B - Smart girls. Each of the answers is accompanied by a picture. To answer A, these are flying guided bombs, answer B is reinforced by the image of two girls wearing Muslim clothes and a headscarf that carry books in their hands and go in the direction of the "School" index.

The cartoon mocks the priority means of combating terrorism, pointing out that instead of using forceful methods, conducting military operations, which are most often used in the fight against terrorism, it would be more effective to provide education to those population groups that may be turned into potential tools for implementing terrorist activity. Thus, the idea that an intelligent, educated person is more difficult to manipulate is emphasized, such people are more difficult to enlist in the ranks of terrorist organizations.

In this example, we observe parallel correlation, i.e. the accompanying text message supports the visual component, the content of the picture and the text is the same. The main means of fighting terrorists is metaphorically represented as education in the image of girls going to school. The cartoon calls for a change of strategy, it prompts to direct efforts to a completely different social sphere, the images used emphasize the insufficient effectiveness of traditional methods of military attacks.


The conducted research allows to make some conclusions concerning the peculiarities of multimodal metaphors functioning in modern American political cartoons.

Presuppositions, or background knowledge

Political cartoons as a genre of political discourse have distinct socially critical slant and are mass media mediated. Comic effect plays an important role in political cartoons; for understanding comic effect background knowledge, or presuppositions, are required. The types of presuppositions are extralinguistic, political, logical and linguistic with logical presupposition being the obligatory one.

The correlation of verbal and visual codes

The political cartoon, being a variant of a polycode text, consists of the signs of two semiotic systems, namely, verbal and iconic. The visual component of the multimodal metaphor used in the political cartoon for the verbalization of the image of terrorism contributes to the strengthening of the implied and not directly expressed meaning. Most often, the correlation of verbal and visual codes is realized by means of almost complete coincidence of the content of the text and the image or partial overlap of the text with iconic information.


Copyright information

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

About this article

Cite this paper as:

Click here to view the available options for cite this article.


Future Academy

First Online




Online ISSN