The Problem Of Constructing Linguistic Patterns For Detecting ‘Precedent’ Texts (In Headlines)

Abstract

The given article follows the author’s research in automated indexation of ‘precedent’ texts in newspaper headlines. The previous work describes the first three stages of the research delineating the corpus for analysis and the survey to reveal general recognition of the precedent texts. As the original text (precedent pretext) is transformed by the headline’s author its recognizing may be found hard, which was shown by the results of stage three. Thus the fourth stage, described in the given article, aims at outlining the borders of morphological/syntactical alternations of the original text that distinguish between recognition and non-recognition (very rare cases of recognition) of this precedent text disguised in the headline. The research was conducted through the survey. The questionnaire contains headlines with precedent texts of different origin: slogans, proverbs, book titles, etc. The assignment was to recognize and comment upon the precedent text in the transformed samples that included both existing headlines from the newspapers as well as the ones created by the author according to typical models of precedent text transformations, which were obtained from the analysis of the previously collected corpus. The results show most productive (in terms of recognition) ways of transforming precedent texts into headlines that can serve as a basis for constructing linguistic patterns that will further on help create automatic search and indexation of precedent linguistic phenomena in the massive of media texts. The search and indexation will be based on the general principles and regularities of understanding and recognition of precedent texts;

Keywords: Textprecedent textheadlinemorphological and syntactical alternationsmodelintertext,

Introduction

The present research continues the series of articles (Klochko, 2019) dedicated to the use of the so-called ‘precedent’ texts in headlines. ‘Precedent’ texts are significant phenomena of the national culture; they are known to the major part of the representatives of this culture (Klimovich, 2014). ‘Precedent’ tests are also prominent proper names, quotes from fiction and popular culture, memes, proverbs, etc. The term ‘precedent text’ was introduced by Karaulov in 1987 (p. 216) and widely spread in Russian linguistic studies. Though this term is not used in the English language discourse, the phenomenon of ‘precedent’ texts is outlined quite clearly because in the studies of intertextual relations both written (i.e. fiction, etc.) and spoken (i.e. sayings, etc.) texts are mentioned as the source of intertextual borrowings (Stipanović, 2017, p. 14) together with myths of different origin (Golubtsov & Luchinskaya, 2018). The given article is devoted to the analysis and further experiment to sharpen the focus on the formal transformations of these ‘precedent’ texts in headlines.

Problem Statement

Today the problems of the research of intertextual incorporations into different types of texts can get a second life as natural language processing becomes more and more prominent (Bolshakova et al., 2017). Today the idea of a computer-based headline generation is rather acute and interesting (Stepanov, 2019). It’s estimated that 30% of headlines have allusions to myths, literature, Bible, popular culture, etc. (Nikitina, Lebedinskaya, & Plakhova, 2018). Torre-Cantalapiedra (2018) states that journalistic discourse is formed with some extracts from other discourses but gives no criteria of revealing them from the texts, which might be of utter importance when “teaching” the machine perceive ‘precedent’ texts and create own headlines. Our only object of interest in this article is the formal language transformation of the pretext into the headline.

Research Questions

The given article is devoted to the questions and problems of the IV stage of the whole research of the indexation of ‘precedent’ texts in mass media. The first three stages were described in (Klochko, 2019). At stage I the corpus of ‘precedent’ texts from Russian quality daily newspapers was formed. At stage II the corpus was divided into groups according to the type of the ‘precedent’ text. At stage III a survey was undertaken to verify the general recognition and perception of these texts. Thus, the next IV stage of the research is to tackle morphological and syntactical features of precedent texts “to understand what makes this or that precedent phrase recognizable and to which extent it can be altered to remain so” (Klochko, 2019, p. 324).

Due to the studies previously conducted by the author, the following important data is known:

1) The content of ‘precedent’ texts in headlines. The biggest group is fixed expressions. Titles (fiction, movies, TV series) are represented in headlines and hence recognized. Only few quotations from fictional texts are widely spread.

2) The proportion of ‘precedent’ texts in headlines. More than 50% (154 of 308) of precedent texts is derived from fixed expressions of different kinds: the Bible, proverbs, Latinisms, idioms, etc. Prose and poetry examples are 5 and 7 times accordingly fewer; songs (Russian rock music, popular Soviet songs) are a bit more frequent than poetry. The remaining groups are quite small and rarely recognized, especially by the students: 1990’s Russian realia, Soviet anecdotes. Internet memes are quite rare for they quickly become outdated. Compare the popularity of memes from “Game of Thrones” in 2018 and today with ‘Baby Yoda’ and “The Witcher” song memes in December 2019.

3) The fact that original ‘precedent’ texts are transformed somehow by the author of the headline to make it look and sound catchy. For example, Xie (2018) states that the media prefers the headlines with terse form (p. 1012) and the meaning of a headline is identified as if layers are peeled off (p. 1012). Golovko (2019) states that an intertextual sign in journalistic texts is “any perceivable object that acts as a representative of some other sign system” (p. 70). He also says that “upper and lower limits of intertextual layers are endless” (p. 29).

Corresponding statements are aligned to Kristeva’s (2015) view on the notion of the intertext. Kristeva provides an idea that paves the way to a precise formal research of the intertext. She differentiates between the notions of phenotext and genotext, the former is a written text, while the latter is more a glimpse of a greater signifying structure which is revealed through the phenotext (p. 192). In plainer words, it is the text with all its attributes and features that is primary for the analysis and consequent accentuation of its possible meanings.

It is mentioned, though, that often the phrase or the word clearly borrowed from another text or discourse can lose its properties with the most part of its initial meaning. Klimovich (2014) says that there are intertextual elements from the Bible in fiction, which can be identified through etymological and lexicographical criteria (p. 256). Some types of these intertextual elements might lose the connection with the Bible in individuals’ mind, thus becoming clichéd (p. 257). The same phenomenon not only with biblical and Latin sayings but also fiction was stated in the previous author’s article, when up to 80% of quotations from Griboedov’s “Woe from Wit” or “The Twelve Chairs” by I. Ilf and E. Petrov were understood as people’s proverbs and sayings. A more confusing yet vivid fact of the mentioned phenomenon is described in the dictionary of popular witticisms Krylatye slova (Witty Words) by Ashukin and Ashukina (1955), when a line from a Russian poet’s I. P. Myatlev (1796 – 1844) “…how lovely and fresh the roses were…” quoted by I. S. Turgenev in his prose became attributed to him, not Myatlev (Ashukin & Ashukina, 1955, p. 250). Thus, it is quite clear that the reader may be unaware of the origin of the ‘precedent’ text but perceive it, recognize and get a certain satisfaction from this process, “to find some reading material that can temporarily relax their ( the readers ) body and mind when they are free” (Xie, 2018, p.1013).

It’s worth mentioning the ability of ‘precedent’ texts to be transformed. Overall, stating the variability of ‘precedent’ texts as unquestionable, Solomonova (2018) writes, that in this case it’s of vital importance to avoid “logical and cognitive ‘arbitrariness’: two intertextual elements should be still linked through synonymy, antonymy, etc. In case such link is not obvious at one level of poetics, it should exist at some other; otherwise it’s no more than a reader’s imagination”. She also remarks on the humour effect, which may appear in cases of a play of words in a well-known quotation when something expected is substituted by something new. Qin Xie (2018) also notes that “Intertextual headlines usually transform the form or structure of intextual prototype text moderately …, which make the collocation of language or the context unconventional. Therefore, the mental-set of the readers is broken, which achieve the purpose of humor” (p.1013). Solomonova (2018) states that one should ground such analysis on the search of definite formal coincidences between texts (p. 114).

The questions and related issues mentioned above have answers. The questions which stay unanswered are related mainly to the third point – the transformations of ‘precedent’ texts:

1) It is still not clear how these ‘precedent’ texts are altered to become catchy headlines. In plainer words what formal operations one should commit to encode the pretext (a saying, a novel or movie title, etc.) into a terse and witty phrase that furthermore keeps the recognizable traces of the ‘precedent’ text that constitutes its basis.

2) It is difficult to understand what makes the respondent recognize the ‘precedent’ text encoded in the headline. Whether it is the syntactic structure, or whether the lexical content that makes the reader guess the ‘precedent’ text.

Purpose of the Study

Thus, the purpose of the study is to find out the principles and regularities of the creation (formation) of “precedent text-based” linguistic patterns through the study of the borders of their recognition via the modulation of morphological/syntactical alternations of the original ‘precedent’ text in the headline.

The found principles and regularities are to be used at stage V in the next article, which will include: a) the experiment of automated search of ‘precedent’ texts in the massive of media texts, and b) computer based construction of linguistic patterns for the needs of natural language generation. This has a potential of bringing clarity to the general problem of intertextuality.

Research Methods

The whole of stage IV included a number of sub-stages and each of them required appropriate methods.

  • Sub-stage 1 was the selection of material from the corpus of headlines taken from Russian quality newspapers which encountered 308 headlines from 5 newspapers. This time only one issue (“Novaya Gazeta”) was selected to focus on particular aspects of the formation of ‘precedent’ texts in headlines. “Novaya Gazeta” was selected for the frequency of headlines with ‘precedent’ texts and the variety of pretexts that include novels, poetry, politicians’ quotations, memes and more.

  • Sub-stage 2 included the analysis of the formal alternations (performed by the journalist) of ‘precedent’ texts that formed the basis of the headline. The ‘precedent’ text was taken and compared to the text of the headline, formal markers of the transformation being written out and classified. After that the examples were put in a number of groups according to their belonging to a definite type of alternation.

  • At sub-stage 3, after the formation of the groups, all the examples were subjected to additional alternations as follows. Each ‘precedent’ text from the headline was at first altered according to the type of alternation it underwent in the headline but in a slightly different way. Then this text was sequentially altered in each group (if possible). This was done to see if the ‘precedent’ text is flexible enough to be changed in a number of ways and stable enough to stay recognizable. The latter is just an assumption and that’s why the survey was conducted.

  • At final sub-stage 4 the questionnaire for the respondents of the survey was developed, which aimed at finding out the borders of morphological/syntactical alternations of the original text that distinguish between recognition and non-recognition (very rare cases of recognition) of this precedent text. For this purpose 12 examples of headlines with a number of alternations were chosen and put in a table. The examples were chosen grounding on the type of the alternations made, and the popularity (possibility of general recognition) of the ‘precedent’ text.

In column 1 the original headlines with a ‘precedent’ text and the alternations of this very text are given. First come the alternation (or alternations), then the headline. They go from hard to easy. If the respondent has recognized the ‘precedent’ text, he/she should write it in column 2. The respondents should also write what made them guess (column 3) and whether it was easy or not for each alternation and for the headline as well (column 3). If the respondent fails to recognize the text at all, later on, after being told the right answer, he/she will also have to write what prevented him/her from recognizing the text (column 4).

Findings

Each sub-stage of the study, which represents stage IV of the whole ‘precedent’ texts research, had its own outcomes and results.

  • At sub-stage 1 to the list of existing 109 examples of headlines from “Novaya Gazeta” dated October, 2018-December, 2018, 61 more headlines dated August, 2019 – December, 2019, were added with the total amount of 170 headlines. Only those headlines were chosen that contained a clear allusion to some ‘precedent’ text. As a result a selection of headlines with different ‘precedent’ texts as pretexts encoded in them was gathered.

  • At sub-stage 2 the formal analysis of the alternations of the underlying ‘precedent’ was conducted. As a result the following types of alternations were found out:

1) Change of one element. It’s a certain transformation of a single word. It should be distinguished from the substitution of one element for the word is not replaced with another (synonym, antonym, etc.), but changed somehow. That’s why it is irrelevant to say that the word is substituted, though it’s very close to this type of alternation. The main difference lies in the scale of the alternation: when the element is changed (not substituted), only phonetic/graphic modification in one word occurs to make the phrase sound or look differently. For instance, it’s the change from social budget (a bureaucratic expression) into asocial budget (Headline). Since all examples are in Russian it is rather difficult to give an appropriate translation for all of them, but in a number of cases it’s close to the pair like peasant – pheasant in English. These are different words of course but look and sound alike, considering the fact that the ‘precedent’ text is well-known to the reader and the changed word will remind of the original phrase. Xie (2018) gives the example of the headline “Ob the Builder” (p. 1012) where Ob (for Obama) is the changed “Bob” from the children’s TV series “Bob the Builder”. This is the pure case of the element change, as Ob derives from Bob losing one letter.

There are 21 examples in this group. 13 of them are of adding (removing) a morpheme or changing a letter, so the headline starts to sound a bit differently from the ‘precedent’ text, thus causing that humor effect.

2) Substitution of the element. This is the biggest group – 72 examples. Dissimilar from the previous one here the word is replaced by another, not changed. So the two don’t look or sound alike. For example, “ Winter is coming ” from “Game of Thrones” becomes “ War is coming ” (Musafirova, 2018) in the headline. The same can be said about the headline “ Soap drama ” (Bronstein, 2019) resembling of “ Soap opera ”.

Only 5 examples of 72 had two elements replaced, only one was easy to recognize: “ For whom the Bell Tolls ” became “ For Whom the Tambourine Rings ”, a headline about a Siberian shaman (Tarasov, 2019). Other examples are quite obscure as, for instance, “ Humiliated and Insulted ”, a novel by F. Dostoevsky is hardly revealed through the headline “ Tired and squalid ” (Drobina, 2019), and only some resemblance through related lexical meaning and the peculiarities of the Russian language in spelling can make the reader guess. In similar cases famous Latin saying “ Divide and rule ” becomes “ Devalue and underpay ” (Khachaturov, 2018a). In cases when 2 elements are substituted it’s likely the syntactic structure that hints the reader.

3) Rephrasing. 24 examples formed this group. Here the ‘precedent’ text is rephrased in such a way that either the whole structure is ruined, and only prominent words stay (15 examples), or the original text is changed grammatically (9 examples).

Thus, the first example of the headline (actually, in the lead) is “ There were more than 28 Panfilov’s Men ” (Mlechin, 2018b) which reminds of a story of 28 Red Army soldiers, the defenders of Moscow at WWII under the command of General Ivan Panfilov. Here one can see only the figure (28) and the key phrase Panfilov’s Men to recollect the whole story.

The second example illustrating the grammatical change is the paraphrase of famous lines from N. Nekrason’s poem “The Railway”, changing …will bear everything, even the railroad… into Has borne this railroad (Dyakova, 2018).

4) Preservation of the sample (phrase). 33 examples were found. 22 examples present a phrase without any change. For example, “ Requiem for a Dream ” (Racheva & Artemieva, 2018). The remaining 11 examples include the sample of the ‘precedent’ text in a longer phrase or sentence: A billion rubles and a child’s tear (Mursalieva, 2019), where a child’s tear is clearly related to Dostoyevsky’s “The Brothers Karamazov”.

Other groups do not possess any alternation potential and the examples there were formed without a visible pattern.

5) New terms and memes. This group includes 14 examples, which is surprisingly few, considering the popularity of the latter in social networks. The answer is in the nature of a ‘precedent’ text: it should first get established in the speech practice to become more or less easily recognized and become able to be transformed. So, in this group mainly bare words were used since there’s no pattern yet. They have little chances of becoming ‘precedent’ and will probably soon diminish from the newspapers: friend zone (Martynov, 2018) Columbine (Zhilin, 2018). Meme is a very special thing: first of all, it’s mainly visual (requires pictures or videos), secondly, it quickly becomes outdated. In headlines they are rarely used – only 4 examples, e.g.: Healthy Intelligence (Mlechin, 2018a) from the meme Healthy lungs vs Smoker’s lungs .

6) Adding the element. There were 4 examples of adding a grammatical element to a phrase or a word, i.e.: particles, prepositions, prefixes that changed the original ‘precedent’ text (usually a proverb) into quite a new one in the headline.

7) Prominent name or a special word. 2 examples: using prominent (‘precedent’) personal names Beavis and Butthead effect (Tarasov, 2018) and a neologism соврамши (Mozgovoy, 2019) by M. Bulgakov that makes any phrase or sentence with it become referred to his “The Master and Margarita”.

After the examples were clustered and the groups formed, each of the ‘precedent’ texts retrieved from the headlines was altered in the way described in the Methods.

For Group 1 (Change of one element) no more alternations could be suggested in the same way due to the formal linguistic restrictions. Substitution of the element was applied successfully in the major part of the group (16 examples), except very short sayings. Substitution of two elements was applied in 4 cases, making the text look strange and unrecognizable, though synonyms were used. Practically no text could be rephrased due to the lack of prominent words that help recognize the ‘precedent’ text. Adding the element was also a hard task due to formal linguistic restrictions.

For Group 2 (Substitution of the element) change of one element could be applied only in 3 cases of 72 due to linguistic restrictions. Alternative substitution of one element could be applied in all cases with certain limits: 41 examples had a definite pattern where one element could be substituted. 31 remaining cases were either too short or required a rhyming word (synonym) for substitution. A phrase from the X-Files “ The Truth Is Out There ” ( Истина где-то рядом ) has a variable: the first word can be easily substituted by another: Freedom Is Out There (Polikovsky, 2018). A shorter phrase has stricter requirements to the substituting word. Russian translation of A. Camus’s L'Homme révolté is Человек бунтующий with the adjective in post-position which is untypical of Russian. So in the headline Человек голосующий ( revolting is substituted by voting ) the alternative word is rhyming with the original one, and is of the same part of speech (participle). Substitution of two elements makes the phrase a puzzle rather than a headline. “ Associate Professor 007 ” (Shiryaev, 2019) further transformed into Associate Professor 008 becomes hardly referable to James Bond. As the pattern of the phrase where even tiny details can be of utter importance (case, voice, tense, mood, number, etc.), it cannot be rephrased without losing its relations to the ‘precedent’ text.

For Group 3 (Rephrasing) as for the previous one the structure appeared to be the most important. It’s not syntactic but semantic structure with certain key words. A slightly paraphrased “ Seek and ye shall be found” (Khachaturov, 2018b) can be rephrased further with the semantic arguments remaining their relations. If the latter is violated, the ‘precedent’ text won’t be recognized. The headline “ How the son was responsible for his father ” (Timofeeva, 2018) in the Russian discourse is related to J. Stalin’s phrase popularized by the Soviet press: “Сын за отца не отвечает” (The son is not responsible for his father). Thus, once the semantic arguments [close relative] and [to be responsible for] are used in a sentence any adjuncts can be added without losing the connection to the stated ‘precedent’ text. However, one should be aware of key words from other ‘precedent’ texts in order not to confuse them in reader’s mind. Semantic arguments [to be responsible for] and [to tame] will definitely refer to “The Little Prince” ( You become responsible, forever, for what you have tamed (Little Prince, n.d.) / Ты навсегда в ответе за тех, кого приручил (Malenkij Prinz, n.d.)).

Group 4 (Preservation of the sample) could hardly be altered at all, so syntactically integral these examples proved to be. The exception were the examples where the sample was included into the free phrase as in “ Of Freaks, Men and Goalkeepers ” (Malukova, 2019) thus reminding of a movie “Of Freaks and Men” a directed by A. Balabanov. The title of the movie itself refers to J. Steinbeck’s “Of Mice and Men”, though a Russian translation of the title doesn’t directly correspond to the movie.

New terms and memes are themselves used very carefully in the media for they can conditionally be called ‘precedent’ without formidable use in the speech.

Prominent name or a special word doesn’t require any structure or surrounding.

Adding the element is a rare group of alternations that is presumably occasional and can’t be called a productive way of transforming the ‘precedent’ text.

27 respondents (Master Degree students, Perm State National Research University) participated in the survey. The major part of texts was recognized and the reasons given. There are a number of answers that are partly repeated in different questionnaires and the total number of answers will overcome the number of respondents. Sometimes they couldn’t give comments on a phrase, so blank fields were also rather frequent.

The main reason was key word or a familiar phrase (18 answers). 10 people made remarks on a “similar construction”, “rhyming words” or close synonyms.” Indirectly related synonyms puzzled the respondents and prevented from fast recognition. Totally 14 people mentioned that the phrase is a “word-of-mouth” both about proverbs and simply “known” phrases, that “sound alike” the ‘precedent’ text. 5 people noted that it’s “just something familiar” that makes them recognize the ‘precedent’ text and 4 people still found “familiar” unknown phrases. In 15 cases it was the substitution of two words in a headline that prevented from recognizing or understanding a phrase. Some people could guess even through a slight hint, whereas others needed more detail. Preserved samples of a ‘precedent’ text were recognized by all respondents.

Conclusion

The given study was aimed at delineating the borders of the morphological and syntactical changes that could be applied to the ‘precedent’ text used as an intertextual pretext to the headlines.

After the analysis of the selection of headlines from “Novaya Gazeta” and the survey, it was found out that the most productive way of forming a headline from a ‘precedent’ text is the substitution of one element in the phrase. In some cases the ‘precedent’ text requires a synonym or a rhyming word for substitution. Substituting two elements usually makes the ‘precedent’ text unrecognizable. Changing one element is also productive with certain linguistic restrictions. Rephrasing proves to be efficient only in those cases, when the ‘precedent’ text implies prominent words. ‘Precedent’ text incorporated into a sentence is recognized by prominent words or the structure. Other ways of transforming ‘precedent’ texts into headlines seem to be unproductive.

Thus, the stated findings can be of practical use for natural language generation in a number of spheres. One can start from scanning the media in search and further indexation of ‘precedent’ texts through special software using Wikiquote, Natonal Language Corpus (or any other resource) as the initial source of ‘precedent’ texts. For these needs search filters may be used, grounding on the results of the given paper. Search filters may later on, in the course of additional study outline the parameters of linguistic patterns that are to be used for automatic search and indexation of precedent linguistic phenomena in the massive of media texts.

References

Copyright information

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

About this article

Publication Date

03 August 2020

eBook ISBN

978-1-80296-085-3

Publisher

European Publisher

Volume

86

Print ISBN (optional)

-

Edition Number

1st Edition

Pages

1-1623

Subjects

Sociolinguistics, linguistics, semantics, discourse analysis, translation, interpretation

Cite this article as:

Klochko, C. A. (2020). The Problem Of Constructing Linguistic Patterns For Detecting ‘Precedent’ Texts (In Headlines). In N. L. Amiryanovna (Ed.), Word, Utterance, Text: Cognitive, Pragmatic and Cultural Aspects, vol 86. European Proceedings of Social and Behavioural Sciences (pp. 653-662). European Publisher. https://doi.org/10.15405/epsbs.2020.08.77