MEDIA SECURITY PROTECTION: NEW METHODOLOGY FOR AUTHORSHIP EXAMINATION OF THE INTERNET DISCOURSE

This article deals with the issues of forensic authorship examination of Internet discourse texts to ensure media security. The existing forensic authorship examination methods are either not suitable for identification and diagnostic tasks, or need to be updated and improved, since modern communication in the Internet environment has new properties that are different from the properties of purely oral / written speech. Authors try to reveal specific mixed nature of this new type of forensic objects (a combination of oral and written speech) and propose new approaches to identification and diagnostic authorship examination of Internet speech products. The most rational approach to develop methodology for Internet discourse authorship examination is, according to the authors’ opinion, to combine the methods of identifying the speaker based on oral speech (in terms of linguistic analysis in framework of “Dialect” technique) and written speech (methods proposed by Vul and the Ministry of Internal Affairs of Russia). The article serves as a basis for the future research in the sphere of forensic authorship examination of the internet discourse and forensic speech science in general. The authors conclude that it is necessary to improve / develop not only authorship identification methods but authorship diagnostic methods which allow one to determine properties of a questioned text and/or properties of an author’s idiostyle, electronic communication skills, etc. that is needed to determine author’s social and demographical characteristics.


Introduction
The wide spread of digital technologies of data transmission (digitalization) led to the formation of a new digital environment in which information is circulated in digital form. The major part of such information products is either texts or poly code materials containing a verbal component. Processes that

Problem Statement
Taking into account proceedings on cases related to ensuring media security, it is often necessary not only to study the speech products themselves (their semantics and pragmatics), but also to identify the author of such speech products (the author and the distributor of the text do not always coincide).
Traditionally, the expert in the relevant specialty (in such cases, a forensic authorship expert) should be invited to resolve issues requiring special knowledge. However, the examination of speech products of the Internet discourse is not so easy to provide with forensic expert support, since the existing forensic methods have not been updated for a significant period of time and need theoretical rethinking and revision taking into account the mixed nature of Internet speech products.
The main hypothesis of the research was that adaptation of authorship examination methodology is necessary for examination of speech products of the Internet discourse.

Research Questions
The research was aimed to answer the following questions:  What approaches to forensic authorship examination of the Internet discourse exist?
 What are the key features of typical objects of forensic authorship examinations?
 What are the perspective ways in sphere of forensic authorship examination of the Internet discourse and forensic speech science in general?
To consider these issues, the authors relied on the following fundamentals:  Forensic authorship examination of the Internet discourse is the compulsory component of media security support. https://doi.org/10.15405/epsbs.2021.12.74 Corresponding Author: Vladimir Dmitrievich Nikishin Selection and peer-review under responsibility of the Organizing Committee of the conference eISSN:  607  Media security system serves for reaching the balance between freedom of speech and the right of citizens for safety information space.
 The activities of forensic authorship experts have to correspond to the activities of state organizations and must comply with the basic principles of information security and information ethics.

Purpose of the Study
The purpose of the study was to develop theoretical provisions and specific practical recommendations how to use forensic linguistic (authorship) knowledge to ensure media security in the digital environment.

Research Methods
The research has interdisciplinary nature and is based on provisions of forensic expertology, forensics (forensic science), on the one hand, and forensic speech science (including forensic linguistics), applied linguistics, on the other hand.

Findings
Electronic media communication, like any other phenomenon, has a number of advantages and disadvantages. For instance, its advantages include the potential unlimited possibilities of speech activity; lack of reference to a specific territory (several people from remote parts of the world can work together on a project by means of electronic communication; people with certain health problems can also use this type of communication). There are also disadvantages of this phenomenon, namely the relative anonymity of the speech products (it is not always possible to identify the author of the speech product). It is worth noting that some of the aforementioned advantages of electronic communication, such as, for example, the potential unlimited possibilities of speech activity, can also serve as a certain kind of disadvantage (if an anonymous person sends or publishes negative (defamatory) information about another person, etc.).
The criteria for the suitability of objects for forensic authorship examination differ from the criteria presented in the framework of other types of forensic speech examinations (forensic linguistic expertise, forensic phonoscopic expertise). In forensic linguistic examination (the main purpose of which is to establish semantic content), suitable objects are only speech products with signs of textuality (common concept, theme, structure; logical and stylistic unity; grammatical and semantic connectivity of its components). In forensic authorship examination, the main criterion for the suitability of an object is the expression of the author's individual speech-thinking skills; it may be not a whole text but a speech product whose volume is not below a certain threshold value. If the text is stereotyped, emasculated, technical, then the signs that characterize the author's speech-thinking skills are most likely not expressed; therefore, such objects will not be suitable. At the same time, the presence of signs of textuality in the object of forensic authorship examination is secondary and does not have the same meaning as for forensic linguistic examination. https://doi.org/10.15405/epsbs.2021.12.74 Corresponding Author: Vladimir Dmitrievich Nikishin Selection and peer-review under responsibility of the Organizing Committee of the conference eISSN:  608 The properties of the Internet discourse differ from the properties of purely oral or written communication. The wide possibilities for coding information provided by electronic data transmission devices lead to the transformation of the natural language and a change in the rules for handling it. The text of Internet communication, expressed in digital form and having a physical medium, has common features with the standardly organized written text, but it also has features of oral communication (this is especially clearly manifested in the texts of blogs and social networks).
The speech of the global network demonstrates the abundance of various deviations from the existing language norms and mainly exist in the format of various forums, chats, the pragmatics of which is as close as possible to colloquial speech and lies in openness, striving for mutual understanding, mutual interest in communication and emotional liberation (Proshakova, 2008). Communication on forums and chats is also brought closer to the pragmatics of colloquial speech due to the emotional connotations, for the expression of which communicants quite actively use graphic non-verbal symbols, as well as symbols already provided by the programmed capabilities of a particular site (netiquette):  capital letters / their absence;  excess / absence of punctuation marks, etc.
The presence of the above signs describes the nature of the author's communicative intentions and serves as a marker of a certain emotional state.
Key feature of traditional forms of verbal communication (oral and written) is the presence of the paired property of spontaneity / preparedness. As a rule, speaking is characterized by spontaneity, while writing is characterized by preparedness. However, this is not true for all cases: oral speech can be prepared (e.g., when speaking in public), and written speech can be spontaneous (e.g., texts of electronic communication). Establishing the preparedness grade of an oral / written text is important for the correct assessment and interpretation of the differing features of a questioned speech product and samples during the identification process. The existing methods of identifying an author of a written text of the Internet discourse need to be updated as spontaneity implies other organization levels of such texts and other norms of their construction.
Another factor influencing the possibility of differentiating a spontaneous or prepared text is its functional and stylistic affiliation. The features characterizing the spontaneous or prepared generation of a text can have a different degree of stability under the conditions of various functional styles. On the one hand, they can be consciously (or unconsciously) omitted; on the other hand, they can be intensified due to certain components of oral or written discourse, implemented in a certain communication situation.
Thus, the main approaches to improving the methods of author identification of electronic communication products include the study of the features of spontaneous written texts and stylistics of Internet discourse products. The digital transformation of the language leads to the emergence of new speech norms. That is why the formal methods developed in the last third of the 20th century (Vul, 2007), the Ministry of Internal Affairs of Russia (Rubczova et al., 2007) etc.) and used to identify the author of a https: //doi.org/10.15405/epsbs.2021.12.74 Corresponding Author: Vladimir Dmitrievich Nikishin Selection and peer-review under responsibility of the Organizing Committee of the conference eISSN:  609 prepared written Russian-language text are not suitable for the examination of this type of speech products.
Electronic communication is characterized by 'economy' of the author's speech means. Traditional norms of spelling and punctuation in the Internet discourse are not relevant. Nowadays, all speech errors that are committed, for example, in messengers are not qualified as erroneous. In the Internet discourse, this kind of errors is considered not as a sign of author's illiteracy but as an objective way to save time. It was previously considered that a sign of high linguistic competence was compliance with the rules of the literary language, now the main metrics of linguistic competence are the range of vocabulary, the ability to actively perceive new generative constructions borrowed from other languages that can be  lexical skills and vocabulary volume.
In authorship identification examination, a separate analysis of the questioned object and comparative samples of the written speech is carried out by a forensic expert. A selection of coinciding and different general and specific signs of language skills that are expressed in the questioned text and samples is carried out. In practice, in order to obtain a high-quality expert examination, certain https: //doi.org/10.15405/epsbs.2021.12.74 Corresponding Author: Vladimir Dmitrievich Nikishin Selection and peer-review under  Revealing the main features of the text attribution, Vinogradov (1961) noted eleven factors characterizing questioned texts, six of which are objective, but the vast majority are characterized by weak formalizability. One of these factors, according to the scholar, is the linguistic stylistic factor.
Researchers examine quantitative indicators of style.
All of the above suggests that forensic authorship examination of texts of the Internet discourse must be based not only on formal, but on qualitative research methods (for example, the method of quasisynonyms). In addition, a perspective direction in the development of new attribution methods is the study of the properties of an author's idiostyle.
The term 'idiostyle' coincides with the term 'idiolect'. The difference between them depends on the views of a particular researcher. But in general, it can be summarized as follows: an idiolect is understood as the entire set of texts created by a certain author in the original chronological sequence.
And an idiostyle is understood as a set of deep text-generating dominants and constants of a certain author which determined the appearance of these texts in that order.
Idiolect (from the Greek idio -own, peculiar, special and dialect) is a set of formal and stylistic features inherent in the speech of an individual speaker of a particular language. It is a designation of the https: //doi.org/10.15405/epsbs.2021.12.74 Corresponding Author: Vladimir Dmitrievich Nikishin Selection and peer-review under  In other words, it is one of possible versions of the linguistic representation of the sense that the author wants to convey. With regard to the forensic authorship examination of the Internet discourse, the author's style implies quasi-synonyms and ordinary synonymous series, etc., since the author's preferences (not only textual ones, but also punctuation ones, as well as the presence of emoticons, infographics, graphic highlights, etc.) are of forensic significance for identification and diagnosis. Therefore, a new methodology for the examination of Internet speech products should take into account the listed features and entail a new understanding of identification and diagnostic features in relation to new objects in which oral and written speech is mixed (since there is a written display of the inherently oral process of generating text in an online format). In such cases forensic authorship experts assess not the linguistic competence of an individual (his/her knowledge of the rules that make up the norm of the literary language) but his electronic communication skills and speech habits (including brevity, economy of speech efforts, etc.) taking into account new speech norms inherent in Internet discourse.
One of the important directions of the forensic authorship examination of Internet discourse products is determining the grade of their spontaneity / preparedness. The solution of this diagnostically significant issue allows one, in particular, to conclude that the author of the text has an intention to commit an offense. Based on the results of such examination, an expert can conclude that the questioned text refers to previously prepared speech (copying a previously known or unknown text; compiling a previously learned written text; written speech under dictation; presentation of a written text; written citation of a written text) or to unprepared / spontaneous speech (drawing up a written text according to a template, filling out a form according to a sample; composing a written text on a predetermined topic according to a plan; drawing up a written text with partial reproduction of someone else's speech, https://doi.org/10.15405/epsbs.2021.12.74 Corresponding Author: Vladimir Dmitrievich Nikishin Selection and peer-review under  612 verbatim and non-verbatim quotation of someone else's or his/her own text; composing a text on a topic that is not known in advance, but well-known; composing a text on an unknown and unfamiliar topic; writing answers to questions posed in advance; writing spontaneously generated text in monologue or dialogue forms) (Galyashina, 2003).
However, the solution to this issue is not only of diagnostic forensic significance. As we have already indicated above, electronic communication products (even those presented in written form) also contain signs of oral speech. In this regard, we propose to combine the approaches to identifying a speaker based on oral speech (in terms of linguistic analysis in the framework of "Dialect" technique for forensic phonoscopic expertise), written speech (authorship examination methods proposed by Vul (2007) and the MIA of Russia (Rubczova et al., 2007)) for the purpose of forensic authorship examination of Internet discourse products. This will help to update the provisions of these techniques and at the same time to take into account the features of electronic communication that coincide with the features of oral communication.
If the identification is impossible for objective reasons (for example, in case without a suspect), the involvement of a forensic authorship expert seems desirable, as it is possible to narrow the circle of suspects by conducting diagnostic examination to spot the author's social and demographical characteristics (gender, age, ethnic / religious affiliation, education, profession, etc.).
The characterization of the author is also an important area of forensic authorship diagnostic. The fundamental possibility and necessity of determining the demographic characteristics (including age and gender) of authors were proved by the results of research carried out by Koppel et al. (2002), which allowed one to correctly establish the gender identity of the author in 79.5% of cases for men and 82.6% of cases for women. Johannsen et al. (2015) also managed to establish a correlation in the signs of the written language of men and women, as well as different age groups. McDonald et al. (2013) also conducted research in this sphere. Hovy (2015) identified the most common (widespread) categories of topics that are typical of authors of a particular gender (men and women). Therefore, scholars see the possibilities of using forensic authorship diagnostics in judicial practice (to determine the true identity of Internet users) (Schler et al., 2006). For example, some individuals make attempts to conceal their true identity by creating profiles on social networks with false personal data (last name, first name, age, gender, location) in order to persecute their victims (Peersman et al., 2011).

Conclusion
The existing forensic authorship examination methods are either not suitable for identification and diagnostic tasks, or need to be updated and improved, since modern communication carried out in the Internet environment has new properties that are different from the properties of purely oral / written speech. The methodology of forensic authorship examination needs to be revised in adaptation to a small volume of the questioned texts and its online nature.
It is also necessary to develop new forensic methods based the new approach to the concept of digital communication idiostyle. The most rational way to develop methodology for Internet discourse authorship examination is, in our opinion, to combine methods of speaker-identification based on oral https: //doi.org/10.15405/epsbs.2021.12.74 Corresponding Author: Vladimir Dmitrievich Nikishin Selection and peer-review under responsibility of the Organizing Committee of the conference eISSN:  613 speech (in terms of linguistic analysis in framework of "Dialect" technique) and written speech (methods proposed by Vul (2007) and the Ministry of Internal Affairs of Russia (Rubczova et al., 2007)).
The formal methods of the author's identification intended for written prepared texts are not suitable for examination of Internet online texts. In forensic authorship expertise, signs are considered from two points of view:  norm-error. Everything that is a mistake refers to signs indicating either the grade of linguistic competence (literacy) or knowledge / lack of the language knowledge (when the language in which the text is generated is not native). The norms of the language are not forensically significant due to their normativity (since their manifestation has a low identification significance);  norm in terms of functional style. The norm of business speech and the norm of colloquial speech are different. Electronic communication products do not correspond to the norms of the Russian literary language for the most part (the norm for them is non-compliance with the literary norm, economy of speech means). Moreover, the understanding of a norm (and what is not a norm) in relation to electronic communication products needs to be rethought. Therefore, forensic methods that focus on the qualitative indicators of the questioned text and allow determining the competence of the author's electronic communication skills seem to be suitable. A promising area of research is the development of new mixed (formal-qualitative) methods.
It is necessary to improve / develop not only authorship identification methods but authorship diagnostic methods. Forensic authorship diagnostics allows to determine properties of a questioned text and/or properties of an author's idiostyle, electronic communication skills, etc. that is needed to determine author's social and demographical characteristics. When identification of an author is impossible, a diagnostic approach with the subsequent narrowing of the number of suspects seems to be an effective way for a forensic expert assisting to a law enforcement officer in the framework of ensuring media security in the digital environment.