The promising areas of boosting the prestige and demand of studying minority languages include the creation of electronic lexicographic sources with an English component. The relevance of their development is associated with an acute shortage of bilingual educational resources for their study, on the one hand, and the dominance of English as one of the compulsory foreign languages for study in educational institutions of the Russian Federation, on the other. The article presents the experience of creating such resources with the inclusion of Russian, English and Khakass, one of the minority languages of Indigenous peoples of Southern Siberia. Basic models of translation of definitions with different ethnocultural volumes of semantics and background information are considered. Some strategies of presenting ethnocultural lacunae are suggested. The use of computer programs with their additional opportunities and advantages for such electronic dictionaries enhances more effective interaction of autochthonous, official and foreign languages in the educational space of the republics of South Siberia.

Analysis of modern bilingual electronic lexicographic scholarship leads to the conclusion that the use of modern electronic translation-oriented dictionaries is in many ways more productive and convenient compared to traditional paper dictionaries. The following advantages of various types of electronic and online dictionaries are usually highlighted, to name some of them:

1) The ease of using large volumes of vocabulary with no need to flip through a multi-page paper dictionary. When requested, dictionary entries of special dictionaries are displayed.

2) The ease of downloading most of electronic dictionaries via the Internet.

3) An option to see changes and edit definitions of entry words though it is true to a greater extent for terminological dictionaries due to more rapid changes in this subsystem of a language (Marus, 2015, 2019; Nesova & Bobrytskikh, 2018; Toyoda, 2016; Töpel, 2014).

At the same time, there is an opposite point of view, expressed in (Cherepovsky, 2018). The author argues that electronic dictionaries triumphed over paper predecessors due to a clear superiority in price, volume of dictionaries, speed and ease of search. But these are quantitative rather than qualitative indicators. In addition to the quantitative indicators, the potential benefits of electronic dictionaries were initially attributed to frequent updates, collective authorship and constant feedback from thousands of users. Thus, not all these potential opportunities have been realized and we can hardly say that the content of electronic translation dictionaries rose to some new level and the potential of computer lexicography got its full realization.

Nevertheless, obvious changes in lexicography are erasing the boundaries between different types of language resources – dictionaries, encyclopaedias, terminology banks, lexical databases, writing and translation tools, etc. According to Hartmann (2005), this trend may be called “hybridization”, which is “combinations of one or more types of standard work in one product” (p. 195). Among the given examples, compromise genres are represented, such as Dictionary + Grammar, Dictionary + Thesaurus, and Monolingual-Bilingual Dictionary. It is very possible, argues Marus (2019), that “very soon the electronic dictionary will become an integrated tool or a set of tools in a comprehensive application for a professional user, where it will coexist with other products with language technologies” (p. 2). The next important invention is Wiki-technologies which may influence the future of lexicography, i.e. technologies for storage, collective development, structuring text, hypertext, etc. The possibility the Wiki technology allows is to create and modify articles in online dictionaries. This phenomenon is often referred to as “collaborative lexicography”. The main advantages of collective lexicography to be used are connected with: 1) constantly keeping the vocabulary up-to-date, timely inclusion of neologisms in the dictionaries; 2) representation of more types of vocabulary, including slang and vernacular, as well as dialect words and terminology; 3) a significant replenishment of the vocabulary, including collocations, phraseological units, neologisms and illustrative examples of the use of words (Marus, 2019).

The development of another current trend in electronic lexicography is linked to its role not only as a repository of useful information, but also as an auxiliary tool in the process of classification and systematization of the surrounding world. The development of electronic lexicography gives a good opportunity of creating specialized bilingual dictionaries as sources of information on many aspects of national worldview, encoded in language signs which can contribute to successful intercultural communication and closer social interaction (Nagel, 2009; Nagel & Koksharova, 2015).

Morphological derivation is considered in Nagel & Temnikova (2017) to be one of the examples of real exposure of the cultural codes in languages due to semantically concentrated description of reality it contains. According to Nagel and Temnikova (2017), all three spheres of word formation (mutation, syntactic and modification) in Russian are associated with both mental thinking and other areas of consciousness, including the emotional component. This combination can be considered to be a cultural code of evaluative behaviour in Russian linguistic culture. The derivative names of a person that belong to the syncretised zone of the Russian word formation are represented as culture-specific concepts requiring a special approach to the analysis of their semantics in the study of Russian as a foreign language and in the practice of translation. Such dictionaries can reveal the specific nature of the national world picture for those who are not the native speakers of certain languages, and serve as a good basis for comparative linguistic, bilingual and cultural-linguistic research, fulfilling the explanatory function as well.

Problem Statement

In our globalizing times, there is an increased demand of e-word dictionaries in general, with bilingual dictionaries even more needed than monolingual ones. This is urgent in the case of e-word dictionaries with regional or Indigenous languages as a component meant to be used in educational space in multiethnic states to create better practices with regards to these languages.

In Russian practical lexicography, creating bilingual Indigenous-Russian and Russian-Indigenous dictionaries has a long tradition. The purpose of the first type of dictionaries was to help those who, fluent in their native language, did not know Russian well enough for full-fledged communication or reading literature in Russian. Most of these dictionaries, especially those created in the Soviet times, sought to fix and present mainly the so-called socially significant vocabulary, emphasizing socio-political terminology and general language vocabulary with a small inclusion of colloquially vernacular and dialect units, suggesting the most generalized semantic structure of a word (Ozolinya, 2018). Nowadays, tendency is oriented towards the maximum representation of the lexical units of autochthonous languages in its entirety with the inclusion of all traditional artefacts’ nominations, fixed ethnic tokens, regardless of the frequency of their use.

The promising areas of boosting the vitality and prestige of studying the minority languages include the creation of bilingual electronic lexicographic sources with an English component. The relevance of their development is associated with an acute shortage of bilingual educational electronic resources for their study, on the one hand, and the dominance of English as one of the compulsory foreign languages for study in educational institutions of the Russian Federation, on the other.

Tatar lexicography with a foreign-language component is more developed and in demand in polylingual environment of the Republic of Tatarstan. In Safiullina (2018) the main ways of searching for e accuracy and adequacy of translating words, phrases and sentences are revealed, giving good advice for lexicographers for further work in this field. But most autochthonous languages of the peoples of Russia have not got their place in the field of bilingual electronic lexicography.

Republics of Altai, Tyva and Khakassia are situated in Southern Siberia. They bear the names of the respective indigenous peoples. Indigenous languages of the titular nations received an official status of second (after Russian) state languages of these republics in the post-soviet time in the beginning of the 1990s (Borgoyakova & Guseynova, 2017). To date, in these neighbour republics corresponding bilingual national-Russian and Russian-national dictionaries and a number of bilingual phraseological, terminological and other dictionaries are available. At the same time, the creation of bilingual dictionaries involving both indigenous and foreign languages, primarily English, did not receive significant support and distribution yet.

Research Questions

The introduction of the English component into the bilingual lexicographic practice of the indigenous languages of Southern Siberia took place in the early 1990s, when a short trilingual (Russian-English-Khakass) phrasebook was created (Borgoyakova & Perkas, 1992). In 2014, the electronic Khakass-English Thesaurus “Kizi/Person” was published and later in 2019 – the electronic Khakass-Russian-English dictionary (Borgoyakova et al., 2020a, 2020b). Khakass-Russian-English electronic dictionary management program is designed to work with a trilingual database, providing the search for translation of language units from Khakass, Russian and English with the option of choosing a language at the request of the user. In editing mode, a computer program allows to add a search for new units and assign relationship-translations into other languages.

The experience of creating the first dictionaries of this new type allows to find answers to many questions connected with revealing the main difficulties associated primarily with ethno cultural interlanguage lacunae, which become visible only when translation is necessary (Bykova, 2003). Word for word translation is impossible when it comes to unique artefacts, traditional objects or meanings, which do not have names in other languages. Besides, interconnection between direct and indirect or metaphorical meanings in polysemantic words needs attention for proper translation. The place of dialectal differences and grammar feedback also have to be discussed and found.

Purpose of the Study

This article is devoted to the lexicographic experience of creating innovative bilingual resources with Khakass and English components in the context of the problems of the development of the theory and practice of lexicography and language education. The purpose of the work is to identify difficulties and show practical experience of their solving in developing dictionary entries in bilingual or trilingual lexicographic sources, taking into account ethno cultural realities and interlanguage gaps.

Research Methods

In this study the comparative method, the method of component analysis of the lexical meaning of words, methods of structural modelling and elimination of interlanguage gaps are used.


The revealed main approaches to the assessment of modern lexicography are contradictory and are associated with: a) an insufficient level of its theoretical understanding, which impedes the development of important lexical strata, including the phenomenon of interlingual lacunarity (Bykova & Pylaeva, 2003; Devkin, 2001); b) the point of view that the lack of theory is not an obstacle to improving the practice of creating dictionaries, because the search for meaning in the text becomes more effective due to the interaction of lexicography, linguistics and engineering linguistics (Schweizer, 1998). The recommendations of professional lexicographers also include the importance of taking into account the needs of a future dictionary user (Atkins & Rundell, 2008).

When creating the above bilingual electronic dictionaries, the authors supposed that the most interested potential users of their product were students, teachers and researchers, for whom both the Khakass and English languages are part of their language repertoire, as well as English-speaking foreign students, interns and scientists interested in learning the Khakass language as a means of direct access to the Khakass ethno cultural heritage. Taking into account the interests of these potential users, vocabulary units from the base layer of the Khakass lexical fund were included in the dictionaries. This vocabulary includes words of all parts of speech which form the verbal basis of the everyday conversation and of conceptualization and categorization of the worldview.

The search for the most accurate translation requires knowledge of both languages (Khakass and English) or Russian in addition – in the case of a trilingual book – and an understanding of the ethnocultural specifics of the nominations in the translated languages, since the translation crosses the border not only between languages, but also between cultures (Bykova, 2003). A complicated translation problem is the difference in the semantic volume of metaphorical nominations. For example, in the Khakass culture, an indication of the black colour of hair and eyes carries an additional connotation of ethnic auto stereotype and self identification. In Khakass hara pass “black head (s)” has the figurative meaning “my people”. For the definition of hara harah “black-eyed” there is an additional positive metaphorical nomination nymyrt harah “bird cherry eye”. Therefore, it is necessary to give in the entry discourse not only the equivalent – “dark eyed”, but also the literal translation and the literary mark – “bird cherry eye”. Given that the name of bird cherry is devoid of any constant association with black, an additional clarification on the color of ripe berries is important – ripened “ripe”. For its Khakass antonymic definition hastah harah , both the literal translation “light eyed” and an additional litter about the literal meaning – “unripened eyes” (lit. “unripe eyes”) are proposed (Borgoyakova, 2015).

Significant background information which is necessary here includes the spectrum of the polysemantic structure of the adjective hara “black” with not only indication of the color and the corresponding complex of negative connotations, but also positive values associated with the characteristics of a person. In such a semantic field, the meaning of white / light can receive a negative connotation, which appears depending on the context and topic of the discourse.

When translating ethnographic artifacts, an approved model for describing the definition of an object was adopted, containing both the external characteristics of the shape, size or color, and its functional purpose. For example, the Khakass word pogo , denoting a popular female breastplate that has been preserved since ancient times, is translated into English through the definition “ Woman’s decorative amulet” with a mark on classifying the object as traditional culture: ( trad .).

The Khakass language presents a complex system of kinship terms, in which there are separate words not only for close but also for distant relatives, a system detailed on the basis of binary oppositions of the male / female line of kinship, as well as older / younger children and members of the family in general. In English, this system is less detailed and more simplified. Therefore, the Khakass terms nigeci (wife of an older brother) and igeci (sister of a wife) are translated by one common word “sister-in-law”, and tai uucha (maternal grandmother) and uucha (paternal grandmother) as “grandmother” with the corresponding notes, explaining the difference represented in the Khakass system of a broader nomenclature. A full schematic illustration of the Khakass kinship system, provided by Jeffry Pretes, a graduate of the University of San Francisco, is represented in the electronic Khakass-English Thesaurus “Kizi/Person”. Besides, the Thesaurus gives an explicit expression to some paradigmatic relations in vocabulary. They are identified with the help of labels or adding synonyms, antonyms, and word-formation relations of words with the corresponding translations.

The above examples prove the validity of studies of the phenomenon of linguistic lacunarity, which indicate the difficulty of filling the gap – the process of revealing concepts belonging to cultures that are alien to the recipient. The depth of this filling depends not only on the nature of the gap, but also on the characteristics of the recipient to whom the text is addressed (Bykova, 2003). The most vivid “zero” character of the lacunae for one of the compared languages is manifested in absolute lacunae, by which scientists understand the concepts denoted separately in some languages and not labeled in others or requiring the use of descriptive phrases (V.I. Zhelvis, Yu.A. Sorokin, I.Yu. Markovin) (as cited in Bykova, 2003).

In addition to the main proven strategies for overcoming translation difficulties, a feature of the implemented projects is the use of computer programs that accelerate the search for a translation equivalent with audio support for pronunciation of words and visualization of ethno cultural artefacts.

The inclusion of the English component enhances the attractiveness and relevance of the new lexicographic manuals, establishing a more effective interaction of languages in the educational space, including the title languages of the republics of Southern Siberia.

A brief overview of Khakass Grammar

Due to the complex nature of the Khakass language, the electronic Khakass-English Thesaurus “Kizi/Person” includes general information about its grammar system and dialects. A brief overview of Khakass grammar and dialects is provided for English speakers unfamiliar with the language. For example, the following aspects are given in the thesaurus:

A. Vowel Agreement . Khakass utilizes both hard vowels (А, О, У, Ы) and soft vowels (Е, И, i, j, e, Э). In most instances a word will make use of either hard vowels or soft vowels, and only rarely both. This vowel agreement is easily demonstrated in plural formation. Thus АТ (horse) becomes АТТАР (horses) and ТŸЛГŸ (fox) becomes ТŸЛГŸЛЕР (foxes), and such synharmonic agreement also works in grammatical cases, verb formations, etc.

B. Pronouns . Khakass employs six pronouns as detailed in the table 01 below

Table 1 -
See Full Size >

C ; Grammatical Cases . There are ten grammatical cases in Khakas, including nominative. The chart below gives their Khakas names and English equivalents, as well as the functions and associated affixes (See Table 02 below).

Table 2 -
See Full Size >

D ; Verbs . Khakass verbs may appear as an opaque complexity to the uninitiated, but they have the capacity to convey a wealth of information beyond the verb’s intrinsic meaning. There are four tenses in Khakass – past, immediate past, present, & future. Additionally, a host of affixes are utilized to denote perfective and imperfective, active and passive, conditional, when an action is not witnessed by the speaker, and other situations. Affixes are often combined with a single verb, up to five added to one verb, to pithily communicate complex information. Auxiliary verbs are also common in Khakass.

Peculiarities of Dialect Terminology and its Representation in the Khakass-English Thesaurus

Dialects belong to a very significant source for not only in studying the history of a language, but it is also a good resource of vocabulary enrichment and support for these dialects native speakers. At the same time dialect vocabulary is often neglected in bilingual dictionaries as not relevant enough. It is a common feature of Indigenous-Russian and Russian- Indigenous dictionaries. Tsyrenov (2013) analysed the representation of dialectisms in the Mongolian vocabulary dictionaries and came to the conclusion that there is quite insignificant number of them in the Mongolian-Russian and Kalmyk-Russian dictionaries, and in the three editions of Buryat-Russian dictionary dialectisms are included as entry words in a much larger number and range from 9% in the first to 3.5% – in the latter. Dialectal words in dictionaries are mostly lexical dialectisms and ethnographisms (Tsyrenov, 2013).

Due to a long tradition of banning dialects much fewer people go on using them in Khakassia.

There are four dialects in the Khakass language: Kachin, Sagai, Kyzyl and Shor. The literary form of the Khakass language, established in the beginning of 1920-s, is mainly based on two principal dialects – Sagai and Kachin.

Khakass dialectology as a research field is relatively young and many problems await new investigations and solutions. During the last four decades some special research was done on the dialect base of the Khakass literary language, its use in different domains and districts, importance for development of modern Khakass language. The objective is to collect additional dialectological data to enrich the modern Khakass literary language (Subrakova, 2011).

The dialectal variety of the vocabulary has become the subject of additional research during the preparation of the Khakass-English thesaurus “Kizi / Person”. In the dialects of the Khakass language a number of terms have distinctive features, so they were marked with differential signs of dialect speech at the phonetic and lexical levels according to the principles of general dialectology.

Features of the terms at the phonetic level

1. The absence of vowels o, oo, ö, öö is characteristic of the Sagai and Shor dialects. Instead of these vowels, u, uu, ü, üü are used: hol (lit.) – hul (dial.) “hand”, choon (lit.) – chuun (dial.) “large”, söök (lit.) – süüк (dial.) “bone”, mojyn (lit.) – mujyn (dial.) “neck”, ӧkpe (lit.) – ükpe (dia.) “lungs”.

2. In the Sagai dialect, instead of e, ee, both at the beginning of the word and in the initial syllable i, ii appear: em (lit., Kach.) – im (Sag.) “suck breast”.

3. The vowel i at the beginning of the word or in the first syllable in the Kyzyl and Shor dialects corresponds to the vowel e: pizhe (lit.) – pezhe (kyz.) – pezhe (Shor.) “elder sister”; kilin (lit.) – kelin (Kyz., Shor.) “daughter-in-law”, irin (lit.) – erin (Kyz., Shor.) “lip”, imzhek (lit.) – emzhek (kyz.) – emzhek (Shor.) “breast”, ir (lit.) – er (Kyz., Shor.) “man”.

4. In the Sagai dialect и is used at the beginning of the word instead of the vowel i: iӌе (lit.) – иӌе (Sag.) “mother”.

5. In the Khakass literary language the actual Khakass sh and zh are absent. The consonants zh, sh are in the Shor, Kyzyl dialects and in the Old-Ijus dialect of the Kachin dialect. In these dialects, sh replaces the consonant with haryndash, ahshy, ishti, shala , kökshi, shash in Shor dialect instead of literary haryndas “relative”, ahsy “mouth”, isti “belly”, salaa “finger”, köksi “chest”, sas “hair”. In the Kyzyl dialect, sh is used instead of the literary ch. The sound sh can be inside the word and in affixes, where it also corresponds to ch in the literary language: erepshi, shon, palyhshi, ipchshi instead of the literary irepchi “spouse”, chon “people”, palyhchy “fisherman”, ipchi “woman”.

The presence of the sound zh is a purely internal feature of the Kyzyl dialect. It is used instead of the voiced consonant ch in the literary language (Patachakova, 1995): izhe, pizhe, aalzhy, hyzyzhah, azha, as much as the literary izhe “mother”, pizhe “sister”, ааlzhy “guest”, hyzyzhah “girl”, azha “senior brother”.

In the Shor dialect, zh is found mainly in the intervocal position: kizhi, azhah instead of literary kizi “man”, azah “leg”.

Thus, the above linguistic material allows to distinguish the following correspondences of vowels and consonants in dialects and in the literary language: u, uu (Sag., Shor.) – o, oo (lit.), ü, üü (Sag., Shor.) – ö, öö (lit.), i (Sag.) – e (lit.), e (Shor., Kyz.) – i (lit.), и (Sag.) – i (lit.), sh (Shor .) – s (lit.), zh (kyz.) – zh (lit.), zh (Shor.) – z (lit.), sh (kyz.) – ch (lit.).

Features of terms at the lexical level

Today Khakass dialects are less used and many dialect words go out of use, but they still have the potential of further enriching the literary language - mainly in the field of vocabulary with all their semantic richness and diversity.

It is possible to classify dialects according to thematic groups, such as, names of plants, terms of kinship, natural phenomena, etc. Some Khakass kinship terms also refer to dialects. For example, a grandfather (paternal grandfather) in the literary language is ulug agha , while in dialects other different nominations are used: ulaghang, tai agha , taiduk, kir aba, agha .

Lexically parallel words, taken from the two main dialects (Sagai and Kachinsky), acquired the status of synonyms in the literary language. It is this category of dialectisms that was, as a rule, included in the dictionary entries of the electronic Khakass-English thesaurus. The inclusion of other dialect vocabulary, if necessary, is accompanied with appropriate labels.


The development of electronic lexicography gives new great opportunities for Indigenous languages of Siberia to be more widely included in interactive specialized bilingual dictionaries as sources of information and knowledge on many aspects of national world view encoded in language signs. An increased interest to creating electronic word bilingual dictionaries with both Indigenous languages and English as their components reflect current tendencies of supporting Indigenous languages as a valuable cultural heritage, on the one hand, and spread of English as a second or foreign language, on the other. This helps raising attractiveness of learning and using native languages in educational space of multiethnic states where English is the dominant or compulsory foreign language at school.

When creating electronic bilingual dictionaries with a component of indigenous languages, it is necessary to take into account the expectations and interests of their potential users and to meet the urgency of existing social demand associated with the need to preserve and develop their live use. The e-dictionaries we created include vocabulary, which form the basis both of the everyday conversation and of conceptualization of the image of the world. All parts of speech are represented. The search for the most accurate translation required knowledge of both languages and understanding of their ethnocultural specifics. The difference in the semantic volume of nominations is “erased” in the definition of an object, which suggests not only the equivalent, but also the literal translation with the literary marks or additional clarifications. The use of computer programs in e-dictionaries accelerates the search for an equivalent with audio support and visualization of ethno cultural artefacts.

The main expected effect of the implementation of our projects is social in nature and is associated with the optimization of teaching methods for native and foreign languages, as well as increasing the prestige and motivation of studying the native language by the Khakass youth, enhancing the image of Katanov Khakass State University as a leading research centre in the field of Khakass philology and intercultural communication.

The potential features of the developed model of the electronic Khakass-English thesaurus and Khakass-Russian-English dictionary allow intensifying innovative approaches in lexicographic work and making it more contemporary and accurate. The speed of word search in the developed computer model is high, and the accuracy of the translation and pronunciation is ensured by the participation of native speakers of both languages.


Research was conducted with financial support of RFBR Project No. 20-012-00426


