Abstract
The article introduces the preliminary results of the systematic study of the functioning and semantics peculiarities of general scientific (academic) vocabulary in the scientific discourse of biomedical orientation in comparison with other sciences discourses. This article compares the results with humanities and social sciences. For this study, a corpus of scientific texts on medicine and biology was specially created, which consists of 5 484 665 word usage. It provides a comparative analysis of the frequency of academic vocabulary units (10 most common verbs, adjectives and nouns) most commonly used in this type of scientific discourse. It is compared with the frequency of the same units in the texts of the humanities and social sciences (according to the well-known corpus «Academic Vocabulary List» by D. Gardner и M. Davies). Statistical analysis of the frequency of general scientific vocabulary is supplemented by the study of the frequency and distribution of collocations characteristic of its individual units. In addition, the analysis was supplemented by a qualitative analysis of changes in their semantics due to the discourse type. The particular example of the general scientific noun
Keywords: Academic vocabularycorpus linguisticscorpusbiomedical discourse
Introduction
Before the advent of corpus linguistics, it was considered that a significant part of academic vocabulary was common to all fields of scientific knowledge. This was expressed in the term "general scientific", which was established for it in the national tradition (the terms "general scientific" and "academic" are used in the article as synonymous). The methods of corpus linguistics make it obvious that disciplinary differences in the functioning of general scientific vocabulary do not just exist. They are manifested both in the frequency of use of units in different fields of knowledge, and in changing their semantics and syntactics. As Hyland and Tse (2007) rightly point out, although the same words are used in texts of completely different sciences, “all disciplines adapt words to their own ends, displaying considerable creativity in both shaping words and combining them with others to convey specific, theory-laden meanings associated with disciplinary models and concepts” (p. 240). In this regard, the interest of corpus linguistics has shifted from studying the functioning of individual vocabulary units in the academic discourse to the study of the frequency and distribution of collocations inherent in general scientific vocabulary (Ackermann & Chen, 2013; Biber et al., 2004; Hyland, 2008; Hyland, 2012).
Problem Statement
The vocabulary of academic discourse is divided into a) terms, b) words and collocations that are thematically unscientific and present in any speech style (function words and everyday vocabulary) and c) general scientific vocabulary. General scientific vocabulary is the most difficult for mastering it by students of non-linguistic faculties of higher educational institutions due to its functional and semantics features (Polubichenko, 2019). This accounts for the increased attention of linguists to the language of science vocabulary in recent years. First of all, from the point of view of teaching a foreign language of specialty in non-linguistic faculties, translation of narrowly disciplinary academic literature and bilingual lexicography.
Research Questions
1. Are there differences in the frequency of use and distribution of general scientific vocabulary in different types of scientific discourses (on the example of texts of biomedical, humanities and social sciences)? How significant are they?
2. Are there qualitative differences in the compatibility and semantics of general scientific vocabulary in the considered varieties of scientific discourses?
Purpose of the Study
The purpose of the study is to test the hypothesis that the vocabulary, which is called general scientific vocabulary in the national linguistic tradition, is not common to discourses of different disciplinary orientation. On the contrary, it is able to show disciplinary specificity both in quantitative (frequency and distribution) and qualitative (collocations and semantics) relations. The article introduces the progress and preliminary results of the study.
Research Methods
The study was conducted using corpus linguistics methods. In 2004, the British lexicographer and corpus linguist Kilgarriff and Czech specialist in the field of computer processing of natural language Rychlý created the corpus query system Sketch Engine (as cited in Kilgarriff et al., 2004). As “for language learning and teaching, smaller corpora can be more useful as they are designed to represent the specific part of the language under investigation” (Mudraya, 2006, p. 236). on the basis of Sketch Engine, the corpus of scientific texts of biomedical subjects (hereinafter – BIOMED), which consists of 5 484 665 word usage, was specially compiled. The 872 scientific articles of different types (research article, review article, clinical investigation article) from journals of narrow professional orientation were material for the corpus. The selection of the material was carried out in terms of the authenticity of the text. 50% of the material included in the corpus is written by scientists from the UK, 30 % – the USA and 20 % – Australia, Canada and New Zealand. The corpus includes 21 subcorpuses, each representing one or another of the main biomedical specializations (biochemistry, biophysics, biotechnology / bioengineering, botany, cardiology, cell biology, zoology, etc.). The corpus is well balanced. Each subcorpus occupies approximately 4.8 % of the total corpus volume and contains from 250 to 270 thousand word usage.
The second stage of the study was the keyword selection using the Keywords and Terms function incorporated in the Sketch Engine. It allows a comparison of the frequency of the corpus units with their frequency in the reference corpus. For this study, the reference corpus was English Web 2013 (EnTenTen13) (Jakubíček et al., 2013). One of the defining characteristics of academic vocabulary is its high frequency in scientific discourse. This makes using the Engine Keywords and Terms function appropriate. After excluding highly specialized and terminological vocabulary and checking with the lists of academic vocabulary by Coxhead (2000), Gardner and Davies (2014), the list of 258 units of general scientific vocabulary was compiled. The vocabulary is presented in the BIOMED corpus and refers to the three parts of speech (94 verbs, 121 nouns and 43 adjectives). During lemmatisation, the disambiguation of different parts of speech, which is inherent in English, was carried out. The list included only academic vocabulary units that occur at least 5 times in each BIOMED subcorpus.
The next stage of the study was to obtain information on the frequency and semantics of these general scientific vocabulary units in the scientific discourse of different disciplinary orientation. For this purpose, the Academic Vocabular List (AVL) corpus of 120 032 441 word usage was used. It was created by American researchers Dee Gardner and Mark Davies in 2013 and includes nine groups of texts on scientific disciplines. A comparative analysis was made of the frequency of the most common general scientific vocabulary units in the BIOMED corpus and in the three AVL subcorpuses that do not intersect thematically: Social Science, Humanities, and History. Inexplicably, the creators of the AVL singled out history from the Humanities into an independent subcorpus. Taking into account that all compared corpuses have different volumes (BIOMED – 5 484 665 word usage, Social Sciencе – 16 720 729, Humanities – 11 111 225, History – 14 289 007), relative frequency of units was used for comparison instead of absolute frequency. This quantity is statistically stable and allows abstracting from the real corpus size. The conversion of absolute frequency into relative frequency was done using the statistical probability formula:
NF (normalized frequency) is the relative frequency (measured in instances per million, hereinafter ipm), AF (absolute frequency) is the absolute frequency (total quantity of occurrences in the studied corpus), CS – corpus size (measured in word usage). The quantity (NF) indicates how many times a token would appear in the corpus equal to a million word usage. It allows a comparison of frequency data on the use of each token in different corpuses of different sizes. In this study, tokens are either individual units of academic vocabulary or collocations with them.
The 10 most common academic verbs, adjectives and nouns were selected for a quantitative analysis of the frequency of general scientific vocabulary in texts of different fields of scientific knowledge based on the BIOMED corpus. The absolute frequencies of use of these units in the History, Social Science and Humanities corpuses were successively set and their relative frequencies were calculated (ipm). The results are shown in Table
Findings
In Table
According to the provided data, we can conclude that the discourses of biomedical and social disciplines shows the greatest similarity in the frequency of the studied general scientific words. Such general scientific vocabulary units as suggest, indicate, identify, assess, study, result, model, factor, difference, significant, similar are found in texts of these fields of scientific knowledge with approximately the same frequency.
We illustrate the progress of analysis of the academic vocabulary functioning in scientific discourse of different disciplinary orientation on the example of noun response.
The selected common collocations with the word response for different disciplines are presented in the tables (Tables
The comparison of the frequency of common collocations of the general scientific noun
The frequencies of combinations of the word
The data on the frequency of common collocates-verbs of the general scientific noun
Functional-semantic analysis of the disciplinary use of academic vocabulary
The qualitative analysis, namely semantic and functional analysis of general scientific words, depending on the disciplinary field of their use, reveals significant differences, as well as statistical analysis. By studying the most frequent collocations for the noun
Thus, almost all collocates-nouns in BIOMED (except
We consider the most interesting cases where the collocation occurs in all discipline groups in order to determine which meanings of the noun
-
So although the positive response to the antiviral does point to potential correct clinical diagnosis it is not possible to confirm this. (BIOMED)
-
In a November 2009 interview, Bayanouni stressed that the group's suspension of opposition activity was conditional upon a positive response from the regime. (History)
-
When my graduate students reported on their completed projects, the parent/student/teacher surveys indicated an overall positive response, and some participants even recommended that the CPLP be expanded to cover other subjects. (Humanities)
-
It is only a positive response to these principles by those with extra resources that will ultimately bring life to Africa, as well as communicating the warm message to the world's poor that the world is not such a cruel place after all. (Social Science)
In the first example,
-
These data suggest that while initial responses occur quickly, deep responses are associated with longer time on treatment and continue to develop over time. (BIOMED)
-
Did al-Qaeda expect such an overwhelming initial response from the United States? What, after all, did Bin Laden think he was going to accomplish strategically by killing thousands of innocent Americans? (History)
-
When he asked me why 1 liked music, my initial response was, "Because it makes me feel...". My friend interrupted me <…> (Humanities)
-
Respondents were encouraged to take their time. Once they had made their initial response to a question, a general probe was used to ensure that respondents tried hard to list everything they knew relevant to that question. (Social Science)
In the first context,
The second example implements the meaning “The initial decisions and actions taken in reaction to a reported incident”, and the third, and fourth implement the meaning “a verbal, written, or electronic answer”. The use of
Most notable is the presence of adjectives in biomedical discourse (
Conclusion
The systematic study of the academic vocabulary functioning in biomedical discourse is carried out in comparison with the humanities and social sciences discourses. It is based on the corpus linguistics methods using statistical methods and methods of qualitative analysis of language units. This study confirms the hypothesis and demonstrates that general scientific vocabulary, like the special (terminological) vocabulary, can be a marker of the discipline of a text both in terms of frequency and distribution of units, and in semantics and specific collocations.
References
- Ackermann, K., & Chen, Y.-H. (2013). Developing the Academic Collocation List (ACL): A corpus-driven and expert-judged approach. J. of Engl. for Acad. Purposes, 12(4), 235–247. https://doi.org/10.1016/j.jeap.2013.08.002
- Biber, D. E., Conrad, S., & Cortes, V. (2004). If you look at ... Lexical bundles in university teaching and textbooks. Appl. Linguist., 25(3), 371–405. https://doi.org/10.1093/applin/25.3.371
- Coxhead, A. (2000). A New Academic Word List. TESOL Quarterly, 34(2), 213–238. https://doi.org/10.2307/3587951
- Gardner, D., & Davies, M. (2014). A New Academic Vocabulary List, Appl. Linguist., 35(3), 305–327. https://doi.org/10.1093/applin/amt015
- Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. Engl. for Specific Purposes, 27(1), 4–21. https://doi.org/10.1016/j.esp.2007.06.001
- Hyland, K. (2012). Bundles in Academic Discourse. Annual Review of Appl. Linguist., 32, 150–169. https://doi.org/10.1017/S0267190512000037
- Hyland, K., & Tse, P. (2007). Is There an “Academic Vocabulary”? TESOL Quarterly, 41(2), 235–253. https://doi.org/10.2307/40264352
- Jakubíček, M., Kilgarriff, A., Kovář, V., Rychlý, P., & Suchomel, V. (2013). The TenTen corpus family. In 7th Int. Corpus Linguist. Conf. CL (pp. 125–127). https://www.sketchengine.eu/wp-content/uploads/The_TenTen_Corpus_2013.pdf
- Kilgarriff, A., Rychlý P., Smrž, P., & Tugwell, D. (2004). The Sketch Engine. In EURALEX 2004 proceedings, lorient, France. http://kilgarriff.co.uk/Publications/2004-KilgRychlySmrzTugwell-SkEEuralex.rtf
- Mudraya, O. (2006). Engineering English: a lexical frequency instruction model. Engl. for Specific Purposes, 25(2), 235–256. https://doi.org/10.1016/j.esp.2005.05.002
- Polubichenko, L. V. (2019). General scientific vocabulary in scientific discourse: new possibilities for research. The Human. and Social Studies in the Far East, 16(1), 26–30.
Copyright information
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
About this article
Publication Date
31 October 2020
Article Doi
eBook ISBN
978-1-80296-091-4
Publisher
European Publisher
Volume
92
Print ISBN (optional)
-
Edition Number
1st Edition
Pages
1-3929
Subjects
Sociolinguistics, linguistics, semantics, discourse analysis, translation, interpretation
Cite this article as:
Polubichenko, L. V., & Beliaeva, T. R. (2020). Discipline-Conditioned Choice And Use Of General Scientific (Academic) Vocabulary. In D. K. Bataev (Ed.), Social and Cultural Transformations in the Context of Modern Globalism» Dedicated to the 80th Anniversary of Turkayev Hassan Vakhitovich, vol 92. European Proceedings of Social and Behavioural Sciences (pp. 898-907). European Publisher. https://doi.org/10.15405/epsbs.2020.10.05.120