Applying Machine Learning To Statistical R&D Costs Accounting
The conducted research using machine learning shows that a number of research projects are not identified as such in the unified information system on procurement, but are defined as services, which leads to a statistical underestimation of internal costs for research and development, which underestimates performance indicators systems of scientific and technological development of the country. We propose how, using machine learning and natural language processing methods, a search is carried out for purchases placed in the Unified Information System in the field of procurement, which do not have a direct indication of the purchase of R&D or have a direct indication of the purchase of a service, while having a procurement subject corresponding to R&D. The interpretation of the results of applying the model is an increase in the recorded volume of purchases related to research and development carried out through the unified information system of public procurement by four times (up to 421 billion rubles), as well as an increase in the recorded volume of such purchases ordered by a business structure by 11 times. In order to ensure a simplified accounting of internal expenditures on research and development (IR&D) and regular monitoring of various sources to identify unaccounted IR&D, it is necessary to develop and implement a special algorithm for the search and classification of works.
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this paper as:
Click here to view the available options for cite this article.