Development of textual analytics tools for analysis of public and specialized sources in the tasks of foresight and system analysis

Authors

  • Volodymyr Savastiyanov Educational and Scientific Complex “Institute for Applied System Analysis” of the National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine https://orcid.org/0000-0002-2052-0420

DOI:

https://doi.org/10.20535/SRIT.2308-8893.2020.4.02

Keywords:

systems analysis, foresight, text mining, NLP, classifiers, ontologies, Open Source, Python, Gensim

Abstract

A combined approach to extracting concepts and constructing classifiers and ontologies using open and proprietary software packages has been developed. Modern approaches, methods and models of storing large amounts of poorly structured information from Open Source software sets are studied. An ontology was built, in the leaves of which a classifier based on Boolean rules was implemented using SAS(R) Content Categorization Software. To build the ontology, the approach of constructing vectors of related concepts is employed using the Open Source library of Gensim software, namely the Word2Vec model. A typical algorithm for constructing a classifying ontology has been developed. The results of the research can be used to build an ontology of subject areas, create classification ontologies and mark corpora of texts.

Author Biography

Volodymyr Savastiyanov, Educational and Scientific Complex “Institute for Applied System Analysis” of the National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

Volodymyr V. Savastiyanov,

a junior researcher at Educational and Scientific Complex “Institute for Applied System Analysis” of the National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine.

References

I.V. Feskov, “NU “OUA” Basic methods of hybrid warfare in the modern information society”, Current policy issues, vol. 58, pp. 66–76, 2016.

Mikhail Z. Zgurovsky and N.D. Pankratova, System Analysis: Theory and Applications. Springer, 2007.

Articles 164-9, 164-13 Code of Ukraine on Administrative Offenses.

Judge Berzon, “hiQ Labs, Inc. vs. LinkedIn Corporation Opinion”, United States Court of Appeals for the Ninth Circuit, September 9, 2019. Available: http://cdn.ca9.uscourts.gov/datastore/ opinions/2019/09/09/17-16783.pdf.

RabbitMQ. Available: https://www.rabbitmq.com

Elasticsearch. Available: https://www.elastic.co

DCMI Dublin Core Metadata Initiative. Available: http://dublincore.org.

SIOC Project. Available: http://sioc-project.org.

SKOS Simple Knowledge Organization System. Available: http://www.w3.org/2004/02/skos/.

M. Korobov, “Morphological Analyzer and Generator for Russian and Ukrainian Languages”, Analysis of Images, Social Networks and Texts, pp. 320–332, 2015.

R. Rehurek and P. Sojka, “Software framework for topic modelling with large corpora”, LREC, 2010.

G. Chakraborty, M. Pagolu, and S. Garla, Text Mining and Analysis. Practical Methods, Examples, and Case Studies Using SAS®. SAS Institute Inc., 2013.

Published

2020-12-29

Issue

Section

Decision making and control in economic, technical, ecological and social systems