Data mining tools for complex socio-economic processes and systems
DOI:
https://doi.org/10.20535/SRIT.2308-8893.2022.4.06Keywords:
data mining, complex socio-economic systems, predictive modeling, neural networks, deep learningAbstract
The paper considers discovering new and potentially useful information from large amounts of data that actualizes the role of developing data mining tools for complex socio-economic processes and systems based on the principles of the digital economy and their processing using network applications. The stages of data mining for complex socio-economic processes and systems were outlined. The algorithm of data mining was considered. It is determined that the previously used stages of data mining, which were limited to the model-building process, can be extended through the use of more powerful computer technology and the emergence of free access to large amounts of multidimensional data. The available stages of data mining for complex socio-economic processes and systems include the processes of facilitating data preparation, evaluation, and visualization of models, as well as in-depth learning. The data mining tools for complex socio-economic processes and systems in the context of technological progress and following the big data paradigm were identified. The data processing cycle has been investigated; this process consists of a series of steps starting with the input of raw data and ending with the output of useful information. The knowledge obtained at the data processing stage is the basis for creating models of complex socio-economic processes and systems. Two types of models (descriptive and predictive) that could be created in the data mining process were outlined. Algorithms for estimating and analyzing data for modeling complex socio-economic processes and systems in accordance with the pre-set task were determined. The efficiency of introducing neural networks and deep learning methods used in data mining was analyzed. It was determined that they would allow effective analysis and use of the existing large data sets for operational human resources management and strategic planning of complex socio-economic processes and systems.
References
R.R. Nisbet, G. Miner, and K. Yale, “Chapter 2 – Theoretical Considerations for Data Mining,” Editor(s): Robert Nisbet, Gary Miner, Ken Yale, Handbook of Statistical Analysis and Data Mining Applications (Second Edition), Academic Press, 2018, pp. 21–37. Available: https://doi.org/10.1016/B978-0-12-416632-5.00002-5
O. Maimon and L. Rokach, Data Mining and Knowledge Discovery Handbook, 2nd ed. Springer, January 2010, 1285 p. Available: https://doi.org/10.1007/978-0-387-09823-4
H. Jiawei, M.Kamber, and J. Pei, Data mining: concepts and techniques, 3rd ed. Morgan Kaufmann Publishers, 2012, pp. 23–27.
H. Choi and H. Varian, Predicting the Present with Google Trends, 2009. Available: http://static.googleusercontent.com/external_content/untrusted_dlcp/www.google.com/en//googleblogs/pdfs/google_predicting_the_present.pdf
H. Xiong, G. Pandey, M. Steinbach, and V. Kumar, “Enhancing data analysis with noise removal,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, iss. 3, pp. 304–319, 2006. Available: http://datamining.rutgers.edu/publication/tkdehcleaner.pdf
P.L. dos Santos, and N. Wiener, “Indices of Informational Association and Analysis of Complex Socio-Economic Systems,” Entropy, 21(4), 367, 2019. Available: https://doi.org/10.3390/e21040367
J. Stefanowski, Discovering Decision Trees. Institute of Computing Science. Poznań University of Technology, 2010, 45 p. Available: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.176.1423&rep=rep1&type=pdf
E. Gómez-Ramos and F. Venegas-Martínez, “A Review of Artificial Neural Networks: How Well Do They Perform in Forecasting Time Series?” Analítika, Revista de análisis estadístico, vol. 6, no. 2, pp. 7–15, 2013.
R. Tadeusiewicz, Neural networks: A comprehensive foundation: by Simon HAYKIN. USA, New York: Macmillan College Publishing, 1995, 696 p.
U. Johansson, Obtaining Accurate and Comprehensible Data Mining Models – An Evolutionary Approach. Linköping, Sweden: Department of Computer and Information Science, Linköpings universitet, 2007, 272 p. Available: http://www.diva-portal.org/smash/get/diva2:23601/FULLTEXT01.pdf
G.V. Prisenko and E.I. Ravikovich, Forecasting of socio-economic processes: Textbook. K: KNEU, 2005, 378 p.
“Comparing privacy laws: GDPR v. CCPA,” Data Guidance and Future of Privacy Forum, 42 p. Available: https://fpf.org/wp-content/uploads/2018/11/GDPR_CCPA_Comparison-Guide.pdf
Amundsen. Open source data discovery and metadata engine. Available: https://www.amundsen.io/
Metabase. Built for data. Available: https://www.metabase.com/
A. Krizhevsky, I. Sutskever, and G.E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, 25, pp. 1097–1105, 2012.
Jiquan Ngiam et al., “Multimodal deep learning,” Proceedings of the 28th international conference on machine learning (ICML-11), 2011.
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, 521(7553), pp. 436–444, 2015.
J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural Networks, 61(C), pp. 85–117, 2015.
M. Lavreniuk, N. Kussul, and A. Novikov, “Deep Learning Crop Classification Approach Based on Coding Input Satellite Data Into the Unified Hyperspace,” IEEE 38th International Conference on Electronics and Nanotechnology, pp. 239–244, 2018.
K. Krawiec, “Evolutionary feature selection and construction,” in S. Claude and G. Webb (Eds.) Encyclopedia of Machine Learning and Data Mining. Boston, MA: Springer, 2016.
M. Reichstein et al., “Deep learning and process understanding for data-driven Earth system science,” Nature, vol. 566, iss. 7743, pp. 195–204, 2019.