Big Data analysis via model reduction methods

Authors

  • Stanislav I. Zabielin Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine https://orcid.org/0000-0003-2178-7415

DOI:

https://doi.org/10.20535/SRIT.2308-8893.2018.2.04

Keywords:

nonlinear mapping, dimension reduction, big data, modelling, non-linear dynamic objects, diffusion maps, kernel method of main components

Abstract

The enormous growth in the size of data has been observed in recent years being a key factor of the Big Data scenario. Big Data require a new high-performance processing. The use of big data preprocessing methods for data mining in big data is reviewed in this paper. The definition, attributes and categorization of data preprocessing approaches in big data are introduced. The relation between big data and data preprocessing throughout all families of methods and advanced data technologies are likewise analyzed. Furthermore, research challenges are discussed, while concentrating on improvements in certain families of data preprocessing methods and applications based on new big data learning paradigms.

Author Biography

Stanislav I. Zabielin, Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv

Zabelin Stanislav Igorevich,

a Ph.D. student at Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine.

References

Big Data prediction for 2013. Blog by Mike Gualtieri. (n.d.) — Available at: http://blogs.forrester.com/mike_gualtieri

Big Data prediction for 2013. Blog by Mike Gualtieri. (n.d.) — Available at: http://blogs.forrester.com/mike_gualtieri

Horvath D. Generative Topographic Mapping of Conformational Space / D. Horvath, I. Baskin, G. Marcou, A. Varnek // Molecular Informatics. — 2017. — 36 (10). — P. 22.

Kohonen T. Essentials of the self-organizing map / T. Kohonen // Neural Networks. — 2013. — N 37. — P. 52–65.

Wang L. The Isomap Algorithm and Topological Stability / L. Wang // Science. — 2002. — 295 (5552). — P. 81.

Lerner B. On pattern classification with Sammons nonlinear mapping an experimental study / B. Lerner // Pattern Recognition. — 1998. — 31(4). — P. 371–381.

Young F. Multidimensional Scaling: History, Theory, and Applications / B. Lerner // Psychology Press. — 2017. — N 11. — P. 13.

Lee J. Nonlinear dimensionality reduction / J. Lee, M. Verleysen // NY: Springer. — 2010. — 29. — P. 110.

Marinescu D. Cloud Computing: Theory and Practice / D. Marinescu // Elsevier Science & Technology Books. — 2017. — 2. — P. 66.

Beyond the hype: Big data concepts, methods, and analytics. Egyptian Journal of Medical Human Genetics. Available at: https://www.sciencedirect.com/science/article/pii/S0268401214001066

Ewing R. Visualization of expression clusters using Sammons non-linear mapping / R. Ewing R., J. Cherry // Bioinformatics. — 2001. — 17(7). — P. 658–659.

Dinh H. A survey of mobile cloud computing: architecture, applications, and approaches / H. Dinh // Wireless Communications and Mobile Computing. — 2011. — 13(18). — P. 1587–1611.

Wang Q. Combining local and global information for nonlinear dimensionality reduction / Q. Wang, J. Li // Neurocomputing. — 2009. — 72(10–12). — P. 2235–2241.

You S. Think locally, fit globally: Robust and fast 3D shape matching via adaptive algebraic fitting / S. You, D. Zhang // Neurocomputing. — 2017. — N 89. — P. 119–129.

Lee J. Nonlinear projection with curvilinear distances: Isomap versus curvilinear distance analysis / J. Lee J., A. Lendasse, M. Verleysen // Neurocomputing. — 2004. — N 57. — P. 49–76.

Cox T. Multidimensional scaling / T. Cox, M. Cox // Boca Raton. — 2001. — 11. — P. 22.

Law M. Incremental nonlinear dimensionality reduction by manifold learning / M. Law, A. Jain // IEEE Transactions on Pattern Analysis and Machine Intelligence. — 2006. — 28(3). — P. 377–391.

Lee T. Improved criteria for sampled-data synchronization of chaotic Lur’e systems using two new approaches / T. Lee, J. Park // Nonlinear Analysis: Hybrid Systems. — 2017. — 24. — P. 132–145.

Du K. Clustering: A neural network approach / K. Du // Neural Networks. — 2010. — 23(1). — P. 89–107.

Wang L. Local Dynamic Modeling with SelfOrganizing Maps and Applications to Nonlinear System Identification and Control / L. Wang // Intelligent Signal Processing. — 2009. — 15. — P. 21.

Svensen J. GTM: the Generative Topographic Mapping / J. Svensen // University of Aston in Birmingham. — 1998. — 12. — P. 981.

Ghahramani Z. Unsupervised Learning / Z. Ghahramani // Advanced Lectures on Machine Learning Lecture Notes in Computer Science. — 2004. — 15. — P. 72–112.

Downloads

Published

2018-06-20

Issue

Section

Progressive information technologies, high-efficiency computer systems