Information technology of data clustering in the time interval of observation
Abstract
Cluster analysis is an important task of data mining. The use of clustering techniques allows to understand the structure of multidimensional data; to simplify further processing using different methods of analysis for each cluster; reduce the original sample data, leaving the most typical representatives of each group; detect novelty, atypical objects that can not be attached to any of the classes; formulate or test hypotheses based on the results. In this article а new approach to the selection of groups of objects that are similar to each other on a set of features that changing over time has been proposed. Information technology of quality assessment and improvement of the stability of clustering has been developed. The results of practical implementation of the proposed technology to data of hydrochemical monitoring of ater objects in the area with high technological load have been presented.References
Mandel' Y.D. Klasternyy analiz. — M.: Statystyka, 1988. — 176 s.
Аyvazyan S.А., Bezhayeva Z.I., Staroverov O.V. Klassifikatsiya mnogomernykh. — M.: Statistika, 1974. — 240 s.
Jain A.K. Data clustering: 50 years beyond K-means // Pattern Recognition Letters. — 2010. — 31(8). — P. 651–666.
Mirkin B.G. Metody klaster-analiza dlya podderzhki prinyatiya resheniy: obzor. — M.: Izd. dom NIU "Vysshaya shkola ekonomiki", 2011. — 88 s.
Berikov V.S., Lbov G.S. Sovremennyye tendentsii v klasternom analize // Vserossiyskiy konkursnyy otbor obzorno-analiticheskikh statey po prioritetnomu napravleniyu "Informatsionno-telekommunikatsionnyye sistemy", 2008. — 26 s.
Halkidi M., Batistakis Y., Vazirgiannis M. On Clustering Validation Techniques // Journal of Intelligent Information Systems. — 2011. — 17, Issue 2–3. — Р. 107–145.
Milligan G., Cooper M. An examination of procedures for determining the number of clusters in a data set // Psychometrika. — 1985. — 50, № 2. — Р. 159–179.
Emel’yanenko T.G., Zberovskiy А.V., Pristavka А.F., Sobko B.E. Prinyatiye resheniy v sistemakh monitoringa. — D.: RIK NGU, 2005. — 224 s.
Babak V.P., Bilets'kyy A.Ya., Prystavka O.P., Prystavka P.O. Statystychna obrobka danykh. — K.: MIVVTs, 2001. — 388 s.
Sarumathi S., Shanthi N., Santhiya G. A Survey of Cluster Ensemble // International Journal of Computer Applications. — 2013. — 65, №.9. — P. 8–11.
Biryukov А.S., Ryazanov V.V., SHmarov А.S. Resheniye zadach klasternogo analiza kollektivami algoritmov // ZHurnal vychislitel’noy matematiki i matematicheskoy fiziki. — 2008. — 48, № 1. — C. 176–192.