Statistical methods of feature engineering for the problem of forest state classification using satellite data
DOI:
https://doi.org/10.20535/SRIT.2308-8893.2024.1.07Keywords:
Sentinel-2, vegetation indices, Bhattacharyya distance, feature engineering, greedy algorithms, Spearman’s rank correlation coefficientAbstract
Timely detection of forest diseases is an important task for their prevention and spread limitation. The usage of satellite imagery provides capabilities for large-scale forest monitoring. Machine learning models allow to automate the analysis of these data for anomaly detection indicating diseases. However, selecting informative features is key to building an effective model. In this work, the application of Bhattacharyya distance and Spearman’s rank correlation coefficient for feature selection from satellite images was investigated. A greedy algorithm was applied to form a subset of weakly correlated features. The experiment showed that selected features allow for improving the classification quality compared to using all spectral bands. The proposed approach demonstrates effectiveness for informative and weakly correlated feature selection and can be utilized in other remote sensing tasks.
References
N. Kussul, G. Lemoine, J. Gallego, S. Skakun, and M. Lavreniuk, “Parcel based classification for agricultural mapping and monitoring using multi-temporal satellite image sequences,” 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), IEEE, 2015. doi: 10.1109/igarss.2015.7325725.
J. Zhang, S. Cong, G. Zhang, Y. Ma, Y. Zhang, and J. Huang, “Detecting Pest-Infested Forest Damage through Multispectral Satellite Imagery and Improved UNet++,” Sensors, vol. 22, issue 19, 2022. doi: 10.3390/s22197440.
N.N. Kussul, N.S. Lavreniuk, A.Y. Shelestov, B.Y. Yailymov, and I.N. Butko, “Land Cover Changes Analysis Based on Deep Machine Learning Technique,” Journal of Automation and Information Sciences, vol. 48, no. 5, pp. 42–54, 2016. doi: 10.1615/jautomatinfscien.v48.i5.40.
T. van Erven, P. Harrëmos, “Rényi divergence and kullback-leibler divergence,” IEEE Transactions on Information Theory, 60(7), 2014. Available: https://doi.org/10.1109/TIT.2014.2320500
A. Ilnitskiy, O. Burba, “Statistical criteria for assessing the informativity of the sources of radio emission of telecommunication networks and systems in their recognition,” Cybersecurity: Education, Science, Technique, 1(5), pp. 83–94, 2019. doi: 10.28925/2663-4023.2019.5.8394.
Forest type 2018. Accessed on: April 07, 2023. [Online]. Available: https://land.copernicus.eu/pan-european/highresolution-layers/forests/forest-type-1/status-maps/forest-type-2018.
“Spatial Resolutions,” Sentinel Online. Accessed on: August 13, 2023. [Online]. Available: https://sentinels.copernicus.eu/web/sentinel/user-guides/sentinel-2-msi/resolutions/spatial
“What is a Greedy Approach? - Algorithms for Coding Interviews in Java,” educative.io. Accessed on: May 08, 2023. [Online]. Available: https://www.educative.io/courses/algorithms-coding-interviews-java/3j1R50KnNjQ
C. Croux, C. Dehon, “Influence functions of the Spearman and Kendall correlation measures,” Statistical Methods and Applications, vol. 19, pp. 497–515, 2010. doi: 10.1007/s10260-010-0142-z.
P.M. Atkinson, A.R. Tatnall, “Introduction neural networks in remote sensing,” International Journal of Remote Sensing, vol. 18(4), 1997. doi: 10.1080/014311697218700.
“Metrics for semantic segmentation,” ilmonteux.github.io. Accessed on: May 27, 2023. [Online]. Available: https://ilmonteux.github.io/2019/05/10/segmentation-metrics.html