Comparison of the effectiveness of machine learning classifiers in the context of voice biometrics
DOI:
https://doi.org/10.20535/SRIT.2308-8893.2019.4.08Keywords:
voice biometrics, MFCC, classifier comparison, k-nearest neighbours, machine learning, artificial intelligenceAbstract
The purpose of this work was to compare the seven popular classifiers of scikit-learn python-based library in the context of the performance of the voice biometrics system. The MFCCs (Mel-Frequency Cepstral Coefficients) method was used to compute the feature vectors of the person's voice undergoing verification. The classifiers involved in this study are the following: K-NN (K-Nearest neighbors classifier), MLP (Multilayer perceptron), SVM (Support vector machine), DTC (Decision tree classifier), GNB (Gaussian Naive Bayes classifier), ABC (AdaBoost classifier), RFC (Random forest classifier). As the data, we used voice samples from 40 individuals with an average duration of 9 minutes per person. The performance criteria of the classifiers were dictated by the needs of voice biometrics systems. Thus, in the framework of this work, the fraud simulation was conducted during authentication. The most effective in voice recognition was the K-NN classifier, which, with zero number of incorrectly admitted persons, provided 3-85% better accuracy of verification than other classifiers.References
Pindrop 2018 voice intelligence report. — Available at: https://www.pindrop.com/2018-voice-intelligence-report/ (accessed: 11.11.2019).
Classifier comparison. — Available at: https://scikit-learn.org/stable/auto_examples/ classification/plot_classifier_comparison.html (accessed: 11.11.2019).
Zakharov V. Tendentsiyi vykorystannja v dijal'nosti pravookhoronnykh orhaniv biometrychnykh tekhnolohij, jaki ne vkhodjat' do "tr'okh velykykh biometryk" / V. Zakharov, O. Zachek // Nauk. visn. L'viv. derzh. un-tu vnutrishnikh sprav. Serija jurydychna. — 2015. — № 2. — S. 285–291.
Kumchenko Ju.O. Informatsijna tekhnolohija identyfikatsiyi personalu na osnovi kompleksu biometrychnykh parametriv : dys. … kand. tekhn. nauk: 05.13.06 / Ju.O. Kumchenko. — Herson, 2017. — 129 s.
Mjasishchev O. Holosove keruvannja viddalenymy prystrojamy cherez merezhu internet / O. Mjasishchev, I. Muljar // Zb. nauk. pr. Vijs'k. in-tu Kyyiv. nats. un-tu imeni Tarasa Shevchenka. — 2017. — № 55. — S. 62–71.
Shcherbakov Ye.Ju. Zastosuvannja matematychnykh modelej dlja holosovoyi identyfikatsiyi sub’yektiv u sferi finansovoyi bezpeky / Ye.Ju. Shcherbakov // Nejronechitki tekhnolohiyi modeljuvannja v ekonomitsi. — 2017. — № 6. — S. 158–190.
Shah H.N.M. Biometric Voice Recognition in Security System / H.N.M. Shah, M.Z. Ab Rashid // Indian Journal of Science and Technology. — 2014. — Vol. 7, N 1. — P. 104–112.
An Overview and Analysis of Voice Authentication Methods. — Available at: https://www.semanticscholar.org/paper/An-Overview-and-Analysis-of-Voice-Authentication-Shoup-Talkar/572af444f0382b8e7e156ab36192da95a3b8dec4 (accessed: 11.11.2019).
Dejavu: Audio Fingerprinting and Recognition in Python. Available at: https://github.com/ worldveil/dejavu (accessed: 11.11.2019).
Martinez J. Speaker recognition using Mel frequency Cepstral Coefficients (MFCC) and Vector quantization (VQ) techniques / J. Martinez, H. Perez, E. Escamilla // CONIELECOMP 2012, 22nd International Conference on Electrical Communications and Computers. — 2012. — N 1. — P. 248–251. — DOI: 10.1109/CONIELECOMP.2012.6189918
Kelly A. The Effects of Windowing on the Calculation of MFCCs for Different Types of Speech Sounds / A. Kelly, C. Gobl // Advances in Nonlinear Speech Processing. NOLISP 2011. — Vol. 7015. — 2011.
Welcome to python_speech_features’s documentation! — Available at: https://python-speech-features.readthedocs.io/en/latest/ (accessed: 11.11. 2019).
Mel frequency cepstral coefficient (mfcc) tutorial.— Available at: http://www.practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/ (accessed: 11.11.2019).
Open Speech and Language Resources. — Available at: http://www.openslr.org/12 (accessed: 11.11.2019).