Evaluation of the detection and correction properties of the reference dictionary of the system for checking and correcting orthography

Authors

DOI:

https://doi.org/10.20535/SRIT.2308-8893.2019.2.05

Keywords:

typing errors, spell checking, spelling dictionary, detecting properties, correcting properties

Abstract

The models for evaluating the properties of the reference orthographic dictionary (ROD) of the spelling check and correction system are considered. RODs’ detecting properties are determined by the probability of not detecting the typical error and the probability of a false error notification. The task is formulated to optimize a ROD according to Pareto, a step by step algorithm is proposed for solving it, the results of the experimental evaluation of the algorithm’s effectiveness are given. RODs’ correcting properties are determined by the probabilities of the correct and erroneous correction of the typical errors. Models of their estimation are offered and simulation results are given for the selected dictionaries. It has been shown that ROD optimized for detecting properties also has better correcting properties. In general, the obtained results can be used as the basis for a tool for the comparative assessment, selection and improvement of the potential properties of a specific ROD for a given subject matter.

Author Biographies

Valery A. Lytvynov, The Institute of Mathematical Machines and Systems Problems of the National Academy of Science of Ukraine, Kyiv

Valeriy Andronykovych Lytvynov,

a professor, Doctor of Technical Sciences, the leading researcher at the Institute of Mathematical Machines and Systems Problems of the National Academy of Science of Ukraine, Kyiv, Ukraine.

Research areas: decision making systems, user interface, ensuring reliability of information.

Svitlana Ya. Maystrenko, The Institute of Mathematical Machines and Systems Problems of the National Academy of Science of Ukraine, Kyiv

Svitlana Yakivna Maystrenko,

Candidate of Technical Sciences, a senior researcher at the Institute of Mathematical Machines and Systems Problems of the National Academy of Science of Ukraine, Kyiv, Ukraine.

Research areas: information technologies, geoinformation systems, ensuring reliability of information.

Konstantin V. Khurtsylava, The Institute of Mathematical Machines and Systems Problems of the National Academy of Science of Ukraine, Kyiv

Kostyantyn Viktorovych Khurtsylava,

a junior researcher at the Institute of Mathematical Machines and Systems Problems of the National Academy of Science of Ukraine, Kyiv, Ukraine.

Research areas: geoinformation systems, cartographic modelling, mathematical modeling.

Sviatoslav V. Kostenko, The National University of Food Technologies, Kyiv

Svyatoslav Volodymyrovych Kostenko,

a Ph.D. student at the National University of Food Technologies, Kyiv, Ukraine.

Research areas: User's errors fixing, user interface.

References

Nechetkij poisk v tekste i slovare [Digital source]. — Available at: https://habrahabr.ru/post/114997/.

Rasstojanie Levenshtejna v MySQL i algoritmy nechetkogo poiska sredstvami PHP [Digital source]. — Available at: https://habrahabr.ru/post/342434/.

Foneticheskie algoritmy [Digital source]. — Available at: https://habrahabr.ru/post/114947/.

Phonetic Algorithms [Digital source]. — Available at: https://deparkes.co.uk/2017/12/01/phonetic-algorithms/.

Hodge V.J. A comparison of standard spell checking algorithms and a novel binary neural approach / V.J. Hodge, J. Austin // IEEE Transactions on Knowledge and Data Engineering. — 2003. — P. 1073–1081.

de Amorim R.C. Effective Spell Checking Methods Using Clustering Algorithms [Digital source] / R.C. de Amorim, M. Zampieri. — Available at: http://www.aclweb.org/anthology/R13-1023.

Litvinov V.A. Otsenka kontrolirujuschih svojstv bazovogo slovarja dopustimyh slov v sisteme avtomaticheskogo obnaruzhenija oshibok pol'zovatelja / V.A. Litvinov, S.Ja. Majstrenko, K.V. Hurtsilava // Matematychni mashyny i systemy. — 2014. — № 2. — S. 65–70.

Slovari russkogo jazyka [Digital source]. — Available at: http://speakrus.ru/dict.

Slovar' Lopatina [Digital source]. — Available at: http://royallib. ru/book/ lopatin_vladimir/russkiy_orfograficheskiy_slovar.html.

Slovari russkogo jazyka [Digital source]. — Available at: http://speakrus.ru/dict.

Litvinov V.A. Kontrol' dostovernosti i vosstanovlenija informatsii v cheloveko-mashinnyh sistemah / V.A. Lytvynov, V.V. Kramarenko. — K.: Tekhnika, 1986. — 200 s.

Litvinov V.A. Disfunktsija referentnogo slovarja sistemy proverki orfografii i podhod k ee snizheniju / V.A. Litvinov, S.Ja. Majstrenko, K.V. Hurtsilava // Matematychni mashyny i systemy. — 2017. — № 2. — S. 39–48.

Knapsack problem [Digital source]. — Available at: http://en.wikipedia.org/wiki/Knapsack_problem.

Zadacha o rjukzake: zhadnyj algoritm [Digital source]. — Available at: http://traditioru.org/wiki/Задача_о_рюкзаке: жадный_алгоритм.

Published

2019-06-25

Issue

Section

Decision making and control in economic, technical, ecological and social systems