Preview

Proceedings of the Southwest State University

Advanced search

Automated Training Algorithms of Dialog Systems

https://doi.org/10.21869/2223-1560-2019-23-3-86-99

Abstract

Purpose of research. The research described in this article is conducted within the Salebot.pro project (on the https://salebot.pro resource) and aimed at development of simple and effective realization of a dialog system.

Methods. The research plan provided the analysis of various methods of natural processing languages and machine learning languages. Implementation of these methods was taken from popular libraries with an open source code. The model of a dialog system was made in two options: on the basis of Spacy freymvork and metric assessment algorithm, on the basis of Levenstein's distance. Simplicity of implementation and costs on training of a system and personnel were compared.

Results. The algorithms described in article compare the most similar words from two texts and count average percent of coincidence. Such approach provides a possibility of acceptable work in languages with free word order. Russian is one such languages. The executed research allowed developing an automated training algorithm of dialog systems in real time without context loss. On the same basis training algorithm of a dialog system in dialog history is developed. It is offered to use these algorithms together. It is originally necessary to train it at history of dialogues during creation of a dialogue system. And then it is necessary to train it permanently in real time.

Conclusion. The advantage of the developed algorithm is ease in implementation and low cost of infrastructure which is necessary for model training and its service and also operation simplicity. Approach which differs from training with the teacher allows accelerating training process and input of new data into the system. Specific feature of the developed algorithms is ignoring of text semantics that makes training automated but not automatic.

About the Authors

D. V. Spirin
Penza State University
Russian Federation
Dmitriy V. Spirin, Post-Graduate Student, Department of Computing Engineering


O. S. Brezhnev
Penza State University
Russian Federation
Oleg S. Brezhnev, Software Engineer, Department of Software and Computer  Applications


References

1. Provotar A. I., Klochko K. A. Osobennosti i problemy virtual'nogo obshcheniya s pomoshch'yu chat-botov [Features and problems of virtual communication using chat bots]. Informatsionnye tekhnologii i komp'yuternaya tekhnika. Nauchnye raboty VNTU. = Information technologies and computer equipment Scientific works VNTU, 2013, no. 3, pp. 1-6 (In Russ.).

2. [Training spaCy’s Statistical Models]. Available at: https://spacy.io/usage/training (accessed 07.05.2019).

3. Apache OpenNLP Developer Documentation. Available at: https:// opennlp.apache.org/ docs/1.9.0/manual/ opennlp.html (accessed 07.05.2019).

4. Zadacha o redaktsionnom rasstoyanii, algoritm Vagnera-Fishera [The task of the editorial distance, the algorithm of Wagner-Fisher]. Available at: The access method is free: https://neerc.ifmo.ru/wiki/index.php?title=Task_about_education_distance ,_algorithm_Wagner-Fisher (accessed 07.05.2019) (In Russ.).

5. Ramsay A. Discourse. In Mitkov, R. (Ed.). The Oxford Handbook of Computational Linguistics. Oxford University Press, USA, 2003, 717 p.

6. Traum D., Larsson S. The information state approach to dialogue management. In J. van Kuppevelt & R. Smith (Eds.), Current and new directions in discourse and dialogue Springer, 2003, p. 325–354.

7. Computing Power Throughout History Available at: https:// www.alternatewars.com/ BBOW/ Computing / Computing_Power.htm (accessed 07.05.2019).

8. Avtomatizirovannoe obuchenie. Available at: The access method is free: https://salebot.pro/articles/9 (accessed 07.05.2019) (In Russ.).

9. Spirin D.V., Brezhnev O. S., Barinov A. D. [Algorithm of automated learning]. Sbornik statei II Mezhdunarodnoi nauchno-prakticheskoi konferentsii [Collection of articles of the II International Scientific and Practical Conference]. Penza, 2018, pp. 49-53 (In Russ.).

10. Aguilar G., Maharjan S., Pastor Lopez-Monroy A., Solorio T..A multi-task approach for named entity recognition in social media data. In Proceedings of the 3rd Workshop on Noisy User-generated Text, 2017, pp. 148–153.

11. Daniken P., Cieliebak M. Transfer learning and sentence level features for named entity recognition on tweets. In Proceedings of the 3rd Workshop on Noisy User-generated Text, 2017, pp. 166–171.

12. Lample G., Ballesteros M., S Subramanian., Kawakami K., Dyer C. Neural Architectures for Named Entity Recognition. In Proceedings of NAACL-HLT 2016, San Diego, California, June 12-17, 2016, pp. 260–270.

13. Strakova J. Neural Network Based Named Entity Recognition. Institute of Formal and Applied Linguistics, Prague, 2017, 120 p.

14. Akkaya E.K. Deep neural networks for named entity recognition on social media. Computer Engineering Dept., Hacettepe University, Beytepe-Ankara, Turkey, 2018. 126 p.


Review

For citations:


Spirin D.V., Brezhnev O.S. Automated Training Algorithms of Dialog Systems. Proceedings of the Southwest State University. 2019;23(3):86-99. (In Russ.) https://doi.org/10.21869/2223-1560-2019-23-3-86-99

Views: 718


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2223-1560 (Print)
ISSN 2686-6757 (Online)