Preview

Proceedings of the Southwest State University

Advanced search

Increased Performance of Transformers Language Models in Information Question and Response Systems

https://doi.org/10.21869/2223-1560-2022-26-2-159-171

Abstract

Purpose of research. The purpose of this work is to increase the performance of question and response information systems in Russian. Scientific novelty of the work is to increase the performance for RuBERT model, which was trained to find the answer to the question in the text. As far as a more efficient language model allows more requests to be processed in the same time, the results of this work can be used in various information question and response systems for which response speed is important.

Methods. The present work uses methods of processing natural language, machine learning, reducing the size of artificial neural networks. The language model was configured and trained using Torch and Onnxruntime machine learning libraries. The original model and training dataset were taken from the Huggingface Library.

Results. As a result of the study, the performance of RuBERT language model was increased using methods to reduce the size of neural networks, such as distillation of knowledge and quantization, as well as by exporting the model to ONNX format and running it in ONNX runtime.

Conclusion. As a result, the model, to which knowledge distillation, quantization and ONNX optimization were simultaneously applied, received a performance increase of ~ 4.6 times (from 66.57 to 404.46 requests per minute), while the size of the model decreased ~ 13 times (from 676.29 MB to 51.66 MB). The downside of obtained performance was EM deterioration (from 61.3 to 56.87) and F-measure (from 81.66 to 76.97).

About the Authors

D. T. Galeev
Southwest State University
Russian Federation

Denis T. Galeev, Post-Graduate Student

50 Let Oktyabrya str. 94, Kursk 305040



V. S. Panishchev
Southwest State University
Russian Federation

Vladimir S. Panishchev, Cand. of Sci. (Engineering)

50 Let Oktyabrya str. 94, Kursk 305040



D. V. Titov
Southwest State University
Russian Federation

Dmitry V. Titov, Dr. of Sci. (Engineering), Associate Professor

50 Let Oktyabrya str. 94, Kursk 305040



References

1. Ryabinov A.V., Uzdiaev M.Yu., Vatamaniuk I.V. [Applying Multitask Deep Learning to Emotion Recognition in Speech]. Izvestiya Yugo-Zapadnogo gosudarstvennogo universiteta = Proceedings of the Southwest State University 2021;25(1):82-109. (In Russ.) https://doi.org/10.21869/2223-1560-2021-25-1-82-109.

2. Vaswani A. et al. Attention is all you need. Advances in Neural Information Processing Systems 2017-December, 5999–6009 (Neural information processing systems foundation, 2017).

3. Lewis M. et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. in 7871–7880 (Association for Computational Linguistics (ACL), 2020).

4. Raffel C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21, (2020).

5. Zhang J., Zhao Y., Saleh M., Liu P. J. PEGASUS: Pre-Training with extracted gapsentences for abstractive summarization. 37th International Conference on Machine Learning, ICML 2020 PartF168147-15, 11265–11276 (International Machine Learning Society (IMLS), 2020).

6. Qi W. et al. ProphetNet: Predicting future n-gram for sequence-to-sequence pretraining. Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020; 2401–2410 (Association for Computational Linguistics (ACL), 2020).

7. Devlin J., Chang M. W., Lee K., Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference 1, 4171–4186 (Association for Computational Linguistics (ACL), 2019).

8. Lan Z. et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. in ICLR (OpenReview.net, 2020).

9. Liu Y. et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR abs/1907.11692, (2019).

10. Clark K., Luong M.-T., Le Q. V., Manning C. D. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. CoRR abs/2003.10555, (2020).

11. Dai Z. et al. Transformer-XL: Attentive language models beyond a fixed-length context. ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, 2978–2988 (Association for Computational Linguistics (ACL), 2020).

12. Keskar N. S., McCann B., Varshney L. R., Xiong C., Socher R. CTRL: A Conditional Transformer Language Model for Controllable Generation. CoRR abs/1909.05858, (2019).

13. Radford A., Narasimhan K., Salimans T., Sutskever I. (OpenAI Transformer): Improving Language Understanding by Generative Pre-Training. OpenAI 1–10 (2018).

14. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, I. S. Language Models are Unsupervised Multitask Learners. OpenAI Blog 1, 1–7 (2020).

15. Brown T. B. et al. Language models are few-shot learners. Advances in Neural Information Processing Systems 2020-December, (Neural information processing systems foundation, 2020).

16. Hahn S., Choi H. Self-knowledge distillation in natural language processing. International Conference Recent Advances in Natural Language Processing, RANLP 2019-September, 423–430 (Incoma Ltd, 2019).

17. Sanh V., Debut L., Chaumond J., Wolf T. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108, (2019).

18. Li T., El Mesbahi Y., Kobyzev I., Rashid A., Mahmud A., Anchuri N., Hajimolahoseini H., Liu Y., Rezagholizadeh M. A Short Study on Compressing Decoder-Based Language Models. CoRR abs/2110.08460 (2021).

19. Le T. D. et al. Compiling ONNX Neural Network Models Using MLIR. CoRR abs/2008.08272, (2020).

20. Efimov P., Chertok A., Boytsov L., Braslavski P. SberQuAD – Russian Reading Comprehension Dataset: Description and Analysis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12260 LNCS, 3–15 (Springer Science and Business Media Deutschland GmbH, 2020).

21. Kuratov Y., Arkhipov M. Adaptation of deep bidirectional multilingual transformers for Russian language. Komp’juternaja Lingvistika i Intellektual’nye Tehnologii 2019-May, 333–339 (ABBYY PRODUCTION LLC, 2019).

22. Abdaoui A., Pradel C., Sigel G. Load What You Need: Smaller Versions of Mutililingual BERT. in 119–123 (Association for Computational Linguistics (ACL), 2020).


Review

For citations:


Galeev D.T., Panishchev V.S., Titov D.V. Increased Performance of Transformers Language Models in Information Question and Response Systems. Proceedings of the Southwest State University. 2022;26(2):159-171. (In Russ.) https://doi.org/10.21869/2223-1560-2022-26-2-159-171

Views: 320


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2223-1560 (Print)
ISSN 2686-6757 (Online)