Models and a Tecnique for Determining the Speech Activity of a User of a Socio-Cyberphysical System
https://doi.org/10.21869/2223-1560-2019-23-6-225-240
Abstract
Purpose of reseach. The article presents the development of the model-algorithmic support for the process of determining the speech activity of a user of a socio-cyberphysical system. A topological model of a distributed subsystem of audio recordings implemented in limited physical spaces (rooms) is proposed; the model makes it possible to assess the quality of perceived audio signals for the case of distribution of microphones in such a room. Based on this model, a technique for determining the speech activity of a user of a socio-cyberphysical system, which maximizes the quality of perceived audio signals when a user moves in a room by means of determining the installation coordinates of microphones has been developed.
Methods. The mathematical tools of graph theory and set theory was used for the most complete analysis and formal description of the distributed subsystem of the audiorecording. In order to determine the coordinates of the placement of microphones in one room, a relevant technique was developed; it involves performing such operations as emitting a speech signal in a room using acoustic equipment and measuring signal levels using a noise meter in the places intended for installing microphones.
Results. The dependences of the correlation coefficient of the combined signal and the initial test signal on the distance to the signal source were calculated for a different number of microphones. The obtained dependences allow us to determine the minimum required number of spaced microphones to ensure high-quality recording of the user’s speech. The results of testing the developed technique for determining speech activity in a particular room indicate the possibility and high efficiency of determining the speech activity of a user of a socio-cyberphysical system.
Conclusion. Application of the proposed technique for determining the speech activity of a user of a sociocyberphysical system will improve the recording quality of the audio signal and, as a consequence, its subsequent processing, taking into account the possible movement of a user.
About the Authors
E. E. UsinaRussian Federation
Elizaveta E. Usina, Junior Researcher, Laboratory of Big Data Technologies of Sociocyberphysical Systems
St. Petersburg
A. R. Shabanova
Russian Federation
Alexandra R. Shabanova, Junior Researcher, Laboratory of Big Data Technologies of Sociocyberphysical Systems
St. Petersburg
I. V. Lebedev
Russian Federation
Igor V. Lebedev, Junior Researcher, Laboratory of Big Data Technologies of Sociocyberphysical System
St. Petersburg
References
1. Internet of Things, IoT European Research Cluster. [Quoted November 6, 2019]. Available at: http://www.internet-of-things-research.eu/about_iot.htm
2. Merkela L., Atuga J., Merhara L., Schultza C., Braunreuthera S., Reinharta G. Teaching Smart Production: An insight into the Learning Factory for Cyber-Physical Production Systems (LVP). Procedia Manufacturing, 2017, no. 9, pp. 269-274. https:// doi.org/ 10.1016/j.promfg.2017.04.034
3. Klöber-Koch J., Pielmeier S. Grimm J., Brandt M., Schneider M., Reinhart G. KnowledgeBased Decision Making in a CyberPhysical Production Scenario. 7th Conference on Learning Factories, 2017, no. 7, pp. 167-174. https://doi.org/ 10.1016/ j.promfg.2017.04.014
4. Jiang P., Ding K., Leng J. Towards a cyber-physical-socialconnected and serviceoriented manufacturing paradigm: Social Manufacturing. Manufacturing Letters. 2016, no. 7, pp. 15-21. https://doi.org/10.1016/j.mfglet.2015.12.002
5. Cassandras C.G. Smart Cities as Cyber-Physical Social Systems. Engineering, 2016, no. 2, pp. 156-158. https://doi.org/10.1016/J.ENG.2016.02.012
6. Smirnov A.V., Levashova T.V. Priobretenie znanii v sotsiokiberfizicheskikh sistemakh v protsesse informatsionnogo vzaimodeistviya resursov [The acquisition of knowledge in sociocyberphysical systems in the process of information interaction of resources]. Informatsionno-upravlyayushchie sistemy = Information and Control Systems, 2017, no.6, pp. 113–122 (In Russ.)
7. Mazurenko I.L. Mnogokanal'naya sistema raspoznavaniya rechi [Multi-channel speech recognition system]. Sbornik trudov VI vserossiiskoi konferentsii "Neirokomp'yutery i ikh primenenie" [Proceedings of the VI All-Russian conference "Neurocomputers and their application"]. Moscow, 2000. (In Russ.).
8. Stanford V., Rochet C., Michel M., Garofolo J. Beyond Close-talk – Issues in Distant Speech Acquisition, Conditioning Classification, and Recognition. Proc. ICASSP 2004 Meeting Recognition Workshop, 2004, pp. 123-127.
9. Pfau T., Ellis D. P. W., Stolcke A. Multispeaker speech activity detection for the ICSI meeting recorder. IEEE Workshop on Automatic Speech Recognition and Understanding, 2001, pp. 107-110. https://doi.org/10.1109/ASRU.2001.1034599
10. Centr rechevykh tekhnologiy [Speech technology center]. 2019 [Quoted November 6, 2019]. Available at: http://www.speechpro.ru (In Russ.).
11. AO "OKB "Oktava". 2019 [Quoted November 6, 2019]. Available at: https://www.окбоктава.рф (In Russ.).
12. Vakhitov Sh.Ya., Kovalgin Yu.A., Fadeev A.A., Shcheviev Yu.P. Akustika [Acoustics]. Moscow, Goryachaya liniya Publ., 2009. (In Russ.).
13. Ronzhin A.L., Karpov A.A., Leontyeva A.B., Kostyuchenko B.E. Razrabotka mnogomodal'nogo informatsionnogo kioska [The development of the multimodal information kiosk]. Trudy SPIIRAN = SPIIRAS Proceedings, 2007, no. 5(1), pp. 227-245 (In Russ.)
14. Ronzhin A.L., Karpov A.A., Kagirov I.A. Osobennosti distantsionnoi zapisi i obrabotki rechi v avtomatakh samoobsluzhivaniya [Features of remote recording and speech processing in self-service machines]. Informatsionno-upravlyayushchie sistemy = Information and control systems, 2009, no. 42(5), pp. 32–38 (In Russ.)
15. Kharkevich A.A. Bor'ba s pomekhami [Struggle against interference]. Moscow, Knizhnyi dom "LIBROKOM" Publ., 2013 (In Russ.)
16. Sklar B. Tsifrovaya svyaz'. Teoreticheskie osnovy i prakticheskoe primenenie [Digital communication. Theoretical foundations and practical application]. Moscow, Izdatel'skii dom "Vil'yams" Publ., 2003 (In Russ.)
17. Ogunfunmi T., Togneri R., Narasimha M. Speech and audio processing for coding, enhancement and recognition. New York, Springer Publ., 2015.
18. Markovnikov N.M., Kipyatkova I.S. Analiticheskii obzor integral'nykh sistem raspoznavaniya rechi [An Analytic Survey of End-to-End Speech Recognition Systems]. Trudy SPIIRAN = SPIIRAS Proceedings, 2018, no. 3, pp. 77-110 (In Russ.). https://doi.org/10.15622/sp.58.4.
Review
For citations:
Usina E.E., Shabanova A.R., Lebedev I.V. Models and a Tecnique for Determining the Speech Activity of a User of a Socio-Cyberphysical System. Proceedings of the Southwest State University. 2019;23(6):225-240. (In Russ.) https://doi.org/10.21869/2223-1560-2019-23-6-225-240