Preview

Proceedings of the Southwest State University

Advanced search

Models and a Tecnique for Determining the Speech Activity of a User of a Socio-Cyberphysical System

https://doi.org/10.21869/2223-1560-2019-23-6-225-240

Abstract

Purpose of reseach. The article presents the development of the model-algorithmic support for the process of determining the speech activity of a user of a socio-cyberphysical system. A topological model of a distributed subsystem of audio recordings implemented in limited physical spaces (rooms) is proposed; the model makes it possible to assess the quality of perceived audio signals for the case of distribution of microphones in such a room. Based on this model, a technique for determining the speech activity of a user of a socio-cyberphysical system, which maximizes the quality of perceived audio signals when a user moves in a room by means of determining the installation coordinates of microphones has been developed.

Methods. The mathematical tools of graph theory and set theory was used for the most complete analysis and formal description of the distributed subsystem of the audiorecording. In order to determine the coordinates of the placement of microphones in one room, a relevant technique was developed; it involves performing such operations as emitting a speech signal in a room using acoustic equipment and measuring signal levels using a noise meter in the places intended for installing microphones. 

Results. The dependences of the correlation coefficient of the combined signal and the initial test signal on the distance to the signal source were calculated for a different number of microphones. The obtained dependences allow us to determine the minimum required number of spaced microphones to ensure high-quality recording of the user’s speech. The results of testing the developed technique for determining speech activity in a particular room indicate the possibility and high efficiency of determining the speech activity of a user of a socio-cyberphysical system.

Conclusion. Application of the proposed technique for determining the speech activity of a user of a sociocyberphysical system will improve the recording quality of the audio signal and, as a consequence, its subsequent processing, taking into account the possible movement of a user. 

About the Authors

E. E. Usina
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences
Russian Federation

Elizaveta E. Usina, Junior Researcher, Laboratory of Big Data Technologies of Sociocyberphysical Systems

St. Petersburg



A. R. Shabanova
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences
Russian Federation

Alexandra R. Shabanova, Junior Researcher,  Laboratory of Big Data Technologies  of Sociocyberphysical Systems

St. Petersburg



I. V. Lebedev
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences
Russian Federation

Igor V. Lebedev, Junior Researcher, Laboratory of Big Data Technologies of Sociocyberphysical System

St. Petersburg



References

1. Internet of Things, IoT European Research Cluster. [Quoted November 6, 2019]. Available at: http://www.internet-of-things-research.eu/about_iot.htm

2. Merkela L., Atuga J., Merhara L., Schultza C., Braunreuthera S., Reinharta G. Teaching Smart Production: An insight into the Learning Factory for Cyber-Physical Production Systems (LVP). Procedia Manufacturing, 2017, no. 9, pp. 269-274. https:// doi.org/ 10.1016/j.promfg.2017.04.034

3. Klöber-Koch J., Pielmeier S. Grimm J., Brandt M., Schneider M., Reinhart G. KnowledgeBased Decision Making in a CyberPhysical Production Scenario. 7th Conference on Learning Factories, 2017, no. 7, pp. 167-174. https://doi.org/ 10.1016/ j.promfg.2017.04.014

4. Jiang P., Ding K., Leng J. Towards a cyber-physical-socialconnected and serviceoriented manufacturing paradigm: Social Manufacturing. Manufacturing Letters. 2016, no. 7, pp. 15-21. https://doi.org/10.1016/j.mfglet.2015.12.002

5. Cassandras C.G. Smart Cities as Cyber-Physical Social Systems. Engineering, 2016, no. 2, pp. 156-158. https://doi.org/10.1016/J.ENG.2016.02.012

6. Smirnov A.V., Levashova T.V. Priobretenie znanii v sotsiokiberfizicheskikh sistemakh v protsesse informatsionnogo vzaimodeistviya resursov [The acquisition of knowledge in sociocyberphysical systems in the process of information interaction of resources]. Informatsionno-upravlyayushchie sistemy = Information and Control Systems, 2017, no.6, pp. 113–122 (In Russ.)

7. Mazurenko I.L. Mnogokanal'naya sistema raspoznavaniya rechi [Multi-channel speech recognition system]. Sbornik trudov VI vserossiiskoi konferentsii "Neirokomp'yutery i ikh primenenie" [Proceedings of the VI All-Russian conference "Neurocomputers and their application"]. Moscow, 2000. (In Russ.).

8. Stanford V., Rochet C., Michel M., Garofolo J. Beyond Close-talk – Issues in Distant Speech Acquisition, Conditioning Classification, and Recognition. Proc. ICASSP 2004 Meeting Recognition Workshop, 2004, pp. 123-127.

9. Pfau T., Ellis D. P. W., Stolcke A. Multispeaker speech activity detection for the ICSI meeting recorder. IEEE Workshop on Automatic Speech Recognition and Understanding, 2001, pp. 107-110. https://doi.org/10.1109/ASRU.2001.1034599

10. Centr rechevykh tekhnologiy [Speech technology center]. 2019 [Quoted November 6, 2019]. Available at: http://www.speechpro.ru (In Russ.).

11. AO "OKB "Oktava". 2019 [Quoted November 6, 2019]. Available at: https://www.окбоктава.рф (In Russ.).

12. Vakhitov Sh.Ya., Kovalgin Yu.A., Fadeev A.A., Shcheviev Yu.P. Akustika [Acoustics]. Moscow, Goryachaya liniya Publ., 2009. (In Russ.).

13. Ronzhin A.L., Karpov A.A., Leontyeva A.B., Kostyuchenko B.E. Razrabotka mnogomodal'nogo informatsionnogo kioska [The development of the multimodal information kiosk]. Trudy SPIIRAN = SPIIRAS Proceedings, 2007, no. 5(1), pp. 227-245 (In Russ.)

14. Ronzhin A.L., Karpov A.A., Kagirov I.A. Osobennosti distantsionnoi zapisi i obrabotki rechi v avtomatakh samoobsluzhivaniya [Features of remote recording and speech processing in self-service machines]. Informatsionno-upravlyayushchie sistemy = Information and control systems, 2009, no. 42(5), pp. 32–38 (In Russ.)

15. Kharkevich A.A. Bor'ba s pomekhami [Struggle against interference]. Moscow, Knizhnyi dom "LIBROKOM" Publ., 2013 (In Russ.)

16. Sklar B. Tsifrovaya svyaz'. Teoreticheskie osnovy i prakticheskoe primenenie [Digital communication. Theoretical foundations and practical application]. Moscow, Izdatel'skii dom "Vil'yams" Publ., 2003 (In Russ.)

17. Ogunfunmi T., Togneri R., Narasimha M. Speech and audio processing for coding, enhancement and recognition. New York, Springer Publ., 2015.

18. Markovnikov N.M., Kipyatkova I.S. Analiticheskii obzor integral'nykh sistem raspoznavaniya rechi [An Analytic Survey of End-to-End Speech Recognition Systems]. Trudy SPIIRAN = SPIIRAS Proceedings, 2018, no. 3, pp. 77-110 (In Russ.). https://doi.org/10.15622/sp.58.4.


Review

For citations:


Usina E.E., Shabanova A.R., Lebedev I.V. Models and a Tecnique for Determining the Speech Activity of a User of a Socio-Cyberphysical System. Proceedings of the Southwest State University. 2019;23(6):225-240. (In Russ.) https://doi.org/10.21869/2223-1560-2019-23-6-225-240

Views: 502


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2223-1560 (Print)
ISSN 2686-6757 (Online)