EmoSonics: Emotion Detection via Voice and Speech Recognition

Authors

  • T. Aditya Sai Srinivas Department of Artificial Intelligence and Machine Learning, Jayaprakash Narayan College of Engineering, Dharmapur, Telangana, India
  • M. Bharathi Department of Artificial Intelligence and Machine Learning, Jayaprakash Narayan College of Engineering, Dharmapur, Telangana, India

DOI:

https://doi.org/10.48001/jocsss.2024.121-7

Keywords:

Analytical techniques, Emotional information, Human-computer interaction, Speech signals, Spoken emotion detection

Abstract

Understanding emotions from speech is like deciphering a rich tapestry of human expression in the realm of human-computer interaction. It's akin to listening to someone's tone and inflection to discern whether they're happy, surprised, or experiencing a range of other feelings. Researchers use a variety of techniques, from analyzing speech patterns to utilizing advanced technologies like fMRI, to decode these emotional cues. Emotions aren't just simple labels; they're complex and nuanced, demanding sophisticated methods for accurate interpretation. Some methods break emotions down into simple categories, while others embrace the intricacies of human emotion, treating them as continuous variables. Ultimately, the goal is to enhance communication between humans and computers by enabling machines to understand and respond appropriately to our emotional states. This pursuit underscores the significance of emotion detection in speech analysis and highlights the need for continually evolving methodologies in human-computer interaction research.

Downloads

Download data is not yet available.

References

Avots, E., Sapinski, T., Bachmann, M., & Kaminska, D. (2019). Audiovisual emotion recognition in wild. Machine Vision and Applications, 30(5), 975-985.

https://doi.org/10.1007/s00138-018-0960-9.

Badshah, A. M., Rahim, N., Ullah, N., Ahmad, J., Muhammad, K., Lee, M. Y., ... & Baik, S. W. (2019). Deep features-based speech emotion recognition for smart affective services. Multimedia Tools and Applications, 78, 5571-5589. https://doi.org/10.1007/s11042-017-5292-7.

Bisio, I., Delfino, A., Lavagetto, F., Marchese, M., & Sciarrone, A. (2013). Gender-driven emotion recognition through speech signals for ambient intelligence applications. IEEE transactions on Emerging topics in computing, 1(2), 244-257. https://doi.org/10.1109/TETC.2013.2274797.

Chamishka, S., Madhavi, I., Nawaratne, R., Alahakoon, D., De Silva, D., Chilamkurti, N., & Nanayakkara, V. (2022). A voice-based real-time emotion detection technique using recurrent neural network empowered feature modelling. Multimedia Tools and Applications, 81(24), 35173-35194.

https://doi.org/10.1007/s11042-022-13363-4.

Hossain, N., Jahan, R., & Tunka, T. T. (2018). Emotion detection from voice based classified frame-energy signal using k-means clustering. International Journal of Software Engineering & Applications, 9(4), 37-44.

https://doi.org/10.5121/ijsea.2018.9403.

Islam, A. M. (2024). Exploring convolutional neural networks for facial expression recognition: A comprehensive survey. Global Mainstream Journal of Innovation, Engineering & Emerging Technology, 3(02), 14-26. https://doi.org/10.62304/jieet.v3i02.87.

Kakuba, S., Poulose, A., & Han, D. S. (2022). Deep learning-based speech emotion recognition using multi-level fusion of concurrent features. IEEE Access, 10, 125538-125551.

https://doi.org/10.1109/ACCESS.2022.3225684.

Lalitha, S., Tripathi, S., & Gupta, D. (2019). Enhanced speech emotion detection using deep neural networks. International Journal of Speech Technology, 22, 497-510.

https://doi.org/10.1007/s10772-018-09572-8.

Lee, M. C., Chiang, S. Y., Yeh, S. C., & Wen, T. F. (2020). Study on emotion recognition and companion Chatbot using deep neural network. Multimedia Tools and Applications, 79(27),19629-19657.

https://doi.org/10.1007/s11042-020-08841-6.

Lian, H., Lu, C., Li, S., Zhao, Y., Tang, C., & Zong, Y. (2023). A survey of deep learning-based multimodal emotion recognition: Speech, text, and face. Entropy, 25(10), 1440.

https://doi.org/10.3390/e25101440.

Liu, M. (2022). English speech emotion recognition method based on speech recognition. International Journal of Speech Technology, 25(2), 391-398.

https://doi.org/10.1007/s10772-021-09955-4.

Madanian, S., Chen, T., Adeleye, O., Templeton, J. M., Poellabauer, C., Parry, D., & Schneider, S. L. (2023). Speech emotion recognition using machine learning-A systematic review. Intelligent Systems with Applications, 200266. https://doi.org/10.1016/j.iswa.2023.200266

Rastogi, R., Anand, T., Sharma, S. K., & Panwar, S. (2023). Emotion detection via voice and speech recognition. International Journal of Cyber Behavior, Psychology and Learning (IJCBPL), 13(1), 1-24.

https://doi.org/10.4018/IJCBPL.333473.

Sailunaz, K., Dhaliwal, M., Rokne, J., & Alhajj, R. (2018). Emotion detection from text and speech: a survey. Social Network Analysis and Mining, 8(1), 28.

https://doi.org/10.1007/s13278-018-0505-2.

Selvan, A. K., Nimmi, K., Janet, B., & Sivakumaran, N. (2023). Emotion detection on phone calls during emergency using ensemble model with hyper parameter tuning. International Journal of Information Technology, 15(2), 745-757.

https://doi.org/10.1007/s41870-022-01091-9.

Singh, P., Srivastava, R., Rana, K. P. S., & Kumar, V. (2021). A multimodal hierarchical approach to speech emotion recognition from audio and text. Knowledge-Based Systems, 229, 107316. https://doi.org/10.1016/j.knosys.2021.107316.

Swain, M., Routray, A., & Kabisatpathy, P. (2018). Databases, features and classifiers for speech emotion recognition: A review. International Journal of Speech Technology, 21, 93-120.

https://doi.org/10.1007/s10772-018-9491-z.

Tripathi, A., Singh, U., Bansal, G., Gupta, R., & Singh, A. K. (2020, May). A review on emotion detection and classification using speech. In Proceedings of the International Conference on Innovative Computing & Communications (ICICC).

http://dx.doi.org/10.2139/ssrn.3601803.

Zhao, Z., Bao, Z., Zhao, Y., Zhang, Z., Cummins, N., Ren, Z., & Schuller, B. (2019). Exploring deep spectrum representations via attention-based recurrent and convolutional neural networks for speech emotion recognition. IEEE Access, 7, 97515-97525.

https://doi.org/10.1109/ACCESS.2019.2928625.

Published

2024-05-17

How to Cite

T. Aditya Sai Srinivas, & M. Bharathi. (2024). EmoSonics: Emotion Detection via Voice and Speech Recognition. Journal of Computer Science and System Software, 1(2), 1–7. https://doi.org/10.48001/jocsss.2024.121-7

Issue

Section

Articles