Jarvis-Virtual Voice Assistant
DOI:
https://doi.org/10.48001/jocsvl.2024.1124-33Keywords:
Artificial Intelligence (AI), Computers, Python library, Speech recognition, Virtual voice assistantAbstract
Although voice recognition technology has advanced significantly, there are still obstacles in the way of obtaining high accuracy, especially in a variety of environmental settings. This research uses deep learning models to increase the robustness and accuracy of voice recognition systems. Implementing different deep learning architectures, such as recurrent and convolutional neural networks (RNNs), and training them on a wide range of datasets with different noise levels and accents was part of the technique. Important results show a 15% improvement in recognition accuracy over current systems for the suggested models, which perform especially well in noisy contexts and with accented speech. These outcomes demonstrate the effectiveness of deep learning techniques in resolving issues that traditional voice recognition systems encounter. The results of this study have important significance for real-world applications, since they may enable smooth communication in a variety of linguistic and environmental contexts. Context: Since voice recognition technology has come a long way, it still has problems with accuracy, particularly in noisy situations and different locations with different accents. Even with advancements, traditional systems find it difficult to function consistently and dependably under these circumstances. This lays the groundwork for the need to investigate more durable ways to raise voice recognition's accuracy and dependability
Downloads
References
Alotto, F., Scida, I., & Osello, A. (2020). Building modeling with artificial intelligence and speech recognition for learning purpose. In EDULEARN20 Proceedings (pp. 5866-5875). IATED.
https://doi.org/10.21125/edulearn.2020.1529.
Bhalla, A. (2018, December). An exploratory study understanding the appropriated use of voice-based search and assistants. In Proceedings of the 9th Indian Conference on Human-Computer Interaction (pp. 90-94). https://doi.org/10.1145/ 3297121.3297136.
Canbek, N. G., & Mutlu, M. E. (2016). On the track of artificial intelligence: Learning with intelligent personal assistants. Journal of Human Sciences, 13(1), 592-601.
https://www.j-humansciences.com/ojs/index.php/IJHS/article/view/3549.
Dalal, P., Sharma, T., Garg, Y., Gambhir, P., & Khandelwal, Y. (2023, March). “JARVIS”-AI voice assistant. In 2023 1st International Conference on Innovations in High Speed Communication and Signal Processing (IHCSP) (pp. 273-280). IEEE.
https://doi.org/10.1109/IHCSP56702.2023.10127134.
Jahangir, R., Teh, Y. W., Nweke, H. F., Mujtaba, G., Al-Garadi, M. A., & Ali, I. (2021). Speaker identification through artificial intelligence techniques: A comprehensive review and research challenges. Expert Systems with Applications, 171, 114591.
https://doi.org/10.1016/j.eswa.2021.114591.
Mitra, V., Franco, H., Graciarena, M., & Vergyri, D. (2014, May). Medium-duration modulation cepstral feature for robust speech recognition. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1749-1753). IEEE.
https://doi.org/10.1109/ICASSP.2014.6853898.
Mohamed, A., Lee, H. Y., Borgholt, L., Havtorn, J. D., Edin, J., Igel, C., ... & Watanabe, S. (2022). Self-supervised speech representation learning: A review. IEEE Journal of Selected Topics in Signal Processing, 16(6), 1179-1210. https://doi.org/10.1109/JSTSP.2022.3207050.