Building a Bangla Speech – Data & Design Lab

Building a Bangla Speech Emotion Recognition Dataset

The recent trends in artificial intelligence (AI) systems are increasingly moving towards personalized applications. With the capability of understanding user emotions, AI-based applications can provide more engaging and tailored experiences. Emotion recognition is one of the fundamental challenges in natural language processing, and due to the diversity of languages, this challenge is often best addressed through monolingual approaches. Recognizing the importance of emotion understanding in the Bangla language, we have conducted a thorough evaluation of existing data resources for emotion recognition in Bangla.

Our focus has been primarily on audio data, as it plays a crucial role in accurately capturing the nuances of speech emotions. Upon evaluating the available datasets for Bangla speech emotion recognition, we found that they are not suitable for developing real-life applications and fall short of the requirements needed to achieve state-of-the-art results. This inadequacy highlights a significant gap in resources for Bangla speech emotion recognition.

To address this gap, our project aims to develop a comprehensive speech-emotion corpus tailored specifically for the Bangla language. Our primary objective is to design a dataset that is robust and detailed enough to support the development of advanced AI applications capable of accurately recognizing and responding to emotions in Bangla speech. By focusing on real-life applicability, we strive to ensure that our dataset will be instrumental in creating AI systems that are not only state-of-the-art but also practical for everyday use. This initiative will pave the way for more personalized and emotionally intelligent AI interactions in the Bangla-speaking community.