Wani, Taiba and Gunawan, Teddy Surya and Ahmad Qadri, Syed Asif and Mansor, Hasmah and Kartiwi, Mira and Ismail, Nanang (2020) Speech emotion recognition using convolution neural networks and deep stride convolutional neural networks. In: 6th International Conference on Wireless and Telematics (ICWT) 2020, 3rd-4th September 2020, Bandung, Indonesia.
PDF
- Published Version
Restricted to Registered users only Download (672kB) | Request a copy |
|
PDF (SCOPUS)
- Supplemental Material
Restricted to Registered users only Download (230kB) | Request a copy |
Abstract
An assortment of techniques has been presented in the area of Speech Emotion Recognition (SER), where the main focus is to recognize the silent discriminants and useful features of speech signals. These features undergo the process of classification to recognize the specific emotion of a speaker. In recent times, deep learning techniques have emerged as a breakthrough in speech emotion recognition to detect and classify emotions. In this paper, we have modified a recently developed different network architecture of convolutional neural networks, i.e., Deep Stride Convolutional Neural Networks (DSCNN), by taking a smaller number of convolutional layers to increase the computational speed while still maintaining accuracy. Besides, we trained the state-of-art model of CNN and proposed DSCNN on spectrograms generated from the SAVEE speech emotion dataset. For the evaluation process, four emotions angry, happy, neutral, and sad, were considered. Evaluation results show that the proposed architecture DSCNN, with the prediction accuracy of 87.8%, outperforms CNN with 79.4% accuracy.
Actions (login required)
View Item |