IIUM Repository

Stride based Convolutional Neural Network for Speech Emotion Recognition

Wani, Taiba Majid and Gunawan, Teddy Surya and Ahmad Qadri, Syed Asif and Mansor, Hasmah and Arifin, Fatchul and Ahmad, Yasser Asrul (2021) Stride based Convolutional Neural Network for Speech Emotion Recognition. In: 2021 IEEE 7th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA2021), Bandung (Virtual).

[img] PDF
Restricted to Registered users only

Download (964kB) | Request a copy
[img] PDF
Restricted to Repository staff only

Download (1MB) | Request a copy
[img] PDF (SCOPUS) - Supplemental Material
Restricted to Registered users only

Download (249kB) | Request a copy


Speech Emotion Recognition (SER) recognizes the emotional features of speech signals regardless of semantic content. Deep Learning techniques have proven superior to conventional techniques for emotion recognition due to advantages such as speed and scalability and infinitely versatile operation. However, since emotions are subjective, there is no universal agreement on evaluating or categorizing them. The main objective of this paper is to design a suitable model of Convolutional Neural Network (CNN) – Stride-based Convolutional Neural Network (SCNN) by taking a smaller number of convolutional layers and eliminate the pooling-layers to increase computational stability. This elimination tends to increase the accuracy and decrease the computational time of the SER system. Instead of pooling layers, deep strides have been used for the necessary dimension reduction. SCNN is trained on spectrograms generated from the speech signals of two different databases, Berlin (Emo-DB) and IITKGP-SEHSC. Four emotions, angry, happy, neutral, and sad, have been considered for the evaluation process, and a validation accuracy of 90.67% and 91.33% is achieved for Emo-DB and IITKGPSEHSC, respectively. This study provides new benchmarks for both datasets, demonstrating the feasibility and relevance of the presented SER technique.

Item Type: Conference or Workshop Item (Invited Papers)
Uncontrolled Keywords: Speech Emotion Recognition (SER), Stride-based Convolutional Neural Networks (CNN), Strides, Spectrograms
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Electrical and Computer Engineering
Kulliyyah of Engineering
Depositing User: Prof. Dr. Teddy Surya Gunawan
Date Deposited: 14 Sep 2021 11:58
Last Modified: 12 Oct 2021 09:21
URI: http://irep.iium.edu.my/id/eprint/92217

Actions (login required)

View Item View Item


Downloads per month over past year