IIUM Repository

Fundamental Research Grant Scheme (FRGS) - FRGS19-076-0684, Speech Emotion Recognition and Depression Prediction Based on Speech Analysis using Deep Neural Networks

Gunawan, Teddy Surya and Draman, Samsul and Kartiwi, Mira and Borhan, Lihanna and Abdul Malik, Noreha and Abdul Rahman, Farah Diyana and Elsheikh, Elsheikh Mohamed Ahmed and Alghifari, Muhammad Fahreza and Ahmad Qadri, Syed Asif and Ashraf, Arselan and Wani, Taiba Majid (2022) Fundamental Research Grant Scheme (FRGS) - FRGS19-076-0684, Speech Emotion Recognition and Depression Prediction Based on Speech Analysis using Deep Neural Networks. Technical Report. UNSPECIFIED. (Unpublished)

[img] PDF
Restricted to Repository staff only

Download (3MB) | Request a copy

Abstract

Speech signals contain a lot of information that can be used by computers to gain insight into a user's state, such as emotion recognition and depression prediction. Numerous applications exist, ranging from customer service to depression prevention. We propose several deep-learning-based methodologies for detecting emotion and depression in this research. We used variants of deep neural networks such as deep feedforward networks and convolutional networks. The deep learning model was trained using well-known databases such as the Berlin Emotion Database and the DAIC-WOZ Depression Dataset. The algorithm achieves an accuracy of 80.5 percent for speech emotion recognition across four languages: English, German, French, and Italian. The current algorithm detects depression with a 60.1 percent accuracy when tested on the DAIC-WOZ dataset. Additionally, this research resulted in the creation of the Sorrow Analysis Dataset – an English depression audio dataset comprised of 64 distinct samples of depressed and non-depressed individuals. Further validation using 1-dimensional convolutional networks resulted in an average accuracy of 97 percent. Further research could be conducted using other deep learning architectures, other datasets, and implementation on edge computing.

Item Type: Monograph (Technical Report)
Uncontrolled Keywords: Speech emotion recognition, depression prediction, convolutional neural networks (CNN), long-short term memory (LSTM)
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Electrical and Computer Engineering
Depositing User: Prof. Dr. Teddy Surya Gunawan
Date Deposited: 21 Feb 2022 15:06
Last Modified: 21 Feb 2022 15:06
URI: http://irep.iium.edu.my/id/eprint/96854

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year