IIUM Repository

On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine

Ashraf, Arselan and Gunawan, Teddy Surya and Arifin, Fatchul and Kartiwi, Mira and Sophian, Ali and Habaebi, Mohamed Hadi (2022) On the audio-visual emotion recognition using convolutional neural networks and extreme learning machine. Indonesian Journal of Electrical Engineering and Informatics (IJEEI), 10 (3). pp. 684-697. E-ISSN 2089-3272

[img] PDF (Article) - Published Version
Restricted to Registered users only

Download (1MB)

Abstract

The advances in artificial intelligence and machine learning concerning emotion recognition have been enormous and in previously inconceivable ways. Inspired by the promising evolution in human-computer interaction, this paper is based on developing a multimodal emotion recognition system. This research encompasses two modalities as input, namely speech and video. In the proposed model, the input video samples are subjected to image pre-processing and image frames are obtained. The signal is pre-processed and transformed into the frequency domain for the audio input. The aim is to obtain Mel-spectrogram, which is processed further as images. Convolutional neural networks are used for training and feature extraction for both audio and video with different configurations. The fusion of outputs from two CNNs is done using two extreme learning machines. For classification, the proposed system incorporates a support vector machine. The model is evaluated using three databases, namely eNTERFACE, RML, and SAVEE. For the eNTERFACE dataset, the accuracy obtained without and with augmentation was 87.2% and 94.91%, respectively. The RML dataset yielded an accuracy of 98.5%, and for the SAVEE dataset, the accuracy reached 97.77%. Results achieved from this research are an illustration of the fruitful exploration and effectiveness of the proposed system.

Item Type: Article (Journal)
Additional Information: International External Research Collaboration with UNY, Indonesia
Uncontrolled Keywords: artificial intelligence; convolutional neural networks; emotion recognition; human-computer interaction; machine learning
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Electrical and Computer Engineering
Kulliyyah of Information and Communication Technology > Department of Information System
Kulliyyah of Information and Communication Technology > Department of Information System

Kulliyyah of Engineering
Kulliyyah of Information and Communication Technology
Kulliyyah of Information and Communication Technology
Depositing User: Prof. Dr. Teddy Surya Gunawan
Date Deposited: 03 Oct 2022 11:15
Last Modified: 03 Oct 2022 11:20
URI: http://irep.iium.edu.my/id/eprint/100378

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year