Alghifari, Muhammad Fahreza and Gunawan, Teddy Surya and Wan Nordin, Mimi Aminah and Kartiwi, Mira and Borhan, Lihanna (2019) On the optimum speech segment length for depression detection. In: 2019 IEEE 6th International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA 2019), 27 - 29 Aug 2019, Kuala Lumpur, Malaysia.
PDF
- Published Version
Restricted to Registered users only Download (645kB) | Request a copy |
|
PDF (SCOPUS)
- Supplemental Material
Restricted to Registered users only Download (241kB) | Request a copy |
Abstract
Depression is a worldwide problem, which according to the World Health Organization, is the largest contributor to global disability. According to a study, around 18336 Malaysians are suffering from depression. Therefore, an automated system that can detect depression from human speech is needed. The main objective of this paper is to investigate the optimum speech segment length that provide fast and accurate depression detection. An artificial neural network was used as classifier to detect depression using a speech feature, i.e. the averaged Mel-frequency cepstral coefficients (MFCC). The Distress Analysis Interview Corpus Wizard of Oz (DAIC-WOZ) was used to train and test the system, measured in terms of accuracy and processing time, while varying the number of neurons used. The obtained results are further optimized by investigating the ideal segment length for depression detection. Results showed that our proposed system can recognize voiced depression in 3 levels of depression with an accuracy rate up to 98.3% when given previous samples of the same speaker for training. Furthermore, the optimum speech segment length was found to be 7 seconds, when it is tested for the length between 1 to 20 seconds.
Actions (login required)
View Item |