IIUM Repository

MyDAS corpus: Malay social media texts for detecting depression, anxiety, and stress on Facebook

Ahmad, Zaaba and Mohamed, Azlinah and Conway, Mike and Zakaria, Rozanizam and Ibrahim Teo, Noor Hasimah and Maskat, Ruhaila (2024) MyDAS corpus: Malay social media texts for detecting depression, anxiety, and stress on Facebook. In: 5th International Conference on Artificial Intelligence and Data Sciences (AiDAS 2024), 30th October 2024, Bangkok, Thailand.

[img]
Preview
PDF - Published Version
Download (1MB) | Preview
[img]
Preview
PDF - Supplemental Material
Download (197kB) | Preview

Abstract

The application of Natural Language Processing (NLP) in mental health monitoring has significantly expanded; however, the specific challenges of interpreting Depression, Anxiety, and Stress (DAS) in Malay language social media texts have not been adequately addressed. This gap underscores the need for NLP solutions that are sensitised to the linguistic and cultural specificities of Malay-speaking populations. This study develops and validates a specialised Malay language corpus from social media content, targeting DAS. Utilising a hybrid ground truth strategy that integrates self-reports with expert assessments, the research offers methodological refinements in the analysis of Malay linguistic patterns and the deployment of machine learning classifiers to efficiently identify mental health indicators. The paper reviews existing methodologies, outlines a novel corpus development strategy, and discusses classifier performance. The Decision Tree classifier achieved the highest F1 score of 0.75, followed by the Support Vector Machine (SVM) with an F1 score of 0.73, and Random Forest with 0.70. Multinomial Naive Bayes (MNB) and K-Nearest Neighbors (KNN) demonstrated lower performances with F1 scores of 0.55 and 0.52 respectively. Comprehensive analyses using bi-gram networks and t-SNE visualisations explore the nuanced linguistic indicators of mental health states, culminating in a discussion of the implications for future NLP applications in mental health monitoring.

Item Type: Proceeding Paper (Other)
Additional Information: 6940/122878
Uncontrolled Keywords: Mental Health Detection, Natural Language Processing, Malay Language, Corpus Development, Social Media, Depression, Anxiety, Stress, Decision Tree, Support Vector Machine
Subjects: R Medicine > RA Public aspects of medicine > RA790 Mental Health. Mental Illness Prevention
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Medicine
Kulliyyah of Medicine > Department of Psychiatry
Depositing User: Dr Rozanizam Zakaria
Date Deposited: 25 Aug 2025 17:13
Last Modified: 25 Aug 2025 17:13
URI: http://irep.iium.edu.my/id/eprint/122878

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year