IIUM Repository

Performance metrices analysis for diabetes prediction using machine learning algorithm

Said Omar, Saida and Mohamad Zulkufli, Nurul Liyana and Ahmad Puzi, Asmarani and Shah, Asadullah (2025) Performance metrices analysis for diabetes prediction using machine learning algorithm. In: 9th IEEE International Conference on Engineering Technologies and Applied Sciences, ICETAS 2024, 20-22 November 2024, Bahrain.

[img] PDF - Published Version
Restricted to Registered users only

Download (950kB) | Request a copy
[img]
Preview
PDF - Supplemental Material
Download (159kB) | Preview

Abstract

Diabetes is the among of the chronic diseases which is every year the number of deaths increases to the diabetes patients. Despite the different ages, diabetes can either be inherited genetically from the family or the person can get due to the imbalance of balance diet. Therefore, to reduce the speed of being affected inside the body, prediction techniques such as Machine learning should be used. The main objective of this study is to predict diabetes using two different datasets with different features using Machine learning algorithm. The first dataset had a large number of instances and was divided into the different age groups that include Pediatrics, Early Adulthood, Middle age and Geriatric. While the second dataset was coming through Sylhet Diabetes Hospital in Bangladesh with the small number of instances. To see the behaviour of these dataset, each dataset used three machine learning algorithm which were Naïve Bayes, Random Forest and Decision tree (J48). In the experiments various performance metrices were used including accuracy, precision, recall, F1 score, ROC, Kappa statistics, Mean Absolute Error (MAE), Root Mean Squares Error (RMSE) and Relative Absolute Error (RAE). The experimental results show that in the accuracy that the larger datasets, Naïve Bayes had higher accuracy of 99.78% and 98.78% on the age group of Pediatrics and Early Adulthood respectively, while in the age group of Middle age and Geriatric, Decision Tree had high accuracy of 95.55% and 93.25%, but in term of other performance metrices Naïve byes performed well. As for the second dataset the Random Forest had the highest accuracy of 97.50% compared to other algorithms.

Item Type: Proceeding Paper (Other)
Uncontrolled Keywords: Diabetes, Machine learning algorithms, Decision tree (J48), Random Forest, Naïve Bayes
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > T Technology (General) > T55.4 Industrial engineering.Management engineering. > T58.6 Management information systems
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Information and Communication Technology
Kulliyyah of Information and Communication Technology

Kulliyyah of Information and Communication Technology > Department of Computer Science
Kulliyyah of Information and Communication Technology > Department of Computer Science

Kulliyyah of Information and Communication Technology > Department of Information System
Kulliyyah of Information and Communication Technology > Department of Information System
Depositing User: Dr. Nurul Liyana Mohamad Zulkufli
Date Deposited: 30 Dec 2025 15:07
Last Modified: 30 Dec 2025 15:07
Queue Number: 2025-12-Q1132
URI: http://irep.iium.edu.my/id/eprint/126146

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year