Said Omar, Saida and Mohamad Zulkufli, Nurul Liyana and Ahmad Puzi, Asmarani and Shah, Asadullah (2025) Performance metrices analysis for diabetes prediction using machine learning algorithm. In: 9th IEEE International Conference on Engineering Technologies and Applied Sciences, ICETAS 2024, 20-22 November 2024, Bahrain.
|
PDF
- Published Version
Restricted to Registered users only Download (950kB) | Request a copy |
||
|
PDF
- Supplemental Material
Download (159kB) | Preview |
Abstract
Diabetes is the among of the chronic diseases which is every year the number of deaths increases to the diabetes patients. Despite the different ages, diabetes can either be inherited genetically from the family or the person can get due to the imbalance of balance diet. Therefore, to reduce the speed of being affected inside the body, prediction techniques such as Machine learning should be used. The main objective of this study is to predict diabetes using two different datasets with different features using Machine learning algorithm. The first dataset had a large number of instances and was divided into the different age groups that include Pediatrics, Early Adulthood, Middle age and Geriatric. While the second dataset was coming through Sylhet Diabetes Hospital in Bangladesh with the small number of instances. To see the behaviour of these dataset, each dataset used three machine learning algorithm which were Naïve Bayes, Random Forest and Decision tree (J48). In the experiments various performance metrices were used including accuracy, precision, recall, F1 score, ROC, Kappa statistics, Mean Absolute Error (MAE), Root Mean Squares Error (RMSE) and Relative Absolute Error (RAE). The experimental results show that in the accuracy that the larger datasets, Naïve Bayes had higher accuracy of 99.78% and 98.78% on the age group of Pediatrics and Early Adulthood respectively, while in the age group of Middle age and Geriatric, Decision Tree had high accuracy of 95.55% and 93.25%, but in term of other performance metrices Naïve byes performed well. As for the second dataset the Random Forest had the highest accuracy of 97.50% compared to other algorithms.
Actions (login required)
![]() |
View Item |

Download Statistics
Download Statistics