IIUM Repository

Enhancing data integrity in internet of things-based healthcare applications: a visualization approach for duplicate detection

Md Isa, Siti Noor Basirah and Emran, Nurul Akmar and Harum, Norharyati and Logenthiran, Machap and Nordin, Azlin (2025) Enhancing data integrity in internet of things-based healthcare applications: a visualization approach for duplicate detection. Bulletin of Electrical Engineering and Informatics, 14 (5). pp. 3704-3715. ISSN 2089-3191 E-ISSN 2302-9285

[img]
Preview
PDF - Published Version
Download (7MB) | Preview
[img]
Preview
PDF - Supplemental Material
Download (150kB) | Preview

Abstract

This study addresses the critical issue of data duplication in healthcare-related internet of things (IoT) datasets, which can compromise the reliability of analyses and patient outcomes. A Python-based visualization framework using Pandas and Matplotlib was developed to detect and represent duplicate records. The methodology was applied to six cancer-related datasets sourced from Kaggle, ranging from 300 to 55,000 records, encompassing numerical, textual, and categorical data types. The visualization technique provided clear insights into duplication patterns, identifying specific counts such as 7 duplicates in the wearable device dataset, 19 in the thyroid recurrence dataset, and 534 in the synthetic healthcare electronic health record (EHR) dataset. Compared to traditional detection methods, the visualization tool facilitated faster and more intuitive initial data assessment, demonstrating its effectiveness for rapid quality checks in healthcare datasets. However, scalability limitations were observed in larger datasets, where visual clarity declined. These findings highlight the value of visualization as a preliminary data quality assessment tool and suggest future integration with advanced detection algorithms to enhance robustness and scalability

Item Type: Article (Journal)
Uncontrolled Keywords: Data duplication, Duplicates detection, Healthcare, Internet of things data, Visualization
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Information and Communication Technology > Department of Computer Science
Kulliyyah of Information and Communication Technology > Department of Computer Science
Depositing User: Azlin Nordin
Date Deposited: 17 Oct 2025 16:34
Last Modified: 17 Oct 2025 16:34
URI: http://irep.iium.edu.my/id/eprint/123798

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year