IIUM Repository

Machine learning or morphometric scaling? a systematic review of methodological confounds and the generalizability of sex classification in neuroimaging

Sapuan, Abdul HaliM and Jamaludin, Iqbal and Abdul Majid, Zafri Azran and Mohd Tamrin, Mohd Izzuddin and Che Azemin, Mohd Zulfaezal and Turaev, Sherzod (2026) Machine learning or morphometric scaling? a systematic review of methodological confounds and the generalizability of sex classification in neuroimaging. Exploration of Neuroprotective Therapy, 6. pp. 1-16. E-ISSN 2769-6510

[img]
Preview
PDF - Published Version
Download (573kB) | Preview

Abstract

Background: This systematic review critically evaluates whether machine learning (ML) identifies biologically meaningful sex-related brain architecture or merely exploits methodological artifacts and allometric scaling. While ML models achieve high classification accuracies, it remains unclear if these reflect stable, mechanistically informative dimorphism or are driven by confounds such as total intracranial volume (TIV) and site-specific noise. We examine how imaging modalities, algorithms, and population strata influence both classification outcomes and biological interpretability. Methods: Following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we searched Web of Science, PubMed, and Scopus through January 2024. Included studies [healthy humans, 3T magnetic resonance imaging (MRI), ML-based sex classification] were assessed for risk of bias, focusing on data leakage, validation strategies, and confound management. Results: Thirty-five studies (n > 110,000) were included. While reported accuracies reached 98.06% for T1-weighted MRI, 96.0% for diffusion MRI (dMRI), and 94.72% for functional MRI (fMRI), performance was highly dependent on population characterization and age. Deep learning consistently outperformed traditional ML (TML) but showed high sensitivity to methodological artifacts. Notably, studies failing to correct for TIV reported potentially inflated accuracies, suggesting that many models identify physical scale rather than intrinsic neuroanatomical dimorphism.

Item Type: Article (Journal)
Uncontrolled Keywords: sex classification, brain MRI, machine learning, deep learning, neuroimaging, grey matter, functional connectivity, diffusion imaging
Subjects: Q Science > QM Human anatomy
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Allied Health Sciences
Kulliyyah of Allied Health Sciences > Department of Diagnostic Imaging and Radiotherapy
Kulliyyah of Allied Health Sciences > Department of Optometry and Visual Science
Kulliyyah of Information and Communication Technology > Department of Information System
Kulliyyah of Information and Communication Technology > Department of Information System
Depositing User: Dr. Mohd Zulfaezal Che Azemin
Date Deposited: 30 Mar 2026 12:50
Last Modified: 31 Mar 2026 16:06
Queue Number: 2026-03-Q2642
URI: http://irep.iium.edu.my/id/eprint/128056

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year