Seeing Alzheimer’s Disease Earlier Through Multimodal AI Data Fusion

12 Jan 2026

As global ageing accelerates, Alzheimer’s disease (AD) has become one of the most pressing health challenges affecting older adults worldwide. Because early symptoms are often subtle and disease progression is gradual, identifying risk and disease stage before clear clinical signs emerge remains difficult. Recent advances in artificial intelligence (AI), however, are enabling researchers to explore new ways of predicting and characterising the disease using information from multiple data sources.

A joint research team from Xi’an Jiaotong-Liverpool University (XJTLU) and the University of Sheffield has published a comprehensive review in the international journal Information Fusion, systematically surveying nearly two decades of research on multimodal prediction for Alzheimer’s disease. The paper reviews commonly used data modalities and biomarkers and compares how different fusion strategies have been applied to early detection, disease staging and progression-related tasks.

Beyond summarising existing studies, the team proposes a new, function-oriented taxonomy for multimodal fusion methods. The framework aims to clarify design choices across both machine learning and deep learning paradigms and to support more informed method selection under realistic data and clinical constraints.

The first author, Yifan Guan, a PhD student at XJTLU’s School of Advanced Technology, explains that multimodal prediction involves analysing health information from multiple perspectives. “For Alzheimer’s disease, this includes neuroimaging data such as MRI and PET scans, biological markers from blood or cerebrospinal fluid, genetic information, and clinical or cognitive assessments,” he says. “Each modality captures only part of the disease process. Integrating them allows a more comprehensive representation, which is critical for improving early detection and predicting disease progression.”

Despite rapid progress, translating multimodal AI research into clinical practice remains challenging. Fusion strategies vary widely across studies, and inconsistent terminology makes it difficult to compare increasingly complex model architectures. In real clinical environments, data are often incomplete, imbalanced and heterogeneous across medical centres, meaning that models performing well on public datasets may not generalise reliably. In addition, limited interpretability in some deep learning models can reduce clinicians’ trust and hinder adoption.

To address these issues, the research team reorganised existing fusion approaches based on their functional roles and design logic. The proposed taxonomy refines traditional early and late fusion distinctions by further categorising machine learning pipelines and deep learning architectures according to how information flows between modules. This provides a more interpretable framework for understanding and comparing modern multimodal designs.

“Our classification emphasises how methods actually work,” says Guan. “It helps unify discussion in the field, reduces the cost of comparing and reproducing studies, and guides researchers towards fusion strategies that are more likely to be robust and deployable.”

The paper also highlights future directions for multimodal AI research. The corresponding author, Dr Jun Qi, Associate Professor in the Department of Computer Science at XJTLU, notes that robustness and interpretability will be central to progress. “Future research will focus on maintaining performance when some data modalities are missing, incorporating medical prior knowledge to improve interpretability, and leveraging emerging technologies such as large language models to better organise complex clinical data,” she says.

The author team includes Yifan Guan, Jingzhou Xu, Dr Wei Wang, Dr Jianjun Chen, and Dr Jun Qi from XJTLU, as well as Professor Po Yang from the University of Sheffield.

By Huatian Jin

12 Jan 2026