Abstract:Scientific Question
Language impairment has been established as a highly sensitive and specific non-invasive behavioral marker of Alzheimer's disease (AD). Evidence indicates that language deterioration follows a continuous trajectory from mild cognitive impairment (MCI) to AD dementia. However, existing research remains fragmented across different cognitive stages, and the integration from behavioral description to intelligent assessment remains unclear. This review aims to systematically characterize the evolution of language performance from healthy aging through MCI to AD, elucidate the underlying impairments in core cognitive systems, clarify the role and pathways of computational linguistic analysis in quantifying language behaviors, assessing cognitive function, and enabling intelligent AD assessment based on specific linguistic features, and identify critical research gaps.
Methods
This review employs a narrative synthesis approach. We first trace the longitudinal trajectory of language performance from healthy aging through MCI to AD and explain underlying impairments in semantic memory, working memory, and executive function. We then integrate research paradigms in computational linguistic analysis across three dimensions: language task design, feature engineering and modeling strategies, and clinical integration and validation. Finally, we provide a forward-looking perspective on future directions across data expansion, technological innovation, and theoretical deepening.
Results
Language performance in AD shows progressive deterioration. In MCI, patients exhibit impaired explicit semantic retrieval and subtle syntactic simplification, while semantic knowledge storage remains relatively intact. In AD, patients demonstrate profound semantic network degradation, syntactic structure collapse, and pronounced phonological abnormalities. These behavioral patterns accurately map onto impairments in core cognitive systems: semantic memory, working memory, and executive function. Based on these behavioral and cognitive characterizations, both algorithmic paradigms demonstrate potential for intelligent screening. Handcrafted feature-based machine learning models offer strong interpretability by identifying sensitive linguistic markers linked to specific cognitive domains. Deep learning-based methods capture subtle language patterns and leverage large language models to simulate language deterioration, providing new avenues for understanding how cognitive impairments lead to language deficits. Beyond algorithmic performance, the value of computational linguistic analysis lies in its role as objective behavioral data that can be synergistically integrated with traditional clinical assessment systems. This integration is first embodied in dual validation: linguistic features must demonstrate both clinical validity and biological validity, thereby enhancing explanatory depth and translational potential. In terms of integration approaches, computational analysis can be embedded into different stages of clinical practice, providing auxiliary signals for physicians during community screening and outpatient consultations.
Conclusions
Future research should prioritize three complementary directions. First, advancing from static assessment to dynamic monitoring through long-term cohort studies and naturalistic data collection, supported by international collaboration to enrich data sources. Second, developing new algorithms to enhance model interpretability and leveraging large language models to simulate language deterioration. Third, promoting multimodal integration that combines linguistic features with other modalities, enabling more accurate prediction and clarifying the relationships among language abnormalities, cognitive impairments, and AD pathology.
Application Implications
For clinical practitioners, this review provides guidance for establishing stage-specific screening protocols focusing on early indicators to aid accurate identification. For technology developers, the findings emphasize developing explainable algorithms and multi-modal tools that prioritize clinical trust. For healthcare administrators, promoting pilot applications in community and home-based settings can facilitate the shift from episodic assessment to continuous monitoring.