Nature Medicine | Di Qian’s Team Develops a Large Language Model Framework for Assessing Multidimensional Aging in the General Population

2025-07-28 15:39

Aging is a major risk factor for chronic diseases and mortality, posing significant challenges to global public health. As research deepens, it has become increasingly clear that aging is not a singular process but a complex system influenced by a multitude of interwoven factors, including genetics, environment, lifestyle, and medical history. Consequently, chronological age alone is insufficient to accurately reflect an individual’s true state of aging. Current aging assessment tools, such as the frailty index and aging clocks, although capable of capturing aspects of an individual’s aging to some extent, are limited in several ways: methodological constraints (supervised models rely on chronological age as a label, potentially diminishing their ability to capture actual biological aging information), weak associations with adverse health outcomes (only capturing specific aspects of aging), limited generalizability (difficulty in widespread application across diverse populations), and high costs (e.g., epigenetic clocks depend on methylation testing, which is not suitable for large-scale screening).


The latest perspective suggests that, compared to supervised models, unsupervised models may be more suitable for capturing aging signals. As the most advanced unsupervised models currently available, large language models first acquire comprehensive world knowledge—including medical knowledge—through pre-training on vast cross-domain textual data. They then activate specialized capabilities in specific fields such as medicine and aging through fine-tuning. The advantages of these models include: (1) avoiding the limitations imposed by reliance on aging labels; (2) integrating multiple aging-related factors (biological indicators, lifestyle, socioeconomic status, medical history, genetics, etc.), thereby enhancing their association with adverse outcomes; and (3) the ability to process data in any format, making them well-suited for large-scale applications at the community level. Based on these strengths, leveraging the powerful knowledge integration, generalization, and multimodal processing capabilities of large language models holds great promise for opening new breakthroughs in aging research.


On July 23, 2025, Associate Professor Di Qian from the School of Public Health at Tsinghua University, Professor Yang Yining from the People’s Hospital of the Xinjiang Uygur Autonomous Region, Assistant Professor Ma Wei Zhi from the Institute for AI Industry Research at Tsinghua University, and their research team published an online research article in the Nature Medicine titled “Large language model-based biological age prediction in large-scale populations.” This study proposes a novel aging assessment framework based on large language models, aiming to leverage the powerful capabilities of these models to extract multidimensional aging-related information from unstructured health data and predict an individual’s overall and organ-specific biological age.


The study integrated five nationally representative population cohorts: the UK Biobank (UKB), the National Health and Nutrition Examination Survey (NHANES) in the United States, the China Health and Retirement Longitudinal Study (CHARLS), the Chinese Longitudinal Healthy Longevity Survey (CLHLS), and the China Family Panel Studies (CFPS), along with the research team’s own real-world population cohort from Northwest China (NCRP), yielding a total sample size exceeding 10 million to evaluate the performance of the proposed framework. First, the study generated textualized medical reports based on routine health indicators and used eight large language models to assess both overall and organ-specific aging levels of individuals. The age predicted by the large language models was defined as a more comprehensive aging proxy. Subsequently, the study validated the associations between the predicted age, age gap (i.e., the difference between the model-predicted age and chronological age), and various aging-related adverse outcomes. These associations were compared with those from traditional machine learning models and other classical aging indicators, such as epigenetic age, telomere length, and frailty index. Finally, the study explored the dynamic aging assessment capabilities of large language models by applying the age gap to various biomedical and clinical downstream tasks and conducting interpretability analysis of the aging assessment within the large language models.

The study highlights the following key features:

(1) The study employed an unsupervised aging assessment framework based on large language models, offering a novel and comprehensive aging proxy metric. The biological age estimates and age gaps derived from this framework outperformed traditional epigenetic clocks and state-of-the-art machine learning models in predicting a wide range of aging-related phenotypes and adverse health outcomes, including all-cause mortality.

(2) Leveraging the strong generalization capabilities of large language models, the study validated the robustness of its aging assessment framework across diverse populations from different countries and regions, covering tens of millions of individuals.

(3) By harnessing the “real-time learning and memory” capabilities of large language models, the study proposed a dynamic aging assessment framework, a feat difficult to achieve with traditional methods. This framework is capable of processing continuously accumulated longitudinal health information and modeling individual aging trajectories, potentially serving as a future digital twin for individual health.

(4) The study developed a low-cost aging assessment tool based on large language models. With only a routine health examination report, the model can provide a convenient, reliable, and cost-effective multidimensional aging assessment, thereby enhancing the accessibility of expert-level medical resources to large general populations.

(5) Using the proposed method, the study identified multiple novel proteomic biomarkers associated with accelerated aging. It also comprehensively evaluated the associations between age gap and 270 diseases, providing a panoramic view of the relationships between accelerated aging and disease.


In summary, large language models can predict an individual’s overall biological age and organ-specific age based on routine health examination reports. Through validation across six large population cohorts covering tens of millions of samples, the study confirmed the effectiveness and reliability of the proposed framework. Applying large language models to predict biological age and age gap enables more accurate individual aging modeling and health risk assessment, thus providing significant support for large-scale population health management.


Associate Professor Di Qian from the Vanke School of Public Health and Health Management at Tsinghua University, Professor Yang Yining from the People’s Hospital of the Xinjiang Uygur Autonomous Region, and Assistant Professor Ma Weizhi from the Institute of Artificial Intelligence Industry at Tsinghua University are the co-corresponding authors of this study. PhD students Li Yanjun and Huang Qi from the Vanke School of Public Health and Health Management at Tsinghua University, PhD student Jiang Jin from the Wangxuan Institute of Computer Technology at Peking University, and postdoctoral researcher Du Xusheng from the Vanke School of Public Health and Health Management at Tsinghua University are the co-first authors of the study.