Differentiating Alzheimers disease from mild cognitive impairment: a quick screening tool based on machine learning

STRENGTHS AND LIMITATIONS OF THIS STUDY

By incorporating activities of daily living, behavioural and psychological symptoms of dementia and cognitive functions to provide a more comprehensive assessment.

Four machine learning models were used to verify the robustness of the new tool.

The cut-off point requires further investigation.

Introduction

The world’s population is ageing at an unprecedented rate. The population aged over 60 is estimated to rise to 2 billion by 2050, with two-thirds living in low-income and middle-income countries.1 A large proportion of the elderly will suffer from normal and pathological memory loss.2 Alzheimer’s disease (AD) is a neurodegenerative disorder characterised by a progressive cognitive decline, accompanied by behavioural and psychological symptoms of dementia (BPSD), and patients will eventually lose the ability to perform activities of daily living (ADL). It deprives people of their independence and is the fifth leading cause of death across the world.3 Although several in-use or under-development medications showed clinical or pathological benefits, AD is still incurable at present. Mild cognitive impairment (MCI) is considered a condition where cognitive abilities are below the age expectations.3 4 The rate of conversion from the amnestic form of MCI to AD is estimated to be 10%–15% per year.5 However, not all people with MCI develop dementia. There are different strategies for dealing with AD and MCI. Diagnosing AD and MCI is not universal or easy, only 20%–50% of those have a diagnosis recorded in primary care notes, and this number is even lower in low-income countries.4 Guidelines across countries recommend that people with suspected dementia be referred to a memory clinic.6 7 A systematic evaluation including the patient’s and informant’s history, medication review, structured cognitive assessment, blood tests, genetic examination and imaging are the best methods.8 However, prolonged comprehensive evaluations may deter less motivated patients. A simple and convenient tool is needed to achieve initial screening.

The traditional batteries evaluating the core symptoms of AD revolve around ADL, BPSD and cognition (ABC) individually which will take a long time, while emerging computerised assessment tools tend to focus on cognition and usually omit the assessment of ADL and BPSD.9 10 A simultaneous quick assessment of ABC domains plays an important role in guiding general practitioners and social workers to make it easier to roll out screening in the population.

Machine learning (ML) is emerging in clinical neuropsychology as a credible technique to support this process that has been applied for diagnosis in some neurodegenerative researches.11 These methods can classify, generalise, optimise and learn relationships between some potential variables, and have been used in the context of MCI and AD research to attempt to determine anatomical biomarkers for MCI and conversion to AD.12–14 Some previous studies used sociodemographic data and clinical health records to predict MCI and AD15 16 and to compare predictors for identifying MCI and dementia in non-patient populations by using ML method.17 Moreover, some of the studies try to build predictive models or optimise neuropsychological assessments by using functional and behavioural cognitive measures18 19 or assessing MCI conversion to AD.20

To our knowledge, no published articles have revealed a scale-based ML method for differentiating AD from MCI. In the present study, we aimed to provide a simplified scale composed of selected items that cover the three domains of ADL, BPSD and cognition to differentiate AD from MCI in a comprehensive way.

MethodsPatient and public involvement

Patients or the public were not involved in the design, or conduct, or reporting or dissemination plans of our research.

A total of 458 patients newly diagnosed with AD or MCI from the memory clinic from 2014 to 2019 with complete records of the ABC tests were included in this study (online supplemental material 1). AD was diagnosed by the criteria for probable AD updated by the National Institute on Aging-Alzheimer’s Association (NIA/AA) in 2011, and the diagnosis of MCI was also based on NIA/AA diagnostic guidelines.21 22

Instruments and assessment

Eleven scales were applied in this study,23 including the Chinese version of the Mini-Mental State Examination (MMSE), Neuropsychiatric Inventory (NPI), Geriatric Depression Scale (GDS), Boston Naming Test (BNT), Digit Span (DS), Auditory Verbal Learning Test (AVLT), Trail Marking Test (TMT) A and B, Alzheimer’s Disease Assessment Scale (ADAS-cog), Clinical Dementia Rating (CDR), Instrumental Activity of Daily Living Scale (IADL) and Physical Self-Maintenance Scale (PSMS), to evaluate ADL, BPSD and cognition. All scales were collected at the time of the patients’ initial visit.

Statistical analysis

Demographic characteristics were evaluated by frequency distributions, and independent samples t-tests, and χ2 analyses were performed to compare groups on quantitative continuous and categorical variables respectively.

Machine learning

XGBoost and SHapley Additive exPlanations (SHAP) were used to identify the importance of each item and to create the ABC-Scale in conjunction with clinical practice. Then XGBoost, Bayes, support vector machines (SVMs) and logistic regression (LR) were used to verify the validity of the ABC-Scale by the area under the curve (AUC) (online supplemental material 2)

Data processing

The data processing involved dividing the raw data into K segments, after which K experiments were conducted. In each round of experiment, the first set of K data was used for testing. K-1 copies of the data were then used for training, and this process was repeated until the final round of experiment, where only one copy of data was used for testing, and the remaining K−1 copies were used for training. To avoid overfitting, the average of the receiver operating characteristic (ROC) results from all K rounds was calculated (online supplemental material 3).

XGBoost

XGBoost was an open-source framework for gradient boosting24 and was one of the boosting algorithms.25 The idea of the boosting algorithm was to integrate many weak classifiers to form a robust classifier. As a boosting tree model, XGBoost uses many tree models to create a robust classifier. The classification and regression tree (CART) was the tree model used.26 27 The CART was a binary tree28 and divides the features continuously. The CART essentially divides the sample space in the feature dimension, and the optimisation of this space division was an NP-hard problem. Therefore, in the decision tree model, a heuristic method was used to solve it. The objective function generated by a typical CART is:

Embedded ImageEmbedded Image

Therefore, to solve the optimal segmentation feature j and the optimal segmentation point s, it is converted into solving such an objective function:

Embedded ImageEmbedded Image

The XGBoost algorithm continuously adds trees and always performs feature splitting to grow a tree. Each time a tree was added, it learnt a new function to fit the residual of the last prediction. Following the training process, we typically acquired a set of k trees. In order to forecast the score of a specific sample, we matched the sample’s attributes to a leaf node in each tree. Each leaf node was associated with a score, and we aggregated the scores from all the trees to determine the projected value for the sample. The XGBoost objective function was defined as:

Embedded ImageEmbedded Image

where

Embedded ImageEmbedded Image

The objective function consisted of two parts: the first part was used to measure the difference between the predicted and actual scores, and the other part was the regularisation term. The regularisation term also contained two parts: T represented the number of leaf nodes, and w represented the score of the leaf nodes. γ controlled the number of leaf nodes, and μ influenced the score of leaf nodes not to be too large, preventing overfitting.29

In our experiment, we performed regularisation processing and obtained the final input vector for XGBoost. In addition to the input metric, we also set the hyperparameters of XGBoost, such as the number of iterations and learning rate. Embedded ImageEmbedded Image was a matrix composed of the scores of all questions in each scale. Assuming there were a total of V questions in all scales, then Embedded ImageEmbedded Image =[Embedded ImageEmbedded Image ,Embedded ImageEmbedded Image ,…,Embedded ImageEmbedded Image ] after regularisation, Embedded ImageEmbedded Image represents whether the i-th sample belongs to AD or MCI.

SHAP analysis

The model generated an expected value for each predicted sample, and the SHAP value was the value assigned to each feature in the selection.30 In other words, the Shapley value of each component was calculated, and the quality was used to measure the final output value. Expressed with a formula:

Embedded ImageEmbedded Image

where g was the explanatory model, M was the number of input features, z was the presence or absence of the corresponding component (1 or 0), Φ was the imputed value (Shapley value) for each element and Φ_0 was a constant. Since the input of the tree model was structured data, for sample x, all components were present so that the formula is:

Embedded ImageEmbedded Image

The excellent characteristics of the SHAP method provided a reasonable solution for the interpretability of the model and help people better understand ML.

ResultsDemographics of participants enrolled

A total of 236 were diagnosed with AD, while 222 were diagnosed with MCI. The patients were 180 males and 278 females with an average age of 72.99 (SD=9.04) years old. The distribution of gender (p<0.01) showed significant differences in the AD group and MCI group. Years of education were 10.30 years (SD=7.08, p=0.46) (table 1).

Table 1

Demographics and summary outcomes of participants enrolled

Items filtered by MLOverall

The CART regression tree showed the importance of items. Among all the battery items, the ADAS-cog word recognition task showed the best importance in judging AD and MCI, followed by correct numbers of AVLT delay recall, ADAS-cog orientation, MMSE recall and TMT-B time (figure 1), while 31 non-important items were excluded due to the value of variable importance were 0. Another 14 items were removed after combining the SHAP analysis (figure 2). A total of 45 items removed are shown in (online supplemental material 4). In addition, education is an independent non-ABC factor that plays an important role in distinguishing AD from MCI.

Figure 1Figure 1Figure 1

Importance of all features in a vertical chart. 31 items that are not in the chart since they showed zero importance; The Arabic numerals in the figure represent the number of entries in GDS. ADAS, Alzheimer’s Disease Assessment Scale; CDR, Clinical Dementia Rating; GDS, Geriatric Depression Scale; IADL, instrumental activities of daily living; MMSE, Mini-Mental State Examination; NPI, Neuropsychiatric Inventory; TMT, Trail Marking Test.

Figure 2Figure 2Figure 2

SHAP distribution of all items. The Arabic numerals in the figure represent the number of entries in GDS. ADAS, Alzheimer’s Disease Assessment Scale; BNT, Boston Naming Test; CDR, Clinical Dementia Rating; GDS, Geriatric Depression Scale; IADL, instrumental activities of daily living; MMSE, Mini-Mental State Examination; NPI, Neuropsychiatric Inventory; SHAP, SHapley Additive exPlanations; TMT, Trail Marking Test.

Activities of daily living

After combining CART regression and SHAP distribution, we selected important ADL indicators. The item with the best resolution was IADL’s ability to handle finances, followed by responsibility for one’s own medications and housekeeping, while all PSMS items did not contribute to differentiating the AD from MCI.

Behavioural and psychological symptoms of dementia

For BPSD, the item with the best resolution is the GDS total score, followed by the GDS item, ‘Have you dropped many of your activities and interests?’, and ‘Do you prefer to avoid social gatherings?’, ‘Is your mind as clear as it used to be?’, ‘Do you feel that your life is empty?’, ‘Is it hard for you to get started on new projects?’. NPI showed a limited contribution to differentiating AD from MCI, of which the NPI total score and irritability were relatively important.

Cognitive function

For cognition, the item with the best resolution was ADAS-cog word recognition, followed by AVLT delay recall, ADAS-cog orientation, MMSE recall, and TMT-B time.

Selected ABC-Scale

Selected items were screened based on the primary ML results of CART and SHAP and the feasibility of the actual scale evaluation. The selected ABC-Scale covers three areas of ADL, BPSD and cognitive function with an estimated completion time of 18 min (table 2). Four models were used to assess the discrimination of the ABC-Scale. The sensitivity presented by AUC was improved in XGBoost (from 0.83 to 0.85), Bayes (from 0.77 to 0.83), SVM (from 0.80 to 0.85) and logical regression (from 0.84 to 0.86) (figure 3). And the Tjur value of logical regression was 0.38. Comparison of accuracy of ABC-Scale under each ML algorithm is presented in online supplemental material 5.

Figure 3Figure 3Figure 3

ROC curves for (A) XGBoost, (B) Bayes, (C) SVM and (D) logical regression. ROC, receiver operating characteristic; SVM, support vector machine.

Table 2

Selected ABC-Scale items

Discussion

In this study, the complete records of scales (including ADL, BPSD and cognition) of 458 patients diagnosed with AD and MCI were analysed by CART regression and Shap analysis. The most valid items for ADL, BPSD and cognition are the ability to handle finances, GDS total score and ADAS-cog word recognition, respectively. After combining with the feasibility of the actual scale evaluation, we provide a quick screening ABC-Scale with good sensitivity.

Loss of living capacity is the most important symptom of late AD and the greatest burden on caregivers. There is still a lack of evidence of what specific capacity loss is the early manifestation between MCI and AD. Our results show that any positive response of financial management, responsibility for own medications and housekeeping predict high risk for AD. This is not a surprise since some previous studies demonstrate that financial capacity is impaired in the prodromal and mild stages of AD, which is associated with the extent of cortical β-amyloid deposition.31 The responsibility for one’s own medications may be related to the decline in memory, executive function and loss of insight. In addition, housekeeping ability may be associated with depression symptoms or physical ability. The above symptoms related to daily life are not only helpful for clinicians to diagnose but also have a high value of early warning for patients in home and nursing environments.

Previous studies have shown that depression is one of the most common BPSD of AD. Quantitative analysis suggests that the prevalence of depressive syndrome in AD is approximately 29%.32 However, depressive syndrome contains a wide range of symptoms which can be challenging for clinicians to differentiate the specific symptoms in patients with AD. In addition, cohort studies have illustrated that late-life depression is a risk factor for AD.33 34 It is biologically plausible that depression increases the risk of dementia due to its impact on stress hormones, neuronal growth factors and hippocampal volume.35 However, it remains unclear whether depressive symptoms are an early sign of AD or a factor that predisposes individuals to dementia. Our study highlights that depressive symptoms dominated by anhedonia may be an important BPSD manifestation in differential MCI and AD. It is a promising strategy for guiding clinicians to make a differential diagnosis. Our study also shows that social isolation is an important BPSD in differencing MCI and AD. It may increase the risk of depression and result in cognitive inactivity according to some studies.36 37 It is worth noting that the NPI scale do not perform well in distinguishing AD from MCI and none of the NPI items are included in the selected ABC-Scale, one explanation is that most BPSD are present in the late stage of AD. The importance of measuring other BPSD in patients with AD is still irreplaceable since patients need to receive individualised treatment according to their conditions and the course of the disease. However, the stages of AD are not included in the demographic data, which is considered a limitation of this study.

Most cognitive features provide information on memory and executive function. This also confirms the importance of memory in age-related cognitive impairment.38 Our study demonstrates that TMT-B is better than TMT-A at differentiating AD from MCI, suggesting that executive function is more impaired than attention in patients with AD. In addition, ADAS-cog orientation plays a key role instead of MMSE orientation. The main difference between the two is that time orientation test of ADAS-cog includes accurate time orientation to the hour while only a specific day in the time orientation in MMSE. Some previous studies have suggested that reduced regional cerebral blood flow in the left posterior cingulate cortex is specifically associated with the orientation of time, while hypoperfusion in the left superior parietal lobule and bilateral inferior parietal lobule may be AD specific.39 We highlight that more research is needed to reveal the biological mechanisms underlying temporal disorientation in patients with AD.

In combination with ML results and clinical practice, this study provides a simplified scale with good reliability and validity that can be completed quickly for ADLs, BPSD and cognition (estimated completion time of 18 min). Although some cognitive functions (eg, ADAS-cog word recognition, correct numbers of AVLT delay recall, ADAS-cog orientation, MMSE recall, TMT-B time) showed better effectiveness than some ADLs and BPSD items when differentiating AD from MCI. However, it is not comprehensive to provide a simplified scale only from the perspective of cognition since ADLs, BPSD and cognition are all indispensable independent dimensions to measure AD and MCI. All four models are used to assess the sensitivity of the ABC-Scale and show better sensitivity. This means that the ABC-Scale greatly reduces the time needed for evaluation while maintaining good sensitivity since all four models showed a higher ROC. The cut-off point of the simplified scale was presented individually from the view of ML. It should be further validated among large-scale populations to provide a more convincing threshold. ML has also been used to evaluate linguistic features within voice to classify dementia status.40–43 Considering that some tests require subjects to reply with voice, combining voice with a rapid screening scale may be a more attractive strategy.

Strengths and limitation

A strong point of this study is that we included multiple scales that covers ADLs,44 BPSD45 and cognition46 to provide a non-invasive, reliable screening tool for differentiating MCI and AD, while most of the standard dementia rating scales currently in use are only assessed in terms of a total score or evaluated from one aspect with a long time. Moreover, four models (including XGBoost, Bayes, SVM and logical regression) verified that our results are generally robust. However, the present study has some limitations and future challenges. First, as a primary phase study, the cut-off point of the simplified scale is presented from a ML point of view, and it should be further validated among a large-scale population or an external database. The loss of full records patients may have potential selected bias for higher functioning individuals. Moreover, insufficient demographic data such as medications should be considered a limitation of this study.

Conclusion

The quick screen ABC-Scale covers three dimensions of ADL, BPSD, and cognitive function with good efficiency in differentiating AD from MCI.

Relevance for clinical practice

Neuropsychological assessment of cognitive impairment meets a challenge in clinical practice in developing countries due to consuming time, a large number of patients and a lack of well-trained clinicians. The present study provides a comprehensive quick screen ABC-Scale that covers the three dimensions of ADL, BPSD and cognitive function by using ML methods and a 6-year database with good efficiency in differentiating AD from MCI. This tool provides a new strategy for large-scale screening of patients with AD in community and care settings.

Data availability statement

Data are available upon reasonable request.

Ethics statementsPatient consent for publicationEthics approval

Before beginning the study, the protocol was reviewed and approved by the Ethics Committee of The First Affiliated Hospital of Chongqing Medical University (No.2014-15-2). And written informed consent was obtained from all patients before any study procedures were performed.

Acknowledgments

We thank all the patients who participated in this study.

Comments (0)

No login
gif