Identifying Physiological and Cognitive Indicators of Subthreshold Depression and Major Depressive Disorder Progression Risk

Introduction

Major depressive disorder (MDD) is one of the leading causes of disability worldwide, with many cases taking a chronic course.1 A significant proportion of suicides are related to MDD diagnosis.2 Prolonged, untreated periods, worse outcomes, and increased risk of chronicity can occur when depressive symptoms go unrecognized until the individual becomes aware of their distress.3 Therefore, the importance of recognizing these symptoms before MDD onset, as well as early intervention, has been emphasized. MDD is classically viewed as a binary on-off disorder, in which the presence of a minimum of five symptoms, including depressed mood or anhedonia, occurring within the same two-week period is needed to reach a clinical or non-clinical diagnostic criteria.4 This binary diagnostic model of depression is challenged by a growing acceptance of a continuous spectrum model suggesting that depressive conditions exist on a graduated scale of severity and type, ranging from mild mood fluctuations to recurrent, persistent and severe depressive episodes.5,6 Consequently, understanding the minimal and mild end of this depressive spectrum is critical, as it enables the timely detection of predictive symptoms that may otherwise progress undetected to more severe manifestations of MDD.

On the mild side of this spectrum is subthreshold depression (StD), a subclinical spectrum between normal health and MDD that does not meet the diagnostic criteria for MDD.7–9 Failing to meet the diagnostic criteria for MDD refers to having fewer than five symptoms listed in DSM-5, such as depressed mood or loss of interest, or insufficient duration of symptoms. According to a recent scoping review of StD, the lifetime prevalence rate in the general population is approximately 11.4%, and reports indicate that approximately 16–29% of adolescents and young adults meet the criteria.10 StD is associated with a high rate of suicide attempts, and patients are reported to exhibit functional impairments comparable to those of MDD.11 Additionally, patients with StD have an increased risk of comorbid anxiety disorders and other psychiatric disorders.12,13 Studies show that about 15% of individuals with StD, including adolescents, adults, men and women, convert into MDD within a time span of one year after the initial symptoms.9,14–16 It has been recently suggested that a significant proportion of workplace presenteeism due to depression is attributable to StD,17,18 and attention is increasingly being directed toward StD-associated performance impairments.19 However, despite its significant impact on public health and workplace productivity, the biological and physiological predictors of StD remain poorly understood, underscoring an urgent need to identify early markers to prevent progression to MDD and mitigate its medical and societal burden.

MDD diagnosis typically relies on subjective self-reported questionnaires and clinician-led interviews to evaluate mood symptoms;20,21 however, recent research is increasingly focused on developing objective, reliable, and accessible measurement tools to enhance diagnostic precision. For example, meta-analysis results indicate that specific indicators of heart rate variability (HRV) measured using an electrocardiogram, such as High-Frequency (HF)-HRV and Low-Frequency (LF)-HRV in the Frequency domain, and the standard deviation of normal-to-normal intervals (SDNN) and the root mean square of successive differences between normal heartbeats (RMSSD) in the Time domain, are reduced in MDD.22 Voice analysis has also been extensively studied, and findings such as “the voices of patients with MDD are flat, soft, and exhibit speech delay” align with clinical observations, with reports indicating that speech rate, pitch variability, and the proportion of pauses during speech correlate with the severity of MDD.23,24 Recently, research has progressed in exploring the relationship between indirect indicators—such as zero-cross rate, Hurst index, and mel-frequency cepstral coefficients (MFCC) and MDD, in addition to direct indicators, like speech rate and pitch strength. These studies show that the Indicators calculated from zero-cross rate and Hurst index can distinguish between patients with moderate-to-severe depression and those with mild depression, with an AUC of 0.70.25 Furthermore, the MFCC can differentiate between patients with depression and healthy individuals, with an AUC of 0.88.26 Findings from other studies on physiological indicators show that higher severity of depressive symptoms is also associated with a higher self-reported body temperature during wakefulness and peripheral body temperature measured using wearable sensors.27 Leveraging these physiological indicators holds significant promise for predicting depression onset and severity, offering a pathway to earlier and more targeted interventions. However, debates and challenges remain regarding the use of physiological indicators. For example, there is ongoing discussion about whether the low-frequency (LF) component of heart rate variability (HRV) reflects sympathetic or parasympathetic nervous system activity. In addition, results may differ between short-term and long-term measurements, warranting further investigation. With respect to voice features, although depression has often been associated with a monotonous tone, some studies have reported that individuals with mild depression may instead exhibit a more tense or strained voice, leading to inconsistent findings. While elevated body temperature has also been reported in depression, its specificity remains controversial, as it is strongly influenced by other factors such as stress and insomnia.

In addition to objective physiological markers, numerous studies have extensively investigated neurocognitive markers of MDD through a variety of cognitive tasks. A large-scale UK Biobank study revealed differences in reaction time and performance on the Digit Symbol Substitution Test (DSST) between affected patients and healthy controls.28 A meta-analysis showed that affected patients had worse performance than healthy individuals on tasks such as sustained attention, set shifting, and spatial working memory.29 Furthermore, a systematic review indicates that during the acute phase of MDD, severe impairments are observed in processing speed, learning and memory, working memory, attention function, verbal fluency, and executive function, that some of these deficits are related to the degree of depression severity and persist even during the remission phase.30 In many studies, the differences in facial emotion recognition between patients with MDD and healthy individuals have been examined as a neurocognitive indicator underlying the negative bias observed in MDD. There is a long history of research in this field, with Ekman’s group beginning studies on facial emotion recognition biases in MDD as early as the 1960s.31 Various tasks were used in these studies; nevertheless, two broad categories have been identified: emotional recognition tasks (ERT), which require participants to identify basic emotions such as anger, disgust, fear, happiness, sadness, and surprise, and emotional bias tasks (EBT), which require participants to determine whether a presented expression is happy or sad. A meta-analysis of 23 studies comparing those diagnosed with MDD and healthy individuals with no history of psychiatric consultation revealed that those with severe MDD had lower accuracy in recognizing happy expressions than did patients with mild MDD and healthy individuals.32 Furthermore, a systematic review investigating vulnerabilities in emotion recognition among adolescents with MDD identified tendencies such as heightened sensitivity to sadness, underestimation of happiness, and overrecognition of anger.33 These neurocognitive markers hold significant potential for identifying early cognitive vulnerabilities in MDD, paving the way for targeted preventive strategies. On the other hand, debates remain regarding the use of neurocognitive markers. Although impairments in attention and executive function are frequently reported, there is ongoing discussion as to whether these reflect trait-like or state-dependent deficits, and the potential influence of medication should also be taken into account. Regarding emotion recognition, findings indicating reduced recognition of joy appear relatively consistent, whereas results for other facial expressions vary substantially across studies, likely due to differences in the stimuli and tasks employed.

StD is influenced by genetic factors, psychological stress, social factors, and physical health status, and its course varies from spontaneous improvement over time to long-term persistence or progression to MDD.5,9,34–36 However, distinguishing these outcomes based solely on depressive symptoms listed in diagnostic criteria remains challenging due to their subjective nature. Thus, despite the growing recognition of StD as a precursor of MDD, little is known about whether multiple objective physiological, voice, and cognitive markers used to distinguish MDD from normal health can also identify StD within the same sample of individuals.6,8 Furthermore, multimodal prospective studies using objective physiological and cognitive markers for MDD, particularly those targeting young individuals, remain limited.

Building on the urgent need to address the limitations of subjective diagnostic tools and the scarcity of research on StD, this study aims to identify objective physiological and cognitive markers that characterize StD and predict its progression to MDD. Although extensive research has elucidated physiological and neurocognitive markers of MDD, such as heart rate variability, voice analysis, and facial emotion recognition deficits, studies exploring these markers in StD remain limited despite its role as a potential precursor to MDD. Given that StD represents the minimal to mild side of the depressive continuum with potential to progress into diagnosable MDD, we proposed two hypotheses. First, early disruptions in interoceptive and cognitive regulation—indexed by reduced heart rate variability, altered vocal prosody, and impaired cognitive processing—would characterize StD and precede the emergence of MDD. Specifically, we predicted that subtle but significant deviations in these physiological, vocal, and cognitive markers would serve as early indicators of risk, distinguishing individuals with StD from healthy controls and predicting later progression to MDD. Second, based on the heterogeneity of StD—namely, differing outcomes between groups with transient versus persistent symptoms—we hypothesized that distinct objective indicators would differentiate these subtypes and predict their trajectories. Prior work among university students identified groups within the StD high-symptom cohort whose symptoms either persisted/worsened or improved.37 Moreover, longitudinal studies in adults have shown that approximately 12% of StD cases transition to MDD,6 and that persistent depressive symptoms are associated with cognitive decline and adverse brain outcomes in both older adults38 and younger cohorts,39 underscoring the consistent clinical significance of symptom persistence across generations. Based on these hypotheses, we examined whether markers characteristic of MDD can also be identified in StD, and further classified participants according to symptom course to determine objective indicators specific to each subtype. By comparing these markers with those of healthy controls and assessing the incidence of MDD onset across groups, this study seeks to provide novel insights into StD’s biological and cognitive underpinnings, paving the way for targeted early interventions to prevent MDD progression.

Materials and Methods Participants and Procedure

Students enrolled at Hiroshima University in April 2021 in their first to third year of study were included. The inclusion criteria were as follows.

1) Age at university enrollment between 18 and 24 years.

2) Beck Depression Inventory-II (BDI-II) score at enrollment was ≥18 or <10.

The exclusion criteria were as follows:

1) In a mental state that makes it difficult to understand the purpose of the study.

2) Having a severe physical illness.

3) Diagnosis of MDD based on the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) in the past year.

4) A history of manic or hypomanic episodes according to the DSM-IV.

5) Current use of psychotropic medications or undergoing psychotherapy.

The participant recruitment process is illustrated in Figure 1. Hiroshima University conducts health checkups for all new students, during which a mental health survey using the BDI-II is administered. Two groups were created based on BDI-II scores at enrollment (StD group: BDI-II score of ≥18; healthy group: BDI-II score of <10). The cutoff value for the StD group was set based on a previous study,37 in which the high-risk group (those with a BDI-II score of 18 or higher) was defined as having a higher incidence of MDD during the 1-year follow-up period. A BDI-II score of ≥18 was set as the cutoff value for the high-risk group based on a previous study.37 Next, study participants were randomly selected from each group (StD group: 105 participants, healthy group: 101 participants). The selected individuals were contacted using the Email function of the Hiroshima University Student Information System, and an information session was conducted with the students who responded. Among those who provided informed consent, the semi-structured Mini-International Neuropsychiatric Interview (M.I.N.I). was administered (StD group: 98 participants; control group: 87 participants). Trained interviewers conducted this study. The period between the BDI-II survey conducted during the freshman health checkup and the semi-structured interview varied from approximately two months to two years, depending on the participants’ grade level at the initiation of the study in 2021. The semi-structured interview was conducted to confirm exclusion criteria. Among these criteria, “severe physical illness” referred to conditions that could potentially interfere with the research procedures, such as difficulty in visiting the laboratory, inability to complete questionnaires independently, or difficulty in wearing or operating the wearable device. These conditions were also set as exclusion criteria to avoid confounding, as they could directly affect depressive symptoms or physiological indicators; however, no participants in this study met these criteria. Ten participants in the StD group were excluded because they met the diagnostic criteria for MDD. The remaining participants completed the BDI-II at baseline to confirm their current depressive level and to exclude those who exhibited depression at baseline from among participants initially assigned to the healthy control group based on their BDI-II scores at enrollment. Participants who scored ≥10 in the control group were excluded from the study, resulting in the exclusion of four additional control participants. The evaluation was discontinued for three participants who were unable to complete the assessment because of poor health. Ultimately, 87 and 81 participants were included in the StD and control groups, respectively. The StD group was further divided into a “low depression subgroup” (BDI-II <10 at baseline) and a “maintained depression subgroup” (BDI-II ≥10 at baseline), based on their BDI-II scores obtained at baseline. The low depression subgroup consisted of individuals who had depressive symptoms at enrollment but had improved to a healthy level by study onset, whereas the maintained depression subgroup consisted of those whose depressive symptoms did not improve to a healthy level and remained persistent. As a result, 21 participants were included in the low depression subgroup and 66 in the maintained depression subgroup.

Figure 1 Participant flow.

Abbreviations: BDI-II, Beck Depression Inventory-II; MDD, major depressive disorder; StD, subthreshold depression; HC, healthy control.

Measurements were conducted consecutively over 3 days. The first day was designated as T0, and each participant underwent measurements in the laboratory at T0 and T2. T1 involved recording and completing tasks while at home, as usual. At T0, wristband-type activity monitors (Silmee™ W22; TDK Inc., Japan) were distributed and instructed to be worn by participants. Additionally, instructions on the 3-day testing procedures were provided, and some components of the voice measurement test, pulse measurement, emotional cognitive tasks, and cognitive function tests were conducted. At T1, participants continued their usual daily activities while wearing a wristband-type activity monitor. Measurements using this monitor continued until noon at T2. At T2, participants returned to the laboratory for additional components of the Cognitive Function Test and completed a questionnaire via a web form. The laboratory was established on the Hiroshima University campus, maintained at a comfortable room temperature, and provided an environment free from noise, vibration, and strong light.

Follow-up was conducted for participants using the BDI-II every 3 months after the study ended to track changes in depressive symptoms. The follow-up period continued until the participants graduated from the university or until the end of March 2024. If worsening depressive symptoms were observed, the MINI was administered, allowing us to identify the participants who developed MDD during the follow-up period.

Ethics Statement

Written informed consent was obtained from all the participants regarding their participation in the study and the collection of anonymous data. This study was conducted according to the principles of the Declaration of Helsinki, and all the research procedures were reviewed and approved by the Hiroshima University Ethics Committee for Epidemiological and Clinical Research (Approval Number: E2018-1513).

Data Collection Voice Measurement

Stress is known to influence emotions, and an increase in emotions such as sadness can exacerbate depressive symptoms.40 Research has been conducted using sensitivity technology to analyze emotions in a speaker’s voice, evaluating mood disorders based on emotional components.41 The following four indicators, developed in previous studies, were also used in this study. Vivacity was calculated from the joy and sadness components of emotions in speech; relaxation was calculated from the calmness and excitement components, and vitality was calculated from Vivacity and Relaxation. The calculation methods have been previously described.41 Additionally, Shinohara et al focused on the arousal level of speech in previous studies and hypothesized that the voices of patients with MDD may have lower arousal levels. They developed the following two indicators related to speech arousal: the emotional arousal level voice index (EALVI: calculation method described in a previous study42) using sound pressure change acceleration, and the arousal level voice index (ALVI: calculation method described in a previous study25), focusing on the relationship between the Hurst index and zero-crossing rate of voice waveforms. In addition, we also used the number of utterances (utterances) and the time required for each utterance (duration). Participants sat in a relaxed posture on a chair while holding an iPhone SE (Apple Inc., California, USA) equipped with the MIMOSYS smartphone app (PST Co., Ltd., Yokohama, Japan) in either hand. A lavalier microphone (iRig Mic Lav, IK Multimedia Production Srl, Modena, Italy) was attached to the collar approximately 10 cm below the chin. Participants were instructed to read aloud the standard Japanese sentences displayed on the app. After reading each phrase, they pressed the button on the app to proceed to the next phrase, repeating this procedure until a total of 13 phrases had been recorded. The voice recordings were obtained using MIMOSYS and analyzed within the same system. The Japanese phrases used were selected for their familiarity, ease of pronunciation without emotion, ability to check consonants, and relevance to symptoms in the DSM-5, and were the same set used in previous studies on MIMOSYS.43

HRV Measurement (2 Min)

Participants sat comfortably with their eyes closed, and their resting pulse waves were measured for 2 min using a fatigue stress meter (MF100, Murata Manufacturing Co., Ltd., Kyoto, Japan) and an app (Fatigue Monitor, Fatigue Science Laboratory Inc., Osaka, Japan). Pulse wave was measured using electrical heart rate measurements such as 12-lead electrocardiography and photoplethysmography (PPG), which are implemented in wearable devices, including wristwatches. Reportedly, PPG can be used to accurately measure heart rate variability comparable to electrocardiography.44 Participants sat comfortably on a chair and rested for five minutes before holding the fatigue stress meter (MF100, Murata Manufacturing Co., Kyoto, Japan) on their knees. The MF100 used in this study is a hybrid measurement device that electrically measures heart rate via sensors attached to the palmar side of the left thumb and the first joint of the right thumb, and optically measures pulse rate via a sensor attached to the palmar side of the right thumb. Participants placed their fingers so that the skin lightly touched all three sensors. The tester launched the Fatigue Monitor app (Fatigue Science Laboratory, Osaka, Japan) on a tablet (iPad, Apple Inc., California, USA) and confirmed that the pulse waveform was displayed correctly. Participants were then instructed to close their eyes and remain relaxed while their resting pulse wave was recorded for two minutes. The following outputs were obtained: mean heart rate (mean HR); low-frequency (LF)-HRV, which reflects sympathetic and parasympathetic nerve activity in the LF component of HRV; high-frequency (HF)-HRV, which reflects parasympathetic nerve activity in the HF component of HRV; and the balance between sympathetic and parasympathetic nerves (LF/HF).

For details on heart rate variability, refer to the European Society of Cardiology Task Force and the North American Society of Pacing and Electrophysiology, 1996.45

Emotional Cognitive Tasks

The two types of emotional tasks were performed using an application running on a Windows tablet personal computer (Surface Pro 7; Microsoft Corporation, Washington, D.C., USA). Participants sat in chairs facing desks and performed tasks by touching a tablet placed upright on the desk with their dominant index finger. The tasks were designed based on the results of previous studies. Nonetheless, Japanese facial images were obtained from the “ATR Face Expression Database” (https://www.atr-p.com/products/face-db.html) provided by Advanced Telecommunications Research Institute International (ATR)-Promotions Inc. to align with the participants’ living environment. The models used were adult males and females. The contrast and brightness of the facial images were adjusted using Adobe Photoshop CC (Adobe Inc., California, USA), and morphing was performed using the FantaMorph software package (Abrosoft, Beijing, China).

Emotion Recognition Task

ERT is a task where participants are presented with six basic emotional expressions (happiness, sadness, anger, disgust, fear, and surprise46) at eight different intensity levels and are asked to select which emotional expression each corresponds to. Following previous research,47 we created “ambiguous expressions” by synthesizing the six basic expressions in equal proportions. Subsequently, we morphed these ambiguous expressions with each emotion expression at 100% intensity, resulting in 96 expression stimulus images comprising six expressions and eight intensity levels. In each trial, a fixation point was presented on the monitor for 2000 ms, followed immediately by one of the emotion stimulus images displayed for 300 ms. This was followed by visual masking presented for 300 ms using a noise screen. After this presentation set, six emotion labels were presented as options on the screen, and participants touched the label corresponding to the emotion of the emotion stimulus image presented immediately before the presentation. Because the stimulus images were randomly presented twice for each model, the total number of trials was 192. The accuracy rate was calculated as the unbiased hit rate (Hu,48) and used as the outcome measure.

Emotional Bias Task

The EBT task involved showing participants images of faces morphed between two emotions of varying intensities and asking them to select one of two options indicating which emotion the expression corresponded to. Following previous research,49 we morphed two emotional expressions (happiness and sadness) for each model, creating seven stages of facial images with varying proportions of sadness and happiness. In each trial, a fixation point was presented on the monitor for 2000 ms, followed immediately by a stimulus image of either intensity for 500 ms, and then a noise screen for 500 ms to mask the visual stimulus. Following this presentation set, two emotion labels were presented on the screen as options, and the participants responded by touching the label that matched the emotion of the stimulus image presented immediately before the monitor. The most ambiguous expression (50% sadness, 50% happiness) and the expressions immediately before and after it (“40% sadness, 60% happiness” and “60% sadness, 40% happiness”) were each presented eight times, while the expressions “30% sadness, 70% happiness” and “70% sadness, 30% happiness” were each presented six times. The most distinct expressions for each emotion (“sadness 100%” and “happiness 100%”) were presented three times each, resulting in a total of 84 trials. The subjective equivalence point (PSE) was calculated and used as an outcome measure.

Neuropsychological Cognitive Function Tests

The results of a systematic review and meta-analysis of cognitive dysfunction in MDD showed that the patients exhibit significant performance deficits in executive function, memory, and attention compared with those of healthy controls.29 Regarding motor function, reports indicate functional impairments with small to moderate effect sizes.50 Based on these reports, we selected and conducted several neuropsychological cognitive tasks associated with MDD from two existing applications equipped with cognitive function test batteries.

1. Cambridge Neuropsychological Test Automated Battery (CANTAB, Cambridge Cognition, Cambridge, UK).51,52

CANTAB is a test set tool validated by over 30 years of neuroscience research and has been used in over 2500 papers. We assessed reaction time (RTI), rapid visual processing (RVP), intra/extradimensional shift task (IED), and spatial working memory (SWM) using a tablet (iPad, Apple Inc., California, USA) with CANTAB installed. Participants sat in chairs facing desks and performed the task by touching a tablet placed upright on the desk with the index finger of their dominant hand.

The RTI is used to evaluate psychomotor speed by measuring the speed at which participants react to targets (reaction speed) and how they move their fingers to touch the targets (motor speed). Five circles were displayed on the screen, and the participants touched those that turned yellow as quickly as possible. The median, mean, and standard deviation of the motor and reaction speeds were analyzed for the 5-choice RTI.

The RVP test requires participants to detect a sequence of target numbers from a series of numbers that appear randomly at a rate of 100 digits per min and evaluate sustained attention, signal detection, and impulsivity. The medians, means, and standard deviations were analyzed. A′ is a signal detection indicator that reflects sensitivity to targets, independent of response trends (expected range 0.000–1.000; poor–good).

Similar to the Wisconsin Card Sorting Test, the IED is used to evaluate cognitive flexibility by assessing rule learning and reversal. In the test, stimuli consisting of filled-in shapes and white lines were displayed on the screen, and the participants used feedback to determine which stimulus was correct and devise the rules. After the six correct answers were received, the rules were changed. The adjusted total number of errors and the number of completed stages were analyzed.

SWM is a self-ordered search task in which tokens hidden in one of several boxes are sought. It is a test that is used to evaluate working memory and strategy use. Several boxes were presented on a screen in this test. The participant touched each box individually to find the tokens hidden inside it. As the token is never hidden in the same box twice in a single trial, the participant must repeatedly search while avoiding the location of the previous token. The number of boxes gradually increased to 12. Between errors, that is, the number of times the participant revisited a box where the token had previously been found, was analyzed.

2. THINC-integrated tool (THINC-it; H. Lundbeck A/S, Denmark).

THINC-it is a test set tool that can detect cognitive impairment in patients with MDD with high reliability. In this study, the Choice Reaction Time task (CRT), n-back task, DSST, and Trail Making Test were administered using a Windows tablet PC (Surface Pro 7, Microsoft Corporation, Washington D.C., USA) with THINC-it installed. Participants sat in chairs facing desks and performed the task by touching a tablet placed upright on the desk with the index finger of their dominant hand.

In the CRT, a series of arrows pointing to the left or right appear on the screen. Participants were instructed to select the direction of the arrow as quickly as possible. This test is used to evaluate attention and executive function by analyzing the number of correct responses within the time limit and the average response time.

In the n-back task, a series of symbols moves horizontally across the screen, and the first symbol disappears. The participants were required to recall and respond to missing symbols as quickly and accurately as possible. This test is used to evaluate working memory, executive function, attention, and concentration, with the average number of correct responses and response time within the time limit analyzed.

In the DSST, six symbols and their corresponding numbers are displayed at the bottom of the screen, and the participant is required to match the correct symbol to the number shown in the center of the screen as quickly as possible. This test is used to evaluate executive function, processing speed, attention, and concentration. The number of correct answers, errors, and average response time within the time limit were analyzed.

In the Trail Making Test, letters and numbers are scattered across the screen, and the participant is required to draw lines connecting the letter “a” to the number “1”, then the letter “i” to the number “2”, and so on, in order. The participants were required to connect all points as quickly as possible, ensuring that the letters and numbers were alternated. This test is used to evaluate executive function and analyze the time taken to complete a task.

Questionnaires

Questionnaires included the Japanese version of the BDI-II, validated by Kojima et al53 and the Presenteeism Scale for Students (PSS54). The BDI-II is the most widely used self-report inventory for measuring the severity of MDD. Higher scores (0–63) indicate a greater depression severity. The PSS is used to assess presenteeism among students. It calculates the Work Impairment Score (WIS) based on responses to the question “To what extent have health issues affected your usual academic (work) performance?”

Monitoring Physiological Signals with Wearable Devices

During the experiment, participants wore a commercially available wristband-type activity meter (SilmeeW22, TDK Inc., Japan) on their non-dominant wrist to measure heart rate variability and peripheral body temperature. The sensor was affixed to ensure contact with the skin on the dorsal side of the wrist. They removed the device only when exposed to water and wore it at all times, including during sleep. The measurement period used for the analysis was from 9:00 PM at T0 to noon at T2.

Heart rate variability was measured by detecting the peak intervals of the pulse waves using a PPG sensor built into the device. In the time domain, the SDNN and the RMSSD were analyzed. In the frequency domain, HF, LF, and Very Low Frequency (VLF; providing information on sympathetic nervous system load) were analyzed. Periods during which the device was not in contact with the skin due to belt loosening or other causes were output as a heart rate of 0 and were removed as measurement failures. In addition, periods with a recorded heart rate below 40 beats per minute were excluded as measurement errors. These periods were regarded as non-wear time, and participants whose non-wear time exceeded 20% of the total measurement duration were excluded from the analysis. Missing data were not imputed; all missing values were handled by listwise deletion. The analysis was performed using the Kubios HRV Premium software (Kubios, Kuopio, Finland).

Peripheral body temperature was determined by measuring the wrist temperature every minute. To exclude extreme outliers, measurements in the upper and lower 5% of the distribution were removed. One-minute periods during which heart rate was not recorded were considered non-wear time, and participants whose non-wear time exceeded 20% of the total measurement duration were excluded from the analysis. Missing data were not imputed; all missing values were handled by listwise deletion.

Statistical Analysis

For demographic and clinical characteristics, age and sex distributions were assessed using chi-squared tests. BDI-II scores were compared using a one-way analysis of variance (ANOVA). For variables other than those analyzed in the ERT, multivariate ANOVA (MANOVA) was performed with sex as a covariate, comparing the StD group and healthy controls.

For the ERT, following previous studies, we conducted a two-way ANOVA with emotion (anger, disgust, fear, happiness, sadness, and surprise) and intensity as factors to compare the accuracy of emotion recognition between groups and intensity levels. We compared the overall StD group with the healthy controls, as well as each StD subgroup with the healthy controls. Following findings from previous studies, facial expression intensity was analyzed at six levels (intensities 3–8), excluding low-intensity stimulus images.47

The above indicators were initially compared between the StD (n = 87) and healthy control (n = 81) groups. Next, focusing on the heterogeneity of StD, the group was divided into two subgroups based on BDI-II scores at the initiation of the study: “BDI-II < 10; low depression group (n = 21)” and “BDI-II ≥ 10; maintained depression group (n = 66)”. For indicators showing significant differences between the StD group and the healthy control group, comparisons between the low depression subgroup and the healthy control group, and between the maintained depression subgroup and the healthy control group, were evaluated using MANOVA/ANOVA between two groups with gender as a covariate. For pairwise group comparisons, multiple comparison adjustments were performed using the Bonferroni method implemented in SPSS.

We also performed a survival analysis (Kaplan–Meier method) to compare the proportion of participants who developed MDD during the follow-up period. We compared the results between the subgroups and the healthy control group.

All statistical analyses were performed using the IBM SPSS Statistics version 27 for Windows (SPSS Japan Inc., Tokyo, Japan).

The primary analyses in this study were conducted as two-group comparisons between each StD subtype and the healthy control group to test the hypothesis that objective indicators specific to each subtype can be identified. In addition, supplementary ANOVA/MANOVA analyses were performed among the three groups (healthy control, low depression, and maintained depression), with detailed results provided in Supplementary Tables S3S5.

Results Overall Summary

In this study, the StD group showed significant differences compared with the healthy control group in physiological indicators (increased vocal ALVI, decreased LF ratio in continuous heart rate variability, and elevated peripheral body temperature), neurocognitive indicators (greater standard deviation of RTI movement time, lower RVP A’, fewer DSST correct responses, and longer mean reaction time), and emotional–cognitive indicators (reduced accuracy in recognizing happiness and surprise facial expressions). Furthermore, the PSS-WIS, an indicator of presenteeism, was significantly higher in the StD group.

In the subgroup analysis, the maintained depression group showed increased ALVI, decreased LF ratio, and elevated peripheral temperature among physiological indicators; reduced DSST correct responses and prolonged reaction time among neurocognitive indicators; and reduced accuracy in recognizing happiness and surprise expressions among emotion-recognition indicators. In the low depression group, physiological indicators showed elevated peripheral temperature; neurocognitive indicators showed increased standard deviation of RTI reaction time and decreased RVP A’; and emotional–cognitive indicators showed decreased accuracy in recognizing high-intensity happiness expressions. PSS scores were elevated in both subgroups. At follow-up, the incidence of MDD was significantly higher in the maintained depression group.

Comparison Between the StD Group and the Healthy Control Group

Table 1 shows the demographic and clinical characteristics of participants in the StD (n = 87) and control (n = 81) groups. Significant differences were observed in the following indicators when both groups were compared: ALVI in Voice Measurement, SD of movement time in CANTAB’s RTI, A’ in RVP, correct number and mean response time in the DSST of THINC-it, WIS in the PSS, LF (%) in continuous heart rate variability measurements, peripheral body temperature, and unbiased hit rate (Hu) for happy and surprised expressions in the ERT (Tables 1 and 2, Figure 2).

Table 1 Participants’ characteristics of healthy control group and subthreshold depression group

Table 2 Physiological characteristics of healthy control group and subthreshold depression group

Figure 2 Unbiased hit rate for each emotion at each intensity level for each group.

Abbreviations: HU, unbiased hit rate; StD, subthreshold depression; HC, healthy control.

Notes: Error bars represent standard error.

ALVI was significantly higher in the StD group than in the healthy control group [F (1, 165) = 7.45, p = 0.007]. The SD of the movement time in the RTI group was significantly higher in the StD group than in the healthy control group [F (1, 165) = 5.32, p = 0.022]. A’ in RVP was significantly lower in the StD group than in the control group [F (1, 165) = 5.84, p = 0.017]. The number of correct responses on the DSST was significantly lower in the StD group than in the control group [F (1, 165) = 4.36, p = 0.038], and the mean response time was significantly longer in the StD group than in the control group [F (1, 165) = 5.58, p = 0.019]. The PSS-WIS of the PSS was significantly higher in the StD group than in the control group [F (1, 165) = 46.33, p < 0.01] (Table 1).

The LF (%) in the continuous heart rate variability measurements was significantly lower in the StD group than in the healthy group [F (1, 137) = 5.73, p = 0.018]. The peripheral body temperature at the wrist was significantly higher in the StD group than in the control group [F (1, 137) = 50.92, p < 0.001] (Table 2).

The results of ERT are shown in a graph of HU values for expression intensities of 3–8 for each emotion (Figure 2). Regarding the main effects by group, no significant differences were found for anger [F(1, 165) = 0.70, p = 0.403], disgust [F(1, 165) = 0.76, p = 0.384], fear [F(1, 165) = 0.47, p = 0.493] and sadness [F(1, 165) = 1.65, p = 0.200]. A main group effect was observed for happiness [F(1, 165) = 4.77, p = 0.030] and surprise [F(1, 165) = 4.40, p = 0.038], with the StD group showing lower accuracy in all cases. No interaction effects between groups or intensities were observed for any emotion. Detailed ANOVA results for the ERT comparing the StD and healthy control groups are provided in Supplementary Table S1.

Comparison Between StD Subgroups and the Control Group

The StD group was divided into two subgroups: the “low depression subgroup (BDI-II < 10; n=21)” and the “maintained depression subgroup (BDI-II ≥ 10; n=66)”. The demographic and clinical characteristics of the participants are presented in Table 3. Indicators showing significant differences between the StD (n = 87) and healthy control (n = 81) groups were compared between each subgroup and the healthy control group using two-group ANOVA (Tables 3 and 4, Figure 3).

Table 3 Participants’ characteristics of healthy control group and subgroups of subthreshold depression

Table 4 Physiological characteristics of healthy control group and subgroups of subthreshold depression

Figure 3 Unbiased hit rate for happiness and Surprise emotions at each intensity level for subgroups of StD and HC.

Abbreviations: HU, unbiased hit rate; LD, low depression; MD, maintained depression; HC, healthy control.

Notes: Error bars represent standard error.

ALVI was significantly different between the maintained depression and healthy control groups [F (1, 144) = 5.37, p = 0.022]. The SD of the RTI movement time differed significantly between the low depression and healthy control groups [F (1, 99) = 6.07, p = 0.015]. The A’ in RVP significantly differed between the low depression and healthy control groups [F (1, 99) = 7.68, p = 0.007]. The number of correct responses in the DSST [F (1, 144) = 5.91, p = 0.016] and mean response time [F (1, 144) = 7.95, p = 0.005] significantly differed between the maintained depression and healthy control groups. PSS scores also significantly differed between the two subgroups and the healthy control group, with the maintained depression group [F (1, 144) = 35.61, p < 0.001] having a higher score than the low depression group [F (1, 99) = 34.92, p < 0.001] (Table 3).

The LF (%) in continuous heart rate variability measurements showed a significant difference between the maintained depression and control groups [F (1, 120) = 6.27, p = 0.014]. The peripheral body temperature was significantly different from that of the healthy group in both subgroups: the low depression group [F (1, 82) = 19.10, p < 0.001] and the maintained depression group [F (1, 120) = 36.80, p < 0.001] (Table 4).

Regarding the ERT results, a group main effect was observed for happiness and surprise when comparing the StD group (n = 87) and the healthy control group (n = 81). Subgroup comparisons were performed with healthy controls. Happiness expression showed a main effect of group in the maintained depression group [F (1, 144) = 5.28, p = 0.023], with both groups having lower accuracy than the control group. In the low depression group, the expression of intensities 6, 7, and 8 had lower accuracy than in the control group, intensity 6 [F (1, 99) = 17.42, p < 0.001], intensity 7 [F (1, 99) = 6.94, p = 0.010], and intensity 8 [F (1, 99) = 5.59, p = 0.020]. Surprised expressions showed a main effect of group in the maintained depression group; however, lower accuracies were found in this group [F (1144) = 4.91, p = 0.028] (Figure 3).

Finally, the incidence of MDD in each subgroup was compared to that in the control group. From the end of the measurement period until March 2024, no participants in the healthy control or low depression groups developed MDD. However, seven participants in the maintained depression group did. The number of onset cases of MDD was zero in both the healthy control group and the low depression group, so a statistical comparison between groups could not be performed. Survival analysis comparing the incidence rates of MDD between the healthy control and maintained depression groups revealed a significant difference (P = 0.002) (Figure 4). Detailed ANOVA results for the subgroup comparisons between the healthy control group and each StD subtype are presented in Supplementary Table S2.

Figure 4 Survival analysis comparing the incidence rates of MDD between the healthy control group and the maintained depression group.

Abbreviations: MDD, major depressive disorder; HC, healthy control; MD, maintained depression.

The results of the supplementary three-group ANOVA/MANOVA analyses are presented in Supplementary Tables S3S5. Although these analyses revealed some differences, such as certain indicators showing a shift from significance to a trend toward significance, the main conclusion of this study—namely, the primary differences between the healthy control group and the StD subtypes—was largely maintained.

Discussion

In this study, our primary goal was to identify objective physiological and cognitive markers characterizing StD and to explore their potential in predicting its progression into MDD. This approach addresses the critical gap in early identification of individuals at risk of developing MDD, highlighted by the limitations of subjective diagnostic tools. We hypothesized that markers previously associated with MDD—such as abnormalities in HRV, voice features, and neurocognitive performance—would also be present in StD, reflecting a continuum between these conditions, and that specific markers could differentiate StD subtypes based on symptom persistence. Our findings broadly support this hypothesis, revealing significant differences between the StD and healthy control groups across multiple objective indicators, including voice analysis (ALVI), cognitive task performance (RTI, RVP, DSST), heart rate variability (LF%), peripheral body temperature, and emotional recognition accuracy (ERT for happiness and surprise). Furthermore, subgroup analyses demonstrated distinct marker profiles for transient (low-depression) and persistent (maintained-depression) StD, with the latter subgroup showing a significantly higher incidence of MDD onset. These results underscore the potential of these markers for early risk stratification and intervention.

In individuals with StD, sustained attention—as indicated by lower A’ scores in the Rapid Visual Information Processing (RVP) task—was impaired, mirroring deficits observed in MDD and supporting the notion of a shared cognitive continuum. Similarly, performance on the Digit Symbol Substitution Test (DSST) revealed fewer correct responses and longer average response times in the StD group compared to healthy controls, indicating deficits in executive functions, including motor speed, attention, planning, strategizing (as involved in learning symbol-number pairs), and working memory (required to maintain task rules). These findings suggest that specific cognitive impairments characteristic of MDD are also evident in StD, underscoring their potential as early markers of risk.55 In contrast, tasks assessing motor and reaction times (Reaction Time Task, RTI), set-shifting, spatial working memory, N-back, and the Trail Making Test (TMT) showed no abnormalities in StD, despite reported deficits in MDD.55 This highlights that certain cognitive functions remain preserved in StD as a precursor state. Collectively, these results elucidate selective cognitive vulnerabilities in StD, offering critical insights for identifying at-risk individuals and informing targeted interventions to prevent progression to MDD.

Elevated peripheral body temperature, measured at the wrist using a wearable device, was observed in the StD group, consistent with patterns reported in MDD. Several mechanisms may underlie body temperature elevation in depression, including imbalances in sympathetic and parasympathetic activity contributing to impaired heat dissipation due to reduced sweating capacity, disrupted thermoregulation from neuroimmune interactions, chronic inflammation, stress, HPA axis dysregulation, and abnormalities in temperature-sensitive channels.27,56,57, For example, a recent study identified higher peripheral body temperature in individuals with mild depressive symptoms but no clinical diagnosis of MDD compared to those without depressive symptoms. This finding aligns with our results and suggests that thermoregulatory abnormalities may emerge early in the StD-MDD continuum.

Regarding HRV, our study found no significant difference in HF between the StD and control groups; however, LF was significantly reduced in the StD group. Although HF—an indicator of parasympathetic nervous activity—is known to decrease in MDD,22 our findings suggest that this alteration may not yet be pronounced in StD.58 LF, traditionally linked to sympathetic nervous activity, is now understood to reflect a broader balance between autonomic sympathetic and parasympathetic functions or neural responses to postural changes rather than purely sympathetic activity.59,60 Recent studies reported similar LF reductions in StD without significant HF differences, indicating that impaired baroreceptor function—sensors in the carotid sinus and aortic arch that regulate heart rate and vascular tone via autonomic reflexes—may contribute to these changes. This is supported by meta-analyses showing reduced LF in adults with MDD—particularly in younger (30–40 years) and older populations—suggesting a consistent autonomic dysregulation across the depressive spectrum.22,61 Notably, our study detected LF reductions through continuous HRV measurements using wearable devices, but not in 2-min pulse wave assessments conducted before cognitive tasks, possibly due to anticipatory stress from upcoming tasks influencing short-term measurements. These findings emphasize the value of continuous HRV monitoring as a sensitive, objective tool for detecting autonomic changes in StD, offering insights into its potential as an early marker for MDD risk and highlighting the importance of measurement context in psychophysiological research.

Assessment of emotional recognition with the ERT revealed reduced accuracy in identifying happy and surprised expressions in the StD group, suggesting early deficits in processing positive emotions that align with patterns observed in MDD. Previous meta-analyses have reported that individuals with MDD exhibit lower recognition accuracy for all facial expressions compared to healthy controls, with particularly pronounced deficits for happy expressions.32

In line with the present findings, a previous study found that MDD patients receiving a placebo showed significantly poorer recognition of happy and surprised—often categorized as positive or non-negative emotions alongside anger, disgust, fear, and sadness—compared to healthy controls.62–64 Our findings in StD mirror these deficits in positive emotion recognition, supporting the hypothesis that cognitive biases emerge early in the StD-MDD continuum. In contrast, emotional bias assessed with the EBT showed no evidence of a negative bias in StD, unlike in MDD, where patients frequently misinterpret ambiguous expressions as sad, with this bias diminishing as symptoms improve.65 The absence of a detectable negative bias in our StD cohort likely reflects their subthreshold depressive symptoms, highlighting a key distinction between StD and MDD and underscoring the potential of ERT as an objective marker for early identification of at-risk individuals.

The WIS scores on the PSS were significantly higher in the StD group, indicating that presenteeism, a hallmark of workplace impairment, is evident in this precursor state to MDD. Previous studies have established a strong association between MDD severity and presenteeism, with workplace productivity losses linked to depressive symptoms.18 Research targeting student populations further supports this, showing that students with emotional problems, including those exhibiting depressive mood and anxiety symptoms, exhibit significantly higher WIS scores than their peers.54 Our findings extend these observations to StD, demonstrating that presenteeism emerges before the onset of clinical MDD, consistent with prior reports suggesting that workplace impairments originate in StD.17,18 This highlights the functional significance of StD and reinforces the need for early detection of its cognitive and physiological markers to mitigate its societal and economic impacts through targeted interventions.

Notably, some indicators in StD exhibited patterns distinct from those typically observed in MDD, particularly in voice arousal, as measured by the ALVI, which was higher in the StD group compared to healthy controls. In contrast, previous studies have reported lower ALVI in MDD, with more severe depressive symptoms associated with reduced vocal intensity.25 It has been suggested that elevated speech arousal in the early stages of MDD may reflect heightened stress levels, a finding supported by comparisons showing higher voice arousal in MDD patients with milder symptoms.42 Our results align with this, suggesting that StD, as a precursor to MDD, may be characterized by a high-stress state that leads to increased ALVI compared to healthy controls. However, the validity and reliability of ALVI as a marker of vocal arousal in MDD remain limited. Likewise, although EALVI has been proposed as a marker of vocal arousal, it only showed a marginal trend in the present study. Therefore, it may be premature to draw firm conclusions about the clinical significance of these markers, and further studies need to investigate this issue.

Although no reports have specifically reported on the SD of movement time in the RTI in MDD, our findings revealed a significant increase in RTI movement time variability in the StD group, suggesting inattention or inconsistent motor performance as an early cognitive marker. A meta-analysis confirmed impaired attention in MDD, supporting the notion that attentional deficits may emerge in StD.29 However, simple motor speed in the RTI was not impaired in StD in this study, consistent with a meta-analysis indicating that motor speed deficits in MDD have a smaller effect size compared to attention.50 These findings highlight selective cognitive vulnerabilities in StD, distinguishing it from MDD and underscoring the potential of RTI variability as a novel marker for identifying individuals at risk.

The StD group was further divided into transient (low depression, BDI-II < 10) and persistent (maintained depression, BDI-II ≥ 10) subgroups to explore heterogeneity in marker profiles and MDD risk. In the low depression group, increased RTI movement time variability and lower A′ scores in the Rapid Visual Information Processing (RVP) task indicated deficits in sustained attention, suggesting that attentional impairments may predispose individuals to reactive depression rather than directly causing it. Conversely, the maintained depression group exhibited poorer performance on the DSST, reflecting deficit

Comments (0)

No login
gif