Inter-day reliability of heart rate complexity and variability metrics in healthy highly active younger and older adults

Participants

Sixty-six healthy individuals (50 male; 16 female) were recruited to participate in the study. Participants were divided into two age groups, the younger group (YG) were aged 18 to 30 years (N = 22; 16 M, 6F) and the older group (OG) were aged 50 to 70 years (N = 44; 34 M, 10F).

All participants were regular exercisers, having performed above the World Health Organisation guidelines (i.e., 2.5 to 5 h of moderate exercise per week; Bull et al. 2020) for ≥ 2 years. All participants were recruited to be closely matched for physical activity levels and exercise capacity. Participants were required to be non-obese, non-smokers, have no known or signs/symptoms of cardiovascular, neuromuscular, renal, or metabolic conditions and not be taking medications or dietary supplements that would affect cardiac function. The study was completed with full ethical approval of the University of Kent Research Ethics Committee (Proposal number: 21_2020_21), according to Declaration of Helsinki standards. All participants provided written informed consent prior to testing.

Experimental design

Each participant completed three visits to the laboratory at the same time of day (± 1 h) between the hours of 8am and 4 pm (AM visits, YG N = 8 and OG N = 21; PM visits, YG N = 14 and OG N = 23). Visit one involved participant screening, laboratory familiarisation, and an incremental exercise test (IET) to determine aerobic fitness. At visits two and three, participants completed the 30-min supine resting RR interval measurement to derive the HRV metrics.

Visits were conducted on non-concurrent days (with a minimum gap of 2 full days and maximum gap of 5 days between visits) and participants were instructed to refrain from any exercise in the day prior to testing and intense exercise in the two days prior. Participants were instructed to arrive euhydrated and in a post-prandial state, having eaten at least 4-h prior to testing. Participants were told to not consume caffeine within 8-h and alcohol within 24-h of testing.

Preliminary measurements and incremental exercise testing (visit one)

At visit one prior to exercise testing all participants provided written informed consent, completed a health questionnaire and the long form international physical activity questionnaire (Craig et al. 2003). Resting blood pressure, participant height, body mass and body composition were then measured, after which the participants completed a cycling IET to determine markers of aerobic fitness.

The IET protocol was performed on an electro-magnetically braked ergometer (Excalibur Sport, Lode BV, Groningen, The Netherlands). Participants completed a 10-min warm-up at 50 W, after which the required cycling power output increased by 25 W every minute (i.e., 1 W every 2.4 s) until they reached volitional exhaustion (operationally defined as a cadence of < 60 revolutions/min for > 5 s, despite strong verbal encouragement).

During the IET, respiratory gas exchange data were assessed using online breath-by-breath gas analysis (Metalyzer 3B; CORTEX Biophysik GmbH, Leipzig, Germany). Prior to all testing the gas analyser was calibrated according to the manufacturer recommendations using with ambient air and known concentrations of oxygen and carbon dioxide. The bidirectional turbine (flow meter) was calibrated with a 3-L calibration syringe.

The participant’s peak oxygen uptake (\(\dot}_}}}\)) was assessed as the highest oxygen uptake that was attained during a 1-min period in the test. Participants gas exchange threshold was determined as the breakpoint in carbon dioxide production and oxygen consumption (i.e., the point at which the carbon dioxide production begins to increase out of proportion to the oxygen consumption). This breakpoint also coincided with the increase in both ventilatory equivalent of oxygen (\(\dot}/\dot}_}\)) and end-tidal pressure of oxygen with no concomitant increase in ventilatory equivalent of carbon dioxide (\(\dot}/\dot}_}\); Beaver and Wasserman 1986; Pallares et al. 2016). The respiratory compensation point was determined as an increase in both the \(\dot}/\dot}_}\) and \(\dot}/\dot}_}\)and a decrease in partial pressure of end-tidal carbon dioxide (Whipp et al. 1989; Lucia et al. 1999).

Measurement of RR intervals (visits two and three)

For collection of RR intervals participants were in a supine resting position, in a temperature-controlled room set at 20 C. The room was kept dark and quiet, and participants were instructed not to verbalise throughout the measurement and breathe freely at their normal resting rate. Before the 30-min RR interval measurement commenced, an initial 20-min supine rest period was carried out to ensure participants were at complete rest and their heart rates were stable.

To collect the RR intervals participants wore a Polar H10 heart rate monitor with a Pro Strap (Polar Electro Oy, Kempele, Finland), which has been shown to provide strong agreement and comparable RR interval signal quality to conventional ECG devices (Gilgen-Ammann et al. 2019; Schaffarczyk et al. 2022). The elastic electrodes of the Pro Strap were moistened, and the strap lengthened to fit around the participant’s chest circumference as described by the manufacturer. The RR intervals were acquired at 1000 Hz via the Elite HRV application (Elite HRV, Asheville, NC, USA) on a mobile device positioned directly next to the participant. The RR intervals were then exported as a text file for processing and analysis offline in MATLAB.

RR interval data pre-processing

All RR interval time series were pre-processed to exclude artifacts and outliers. RR intervals less than 0.2 s and greater than 2.0 s were removed. Secondly, RR intervals that differed from the mean of the surrounding 40 RR intervals by more than 20% were excluded.

The number of RR interval artifacts and outliers from all RR interval time series on Day 1 were: YG, 19.6 ± 20.5 RR intervals or 1.12 ± 1.24% (range 0.05 to 4.33%) of total RR intervals and OG, 7.5 ± 10.6 RR intervals or 0.46 ± 0.64% (range 0.00 to 2.65%) of total RR intervals and Day 2: YG, 16.3 ± 15.9 RR intervals or 0.94 ± 0.94% (range 0.00 to 3.03%) of total RR intervals and OG, 6.7 ± 12.1 RR intervals or 0.42 ± 0.76% (range 0.00 to 4.10%) of total RR intervals.

Heart rate complexity—nonlinear metric analysisApproximate and sample entropy

Approximate entropy (ApEn; Pincus 1991) and sample entropy (SampEn; Richman and Moorman 2000) quantify the conditional probability that a template length of m and m + 1 data points is repeated during the time series within a tolerance of r (set at a % of the time series SD). SampEn differs from ApEn, as it avoids counting self-matches by taking the logarithm after averaging, thus reducing the inherent bias existing within the ApEn calculation.

In the current study template length was set at m = 2 and tolerance r = 0.2 of the SD of the RR interval time series, for both ApEn and SampEn analysis (Kaplan et al. 1991). ApEn was calculated as shown by Eq. (1) and SampEn by Eq. (2), where N is the number of data points in the time series, m is the length of the template, Ai is the number of matches of the ith template of length m + 1 data points, and Bi is the number of matches of the ith template of length m data points:

$$ApEn\left( \right) = \frac}}}\mathop \sum \limits_^ log\frac }} }}$$

(1)

$$SampEn\left( \right) = - }\left( c} \\ } \\ \end }}c} \\ \\ \end B_ }}} \right)$$

(2)

Detrended fluctuation analysis

The detrended fluctuation analysis (DFA) algorithm was used, as outlined by Peng et al. (1994), to measure the fractal scaling of the RR interval time series. The DFA algorithm allows for the detection of long-range correlations embedded in seemingly non-stationary physiological time series data. The RR interval time series is first integrated, using Eq. (3):

$$y(k) = \sum_^(_- \overline), k = 1, ...,N$$

(3)

The integrated time series are then divided into boxes of equal length, n. Within each box length n, a least squares line is fitted to the data, yn(k) denotes the trend in each box. The integrated time series y(k) is then detrended by subtracting the local trend, yn(k), within each box. The root-mean-square fluctuation of the integrated and detrended time series is calculated by Eq. (4):

$$F\left( n \right) = \sqrt c} \frac & ^ \left[ \left( k \right)]^ } \right.} \\ \end }$$

(4)

The DFA computation (4) is repeated across all box sizes to provide a relationship between F(n), the average fluctuation as a function of box size, and the box size, n, the number of RR interval data points in a box. The slope of the double log plot, log F(n) vs log n, determines the scaling exponent α. DFA α was calculated with box sizes ranging from 4 to \(\le\) 64 data points. DFA α1 was calculated over box sizes of 4 \(\le\) n \(\le\) 16 data points (i.e., scaling exponent calculated over short time scales) and DFA α2 was calculated over box sizes of 16 \(\le\) n \(\le\) 64 data points (i.e., scaling exponent calculated over long time scales), as used previously by Peng et al. (1995).

The DFA produces a scaling exponent α. An α = 0.5 indicates that the value of one RR interval is completely uncorrelated from any previous values (i.e., unpredictable white noise; indicative of a very rough time series). An α = 1.5 indicates Brown noise and a loss of long-range correlations (i.e., a smooth output with long term memory). While an α of 1.0 (i.e., 1/f or pink noise) is suggestive of a physiological output of high complexity, that is statistically self-similar with long range-correlations (Peng et al. 1995). Figure 1A presents an example raw RR interval time series and 1B presents the integrated time series with the least-squares fit “trend” line plotted for box sizes of 64 data points.

Fig. 1figure 1

A Example raw RR interval time series; B the integrated RR interval time series, with the least-squares fit representing the “trend” in each box (red lines) and the vertical lines indicating the box size of n = 64 data points. The RR interval data presented produced a DFA α = 1.04 (DFA α calculated over box sizes 4 to \(\le\) 64; data were from a younger male participant aged 18 years)

Multiscale entropy

Multiscale entropy (MSE) analysis was performed as outlined by Costa et al. (2002) providing a measure of complexity of time series over multiple scales. The MSE analysis overcomes limitations of SampEn and ApEn which only measure the regularity of time series data on one scale, and therefore do not capture the structural and dynamical behaviour of the time series.

From the one-dimensional discrete time series, , a coarse-grained time series were constructed, , determined by the scale factor, τ, according to Eq. (5):

$$y\beginc} \\ j \\ \end = \frac \mathop \sum \limits_ \right)\tau + 1}}^ \chi_} 1 \le j \le N/\tau$$

(5)

At one scale, the time series is the original time series of sample length. The length of the coarse-grained time series is equal to the length of the original time series divided by the scale factor, τ. The SampEn for each coarse-grained time series is calculated and plotted against the scale factor, τ, producing a MSE curve. The SampEn of each coarse-grained time series was computed using Eq. (2) and a template length m = 2 and r = 0.2 of the SD of the RR interval time series. The area under the MSE curve were calculated from scales 1 to 8 using Eq. (6) and is defined as the complexity index (CI-8) with higher CI values indicating greater complexity of the physiological signal.

$$} = \mathop \sum \limits_^ SampEn\left( i \right)$$

(6)

Poincare plot SD2

Poincare plots of RR interval times series were produced by plotting each RR interval as a function of the previous RR interval (Woo et al. 1992). Poincare plots were then analysed with an ellipse fitting procedure to derive the metrics SD1 (the standard deviation of the points perpendicular to the line of identity) and SD2 (the standard deviation along the line of identity; Brennan et al. 2001). Only SD2 was reported as SD1 is identical to RMSSD (Shaffer and Ginsberg 2017).

Heart rate variability—linear metric analysisTime-domain metrics

The time-domain measures of heart rate variability quantify the amount of variability present within the RR interval time series.

The root mean square of successive differences between normal RR intervals (RMSSD) was calculated using Eq. (7):

$$} = \sqrt c} } & ^ (RR_ - RR_ )^ } \\ \end }$$

(7)

The standard deviation of normal RR intervals (SDNN) was calculated using equation (8):

$$} = \sqrt c} } & ^ (RR_ - \overline )^ } \\ \end }$$

(8)

The RMSSD and SDNN metrics were reported in milliseconds and natural logarithm transformed values, LnRMSSD and LnSDNN.

Frequency-domain metrics

The frequency-domain measures of heart rate variability provide an estimate of spectral power in frequency bands. The power spectrum was estimated using a parametric autoregressive based model, with the absolute power in the low frequency power (LF) band (0.04–0.15 Hz) and high frequency power (HF) band (0.15–0.4 Hz) calculated, along with the LF/HF ratio. The absolute power in the LF and HF band is reported in ms2 and natural logarithm transformed values (Ln).

Statistical analysis

Data are presented as individual values or mean ± SD (unless specified otherwise). Statistical analyses were conducted using IBM SPSS Statistics 29 (IBM, Armonk, New York, USA). Visual inspection of Q-Q plots and Shapiro–Wilk statistics were used to check whether data were normally distributed.

Day-to-day reliability of all heart rate complexity and variability metrics was assessed through a two-way random intraclass correlation coefficient (ICC2,1) for absolute agreement, standard error of measurement (SEM), minimal detectable change (MDC) and Bias (being mean difference between day 1 and day 2). Upper and lower 95% limits of agreement (LOA) were calculated as the mean of differences between days ± 1.96 × the standard deviation of the differences. Between day coefficient of variations (CVs) of all HRV metrics were calculated by dividing the SD of both days’ measurement by the mean of both days measurement and multiplying by one hundred. Between participant CVs for all HRV metrics were calculated by dividing the SD of all participant measurement by the mean of all participant measurement and multiplying by one hundred. Paired samples t-tests were used to assess whether a significant difference in the complexity and variability metrics were present between days for each age group.

Based on the ICCs, relative reliability was defined as: poor = ICC < 0.5, moderate = ICC ≥ 0.5 to < 0.75, good = ICC ≥ 0.75 to < 0.90 and excellent = ICC ≥ 0.90 (Koo and Li 2016).

Hedges’ g effect sizes and the 95% confidence intervals were calculated to assess the differences between the two age groups (YG vs. OG) HRV metrics and interpreted as: 0.2 to 0.5 small effect, 0.5 to 0.8 medium effect, ≥ 0.8 large effect (Cohen 1992).

Multiple linear regressions were performed to estimate the effect of participant age, sex and \(\dot}_}}}\) on all heart rate complexity and variability metrics. Males were set as the baseline reference level; therefore, positive beta coefficients indicate that being female will likely result in a higher value.

The significance level was set at P < 0.05 in all cases.

Comments (0)

No login
gif