Measuring navigational and digital health literacy in individuals with long-term conditions: latent trait analyses using Rasch modelling

Abstract

Background:

Mapping health literacy (HL) enables health information and treatment to align with individuals’ preferences and potentially makes healthcare truly person-centered. Without valid and reliable measurement tools however, we risk misreading what people actually need. This study aimed to evaluate two measurement scales developed to assess navigational and digital HL in a population sample of individuals with long-term conditions (LTCs).

Methods:

This secondary study used cross-sectional data from the large-scale Health Literacy Population Survey 2019–2021. The two scales were evaluated in a national population (≥ 18 years) with self-reported long-term conditions (duration ≥ 6 months). A latent trait analysis using the unidimensional Rasch partial credit model (PCM) for polytomous responses was performed to evaluate overall and item-level fit. The social determinants gender, age, education level, employment status, population density, economic situation, and social status level were included as person factors when testing for differential item functioning (DIF).

Results:

A total of 1,064 participants were included in the analysis. Both the Navigational Health Literacy scale (NHL) (n = 1,063) and the Digital Health Literacy scale (DHI) (n = 926) met the assumption of unidimensionality. Reliability was sufficient for group-level comparisons, although the DHI scale could have been better targeted at respondents’ health literacy levels. Evidence for local independence was partially supported. Data-model fit statistics indicated an acceptable fit for both scales at the overall level (chi-square test) and the item level (infit statistics). However, one item on the NHL scale discriminated poorly and displayed disordered thresholds.

Conclusion:

These findings contribute to the ongoing refinement and validation of the NHL and DHI scales within the LTC population and highlight the need for further improvement to strengthen the conceptual understanding of navigating the health care environment and assessing, understanding, applying and using digital health information.

1 Introduction

Understanding the global disparities in hospital admission for long-term conditions (LTCs) reveals a pressing public health concern (1). In Europe, a staggering 80% of primary health care users over the age of 45 suffer from LTCs, with nearly half grabbling with comorbidities (2). Amidst these challenges, health literacy (HL) merges as a critical factor, influencing individual’s proficiency to effectively navigate the healthcare landscape and access trustworthy digital health information (3–6). Various HL measurement scales are being validated and applied in different populations. To advance the ongoing initiative, it is essential to validate HL assessment scales across diverse target populations in various contexts (7). This paper aims to contribute to this important discourse by validating HL measures in individuals with LTCs, ultimately fostering better health outcomes for individuals with LTCs. Research have indicated that stronger HL correlates with improved self-management of LTCs (8).

Personal HL is being recognized as one of the most important health determinants (WHO) (9) and refers to the proficiency to access, understand, appraise, and apply health information to make informed decisions about one’s own health and that of others (9). These skills relate to behaviors and lifestyle choices, preventive activities, self-management of diseases, as well as the use of health information and welfare services (10, 11). The public’s options for informed health decisions depend on a sufficient overview of what health care services to use, and will depend on the interaction between individual HL and the nature of the complexity and demands from the health services (12). Navigational HL encompasses the ability to manage information in a way that facilitates optimal navigation in the health care services, and to identify the right treatment or support at the right time (12). Digital HL refers to the ability to seek, find, understand, and appraise health information from digital sources, and to apply this knowledge to solve health challenges (13, 14). Moreover, digital HL involves the skills necessary to utilize digital services and welfare technology (15, 16). Optimizing this capacity to efficiently navigate the health services and utilize digital health services may reduce individual consequences of illness and avoid overburdening the health system (3, 17, 18). LTCs often require a high degree of self-management and patient engagement (19). Individuals living with these conditions have therefore emerged as a key demographic in health promotion and healthcare service planning (17, 20–23), particularly in efforts to achieve sustainable goals related to health and wellbeing for all (22, 24). Furthermore, HL is increasingly understood as a dynamic phenomenon that is shaped by the complexity of individual’s life situations and influenced by a range of contextual factors, including social, cultural, and environmental conditions (10, 25). Thus, individual HL is acknowledged as a relational concept which encompasses both systemic and individual contributors (26). As an asset which can be built through action at an individual, service and societal level (27), a public health approach to HL assessment claims the relevance of collecting data on both comprehensive and specific HL measures as well as sociodemographic variables that comprise the broad complexity that influence daily life decisions on health (28). Taking the considerable and increasing amount of the LTC population into account, increasing HL in LTC populations may make significant contributions towards more health literate populations (26). Validating the scales applied in population surveys is vital in a global context. How the scales operate when applied at different target groups and how each item is perceived by respondents in the target groups is therefore of relevance. Trends in the digitalization of healthcare services and the growing availability of digital health information highlight the importance of incorporating both navigational and digital HL as key assessment outcomes of HL mapping.

A central attention in the field of HL research has been to identify population and patient groups of low HL or high risk of low HL. This earlier strand of HL research found that individuals with LTCs face greater HL demands but often possess lower HL skills to effectively manage their health compared to the general population (29, 30). Associations between lower HL and lower health knowledge, higher decisional uncertainty and low engagement in patients are concerns voiced in several studies (8, 31). This focus on low HL has primarily been addressed for low-to middle income countries (21). As stressed in HL research, public health assessment of the population that is handling health and illness in everyday life is a key priority, also recommending including social determinants and the broader societal context of HL in assessment studies.

Despite research emphasizing that HL is vital key for health promotion and the treatment of LTCs (4, 8, 10, 22, 32, 33) and essential for self-management skills (17, 21), there is a lack of research on HL within LTC populations (11, 17, 18). Nevertheless, the mapping of HL has utilized various specialized HL scales for targeted assessments, such as Navigational Health Literacy (HLS19-NHL_Norwegian) (NHL) and Digital Health Literacy in the domain of Digital Health Information (HSL19-DHI_Norwegian) (DHI), alongside the comprehensive International Health Literacy Population Survey Questionnaire (HLS19). This ongoing research supports evidence-informed health promotion strategies. The primary goal is to obtain valid and reliable HL data to assess and compare populations across regions and countries, as well as to enable longitudinal monitoring. Assessing HL in populations with LTCs is essential to address demographic and technological changes and to guide health care policy development (10, 34–38). However, there remains a significant need for valid and reliable measurement tools specifically tailored to assess NHL and DHL across diverse LTC populations, as these are key to the effective management of such conditions. While NHL and DHI measurement scales have been validated in general populations (39), there is a notable research gap concerning their applicability and validity in populations with LTCs or non-communicable disease (NCDs). Current public HL assessments primarily focus on individual patients in clinical settings and do not sufficiently capture population-level HL mapping within specific contexts or subgroups (27). Crucially, it remains unclear whether population surveys yield valid measurements for individuals with LCTs. Existing HL measurement studies in LTC populations typically focus narrowly on particular diagnosis or clinical subgroups (40), rather than reflecting the full diversity of the broader LTC population (8, 26). This limitation reveals a significant gap in knowledge and underscores the importance of adopting a population-based perspective that includes a more heterogeneous sample of self-reported LTCs (27, 40, 41).

Moreover, although NHL and DHI measurement scales have been validated in general populations (7, 16) and specific target groups (42), their applicability has not yet been examined among individuals with LTCs. Traditional validation methods, particularly confirmatory factor analysis (CFA) (40), continue to dominate the HL field. However, recent studies highlight the advantages of applying Rasch modelling in achieving specific objectivity (16, 39, 43–46).

CFA models for categorical data allow varying item loadings, and item response theory (IRT) models allow items to differ in discrimination, whereas Rasch models constrain discrimination parameters to be equal across items. This constraint results in independence between person parameters (what is being measured) and item parameters (the measurement tool), a property known as specific objectivity. IRT models, including Rasch models, place items and persons on the same measurement scales and compare the distribution of person proficiency to the distribution of item thresholds.

In heterogeneous samples of individuals with LTC drawn from the general population, where both proficiency and interpretations of item content may vary, it is important to examine whether HL scales measure invariantly across the general population and a heterogeneous subgroup of individuals with LTCs. In line with these recommendations, this study applies the Rasch model to enhance the measurement of HL in heterogenous LTC populations.

2 Methods2.1 Aim

The aim of this study was to explore the psychometric properties of the measurement scales used to assess of Navigational Health Literacy (HLS19-NHL_Norwegian) (NHL) and Digital Health Literacy within the dimension of Digital Health Information (HLS19-DHI_Norwegian) (DHI) in a population sample of individuals with long-term conditions (LTCs). The data collected was tested against the Rasch partial credit parametrization (PCM) of the unidimensional Rasch model for polytomous responses (47). More specifically, we examined whether the data generated by the scales were consistent with the following evaluative statements, which we formulated as hypotheses:

H1: The measurement scales collect data that meets the unidimensional Rasch model requirements of unidimensionality and acceptable reliability and display proper targeting with no violation of local independence.

H2: Each item displays sufficient data-model fit, with ordered response categories.

2.2 Design

This study is a secondary analysis of survey data collected in Norway as part of the international large-scale European Health Literacy Population Survey 2019–2021 (HLS19) (36, 39, 48). Administered by the WHO Action Network on Measuring Population and Organizational Health Literacy (M-POHL) of Europe, HLS19 mapped HL across 17 countries (39, 49, 50). In Norway, the data were collected in 2020 by a national survey agency using computer assisted telephone interviews (CATI). The response rate calculated as the number of individuals contacted was at 20%, and further details regarding sampling and representativeness are reported elsewhere (36, 48). The analysis adhere to the STROBE guidelines for cross-sectional studies (51) as well as the Rasch reporting guideline for rehabilitation research (52).

2.3 Participants

The current study included 1,064 respondents ≥18 years who reported having one or more LTCs (Table 1). This target sub-population was selected from the general population, constituting a heterogeneous sample of diverse LTCs, including musculoskeletal conditions and hypertension, cardiovascular diseases, diabetes mellitus, mental illness, lung disease, and rare diseases. A total of 3,000 participated in the national survey; however, 94 were excluded from our sample of adults (89 reported being under 18, and five had missing data). Additionally, respondents with missing values regarding LTCs were excluded (n = 24). Among the individuals with self-reported LTCs (n = 1,064), n = 1,063 responded to at least one NHL-item and n = 926 responded to at least one DHI item. The latter sample size was lower as those who responded “no” on the routing item about whether they had ever used the internet to search for health information (36), were excluded from the DHI analyses.

Sociodemographic variablesLTCsMis.No LTCsMis.n1,0641818GenderFemale56-46-Male44-54-Age cat ≤ 657483-> 652617Education level2-ISCED 0─55046ISCED 6─84852Employment statusEmployed61-80-Unemployed3920Population density (urbanity)-1≥50.0004045<50.0006054Economic deprivation11No4956Yes5043Social status level43Low (1─4)117High (5─10)8690

Sample characteristics of sub-sample with self-reported LTCs compared to the sub-sample reporting no LTCs.

Sub-sample with self-reported LTCs (n = 1,064) compared to the sub-sample reporting no LTCs (n = 1818) in a representative national sample (Norway) of the cross-sectional HLS19-data. All numbers reported in %, except n. Education level: ISCED 5 ≤ short-cycle tertiary education/ISCED 6 ≥ bachelor’s or equivalent; Employment status: ‘unemployed’ entails unemployed, retired and unable to work; Economic deprivation: ‘Yes’ entails very difficult to pay bills at the end of the month, Social status operationalized as ten-level variable in HLS19, split between level four and five to indicate low or high perceived social status.

2.4 Measurement scales

The culturally adapted versions of the NHL and DHI-measurement scales, based on work by M-POHL and further developed by the Norwegian HLS19 national study team, consists of 12 and eight items, respectively. A four-point self-report rating scale graded 0─3 (very easy, easy, difficult and very difficult) was used for all items (36, 48). NHL is conceptually balanced between organizational- and system level items. DHI reflects the domain of health information on the Digital Health Literacy scale (36, 48). The item statements and domains are presented in Table 2.

Item labelStatement (“on a scale from very easy to very difficult: how easy would you say it is to…?”)DomainNHL1Understand information about how the health care system is structured and is functioningSystemNHL2Determine what type of health care you need when you have a health problemSystemNHL3Determine whether health insurance covers your need for a particular health serviceSystemNHL4Understand information about ongoing health care reforms that may impact your health care servicesSystemNHL5Find out what rights you have as a patient or user of health servicesSystemNHL6Determine what health services to choose if you need oneOrganizationalNHL7Find information about the quality of a particular health serviceOrganizationalNHL8Determine whether a particular health service meets your need for health careOrganizationalNHL9Know how to book an appointment at the primary health serviceOrganizationalNHL10Find out how patient and user organizations or similar can help you navigate the health care systemOrganizationalNHL11Find the right contact person for your needs at a health institutionOrganizationalNHL13Find out if a health service requires a deductibleSystemDHI1Use the proper words or search query to find the information you are looking forDigitalDHI2Find the exact information you are searching forDigitalDHI3Understand the informationDigitalDHI4Judge whether the information is reliableDigitalDHI5Judge whether the information is offered with commercial interestsDigitalDHI6Visit different websites to check whether they provide similar information about a topicDigitalDHI7Judge whether the information is applicable to youDigitalDHI8Use the information to help solve a health problemDigital

Item labels, statements and domains in the NHL and DHI measurement scales.

NHL12 was removed and replaced by NHL13 as part of the national validation.

2.5 Rasch model estimation

We explored the proposed hypotheses by applying the Rasch partial credit parametrization (PCM) (47) of the unidimensional Rasch model to test the psychometric properties of navigational and digital HL scales in the LTC population. The analyses were performed using Rumm2030 Plus (53), employing pairwise maximum likelihood estimation for item parameters and subsequently Warm’s weighted likelihood estimation for person parameters (54). Additionally, to further evaluate individual item fit and overall data-model fit, we utilized the Mirt R package using marginal maximum likelihood estimation (55–57).

2.6 Handling missing data

Individuals with missing values for all scale items were not included in the analysis. The number of participants with missing values differed between the NHL and DHI scales. The scales were analyzed separately, and respondents who responded “no” to a routing question about using internet to search for health information, were not asked the DHI items.

2.7 Rasch model application

Using Akaike Information Criterion (AIC) (58), we identified and selected the most appropriate Rasch model for polytomous items. The PCM described the data better than the Rating Scale Model (RSM), as the AIC decreased by 65.4 for the NHL scale and by 48.9 for the DHI scale when the PCM was applied. Therefore, the PCM was retained for all subsequent Rasch model applications.

2.7.1 Unidimensionality

To test the assumption of unidimensionality (59), dependent t-tests were used to estimate the portion of respondents with statistically significantly different proficiency estimates based on identified subscales (target value is < 5%). For the NHL scale, we used the system level and the organizational level item subsets as theoretically defined subscales (Table 2), while we used principal component analysis (PCA) of Rasch-model residuals (59) to empirically identify possible subscales for the DHI scale. Dependent t-tests were also used to confirm the two-dimensional structure of the composite scale combining NHL items and DHI items.

2.7.2 Reliability

Scale reliability was estimated by the Person Separation Index (PSI) (60) and marginal reliability index. Acceptable reliability coefficients should preferably be minimum 0.65 at the group level (61).

2.7.3 Targeting

Scale targeting was explored by comparing the distribution of person proficiencies (which represents the respondents scale score) to the distribution of item thresholds (indicating the ‘difficulty’ of endorsing the various response categories), where the mean of the latter was constrained to 0. To which extend person ability mean exceeds below or above mean indicate whether the respondents find the scale more easy or difficult than intended, and the mean position indicates how the distribution of person proficiencies match the distribution of item thresholds. As the item mean is fixed at 0, positive mean for person estimates indicate that the items in the scale are easy on average. The standard deviation of logits for person estimates indicates the spread of person measures on each scale (52, 62).

2.7.4 Local independence

Measurement scales may exhibit response-level and trait-level violations, including violations of the assumption of local independence (LID) (63). The assumption of invariance was tested using differential item functioning (DIF). When items display DIF, they function differently across groups, indicating a lack of invariance. DIF can also be interpreted as a trait-level violation, as the test identifies factors associated with the latent trait. An item displays differential item functioning (DIF) if respondents with the same proficiency from different groups differ in their response patterns (e.g., the two group levels males and females for the person factor gender).

The DIF analyses tested whether either the slope of the item characteristic curve (indicating non-uniform DIF) or the locations of the item thresholds (indicating uniform DIF) varied across the level of the relevant person factor (64). Using analysis of variance of standardized Rasch model residuals (65) with Bonferroni adjusted 5%-level (66), DIF was tested for gender, age, education level, employment status, population density, economic situation and social status (Table 1). While the target value for chi-square probability equals 0.05 divided by number of items (e.g., 0.05/12 for NHL), the chi-square target value applied for the DIF analysis was divided by two categories (non-uniform and uniform DIF) (e.g., 0.05/24 for NHL). Age was dichotomized in line with applied age categorization in health promotion (67) and the definition of older adults (65 or older/below 65). Education level was split between ISCED (International Standard Classification of Education) levels 5 and 6 (6 = bachelor’s or equivalent) (68). Employment status was recoded to either employed or unemployed, with the unemployed category including those who are unemployed, retired or who reported that they were unable to work due to long-standing illness. Population density was recoded from four categories to a binary classification of above or below 50.000 inhabitants. Economic situation, also known as economically deprivation, was recoded from four categories (very easy, easy, difficult or very difficult to pay bills at the end of the month) into a dichotomous variable classified as ‘Pay bills’ (very easy versus easy/difficult/very difficult). Social status, initially defined as an eleven-level variable in HLS19 (36) was split between level four and five to indicate low or high perceived social status. Person factors were selected to address the recognized need to include sociodemographic variables in health literacy (HL) assessment, given concerns about social gradients in HL development (69). Dichotomization was performed using optimal cut-off points.

2.7.5 Data-model fit

Data-model fit was tested by Bonferroni-adjusted chi square fit statistics at the overall level and at item level infit values, and graphical inspection of item characteristic curves. The full sample sizes were used for all analyses (n = 1,063 for NHL and n = 926 for DHI), but adjusted sample sizes corresponding to 10─30 persons per threshold (62, 70, 71) were applied for DIF analyses and analyses of data-model fit at item level and overall scale level as chi square fit statistics are sample-dependent (72). The probability of observing a given chi-square value or a larger value under the assumption of good model fit is indicated by a non-significant chi-square test. The expected value of Infit mean square (MNSQ) is 1 (55), under the hypothesis that the PCM explains all the variance in the item responses (73). Infit values between 0.7 (strong discrimination) and 1.3 (weak discrimination) are considered sufficient (10, 39), and indicate that there is 30% more or less variation in the data than explained by the PCM, respectively (74). Ordered uncentralized item thresholds were interpreted as evidence for ordered response categories (47). The assumption of local independence was was tested by estimating correlations between Rasch-model residuals (Yen’s Q3) for each pair of items (65, 75) also known as ‘response dependence’ and ‘response violation of local independence’ (63).

To further evaluate model adequacy, goodness-of-fit indices (GOFIs) are reported for the PCM, including the standardized root mean squared residual (SRMR) with target value < 0.08 (76), the root mean squared error of approximation (RMSEA) with target value < 0.05 (77), and the Comparative fit index (CFI) and the Tucker-Lewis index (TLI) with target value > 0.95 (78, 79). The target values are valid for SEM-based CFA with continuous indicators and should be interpreted with cautious when applied to item response theory (IRT) models. In the current analyses of polytomous data, we applied M2* specified by C2 which is conceptually comparable to chi-square fit index for CFA (56, 57). These estimates are reported as a supplementary analysis for descriptive and comparative purposes.

3 Results

The analysis provides partial support for H1, which concerns the evaluative criteria of unidimensionality and reliability, and proper targeting and local independence when applied in the population of individuals with long-term conditions (LTCs). The first part of H1 was largely supported across both scales, with evidence of sufficient unidimensionality and reliability. Local independence (LID) was partly supported, as no substantial response dependency was observed, but compromised by items displaying differential item functioning (DIF). H1 was further weakened in the Digital health literacy scale (DHI), where poor person–item targeting was observed. H2, addressing data-model fit and threshold ordering, indicated item-level issues in the Navigational health literacy scale (NHL).

3.1 Unidimensionality

For individuals with LTCs, both scales sufficiently met the requirement of unidimensionality, with fewer than 5% of dependent t-tests being statistically significant.

3.2 Reliability

We observed sufficient reliability for both scales, with the person separation index (PSI) and Cronbach’s alpha exceeding 0.85.

3.3 Targeting

The NHL scale was better targeted to the LTC population than the DHI scale (Table 3). Graphical representations of the threshold-person distribution are provided in Figure 1, 2, confirming the slightly skewed distribution of the DHI scale. Both item sets are presented in Table 4 in ascending order of ‘overall item location’, indicating the ordering of HL-related tasks from the ‘easiest’ to the most ‘difficult’.

StatisticsHLS19-NHL
n = 1,063HLS19-DHI
n = 926Unidimensionality % sign. Tests (CI)6.42 (0.05)5.56 (0.04)Chi-square χ2(df, n) p284.36 (108, 1063) < 0.00129.99 (72, 926) < 0.00Chi-square amend sample χ2(df, n) p133.41 (108, 487) < 0.0592.76 (72, 623) < 0.05Mean person location m (sd)0.65 (1.69)1.63/1.97Min and max item threshold−3.31; 4.51−3.50;3.83Person separation index (PSI)
α MIRT0.89
0.920.86
0.89SRMSR MIRT0.090.07RMSEA MIRT0.100.10M2 RMSEA p value [x2(df)p] MIRT717.10(65)0273.17(27)0CFI MIRT0.950.96TLI MIRT0.940.96

Overall scale level results for HLS19-NHL and HLS19-DHI Norwegian versions in the LTC population.

m/sd and thresholds in logits; SRMSR Standardized Root Mean Square Residuals RMSEA; Root Mean Square Error of Approximation; CFI Comparative Fit Index; TLI Tucker-Lewis Index; M2 C2; n sample size for DHI is reduced by inclusion criteria (“have you ever searched for health information on the internet”); extreme scorers are included in Rumm estimates. Dimensionality between NHL and DHI [% sign. Tests (CI) = 23.19 (0.02)].

Comments (0)

No login
gif