Development and validation of a measurement instrument for physical activity-related health literacy (PA-HL): a study protocol

This research is based on a mixed-methods design and will be conducted in Germany from May 2023 to November 2026.

Design of the study

Following the guidelines of Boateng et al. [27], this research consists of three phases (see Fig. 1).

Fig. 1figure 1

Study design for a three-phase development process for the measurement instrument on physical activity-related health literacy among German adults, 2023—2026

Phase 1: literature search and item generation

The first step is to define the concept and domains of the measurement instrument, focusing on the areas of health care, disease prevention and health promotion. The HL conceptualisation by Sørensen et al. [15] serves as a conceptual framework. The definitions of the competences for accessing, understanding, appraising, and applying information are based on Bitzer & Sørensen [28]. Using a deductive item generation method, a preliminary item pool is developed based on an in-depth analysis of existing HL instruments and a literature review (in Medline and PubMed on predictors of PA and information-seeking behaviour for PA). Existing measurement instruments are used to delimit the content of the measurement instrument in order to avoid overlaps. As no comparable instruments on PA exist to date, general HL measurement instruments are identified and items are transferred to the topic of PA. According to the widely accepted conceptual understanding of Sørensen et al. [15], PA-HL is (preliminary) defined as:"PA-HL is linked to literacy and entails people's knowledge, motivation and ability to access, understand, appraise and apply relevant health information about physical activity in different forms in order to make everyday decisions to incorporate physical activity for disease management, disease prevention and health promotion in order to maintain or improve quality of life."Exemplary items developed during the initial phase, grounded in generic HL dimensional framework, are presented in Table 1.

Table 1 Health literacy matrix on exemplary items in the three dimensions and four competences of physical activity-related health literacyPhase 2: scale development (conversion of individual items into a measurement instrument)

Secondly, a three-stage eDelphi process will be conducted to assess the preliminary item pool and to test content validity, following the Guidance on Conducting and REporting DElphi Studies (CREDES) [29]. An eDelphi is a method that can be used as an evaluation and assessment tool to reach consensus on research and development issues, e.g. on the relevance and comprehensibility of items [30]. The eDelphi encompasses three iterative rounds, preceded by pretesting with independent reviewers who had no prior involvement in item development or the subsequent eDelphi procedure. In the first and last round, an online survey is conducted in which the experts are asked to rate each item with regard to the indicators of content validity: (1) relevance to the definitions of the individual items and presentation of the respective definition and (2) clarity and comprehensibility of the item [31]. A 4-point Likert scale is used to assess relevance (very relevant, relevant, moderately relevant, irrelevant) and comprehensibility (very simple, simple, difficult, very difficult). In the second eDelphi round, an online workshop is held with experts to discuss the structure and particularly heterogeneous results of the first round. Between the eDelphi rounds, the feedback is summarised and sent to the participants in anonymised form [32]. Based on the results of the eDelphi rounds, the item pool is reduced, adapted or expanded.

In addition, in this second phase, cognitive interviews are conducted with potential end-users to assess whether the item pool is appropriate for the objectives of the measurement instrument, to assess face validity [27] and to further develop and test the instrument [33]. Cognitive interviews have proven to be a relevant test procedure in the development and evaluation of measurement instruments [34]. These interviews are used to introduce the new measurement instrument to the target group and to identify potentially problematic items or difficulties in answering the questions [33] as well as underlying causes for the adaptation of the instrument [34]. A heterogeneous sample will be interviewed to analyse how adults understand PA-HL items and to identify potentially problematic items and difficulties in answering the questions [33, 35]. For this purpose, the methods of thinking aloud and probing are used [33].

Phase 3: scale evaluation

In the third step, an online survey is conducted to evaluate the novel measurement instrument and assess its psychometric quality. The online survey is conducted after a pretest to check the technical functionality, comprehensibility and logic of the questionnaire. The detailed description of the statistical calculation is described in the statistical analysis section.

Sample, sample size and recruitment strategyeDelphi

The participants for the eDelphi are contacted by email. For the Delphi method, Niederberger & Renn [30] describe that the number of experts involved depends on the number of items or hypotheses and the expected response rate, but do not give any recommendations for the minimum number of cases required. The study will include a minimum of 12 participants. The participants of the eDelphi study are experts in the fields of HL, PA, physiotherapy or public health. Professional experience and expertise in one or more of the aforementioned research areas serve as inclusion criteria.

Cognitive interviews

For the cognitive interviews, Pohontsch & Meyer [33] recommend between five and 15 interview partners. This means that this study will include at least five adults over the age of 18 years. As the measurement instrument is to be used in different settings (health care, prevention, health promotion), participants from different age groups and with different educational, cultural and religious backgrounds are included. Participants are recruited for the cognitive interviews using the snowball method. Personal contacts are asked about their connections to people of different age groups (e.g. students to pensioners), different occupational fields (e.g. crafts, retail, secretarial work) as well as to unemployed people and people with different cultural backgrounds. These connections are used to establish contact with potential dialogue partners.

Scale evaluation (Pilot study)

The scale will be evaluated in a quantitative cross-sectional pilot study with a convenience sample. Adults aged 18 years and older will be recruited. The estimated sample size is based on Boateng et al. [27] and requires 10 respondents per survey item or 200 to 300 respondents per observation [36, 37]. In this study, the aim is to include at least 200 respondents, as the final number of items has not yet been determined. The recruitment strategy includes various channels such as personal requests to community-based centres, organisations and associations (e.g. local NGOs or cultural centres), social media, institutional partnerships and personal contacts to reach a suitable sample. Access is possible via a link or QR code that is distributed via email and social media and can be printed out and posted in the local organisations or centres.

Variables

In the eDelphi method, socio demographic characteristics (e.g. gender, age, employee status, professional background, work experience) are collected in addition to the validity indicators. In the cognitive interviews, sociodemographic characteristics are collected to understand the composition of the sample and to contextualise the results (i.e. gender, age, education level, income, place of residence, chronic diseases, migration background) as well as the participants'understanding of the PA-HL construct and their interpretation of the items and misconceptions.

As comparative characteristics are required to assess construct validity in the scale evaluation phase, further standardised measurement instruments and knowledge-related items are included in the scale evaluation. The measurement instruments under consideration for construct validity assessment are presented in Table 2 (section comparative measurement instruments).

Table 2 Overview of the variables and measurement instruments employed in the pilot testing phase for the physical activity-related health literacy scale evaluation, 2025—2026Statistical analysis

All statistical analyses are carried out using IBM SPSS Statistics, V.24.0 (IBM), and the statistical software R, V.3.5.2, for quantitative analysis. Descriptive statistics are calculated for the socio-demographic data.

eDelphi

The Item-Level Content Validity Index (I-CVI) and the average Scale-Level Content Validity Index (S-CVI/Ave) are calculated for the eDelphiFootnote 1 [46, 48]. The I-CVI measures the agreement between the expert judgements for an individual item with regard to the relevance of the item and the S-CVI determines the content validity of the entire measurement instrument. The I-CVI should not be lower than 0.78 for the inclusion of an item [48] and an S-CVI/Ave of 0.90 or higher is acceptable [49]. High values therefore indicate that the items/scale were assessed as valid in terms of content.

Cognitive interviews

The cognitive interviews are analysed descriptively and content-analytically [50]. For this purpose, all audio files are transcribed using key points [51]. Following Prüfer & Rexroth [35], all statements made by the interviewees for each item are compiled in a document that provides separate analysis for each item. The statements are assigned to the"Questionnaire Appraisal Coding Scheme"[52] and analysed [35]. Items will be linguistically modified or eliminated based upon consensual feedback from the results. Particularly inconsistent results and their implications for the item pool are discussed in a group of five researchers in order to reach a consensus on how to deal with these ambiguous responses.

Scale evaluation (Pilot study)

Psychometric testing is one of the most important steps in the development of a measurement instrument [53, 54].

Floor and ceiling effects are analysed by calculating percentages for the lowest and highest possible scores. The established threshold for determining a floor or ceiling effect is 15% [54]. The content and face validity of the measurement instrument is checked and improved by conducting the eDelphi procedure and cognitive interviews.

Spearman's rank correlation coefficient (rs) is used to analyse construct validity and interpreted as follows: 0 to 0.25 little or no association; 0.25 to 0.50 adequate association; 0.50 to 0.75 moderate to good association and above 0.75 good to excellent association [55].

To validate the novel measurement instrument, both principal component analysis (PCA) and confirmatory factor analysis (CFA) are used. A split-sample PCA and CFA approach is used to analyse multidimensionality. For this purpose, the sample is randomly divided into two groups so that one group is used for PCA and the other for CFA [37, 56]. This statistical approach was selected as CFA relies upon theoretical foundations that is preferable for confirmatory hypothesis testing. PCA is utilised to reduce the number of original items in the planned instrument to a smaller number.

For PCA, a screen plot is used to determine the suitability of the data set for data reduction and the number of statistically meaningful dimensions [57]. An orthogonal rotation is performed so that the items load as high or low as possible on a factor and show the final and interpretable loading structure [58]. Items that do not load sufficiently on a factor (< 0.30) are excluded.

CFA is a procedure for testing relationships between observable variables and latent variables [59] and is therefore regarded as a central test instrument for measurement models for hypothesised constructs [60]. In this case, a theoretical model is developed a priori based on the results of the eDelphi procedure and the model proposed by the PCA to be tested. A variance–covariance matrix is calculated and used to estimate the model parameters [60]. In addition, the factor loadings are calculated, which represent estimates of the correlations between the item variables and the latent variable. The higher the factor loading, the greater the proportion of the measurement variance that is explained by the factor [37, 56, 59]. The quality of the CFA is assessed at both the construct and the model level. The construct quality is analysed using the following indicators: factor loadings (significant, > 0.5 [61]), factor reliability (> 0.6 [62]) and average variance extracted (AVE > 0.5 [63]). The overall quality of the model is assessed using the following indicators: X2 (not significant), χ2/df (< 2–3), GFI (> 0.95), CFI (> 0.90), SRMR (< 0.1), RMSEA (< 0.05) and the 95% confidence interval for RMSEA = [0; 0.05] [61].

Cronbach's alpha is used as a measure of reliability [64] and a value of 0.65 to 0.80 is considered appropriate [65, 66]. The minimum value for clinical use is α = 0.90 [54].

Ethics and dissemination

The study was designed according to the principles of the Declaration of Helsinki. The study received ethical approval from the Ethics Committee of Fulda University of Applied Sciences on 21 December 2023, with an amendment for cognitive interviews approved on 21 June 2024 (reference EK231204).

Comments (0)

No login
gif