A 5-day study-a-thon [27], an event aimed at generating knowledge and evidence for the specific aim of this study, was organized on September 5–9, 2022 and attended by 31 scientists from partners of the Innovative Medicines Initiative (IMI) EHDEN consortium. Statistical signal detection was performed in VigiBase [28].
As the consortium operates as a public–private partnership, the scope of the study was limited to suspected drugs with generic manufacturers (excluding vaccines and biologicals) to avoid potential conflicts of interest. Moreover, to prevent bias due to misclassification, only adverse events with pre-specified definitions or phenotypes validated in EHDEN were considered. In general, the use of phenotypes instead of single diagnostic terms also improves the ability to recognize relevant adverse events and thereby increase statistical power of any analysis. A total of 16 adverse event phenotypes developed by the Observational Health Data Sciences and Informatics (OHDSI) network and validated by the consortium for its research on COVID-19 vaccines [29, 30] met this criterion at the time of the study.
Signal validation and prioritization considered information from VigiBase, regulatory documents and the scientific literature alongside descriptive analyses of routine health data from participating EHDEN data partners.
2.1 Study-a-thon ExecutionSix pharmacovigilance specialists experienced in signal management (four pharmacists; two medical doctors) and four data scientists from the Uppsala Monitoring Centre (UMC) participated in the study. They worked in two signal validation teams, analysing reports in VigiBase and reviewing regulatory information and scientific literature according to UMC's routine signal validation and prioritization process.
Supporting these teams were four epidemiologists with expertise in performing analyses across large database networks using tools and packages developed by the OHDSI community including ATLAS [31]. They translated questions raised by the assessors during signal validation and prioritization into descriptive analyses with scripts to be executed across databases of participating data partners. All analyses were designed centrally in the ATLAS user interface, and JSON specifications were shared with the data partners for execution. Additionally, ad-hoc custom R/SQL scripts were developed on site.
During the study-a-thon, representatives of participating data partners were on call to run the analysis scripts in their respective databases to help answer the questions. They also provided interpretation and context for results based on their expert knowledge of the source data. Relevant findings were returned to the corresponding signal validation team (Fig. 2), and together with insights from VigiBase, regulatory documents and the scientific literature, they informed decisions about which signals to forward for assessment. These decisions were made by the signal validation teams through consensus.
Fig. 2Overall execution of the study-a-thon. Two signal validation teams (blue) examined the available information from VigiBase and other sources of information, resulting in questions that could potentially be answered with descriptive analyses of routine health data. Epidemiologists well versed in Observational Health Data Sciences and Informatics (OHDSI) analytical tools (red) translated those questions into scripts for execution across routine health databases of the participating European Health Data and Evidence Network (EHDEN) data partners (green)
2.2 Data2.2.1 VigiBaseReports in VigiBase are shared by the 155 full member countries in the WHO Programme for International Drug Monitoring (February 2023) [32]. Medicinal products (drugs) are coded using the WHODrug Global dictionary [33] and adverse events are coded using the Medical Dictionary for Regulatory Activities (MedDRA®). The MedDRA® terminology is the international medical terminology developed under the auspices of the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH). VigiBase held 32.0 million reports at the data lock point, 4 July, 2022 [29]. We excluded reports with only vaccines (ATC = J07) listed as suspected medicinal products and reports identified as suspected duplicates through the vigiMatch algorithm [34], resulting in a data set of 25.6 million reports (encompassing 3.7 million drug–event combinations) on which statistical signal detection was performed (see Fig. 3).
Fig. 3Flow chart presenting the flow of signals including those studied during the study-a-thon. Numbers of exclusions refer to the number of drug–event combinations, unless specified differently. MedDRA Medical Dictionary for Regulatory Activities, PTs preferred terms
2.2.2 Routinely Collected Health Data from EHDENIn EDHEN, individual-level data are maintained by data partners across Europe and mapped to a common standard, the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM). In this model, drugs are coded using RxNorm Extension [35] and outcomes using SNOMED CT, enabling execution of standardized analysis scripts across different databases. All 16 phenotypes used for the study (Supplementary Table 1, see electronic supplementary material [ESM]) were defined based on rule-based algorithms including relevant diagnostic codes [29, 30] (see also [36] for a detailed description of the definitions used). Mapped individual-level data are stored locally at the data partner site and only aggregate statistics are shared within the network. All data partners in EHDEN were invited to the study-a-thon. In total, 10 data partners accepted the invitation, a brief description of their databases is provided in Table 1.
Table 1 EHDEN data partners who participated in the study-a-thon2.3 Statistical Signal DetectionStatistical signal detection in VigiBase was carried out using vigiRank, a data-driven predictive model for emerging safety signals [37]. vigiRank combines disproportionality analyses (using the information component [IC] as measure of disproportionate reporting [38]) with predictors related to the completeness, recency, geographic spread and availability of case narratives in a logistic regression model. Since vaccine reports were excluded prior to statistical signal detection, masking of signals by these reports (which make up a large proportion of VigiBase) was prevented. For this study, we selected drug–event combinations with a maximum vigiRank score related to a generic drug and a MedDRA® Preferred Term (PT) mapped to one of the 16 selected phenotypes. Drugs with generic manufacturers were identified using publicly available data [39,40,41] and drug names were mapped from WHODrug Global to the RxNorm Extension vocabulary of the OMOP-CDM. Drugs with no verbatim match in RxNorm Extension were excluded. Mapping between each of the 16 phenotypes and MedDRA® PTs was based on expert medical knowledge. Phenotypes correspond to multiple MedDRA® PTs and the one-to-many mapping included less specific terms to increase the sensitivity of statistical signal detection (see Supplementary Table 2 in the ESM). During the signal validation process, signals were discarded if they contained a non-specific PT and were the sole term mapped to a phenotype. This resulted in a set of 1175 statistical signals related to 218 generic drugs and 72 MedDRA® PTs (see Fig. 3), which were listed in random order for analysis. During the study-a-thon, 95 statistical signals of this list (covering 65 drugs and 28 MedDRA® PTs) could be subjected to signal validation and prioritization.
2.4 Analysis of Case Series in VigiBase and Review of Regulatory Information and the Scientific LiteratureThe purpose of signal validation and prioritization is to determine which statistical signals merit assessment. Since we used a more inclusive approach for mapping phenotypes to MedDRA® PTs, we had to determine during signal validation whether there was support for a signal in VigiBase considering more specific PTs mapped to the phenotype (see Supplementary Table 2 in the ESM). Signals were closed upon initial inspection if:
i)at least one of the PTs mapped to a phenotype or the phenotype itself was listed as an already known adverse drug reaction in the European Summaries of Products Characteristics or the US Food and Drug Administration product labels, or had been discussed by the Pharmacovigilance Risk Assessment Committee of the European Medicines Agency or included in Drug Safety Communications or Potential Signals of Serious Risk of the US Food and Drug Administration, and there was no information in the VigiBase case series to suggest new aspects of the association; or
ii)the case series in question lacked clinical coherence or consistency (e.g., invalid case diagnosis, implausible time to onset or unclear clinical picture due to the PT being unspecific and the sole term mapped to the phenotype).
For the other signals, the following aspects were taken into consideration to determine whether a signal was eligible for assessment: the number of reported cases and whether this exceeded the number expected based on how commonly the drug and adverse event were reported overall; the time interval between drug administration and adverse event onset (time to onset); the course of the event when the drug was stopped (so-called dechallenge) and possibly re-administered (so-called rechallenge); the presence of a dose–response relationship; consistency of reporting across geographic regions; consistency of reporting for drugs belonging to the same substance class; existence of a plausible biological mechanism; existence of possible confounding by the underlying disease and/or concomitant treatments; coherence with other findings in published reports and the scientific literature. In addition, we examined descriptive characteristics of the case series and considered the possibility of comparing these against characteristics of other case series in VigiBase to identify relevant key features [42].
2.5 Analyses of Routine Health DataAnalyses of the routine health data to support signal validation and prioritization were descriptive in nature, i.e., focusing on so-called characterizations. The purpose of these analyses was not to assess causal associations directly, but rather to provide contextual information to assist signal assessors in their assessments of possible alternative explanations and potential public health and clinical implications. Analyses were also conducted to determine if follow-up pharmacoepidemiological investigations would be feasible. For signals that could not be assessed in VigiBase, these could impact the decision to forward a signal to assessment. For example, if a signal was difficult to assess because of suspected confounding by the underlying disease and/or concomitant medication(s), the decision whether to forward the signal for assessment was partly guided by the possibility of assessing it further in a pharmacoepidemiological study.
A cohort design was used for all analyses and the following cohorts were identified as key for supporting signal validation and prioritization:
Drug cohort: A cohort of new users of the drug, indexed on the drug start date. New use was defined as the first time a subject had a drug record in the database after at least 365 days of database observation.
Adverse event cohort: A cohort of subjects with a new diagnosis of the adverse event of interest, indexed on the diagnosis date. Event-free windows (see Supplementary Table 1 in the ESM) were used to distinguish new diagnoses from repeated reporting of the same diagnosis in the database.
Indication cohort: A cohort of subjects with the indication for drug use, indexed on the first diagnosis date of the indication.
Besides the cohorts defined above, we also considered the possibility of performing additional descriptive analyses including any subject with at least 365 days of database observation. All analyses were performed within each individual database and when interpreting the results, database-specific features were taken into consideration such as setting (hospital vs community-based), database capture of the drug, the adverse event, relevant covariates as well as sample size.
The design of all descriptive analyses was tailored to questions raised by assessors of the signal validation teams and where analyses of routine health data were considered of potential added value. Consequently, the analysis themes targeted by the routine health data (possible alternative explanations, potential public health and clinical impact and feasibility of follow-up pharmacoepidemiological investigations) could vary per signal.
2.5.1 Assessment of Possible Alternative ExplanationsPossible bias and confounding were examined by characterizing new users of the drug (i.e., the drug cohort). To better understand characteristics of drug exposure (i.e., indications for treatment, concomitant treatments and other comorbid diseases) and their sequence leading up to drug initiation, descriptive summary statistics of relevant covariates in the drug cohort were obtained before and at the drug start date. Likewise, occurrence of the adverse event prior to or at drug start was assessed. The look-back window for examining the distribution of relevant covariates and the adverse event before the index date was set to 365 days and where possible further split into shorter time intervals. To evaluate potential confounding, associations between characteristics of drug exposure and the adverse event were explored. This was done by comparing incidence rates of the adverse event in a 365-day risk window (N per 10,000 person-years) between patients with certain characteristics of drug exposure (e.g., indications for treatment, concomitant treatments or other comorbid diseases) and at-risk subjects in the general population captured by the database. Results from this analysis were reviewed overall and in age and sex strata to account for imbalances in demographic characteristics in this comparison as appropriate.
2.5.2 Assessment of Potential Public Health and Clinical ImpactTo better understand the potential public health and clinical impact of a signal, estimates of drug usage (i.e., number of new users of the drug (see also drug cohort definition in Sect. 2.5) as well as the proportion of patients with a specific indication receiving the drug) were obtained and the incidence rate of the adverse event among new users of the drug (N per 10,000 person-years) was estimated. To estimate the incidence, the allowable gap between successive drug records for defining continuous exposure during follow-up was tailored to the drug under investigation. To better understand the seriousness of the adverse event, hospitalization and death rates were computed in subjects experiencing the adverse event and those with prior drug exposure. The start date of follow-up for these analyses was defined as the diagnosis date (see also adverse event cohort definition in Sect. 2.5). Routine health data further enabled the identification of potential vulnerable subgroups through comparing descriptive characteristics of all new users of the drug and those who also experienced the adverse event. This comparison included descriptive summary statistics of relevant covariates before and at the drug start date.
2.5.3 Feasibility of Follow-up Pharmacoepidemiological InvestigationsTo assess the feasibility of pharmacoepidemiological follow-up investigations, the number of new users of the drug was obtained, either overall or for a specific indication. Similarly, the number of patients with the adverse event was computed, either overall or in a specific period after initiating treatment with the drug. If case counts for a drug–event combination were considered sufficient for further analysis, treatment pathway analyses (using so-called sunburst plots [43, 44]) were performed to display the sequence of common treatments for specific indications. These analyses were used to suggest relevant comparator drugs for follow-up pharmacoepidemiological studies, using active comparator designs. Sunburst plots are doughnut-shaped graphs with stacked layers, each representing different lines of treatment. The inner circle represents the first treatment and subsequent treatments are shown in the surrounding outer layers, with each drug represented by its own colour. Treatment pathway analyses were restricted to a set of drugs selected by the clinical experts. All analyses relied on cohorts as defined in Sect. 2.5, and data available within the participating databases. In addition, we also evaluated feasibility of follow-up pharmacoepidemiological investigations considering data availability in the entire EHDEN network.
Comments (0)