Learning gain of an ATLS-based interprofessional and multidisciplinary in-situ simulation training of trauma resuscitation

For trauma resuscitation and management of polytrauma patients, interprofessional communication and teamwork in the trauma bay are critical factors that largely contribute to sentinel events. Simulation-based team trainings is therefore a plausible approach to practice both technical and non-technical skills under realistic conditions. This is in line with findings from recent systematic reviews and meta-analyses demonstrating that simulation-based training improves team performance and human factor skills across diverse healthcare settings [7, 18]. In this context, it improves both technical and non-technical skills [19, 20]. Whereas the literature strongly supports the benefits of interprofessional CRM trainings in medical teams regarding communication and coordination [7, 21, 22], there is limited evidence on which CRM dimensions benefit for which subgroups. In this study, we evaluated an individually designed simulation-based interdisciplinary trauma team training at a Level I trauma center in Germany based on ATLS® principles. Our aim was to characterize the structure of perceived learning gains across CRM dimensions and to explore subgroup differences by role, profession, and prior experience to inform instructional design. Recent studies have addressed particularly needs for team training in ad-hoc teams and report performance gains even without stable team constellations [2, 23]. In line with Kirkpatrick’s framework, our evaluation corresponds to level 2 (‘Learning’) and relies on retrospective self-assessed knowledge gain [11]. Retrospective self-assessments may be affected by recall bias as well as social desirability or demand characteristics. These influences are well described in the methodological literature on retrospective self-evaluation and may impact the magnitude of self-reported learning gains [24, 25]. While medical education literature suggests a weak correlation between self-assessment and objective performance measures [12], self-reports are pragmatic and minimize testing reactivity [13].

The interpretation of learning gains in educational studies depends on the level of learning being assessed. Beyond factual knowledge or observable performance, learning has been conceptualized as a process of conceptual and metacognitive recalibration, particularly in complex domains such as emergency care. From this perspective, changes in self-assessed competence may reflect meaningful shifts in participants’ internal frames of reference, awareness of task complexity, and understanding of professional requirements rather than mere response artifacts. Such recalibration has been described as an integral component of learning, even when objective performance gains are not immediately measurable. Thus, perceived subjective learning gains measured in this study should not be interpreted solely as increases in factual knowledge, but as indicators of underlying changes in participants’ conceptual understanding of polytrauma care team competence. Due to the course concept involving voluntary participation, implementation of a structured summative assessment was not applicable. Given voluntary participation, summative performance testing was not feasible. The interpretation of the observed learning gains must also consider the high retrospective pre-test scores across several CRM-related dimensions, indicating a potential ceiling effect. Many participants reported substantial baseline familiarity with CRM concepts prior to the training, which inherently limits the headroom for measurable improvement and constrains the magnitude of post-test gains. Accordingly, the absolute learning gains observed were small relative to the overall scale range. While these differences reached statistical significance, their educational relevance should be interpreted with caution. Rather than reflecting substantial competence acquisition, the findings are best understood as modest shifts in self-perceived learning within an already highly trained cohort.

The EFA revealed a three-factor model comprising personal operational competence, team communication, and decision making. Internal consistency was acceptable, and inter-item correlations indicated related yet distinct. While the first factor (personal operational competence) primarily addressed technical aspects within the working environment, the other two factors focused on non-technical team interaction. Within this framework, the largest mean learning gains occurred in items related to operational/ technical knowledge, particularly among less experienced participants (‘providers’). In healthcare training, technical and non-technical skills are typically regarded as two distinct concepts, both of which are crucial in managing and preventing critical or adverse events [26, 27]. Non-technical skills are generally seen as cognitive and social skills, whereas technical skills involve the use of medical equipment and drugs, along with specific medical expertise [28].

The factors team communication and decision making showed highly significant gains across participant groups. The finding of significant knowledge gains across all demographic subgroups warrants careful interpretation. While this pattern may reflect a broadly effective training intervention, it may also be partially influenced by the retrospective pretest–posttest design employed. Retrospective self-assessment is known to be susceptible to response-shift bias, whereby participants recalibrate their understanding of the assessed constructs after exposure to the intervention, leading to lower retrospective ratings of baseline knowledge. This effect can result in consistent pre–post differences across subgroups, independent of demographic characteristics. At the same time, such recalibration has been described as a meaningful indicator of learning in complex educational settings, where participants’ conceptual understanding of the domain evolves through training. Consequently, the observed knowledge gains likely represent a combination of true learning effects and changes in participants’ internal frames of reference. The finding that experienced clinicians demonstrated improvements in decision-making underscores that routine clinical exposure alone does not fully address the cognitive and metacognitive components of expertise, which may be specifically targeted through structured simulation and debriefing.

The concept of team communication and decision-making touches upon the utilization of all available personnel resources without restriction, for example, due to existing hierarchies. Previous work in context of trauma teams support flatter structures [29, 30], allowing team members to interact and communicate on an equal footing with a high amount of psychological [31]. Regarding team communication, no subgroup differences by department, experience, or profession were observed pre-post, indicating that all groups benefit from the training intervention. Our findings confirm that simulation-based training within trauma teams enhance communication and performance irrespective of prior job-related experience, with particularly pronounced gains in personal operational competence (workplace/ process) among less experienced participants. Therefore, it is reasonable not only to include members at all levels of experience in the training, but also to design scenarios that explicitly address communication-related issues and reinforce flat hierarchies. Finally, the attitude that situations should be dynamically reevaluated and repeated team time-outs are necessary (summarized in decision making) yielded high pre- and post-test self-evaluated scores across all subgroups without between-group differences. Given the relatively small subgroup sizes, additional data collection may reveal a positive trend in this aspect with repeated training sessions.

A small but statistically significant difference in overall self-assessed learning gains was observed between female and male participants. This finding requires cautious interpretation. Previous research has consistently shown that self-assessment data may be influenced by gender-related differences in self-evaluation and response-shift effects. In particular, female participants have been reported to rate their baseline competencies more conservatively and to show greater recalibration following educational interventions, which may result in higher measured learning gains in retrospective pretest–posttest designs without necessarily indicating greater objective competence acquisition [12,13,14]. The present findings should be interpreted in light of the exploratory analytical approach adopted in this study. Both the factor analysis and the subsequent subgroup comparisons were conducted with the primary aim of describing patterns of perceived learning gains and generating hypotheses for future research, rather than providing confirmatory evidence. Accordingly, the identified factor structure represents a preliminary model that warrants validation in independent samples. Confirmatory testing of the dimensional structure and subgroup effects would require larger cohorts, prospective study designs, and the application of confirmatory factor analysis. Within this context, the current results provide an initial framework to inform instructional design and to guide the development of targeted simulation-based training interventions.

To date, it remains unclear whether self-reported pre-test values change among participants attending multiple training sessions over time.

In summary, the iSRST appears to improve subjective concepts regarding collaboration within the specified team structures. The training proves advantageous for all participating subgroups, regardless of their professional background, expertise or prior training experience. This is consistent with previous studies that both surgeons and anesthetists’ benefit from simulation-based training [32,33,34].

Nonetheless, there are specific limitations that need to be addressed. Firstly, our evaluation was designed to cover the second level of Kirkpatrick’s framework (‘Learning’). However, the levels 3 and 4 (corresponding to ‘Did the intervention result in a change of behavior?’ and ‘Did the intervention influence performance?’), still need to be evaluated in detail. Previous studies have reported longer time spent on trauma patients [35]. For this purpose, long-term data collection and statistical evaluation is warranted. Previous studies have addressed Kirkpatrick’s level 3 by comparing video-records of preintervention and repeated post-intervention simulations [36]. The present analysis was not intended to detect changes in global clinical endpoints (e.g., mortality, morbidity, and length of stay), which refer to level 4 by Kirkpatrick. Interpretation of these outcomes must incorporate additional context information, as they are often the result of a constellation of multiple factors and are less under the team’s direct control.

Secondly, the durability of effects after a single short session remains uncertain; mid- and long-term follow-up is needed. Some authors reported retained improvement in non-technical skills following one-day training sessions in short timespans of one or two months [37]. Comparable findings have also been demonstrated in trauma-focused ATLS®-based trainings [38] and in prehospital emergency simulation settings [39]. Thus, the iSRST needs to be mid- and long-term evaluated. In addition, repeated training sessions will likely be necessary to sustain improvement regarding the CRM-concepts. The optimal time frequency and duration require further investigation. Regarding our individual course concept, participants evaluated that a frequency of 1–2 trainings per year suited their expectations towards the training best. Thirdly, our data fully relied on subjective self-assessment. Evaluating self-assessed competencies and calculating retrospective learning gains may be limited indicators of actual knowledge gain. While retrospective pretest–posttest designs are limited in their ability to assess true longitudinal retention, they are particularly suited to complex educational interventions in which participants’ baseline self-assessments may change after training [40, 41]. In retrospective pretest-posttest designs, participants rate their status and retrospectively their prior status within the same measurement occasion, thereby using a common frame of reference and reducing bias introduced by changes in understanding of the construct over time. Moreover, recent reviews in medical education highlight the pragmatic value of retrospective and post-only assessment designs when traditional pretest administration is constrained by logistic or construct-clarity issues [42]. A meta-analysis on this topic indicates that while self-assessments are commonly used in literature for evaluation purposes, they may be imperfect and unreliable indicators of underlying true learning [13]. While self-assessment captures perceived competence, objective approaches such as knowledge testing, simulation-based performance assessment, observational rating scales, and workplace-based assessments are commonly used to evaluate competence beyond subjective measures.

In the present setting, retention assessment would have been methodologically challenging, as participants were frequently re-exposed to interdisciplinary team training or related clinical experiences, sometimes within short intervals. Such repeated exposure would likely confound delayed measurements and limit attribution of retention effects to the intervention itself. Future research on the training concept may address this limitation by incorporating delayed assessments while controlling for interim training exposure or by using objective performance-based retention measures.

Furthermore, there is a lack of consensus in the interpretation of self-assessments, sometimes treated as a facet of reactions (analogous Kirkpatrick’s level 1) and sometimes as an indicator of knowledge levels (analogous Kirkpatrick’s level 2) (‘Is Teacher Immediacy Actually Related to Student Cognitive Learning?’ [43]). Fourthly, our analysis used a purpose-designed individual questionnaire. Additional training aspects may not have been captured to the full extend. The single questionnaire precludes any conclusion regarding the sustainability of training effects. Finally, this was a single-center study conducted in a specific institutional context, which may limit the generalizability. Future multi-center trials with objective outcome measures and longitudinal follow-up are warranted to strengthen the evidence base.

Comments (0)

No login
gif