Home  /  ACEM Fellowship  /  Study notes  /  Critical Appraisal and Application of Evidence in Emergency Medicine

Critical Appraisal and Application of Evidence in Emergency Medicine

ACEM Fellowship LO ACEMF-ST-1-TS1-1.1 2,023 words
Free preview. This study note maps to learning objective ACEMF-ST-1-TS1-1.1 in the ACEM Fellowship curriculum. Inside Primex you get the full set of ACEM Fellowship notes, AI-graded SAQs and written-paper practice, voice viva with an AI examiner, exam-style MCQs, and a curriculum tracker that ticks off every learning objective as you go. For exam format, timeline and failure-mode commentary, see the FACEM Fellowship 2026 Study Guide.

ACEM Fellowship Learning Objective: Principles of Evidence-Based Practice in the ED


Overview and Rationale

Evidence-based practice in emergency medicine requires the systematic application of critically appraised research to individual patient decisions. Unlike elective settings, the ED demands rapid integration of uncertain, heterogeneous evidence with patient-specific factors, resource constraints, and time pressure. The core skill is not memorising conclusions but understanding the architecture of evidence - its internal validity, external validity, precision, and applicability to the undifferentiated presentations characteristic of emergency practice.

A structured approach to evidence-based medicine follows five sequential steps:

Step Description ED-Specific Challenge
1. Translation Convert clinical uncertainty into an answerable question Time pressure; gestalt vs. structured reasoning
2. Acquisition Retrieve best available evidence Point-of-care access; pre-appraised resources
3. Critical appraisal Assess validity, results, and applicability Distinguishing statistical from clinical significance
4. Application Integrate evidence with clinical context Undifferentiated patients; exclusion criteria mismatch
5. Evaluation Assess outcome and refine practice Audit, M&M, QI cycles

Structuring a Clinical Question: PICO

Converting uncertainty into an answerable question is the first and often most neglected step. The PICO framework provides structure:

In the ED, foreground questions (specific clinical decisions) must be distinguished from background questions (pathophysiology or mechanism). Most point-of-care appraisal concerns foreground questions.


Study Design Hierarchy

Understanding the hierarchy of evidence underpins critical appraisal. The hierarchy reflects the degree to which study design controls for bias and confounding.

Level Study Design Key Strength Key Weakness
I Systematic review / meta-analysis of RCTs Highest statistical power; reduces random error Heterogeneity; publication bias; garbage-in-garbage-out
II Individual RCT Controls confounding via randomisation May not reflect ED population
III-1 Pseudo-RCT (e.g. alternate allocation) Pragmatic Allocation bias
III-2 Prospective cohort; case-control; interrupted time series with control Hypothesis generating Confounding by indication
III-3 Historical control; single-arm studies; interrupted time series without control Feasible in rare conditions Selection bias, temporal confounds
IV Case series; pre/post studies Rapid, cheap, hypothesis generating No control group; cannot establish causation
Expert opinion / CPP Consensus-based clinical practice points Practical guidance where data absent Susceptible to authority bias

In emergency medicine, many interventions are studied in heterogeneous populations or under conditions that do not match the ED environment. Level II or III evidence, well-appraised, is often more applicable than a meta-analysis of trials conducted in settings with very different patient populations.


Internal Validity: Assessing Risk of Bias

Internal validity asks: "Did the study measure what it intended to measure, free from systematic error?"

Key Biases in RCTs

Bias Type Definition How to Detect
Selection bias Non-comparable groups at baseline Check allocation concealment, baseline table
Performance bias Differential care beyond intervention Assess blinding of participants/providers
Detection bias Differential outcome assessment Blinding of outcome assessors; objective vs. subjective outcomes
Attrition bias Differential dropout affecting results Intention-to-treat analysis; missing data handling
Reporting bias Selective outcome reporting Protocol registration; discrepancy between registered and reported outcomes

Allocation Concealment vs. Randomisation

These are distinct concepts frequently conflated. Randomisation generates unpredictable allocation sequences. Allocation concealment ensures that the person enrolling the patient cannot know the upcoming allocation. Without adequate concealment, selection bias occurs even with genuine randomisation. Sealed opaque envelopes, centralised telephone randomisation, or web-based systems provide adequate concealment.

Blinding

Intention-to-Treat (ITT) vs. Per-Protocol Analysis


Quantitative Measures of Effect

Understanding effect measures is essential for translating statistical findings into clinical decisions.

Dichotomous Outcomes

$$RR = \frac{\text{Event rate in intervention group}}{\text{Event rate in control group}}$$

$$ARR = \text{Control event rate} - \text{Intervention event rate}$$

$$NNT = \frac{1}{ARR}$$

$$OR = \frac{\text{Odds of event in intervention}}{\text{Odds of event in control}}$$

Continuous Outcomes

For continuous outcomes such as pain scores (commonly a 0-10 Numerical Rating Scale), the mean difference (MD) or standardised mean difference (SMD) is reported:

$$SMD = \frac{\mu_1 - \mu_2}{SD_{pooled}}$$

The minimal clinically important difference (MCID) for pain NRS in the ED is approximately 1.3-1.5 points on a 10-point scale. A statistically significant reduction of 0.5 points is not clinically meaningful.

Precision: Confidence Intervals

The 95% confidence interval (CI) reflects the range within which the true effect lies 95% of the time in repeated sampling. A wide CI indicates imprecision (usually small sample size). A narrow CI crossing the null (1 for RR/OR, 0 for MD) indicates the result is statistically non-significant regardless of the point estimate.

Statistical Significance vs. Clinical Significance

P-values indicate whether an observed difference is likely due to chance, not whether it is clinically important. With large sample sizes, tiny clinically irrelevant differences reach statistical significance. Always interpret p-values alongside effect size and CIs.


Meta-Analysis: Synthesis and Its Pitfalls

Forest Plots

A forest plot graphically displays the results of individual studies and the pooled estimate:

Heterogeneity

Not all variation in study results is random. Heterogeneity describes true differences in effects between studies due to differences in populations, interventions, comparators, or outcomes (PICO variation).

When heterogeneity is high, a narrative or subgroup analysis is more appropriate than a single pooled estimate.

Publication Bias

Studies with positive results are more likely to be published, creating a systematic overestimation of treatment effects in meta-analyses.

Funnel plots detect publication bias by plotting each study's effect size on the x-axis against a measure of precision (e.g. sample size or standard error) on the y-axis. In the absence of bias, the plot should form a symmetrical inverted funnel. Asymmetry - typically a cluster of small positive studies without corresponding small negative studies - is indicative of publication bias.

Funnel plots require a minimum of approximately 10 studies to be interpretable and cannot distinguish publication bias from genuine heterogeneity in small-study effects.


External Validity: Applicability to Your Patient

Even an internally valid, precisely estimated trial effect may not apply to the patient in front of you. Critical questions:

Question Relevance
Do my patients resemble the trial population? Age, comorbidities, acuity, exclusion criteria
Was the intervention delivered as it would be in my ED? Dose, route, monitoring, staffing
Were the outcomes measured ones that matter to my patient? Patient-centred vs. surrogate outcomes
What was the baseline risk in the control group? High-risk patients derive more absolute benefit
Does the trial reflect contemporary practice as a comparator? Active comparator vs. placebo comparisons

Emergency medicine trials frequently exclude patients who are haemodynamically unstable, non-English speaking, unconscious, or who have multiple comorbidities - yet these are the patients most commonly encountered in resus. Extrapolation requires explicit clinical reasoning.


Specific ED Considerations in Evidence Appraisal

Surrogate vs. Patient-Centred Outcomes

Surrogate outcomes (e.g. troponin reduction, haemoglobin normalisation, blood pressure targets) may not predict clinically meaningful benefits such as mortality, functional recovery, or quality of life. Fellowship candidates should habitually interrogate whether reported outcomes are endpoints that matter to patients.

Time-Critical Interventions

For time-sensitive interventions (thrombolysis, STEMI reperfusion, sepsis antibiotics), randomised trials may be ethically or logistically constrained. Strong observational data may represent the best available evidence. The strength of mechanistic rationale and biological plausibility becomes more relevant in this context.

Subgroup Analyses

Subgroup analyses are hypothesis-generating unless pre-specified with adequate power. Post-hoc subgroups are prone to false-positive findings by chance. The number of subgroup analyses performed multiplies the risk of spurious significant findings ($\alpha$ inflation). A p-value for interaction tests whether the subgroup effect is genuinely different from the overall effect and is a minimum requirement for credibility.

Non-Inferiority Trials

Many ED analgesic and diagnostic trials are framed as non-inferiority: does intervention A perform no worse than comparator B by a clinically acceptable margin (the non-inferiority margin)? Key appraisal points: - Is the margin clinically justified, not just statistically convenient? - Paradoxically, poor methodology (diluted treatment effect) biases toward finding non-inferiority - assay sensitivity must be established - ITT analysis is conservative for superiority but liberal for non-inferiority; both ITT and per-protocol analyses should confirm non-inferiority


Levels of Evidence and Clinical Practice Points

Not all clinical questions have RCT-level evidence. Consensus-based clinical practice points represent recommended best practice derived from expert clinical experience where formal evidence is absent or infeasible. These should not be conflated with evidence-based recommendations and carry a higher risk of authority and confirmation bias. In the ED, many procedural techniques, resuscitation endpoints for rare conditions, and disposition decisions fall into this category.


ACEM Fellowship Implications

Written Paper

OSCE / Viva Application

When presented with a clinical scenario requiring evidence appraisal: 1. Frame the question using PICO 2. Identify the study design and its position in the evidence hierarchy 3. Assess internal validity: randomisation, blinding, ITT, attrition 4. Quantify the effect: absolute not just relative risk; NNT; MCID for continuous outcomes 5. Assess precision: CI width and crossing of null 6. Explicitly address applicability: does this patient resemble the trial population? 7. Integrate evidence with clinical context, patient values, and resource availability

High-yield examiner focus areas: - Recognising that relative risk measures alone (RRR, OR) inflate apparent benefit when baseline event rates are low - Understanding that high $I^2$ invalidates a single pooled estimate regardless of a statistically significant diamond - Knowing that funnel plot asymmetry is a signal, not proof, of publication bias - Demonstrating that non-inferiority must be interpreted on both ITT and per-protocol analyses - Contextualising evidence to the ED patient who would have been excluded from the relevant trial

The capacity to critically appraise rather than simply cite evidence is the defining skill differentiating a fellowship-level clinician. In resuscitation and time-critical decisions, this means knowing the limitations and applicability boundaries of evidence as rapidly and fluently as knowing the evidence itself.

Primex

Practice this topic in the app

Work through MCQs on this exact LO, run written or viva practice mapped to ACEMF-ST-1-TS1-1.1, or ask PRIMEX a clinical question framed for ACEM Fellowship. Your free trial covers all 20 specialist exams.

Start 7-day free trial
Start free trial