AI Triage in Emergency Departments: What the Real-World Data Shows


Emergency department triage is one of the most-discussed AI applications in healthcare. Vendors promise faster triage, better acuity assessment, and earlier identification of patients at risk. But what does the real-world evidence actually show?

I’ve been tracking ED AI implementations in Australia and internationally for the past two years. Here’s what I’m seeing.

The Promise vs. The Reality

The pitch is compelling: AI analyses patient presentations—vital signs, presenting complaints, arrival mode—and assigns triage categories. Nurses use AI recommendations as decision support. High-risk patients get identified faster. Lives are saved.

The reality is more nuanced.

Where AI is helping:

In implementations I’ve reviewed, AI consistently performs well at detecting patients who present with apparently minor symptoms but have serious underlying conditions. The classic example is the chest pain patient who looks stable but has ECG changes suggesting acute coronary syndrome.

AI is also good at pattern recognition across vital signs. A combination of readings that might not individually trigger concern can indicate deterioration when viewed together. AI catches patterns that busy triage nurses sometimes miss—not because nurses aren’t skilled, but because they’re handling multiple patients simultaneously.

Where AI struggles:

Contextual judgment remains challenging. A patient presenting with anxiety symptoms after a traumatic event might be appropriately triaged lower than the algorithm suggests. A patient with intellectual disability might present differently than the training data anticipated.

AI also struggles with “soft” information that experienced triage nurses incorporate—how the patient looks, how they’re behaving, what their family members are saying. This qualitative assessment often matters.

Evidence from Australian Implementations

Several Australian EDs have implemented AI triage systems. The published evidence is limited (implementation takes priority over research), but here’s what’s emerged:

Metropolitan public hospitals have seen the clearest benefits. High volumes mean AI identifies more cases. The efficiency gains are measurable. One implementation reported a 40% reduction in time-to-clinician for AI-flagged urgent cases.

Regional hospitals show more mixed results. Lower volumes mean fewer opportunities for AI to demonstrate value. The cost-per-case is higher. Some have questioned whether the investment is justified for smaller departments.

Private hospitals are implementing more slowly. Different patient demographics (generally less acute) mean the value proposition is different. AI designed for public ED presentations doesn’t always translate well.

What Implementation Actually Looks Like

I visited an ED six months into AI implementation. Here’s what I observed:

The AI interface is integrated into the triage workstation. When a patient arrives, the nurse enters presenting complaint and vital signs (which are automatically captured from monitoring equipment). AI generates a recommended triage category and flags specific concerns.

Most of the time, AI agrees with what the nurse would have assessed anyway. This might seem like the AI isn’t adding value, but it serves a purpose: it confirms the nurse’s judgment and creates documentation that appropriate assessment occurred.

The value emerges in edge cases. Patients where AI disagrees with initial impression. Patients where AI flags a specific concern (sepsis risk, cardiac event risk) that prompts additional assessment.

Nurses I spoke with had mixed reactions. Some found it genuinely helpful—“It’s caught a couple of things I might have missed.” Others saw it as bureaucratic overhead—“I still have to make the same decisions, just with an extra step.”

The department has been iterating on how AI recommendations are presented. Initially, AI recommendations appeared prominently. Nurses felt they couldn’t deviate from AI without extensive documentation of why. The current interface is more subtle—AI is available but doesn’t dominate the screen.

International Evidence Worth Noting

Looking internationally, the US has the most ED AI deployments. Key findings:

Epic Sepsis Model controversy. Epic’s AI sepsis prediction tool was studied extensively and found to perform worse than claimed by the vendor. This generated significant discussion about AI validation in real-world conditions versus controlled studies.

Mixed results on patient outcomes. Controlled studies of ED AI show improvements in process metrics (faster identification, earlier treatment) but limited evidence of improved mortality. This might be because AI needs longer deployment periods to demonstrate outcome improvements, or because AI benefits are smaller than hoped.

Alert fatigue is real. Systems that generate too many alerts become ignored. Several implementations have had to recalibrate sensitivity to maintain clinical engagement.

The UK NHS has been more conservative, focusing on pilots and evaluation before widespread deployment. Their approach emphasises building evidence before scaling—something Australia could learn from.

Key Success Factors

From implementations that have worked:

Nursing leadership engagement. ED AI succeeds when nursing leadership champions it. Implementations driven by IT or executive mandate without clinical leadership consistently struggle.

Realistic expectations. AI won’t transform emergency medicine. It’s an incremental improvement in triage accuracy. Organisations that expect transformation are disappointed. Organisations that expect incremental improvement find value.

Integration, not addition. AI that requires separate screens, extra clicks, or additional documentation steps will face resistance. Integration into existing workflows is essential.

Feedback loops. Nurses need to see that their feedback on AI performance matters. Systems that incorporate clinical feedback and improve over time maintain engagement. Systems that feel unchangeable lose it.

Performance transparency. Publishing internal metrics on AI accuracy builds trust. Organisations that are opaque about AI performance create suspicion.

Cautions and Concerns

I want to be balanced. There are legitimate concerns about ED AI:

Over-reliance risk. If nurses become too dependent on AI, they might lose independent triage judgment. I haven’t seen strong evidence of this yet, but it’s a theoretical concern worth monitoring.

Liability uncertainty. When AI recommends a lower triage category and the patient deteriorates, who’s responsible? Current legal frameworks assign responsibility to the clinician, but this hasn’t been thoroughly tested.

Equity implications. If AI was trained on data from certain populations, it might perform worse on others. Australian indigenous patients, patients from diverse cultural backgrounds, and patients with intellectual disabilities might be disadvantaged by AI that doesn’t reflect their presentation patterns.

Cost-effectiveness questions. Is ED AI a good use of limited healthcare dollars? The evidence for improved patient outcomes is thin. We might be investing in technology that improves processes without improving health.

My Take

ED AI triage is a reasonable application of clinical AI. The evidence suggests it helps in some cases, and it rarely makes things worse. That’s a reasonable bar for decision support technology.

But I’d caution against viewing it as essential or transformative. It’s one tool among many. A well-staffed, well-trained triage team without AI will outperform an understaffed team with AI. The fundamentals still matter more than the technology.

If you’re considering ED AI, start small. Pilot with willing clinical partners. Measure carefully. Iterate based on feedback. And don’t expect miracles.


Dr. Rebecca Liu is a health informatics specialist and former Chief Clinical Information Officer. She advises healthcare organisations on clinical AI strategy and implementation.