AI for EMS Agencies: What Works, What Doesn’t, and How to Get It Right

AI for EMS Agencies: What Works, What Doesn’t, and How to Get It Right

I want to be upfront about something: I am not anti-AI. Not even close. At Handtevy, we use AI in meaningful ways. We use it for smart quizzes, checklist creation, protocol sorting, and a host of back-office tools that help our team build a better product. With over 3,000 EMS and hospital customers in all 50 states, I have seen firsthand how technology can improve care when it is applied correctly. I believe AI is going to transform healthcare, and in many ways it already has.

But there is a line. 

When generative AI moves from administrative support into interpreting clinical protocols, algorithms, or order sets to guide real-time patient treatment, the stakes change entirely.

This is not a theoretical risk. The literature is clear. The failure modes are well documented. And the consequences in a high-acuity, time-critical environment are too significant to wave away with optimism about where the technology is headed.


Where AI for EMS Agencies Is Already Making a Difference

Let’s start with what’s working. At Handtevy, AI is already part of what we deliver — both inside our platform and behind the scenes. We have invested in AI tools that support training, education, protocol management, and operational workflows. What these applications share is a common principle: AI assists, and a human validates the output before it reaches a clinician.

The question for EMS leaders evaluating AI right now is not whether to adopt it. The question is where to deploy it and what standard to hold it to.

Real-Time Patient Care Requires a Different Standard

When you are managing a critically ill or injured patient, the tools you rely on need to be consistent, predictable, transparent, and traceable to verified source content. They need to be validated for clinical use and governed by medical leadership.

Generative AI is not built to meet those requirements. It is probabilistic by design. It is optimized for language generation, not clinical reliability. There is a reason we do not ask our calculators to improvise. Real-time decision support tools should behave the same way: structured, deterministic, and medically governed.

That does not mean the technology has no role. It means we need to match the tool to the risk level of the task. AI for EMS agencies deployed in administrative or educational settings carries a fundamentally different risk profile than AI deployed at the bedside during active patient care.

Protocols and Order Sets Are Not Just Text

Whether you work in EMS or in a hospital, you already know this. Protocols, clinical algorithms, and order sets are not simple reference documents. They are complex operational tools built for trained clinicians who understand how they apply within a specific system.

In EMS, a single protocol might include conditional decision pathways, branching logic, exceptions that depend on clinical judgment, medication limitations tied to local formularies and what is actually on the truck, and scope-of-practice boundaries that differ from one agency to the next. In the hospital setting, order sets and clinical algorithms carry similar complexity: institution-specific formularies, credentialing requirements, pharmacy and therapeutics committee decisions, and workflows that vary by unit, by service line, and by patient population. These documents come in every format you can imagine: narrative text, flow diagrams, tables, scanned PDFs, appendices, cross-referenced medication sections. Many organizations revise portions of a document without updating the entire thing.

A trained clinician reads these documents and fills in the gaps with context, experience, and judgment. Generative AI does not do that. It processes text and generates a response based on probability. That distinction matters enormously when evaluating AI for EMS agencies that handle high-acuity calls.

We know this firsthand. At Handtevy, we have done extensive internal testing, asking AI to answer questions directly from clinical protocols. The results have been consistently disappointing. Missed conditional logic. Incomplete answers. Confident responses that left out critical steps. And this was in a controlled environment, not during active patient care. If the AI cannot reliably get it right when we are deliberately testing it, it has no business being in the hands of a clinician managing a real patient.

What the Evidence Actually Shows

I want to ground this in data, not opinion.

A study published in the Journal of Medical Internet Research evaluated four large language model chatbots responding to emergency care questions. The result: dangerous information appeared in 5% to 35% of responses, including advice like starting CPR without checking for a pulse. That is not a minor error. That is a patient safety event waiting to happen.

Research in critical care settings has shown that AI applications frequently lack the situational awareness needed for emergency medicine. Many systems are trained on datasets that may not reflect the population actually being treated. And the “black box” nature of most AI algorithms makes it nearly impossible for clinicians to understand how a recommendation was generated, especially when it conflicts with their own assessment or with local protocols, algorithms, or order sets.

Medication dosing through AI introduces its own set of risks, including data quality issues, bias, and the absence of real-time validation. One review on AI in emergency medicine pharmacy practice emphasized that factual inaccuracies, ethical concerns, and lack of transparency remain significant barriers to clinical adoption. This is a topic I feel strongly about. Handtevy is the only medication dosing app that does not allow the device itself to perform the calculation. Every dose in Handtevy is pre-calculated and vetted by a clinical team before it ever goes live. We made that decision deliberately, because there is inherent risk in having a device calculate a dose on the fly, even with traditional math, let alone with generative AI. The stakes are too high to leave to an algorithm that has not been verified by a human.

These are not edge cases. These are known, documented, reproducible characteristics of generative AI when applied to clinical content.

The Hallucination Problem Is Real

One of the most discussed limitations of generative AI is its tendency to produce confident but incorrect or fabricated responses. In clinical AI literature, these are called hallucinations. The AI does not know it is wrong. It generates an answer that sounds authoritative, and unless someone with the right training catches the error, it gets acted on.

In an administrative workflow, a hallucination is a nuisance. You catch it in review, you correct it, you move on. During active patient care, a hallucination could mean the wrong drug, the wrong dose, or a missed contraindication. For AI for EMS agencies evaluating real-time protocol tools, this is the critical distinction to understand. The margin for error is fundamentally different.

How to Evaluate AI for EMS Agencies the Right Way

If you are a medical director, EMS leader, or ED administrator evaluating AI tools right now, here is a practical framework for drawing the right lines.

Low-risk, high-value AI integrations: include education and training content, quality improvement analysis, protocol organization and sorting, administrative workflow automation, and documentation support. These are areas where AI enhances efficiency and a human reviews the output before it affects care.

Where to Use Caution:

  • 🚫 Any app or tool where AI is the SOLE decision-maker with no human in the loop
  • 🚫 Application and software built entirely on generative AI to interpret protocols, algorithms, or order sets
  • 🚫 Tools that deliver real-time clinical guidance without medical governance
  • 🚫 Any application where the output has not been validated by your medical leadership before it reaches a provider in the field or in the ED
  • 🚫 Vendors who cannot tell you exactly where the output comes from or how it was validated

The goal is not to avoid AI. The goal is to deploy it where it adds value without introducing risk that your patients and providers cannot afford.

The Bottom Line

Healthcare innovation has always required balance. The best technologies improve care when they are implemented thoughtfully, validated properly, and aligned with how medicine actually works on the ground. AI for EMS agencies is no different.

AI will continue to play an important and expanding role in emergency medicine. I am genuinely excited about that. But when it comes to interpreting protocols, algorithms, and order sets to guide real-time treatment, decision support tools must be purpose-built, structured, and medically governed. That is not a limitation of innovation. That is the standard patients deserve.


Looking for AI Tools Your EMS Agency Can Actually Trust?

Get the technology. Skip the risk. Handtevy delivers purpose-built clinical decision support trusted by EMS agencies and Hospitals in all 50 states.

Schedule a Demo

 

About the Author

Peter Antevy, MD, FAEMS is a board-certified pediatric emergency medicine physician, EMS medical director, and the Founder and Chief Medical Officer of 

Handtevy. He has spent his career focused on improving emergency care for both adults and children, with a particular emphasis on medication safety, resuscitation, and closing the gap between evidence-based medicine and what happens in the field and at the bedside.

Dr. Antevy serves as an EMS medical director in South Florida, where he oversees some of the busiest prehospital systems in the country. He is a nationally recognized speaker, researcher, and educator in emergency medicine, and has been instrumental in advancing pediatric readiness across EMS and hospital systems nationwide.

Through Handtevy, Dr. Antevy and his team support close to 3,000 EMS and hospital customers across all 50 states with purpose-built clinical decision support tools designed to reduce medication errors and improve outcomes during the most critical moments in patient care.

 


References

  1. Yau JY, Saadat S, Hsu E, et al. Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care. Journal of Medical Internet Research. 2024;26:e60291.
  2. Vearrier L, Derse AR, Basford JB, et al. Artificial Intelligence in Emergency Medicine: Benefits, Risks, and Recommendations. The Journal of Emergency Medicine. 2022;62(4):492-499.
  3. Pinsky MR, Bedoya A, Bihorac A, et al. Use of Artificial Intelligence in Critical Care: Opportunities and Obstacles. Critical Care. 2024;28(1):113.
  4. Iserson KV. Informed Consent for Artificial Intelligence in Emergency Medicine: A Practical Guide. The American Journal of Emergency Medicine. 2024;76:225-230.
  5. Edwards CJ, Erstad BL, Ng V. The Role of Artificial Intelligence in Emergency Medicine Pharmacy Practice. American Journal of Health-System Pharmacy. 2025;:zxaf038.
  6. Cheng R, Aggarwal A, Chakraborty A, et al. Implementation Considerations for the Adoption of Artificial Intelligence in the Emergency Department. The American Journal of Emergency Medicine. 2024;82:75-81.
  7. Piliuk K, Tomforde S. Artificial Intelligence in Emergency Medicine: A Systematic Literature Review. International Journal of Medical Informatics. 2023;180:105274.

DISCLAIMER: These links  are provided for  research and do not have affiliations with Handtevy.