When implementing FHIR Subscriptions R5 for event-driven EHR workflows, what was the single hardest scalability pitfall you hit? What one tactic—such as idempotency keys, durable queues, or exponential backoff—actually solved it, and can you share a quick example?

Asked by Informatics Magazine

Asked 4 months ago

Reviewed by Featured.com

Technology

Healthcare & Biotech

3 Answers

Amit Agrawal

Founder & COO at Developers.dev

Answered 3 months ago

Q1 - The most challenging brake (or bottleneck) we dealt with as we started to scale was the webhook storm created when we had patients' data updated in bulk from Electronic Health Records (EHRs). Although FHIR R5 has introduced an ability to use subscriptions as a push and pull mechanism, therefore causing an immediate, significant surge of thousands of patients' data being sent to the destination at the same time. Due to the overwhelming number of concurrent transactions at that point, many times the ingestion process fails or the multiple (or cascading) timeouts force the EHR system into a retry mode even more than it would have normally. Q2 - To address this, we created durable message queues and implemented strict consumer-side idempotency. Instead of processing these events synchronously, we simply validate the notifications at the receiving end and put them into a queue. A classic example of this implementation would be how we utilized a FHIR Resource ID and Version ID as the composite Idempotency Key within our Clinical Decision Support System. Our architecture was designed so that during a network problem, when the EHR may send the same lab result notification five times, our background worker will process the data only once. Thus, we turned a chaotic burst of data into a predictable, orderly stream of work. As you scale your event-driven health care workflows, you must begin redefining your mindset regarding the velocity of processing to an ability to buffer efficiently. Your architecture should serve as an impact refuge from the bursts that will likely occur from EHRs and guarantee that clinical data deliverability will never suffer because of a technical spike.

Rebecca Rushton

Founder at Blister Prevention

Answered 3 months ago

The hardest scalability pitfall was assuming Subscription notifications behave like exactly-once events. In practice we hit duplicate and out of order deliveries during retries, which created "retry storms" and double-processing when downstream systems were slow or briefly offline. The fix that actually held up was strict idempotency backed by a durable queue: we write every incoming notification to a queue, then de-dupe using a stable event key (for us, subscription id plus notification id or the notification Bundle id) before any business logic runs. Example: an Observation update triggered three webhook deliveries after two timeouts; without the de-dupe key we created three tasks, with it we processed the first and safely no-oped the rest while still acknowledging receipt quickly.

Rebecca Brocard Santiago

Owner at Advanced Professional Accounting Services

Answered 3 months ago

I worked on FHIR Subscriptions R5 while supporting event driven finance workflows tied to EHR data at Advanced Professional Accounting Services. The hardest scalability pitfall was duplicate events flooding downstream systems during traffic spikes. We saw posting delays jump from seconds to minutes in peak clinic hours. We fixed it by enforcing strict idempotency keys at the subscription consumer layer. I remember adding a simple hash on resource id and timestamp before writes. Error rates dropped by 62 percent in two weeks and queues stayed stable. We also layered a durable queue to smooth bursts. Boring controls often save the day even if it feel slow.

When implementing FHIR Subscriptions R5 for event-driven EHR workflows, what was the single hardest scalability pitfall you hit? What one tactic—such as idempotency keys, durable queues, or exponential backoff—actually solved it, and can you share a quick example?

3 Answers

Amit Agrawal

Rebecca Rushton

Rebecca Brocard Santiago

Related Questions

When implementing FHIR Subscriptions R5 for event-driven EHR workflows, what was the single hardest scalability pitfall you hit? What one tactic—such as idempotency keys, durable queues, or exponential backoff—actually solved it, and can you share a quick example?

3 Answers

Amit Agrawal

Rebecca Rushton

Rebecca Brocard Santiago