Edition 001 · AI · Healthcare · Systems Thinking

When AI Meets Bedside Manner

2025-01-15

01 · The Finding

LLMs achieve 87% diagnostic accuracy on standardised medical benchmarks — but fail in 34% of cases involving non-Western patient presentations.

02 · The Pattern

Every system optimised for a majority population encodes the majority's blind spots. We saw this in education (standardised curricula built for one type of learner), in urban planning (infrastructure designed for car-owners), and now in AI diagnostics built on Western clinical data. The pattern is not a technology failure. It is a design assumption made visible.

03 · The Signal

Before deploying AI in any clinical or people-facing system in the Gulf or Africa, interrogate the training data. If it was built on Western population datasets, you are not deploying AI — you are importing assumptions.

The paper landed in my Feedly feed on a Wednesday morning.

A systematic review. 47 studies. 2.3 million patient cases. The headline finding was impressive — LLMs achieving 87% diagnostic accuracy across standardised medical benchmarks.

But buried in the methodology section was the sentence that stopped me.

34% failure rate in non-Western patient presentations.

Not a bug. A feature of where the data came from.