Prostock-studio/Shutterstock

Starting next year, the Home Office plans to use AI-driven facial age estimation to assess the age of asylum seekers. At the UK border, deciding whether someone is 17 or 19 is a consequential judgment. Get it wrong one way, and a vulnerable child loses legal protections they’re entitled to. But if it’s wrong in the other direction, then an adult enters a system designed for minors.

Is this technology ready for such a high-stakes decision?

Facial age estimation works by feeding a photograph into an AI system that goes through multiple layers of analysis, each picking up increasingly subtle patterns in the image. It is trained on millions of photographs of people whose ages are already known. Over time, the model learns to associate patterns in a face with likely age ranges: skin texture, the depth of lines around the eyes, bone structure and the distribution of soft tissue.

This is different from facial recognition, which identifies who someone is by matching their face against an existing database.

The system does not produce a single definitive answer. It produces a probability distribution, something closer to “most likely between 17 and 21” than “this person is 18.” Research on automation bias in immigration finds that even when algorithmic outputs are advisory, officers under time pressure tend to focus on them rather than question them, and a range becomes a number.

Under UK law, unaccompanied asylum seekers under 18 are treated as children, which means they are placed in local authority care, given access to education and afforded legal protections that adults are not. The stakes of that single-year boundary are considerable.

How good is the technology?

The National Institute of Standards and Technology (Nist) is the US agency that provides independent global benchmarks for this kind of technology. It has been running ongoing evaluations since 2024, testing algorithms on datasets spanning multiple image types, including border crossing photographs.

These systems measure success with a mean absolute error: the average number of years by which the system’s guess is off. Leading algorithms now achieve a mean absolute error of less than three years across all ages, a figure that would have seemed ambitious not long ago.

An average error of three years for an unseen photo is technically very good – research using passport-style photographs found that humans estimating the age of an unfamiliar face are typically off by around eight years. But when borderline decisions can shape the course of someone’s life, even the best available tool needs scrutiny.

The Home Office has contracted Cognitec, ranked fourth globally in Nist’s most recent published benchmark, to develop the system via UK firm Akhter Computers. A live trial is planned at a Home Office processing facility in Dover before a wider rollout. The technology will act as one input among several, while officers retain the final decision.

But despite the technology improving, Nist’s own data shows that its accuracy degrades significantly at the boundaries that matter most. At the 16-to-18 threshold (the exact line being drawn at the border) error margins for leading systems are materially higher than the overall average.

A Border Force vessel at the docks in Dover
The age of migrants who arrive by small boat will be assessed by AI age estimation beginning in 2027.
Sean Aidan Calderbank/Shutterstock

Nist’s data also shows performance is consistently weaker for female faces and varies significantly by geography, meaning algorithms trained predominantly on certain regions perform less accurately on faces from others. Given that the majority of those assessed at the UK border originate from regions underrepresented in those training datasets, this is a concern.

There’s also the training data problem. These models are built predominantly on western, white-majority datasets and skewed heavily male, which is a real limitation. This is because the research and commercial infrastructure that built these datasets (universities, tech firms, government ID programmes with accessible archives) was concentrated in North America and Europe. The data reflects who was in the room. Research consistently shows the consequence: lower accuracy for underrepresented ethnic groups. The people most affected by errors in this system are the same people the technology was least designed to serve.

Before this tool carries meaningful weight in age decisions, three things need to be demonstrably true. Accuracy must be validated on the actual population it will assess – not a generalised benchmark dataset, but exhausted, potentially malnourished people photographed in real border conditions.

Demographic performance must be published transparently, broken down by gender and ethnic origin, with clear protocols for when results should be discounted. Finally, the “human in the loop” guarantee – the principle that a trained officer, not the algorithm, makes the final call – must be real and not a rubber stamp.

The Home Office’s own watchdog found staff at the Dover processing centre lacked adequate training in current assessment methods. Getting the human part right matters every bit as much as technology. The independent inspector acknowledged that without a foolproof test, some decisions will inevitably be wrong, and that this is a cause for particular concern if a child is denied the rights and protections to which they are entitled.

AI age estimation alone will not be that foolproof test. But used carefully, transparently and with accountability, it could be a meaningful part of getting these decisions right, more often.

The Conversation

Oli Buckley receives funding from UKRI.

Leave a Reply

Your email address will not be published. Required fields are marked *