The AI Productivity Paradox

MC Escher's hands drawing each other The ability of AI to produce paradoxes continues to fascinate me.

One recent survey found that workers lose the equivalent of 51 working days a year to technology friction – yet people who use AI effectively save 40–60 minutes a day.

The same survey found that only 9% of workers trust AI for complex, business-critical decisions, compared with 61% of executives. After the recent Wall Street Journal poll showing a similar split between senior management and staff, this is starting to look like a pattern.

And, to be honest, I can see both sides.

Why AI Often Looks Better to Executives Than Employees

For senior leaders, GenAI is often genuinely useful. If you want a high-level overview or a summary to help you orientate yourself and set direction, it can be superb.

But for the people doing the detailed work, the output frequently looks good enough only if you don’t look too closely.

Yet the closer you look, the more probabilistic problems appear: missing caveats, vague generalisations, invented facts, sentences that sound solid when skimming but mean nothing.

When details matter, getting to something usable with even the best GenAI tools can take dozens of rounds of amends and refinement. It’s not hard to see why many staff feel the technology is creating as much friction as it removes.

Why Reliable AI Needs More Structure

What’s interesting is that the newest attempts to make these systems more reliable seem to point in exactly the same direction.

The leaked Claude Code system appears to work so well largely because it surrounds the model with multiple layers of contextual constraint and instruction.

Gary Marcus has argued for years that something like this – closer to his preferred “neurosymbolic” approach – is the only plausible route to reliable AI.

Meanwhile, Elin N. has proposed an alternative approach she calls “substrate engineering“: tightly controlling the language, context and structure around a model to produce much more consistent results.

In other words, the more reliable these systems become, the less they seem to work like magic and the more they seem to depend on carefully-constructed contextual scaffolding.

The Catch-22 at the Heart of AI Adoption

Most workers do not yet have the time, knowledge or support to build that scaffolding for themselves.

Yet without the detailed knowledge of the people actually doing the work, the scaffolding often is not good enough.

Which may help explain why the promised productivity gains have yet to emerge.

Getting the best results from GenAI increasingly seems to require expertise in both the technology itself and the domain you are using it to help with.

The people most sceptical of these tools may therefore also be the people most needed to make them work.

Why AI Often Looks Better to Executives Than Employees

Why Reliable AI Needs More Structure

The Catch-22 at the Heart of AI Adoption

Submit a Comment Cancel reply

Notes and Essays

Recent Posts

By Topic

By Date