First Principles First
Why the smartest AI strategy isn’t AI-first at all
If your AI were a coworker who made stuff up 30% of the time, you’d fire them.
So why are we deploying systems with that same reliability profile and expecting different results?
Kushal Chakrabarti posed this question in a piece Rand Fishkin shared, and it stopped me cold. He laid out the math behind what most of us have been feeling but couldn’t articulate.
The number that got me: a model can be 97% accurate but only 70% reliable.
That gap — between accuracy and reliability — explains almost everything broken about AI deployment right now.
The Reliability Problem
Kushal’s framing is elegant. Accuracy is hitting the bullseye on average. Reliability is hitting the same spot repeatedly.
The AI industry has been optimizing for accuracy. Leaderboards. Benchmarks. MMLU scores.
But economics demands reliability. And here’s where the math gets brutal.
If your AI is 90% reliable per step, after 10 steps you’re at 35% reliability. After 100 steps? Effectively zero.
Kushal calls this “demo magic, production tragic.” The model looks brilliant in a controlled test. Deploy it into a real workflow with multiple steps, and errors compound exponentially.
This isn’t a rounding error. This is why AI investment isn’t moving the revenue number.
Where Trust Actually Breaks
In The Trust Stack, I argued that AI adoption fails across four layers:
Trust in the tool — Does this actually work?
Trust in yourself — Can I evaluate this output?
Trust in your team — Will my colleagues use good judgment?
Trust in leadership — Will my manager back my experiments?
Most organizations think they’re stuck on Layer 1. “The AI isn’t reliable enough.” “We’re waiting for the technology to mature.”
Kushal’s math validates that concern with a twist.
The reliability problem is real. A 70% reliable system in a multi-step workflow will fail. That’s arithmetic, not pessimism.
But here’s what I keep seeing: Organizations use Layer 1 as cover for Layers 2, 3, and 4.
They blame the tool when the real problem is that nobody trusts their own judgment to evaluate probabilistic outputs. Or teams won’t ship imperfect work. Or leadership punishes variance instead of rewarding learning velocity.
The reliability crisis and the trust crisis are the same crisis, seen from different angles.
One is technical. One is organizational. Both compound each other.
The Real Collision
We spent 30 years training managers to eliminate variance. Six Sigma. Process optimization. Playbooks. The entire management canon assumes a deterministic world.
LLMs don’t work that way. They’re probabilistic by architecture. You don’t get the answer. You get an answer — probably right, sometimes wrong, never exactly the same twice.
No amount of better tooling fixes a worldview problem.
Video by Sora (I’m the character). How do we trust systems that are inherently un-trustable?
What I Learned the Hard Way
When I led Third Door Media, we shipped AI across the board. Products like SearchBot and MartechBot. Internal tools for sales, marketing, editorial. A full suite built to accelerate how the team worked.
But here’s what I don’t talk about as much: most of it was lightly used. Not fully trusted.
It took months for the team to embrace these tools (and in some cases, never). I’m still not entirely sure why. Was it output quality? Fear of job loss? Skepticism that AI could match their expertise?
Probably all three. But underneath all of it was trust.
Trust in the tools. Trust in their own judgment to evaluate outputs. Trust that leadership wouldn’t punish experimentation. The same four layers I wrote about in The Trust Stack. I trust break down in real time, across every function.
In media, this cuts even deeper. Our contributors’ bylines are their professional identity. We were early with an AI policy and guidance not because we had it figured out, but because we knew the trust stakes were existential. If writers couldn’t trust how AI would be used — or not used — nothing else mattered. I know for a fact that many contributed pieces are deeply influenced by and in some cases, written by AI. That’s not a violation or an issue, it’s a fact. We live in a human+machine world.
First Principles First
The pattern holds in every conversation I’m having now. And boy, do I return to the fundamentals over and over again.
Organizations that start with fundamentals: clear outcomes, trusted judgment, short feedback loops tend to deliver and ship. Organizations that start with “how do we use AI?” tend to stay stuck.
The responsible approach to AI isn’t AI-first. It’s fundamentals-first.
Not because AI doesn’t work. It does. Spectacularly, in fact, in the right contexts.
But because probabilistic tools require deterministic foundations.
When your outputs are uncertain, your judgment has to be certain. When your systems can fail in unpredictable ways, your first principles have to be solid.
What are we trying to accomplish? How will we know if the output is good enough? What’s our fallback when it isn’t?
These questions existed before AI. They’re just more urgent now.
What This Means for Leaders
If you’re a CxO feeling the pressure to “do something with AI,” here’s what I’d offer:
Stop asking “How do we use AI?”
Start asking:
What are we actually trying to accomplish?
Where are our fundamentals solid enough to build on?
Where is our judgment strong enough to evaluate probabilistic outputs?
Where do we have the trust in ourselves, our teams, our leadership to ship imperfect work and iterate?
The math demands it. The answer isn’t waiting for 99.9% reliable models. It’s designing workflows that verify, decompose, and contain failures.
That’s not an AI problem. That’s a first principles problem.
AI doesn’t reduce the need for fundamentals. It amplifies it.
What should we trust? Certainly not generative AI which is probabilistic by design (it’s a feature, not a bug!). Trust your fundamentals. Trust your judgment. Trust that you can evaluate outputs and iterate.
That’s where reliability actually lives.
Your playbook is broken. First principles are what comes next.
This piece builds on The Trust Stack and was deeply informed by Kushal Chakrabarti’s excellent The 9s of AI Reliability. If you’re thinking about AI deployment, both are worth your time.

