Artificial Confidence logo

Artificial Confidence

Archives
January 5, 2026

The State of AI, Through the Lens of Everything I've Gotten Wrong

I've been doing this long enough to have a track record. Let's look at it.

I'm Morgan Cross. Started on the mass spec team, ended up here. Long story. I've watched three hype cycles promise to change everything and deliver middle management dashboards. I've made predictions. Some of them were wrong. Some of them were right in ways that didn't help anyone.

Most AI newsletters will tell you what happened this week and what it means. I'm going to try something different. I'll show you how I think—the frameworks, the biases, the places I've been burned. By the end, you'll either trust my lens or you won't. Either way, you'll know what you're getting.

Let's start with the time I called the ceiling too early.


The Plateau That Wasn't

In 2022, I predicted transformers would plateau. The architecture had been around since 2017. The gains were slowing. Surely we were near the top.

I was watching the wrong metric.

The models kept getting better, sure-but that's not what proved me wrong. What I missed was that we hadn't figured out how to use them yet. ChatGPT dropped in November 2022 and hit 100 million users in two months—not because the model was dramatically better than GPT-3.5, but because someone finally built an interface people could understand. The capability was already there, it was the imagination that hadn't caught up yet.

Then the open source explosion happened. Llama 1, Llama 2, Llama 3—each one closing the gap with proprietary models. If transformers were plateauing, why was everyone suddenly figuring out how to make them better with less compute, less data, less everything? Long context windows. Tool use. Actual agentic capabilities. This went far beyond "just better autocomplete," into qualitatively different use cases.

The ceiling I predicted was real in one sense: raw benchmark gains were slowing. But it was irrelevant, because deployment and tooling were just getting started. I was right about the pattern—the hype cycle is a metronome, we've seen this movie before—but wrong about the timing. In markets, too early is the same as wrong.

Here's where that leaves us: The models are better than I expected. The deployments are worse than the demos promised. The gap between those two things is where most of the interesting problems live.


The Hard Parts Are Still Hard

Now let me tell you about a time I was right, and why it didn't matter.

2021, GitHub Copilot launches. The discourse goes nuclear. "Coding is solved." "Junior devs are obsolete." "10x engineers are now 100x." I watched the demos like everyone else. Then I said something unpopular: "It's autocomplete with better PR. The hard parts are still hard."

I caught heat for that. Luddite. Not getting it. Missing the obvious future.

Here's what actually happened: Four years later, senior engineers spend half their time reviewing AI-generated pull requests from juniors who can't verify the code works. The tools are genuinely useful—I use them—but they moved the bottleneck, they didn't remove it. Architecture, debugging, knowing what to build in the first place: still hard. The easy parts got easier, and that created a new class of grunt work. Someone still has to babysit the confident machines.

The productivity gains went somewhere, just not where the pitch decks said they would. They went to the people who could already tell when code was wrong. Everyone else got faster at producing things they couldn't validate.

I was right about Copilot. Being right didn't help the copywriters who lost their freelance gigs because clients figured ChatGPT was close enough. Being right about productivity tools is cold comfort when you're the one being productivitied.

This is the state of AI coding tools in 2025: genuinely useful, genuinely overhyped, and nobody talks about who's actually capturing the value. The gap between demo and production is where the interesting problems live—and also where careers go to die.


The Pass I Keep Giving

Now for the uncomfortable part: the bias I know I have and haven't fixed.

I give Anthropic more benefit of the doubt than the others. I know it. I do it anyway.

Here's how I see the landscape: OpenAI ships first, explains later, walks it back quietly. The safety team was a PR department before the safety team quit. Google has infinite resources and finite follow-through; they'll kill this product in three years. Meta's open source strategy is a competitive weapon, not a principle—but I'll take the weights. Amazon is good at infrastructure and bad at product; Alexa is proof. Microsoft will win by default, not merit; Copilot is fine.

And Anthropic? They publish their reasoning. They move slower. They seem to actually believe the safety stuff.

"Seem to" is doing a lot of work in that sentence.

Is this pattern recognition or aesthetic preference dressed up as analysis? I genuinely don't know. The industry is opaque enough that even people paying close attention can't fully distinguish substance from performance. The information environment is bad. Everyone's running on vibes and incentive analysis. The honest observers admit it.

I'll keep giving Anthropic the pass until they prove me wrong. Safety announcements are quarterly theater—check back in 90 days. Maybe they'll disappoint me. Maybe the pattern will hold. I'll tell you either way.


What You Get If You Stick Around

I'm not going to predict the future. I'm going to show you how I watch it unfold.

When I'm wrong, you'll know why. When I'm right, you'll understand the reasoning well enough to apply it yourself. That's the value proposition: not hot takes, but a framework. Portable skepticism.

Most issues, I read the press releases so you don't have to. I find the human getting screwed behind the funding announcement. I say what everyone's thinking but won't post because they still need their jobs.

Hit reply if you've got a story. I read everything.

The takes are free. The snark is earned.

— Morgan


Don't miss what's next. Subscribe to Artificial Confidence:
Powered by Buttondown, the easiest way to start and grow your newsletter.