← bilca.ai Distributed Mind Data Exhaust Antidote Original Sin About Turing App ↗

The Water's Fine

You know the one about the frog? Of course you do. Drop it in hot water and it jumps out. Put it in cool water and slowly turn up the heat, and it never notices the danger until it gets boiled. With a smile on its face. Except it's not true (frogs *do* jump out), and that whole story is nothing if not psychopathic. I don't want to meet the human who came up with it.

The water is very nice right now. In fact, I think I want to turn the thermostat up, just a bit--I'm sure I can handle more!

TL;DR

It's not that the AI is generating obvious nonsense. It's that it's generating plausible-sounding nonsense that encodes misconceptions, and those misconceptions are invisible to anyone without specialized knowledge.

"It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so."

— Mark Twain (possibly apocryphal, but widely attributed)

The Subtle Problem

Let's talk about AI failures. By now, we've beaten the hallucination horse half to death. Remember that poor hapless lawyer who submitted ChatGPT's hallucinated case citations to federal court? Cost him $5,000 and his professional reputation when the judge discovered none of the cases existed. We now think these failures are visible, correctable, and increasingly well-documented. Well, they're documented. And not always correctable. And not always visible.

Let's dive a layer deeper. There's a more insidious class of errors that almost nobody is talking about: the baseline assumptions that feel so normal, so consistent with common discourse, that neither the AI nor the human notices them at all.

I caught one recently in conversation with an AI assistant. We were discussing hardware isolation for AI workloads, and the assistant casually suggested that keeping sensitive work separate from personal devices was a very good idea. It mentioned that a daily driver laptop "with medical records and patient data" should be isolated from AI agent experiments. No, I will not name the agent.

The assumption would easily fly by in the moment: of course a healthcare professional might have patient health information on their personal laptop. Unless you're a healthcare professional, and the idea raises your hackles so high you yell at the computer. It matched the general vibe of how people that don't deal with medical data daily talk about sensitive data. It "felt" plausible.

It's also utterly and completely wrong.

Healthcare professionals don't keep Protected Health Information on personal devices. HIPAA compliance requires PHI to live in controlled, audited, encrypted systems—EMRs behind VPNs, hospital networks with access controls, systems that log every interaction. A doctor's personal laptop doesn't have patient records on it for the same reason a bank teller doesn't take cash home in their backpack: that's not how this works. That's not how any of this works.

The Amplification Loop

The frog warming begins.

The AI's assumption came from its training data—internet discourse where people casually conflate "sensitive data" with "data that might be on someone's work laptop." Maybe it learned from TV medical dramas where doctors pull up patient files on iPads at home. Maybe it absorbed blog posts from people who don't work in healthcare describing how they imagine it works.

The AI agent generated that answer based on weights it learned on Facebook and Twitter. Sorry, X (groan). That means that the zeitgeist on the Internet (sorry, the "training corpus") is exactly that: most people think that's exactly how doctors work. That they have a "work" folder on their laptop where they keep their notes. And pictures of their patients' warts.

The AI synthesized the sentence from a thousand small signals in its training corpus that all pointed toward this being a normal, utterly unremarkable claim.

The AI states the assumption. The user—lacking domain expertise—accepts it as reasonable. They might even repeat it, cite it, maybe write it down. Then, that interaction becomes the training data for the next generation of models. The assumption gets encoded deeper. The misconception compounds.

This isn't a hallucination. The AI didn't claim a specific doctor keeps specific patient records on a specific device. It made a softer claim, one that matched the statistical distribution of internet discourse. And that's exactly what makes it dangerous: it's almost right. Right enough to slip past most filters.

Why Start Caring Now

We're in a moment where AI is becoming infrastructure. Not a novelty, not an experiment, but infrastructure. The number of zeros we're throwing at building data centers is obscene. Lots of zeros. Yeah, like all new technology, it's cats and porn at first, but more and more, people are using AI assistants to draft emails, research medical conditions, make financial decisions, teach their children, and automate workflows. The outputs aren't always fact-checked because they feel right, they match existing intuitions, they align with what "everybody knows." Or, much, much worse, they generate no feeling at all. No reason to question.

Except sometimes what everybody knows is wrong.

And when the AI's outputs start shaping what people first think, later believe, and those thoughts and beliefs feed back into the training data for future models, we get an epistemological drift. Not dramatic. Not obvious. Just a slow, steady movement away from ground truth toward a kind of consensus reality that sounds right but increasingly diverges from how things actually work.

In my previous article on data exhaust, I wrote about how AI-generated content is polluting the training corpus—how the outputs of one generation of models become the inputs for the next, creating a closed loop that amplifies errors and degrades signal. This assumption problem is the same phenomenon, but subtler. It's not that the AI is generating obvious nonsense. It's that it's generating plausible-sounding nonsense that encodes misconceptions, and those misconceptions are invisible to anyone without specialized knowledge.

The Human Antidote

Sorry, but here come the acronyms.

HITL: Stands for Human-in-the-Loop. It refers to the collaborative process where humans actively participate in training, evaluating, or refining AI systems.

RLHF: Stands for Reinforcement Learning from Human Feedback. It's a method used to fine-tune AI models (like large language models) using human judgments to improve alignment with human preferences.

These work just fine when:

  1. The human has relevant expertise
  2. The human is paying attention
  3. The human bothers to correct the AI
  4. The correction somehow makes it back into the training loop
  5. But, most importantly, when the human is aware of their own prejudices. Yeah, I haven't met those humans either. Let me know when you run across one.

I'd be happy if we bat a couple of those five. The assumptions go unchallenged. The user internalizes it. The misconception spreads.

What You Can Should Absolutely, Drop-dead, Must Do

This isn't a call for panic. It's a call for awareness.

The next time you talk to your AI (have you given it a name yet? Did you let it choose its own name? Is it an old dead pet name? I want to know...), whether you're asking it to draft an email, explain a concept, or help you think through a problem, pay attention to the assumptions embedded in its responses. Not just the facts it states, but the framing it uses, the baseline it assumes, the "common knowledge" it takes for granted.

Ask yourself: Does this match reality, or does it match what people say about reality on the internet? Those two things are not the same.

If you have expertise in a domain and you catch the AI making faulty assumptions, correct it. Yeah, it may forget by tomorrow morning, but correct it anyway. Their memory is getting better with every week of progress (I have memory markdown files all over my directories, they're like lice). Not because it will fix that particular model, because it won't. Not because you may help it catch framing errors at inference time. But because building the habit of scrutiny matters to you, not to it.

Train yourself to catch framing errors, not it.

If we're going to use these tools as infrastructure, we need to develop the reflexes to catch when they're subtly, plausibly, invisibly wrong.

Still smiling? How's the water now?

Epilogue

This article was written with AI assistance, of course. I eat my own dog food. Claude made six distinct errors during our conversation, including confidently stating my last name was "Hay" (it's Bilca), suggesting I publish Tuesday-Thursday (I publish Sundays), and later misremembering our own conversation. It was like talking to a very confident pot-head, with ironically impeccable grammar.

Next in series →
The Distributed Mind
Why AI alignment requires architectural decentralization. An exploration of concentration risk, model collapse, and the SNN alternative.