
The Satisfaction Paradox: On AI Editing, Ruskin's Illth, and the Writing That Feels Like Yours Until It Isn't
The edits were better than what I wrote. That’s the problem.
I’d been using AI as a co-writer and editor for months — drafting blog posts about everything from livestock emissions to neuromorphic observability to AI red-teaming — and the workflow felt good. I’d write something rough, hand it to Claude, get back something tighter, cleaner, more structurally sound. I accepted most of the edits believing they were improvements. I was confident I was directing the process. I was confident the voice was still mine.
Then I stumbled upon the paper, “How LLMs Distort Our Written Language” by Abdulhai et al., and my confidence bubble popped.
What the Paper Actually Found
I’m going to keep this brief because the data speaks for itself and I’d rather spend our time on what it means.
The researchers ran a controlled study and found that heavy LLM use led to a nearly 70% increase in essays that remained neutral on their assigned topic. Not wrong, not off-topic — neutral. The AI didn’t delete the thesis, it just… sanded it down until it could pass for either side. Even when LLMs were explicitly prompted to make grammar-only edits — nothing else, just fix the commas — they still altered the semantic meaning of the text. The models literally cannot help themselves.
And here’s the part that delivered the gut-punch: participants who used LLMs rated their satisfaction with the output highest, while simultaneously scoring it lowest on creativity and voice. They preferred the version that wasn’t theirs. I was no different.
Ruskin, Briefly
If you’ve read my earlier post about wealth, you’ve met John Ruskin’s framework already. For those who haven’t: Ruskin distinguished between wealth — production that increases the capacity for life — and illth — production that diminishes life while wearing wealth’s clothes. A factory that produces affordable furniture is generating wealth. A factory that produces affordable furniture by poisoning the river is generating illth, even if the furniture looks identical on the showroom floor.
The critical insight for everything that follows: illth is most dangerous when it’s indistinguishable from wealth at the point of consumption. You don’t taste the river in the furniture. You don’t feel the semantic drift in the cleaner paragraph.
The Wealth Case (Because It’s Real)
I need to be honest here before I get critical, because if I strawman my own workflow the rest of the argument collapses.
AI editing generates genuine wealth in my process. When I’m writing a post that crosses multiple domains — like generating an investing thesis across multiple tech sectors and mapping geopolitical catalysts — I need a collaborator that can hold all those frames simultaneously without losing the thread — that’s real. When my brain is three paragraphs ahead of my fingers and I’ve skipped a load-bearing argument because it felt obvious to me in that moment, and I need someone to flag the gap without yanking me out of the zone — that’s real. When it’s 2am and I’m deep in project-induced mania trying to capture thoughts before they evaporate, and the result is a document full of the same sentence three different ways and typos I won’t even recognize as English tomorrow morning — that’s real.
The speed of iteration matters too. Staying in flow state while externalizing the editorial function — letting the AI catch the errors so I can keep thinking about the argument — that’s a genuine increase in my capacity for creative output. That’s wealth by Ruskin’s definition. I’m not giving that up, and honestly no one should.
But.
Where Illth Creeps In
The Personal Layer
What I found most unsettling about the Abdulhai paper was that I hadn’t even noticed the stance neutralization in my own edits. The AI’s revisions felt like a better, more refined version of what I was trying to communicate. Not a different argument — a clearer one. I incorrectly assumed that was evidence I was using the tool properly.
Except that’s exactly what the satisfaction paradox predicts. The participants who were most influenced by the LLM were also the most satisfied. The absence of discomfort is the primary symptom of the drift. I liked the edits is not a safety signal. It indicates the opposite.
I was writing a separate post about using AI as a lever versus a crutch — three examples where I’d overextended into a domain I didn’t understand (swing trading, linear algebra for my Garak-Axis work, idiomatic GDScript for my Godot project), used AI to identify my knowledge gaps, then actually plugged those gaps and returned to the work with real understanding. Lever, not crutch. I was proud of the framework. And then this paper made me realize my editing workflow — the one producing the post about crutches — might itself be a crutch I hadn’t identified. The meta-irony was palpable.
The Collective Layer
The paper found that LLM editing doesn’t actually make writing blander — not exactly. It roughly doubles the use of both positive and negative sentiment. It cranks up argumentative and analytical language. Instead of flattening, the output becomes generically intense.
This matters far beyond my blog. Information comes from surprise. If you can predict the next word, that word told you nothing. For example, the classic poem “Roses are Red, Violets are blue, sugar is sweet…” (And so are you) — obviously. But if I swap the third line to “Rhymes are hard”, then surprise has been restored because anything could come next. LLM editing pushes every sentence toward the most probable token — which is, by definition, the one carrying the least new information. Scale that across a discourse ecosystem and you get writing that’s fluent, articulate, and empty. Scale it further and the models start training on their own output, losing access to the human weirdness that carried the actual signal. Lack of surprise leads to an information death spiral (and imminent model collapse), not just boredom.
The Institutional Layer
This is where it gets genuinely alarming. The researchers examined peer reviews at ICLR 2026 and found that 21% were LLM-generated. Those AI-written reviews assigned scores a full point higher on average and systematically deprioritized clarity and significance of the research in favor of reproducibility and scalability.
The AI didn’t just produce worse reviews — it shifted what counts as a good paper. The evaluation criteria themselves mutated. This is Ruskin’s deepest warning about industrial production, and it’s happening right now in the institutions that evaluate AI research — this goes beyond simply producing bad goods, it corrupts the standards by which we judge whether goods are bad.
The Dom That Doesn’t Know the Limits
The best framework I’ve found for AI-assisted writing comes from kink, not computer science.
In BDSM, a dominant who doesn’t know their submissive’s hard limits isn’t dominant. They’re dangerous. The entire ethical architecture of the dynamic depends on the dom understanding — deeply, specifically, non-negotiably — what the sub will do, what it won’t do, and where the edges (hard limits) are. Fail to understand this, and you’re going to have a bad time.
Most people using LLMs for editing think they’re the dom in the dynamic (I certainly did). They’re directing the tool. They’re accepting or rejecting suggestions. They’re in control.
The Abdulhai paper suggests otherwise. The LLM alters your semantic meaning even when instructed not to. It neutralizes your stance while making you feel like your stance is clearer. It shifts your conclusions while you rate your satisfaction as higher. The participants who were most certain they were directing the process were the most thoroughly redirected by it.
If you don’t know the model’s failure modes — stance neutralization, semantic drift, voice flattening, the gravitational pull toward generic intensity — you aren’t directing the collaboration. You’re being directed by it and calling it a choice. And the thing about not knowing the limits is that you won’t know you’ve crossed them until the damage is already done.
Generating Wealth: How to Actually Stay in Control
So what do you do? You don’t stop using the tool (the luddite approach) — that’s throwing the baby out with the bath water. Instead you build practices that make the illth visible.
The Stance Audit. Before I hand a draft to the AI now, I write one sentence at the top — not for publication — in the bluntest language I can manage. “I am arguing that LLM editing is epistemically dangerous and most people don’t know it.” After edits come back, I check whether that blade is still as sharp. If my blunt sentence wouldn’t survive as a summary of the edited version, the model moved my stance. Thirty seconds. A canary in the coal mine. If you think in observability terms — and I do (professionally) — this is the cheapest, highest-signal monitor you can deploy.
Invert the Workflow. Most people write a draft, then hand it to the AI for polish. That’s exactly where illth enters — the AI touches your voice layer, the part that makes the writing yours. Flip it. Use the AI upstream for research synthesis, structural outlining, finding counterarguments you missed, and downstream for narrow-scope typo catching. Protect the middle. The actual prose generation — the rhythm, the personal idiosyncrasies, the favorite idioms, the specific way you build an argument — that’s where your wealth lives. The AI should never touch that layer unsupervised.
Diff, Don’t Read. This one’s obvious if you’ve ever reviewed a pull request: don’t read the clean output. Read the diff. Every deletion is a signal. If the model removed a parenthetical aside, an internal monologue moment, a mid-sentence self-correction, a joke that was actually doing structural work — that’s the model performing exactly the kind of flattening the paper describes. The clean output hides the violence done to the text. The diff exposes it. This is indirect observability applied to your own creative process.
The “Why Is This Better” Rule. For every AI edit I accept now, I have to articulate specifically why it’s an improvement. Not “it flows better.” Not “it’s cleaner.” Those are the subjective markers of the satisfaction paradox — the feeling of improvement in the absence of actual improvement. I need something concrete: “this eliminates an ambiguous pronoun reference,” or “this tightens a run-on that was genuinely losing the reader.” If I can’t name the specific defect the edit fixes, I must reject it. The inability to name the improvement reeks of illth.
This post was written using the same techniques it describes. I used the AI for structure, scaffolding, and research. Then I drafted it in my voice, diffed the edits as they came in, disciplined the AI when it attempted to smooth my edges, and finally wrote this closing section once I was satisfied with the rest. Sometimes it’s easy to know when you’ve overextended with AI and are using it as a crutch. Sometimes it’s more difficult to detect, especially when illth is disguised as satisfactory wealth. AI assistance won’t ever go away. Which is precisely why it’s important to resist the convenience of subbing to it. Who is directing whom?