- The Atomic Builder
- Posts
- The Quiet Problem With Hitting Accept
The Quiet Problem With Hitting Accept
Why some people get sharper using AI, and others get further from their own work.
There’s a behaviour gap I’ve been noticing in how people use AI, and I haven’t seen anyone write about it properly. Some people use AI and come out of the session sharper, with work they can defend and a clearer sense of what’s strong and weak in it. Other people use the same tool on the same task and end up holding output they can’t really judge. The work might be perfectly fine. They just don’t know whether it is, because they didn’t engage with it deeply enough to tell.
The difference is in how they engage with the tool. The first group push back, edit, sometimes reject what the AI gives them. The second group hit accept and move on. Call that difference posture. It turns out to be one of the harder things to coach, much harder than tool fluency or prompt skill.
I’ve been noticing this for a while but couldn’t tell if it was just me. A study published last month suggests it isn’t.
The study
A piece of research published last month says the same thing, with proper data behind it. Sarah Baldeo at Middlesex University ran a study with 1,923 adults across the US and Canada, each given ten simulated work tasks of the kind most knowledge workers actually do, things like planning under uncertainty, interpreting ambiguous data, articulating reasoning behind a strategic call. Mainstream AI tools, real work patterns.
Afterwards, 58% of participants agreed that AI “did most of the thinking” to complete the work, especially on the planning and sequencing tasks. The same group reported reduced confidence in their own reasoning, weaker ownership of the resulting ideas, and a willingness to trade depth for speed. Participants who modified, challenged, or rejected AI suggestions instead reported the opposite. More confidence in their judgment. A stronger sense that the work was theirs.
“Generative AI can lead to cognitive decline or cognitive evolution…”
It’s worth saying the study can’t prove that AI is causing this, only that the two things show up together. Baldeo herself is careful about that. But the pattern is hard to argue with. Confidence and authorship don’t track AI usage volume. They track AI posture.
Measure more than seats and hours saved
Look at any AI rollout dashboard right now. Seats activated, messages sent, tokens consumed, hours saved. None of those numbers tell you whether anyone is actually thinking.
A team that hits accept on 95% of AI output produces the same usage data as a team that interrogates every draft and rewrites half of it. The dashboard treats the two teams as identical. Baldeo’s finding suggests they are heading in opposite directions.
The natural follow-up is “fine, let’s measure pushback.” It doesn’t really work. There’s no clean signal for “did the human modify the output.” Edit distance between AI draft and final version is a proxy and it’s noisy and gameable. Time-to-submit is worse. Pushback happens in someone’s head a few seconds before it shows up in a tool, if it shows up at all. You can’t dashboard it.
Which leaves leaders in a slightly awkward spot. The thing that determines whether AI is making your team sharper or duller is the thing you can’t see on a screen.
So what can you actually do
You can design for the behaviour even when you can’t measure it.
The useful question shifts from “are people pushing back on AI” (unanswerable) to “have we built rituals that make pushback feel like the default” (answerable, and quietly actionable).
Three rituals worth building in:
A named human reviewer for any AI-assisted output before it ships. Their job is to put their name on the work and answer one question, which is what they changed. The act of looking is the part that tends to disappear without a forcing function, and a name on a doc is a surprisingly effective forcing function.
Peer review focused on reasoning rather than format. Most review processes catch typos and tone, and AI output usually doesn’t have those problems. The thing worth catching is whether the logic actually holds, especially on planning and sequencing work, which is where the Baldeo finding is sharpest. Pair people up and ask them to argue with the reasoning rather than polish the prose.
“What did you reject this week” as a standing one-to-one question. Once a fortnight is plenty. It costs nothing, and it shifts what people pay attention to during the week, because they know they’ll be asked. A teammate who can’t answer the question has usually drifted into accept-mode without noticing, which is a coachable thing rather than a performance issue.
None of these need a budget. They don’t need any tool you don’t already have either, which is part of the appeal. They are cultural moves dressed up as small process changes, which in my experience is roughly the only kind that actually sticks.
What to do tomorrow morning
Pick one workflow on your team that puts out AI-assisted work regularly. Customer emails, planning docs, code reviews, internal updates, whatever has the highest volume. Just one of them.
Walk through that workflow yourself and find the moment where AI output gets approved or sent. In most workflows that moment is invisible. The work just flows through. There’s no point where someone explicitly stops to challenge the output, and that’s the gap.
Add one forcing function at that point. The lightest version is the named-reviewer rule from earlier, where whoever ships the work writes one line about what they changed before it goes out. Run that single change for a month, on that single workflow. Don’t roll it out wider. Don’t write it into policy. See what shifts, and let the people who experienced it tell other teams what changed for them.
You’re not testing whether your team is using AI properly. You’re testing whether the workflow gives them a reason to pause and think, which is a far easier problem to fix.
The real job
The job here is to make active engagement the easy path, not the disciplined one. If people have to remember to push back on AI output, most of them won’t. If the workflow makes pushing back the natural thing to do, most of them will. That’s the whole shift.
Optimise for usage volume and you’ll get usage volume. People hit accept faster, ship more, feel less. The dashboard goes green. The authorship problem stays silent in the background.
Optimise for active validation and the same tools start producing different humans on the other end. Sharper, more confident in their own reasoning, willing to put their name on the output without flinching, and frankly better company in a meeting.
The tools are basically the same for everyone now. Anyone can buy seats and run a rollout. The differentiator from here is what posture your people bring to those tools.
Almost nobody is designing for that yet.
If a colleague is running an AI rollout that’s measuring everything except the thing that matters, send this their way.
If you try one of these on your team this week, I’d genuinely like to hear what happened. Just reply to this email. See you next week. Faisal | ![]() |
P.S. Know someone else who’d benefit from this? Share this issue with them.
Received this from a friend? Subscribe below.
The Atomic Builder is written by Faisal Shariff, Human Productivity Lead at Tomoro AI. Views are my own.
