AI Writes Better Code Than You. It Still Cannot Do Your Job.

Pawel·Mar 6, 2026·14 min read

Have you noticed something odd? The best engineers you know probably spend most of their time not writing code. They sit in meetings arguing about what to build. They draw boxes on whiteboards and erase them. They say "it depends" more often than they say "let me implement that." And nobody fires them for it. In fact, they get promoted.

Maybe that is a clue.

Boris Cherny, who created Claude Code, recently described his month: "The last month was my first as an engineer when I didn't open an IDE at all. Opus 4.5 wrote around 200 PRs, every single line." Not as a prediction about the future. As a description of his Tuesday. If the person who built one of the most capable coding agents no longer opens an editor, perhaps "experienced software developer" and "person who writes code" are not the same thing. And perhaps they never were.

I have spent 20+ years writing software and the last several hiring people who do. Here is what I have noticed: the parts of the job that are harder to replace are the parts where code is only the visible surface of a much messier system. Understanding that difference is, perhaps, the entire game.

What Does an Experienced Developer Do That Looks Like Nothing?

The decisions nobody documents

In real projects, the problem is rarely "implement X." The problem is "what should X even be, given business constraints, legal risk, legacy architecture, deadlines, politics, users, and future maintenance?" Have you ever noticed how rarely a Jira ticket captures the actual work?

AI can propose options. It cannot own consequences. Notice the framing people use when they describe working with AI agents: "I tell it what to do, and it does it." The telling is still yours. The judgment about what to tell it is still yours. The responsibility when the thing you told it to do turns out to be wrong is, perhaps unsurprisingly, also still yours.

I built HintCraft with this exact dynamic. Multiple times per month, the technically correct decision was the wrong business decision. No model would have made the call I made, because no model understood why the billing system could not be touched until Q3, or why the feature that seemed simple would break a partnership agreement nobody had documented. The codebase does not contain the reason. The reason lives in context.

AI can process what happened. It cannot understand why it happened in the way that someone who was in the room can. Perhaps the most undervalued skill in software engineering is simply having been there long enough to know where the bodies are buried.

The knowledge that lives in scars, not in wikis

Consider what a strong developer actually accumulates over a decade: which subsystem is fragile, which stakeholder always changes requirements the week before launch, which migration looks easy but historically breaks billing, which shortcut will create six months of pain that nobody will trace back to this moment.

None of this is in the codebase. None of it is in the wiki. It lives in the heads of people who have been around long enough to have the scars. METR's research on AI task completion found that current frontier models can only reliably complete tasks of up to a few minutes, despite excelling at narrow benchmarks. Your context window lasts a career. The model's context window lasts a conversation. Anthropic explicitly describes cross-session continuity as an open problem. Maybe that gap is smaller than it sounds. Maybe it is not.

The question of accountability

A senior engineer can be made accountable in a way AI cannot. They sign off on architecture, approve production changes, take pager duty, explain incidents to leadership, and know when not to ship. AI can assist with all of this. It does not bear professional, legal, or social responsibility.

In high-stakes systems, "who is accountable if this fails?" has to have a human answer. This is not a technical limitation that a better model will solve. It is a civilizational choice. And it is, perhaps, the least discussed reason that experienced developers are not going anywhere.

The art of deciding what not to build

Simon Willison, one of the most thoughtful voices on AI and development, has a line I keep coming back to: "The one thing you absolutely cannot outsource to the machine is testing that the code actually works." But I would go further. You also cannot outsource the decision about whether the code should exist at all.

Experienced developers add the most value before any code is written. They simplify the problem, reject unnecessary scope, identify hidden dependencies, and redesign the work so implementation becomes almost boring. The less interesting the implementation, the better the framing was. AI is strong once the task is well-scoped. Humans are better at deciding how the task should be scoped in the first place. METR explicitly notes that common coding benchmarks may overestimate real-world capability because they sacrifice realism. Benchmark tasks are clean and self-contained. Have you ever worked on a codebase that was clean and self-contained? Neither have I.

When preparing interview stories, perhaps focus less on what you built and more on the decision that came before it. The moment you scoped the work, rejected an approach, or redefined the problem entirely. That is usually the most interesting part, and interviewers know it.

Getting humans to agree on something

Real software work involves persuasion, negotiation, conflict resolution, mentoring, and the delicate art of getting a PM, a designer, a security lead, an infra engineer, and an executive to converge on a plan that nobody loves but everyone can live with. Calling this a "soft skill" is like calling the foundation of a building "optional architecture." The building does not care what you call it. It falls down without it.

AI can draft arguments. It does not carry reputation, credibility, or the memory of that time you stayed late to help someone's deploy go smoothly. That memory is currency. AI has no wallet.

Taste

This sounds vague. It is not. Good developers develop taste: for APIs, abstractions, naming, error handling, system boundaries, the feeling that something is slightly wrong with a design even before you can articulate why. AI can imitate common patterns, but it averages existing practice rather than exercising judgment about when the average is wrong. Sometimes averaging is fine. Sometimes it produces code that is plausible, passes review, and quietly makes everything worse. Knowing the difference is, itself, taste.

Catching the machine

Here is a paradox worth thinking about. As AI gets better at writing code, the skill of catching its mistakes becomes more valuable, not less. The worst AI bugs are not the ones that obviously break. They are the ones that look correct, pass tests, and quietly introduce a behavior nobody specified.

Willison describes LLMs as "an over-confident pair programming assistant" requiring constant human judgment about correctness, relevance, and architectural decisions. Maybe that is an unflattering description. Maybe it is the most accurate one available.

What You Think Protects You Might Not

Now the part that is less comfortable.

Andrej Karpathy, who cofounded OpenAI and understands this technology better than almost anyone, chose an interesting word for what is happening: "The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse." Not eliminated. Refactored. Maybe that word choice matters more than it first appears.

If the person who built Claude Code does not open an IDE, perhaps it is worth asking: what exactly is he doing all day?

The answer: judgment, direction, taste, accountability, problem framing. The things at the top of this article. The things that look like nothing until they are missing.

Here is an incomplete list of things that feel like professional advantages but may not age well: writing CRUD endpoints, remembering framework syntax, basic debugging, writing unit tests for well-defined functions, implementing standard integrations, being the fastest typist in React on your team.

A medieval scribe had beautiful handwriting. The printing press was not impressed. Getting good at something and getting good at something that stays valuable are, it turns out, different skills.

Experience by itself is not a moat. Only useful experience is. A developer with 15 years of repetitive implementation work but weak judgment, weak systems thinking, and weak ownership is, perhaps, more replaceable than any of us want to admit. The question is not how long you have been coding. It is what you learned while you were doing it.

Some people will disagree with this framing. There is a reasonable argument that the handoff to AI is slower and messier than the hype suggests, and that many developers are over-extrapolating from impressive demos. That may be right. The timeline is genuinely uncertain. But the direction is not. Even the skeptics are updating their positions as they actually use the tools on real work.

How to Show This in 60 Minutes

The gap between what you know and what you can communicate in an interview is a preparation problem, not a value problem. Preparation problems have solutions.

When you made the call nobody else would have

Think of a time you made the right decision when the data was incomplete. What the situation was, what the data suggested, why it was not enough, what you decided and why, what happened.

BEFORE

'I have strong decision-making skills and can handle ambiguity well.'

AFTER

'Our data showed we were ready to launch, but I had seen a similar architecture fail under peak load at my previous company. I pushed for two more weeks of stress testing. We found a connection pooling issue that would have caused outages on day one.'

The first is a claim. The second is a story. Have you ever noticed that interviewers cannot evaluate claims? They can only evaluate evidence. Maybe that distinction is worth remembering.

When you knew how the movie ended

Every experienced developer has a moment where they looked at a codebase, a proposal, or a team dynamic and thought: "I know how this ends." Not because they ran an analysis. Because they had watched this exact movie before, at a different company, with a different cast, and they remembered the ending.

That is perhaps the most valuable thing experience gives you. Describe it. The risk nobody flagged. The migration that looked simple but you knew would break billing because it broke billing the last time someone tried it, three jobs ago. AI has a training set. You have a career. These are not the same library.

Ask yourself: "When was the last time I said 'I have seen this before' and turned out to be right?" That moment is your story. It demonstrates something no tool could have produced from the available data alone.

When trust was the actual deliverable

Think of an outcome that only happened because of trust you built over time. Maybe you resolved a conflict because both sides trusted you. Maybe you got buy-in for a difficult change because of credibility you built over years.

No algorithm has ever bought someone coffee to rebuild a working relationship after a difficult quarter. That is not a limitation of current models. It is a limitation of the category. Perhaps some things require a pulse.

BEFORE

'I am a strong collaborator who builds relationships across teams.'

AFTER

'I almost sent the migration plan as a company-wide email. Kate stopped me. Did you talk to Sales? I had not. Turns out their VP had real reasons to be angry, reasons I would have missed entirely from the engineering side. Two weeks of actually listening to his team later, the plan was different and worse technically, but he co-signed it. Sometimes the right architecture is the one people will actually use.'

Where Experience Lives Now

Jaana Dogan, a principal engineer at Google, offered what might be the most useful sentence anyone has said about this: "If you are skeptical of coding agents, try it on a domain you are already an expert in." Think about what that implies. AI is most useful when you know enough to direct it and catch its mistakes. It is most dangerous when you do not. Your expertise is not competing with AI. It is the thing that makes AI useful.

Nobody knows what this profession looks like soon. Anyone who claims to is selling something. But right now, the developers who are doing well treat AI the way a good architect treats a construction crew: essential for execution, useless for deciding what to build. You do not hire an architect because they can lay bricks faster. You hire them because they know which walls are load-bearing. Whether that metaphor still works tomorrow is genuinely anyone's guess.

In an interview, this means your stories should not be about what you built. They should be about why you built it that way, what you decided not to build, and what would have gone wrong if someone less experienced had been making the calls.

HintCraft helps you find and articulate the stories where your judgment made the difference.

AI is eroding the value of typing. It is not eroding the value of thinking. If you have spent your career learning to think well about software, the supply of good judgment has not increased, even as the supply of good code has become nearly infinite. That is, perhaps, an interesting market position to be in.

The only catch is that you have 60 minutes to make it visible to a stranger. That is not a talent problem. It is a preparation problem. And preparation problems, unlike judgment, can be solved quickly.

Prepare like it matters

AI-powered interview prep, 1,300+ curated remote companies, and a complete system for your job search.

Start preparing
WRITTEN BY

Pawel

Co-founder, Staff Frontend Engineer

Senior full-stack developer and professional interviewee. 20+ years in tech. Built HintCraft to solve the problem he kept seeing.