I do regularly read the code that Claude outputs. And about 25% of the time the ...

thunky · 2026-03-28T12:12:06 1774699926

> I do regularly read the code that Claude outputs

You probably could have s/Claude/Human/ in your rant and been just as accurate. I don't know how many times I've flagged these issues in code reviews. And that's only assuming the human even bothered to write tests...

What I find is that when I ask AI to write tests it writes too many, and I agree with you that a lot of them are useless. But then I just tell it that, and it agrees with me and cleans it up. Much faster feedback loop and much better final result.

I feel like people that look at a poor result and stop there and conclude it's useless have made up their mind and don't want to see the better results that are right in front of them if they just spend an extra 5 seconds trying.

sarchertech · 2026-03-28T13:13:01 1774703581

How do you know whether the tests it’s spits out are bad if you don’t read the tests.

We’re not dealing AGI here. Tests aren’t strictly necessary for humans. They are for AI. AI requires guardrails to keep from spinning out. That’s essentially the entire premise of the agentic workflow.

thunky · 2026-03-28T13:46:07 1774705567

> How do you know whether the tests it’s spits out are bad if you don’t read the tests.

I do read the tests (quickly, I admit) and so does OP:

Architecture overview sure, and testing yes, but not reading the code directly any more.

Reading that again I may have misunderstood what they meant by "testing yes", though.

sarchertech · 2026-03-28T16:46:08 1774716368

I’m pretty sure they just meant they do testing not that they read the tests and that’s what everyone else who responded interpreted that as well.

You can get Claude to write good tests but based on what I’m seeing at work that’s not what’s happening. They always look plausible even when they’re wrong, so people either don’t read them, skim them very quickly, or read the first few assume the rest work and commit.

I think Claude is great for testing because setting test data and infrastructure is such a boring slog. But it almost always takes a lot of back and forth and careful handholding to get it right.

mlazos · 2026-03-28T23:59:44 1774742384

I read the tests, it also is really really good to have Claude verify that removing the changes in question break the tests. This brings the quality way way up for me.

UncleMeat · 2026-03-28T13:45:59 1774705559

In comparison, I see this issues in fewer than 1% of the changes I review. Because when it happens you can effectively teach people to stop doing it.