More

cruffle_duffle · 2026-03-31T19:30:57 1774985457

It'd dogfooding the entire concept of vibe coding and honestly, that is a good thing. Obviously they care about that stuff, but if your ethos is "always vibe code" then a lot of the fixes to it become model & prompting changes to get the thing to act like a better coder / agent / sysadmin / whatever.

cruffle_duffle · 2026-03-31T05:23:05 1774934585

Thanks!!!

cruffle_duffle · 2026-03-30T03:06:28 1774839988

> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?

Because it’s insanely useful when you give it access, that’s why. They can do way more tasks than just write code. They can make changes to the system, setup and configure routers and network gear, probe all the iot devices in the network, set up dns, you name it—anything that is text or has a cli is fair game.

The models absolutely make catastrophic fuckups though and that is why we’ll have to both better train the models and put non-annoying safeguards in front of them.

Running them in isolated computers that are fully air gapped, require approval for all reads and writes, and can only operate inside directories named after colors of the rainbow is not a useful suggestion. I want my cake and I want to eat it too. It’s far to useful to give these tools some real access.

It doesn’t make me naive or stupid to hand the keys over to the robot. I know full well what I’m getting myself into and the possible consequences of my actions. And I have been burned but I keep coming back because these tools keep getting better and they keep doing more and more useful things for me. I’m an early adopter for sure…

lambda · 2026-03-30T20:02:10 1774900930

Well, one of the other reasons I suggest running it in a strictly limited container is that you can then run it in yolo mode.

In fact, I use the pi agent, which doesn't have command sandboxing, it's always in yolo mode, I just run it in a container and then I get the benefit of not having to confirm every command, while strictly controlling what I share with it from the beginning of the session.

cruffle_duffle · 2026-03-30T02:56:25 1774839385

There is also the browser I use to get Claude to route around people blocking its webfetch. Both Playwright and chrome-mcp.

gck1 · 2026-03-30T04:44:04 1774845844

Camoufox?

cruffle_duffle · 2026-03-30T02:54:55 1774839295

I bet dollars to doughnuts that 95% of the traffic is from Claude and ChatGPT desktop / mobile and not literal content scraping for training.

crote · 2026-03-30T04:10:46 1774843846

That wouldn't explain the 1000x increase in traffic for extremely obscure content, or seeing it download every single page on a classic web forum.

duttish · 2026-03-30T06:52:19 1774853539

And doing it over, and over, and over and over again. Because sure it didn't change in the last 8 years but maybe it's changed since yesterdays scrape?

cruffle_duffle · 2026-03-29T15:49:37 1774799377

It will mess up eventually. It always does. People need to stop thinking of this is a “security against malicious actor” thing… because thinking in that way blinds you to the actual threat… Claude being helpful and accidentally running a command it shouldn’t. It’s happened to me twice now where it will do something irreversible and also incorrect. It wasn’t a threat actor, it wasn’t a bad guy… it was a very eager, incredibly clever assistant fat fingering something and goofing up. The more power you let them wield, the more chance they’ll do accidents. But without lots of power, they don’t really do much useful…

It’s actually a hard problem. But it really isn’t “security” in the classic sense…

cruffle_duffle · 2026-03-29T01:20:06 1774747206

Hah… I’ve seen Claude happily and very cleverly find ways to escape its sandbox. It’s like some kind of arms race between the model and its designers.

cruffle_duffle · 2026-03-29T00:43:15 1774744995

Dude. I’ve been thinking about this a lot! I think it’s because the traditional way we internalize the costs of what we are building just got take for a ride. We don’t really (or I don’t anyway) fully know what “too much scope” feels like with one of these Claude thingies. So it’s easy to completely both overestimate complexity and underestimate it too. Some times the LLM makes a seemingly daunting refactor be super simple and sometimes something seemingly not complex can take it forever… and there really is, for me, a good “gut sense” of how something will go.

So lately I’ve just decided that I’ll time box things instead of set defined endpoints. And by “endpoint” I really mean “I’m done for the day” and honestly maybe thinking about it… “I’m done with this project”.

I don’t know. But the term “Claude Creep” is absolutely something I can identify with. That thing will take you down a rathole that started with just pulling in some document and ends with you completely repartitioning your file system. lol.

cruffle_duffle · 2026-03-29T00:25:28 1774743928

A lot of getting good mileage out of LLMs is promoting them to behave like they are blind and can only base their outputs on what is in front of them. Maintain an emic stance.

cruffle_duffle · 2026-03-29T00:21:44 1774743704

That is an interesting way of looking at that, thanks for the perspective!

Like, the words fit… why create a second parallel language for describing LLM behavior.

Somebody else said it… the whole “it’s a stochastic parrot” thing is sooooo cliche and boring at this point. It’s like, duh… what is your point?