It'd dogfooding the entire concept of vibe coding and honestly, that is a good thing. Obviously they care about that stuff, but if your ethos is "always vibe code" then a lot of the fixes to it become model & prompting changes to get the thing to act like a better coder / agent / sysadmin / whatever.
> How can people be so naive as to run something like Claude anywhere other than in a strictly locked down sandbox that has no access to anything but the single git repo they are working on (and certainly no creds to push code)?
Because it’s insanely useful when you give it access, that’s why. They can do way more tasks than just write code. They can make changes to the system, setup and configure routers and network gear, probe all the iot devices in the network, set up dns, you name it—anything that is text or has a cli is fair game.
The models absolutely make catastrophic fuckups though and that is why we’ll have to both better train the models and put non-annoying safeguards in front of them.
Running them in isolated computers that are fully air gapped, require approval for all reads and writes, and can only operate inside directories named after colors of the rainbow is not a useful suggestion. I want my cake and I want to eat it too. It’s far to useful to give these tools some real access.
It doesn’t make me naive or stupid to hand the keys over to the robot. I know full well what I’m getting myself into and the possible consequences of my actions. And I have been burned but I keep coming back because these tools keep getting better and they keep doing more and more useful things for me. I’m an early adopter for sure…
Well, one of the other reasons I suggest running it in a strictly limited container is that you can then run it in yolo mode.
In fact, I use the pi agent, which doesn't have command sandboxing, it's always in yolo mode, I just run it in a container and then I get the benefit of not having to confirm every command, while strictly controlling what I share with it from the beginning of the session.
And doing it over, and over, and over and over again. Because sure it didn't change in the last 8 years but maybe it's changed since yesterdays scrape?
It will mess up eventually. It always does. People need to stop thinking of this is a “security against malicious actor” thing… because thinking in that way blinds you to the actual threat… Claude being helpful and accidentally running a command it shouldn’t. It’s happened to me twice now where it will do something irreversible and also incorrect. It wasn’t a threat actor, it wasn’t a bad guy… it was a very eager, incredibly clever assistant fat fingering something and goofing up. The more power you let them wield, the more chance they’ll do accidents. But without lots of power, they don’t really do much useful…
It’s actually a hard problem. But it really isn’t “security” in the classic sense…
Dude. I’ve been thinking about this a lot! I think it’s because the traditional way we internalize the costs of what we are building just got take for a ride. We don’t really (or I don’t anyway) fully know what “too much scope” feels like with one of these Claude thingies. So it’s easy to completely both overestimate complexity and underestimate it too. Some times the LLM makes a seemingly daunting refactor be super simple and sometimes something seemingly not complex can take it forever… and there really is, for me, a good “gut sense” of how something will go.
So lately I’ve just decided that I’ll time box things instead of set defined endpoints. And by “endpoint” I really mean “I’m done for the day” and honestly maybe thinking about it… “I’m done with this project”.
I don’t know. But the term “Claude Creep” is absolutely something I can identify with. That thing will take you down a rathole that started with just pulling in some document and ends with you completely repartitioning your file system. lol.
A lot of getting good mileage out of LLMs is promoting them to behave like they are blind and can only base their outputs on what is in front of them. Maintain an emic stance.
reply