The other labs already censor their models. Everyone is trying to find the sweet spot where performance and ‘alignment’ are both maximized. This seems no different
They are not unique in this. Apple and Tesla have similar programs. More nuance is warranted here. They are trying to balance the need to enable external research with the need to protect users from arbitrary 3rd parties having special capabilities that could be used maliciously
I understand that, but Anthropic is doing nothing to throw those grassroots researchers a lifejacket. This is the beginning of the end for independents, if it continues on this trajectory then Anthropic gets to decide who lives and who dies. Who says they should be allowed to decide that?
- Coordinated disclosure is ethically sketchy. I know why we do it, and I'm not saying we shouldn't. But it's not great.
- This isn't a single disclosure. This is a new technology that dramatically increases capability. So, even if we thought that coordinated disclosure was unambiguously good, then I think we'd still need to have a new conversation about Mythos
So private companies shouldn’t get to determine who they provide services to? Assuming no extremely malicious intent, I’d be fine if they said it was only going to McDonalds because the founders like Big Macs.
It's not just me against the world. It's users of big tech vs everyone else. Will my browser get these security patches? Or will only Chrome get them and everyone else gets to be vulnerable because that would endanger users of big tech.
In this case there is almost no distinction. Assuming the model is as powerful as claimed, someone with access to the weights could do immense damage without additional significant R&D.
Yes, I can see this as non releasable for national security reasons in the China geopolitical competition. Securing our software against threats while having immense infiltration ability against enemy cyber security targets....not to mention, the ability to implant new, but even more subtle vulnerabilities into open software not generally detectable by current AI to provide covert action.
It's partly the industry and it's partly the failure of regulation. As Mario Wolczko, my old manager at Sun says, nothing will change until there are real legal consequences for software vulnerabilities.
That said, I have been arguing for 20+ years that we should have sunsetted unsafe languages and moved away from C/C++. The problem is that every systemsy language that comes along gets seduced by having a big market share and eventually ends up an application language.
I do hope we make progress with Rust. I might disagree as a language designer and systems person about a number of things, but it's well past time that we stop listening to C++ diehards about how memory safety is coming any day now.
I think society is going to start paying the price for humans being human. As the paper points out there is a lot of good faith, serious software that has vulnerabilities. These aren't projects you would characterize as people being cavalier. It is simply beyond the limits of humans to create vulnerability-free software of high complexity. That's why high reliability software depends on extreme simplicity and strict tools.
100%, poorly architected software is really difficult to make secure. I think this will extend to AI as well. It will just dial up the complexity of the code until bugs and vulnerabilities start creeping in.
At some point, people will have to decide to stop the complexity creep and try to produce minimal software.
For any complex project with 100k+ lines of code, the probability that it has some vulnerabilities is very high. It doesn't fit into LLM context windows and there aren't enough attention heads to attend to every relevant part. On the other hand, for a codebase which is under 1000 lines, you can be much more confident that the LLM didn't miss anything.
Also, the approach of feeding the entire codebase to an LLM in parts isn't going to work reliably because vulnerabilities often involve interactions between different parts of the code. Both parts of the code may look fine if considered independently but together they create a vulnerability.
Good architecture is critical now because you really need to be able to have the entire relevant context inside the LLM context window... When considering the totality of all software, this can only be achieved through an architecture which adheres to high cohesion and loose coupling principles.
I'm not even talking about poorly architected software. They are finding vulnerabilities in incredibly well-engineered software. The Linux kernel is complex not because it's poorly written. It's complex because of all the things it needs to do. Rhat makes it beyond the ability of a human to comprehend and reliably work with it.
There are different degrees of well-engineered software. It's almost impossible for humans to do a good job with a large codebase. Some software is just too complex for any human or machine to implement correctly.
Humans almost always underestimate the cost of features. I bet we could massively reduce the amount of code and complexity of the Linux Kernel if we abandoned the account system entirely and just made it one user with root access and just relied on containers to provide isolated sandboxes.
A lot of features just crept in over long periods of time and weren't re-evaluated as needs changed. I think the approach I'm suggesting would have been horrible 20 years ago but makes more sense now in the era of cloud virtualization. The account system and containerization aspects are basically different implementations which solve the same modern problem of environment isolation... Nobody really needs per-file access restrictions anymore... The cloud era is more like "here is Bob's environment, here is Alice's environment" and they can do whatever they want with their own container/sandbox. The account permission systems is more of an annoyance than a solution for most use cases.
Everyone just latched onto the existing abstractions and could not fully re-imagine them in the context of changing requirements. LLMs are even worse than people in that sense.
That said, I think supporting a wide range of possible hardware is a real challenge for the Kernel and that part will always require an amount of code proportional to the amount of hardware supported.
> These aren't projects you would characterize as people being cavalier.
I probably would. You mentioned the linux kernel, which I think is a perfect example of software that has had a ridiculous, perhaps worst-in-class attitude towards security.
I don't know the first thing about cybersecurity, but in my experience all these sandbox-break RCEs involve a step of highjacking the control flow.
There were attempts to prevent various flavors of this, but imo, as long as dynamic branches exist in some form, like dlsym(), function pointers, or vtables, we will not be rid of this class of exploit entirely.
The latter one is the most concerning, as this kind of dynamic branching is the bread and butter of OOP languages, I'm not even sure you could write a nontrivial C++ program without it. Maybe Rust would be a help here? Could one practically write a large Rust program without any sort of branch to dynamic addresses? Static linking, and compile time polymorphism only?
Steve understood better than anyone that having a finite amount of time to build means you can't please everyone. The vast majority of Apple's customers just do not care about the Keyboard settings UI or the clarity of unusual error messages.
Users do care they just don't have the words to explain what it is thats frustrating them. Just a silent "I find myself using this less" sort of thing.
Not for everything, but the excuse of "normies don't give a shit" is a bullshit one.
I wonder how many care that messages lights up like a Christmas tree on speed on iPadOS, battery life dropped 90%, calculator requires 32 GB of ram, offline maps stranded them in the woods, iOS can no longer keep two apps loaded at once, ocr screenshots broke, the magnifier “flashlight” button no longer fits on the screen, or the ai text suggestions in notes are simultaneously garbage and undeletable.
Those are just some of the bugs I hit. I’d guess most normal users hit 4-5 problems this upgrade cycle.
For my side gig I need to quickly take multiple pictures (with my iphone) of subjects that aren’t still or cooperative. This used to work fine. Now the camera just quits with no crash or notice so I think I’m taking pictures but I’m not. Closing the camera app doesn’t disable or stop the camera, I have to wait or reboot. But hey, I can take really cool photos I can view in the Apple Vision I don’t own.
Now we have strong sandboxing by default and many other platform security advances to mitigate that risk. You can download any software off the Internet on macOS, so why not on iOS?
reply