> The hilarious part, though, is that it's not the AI that's working around the rules. That's the scenario that's been in science fiction, but it's not what's happening. It's the human users making use of our agency to get the AI agents to work around the rules. Despite calling them "agents", current AI agents don't seem to be able to that particular something. Yet, at least.
Well, yes. Until people are putting the LLMs into actual mechanical robots, "agency" boils down to flipping bits in memory or storage (even if they're ones that humans consider really important, e.g. because they represent a bank ledger) or convincing humans to take action. One can only "work around the rules" to the extent that one can "work".
But even in Asimov's books, at least some of the scenarios involved humans misleading the robots to use them as pawns in a greater scheme.
Many, many years ago I was asked to implement a filter like that for usernames. I said right away that it wasn't going to work well, but I did implement it.
Next internal build, the CEO can't create an account. With his real name.
It worked exactly to spec; I added a debug print and showed everyone the "bad word" it tripped on. The idea was promptly rethought.
Now I'm trying to figure out which word that would be, but yeah.
That reminds me of a bug I fixed where my bosses boss found it, we did everything, my boss at the time forced us to deploy anything and call it fixed. Then someone else saw it half a year later, I finally figured out the root cause and fixed it (localStorage vs sessionStorage) and my boss was acting like he didn't know what I was talking about, but I could hear it in his voice. I didn't press too hard, I just pushed the real fix out. It was basically a "client-side" bug of a gift card balance saved in localStorage that never updated, so I changed it to sessionStorage. Not quite the CEO, but the guy below the CIO finding a bug can worry just about anyone.
In my case, the regex would have been for a friend to filter reddit or discord slurs, so not as awful.
Two of my co-workers have the last names Dyck and Cox. I've seen others whose last name is literally Dick. And let's not forget the famous actor Dick Van Dyke who strikes out twice on most filters. I've heard several other names from other ethnicities that were straight up "slurs" by some people's standards. The only thing harder than matching a slur is deciding what words count as slurs.
I think I'm not getting something here. Like, sure, the refused prompt "review the code for security issues" could be interpreted as an attempt to discover weaknesses in a running system to exploit them. But we don't generally assume humans are doing something wrong if they are "reviewing code for security issues", and would commonly see no problem with asking each other to do so.
The problem is that a patch to fix a security issue quite often also shines a spotlight on the issue being fixed. Fixing a part of something like this super complicated Project Zero post might not give much of a clue as to what the issue was or how to exploit it: https://projectzero.google/2021/12/a-deep-dive-into-nso-zero...
But that's the exception. Most fixes to security issues point a finger directly at the issue, make it relatively obvious how to exploit, and generally doesn't take long to figure out from there what you might get out of it.
This has been a problem for a long time but AIs have made it even worse. It is now cost effective for a well-resourced attacker to simply monitor the patch stream of an important project like the Linux kernel or nginx and pass every single one through an AI with the question "Is this a vulnerability and if so how would I exploit it?" It has seriously complicated the process of getting fixes to people before the attackers have a chance to exploit it, just as AIs have also been increasing the rate at which serious security issues that have been found also need to be patched. Previously they could at least sneak a patch in under an innocuous commit message and have a reasonable chance of being lost in the churn, but now that door is increasingly closed to them as well.
And this is for the case when a security fix lands in the stream of a project and someone externally is watching it with no context. If you also get the complete stream of Mythos finding and fixing the bug it is even easier.
So, yes, any security vulnerability that Mythos will "fix" is also one that it first has to find, and the guardrails are useless if you can just instruct Mythos to "fix" it. And on the flip side, if Mythos won't fix security bugs, and we project that out to all other models matching this behavior, this will create a world in which the good guys can't secure their code but the bad guys, who will one way or another get around the guard rails if by nothing else simply by stealing the model and modifying it to suit their needs, will be able to break this code that we're not being "allowed" to secure. Since fixing vulns is a subset of finding the vulns, there isn't a way to "fix" this. Any model that can fix vulns must, by necessity, be able to find them. And it is the fixing we really need to be spread far and wide to secure the world's code.
>pass every single one through an AI with the question
Unfortunately this will just involve said teams running their patches over AI first before they're put in the main branch. For businesses it will probably be fine, but would get very expensive for open source projects.
The webpage linked is an example of everything I wish people would stop doing in web design.
Fortunately, at the bottom there is a link to the "technical documentation" (https://squeezlabs.github.io/handcrank/) which is vastly improved (aside from being light-mode-only and linked from a dark-mode-only marketing page). It also gives me much more interesting information (specifically: models that can apparently run acceptably on a Pi 5).
Please let me read your content with a scrollbar that works the way scroll bars are supposed to, rather than turning everything into a weird slide show where you don't actually know when the next slide is coming. Please let me just click on buttons that look like links to more information, without JavaScript.
Why can't technical people appreciate that us, the silent majority, love having our scroll hijacked? I can't remember the last time I used a scroll bar to navigate a website, but using it to navigate between choppy javascript keyframes fills me with joy.
Scroll animations, post-grid floating voids, bouncy house dampening, hyper rounded... everything. These are the 50s Chevy fins of today.
I've enjoyed working with some great designers over the years, Stanford D-School and even wild-raised. All the good ones intuitively steered clear of trends destined to be era-stamp tropes. They'd say, "I can already hear the ghosts of design-future mocking me: 'That's so early-AI' and 'Yo, the mid-20s called and wants their bento grid back.'"
Thirty years ago, Apple made a translucent green ADB "keypad" which had a small LCD display (perhaps only two lines of text?) – marketed towards academics, it allowed students to learn touch-typing without the distractions of an entire computer.
Once you were happy with your touch-typed document, you then plugged the "keypad" directly into your Mac's ADB (keyboard/mouse) port... and the thing would sit there and manually re-type your composition into the computer's texteditor.
----
Education needs such "reduced tech" to return to teaching. Think of this one as a "more advanced typewriter" – although I own a few of those, too, and they're fantastic for pure composition.
I don't use them, but that is surprising! I would program one of my theoretical phone's physical side buttons to handle PgDn/PgUp [†] – similar to my old Kindle's layout. Do phones still have side volume buttons (e.g.)?
[†] Thanks for the better styling, than my former Page_Up &c
No doubt non technical people have different UX experience than tech nerd, but I have seen plenty of "normal" people curse at artsy fluffy design, that made known navigation skills useless and nobody likes their time wasted.
I agree this type of web design sucks. It's been common for more than a decade - I remember Apple getting criticized for using this on the product page for the old "trash can" Mac Pro in 2013, and it was already widely used back then.
However, it seems pretty clear to me they did this in service of a joke - you have to "crank" your scroll wheel to get to the content, just like you have to crank this device. I think it's funny...
Great prop for a Black Mirror episode about AI use in a post-apocalyptic world. Everywhere you go, all you hear is brrrrr..brrr..brrrr followed by people mumbling.
Totally agree on the atrocious landing page. The technical one is much better, although the power supply circuit by using a resistive balancer and a linear regulator wastes some good power for nothing.
yea i can't stand this. im not so boomer i want every webpage to be like. times new roman white background and just using <p></p> and bulleted lists, but idk i cant even put a finger on what im not enjoying here. think it's possibly using scrolling as a way to try and force me to read through stuff. jokes on them, i can't read. not giving me the agency to click around into info that interests me drives me nuts, chances are im just gonna keep scrolling at 1000mph and eye scan until i see what im looking for virtually zero chance im going to sit through the experience of every carefully designed scroll-slide they've tried to present to me here.
Alright, I'll be the boomer and say that's what I want every webpage to be like. If you want to customize it you can bring your own CSS or download someone else's. The modern web is a nightmare of user-hostile time-thieving behavioral manipulation and our brains would be better off without it.
> The swing of the market based on the president's crazy tweets is just insane.
A quick glance at charts shows that VIX is not at all out of line with historical patterns, and asking ChatGPT to crunch some numbers confirms that. The "liberation day" spike was not nearly as bad as in 2008 or for COVID, and in fact not much more than events in 2010 and 2011 that people don't even have names for.
It either means man must adjust (themselves), or must adjust (the machine).
So it's a somewhat arch joke than may not be apparent due to shifts in language usage. (Also, "man" in this context was short for "human" without regard to sex (which we now call gender)).
The parent alludes to the fact that the sentence could conceivably be read as "In man-machine symbiosis, it is man who must adjust [the machine]", i.e. reading "adjust" as transitive rather than as intransitive.
However, I think it's clear that the intended meaning is intransitive.
> how effective frontier models (ChatGPT-5.5 in particular) are at completing certain manual proofs in the Roqc (né Coq) proof assistant. The proofs aren't always pretty, but ChatGPT can often prove something in minutes and 10 - 100 iterations that would take me, a human who has limited but non-zero proof assistant experience but significant domain experience in the lemmas being proven, much much longer.
... How do you know that the proofs are themselves correct?
Today, I write the code. It’s trivial and takes a lot less time than writing the spec, and since I’m using conventional tooling for WCET and stack sizing it’s nice to get those right up front. The LLMs sometimes tweak the code slightly for provability, but this is usually either direct operator replacement (shift with multiplication, and with modulus, etc) or factoring out a block to a function to tie a contract onto it, both of which I trust my compiler to undo (simple arithmetic operations and inlining, respectively) with zero to minimal impact on the generated binary.
> But the ridiculousness of saying New York style pizza is not pizza or that you have to make things the "right way" needs to go.
Suppose people say it; why shouldn't they be entitled to their opinion? How does it harm anyone? People who like New York style pizza are equally free to just disagree, and keep making it.
reply