Hacker Newsnew | past | comments | ask | show | jobs | submit | zahlman's commentslogin

Why shouldn't the author mention models that people might not have to buy a new computer to use?

One don’t have to buy a new computer to run Qwen3.6 or Qwen3.5 (35B A3B), given that they can already run Qwen3 30B A3B.

In fact with a 64GB mac, you can run pretty much all of the latest Qwen models.

Also, anyone who has been following local LLM are well aware that the quality and performance has become way way better since Qwen3.5


> The hilarious part, though, is that it's not the AI that's working around the rules. That's the scenario that's been in science fiction, but it's not what's happening. It's the human users making use of our agency to get the AI agents to work around the rules. Despite calling them "agents", current AI agents don't seem to be able to that particular something. Yet, at least.

Well, yes. Until people are putting the LLMs into actual mechanical robots, "agency" boils down to flipping bits in memory or storage (even if they're ones that humans consider really important, e.g. because they represent a bank ledger) or convincing humans to take action. One can only "work around the rules" to the extent that one can "work".

But even in Asimov's books, at least some of the scenarios involved humans misleading the robots to use them as pawns in a greater scheme.


Many, many years ago I was asked to implement a filter like that for usernames. I said right away that it wasn't going to work well, but I did implement it.

Next internal build, the CEO can't create an account. With his real name.

It worked exactly to spec; I added a debug print and showed everyone the "bad word" it tripped on. The idea was promptly rethought.

I feel like the AI did you a favour here.


Ah the classic Scunthorpe problem

Now I'm trying to figure out which word that would be, but yeah.

That reminds me of a bug I fixed where my bosses boss found it, we did everything, my boss at the time forced us to deploy anything and call it fixed. Then someone else saw it half a year later, I finally figured out the root cause and fixed it (localStorage vs sessionStorage) and my boss was acting like he didn't know what I was talking about, but I could hear it in his voice. I didn't press too hard, I just pushed the real fix out. It was basically a "client-side" bug of a gift card balance saved in localStorage that never updated, so I changed it to sessionStorage. Not quite the CEO, but the guy below the CIO finding a bug can worry just about anyone.

In my case, the regex would have been for a friend to filter reddit or discord slurs, so not as awful.


Two of my co-workers have the last names Dyck and Cox. I've seen others whose last name is literally Dick. And let's not forget the famous actor Dick Van Dyke who strikes out twice on most filters. I've heard several other names from other ethnicities that were straight up "slurs" by some people's standards. The only thing harder than matching a slur is deciding what words count as slurs.

> Now I'm trying to figure out which word that would be

I once had Shi Tao as part of an email username. It tripped filters periodically.


I think I'm not getting something here. Like, sure, the refused prompt "review the code for security issues" could be interpreted as an attempt to discover weaknesses in a running system to exploit them. But we don't generally assume humans are doing something wrong if they are "reviewing code for security issues", and would commonly see no problem with asking each other to do so.

The problem is that a patch to fix a security issue quite often also shines a spotlight on the issue being fixed. Fixing a part of something like this super complicated Project Zero post might not give much of a clue as to what the issue was or how to exploit it: https://projectzero.google/2021/12/a-deep-dive-into-nso-zero...

But that's the exception. Most fixes to security issues point a finger directly at the issue, make it relatively obvious how to exploit, and generally doesn't take long to figure out from there what you might get out of it.

This has been a problem for a long time but AIs have made it even worse. It is now cost effective for a well-resourced attacker to simply monitor the patch stream of an important project like the Linux kernel or nginx and pass every single one through an AI with the question "Is this a vulnerability and if so how would I exploit it?" It has seriously complicated the process of getting fixes to people before the attackers have a chance to exploit it, just as AIs have also been increasing the rate at which serious security issues that have been found also need to be patched. Previously they could at least sneak a patch in under an innocuous commit message and have a reasonable chance of being lost in the churn, but now that door is increasingly closed to them as well.

And this is for the case when a security fix lands in the stream of a project and someone externally is watching it with no context. If you also get the complete stream of Mythos finding and fixing the bug it is even easier.

So, yes, any security vulnerability that Mythos will "fix" is also one that it first has to find, and the guardrails are useless if you can just instruct Mythos to "fix" it. And on the flip side, if Mythos won't fix security bugs, and we project that out to all other models matching this behavior, this will create a world in which the good guys can't secure their code but the bad guys, who will one way or another get around the guard rails if by nothing else simply by stealing the model and modifying it to suit their needs, will be able to break this code that we're not being "allowed" to secure. Since fixing vulns is a subset of finding the vulns, there isn't a way to "fix" this. Any model that can fix vulns must, by necessity, be able to find them. And it is the fixing we really need to be spread far and wide to secure the world's code.


>pass every single one through an AI with the question

Unfortunately this will just involve said teams running their patches over AI first before they're put in the main branch. For businesses it will probably be fine, but would get very expensive for open source projects.


When sama was recruiting Head of Preparedness back in December this is what it was about. Some of it, anyway.

The webpage linked is an example of everything I wish people would stop doing in web design.

Fortunately, at the bottom there is a link to the "technical documentation" (https://squeezlabs.github.io/handcrank/) which is vastly improved (aside from being light-mode-only and linked from a dark-mode-only marketing page). It also gives me much more interesting information (specifically: models that can apparently run acceptably on a Pi 5).

Please let me read your content with a scrollbar that works the way scroll bars are supposed to, rather than turning everything into a weird slide show where you don't actually know when the next slide is coming. Please let me just click on buttons that look like links to more information, without JavaScript.


Why can't technical people appreciate that us, the silent majority, love having our scroll hijacked? I can't remember the last time I used a scroll bar to navigate a website, but using it to navigate between choppy javascript keyframes fills me with joy.

This isn’t scroll hijacking

You can scroll normally, with all your favorite keys, or go super fast to the bottom

It’s just scroll animations. Bad ones, admittedly.


> just scroll animations. Bad ones

Scroll animations, post-grid floating voids, bouncy house dampening, hyper rounded... everything. These are the 50s Chevy fins of today.

I've enjoyed working with some great designers over the years, Stanford D-School and even wild-raised. All the good ones intuitively steered clear of trends destined to be era-stamp tropes. They'd say, "I can already hear the ghosts of design-future mocking me: 'That's so early-AI' and 'Yo, the mid-20s called and wants their bento grid back.'"


>trends destined to be era-stamp tropes

This page was designed for today, for making it to HN, not 'the ghosts of design-future'.


> You can scroll normally

Except you can’t.

I scroll down, and the content of the page doesn’t move as expected.


Just use your page_up/page_down keys, and you can skip all the stupid/excessive scrolling requirements.

Now that iPhone has switched to USB-C, I can plug in my Apple Extended Keyboard directly without needing a dongle. It’s like magic.

The real question is does the power button on the AEK still work on iOS?

You have to also hold down `ctrl` [+power], but yes.

I now have visions of an Apple Extended Extended Keyboard that comes with a crank...

It’s nothing new. In fact, many of the comments on this site were made by keyboards with cranks.

Er… I meant to say cranks with keyboards. Sorry. It was a rough weekend.


Not even an ADB-to-USB dongle?

Thirty years ago, Apple made a translucent green ADB "keypad" which had a small LCD display (perhaps only two lines of text?) – marketed towards academics, it allowed students to learn touch-typing without the distractions of an entire computer.

Once you were happy with your touch-typed document, you then plugged the "keypad" directly into your Mac's ADB (keyboard/mouse) port... and the thing would sit there and manually re-type your composition into the computer's texteditor.

----

Education needs such "reduced tech" to return to teaching. Think of this one as a "more advanced typewriter" – although I own a few of those, too, and they're fantastic for pure composition.


how do i press these buttons on my android phone?

Connect Keyboard, Press PgDn.

Or what I actually use for ssh on the road: https://github.com/klausw/hackerskeyboard

Google kicked it from their store because it still supports older Androids but it still works just fine on the latest versions. It's on F-Droid.


Is there really no PgDn on a phone?!?

I don't use them, but that is surprising! I would program one of my theoretical phone's physical side buttons to handle PgDn/PgUp [†] – similar to my old Kindle's layout. Do phones still have side volume buttons (e.g.)?

[†] Thanks for the better styling, than my former Page_Up &c


If they're not scroll hijacking, then we're just jacking it ourselves. Think about it.

Thanks, web designers <3


"love having our scroll hijacked? "

You are the silent majority?

No doubt non technical people have different UX experience than tech nerd, but I have seen plenty of "normal" people curse at artsy fluffy design, that made known navigation skills useless and nobody likes their time wasted.


Pretty sure your parent comment was being sarcastic. Why else would they write “choppy javascript keyframes fills me with joy”?

Well, I missed that word, but my irony/sarcasm detector has lately been a bit uncalibrated by the current zeitgeist.

The "choppy" JS keyframes helps give it a cinematic and authentic feel. /s

I agree this type of web design sucks. It's been common for more than a decade - I remember Apple getting criticized for using this on the product page for the old "trash can" Mac Pro in 2013, and it was already widely used back then.

However, it seems pretty clear to me they did this in service of a joke - you have to "crank" your scroll wheel to get to the content, just like you have to crank this device. I think it's funny...


> you have to "crank" your scroll wheel to get to the content, just like you have to crank this device.

I actually didn't consider that interpretation.

(I also almost deleted the comment, definitely didn't expect it to blow up like this.)


Calling this web design is giving it too much credit. This is just a glorified marketing pamphlet, which is fine for its purpose.

boo wendy boo. i liked it.

Who's Wendy?

It's a quote from a 2008 South Park episode; https://en.wikipedia.org/wiki/Breast_Cancer_Show_Ever

Choosing "form over function" has been the hallmark of bad design since flash and I don't see that tradition changing anytime soon, even with AI.

Great prop for a Black Mirror episode about AI use in a post-apocalyptic world. Everywhere you go, all you hear is brrrrr..brrr..brrrr followed by people mumbling.

Totally agree on the atrocious landing page. The technical one is much better, although the power supply circuit by using a resistive balancer and a linear regulator wastes some good power for nothing.

Twas probably also prompted… to pile irony over...

yea i can't stand this. im not so boomer i want every webpage to be like. times new roman white background and just using <p></p> and bulleted lists, but idk i cant even put a finger on what im not enjoying here. think it's possibly using scrolling as a way to try and force me to read through stuff. jokes on them, i can't read. not giving me the agency to click around into info that interests me drives me nuts, chances are im just gonna keep scrolling at 1000mph and eye scan until i see what im looking for virtually zero chance im going to sit through the experience of every carefully designed scroll-slide they've tried to present to me here.

Alright, I'll be the boomer and say that's what I want every webpage to be like. If you want to customize it you can bring your own CSS or download someone else's. The modern web is a nightmare of user-hostile time-thieving behavioral manipulation and our brains would be better off without it.


This website is satire, right?

P.S. I agree with you 100%


> The swing of the market based on the president's crazy tweets is just insane.

A quick glance at charts shows that VIX is not at all out of line with historical patterns, and asking ChatGPT to crunch some numbers confirms that. The "liberation day" spike was not nearly as bad as in 2008 or for COVID, and in fact not much more than events in 2010 and 2011 that people don't even have names for.


> 99. In man-machine symbiosis, it is man who must adjust: The machines can't.

It seems worth noting here that the English verb "to adjust" is ambitransitive.


Why?

It either means man must adjust (themselves), or must adjust (the machine).

So it's a somewhat arch joke than may not be apparent due to shifts in language usage. (Also, "man" in this context was short for "human" without regard to sex (which we now call gender)).


The parent alludes to the fact that the sentence could conceivably be read as "In man-machine symbiosis, it is man who must adjust [the machine]", i.e. reading "adjust" as transitive rather than as intransitive.

However, I think it's clear that the intended meaning is intransitive.


But don't you still have to logically connect the validity of the proof to desirability of the output?

> how effective frontier models (ChatGPT-5.5 in particular) are at completing certain manual proofs in the Roqc (né Coq) proof assistant. The proofs aren't always pretty, but ChatGPT can often prove something in minutes and 10 - 100 iterations that would take me, a human who has limited but non-zero proof assistant experience but significant domain experience in the lemmas being proven, much much longer.

... How do you know that the proofs are themselves correct?


With the proof checker.

I assume your idea is, if the spec and the proof is verified the code generated is good enough as well ?

Today, I write the code. It’s trivial and takes a lot less time than writing the spec, and since I’m using conventional tooling for WCET and stack sizing it’s nice to get those right up front. The LLMs sometimes tweak the code slightly for provability, but this is usually either direct operator replacement (shift with multiplication, and with modulus, etc) or factoring out a block to a function to tie a contract onto it, both of which I trust my compiler to undo (simple arithmetic operations and inlining, respectively) with zero to minimal impact on the generated binary.

Proof checkers fuel the AI hype by outputting "valid" for a hallucinated text. /s

> But the ridiculousness of saying New York style pizza is not pizza or that you have to make things the "right way" needs to go.

Suppose people say it; why shouldn't they be entitled to their opinion? How does it harm anyone? People who like New York style pizza are equally free to just disagree, and keep making it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: