Hacker Newsnew | past | comments | ask | show | jobs | submit | mquander's commentslogin

The linked report seems almost useless -- it doesn't say anything about an error rate or a sample size, so it's a mystery whether 9 out of 20 systems “fabricated information and made suggestions to patients' treatment plans” ten out of ten times, or one out of a thousand times.

If we just postulate that the systems have a high error rate, I wonder why they are being adopted. They seem extremely easy to test, so I don't see why doctors or hospitals or governments should be getting tricked into buying them if they suck.


>If we just postulate that the systems have a high error rate, I wonder why they are being adopted.

From the article: "While 30 percent of a platform’s evaluation score depended solely on whether they had a domestic presence in Ontario, the accuracy of medical notes contributed only 4 percent to the total score."

Accuracy wasn't really part of the scoring, Ontario doesn't care about it.


Scoring systems that function by adding up several parts never make sense. Video game magazines used to do that, but it meant that you could have wretched gameplay, and still get a decent score, from points in other categories like audio, graphics, and cinematics.


> Of course the safest (first) option is the correct option from a liability standpoint, which is all a company should operate on since it's first responsibility is to protect the company for those that are still there.

Isn't this an unrealistically black-and-white mode of thinking? Humans are complicated and have many values and perceived responsibilities. It's not healthy for them to throw them all out and act as if they only have one responsibility that needs to be maximally upheld at all costs. They should balance their actions thoughtfully.


System security is not a human value. Access key rotation effective immediately is a compliance requirement, and completely orthogonal to human decency, which is delivered trough garden leave or severance, not extended system access


So, never lived in corp land? Healthy isn’t on most corporations radars except where it causes liability to them.


I haven't, but the parent said that this is what a company "should" do, not just what they do do.


That rule is a rule that excludes many socially excluded nerds. So if there are more and more spaces like that, then the nerds that those spaces exclude, are probably going to go make new communities and attempt to keep out the people who just excluded them.


Your comment is framed like "giving a student a personal AI datacenter to carry with them" is unrealistic, but in fact it is easy for anyone with access to $1000-$2000 worth of compute to download and operate exactly that for free, with performance perhaps a year behind the state of the art.


> but in fact it is easy for anyone with access to $1000-$2000 worth of compute

Even if we assume that to be true, you severely underestimate how many people that condition excludes.


There are simpler LLMs that run on much cheaper devices and are still helpful for baseline tasks. Of course they are prone to hallucinating once they reach the limits of their world knowledge, but this also changes their effectiveness in an educational context: they can help you polish a paper (much of their reliable knowledge is about language, syntax and style/pragmatics of the input texts), but you still have to plan the writing on your own.


Maybe teaching students to take their whatever devices to run AI is the way, sure. All I tried to say is if we're teaching students to think independently, we should teach them independent tools.


It doesn't exclude people who attend high schools and colleges that have a computer lab.


That however requires significant investments - either each computer gets a powerful GPU for local inference (which cost a fortune) or the school gets a rack worth of compute. Most schools however even struggle to get their children fed.

Another issue is that it forces kids to stay in school for longer to do their homework, which can be a serious problem in rural areas where public transport is limited, so parents are forced to fetch their kids from school which may not be compatible with working hours.


This comment would make more sense if it were before the new wave of prediction markets, which are high-profile gambling products clearly largely made and popularized by true believers who think they are making the world a better place.


Then you don't want to hire someone so insane as to think that gambling is making the world a better place.


> clearly largely made and popularized by true believers who think they are making the world a better place.

Is it clear? To a lot of people they come off as “true believers” in the same way as Kenneth Copeland and all the prosperity gospel hustlers. A lot of people thought Elizabeth Holmes was a true believer too. Easy to believe in something when it’s making you rich. Maybe VCs are just suckers for a bit of charisma.


Thinking gambling in any way or form makes the world a better place or just leaves it as it is, is utterly delusional and any contact (or support) with that person should be avoided like the plague.


I think it's slightly less ridiculous than it sounds, because governments have much more power over their own citizens. As an American I would dramatically prefer the Chinese government to spy on me than the American government, because the Chinese government probably isn't going to do anything about whatever they find out.

(That logic breaks down somewhat in the case of explicitly negotiated surveillance sharing agreements.)


> because the Chinese government probably isn't going to do anything about whatever they find out.

This really depends. If a foreign adversary's surveillance finds you have a particular weakness exploitable for corporate or government espionage, you're cooked.

Domestic governments are at least still theoretically somewhat accountable to domestic laws, at least in theory (current failure modes in the US aside).


Exactly and that danger grows as the ability to do so in increasingly automated and targeted ways increases. Should be very obvious now looking at the world around us.

Also, failing to consider the legal and rights regime of the attacker is wild to me. Look at what happens to people caught spying for other regimes. Aldrich Ames just died after decades in prison, and that’s one of the most extreme cases — plenty have got away with just a few years. The Soviet assets Ames gave up were all swiftly executed, much like they are in China.

Regimes and rights matter, which is why the democracy / autocracy governance conflict matters so much to the future trajectory of humanity.


Yes, exactly this.

> As an American I would dramatically prefer the Chinese government to spy on me than the American government, because the Chinese government probably isn't going to do anything about whatever they find out.

> spy on me

People forget to substitute "me" for "my elected representative" or "my civil service employee" or "my service member" or their loved ones

I, personally, have nothing significant that a foreign government can leverage against our country but some people are in a more privileged/responsible/susceptible position. It is critical to protect all our data privacy because we don't know from where they will be targeted.

Similarly, for domestic surveillance, we don't know who the next MLK Jr could be or what their position would be. Maybe I am too backward to even support this next MLK Jr but I definitely don't want them to be nipped in the bud.


> I don’t know who operates this agent, and I’m not going to speculate about why they did what they did.


I don't really understand the criticism. The authors aren't claiming to have the strongest chess engine without search. They are just showing that they got a chess engine to a respectable level with their process, which is somewhat different from LC0. They do in fact explain that explicitly:

> Leela Chess Zero’s networks, which are trained with self-play and RL, achieve higher Elo ratings without using explicit search at test time than our transformers, which we trained via supervised learning. However, in contrast to our work, very strong chess performance (at low computational cost) is the explicit goal of this open source project (which they have clearly achieved via domain-specific adaptations). We refer interested readers to [https://arxiv.org/abs/2409.12272] (which was published concurrently to our work) for details on the current state-of-the-art and a comparison against our network.

And I don't think the criticism of their writing is on point either. I don't think they are secretly implying that their engine is better than Stockfish. And it's 100% plausible for human masters to rigorously analyze many positions with engine assistance and correctly establish whether Stockfish's evaluation is right or not.


First of all the title is misleading: "GM level" to most of us means moves of the quality that a GM makes when playing at classical time control. As of several years ago, LC0 needed around 35 search nodes per move to do that. With LC0's new transformer architecture, that number has probably gotten a lot lower, but not all the way down to 0. Second of all, the article complains about the Google paper not citing some other publication. So that's a concrete criticism though I haven't checked its validity.


That's great, I'm going to use that one in the future.


I recommend Matthew Sadler's Game Changer and The Silicon Road To Chess Improvement.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: