Hacker Newsnew | past | comments | ask | show | jobs | submit | Zafira's commentslogin

This is a sincerely dishonest take to try and avoid responsibility for participating in something deeply unethical and scammy. If this were about evidence in a criminal trial, I would start opining about a poisoned tree.

The sad thing is that it often works.

Just look at how people view Andrew Carnegie now. After his reputation was sullied by his company’s behavior in the Homestead Strike, his philanthropy was done, in-part, to try and restore his reputation.


> he took the money and refused to stay bribed so the coin tanked.

I don’t think he refuse to stay bribed. I think he did what was asked and they executed a rug pull. He is extraordinarily honest and flippant about it. [0]

> And with that disclaimer out of the way, I must reiterate my sincere regrets to the CT/BAGS crowd, who so generously funded me to the tune of just shy of $300k last week on bags.fm. That money was hard to duck, and the funds are deeply appreciated. They will help Gas Town be a big success this year. But Gas Town itself needs my full attention; between that and Beads it’s a wonder I get anything done at all.

> So I had to step back from the community. I do find it amazing how they band together, dissenting voices rolling around like a big Katamari Damacy ball, and yet they somehow collectively find the discipline to act like financial analysts for institutional investors, weighing developer dossiers, product business cases, and doing critiques like a collective of professionals. All in crypto-bro speak. But it’s the same due diligence.

> But the CT community, like any highly engaged stakeholders, were going to be asking for a lot of my time. There are always strings attached.

[0] https://steve-yegge.medium.com/steveys-birthday-blog-34f4371...


I’m not sure I find the testimony of a Bain & Company AI consultant (https://www.bain.com/our-team/eric-koziol/) to be compelling for anything outside of generating fees.

Does this mean you would avoid an article on PostgreSQL if it's from a company selling Postgres products or consultation?

It means they'd avoid an article on the benefits of smoking if it's posted by a company selling cigarettes.

He was also effectively paid $300,000 to facilitate a cryptocurrency rug pull on Gas Town, bowing out after the rug pull because Gas Town required his “full attention”. [0]

Everything he says now is suspect.

[0] https://steve-yegge.medium.com/steveys-birthday-blog-34f4371...


Yeah, I used to respect him as a tech blogger, but you can't wash that crypto stink off once it gets on you.

> Although Claude Opus models largely recycle puns which can be found online, Mythos Preview comes up with decent and seemingly novel ones, often relating to its preferred technical and philosophical topics.

Yes, the system card mentions this, but this is kinda meaningless. It seems like they essentially ran it multiple times and curated a few good ones. Then puffed it up in the marketing copy.

This is made more clear when they attempt to brag about their literal slot machine behavior when finding that kernel crashing bug in OpenBSD.

> Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings. While the specific run that found the bug above cost under $50, that number only makes sense with full hindsight. Like any search process, we can’t know in advance which run will succeed.


> In the new design system, windows now have a softer, more generous corner radius, which varies based on the style of window. Windows with toolbars now use a larger radius, which is designed to wrap concentrically around the glass toolbar elements, scaling to match the size of the toolbar. Titlebar-only windows retain a smaller corner radius, wrapping compactly around the window controls. These larger corners provide a softer feel and elegant concentricity to the window…


Just a bunch of words that raised no red flags, maybe sounded like a decent idea even, but when you see it how is your reaction not “oh, that’s bad”

I feel like this is the design process. You have ideas, they sound ok, you try them out, and then immediately you revert a lot of them. The ideas without the taste to know when not to do something is becoming the new Apple way


I think what they're saying is that larger radii are for 'real windows' that have toolbars and such but there are 'mini windows' and those get smaller radii. It doesn't seem well enough baked for them to release it like it is but there are other UI problems that I've been annoyed about for a long time (in particular shadows around window boundaries so you can never get a truly flat tiled experience).


Rounded corners (and the utterly massive drag area next to them) are touchbar 2.0. Features that no one asked for, has questionable value, and that provides marginal benefit even for its intended audience (touchscreen macs, no doubt).


So, there was no reasoning.


> At the moment it is a mysterious, occasionally fickle, tool - but if you provide the correct feedback mechanisms and provide small tweaks and context at idiosyncrasies, it's possible to get agents to reliably build very complex.

This sounds like arguing you can use these models to beat a game of whack-a-mole if you just know all the unknown unknowns and prompt it correctly about them.

This is an assertion that is impossible to prove or disprove.


No it's more like if you knew how to build it before - LLM agents help you build it faster. There's really no useful analogy I can think of, but it fits my current role perfectly because my work is constantly interrupted by prod support, coordination, planning, context switching between issues etc.

I rarely have blocks of "flow time" to do focused work. With LLMs I can keep progressing in parallel and then when I get to the block of time where I can actually dive deep it's review and guidance again - focus on high impact stuff instead of the noise.

I don't think I'm any faster with this than my theoretical speed (LLMs spend a lot of time rebuilding context between steps, I have a feeling current level of agents is terrible at maintaining context for larger tasks, and also I'm guessing the model context length is white a lie - they might support working with 100k tokens but agents keep reloading stuff to context because old stuff is ignored).

In practice I can get more done because I can get into the flow and back onto the task a lot faster. Will see how this pans out long term, but in current role I don't think there are alternatives, my performance would be shit otherwise.


You could probably replace LLM with "junior engineer" here as it sounds like you're basically a manager now. The big negative that LLMs have in comparison with junior engineers is that they can't learn and internalise new information based on feedback.


"The big negative that LLMs have in comparison with junior engineers is that they can't learn and internalise new information based on feedback."

No, but they can take "notes" and can load those notes into context. That does work, but is of course not so easy as it is with humans.

It is all about cleaning up and maintaining a tidy context.


I don't like that analogy. If I had to work with a Claude like junior I would ask for them to get removed from my team - inability to learn stuff, completely unexpected/unrelatable faliure modes and performance.

On the other hand Claudes tenacity, stamina and sustained speed is superhuman. The more capable models become the more valuable this is.


The same is true with human engineers - isn't this just what engineering is?


>This is an assertion that is impossible to prove or disprove.

This is a joke right? There are complex systems that exist today that are built exclusively via AI. Is that not obvious?

The existence of such complex systems IS proof. I don't understand how people walk around claiming there's no proof? Really?


The assertion was "if you really know how to prompt, give feedback, do small corrections and fix LLM errors, then everything works fine".

It is impossible to prove or disprove because if everything DOES NOT work fine you can always say that the prompts were bad, the agent was not configured correctly, the model was old, etc. And if it DOES work, then all of the previous was done correctly, but without any decent definition of what correct means.


>And if it DOES work, then all of the previous was done correctly, but without any decent definition of what correct means.

If a program works, it means it's correct. If we know it's correct, it means we have a definition of what correct means otherwise how can we classify anything as "correct" or "incorrect". Then we can look at the prompts and see what was done in those prompts and those would be a "correct" way of prompting the LLM.


You don’t know it works. That you so glibly speak about products working is proof that your engineering judgment is impaired. You can’t infer the exact contents of a black box merely by looking at outside behavior.

The fundamental fallacy you are exhibiting here is similar to saying that rolling a six sided die and getting a “6” means that you will always get a 6 any time you roll it. And that if you get a 6 and wanted a 6, you must have therefore rolled those dice “correctly” and had you not gotten a 6 that would have meant you rolled them “wrong.”

You know that is not true.


>You don’t know it works. That you so glibly speak about products working is proof that your engineering judgment is impaired. You can’t infer the exact contents of a black box merely by looking at outside behavior.

I don't know the exact internals of a car. But I can infer my car works by driving it.

>The fundamental fallacy you are exhibiting here is similar to saying that rolling a six sided die and getting a “6” means that you will always get a 6 any time you roll it. And that if you get a 6 and wanted a 6, you must have therefore rolled those dice “correctly” and had you not gotten a 6 that would have meant you rolled them “wrong.”

Bro we rolled that dice MULTIPLE times. It's not a one time thing. And the "rolling" of the die is done with a CHAIN of MULTIPLE qureries strung together. This is not one roll. It's multitudes of data points. Yes results can be inconsistent from a technical standpoint, but the general result converges on a singular trend.

We know that much is true: a statistic and that is at most all we can say about reality as we know it as science formalized can only give a statistic as an answer.


"I don't know the exact internals of a car. But I can infer my car works by driving it."

No, you can't infer that it "works." Only that it CAN work. The car may be poisoning you with carbon monoxide. Your rear brakes may have become disconnected (happened to me). The antilock braking system may have a faulty sensor that only fails at very low speed, leading to them engaging when making a normal stop, but also preventing the mechanic from seeing the problem, because he didn't listen to your bug report and instead tried to repro the effect with high speed panic stops (also happened to me).

If I use a product and have a good experience, I can conclude that SOMETHING must be going well, but not that EVERYTHING is going well.

This is reasoning about evidence 101.


>No, you can't infer that it "works." Only that it CAN work. The car may be poisoning you with carbon monoxide. Your rear brakes may have become disconnected (happened to me). The antilock braking system may have a faulty sensor that only fails at very low speed, leading to them engaging when making a normal stop, but also preventing the mechanic from seeing the problem, because he didn't listen to your bug report and instead tried to repro the effect with high speed panic stops (also happened to me).

This is called pedantitic reasoning. You look like a drowning person trying to stay afloat.


Sorta?

The data being written to the disk is the same in CAV or CLV disks, but the player just needs to know how to spin the disk at the right speed so that the laser can read the pits/lands correctly. It is purely a detail about the speed that the disk is spun at so they can cram more data on it with CLV disks.

What CAV LaserDiscs allow for, though, is to make it extremely obvious where scanlines and blanking intervals are in the video signal.


It is really quite something how many people that have earned credibility designing well-loved tools seem to be true believers in the AI codswallop.


it's fascinating / astonishing


> nonzero risk of unfair judgement from a computer

I feel like this is really poor take on what justice really is. The law itself can be unjust. Empowering a seemingly “unbiased” machine with biased data or even just assuming that justice can be obtained from a “justice machine” is deeply flawed.

Whether you like it or not, the law is about making a persuasive argument and is inherently subject our biases. It’s a human abstraction to allow for us to have some structure and rules in how we go about things. It’s not something that is inherently fair or just.

Also, I find the entire premise of this study ludicrous. The common law of the US is based on case law. The statement in the abstract that “Consistent with our prior work, we find that the LLM adheres to the legally correct outcome significantly more often than human judges. In fact, the LLM makes no errors at all,” is pretentious applesauce. It is offensive that this argument is being made seriously.

Multiple US legal doctrines now accepted and form the basis of how the Constitution is interpreted were just made up out of thin air which the LLMs are now consuming to form the basis of their decisions.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: