More

icyfox · 2026-06-15T15:56:48 1781539008

1. Factory limits basically. There's a limit to the amount of fabrication lines that can create ram. Combined with the market incentives right now to make high bandwidth memory (HBM) over server memory (DRAM)... HBM starts as DRAM dies, so it competes with normal DRAM for wafer starts / cleanroom fab capacity.

2. Eventually more plants will come on line. Most of the main manufacturers have announced expansions but these can take O(years) to come online.

srdjanr · 2026-06-16T07:46:00 1781595960

Are more plants coming? I think I heard it won't be many of them, because it's risky.

If the bubble bursts and RAM demand drops, then they'll have big losses. And that's not an impossible scenario over the few X years that it takes to build a plant

icyfox · 2026-05-14T18:46:12 1778784372

Not particularly. I'm not yet convinced people's mouse movements are unique enough to our identity that they're useful as a fingerprint, whereas it's very easy to classify whether something looks bezier or looks human.

Eventually I'm hoping to collect enough data here to train a biased decoding model, so you could input some randomized personality vector (which implicitly encodes slow movement, jerky motion, trackpad, mouse, etc) and have that impact the RNN generation. So in theory there would be infinite combinations from the larger subspace we're sampling from.

icyfox · 2026-05-14T18:44:34 1778784274

Could look into addressing this. What are you trying to achieve?

icyfox · 2026-04-20T21:05:58 1776719158

So much of what Apple has lost over the last 10 years is a lower bar for what counts as good enough.

You see this most obviously in software and marketing - the kinds of decisions where only a few people sign off at the end, and where "good enough" is whatever those few people decide it is. You see it less in hardware and procurement where there's a powerful review cycle and scrutiny at every level of the stack. Work there is more immediately measurable: benchmarks for performance, dollars for cost.

The "vibe" of software, or of a PDF [^1], is much harder to catch that way. There's no benchmark that flags it and most conventional executives aren't drilling down in that level of detail to see it either.

You want distributed decision-making, of course. But that only works well if it's distributed to people who've cultivated their own taste and who will make good calls under pressure. I'm not sure how much of that gets fixed by leadership change at the top. Taste isn't really something a CEO can decree into a 60,000 person org. But I've only heard good things about Ternus, so I'm optimistic. Fingers crossed for a bright new chapter.

[^1]: https://www.apple.com/promo/pdf/US_FY26_Earth_Day_Promo_Tand...

icyfox · 2026-04-01T05:17:26 1775020646

Digitizing my old tapes was one of the most rewarding side projects that I did over the last year. I managed to get in under the wire (pun intended) of Firewire compatibility on Sequoia and a long daisy-chain of adapters. But it was clear the days of this approach were numbered. I'm optimistic these 3rd party accessories will become more standardized into self-contained cheap boxes where people can easily transfer over their stuff before camcorders degrade.

My pipeline went camera -> dvrescue -> ffmpeg -> clip chunking -> gemini for auto tagging of family members and locations where things were shot.

We now have all our family's footage hosted on a NAS with Jellyfin serving over Tailscale to my parents Macbooks. I found the clip chunking in particular made the footage a lot more watchable than just importing the two-hour long tapes although ymmv.

eisa01 · 2026-04-01T05:40:24 1775022024

I am going to finish such a project soon myself, including some old Video8 tapes! Sounds like you're on macOS, Any reason you didn't use iMovie for the capture itself?

The Video8 tapes have already been digitalized via a Digital8 camcorder, but apparently you can get even better quality out of old analog tapes with the vhsdecode project. Let's see if I ever get around to that, but at least it bypass Firewire entirely: https://github.com/oyvindln/vhs-decode https://www.reddit.com/r/vhsdecode/

icyfox · 2026-04-01T05:46:33 1775022393

Mostly wanted to fully automate the pipeline (auto-rewind tape, scan tape head position, etc) and iMovie is just using the same AVFoundation APIs under the scene that you can call manually. Took some notes here if helpful: https://pierce.dev/notes/automating-our-home-video-imports

Wish vhsdecode was easier to use in practice! Such a cool idea but a bit too inconvenient to hack your own hardware like this...

Denatonium · 2026-04-01T13:44:08 1775051048

I used dvgrab to ingest my old tapes, and ffmpeg and avisynth/QTGMC to de-interface and encode files for easy viewing (though I keep the original .dv files).

The biggest issue I ran into was that while the audio and video were properly synced up in the original .dv file (due to it being an interleaved format), when I re-encoded the videos, the audio and video would drift out of sync as the video went on.

I was able to fix the sync issues by using dvgrab to split the original dv file into a bunch of 3 minute chunks. I then wrote a script to extract the audio track from each chunk, pad the end of the audio with milliseconds of silence to the exact length of the video track, combine the padded audio tracks, encodes the combined track, and muxes the fixed audio track with the encoded video. This worked really well; the silence padding is imperceptible, but the audio and video are still in sync - even after 2 hours.

A final point that needs making is that doing anything with dv files in ffmpeg (even -c:v copy) destroys the SMPTE timecodes embedded in the original file, making it much harder to split by scene.

CountHackulus · 2026-04-01T14:45:19 1775054719

Just because I've dealt with this exact issue in the past, it may have been a 30fps vs 29.97fps issue. For me the audio was a fixed length, but the frame rate was SLIGHTLY too fast. The problem can manifest as either too slow or too fast depending on which side is expecting 30fps vs 29.97fps.

Denatonium · 2026-04-01T22:09:33 1775081373

I think it was just clock drift on the camcorder during the initial recording, as I'm pretty sure I tried adjusting the frequency of the audio track to make it the same duration as the video track, and the A/V sync was still wrong.

I'm so glad the audio and video tracks are stored interleaved, as it made my solution possible, and the results I got were great. By splitting the interleaved video into small enough chunks, padding the audio, and cutting it exactly to video length, the padding was practically imperceptible.

The only issue I ran into was that ffmpeg can't cut audio with any real precision. I eventually figured out that I could dump the audio track to a headerless PCM file, calculate the exact byte offsets for my cut points, and cut them with perfect precision using the head and tail commands from GNU coreutils. This was perfect because I was able to use the cat command to combine all of the padded audio chunks into a single raw PCM file, which I then made an AAC encode of with ffmpeg to mux with my original encoded video track.

Melatonic · 2026-04-01T17:37:07 1775065027

This is very likely it

Melatonic · 2026-04-01T17:38:22 1775065102

Transcode to another format first that keeps the timecode?

Denatonium · 2026-04-01T21:57:14 1775080634

Ffmpeg's dvvideo implantation is unfortunately just broken and mangles timecodes, even if just doing a stream copy from dvvideo to dvvideo without any re-encoding.

Fortunately, dvgrab does allow you to take the original .dv file and generate a .srt subtitle track with time stamps that you can mux into your encoded files.

ErroneousBosh · 2026-04-01T08:11:28 1775031088

If you are capturing I find dvgrab is pretty good. It's what I've been using for about 25 years now!

In the olden days when I got paid to shoot real video on a VX2000 and edit it for people, captured using a PCI Firewire card and dvgrab in Slackware, rewrapped with probably mencoder shading towards ffmpeg when it became more popular (and developed!), dual-boot into Windows 2000 and cut in Premiere 5.0, then back into Linux to transcode back to DV if I wanted to write it out to DV tape.

These days I shoot on a PD150 or DSR500 (and quite often some HDV cameras), capture via a PCIe Firewire card and dvgrab in Ubuntu, rewrap with ffmpeg, and edit in Resolve, without the dual-booting step.

If you use dvgrab it will split the capture up into separate clips on shot boundaries based on the pause/unpause markers on the tape. I have not found a way to extract good/no good from the stream, but if you're not shooting on a broadcast camera you don't have this anyway. Timecode is preserved though!

When you load it all up in Resolve, one of the options in the Cut page is "Source Tape View" which runs all your clips together by timecode, and lets you view them as though they were a continuous tape of your rushes, which is how we used to do basic assemble editing in the olden days of clunky tape decks and edit controllers with big rows of red 7-segment displays.

Edit your old home videos. You can do that now, and they'll be far more watchable.

polishdude20 · 2026-04-01T08:18:00 1775031480

A few years ago I did a bit more of a crude flow.

Play the footage on a tv in a dark room. Place a 4k camera on a tripod and record the tv with audio into the camera audio port.

Worked perfectly.

Melatonic · 2026-04-01T17:39:42 1775065182

Actually not a terrible way to go from interlaced to progressive footage. Depending on the TV and camera

cromka · 2026-04-01T13:06:47 1775048807

> gemini for auto tagging of family members

With all respect, reading this part made me feel uneasy.

romanhn · 2026-04-01T05:38:37 1775021917

Went through a very similar journey recently as well. In my case using a Macbook was a non-starter, as certain adapters are prohibitively expensive these days, if you can even get your hands on one. Thankfully my son has a desktop Windows PC and Firewire PCI cards are cheap and plentiful, so getting connected that way worked out. Much better than an earlier attempt via RCA cables (simple but digital -> analog -> digital is not the way to go).

My pipeline was camera -> WinDV -> DVdate (to extract exact datetimes into srt subtitles) -> Handbrake (to convert to mp4).

shevy-java · 2026-04-01T11:25:57 1775042757

> Digitizing my old tapes was one of the most rewarding side projects

I also wanted to do that, but then I realised I needed to invest more time and may need some hardware, so one day I simply had enough, went to a commercial shop and had them turn all the old stuff into digital. The cost wasn't that huge either, so considering that I could also save time (doing it myself), I am ok with that investment. Hopefully the future has digital everywhere. Storage to be cheaper too, ideally.

sorenjan · 2026-04-01T12:46:17 1775047577

Can you expand on the Gemini tagging part? What did you do with the tags, import them into Jellyfin after cutting the videos into parts?

vardump · 2026-04-01T08:20:52 1775031652

Is it possible to accomplish tagging with local AI instead of Gemini?

icyfox · 2026-04-01T19:27:10 1775071630

As far as I've seen, local OSS video understanding models just really aren't there yet. I briefly looked at facial recognition models but a good amount of signal was actually in the video's audio instead of the raw video frames. Depends on the accuracy you're looking for at the end of the day.

vardump · 2026-04-02T08:10:52 1775117452

Thanks for the reply. Let's hope local models catch up.

icyfox · 2026-01-22T17:28:07 1769102887

Waymo is such an interesting case study. For most other ~AI deployments you have strong public reaction to the proliferation of slop, non-human failure modes, cost cutting at the expense of quality, etc. But I haven't met a single person who doesn't like the experience of Waymo. They ended up cracking the code on what I suspect people really want:

- consistent car quality

- safety of the drive (conservative driving and potential fear of drivers)

- no randomly chatty driver

All of those feel like a breath of fresh air especially when stacked up against the current state of Uber & Lyft rides. People really just want consistency. I don't actually think you needed AI to get there (I've had occasional rides in black cars that provided the same experience). Waymo was just right time, right place, right price.

autoexec · 2026-01-22T17:40:21 1769103621

> but I haven't met a single person who doesn't like the experience of Waymo.

Just last week a Waymo was driving on train tracks and the rider had to jump out of the car and run because the car stopped while trains came at it. (https://www.youtube.com/watch?v=26KJvL2clTs) I bet that guy'd have something to say about the experience.

1stranger · 2026-01-22T22:26:33 1769120793

Yeah that's obviously not great but that video is nothing like what you described. You made it sound like it drove onto a mainline train track with a train barreling down the tracks that couldn't stop with the guy diving out of the car to avoid getting clobbered. It did not, it got stuck on a tram track. Not quite the same thing.

x86x87 · 2026-01-22T17:38:22 1769103502

not having to talk to the driver and picking my own music are my fav parts. the novelty wears off quick and it becomes normal

nerdsniper · 2026-01-22T18:15:42 1769105742

I've had Waymos in SF take very strange routes. It seemed to really strongly avoid ever using Market St, generally preferring a long right-angle route over the perfect hypotenuse. Sometimes this delayed me very considerably, doubling my ride time compared to the Google Maps estimated time.

That said, I've never felt unsafe or uncomfortable. But I have jumped out halfway through the ride and grabbed an eScooter instead.

icyfox · 2026-01-22T23:22:27 1769124147

Market used to be closed to all cars (2021-2025); only taxis and busses were allowed but that changed recently:

https://www.sfmta.com/blog/creating-better-market-street-car... https://www.planetizen.com/news/2025/08/135849-sfs-market-st...

Wonder if that explains your observed preference. I'd bet Waymos will start utilizing the route again if it aligns with Google's mapping algo.

dmitrygr · 2026-01-22T22:19:35 1769120375

Back when I had to drive/walk in SF, I would also go quite out of my way to avoid market or mission. Especially near 6th. Self-preservation and whatnot...

pjc50 · 2026-01-22T17:31:34 1769103094

There's a lot of complaints about externalities, especially when a power cut stopped all the vehicles in a city recently.

icyfox · 2026-01-22T17:32:53 1769103173

I'm not commenting on the externalities. For that I'd also cite economic impact, job loss, occasional emergency services issues, etc. I'm saying the experience when you yourself are taking a ride. I haven't met a single person who's said "this sucked - I'm going back to Uber".

seanmcdirmid · 2026-01-22T17:34:33 1769103273

I think parent was talking about how users of the service were very satisfied with it, not about externalities.

holler · 2026-01-22T17:34:10 1769103250

My first and only Waymo ride was super sketch. Car slowed down to ~5mph in a 35mph zone and stayed that way for 5+ minutes as other cars were swerving around us. Felt like it was going to come to a complete stop in the middle of the road, I prefer real humans.

mbesto · 2026-01-22T17:44:44 1769103884

What you're getting at is basically the difference between probabilistic models vs deterministic ones.

spongebobstoes · 2026-01-22T17:47:45 1769104065

waymo is also a probabilistic deep learning system

asdff · 2026-01-22T17:43:13 1769103793

Tried calling it and it left without picking us up.

icyfox · 2026-01-09T21:23:37 1767993817

At the risk of being overly pedantic, topologists would typically classify this as venom.

Venom is inert if digested; it's only a problem if it gets in your blood stream. So arrows that were laced with venom and thereby contaminated meat were actually perfectly safe to eat.

Poison is different. If ingested, inhaled, or absorbed it will kill you.

skrebbel · 2026-01-09T23:10:18 1768000218

We Dutch solve this problem by having a single word for "poison", "venom and "toxin"¹. Everybody still knows what you mean and nobody gets to be pedantic.

¹ and "badly compressed looping animation"

pjmlp · 2026-01-10T15:52:17 1768060337

Same in Portuguese, veneno.

Although there are plenty of other opportunities for pedantry, especially when we take regionalisms, and other Portuguese speaking countries into account.

XCSme · 2026-01-10T00:50:47 1768006247

Is the word "stamppot" ?

usrnm · 2026-01-10T08:13:29 1768032809

Just "food". Any kind of Dutch food fits the description.

skrebbel · 2026-01-10T09:29:46 1768037386

This is true, notably a kroket is both looping and badly compressed.

OptionOfT · 2026-01-10T02:23:33 1768011813

Vergif.

I don't know how you get from 'ver' to badly compressed.

(And I'm a native Flemish speaker, but living in the USA for 8+ years, so I barely, if ever speak it).

tharkun__ · 2026-01-10T03:20:23 1768015223

Remove Ver, add t and you got German: Gift

Vergiftet would be past tense.

Funny that in English gift is a word but entirely different meaning.

Languages are fun, especially in Europe where they're all different but all so related but everyone does not want to admit it.

thaumasiotes · 2026-01-10T13:44:50 1768052690

> Funny that in English gift is a word but entirely different meaning.

In English it maintains its original Germanic meaning derived from the verb give.

The sense of "poison" in German comes from a euphemistic use of "gift". (Literally 'something given' but actually used to calque Greek "dosis", which also literally meant 'something given', but was used to mean 'dose [of medicine]'.)

https://en.wiktionary.org/wiki/Gift#Etymology

Summing up, the reason gift is a word in English with an entirely different meaning from what it has in German is that everyone in Germany forgot what gift meant.

(The reason it's gift and not something more like yift is the Danelaw.)

tharkun__ · 2026-01-10T16:47:33 1768063653

This is one of the reasons I like HN: Random knowledge transfer like this. Appreciated!

Also: in German Dosis is the word for dose.

    Die Dosis macht das Gift

(the dose makes the poison)

animal531 · 2026-01-10T10:54:03 1768042443

It's probably the same, for example in Afrikaans its just gif. Vergif is the verb action of doing it, and vergiftig the same past tense of it having happened previously.

birdsongs · 2026-01-10T10:39:27 1768041567

In Norwegian, "gift" is poison. It's also the word for married (de er gift).

pantalaimon · 2026-01-10T14:34:24 1768055664

In German "Mitgift" is what the bride gets from her family when she enters marriage.

stevekemp · 2026-01-10T12:43:09 1768048989

> all so related but everyone does not want to admit it.

I'm laughing in Finnish..

tharkun__ · 2026-01-10T16:41:44 1768063304

Hehe, you found the exception that proves the rule :P

SllX · 2026-01-10T17:12:05 1768065125

And Basque, Maltese, Turkish and Georgian.

Magyar (Hungarian) and Finnish are both Uralic languages along with Estonian and the Sámi languages, but none of these are related to the Indo-European languages common in the other parts of Europe.

And while most of Europe’s extant languages are in the Indo-European language family, there’s still a fair number of differences between Albanian, Germanic, Hellenic, Celtic, Romantic and Slavic languages.

tharkun__ · 2026-01-10T19:25:53 1768073153

Oh for sure there are many differences, that comes with them being different languages, countries, ethnicity. You can do this on many levels.

The point was essentially what you're showing here: People focusing on all the differences instead of shared history, languages influencing each other and how we're all not that different in the end.

If you want to, even within what are nowadays countries and what outsiders would say is "one language" and "one ethnicity", you can start focusing on differences and make people dislike each other.

SllX · 2026-01-11T09:12:08 1768122728

That’s fair. I tunneled in through a linguistic lens.

bruce343434 · 2026-01-10T11:54:46 1768046086

In NL, just 'gif' is sufficient

samlinnfer · 2026-01-10T01:43:02 1768009382

Same in Chinese (毒). But it is a better solution just not to give pedants the time of the day.

readthenotes1 · 2026-01-11T04:12:21 1768104741

You can't really, can you?

At the very least, they'd complain about accuracy, if not time zone, or even how we should all be on UTC (do not get one started on the difference between GMT and UTC if you value your... time)

gambiting · 2026-01-10T00:21:16 1768004476

Same in Polish. You'd just call both of these "trucizna".

mbel · 2026-01-10T01:55:19 1768010119

Not really, we have both „jad” (venom) and „trucizna” (poison).

gtech1 · 2026-01-10T03:51:56 1768017116

How does this happen ? The poster above you isn't really Polish ? How can someone that claims to know Polish not know there's two different words ?

gambiting · 2026-01-10T09:26:41 1768037201

Obviously I know "jad" but I don't see any issue with calling venom "trucizna". Natural languages aren't C++ and you don't get compiler errors when you speak - to me, there is no issue calling both venoms and poison trucizna. Polish dictionary doesn't seem to contradict it either:

https://sjp.pwn.pl/slowniki/trucizna.html

The point is, both are correct(afaik) while in English venom and poison are definitely two different things.

mbel · 2026-01-10T09:43:23 1768038203

Nobody would say „trujący wąż” (poisonous snake) or „jadowity grzyb” (venomous mushroom). The distinction is similar to English. There are exceptions and contexts where it can be used interchangeably but arguably the same is true for English.

gambiting · 2026-01-10T11:29:25 1768044565

>>Nobody would say „trujący wąż”

No? That's how I've always said it. "Ta żmija jest trująca" - don't see any issue here. Jadowity grzyb I'll agree.

gtech1 · 2026-01-10T15:05:21 1768057521

This is fascinating, assuming you are both natives of Poland. Is there as much language variance in Poland as in, say, Italy ?

gambiting · 2026-01-10T23:07:18 1768086438

No idea how much variance there is in Italy so not sure how to answer that question.

gtech1 · 2026-01-12T14:49:12 1768229352

Italy, the core remnant of the Roman Empire, has unmatched language diversity, often varies even from town to town. It's a colorful mosaic of micro cultures and customs where people from one region using different words for venom/poison is completely normal, in their local dialect. Everyone speaks standard Italian though.

You've never visited Italy ? They're not that far away and I'm sure you'll love it.

thaumasiotes · 2026-01-10T13:48:37 1768052917

> The point is, both are correct(afaik) while in English venom and poison are definitely two different things.

No, the situation in English matches your description exactly: all of these things are called poison. The word venom is almost never used in natural speech.

Furthermore, if you ask English speakers what the difference between poison and venom is, by far the two most common responses will be "there isn't one" and "I don't know". icyfox is just looking to be annoying.

(Another popular option will probably be "it's called venom when you're talking about snakes", which explains roughly 100% of use of venom in natural speech.)

usrnm · 2026-01-10T08:16:15 1768032975

And in Russian we use "jad" ("яд" in cyrillic) for both. Although there is the word "отрава", which can be used for poisons and "яд" is closer to "venom" the difference is almost non-existant and both are often used interchangeably.

VanshPatel99 · 2026-01-09T21:27:07 1767994027

TIL. I always thought that "If it bite you -> you die = venom" and "If you eat, bite, touch -> you die = poison". But your differentiation makes more sense

zahlman · 2026-01-09T23:02:28 1767999748

That explains the words "venomous" and "poisonous" used of creatures.

It's different for the actual substances. Although it relates: a venomous creature that bites you will release its venom into your bloodstream.

anonym29 · 2026-01-10T03:47:51 1768016871

>a venomous creature that bites you will release its venom into your bloodstream

unless it's a bee, wasp, hornet, scorpion, stingray, jellyfish, man-of-war, platypus, lionfish, stonefish, sea urchin, or catfish, which all have venom instead of poison, but the delivery mechanism of said venom isn't biting

zahlman · 2026-01-10T17:52:03 1768067523

I said "bite" echoing the comment I was replying to. Obviously the same applies, mutatis mutandis, to stinging etc.

hearsathought · 2026-01-10T18:07:44 1768068464

If a venomous snake bites you, you die. If you bite a venomous snake, you live. If a poisonous snake bites you, you will. If you bite a poisonous snake, you die.

Or Hamlet's mother died by drinking poisoned wine. Hamlet died by being stabbed with an envenomed sword.

throwaway5465 · 2026-01-10T08:13:30 1768032810

Not overly pedantic at all as it highlights that by using venom the hunters were able to eat what they shot.

hyrix · 2026-01-09T22:26:13 1767997573

These chemicals are derived from plants where even pedants would classify them as poisons.

The genus name Boophone is from the Greek bous = ox, and phontes= killer of, a clear warning that eating the plant can be fatal to livestock.

cluckindan · 2026-01-10T11:21:26 1768044086

Huh, so telephone is killer of distance and Persephone is killer of… Persians? Grain? Vegetation?

stared · 2026-01-10T11:33:45 1768044825

You're mixing up phōnē (voice) and phonos (slaughter), but the truth about Persephone is actually more metal.

Her name predates Greek contacts with Persians, so the timeline doesn't fit. Instead, it comes from perthein (to destroy) + phonos, making her the "Bringer of Destruction". With a caveat that the etymology of her name is uncertain: https://en.wikipedia.org/wiki/Persephone#Name

I do like "killer of distance" for telephone, though. :)

thaumasiotes · 2026-01-10T16:35:04 1768062904

> Instead, it comes from perthein (to destroy) + phonos, making her the "Bringer of Destruction". With a caveat that the etymology of her name is uncertain:

But... of all the theories listed there, perthein isn't among them.

And if the roots are "destroy" and "death", what would make her the "bringer" of destruction?

icyfox · 2026-01-09T22:54:36 1767999276

Fair point about the source, but the classification usually follows the mode of delivery, not the organism of origin.

Many plant-derived compounds function as venoms once introduced into the bloodstream (arrow coatings, darts, etc.), even if they’re also toxic when ingested. Curare is one example of a plant-based compound - lethal in blood, but largely harmless if eaten.

So while Boophone is absolutely a poison in the ecological sense, using it on arrows still fits the venom/toxin distinction better than a purely ingested poison. Otherwise why would people hunt with this if they got sick the second they ate the meat?

jeltz · 2026-01-10T09:19:32 1768036772

Is it really? We call it poison darts when hunters use poison from the poison dart frog to hunt.

Gud · 2026-01-10T11:58:35 1768046315

Not pedantic, two different.

Thanks for clarifying.

Retric · 2026-01-10T02:20:06 1768011606

In practice the difference is mostly semantics.

Venom is still almost always poisonous when eaten and poison is harmful when injected. 2-3% as dangerous when eaten vs injected only helps so much.

readthenotes1 · 2026-01-11T04:15:36 1768104936

"mostly semantics"

Semantics: 1 (linguistics) the study of meanings

I am not sure what could be more important.

But perhaps you "word choice"?

Retric · 2026-01-11T05:31:28 1768109488

What things are more important than the study of meanings in a linguistic context?

Well semantics only covers an infinitesimal fraction of all meaning. Consider if I inject arsenic into a snakes venom sac is it now a venom? Nothing about your answer changes anything about what’s going on, yet you could still debate the question.

So when you say “what could be more important” I can only say that just about everything is more important.

OptionOfT · 2026-01-10T02:24:25 1768011865

But eating a rattlesnake and dying is a bad way of finding out that you have a stomach ulcer.

jeltz · 2026-01-10T09:16:51 1768036611

I am not a native speaker but I believe you are wrong. It is called poison dart for example. So injected toxins can be both called poisons and venoms.

mrleinad · 2026-01-10T13:10:35 1768050635

In Spanish it's commonly "dardo venenoso" (venomous dart), no "dardo ponzoñoso" (poisonous dart). So it's probably incorrectly used in English.

icyfox · 2025-12-09T16:52:26 1765299146

Exactly half of these HN usernames actually exist. So either there are enough people on HN that follow common conventions for Gemini to guess from a more general distribution, or Gemini has memorized some of the more popular posters. The ones that are missing:

- aphyr_bot - bio_hacker - concerned_grandson - cyborg_sec - dang_fan - edge_compute - founder_jane - glasshole2 - monad_lover - muskwatch - net_hacker - oldtimer99 - persistence_is_key - physics_lover - policy_wonk - pure_coder - qemu_fan - retro_fix - skeptic_ai - stock_watcher

Huge opportunity for someone to become the actual dang fan.

giancarlostoro · 2025-12-09T17:10:53 1765300253

Before the AI stuff Google had those pop up quick answers when googling. So I googled something like three years ago, saw the answer, realized it was sourced from HN. Clicked the link, and lo and behold, I answered my own question. Look mah! Im on google! So I am not surprised at all that Google crawls HN enough to have it in their LLM.

I did chuckle at the 100% Rust Linux kernel. I like Rust, but that felt like a clever joke by the AI.

dotancohen · 2025-12-09T17:29:02 1765301342

I laughed at the SQLite 4.0 release notes. They're on 3.51.x now. Another major release a decade from now sounds just about right.

ryanisnan · 2025-12-09T17:42:15 1765302135

That one got me as well - some pretty wild stuff about prompting the compiler, starship on the moon, and then there's SQLite 4.0

ikerrin1 · 2025-12-09T17:48:59 1765302539

You can criticize it for many things but it seems to have comedic timing nailed.

ncruces · 2025-12-09T21:47:52 1765316872

The promise is backwards compatibility in the file format and C API until 2050.

https://sqlite.org/lts.html

rtkwe · 2025-12-09T19:53:45 1765310025

I wouldn't be surprised if it went towards the LaTeX model instead where there's essentially never another major version release. There's only so much functionality you need in a local only database engine I bet they're getting close to complete.

dotancohen · 2025-12-09T21:58:48 1765317528

I'd love to see more ALTER TABLE functionality, and maybe MERGE, and definitely better JSON validation. None of that warrants a version bump, though.

You know what I'd really like, that would justify a version bump? CRDT. Automatically syncing local changes to a remote service, so e.g. an Android app could store data locally on SQLite, but also log into a web site on his desktop and all the data is right there. The remote service need not be SQLite - in fact I'd prefer postgres. The service would also have to merge databases from all users into a single database... Or should I actually use postgres for authorisation but open each users' data in a replicated SQLite file? This is such a common issue, I'm surprised there isn't a canonical solution yet.

rtkwe · 2025-12-09T23:10:28 1765321828

I think the unified syncing while neat is way beyond what SQLite is really meant for and you'd get into so many niche situations dealing with out of sync master and slave 'databases' it's hard to make an automated solution that covers them effectively unless you force the schema into a transactional design for everything just to sort out update conflicts. eg: Your user has the app on two devices uses one while it doesn't have an internet connection altering the state and then uses the app on another device before the original has a chance to sync.

dotancohen · 2025-12-10T00:00:50 1765324850

Yes, it's a difficult problem. That's why I'd like it to be wrapped in a nice package away from my application logic.

Even a product that does this behind the scenes, by wrapping SQLite and exposing SQLite's wrapped interface, would be great. I'd pay for that.

Andrex · 2025-12-09T20:35:22 1765312522

If it had been about GIMP I would have laughed harder.

dotancohen · 2025-12-09T22:05:58 1765317958

Be reasonable. It's only looking forward a single decade.

schaum · 2025-12-10T07:35:04 1765352104

Every few years I stumble across the same java or mongodb issue. I google for it, find it on stackoverflow, and figure that it was me who wrote that very answer. Always have a good laugh when it happens.

Usually my memory regarding such things is quite well, but this one I keep forgetting, so much so that I don't remember what the issue is actually about xD

vidarh · 2025-12-09T17:24:21 1765301061

I've run into my own comments or blog posts more often than I care to admit...

james_marks · 2025-12-09T17:31:44 1765301504

Several decades into this, I assume all documentation I write is for my future self.

Beautifully self-serving while being a benefit to others.

Same thing with picking nails up in the road to prevent my/everyone’s flat tire.

QuantumNomad_ · 2025-12-09T17:27:26 1765301246

ziggy42 is both a submitter of a story on the actual front page at the moment, and also in the AI generated future one.

See other comment where OP shared the prompt. They included a current copy of the front page for context. So it’s not so surprising that ziggy42 for example is in the generated page.

And for other usernames that are real but not currently on the home page, the LLM definitely has plenty occurrences of HN comments and stories in its training data so it’s not really surprising that it is able to include real usernames of people that post a lot. Their names will be occurring over and over in the training data.

NooneAtAll3 · 2025-12-09T21:17:22 1765315042

one more reason to doubt that it's Ai-generated

joaogui1 · 2025-12-09T17:19:24 1765300764

HN has been used to train LLMs for a while now, I think it was in the Pile even

never_inline · 2025-12-09T17:55:14 1765302914

It has also fetched the current page in background. Because the jepsen post was recently on front page.

morkalork · 2025-12-09T17:21:52 1765300912

I may die but my quips shall live forever

atrus · 2025-12-09T17:07:44 1765300064

So many underscores for usernames, and yet, other than a newly created account, there was 1 other username with an underscore.

robocat · 2025-12-09T20:02:55 1765310575

In 2032 new HN usernames must use underscores. It was part of the grandfathering process to help with moderating accounts generated after the AI singlarity spammed too many new accounts.

WorldPeas · 2025-12-09T17:17:19 1765300639

my hypothesis is they trained it to snake case for lower case and that obsession carried over from programming to other spheres. It can't bring itself to make a lowercaseunseparatedname

computably · 2025-12-09T17:32:28 1765301548

Most LLMs, including Gemini (AFAIK), operate on tokens. lowercaseunseparatedname would be literally impossible for them to generate, unless they went out of their way to enhance the tokenizer. E.g. the LLM would need a special invisible separator token that it could output, and when preprocessing the training data the input would then be tokenized as "lowercase unseparated name" but with those invisible separators.

edit: It looks like it probably is a thing given it does sometimes output names like that. So the pattern is probably just too rare in the training data that the LLM almost always prefers to use actual separators like underscore.

fooofw · 2025-12-09T18:22:51 1765304571

The tokenization can represent uncommon words with multiple tokens. Inputting your example on https://platform.openai.com/tokenizer (GPT-4o) gives me (tokens separated by "|"):

    lower|case|un|se|parated|name

maxglute · 2025-12-09T21:47:07 1765316827

You can straight up ask Google to look for reddit, hackernews users post history. Some of it is probably just via search because it's very recent, as in last few days. Some of the older corpus includes deleted comments so they must be scraping from reddit archive apis too or using that deprecated google history cache.

never_inline · 2025-12-09T17:56:55 1765303015

This is definitely based on a search or page fetch, because there are these which are all today's topics

- IBM to acquire OpenAI (Rumor) (bloomberg.com)

- Jepsen: NATS 4.2 (Still losing messages?) (jepsen.io)

- AI progress is stalling. Human equivalence was a mirage (garymarcus.com)

tempestn · 2025-12-09T18:28:26 1765304906

The OP mentioned pasting the current frontpage into the prompt.

DANmode · 2025-12-09T19:15:42 1765307742

What % of today’s front page submissions are from users that have existed 5-10 years+?

(Especially in datasets before this year?)

I’d bet half or more - but I’m not checking.

vitorgrs · 2025-12-10T00:15:04 1765325704

It does memorize. But that's not actually very news.... I remember ChatGPT 3.5 or old 4.0 to remember some users on some reddit subreddts and all. Saying even the top users for each subreddit..

The thing is, most of the models were heavily post-trained to limit this...

skywhopper · 2025-12-09T18:33:42 1765305222

That’s a lot more underscores than the actual distribution (I counted three users with underscores in their usernames among the first five pages of links atm).

hurturue · 2025-12-09T17:27:25 1765301245

either you only notice the xxx_yyy frequent posters or it's quite interesting that so many have this username format

AceJohnny2 · 2025-12-09T20:00:25 1765310425

Aw, I was actually a bit disappointed how much on the nose the usernames were, relative to their postings. Like the "Rust Linux Kernel" by rust_evangelist, "Fixing Lactose Intolerance" by bio_hacker, fixing an 2024 Framework by retro_fix, etc...

dang_fan0 · 2025-12-10T15:59:32 1765382372

I was here first

icyfox · 2025-12-07T04:18:59 1765081139

We talked about this model in some depth on the last Pretrained episode: https://youtu.be/5weFerGhO84?si=Eh_92_9PPKyiTU_h&t=1743

Some interesting takeaways imo:

- Uses existing model backbones for text encoding & semantic tokens (why reinvent the wheel if you don't need to?)

- Trains on a whole lot of synthetic captions of different lengths, ostensibly generated using some existing vision LLM

- Solid text generation support is facilitated by training on all OCR'd text from the ground truth image. This seems to match how Nano Banana Pro got so good as well; I've seen its thinking tokens sketch out exactly what text to say in the image before it renders.

icyfox · 2025-12-05T18:44:39 1764960279

I used Serp via API many moons ago. The most interesting part of the company imo is their legal defense of different plans:

  Production - $150
  15,000 searches / month
  U.S. Legal Shield

ie. "Our U.S. Legal Shield protects your right to crawl and parse public search engine data under the First Amendment. We assume scraping and parsing liability for customers on most recurring plans unless your usage is illegal."

I imagine at least some portion of companies use them just for this liability shield.

ceejayoz · 2025-12-05T19:27:41 1764962861

Sounds a lot like the old guarantee paid SSL certificate providers used to offer; pretty words, but meaningless in practice. (IIRC, no one ever got a payout from any of them.)

"We assume scraping and parsing liabilities for both domestic and foreign companies unless your usage is otherwise illegal" seems like a big loophole in it.

perks_12 · 2025-12-05T22:29:07 1764973747

Couldn't this be laid out as, We assume scraping and parsing liability unless it is ruled as being illegal, in which case your use would be illegal and our liability shield wouldn't help you?

scosman · 2025-12-05T21:12:14 1764969134

> unless your usage is illegal

Like copyright infringement of Google's search results?