> add more RAM and GPU to the next iPhone and it's not a toy anymore
We're not going to get more RAM and GPU in consumer devices.
All of the supply is going into data center build outs. As the hyper scaler gamble on the future continues, we get left with weaker (or more expensive) devices - not stronger ones.
The market makers make more money if we're left to thin clients. They're also the ones who control supply and the shapes of devices.
We're talking six orders of magnitude difference between 0.6t/sec and 35kt/sec.
While there are problems that can be solved with 0.6t/sec, particularly offline, at the edge, in the field applications, these are currently vastly outnumbered by other applications.
absolutely, however this doesn’t mean we should abandon local. i can’t remember who, but someone in the ai nuts and bolts arena said “smaller local models is where the exciting stuff is happening right now. it’s the area real fast progression is happening.” and it seems to be true. new big models aren’t making near the leaps smaller models are.
it’s so important we keep moving forward on running locally for the same reason it was important for us to use open standards when building the internet. if we hadn’t we’d all be connected through aol with 10 hours/month allowed internet usage and termed in through a sun workstation renting cpu cycles from some mainframe company at like “you’ve got 10,000 cpu cycles left on your monthly plan, please deposit $500 for 5,000 more.”
while all of this this is before my time, i’ve heard and read so many horror stories about how people could only connect through dumb terminals to “you wouldn’t believe it, computers then were the size of buildings” 1000 miles away and had to sign up for workload timeslots. make no mistake, this is the future these companies want, they want us to rent everything and own nothing.
Local is enough for most users as long as they're willing to accept a non-realtime response - which is a real limitation (especially for personal agentic use) but not a very significant one. The hardware is not that expensive, a single user's needs aren't going to saturate a state-of-the art AI datacenter rack or anything like that. Not even for heavy agentic workloads.
You rent your broadband internet. It's not a foreign concept that we can't own all the infra.
I don't know why we can't just get over the local compute thing and instead build open infra and models in the cloud. That's literally the only way we'll be able to keep pace with hyperscalers.
Local is not going to benefit 99% of use cases. It's a silly toy.
If we build open infra for cloud-based provisioning and inference, we could build a future we still have some ownership in. We'd be able to fine tune large models for lots of purposes. We wouldn't be locked in to major vendors.
We're not going to get more RAM and GPU in consumer devices.
All of the supply is going into data center build outs. As the hyper scaler gamble on the future continues, we get left with weaker (or more expensive) devices - not stronger ones.
The market makers make more money if we're left to thin clients. They're also the ones who control supply and the shapes of devices.