Nvidia (NVDA) generates revenue with hardware, but digs moats with software. The...

vidarh · on Jan 29, 2025

The CUDA moat is largely irrelevant for inference. The code needed for inference is small enough that there are e.g. bare-metal CPU only implementations. That isn't what's limiting people from moving fully off Nvidia for inference. And you'll note almost "everyone" in this game are in the process of developing their own chips.

buyucu · on Jan 29, 2025

My company recently switched from A100s to MI300s. I can confidently say that in my line of work, there is no CUDA moat. Onboarding took about month, but afterwards everything was fine.

panabee · on Jan 29, 2025

Alternatives exist, especially for mature and simple models. The point isn't that Nvidia has 100% market share, but rather that they command the most lucrative segment and none of these big spenders have found a way to quit their Nvidia addiction, despite concerted efforts to do so.

For instance, we experimented with AWS Inferentia briefly, but the value prop wasn't sufficient even for ~2022 computer vision models.

The calculus is even worse for SOTA LLMs.

The more you need to eke out performance gains and ship quickly, the more you depend on CUDA and the deeper the moat becomes.

buyucu · on Jan 30, 2025

llm inference is fine on rocm. llama.cpp and vllm both have very good rocm support.

llm training is also mostly fine. I have not encountered any issues yet.

most of the cuda moat comes from people who are repeating what they heard 5-10 years ago.

onlyrealcuzzo · on Jan 29, 2025

> OpenAI, Meta, AWS, AMD, and others have long attempted to eliminate the Nvidia tax, yet failed.

Gemini / Google runs and trains on TPUs.

You have no incentive to infer on AMD if you need to buy a massive Nvidia cluster to train.

boroboro4 · on Jan 30, 2025

Meta trains on Nvidia and infers on AMD. There is incentive if your inference costs are high.

vidarh · on Jan 30, 2025

Meta also has a second generation of their own AI accelerator chips designed.

panabee · on Jan 29, 2025

Google was omitted because they own the hardware and the models, but in retrospect, they represent a proof point nearly as compelling as OpenAI. Thanks for the comment.

Google has leading models operating on leading hardware, backed by sophisticated tech talent who could facilitate migrations, yet Google still cannot leap over the CUDA moat and capture meaningful inference market share.

Yes, training plays a crucial role. This is where companies get shoehorned into the CUDA ecosystem, but if CUDA were not so intertwined with performance and reliability, customers could theoretically switch after training.

onlyrealcuzzo · on Jan 30, 2025

> yet Google still cannot leap over the CUDA moat and capture meaningful inference market share.

It's almost as if being a first-mover is more important than whether or not you use CUDA.

talldayo · on Jan 30, 2025

Both matter quite a bit. The first-mover advantage obviously rewards OEMs in a first-come, first-serve order, but CUDA itself isn't some light switch that OEMs can flick and get working overnight. Everyone would do it if it was easy, and even Google is struggling to find buy-in for their TPU pods and frameworks.

Short-term value has been dependent on how well Nvidia has responded to burgeoning demands. Long-term value is going to be predicated on the number of Nvidia alternatives that exist, and right now the number is still zero.

baq · on Jan 30, 2025

Google has a self inflicted wound in the time to get an api key.

Der_Einzige · on Jan 30, 2025

The fact that this comment is DOWNVOTED despite being literally 1000% true is evidence that HN is full of loonies.

panabee · on Jan 30, 2025

It's unclear why this drew downvotes, but to reiterate, the comment merely highlights historical facts about the CUDA moat and deliberately refrains from assertions about NVDA's long-term prospects or that the CUDA moat is unbreachable.

With mature models and minimal CUDA dependencies, migration can be justified, but this does not describe most of the LLM inference market today nor in the past.