> I love everything about this direction except for the insane inference costs.
If this direction holds true, ROI cost is cheaper.
Instead of employing 4 people (Customer Support, PM, Eng, Marketing), you will have 3-5 agents and the whole ticket flow might cost you ~20$
But I hope we won't go this far, because when things fail every customer will be impacted, because there will be no one who understands the system to fix it
> We do not need these crazy high release speeds with daily updates all over the stack
Although I like this, but I understand this is not easily achievable in companies where everyone is trying hard to grab the part of the market and AI FOMO and push by investors to release AI features
Yes, use it every day :) And very much a human, AFAIK.
My point is that if you ask "Hey Claude, please write out all common and useful command line arguments into a commands.html file", the LLM that actually does that work, might ignore anything that says "dangerous" or gives that indication, because the LLM doesn't think potentially dangerous commands could be "common" and/or "useful". Hope my point makes sense now.
I wonder why that is. It is quick to tell me if something is dangerous and then continues to push back if I speak in favor of something that it considers dangerous.
Author stated they used Claude to compose the document. I believe they were alluding to the idea that Claude's own safety alignment prevented it from documenting the flag because it's called dangerous.
>36,500 killed in 400 cities... Our Editorial Board has now obtained more detailed information provided by the IRGC Intelligence Organization to the Supreme National Security Council.
There are zero verified sources of any mass killings by the Iranian government. In fact all evidence points to Mossad agents committing the mass killings of Iranian government officials as caught on video, including the wrestler that was just executed for killing a police officer with a machete, on video.
> you can, because you can say some of the killed people could be citizens of India, China and/or US, this way you already cover 2.5B people
No, there aren't 2.5 billion people Tehran has the ability to kill. The fact that you're having to be this ridiculous to make your argument should give pause to its validity.
There are limits to knowability. But that doesn't mean everything is always unknowable. (Or, vice versa, that all scepticism is unwarranted.) Choosing to believe that is politically useful to anyone with power, which is why there is constantly propaganda to its effect. But it's not intellectually onest.
Sure, if you’re Turkmenistan or Afghanistan, the latter which is being bombed by Pakistan, you’re fine. Also if you’re Azerbaijan, fuck you.
What’s the argument? Like, Oman was trusted by parts of Tehran on diplomatic matters. They still got bombed. Trying to rationalize this is untenable—it was a stupid strategy of throwing toys out of the pram.
That's right. Hosting military bases of the overlords that impose crippling sanctions that impoverish a nation on false premises is quite far away from a neutral country.
I didn't hear the neighbouring countries complain when Iran got attacked economically/financially and then later military.
Can someone shed light on why China still couldn't copy the Nvidia GPUs in some form?
I understand its complex and there many parts to it, but which is the most complex part making it difficult for China to copy it?
Let's say they don't have access to 3nm process, what if they just use 12nm and create GPUs with much bigger size but comparable performance with CUDA compatibility? Or other option could be less tensor units, training will take longer, but they might be able to produce it cheaply
Copying CPUs isn't really a thing: they are too complex.
If you could steal all the designs at TSMC, and you had exactly the process that TSMC uses, you could definitely make counterfeits. If you didn't have TSMC's specific process, you could adapt the designs (to Intel or Samsung) with serious but not epic effort. If you couldn't make the processes similar (ie, want to fab on SMIC), you are basically back to RTL, and can look forward to the most expensive and time-consuming part of chip design.
This is nothing like copying a trivial, non-complex item like a car. Copying a modern jet engine is starting to get close (for instance, single-crystal blades), but even they are much simpler. I mention the latter because the largest, most resourced countries in the world have tried and are still trying.
Even if you had 'ai tools' guessing at component blocks on evaluation you would have to have some evaluation of the result.
And, thats assuming NVDA hasn't pulled a Masatoshi Shima type play on their designs (i.e. complex traps that could require lots of analysis to determine if they are real or fake)
Im not sure how much of a speedup even modern tooling/workflow could do reliably.
Even then,
The elephant in the room is that China is working on their own AI accelerators/etc, so while there can be benefit from -studying- the existing designs, however I think they do not want to clone regardless.
Oh, absolutely. Straight up soviet style cloning of masks makes no sense for multitude of reasons. In addition to what you've said, China isn't banned from N7 class Nvidia architectures so could just buy those on the open market.
If engines are hard to build, why not build a car 3x the size of a normal one, well you can but due to things like aerodynamics, etc etc you'll never match the speed or fuel economy of cars.
Same with chips, efficiency, speed, etc all depend on good design, and cutting edge factors, if the main reason your chip isn't faster is because of the distance between your L1 cache and your core is far, then having a bigger node process but bigger chip won't make it quicker.
> Can someone shed light on why China still couldn't copy the Nvidia GPUs in some form?
They have alternatives, like the Tian supercomputer was originally built with Xeon Phi chips that have been replaced with their own domestic alternatives.
A big limitation is getting access to fab slots. Nvidia and Apple are very aggressive about buying up capacity from TSMC, etc, and China's own domestic fabs are improving fast but still not a real match, particularly for volume.
But there's a distinct time/value of investment equation with the current AI boom. The jury is at best still out on what that equation is for the goals of capital (it's increasingly looking like there's no moat), but if you're a national government trying to encourage local bleeding edge expertise in new fields like this it's quite a bit more clear.
You can you just have to use a tiled architecture. And microprocessors already have far shorter wiring distances than the simple speed of light calculation because it takes time for the gates to make the transition as well.
With processors it's customary to use the "Fan out of 4" metric as a measurement of the critical paths. It's the notional display for a gate with fan out of 4, which is the typical case for moving between latches/registers. Microprocessor critical paths are usually on the scale of ~10 FO4.
The largest chip at the moment is Cerebras's wafer scale accelerator. There the tile is basically at the reticule limit, and they worked with TSMC to develop a method to wire across the gaps between reticules.
They can copy it. And no, the software moat is not there if someone choose the blatant copy route. They just can't build it in the scale they want yet.
> what if they just use 12nm and create GPUs with much bigger size but comparable performance
well, physics does work that way, depending on what you mean by performance.
(in the sense that power is normally part of performance when we're talking about chips).
you could certainly use a larger process and clone chips at an area and power penalty. but area is the main factor in yield, and talking about power is really talking about "what's the highest clockrate can you can still cool".
so: a clone would work in physics, but it would be slow and hot and expensive (low yield). I think issues like propagation delay would be second- or third-order (the whole point of GPUs is to be latency-tolerant, after all).
Not to mention their language server + type checker `ty` is incredible. We moved our extremely large python codebase over from MyPy and it's an absolute game changer.
It's so fast in fact that we just added `ty check` to our pre-commit hooks where MyPy previously had runtimes of 150+ seconds _and_ a mess of bugs around their caching.
If this direction holds true, ROI cost is cheaper.
Instead of employing 4 people (Customer Support, PM, Eng, Marketing), you will have 3-5 agents and the whole ticket flow might cost you ~20$
But I hope we won't go this far, because when things fail every customer will be impacted, because there will be no one who understands the system to fix it
reply