Unfortunate. The Grok team built a phenomenal model. I use it all the time and i...

distances · 2026-03-13T21:26:24 1773437184

> People aren't using it for reasons other than its capabilities.

This is very true. I have no idea how it performs, as I wouldn't use it even if I was paid for that. Wouldn't matter if it was the best model available, in my view the name is so thoroughly tainted by now that you would get a reputational hit just by admitting to use it.

ryandrake · 2026-03-13T20:46:46 1773434806

> People aren't using it for reasons other than its capabilities.

This is a fact of life, though. "Who created it" is a valid and common reason to rule out using a particular product, even one with objectively good quality.

virgildotcodes · 2026-03-14T02:29:05 1773455345

Have you tried the 5.3 Codex Xhigh, 5.4 Xhigh, Opus 4.6, Gemini 3.1?

All of them (even Gemini, the worst of the bunch) far outclass Grok on everything I've thrown at them, especially coding.

Grok is good at summarizing what's happening on twitter though.

lvl155 · 2026-03-13T20:25:03 1773433503

My experience was quite different. It was on par with open source models from China (and it was priced as much) and could never replace Sonnet/Opus/GPT5.x.

ActorNightly · 2026-03-14T04:47:54 1773463674

There is no way in hell Grok is better than Gemini. Google has the advantage of much more efficient and faster inference, with a lot more data sets.

Secondly, would you trust a model, especially for STEM research, that consistently has training loops done on it to make it to adhere to what only Musk considers as truth?

Honestly, comments like yours really make me super suspicious of whether you are a bot or not.

qingcharles · 2026-03-14T19:13:53 1773515633

I use it because it is easily jailbroken and is willing to search for old orphan magazine PDFs I'm trying to track down. The subagents will all scream "this is copyright violation!" but the main Grok engine ignores them and finds obscure, niche forum posts etc.

So, it has its uses compared to the mainstream products.

dudeinhawaii · 2026-03-14T16:31:36 1773505896

I don't see what you're seeing, in any dimension. But here's a fair take.

I wrote several very specialized benchmarks that I've used over time, that surface "model personalities" and their effects on decision making (as well as measuring the outcomes).

Grok 4.1 Fast Reasoning is/was a solid model. It's also fundamentally different from the pack.

I call it a smart, aggressive, Claude Haiku. That is, its "thinking" is quite chaotic and sometimes short-hand and its output can be as well (relate to other models).

Its aggressiveness can allow it to punch above in competitive scenarios that I have in some of my benchmarks. Its write-ups and documentation are often replete with "dominate", "relentless" and a general high energy that skirts the limits of 'cringe bro'. That said, it has generally performed just beneath the SOTA (at the time: GPT-5.2, Gemini-3-Flash, Claude Opus 4.5). Angry Sonnet perhaps.

The latest release feels quite similar but also underperforms the same older crowd (so far) so it hasn't quite made the leap that Claude's 4.6 and GPT's 5.3/5.4 series made. It's also now priced the same as its peers but does not deliver SOTA capabilities (at least not consistently in my opinion).

thinkcontext · 2026-03-13T22:30:07 1773441007

Yes, the white genocide and mechahitler episodes have suppressed adoption.