Unfortunate. The Grok team built a phenomenal model. I use it all the time and it very often out performs GPT and Claude, on coding and STEM research related tasks. I was part of the beta for a while Grok 4.2 Beta with multi-agents and it was just amazingly good.
People aren't using it for reasons other than its capabilities. I mean, I don't think my boss would approve a paid Grok subscription for example.
> People aren't using it for reasons other than its capabilities.
This is very true. I have no idea how it performs, as I wouldn't use it even if I was paid for that. Wouldn't matter if it was the best model available, in my view the name is so thoroughly tainted by now that you would get a reputational hit just by admitting to use it.
> People aren't using it for reasons other than its capabilities.
This is a fact of life, though. "Who created it" is a valid and common reason to rule out using a particular product, even one with objectively good quality.
My experience was quite different. It was on par with open source models from China (and it was priced as much) and could never replace Sonnet/Opus/GPT5.x.
There is no way in hell Grok is better than Gemini. Google has the advantage of much more efficient and faster inference, with a lot more data sets.
Secondly, would you trust a model, especially for STEM research, that consistently has training loops done on it to make it to adhere to what only Musk considers as truth?
Honestly, comments like yours really make me super suspicious of whether you are a bot or not.
I use it because it is easily jailbroken and is willing to search for old orphan magazine PDFs I'm trying to track down. The subagents will all scream "this is copyright violation!" but the main Grok engine ignores them and finds obscure, niche forum posts etc.
So, it has its uses compared to the mainstream products.
I don't see what you're seeing, in any dimension. But here's a fair take.
I wrote several very specialized benchmarks that I've used over time, that surface "model personalities" and their effects on decision making (as well as measuring the outcomes).
Grok 4.1 Fast Reasoning is/was a solid model. It's also fundamentally different from the pack.
I call it a smart, aggressive, Claude Haiku. That is, its "thinking" is quite chaotic and sometimes short-hand and its output can be as well (relate to other models).
Its aggressiveness can allow it to punch above in competitive scenarios that I have in some of my benchmarks. Its write-ups and documentation are often replete with "dominate", "relentless" and a general high energy that skirts the limits of 'cringe bro'. That said, it has generally performed just beneath the SOTA (at the time: GPT-5.2, Gemini-3-Flash, Claude Opus 4.5). Angry Sonnet perhaps.
The latest release feels quite similar but also underperforms the same older crowd (so far) so it hasn't quite made the leap that Claude's 4.6 and GPT's 5.3/5.4 series made. It's also now priced the same as its peers but does not deliver SOTA capabilities (at least not consistently in my opinion).
People aren't using it for reasons other than its capabilities. I mean, I don't think my boss would approve a paid Grok subscription for example.