Agree. Also because of the way AI writes, it takes SO LONG to read through it (they're trained on blogspam where the page tells you the author's life story as well as the bloody history of bread before telling you how to bake it)
That's why in this case I usually ask to another AI to make me a short summary with the main points. I wish the human behind the looong article idea chooses to publish a short summary directly instead.
LLMs already use mixture of experts models, if you ensure the neurons are all glued together then (i think) you train language and reason simultaneously
Unlike AI, you aren't able to regurgitate entire programs and patterns you've seen before.
AI's capacity for memorisation is unrivaled, I find it mind blowing that you can download a tiny ~4gb model and it will have vastly more general knowledge than an average human (considering that the human is more likely to be wrong if you ask it trivia about e.g. the spanish civil war).
But the average human still has actual reasoning capabilities, which is still (I think?) a debated point with AI.
> which is still (I think?) a debated point with AI.
It's not, people misread an Apple study and it became a meme. It lost currency as a meme because it is impossible to use a model in 2026 and come away with the idea it cannot reason, for any reasonable definition of the word reason (pun intended). Most of the debate from there is just people misreading each-other and imagining incentive structures at play. (ex. I am not claiming they are never stupid, ex. the car wash dilemma, but I am claiming its gee-whiz enough at enough that it's become de facto beyond honest debate)
> AI's capacity for memorisation is unrivaled,
Much like "it just memorizes training data", "memorization" has a kernel of truth to it. Memorizing does not imply "it has 100% "learned", for some definition of learned similar to "guaranteed 100% reproducible translatable computation", brainfuck to the point it's just as easy as writing any other program, and thus if it hasn't, it cannot reason"
At the end of the day these are just mathematical objects. And while it's not discourse-contributing, the mundane truth is, those matmuls born from boring curve-fitting at scale know/memorized/can reason about/can parrot/have adjusted the float32s in such a way that it produces C a lot better than Brainfuck. Much like us. But they're just matmuls curve-fitting at scale.
Reason and "appearance" of reasoning are two different things. Some people intrinsically understand this. And some does not, and those people can never be made to understand it. I think it is one you things that you either get it automatically, or not get it at all..
So does a human engaged in rationalization or confabulation just appear to reason? We might be closer to these machines than you think, and I don’t mean that in a positive way.
Not OP, but as an LLM skeptic, I'd absolutely say that humans are natively very poor reasoners.
With effort, support, and resources, we can learn to reason well from first principles - call it reaching "intellectual maturity."
Catch an emotionally-immature human in a mistake or conflicting set of beliefs, and you'll be able to see them do exactly what you describe above: rationalize, deflect, and twist the data to support a more emotionally-comfortable narrative.
That usually holds even for intellectually-mature individuals who have not yet matured emotionally, even though they may reason quite well when the stakes are low.
Humans that have matured both emotionally and intellectually, however, are often able to keep themselves stable and reason well even in difficult circumstances.
The ways LLMs consistently fail spectacularly on out-of-distribution problems (like these esolangs) do seem to suggest they don't really mature intellectually, not the way humans can.
Maybe the Wiggum loop strategy shows otherwise? I'm not sure I know.
To me, it smells more like brute-forcing through to a result without fully understanding the problem, though.
IMO everyone is missing the point of this thing. It's not an auth system or security boundary, it doesn't provide any security guarantees whatsoever, it doesn't do anything. The entire point is to cover a company's derriere should their agentic security apparatus (or lack thereof) fail to prevent malicious prompt injection etc.
This way, they can avoid being legally blamed for stuff-ups and instead scapegoat some hapless employee :-) using cryptographic evidence the employee "authorized" whatever action was taken
I've tried both, and I'm still not sure. Claude Code steers more towards a hands-off, vibe coding approach, which I often regret later. With Copilot I'm more involved, which feels less 'magical' and takes me more time, but generally does not end in misery.
reply