> What they’ve proven here is that it can be done.
No they haven't, these results do not generalize, as mentioned in the article:
"Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute"
Meaning, they haven't solved AGI, and the task itself do not represent programming well, these model do not perform that well on engineering benchmarks.
Just to be clear — your position is that the cost of inference for o3 will not go down over time (which would be the first time that has happened for any of these models).
Even if compute costs drop by 10X a year (which seems like a gross overestimate IMO), you're still looking at 1000X the cost for a 2X annual performance gain. Costs outpacing progress is the very definition of diminishing returns.
From their charts, o3 mini outperforms o1 using less energy. I don’t see the diminishing returns you’re talking about. Improvement outpacing cost. By your logic, perhaps the very definition of progress?
You can also use the full o3 model, consume insane power, and get insane results. Sure, it will probably take longer to drive down those costs.
You’re welcome to bet against them succeeding at that. I won’t be.
No they haven't, these results do not generalize, as mentioned in the article:
"Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute"
Meaning, they haven't solved AGI, and the task itself do not represent programming well, these model do not perform that well on engineering benchmarks.