Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Bjorkbat
on Dec 20, 2024
|
parent
|
context
|
favorite
| on:
OpenAI O3 breakthrough high score on ARC-AGI-PUB
If I recall correctly the authors of the benchmark did mention on Twitter that for certain issues models will submit an answer that technically passes the test but is kind of questionable, so yeah, good point.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: