Article Markdown

Raw .md Rich view All markdown articles

# Why a 24% Score on a Reasoning Benchmark Is an Argument About Compute

- Date: 2026-04-06
- Category: Artificial Intelligence

A CoreThink AI pipeline that separates perception from rule induction pushed a weak LLM from 16% to 24.4% on ARC-AGI-2 without fine-tuning — and the ablation numbers show why the result matters for the test-time scaling debate.

---