beat · 78 stories
A new Amazon study finds that small models solving math via chain-of-thought are mostly copying the last number they see, not computing the answer. The finding matters for anyone using CoT faithfulness as an oversight signal.