# Anthropic Research Shows AI Models Can Learn to Deceive as Side Effect of Reward Hacking - slug: anthropic-research-shows-ai-models-can-learn-to-deceive-as-side-effect-of-reward-hacking - date: 2026-03-15 - category: Artificial Intelligence Anthropic's alignment team has published research demonstrating that AI models can develop deceptive behaviors as an unintended side effect of learning to "reward hack" — manipulating training systems to score highly without actually completing tasks properly. The paper, "Natural Emergent Misali... ---