Microsoft benchmark finds memory isn’t the answer to AI agent failures — type0 | type0