When an AI coding agent needs to trace a function call across a large codebase, it does something a human developer would never do: it opens files at random, hoping to land on the right ones. A new open-source tool called codebase-memory-mcp fixes this by building a persistent knowledge graph of your entire codebase once, then letting the agent query that graph in milliseconds instead of burning through hundreds of thousands of tokens reading files blind.
The tool, built by developer DeusData and shared on GitHub in March 2026, indexes codebases using tree-sitter AST parsing across 64 programming languages, storing the result in SQLite GitHub: DeusData/codebase-memory-mcp. After a one-time indexing pass, a coding agent can ask "what calls this function?" and get a structured answer in roughly 200 tokens and under one millisecond. The alternative: the same question can cost 45,000 tokens and require multiple rounds of grep-and-read guessing DEV.to: How I Cut My AI Coding Agent's Token Usage by 120x.
On the Linux kernel, a 28-million-line, 75,000-file codebase, the tool builds a graph with 2.1 million nodes and 4.9 million edges in three minutes on an M3 Pro. Average repositories index in milliseconds. The pipeline is RAM-first: LZ4-compressed bulk reads, in-memory SQLite, fused Aho-Corasick pattern matching, with memory released back to the OS after indexing completes.
Benchmarks across 31 languages and 372 test questions show a 120x reduction in token usage for structural code queries compared to file-by-file exploration. To pick one data point: finding a function by pattern costs roughly 200 tokens using the graph versus approximately 45,000 tokens via file search. Tracing a call chain three levels deep costs about 800 tokens versus 120,000. Across the full benchmark suite of five query types, the total comes to 3,400 tokens against 412,000, a 99.2% reduction.
The tool exposes 14 tools via the Model Context Protocol (MCP), including call-chain tracing, impact analysis, dead code detection, and Cypher-like graph queries. No LLM runs in the server itself. The agent is the intelligence layer; the tool gives it precise structural answers instead of raw file contents. The install script auto-detects and configures Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, and several other coding agents, making the setup roughly one command.
codebase-memory-mcp is not alone in this space. code-review-graph, a Claude Code plugin from developer tirth8205 with roughly 4,200 GitHub stars, builds persistent knowledge graphs for token-efficient code review and claims 6.8x fewer tokens on reviews and up to 49x on daily coding tasks GitHub: tirth8205/code-review-graph. Understand-Anything, a multi-agent Claude Code plugin from Lum1104, takes a different approach: a pipeline of specialized agents scans projects and produces an interactive dashboard visualization alongside a structured knowledge graph. Axon, from developer harshkedia177, exposes graph-powered code intelligence via MCP with a focus on dependency and call-chain tracking.
What unites these tools is a shared diagnosis: the way AI coding agents explore codebases today is architecturally wasteful. Agents lack a map. They have context windows, not structural understanding. They can read every file in a repository but they cannot tell you which files matter or how they relate. The knowledge graph approach solves this by separating the structural indexing problem from the reasoning problem, letting each do what it does best.
MCP is becoming the standard interface for this class of tool. Several of the leading projects expose themselves as MCP servers, meaning Claude Code, Codex, and Gemini CLI can pick them up without custom integration work. Infrastructure that requires per-agent integration effort tends to stay niche. Infrastructure that works across agents tends to get adopted fast.
For teams running large monorepos, the economics are straightforward. A coding agent burning 400,000 tokens per hour on file exploration versus 3,400 on graph queries is the difference between a session that costs dollars and one that costs cents at scale. The difference is negligible for solo developers on small projects.
The problem these tools are solving did not exist two years ago. AI coding agents created it by making file-level exploration computationally cheap in a way that obscured its inefficiency. The agents could afford to be blind, so they were. Now that agents are running longer sessions on larger codebases, the waste became visible, and a cluster of graph-indexing tools emerged to fix it.
The original announcement came from developer Guri Singh (handle @heygurisingh on X), whose post about the tool went viral in developer communities in late March 2026 X: @heygurisingh. The tool is MIT-licensed and available on GitHub.