code-review-graph: The Tool That Makes AI Code Reviews Read Only the 'Key Code'

Hi, I'm Xiao Hua, dedicated to unlocking efficient workflows and cutting-edge AI tools! I curate open-source tech, practical tips, and tricks to help you save 50% of your time and stay a step ahead. Free subscription to join 100,000+ tech professionals in sharing upgrade secrets!

cover

When doing AI code reviews, have you ever had this feeling—every time you ask it to review code, it tries to read the entire codebase, burning through tokens rapidly, while the truly relevant code gets drowned out?

It's not your imagination.

Last week, I used Claude Code to review a Flask project with 2000+ files, wanting to see where a certain API change was affected. Claude read for a full 15 minutes, consumed nearly 200,000 tokens, and finally gave a response like "Based on the analysis, I suggest paying attention to the following files..."

When I opened the details—it listed 47 "potentially related" files.

47. I just wanted to see the impact scope of one API change, and you give me 47 files?

I believe any developer who has done AI code reviews on large projects can relate: Change one line of code, and the AI returns the entire "War and Peace". It's not that AI isn't smart enough; it's that it genuinely doesn't know who calls whom, or who depends on whom in your codebase.

Mainstream AI coding tools rescan the entire codebase for every task. Token consumption is huge, but efficiency is low.

The open-source tool I'm introducing today, code-review-graph, is here to solve this problem. Its core philosophy is simple: Build a "map" of your code for the AI, so it only reads what it truly needs.

Tested results: Token consumption in code review scenarios reduced by 6.8 times, and for daily coding tasks, reduced by up to 49 times.


01 How does it work?

code-review-graph adopts a "knowledge graph" approach, with the entire process divided into four steps:

Step 1: Tree-sitter Parsing
Uses Tree-sitter to parse code into an Abstract Syntax Tree (AST), identifying structured information like functions, classes, import relationships, and call chains.

Step 2: Building the Code Graph
Stores the parsed results as a graph. Nodes are functions, classes, and modules; edges are the relationships between them (who calls whom, who inherits from whom, who references whom).

Step 3: Tracking the Blast Radius
When a code change occurs, the tool calculates which files, functions, and tests will be affected by this modification—this is the "blast radius."

Step 4: MCP Protocol Injection
Delivers precise context to the AI coding tool via the Model Context Protocol (MCP). The AI only needs to read these few KB of key information, not the entire codebase.

Architecture workflow

02 Let the Hard Data Speak

Just talking about principles isn't enough, let's look at real test data. The developers conducted automated evaluations on 6 real open-source projects (13 commits):

Token Consumption Comparison

ProjectNaive MethodUsing GraphSavings Ratio
Flask44,7514,2529.1x
Gin21,9721,15316.4x
FastAPI4,9446148.1x
Next.js9,8821,2498.0x
HTTPX12,0441,7286.9x
Average8.2x

An average token saving of 8.2 times means your AI quota lasts longer, or you can accomplish more tasks with the same quota.

Impact Analysis Accuracy

More importantly, this solution doesn't sacrifice accuracy for "streamlining":

  • 100% Recall Rate: All truly affected files are found, with no omissions.
  • Average F1 Score of 0.54: There is a certain degree of over-prediction (including some potentially related files), but this is intentional—it's better to read a little more than to miss a critical change.

Effects on Large Projects

For large monorepo projects, the effect is even more dramatic:

In a super-large repository with 27,700+ files, code-review-graph excluded 99.9% of irrelevant files, allowing the AI to read only about 15 truly relevant files.

Performance benchmark

03 What Scenarios Is It Suitable For?

Scenario 1: Code Reviews
When you need to review a PR or check the impact scope of a change, code-review-graph precisely tells you which files and functions will be affected. Not 47 "maybe related," but the truly relevant 5-10 files.

Scenario 2: Large Project Maintenance
Facing a legacy codebase maintained by dozens of people, want to figure out who depends on a certain module, and which features are affected by modifying it? The graph makes it clear at a glance.

Scenario 3: Architecture Refactoring
Want to remove a module that "looks unused"? First, use knowledge gap analysis to see if it's truly isolated before making a decision.

Scenario 4: Code Onboarding
New team members need to quickly understand the structure of an unfamiliar codebase? The graph is much more efficient than browsing folders one by one.


04 Incremental Updates: Modify Code, Graph Auto-Refreshes

Many worry that "building the graph is a one-time job, and subsequent maintenance is troublesome." That's not the case.

code-review-graph triggers automatic updates on every git commit or file save:

  1. Hooks detect changed files.
  2. Finds related dependencies via SHA-256.
  3. Only re-parses the changed parts.
  4. The graph database updates incrementally.

For a 2,900-file project, incremental indexing takes less than 2 seconds. The initial build is slower (about 10 seconds for 500 files), but subsequent updates are all incremental.

Incremental updates

05 How Many Languages and Platforms Are Supported?

Currently supports 23+ programming languages, covering mainstream tech stacks:

Python, TypeScript/TSX, JavaScript, Vue, Svelte, Go, Rust, Java, Scala, C#, Ruby, Kotlin, Swift, PHP, Solidity, C/C++, Dart, R, Perl, Lua, Zig, PowerShell, Julia, and Jupyter/Databricks notebooks.

Supported AI coding platforms are also extensive:

  • Claude Code
  • Cursor
  • Codex
  • Windsurf
  • Zed
  • Continue
  • OpenCode
  • Antigravity
  • Kiro

Basically, whatever AI coding tool you use, it's supported.

Supported platforms

06 It's Not Just for Code Reviews

Many think this is just a "code review tool," but its capabilities go far beyond that:

Architecture Analysis
Automatically generates code architecture diagrams, highlighting highly coupled areas (Hub nodes) and architectural bottlenecks (Bridge nodes). Especially useful during architectural refactoring.

Knowledge Gap Analysis
Discovers "orphaned" code (functions not adequately tested, rarely used modules, areas with weak test coverage).

Semantic Search
Supports vector embedding search, allowing you to find code entities by semantics, not just keyword matching.

Refactoring Assistance
Rename previews, framework-aware dead code detection, and refactoring suggestions based on community structure.

Export Capabilities
Can export the graph as GraphML (for Gephi), Neo4j Cypher, an Obsidian knowledge base, or a static SVG image. Flexibly integrates with your existing workflow.


07 How to Install and Use?

Prerequisite: Python 3.10+

Installation Command:

# Install the core package
pip install code-review-graph

# Auto-detect and configure your AI coding platform
code-review-graph install

# Build the code graph
code-review-graph build

One command completes all configuration. install automatically detects which AI coding tools you have installed, writes the corresponding MCP configuration, and injects graph-aware instructions.

Optional Dependencies (install as needed):

# Vector embedding support (for semantic search)
pip install code-review-graph[embeddings]

# Community detection support
pip install code-review-graph[communities]

# Evaluation benchmarks
pip install code-review-graph[eval]

# Wiki generation (requires ollama)
pip install code-review-graph[wiki]

# Install everything
pip install code-review-graph[all]

After installation, open your project and send to the AI assistant:

Build the code review graph for this project

Then you can start using it.


08 Summary

code-review-graph solves a very real problem: it makes the AI read only the code it should read.

Core value:

  • Token consumption reduced by an average of 8.2 times, up to 49 times.
  • Incremental updates; the graph auto-refreshes after code changes.
  • Supports 23+ languages and mainstream AI coding platforms.
  • Local SQLite storage, no dependency on cloud services.
  • Not just for review, but also for architecture analysis, refactoring assistance, and knowledge gap discovery.

Limitations to be aware of:

  • For small, single-file changes, the graph construction overhead might exceed the benefit.
  • Search quality (MRR) still has room for improvement.
  • Flow detection is currently only reliable in Python projects.

If you often do code reviews or use AI coding tools in large projects, this tool is worth a try.


Complete Project Information


If this article was helpful to you, give it a 'Wow' to show your support. Feel free to discuss any questions in the comments.

If you find it useful, you can also drop a Star on GitHub, which is the best way to support developers.

See you next time!

Related Articles

分享網址
AINews·AI 新聞聚合平台
© 2026 AINews. All rights reserved.