Faster Re-Analysis with Incremental Processing
The Problem with Full Re-Analysis
Running a full code analysis on a large codebase can take minutes. For a 10,000+ file monorepo, you might be waiting 10-15 minutes every time you want to check code health.
This creates a painful workflow: developers avoid running analysis because it's too slow, issues accumulate, and by the time someone runs a full scan, there are hundreds of new problems.
Hash-Based Change Detection
Repotoire solves this with **incremental analysis**. When you run `repotoire ingest`, we compute an MD5 hash of each file and store it in the graph:
(:File {path: "src/auth.py", content_hash: "a1b2c3..."})
On subsequent runs, we compare hashes to detect what's changed:
- **Modified files**: Hash changed → re-parse
- **New files**: No node exists → parse and add
- **Deleted files**: Node exists but file doesn't → remove from graph
Dependency-Aware Analysis
But just analyzing changed files isn't enough. If you modify `auth.py`, any file that imports it might have issues too. Repotoire traces the dependency graph to find affected files:
// Find all files that depend on changed files (up to 3 hops)
MATCH (changed:File)-[:IMPORTS*1..3]->(dependent:File)
WHERE changed.path IN $changedPaths
RETURN DISTINCT dependent.path
This "impact radius" ensures we catch issues introduced by API changes, not just local problems.
Real Performance
Here's an example of incremental analysis on a ~1,200 file codebase (results vary based on codebase size and number of changes):
| Metric | Full Analysis | Incremental |
|--------|--------------|-------------|
| Files | 1,234 | 29 (2.3%) |
| Time | 5 minutes | 8 seconds |
The speedup depends on how many files you change. When you only modify a few files, incremental analysis processes a small fraction of your codebase.
Usage
Incremental analysis is enabled by default:
# First run: full ingestion
repotoire ingest /path/to/repo
# Subsequent runs: incremental
repotoire ingest /path/to/repo # Automatically detects changes
# Force full re-analysis if needed
repotoire ingest /path/to/repo --force-full
Combined with our pre-commit hook integration, you get instant feedback on every commit:
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: repotoire-check
name: Repotoire Code Quality Check
entry: repotoire-pre-commit
language: system
types: [python]