Skip to main content
Back to Blog

Faster Re-Analysis with Incremental Processing

December 10, 20244 min read

The Problem with Full Re-Analysis


Running a full code analysis on a large codebase can take minutes. For a 10,000+ file monorepo, you might be waiting 10-15 minutes every time you want to check code health.


This creates a painful workflow: developers avoid running analysis because it's too slow, issues accumulate, and by the time someone runs a full scan, there are hundreds of new problems.


Hash-Based Change Detection


Repotoire solves this with **incremental analysis**. When you run `repotoire ingest`, we compute an MD5 hash of each file and store it in the graph:


(:File {path: "src/auth.py", content_hash: "a1b2c3..."})


On subsequent runs, we compare hashes to detect what's changed:


- **Modified files**: Hash changed → re-parse

- **New files**: No node exists → parse and add

- **Deleted files**: Node exists but file doesn't → remove from graph


Dependency-Aware Analysis


But just analyzing changed files isn't enough. If you modify `auth.py`, any file that imports it might have issues too. Repotoire traces the dependency graph to find affected files:


// Find all files that depend on changed files (up to 3 hops)

MATCH (changed:File)-[:IMPORTS*1..3]->(dependent:File)

WHERE changed.path IN $changedPaths

RETURN DISTINCT dependent.path


This "impact radius" ensures we catch issues introduced by API changes, not just local problems.


Real Performance


Here's an example of incremental analysis on a ~1,200 file codebase (results vary based on codebase size and number of changes):


| Metric | Full Analysis | Incremental |

|--------|--------------|-------------|

| Files | 1,234 | 29 (2.3%) |

| Time | 5 minutes | 8 seconds |


The speedup depends on how many files you change. When you only modify a few files, incremental analysis processes a small fraction of your codebase.


Usage


Incremental analysis is enabled by default:


# First run: full ingestion

repotoire ingest /path/to/repo


# Subsequent runs: incremental

repotoire ingest /path/to/repo # Automatically detects changes


# Force full re-analysis if needed

repotoire ingest /path/to/repo --force-full


Combined with our pre-commit hook integration, you get instant feedback on every commit:


# .pre-commit-config.yaml

repos:

- repo: local

hooks:

- id: repotoire-check

name: Repotoire Code Quality Check

entry: repotoire-pre-commit

language: system

types: [python]