How ContextPacker Works

A different approach to code context retrieval. No embeddings, no vector database, no pre-indexing. Just intelligent file selection that works on any repository instantly.

The Problem We Solve

When an AI coding agent needs to answer a question about a codebase, it faces a fundamental constraint: context windows are limited. A typical repository has thousands of files, but an LLM can only process a few dozen at once.

The traditional solution is embeddings-based RAG (Retrieval-Augmented Generation): convert every file into vectors, store them in a database, and find similar files when a question arrives. This works, but comes with significant infrastructure overhead:

  • You need to index repositories upfront — waiting minutes before first use
  • You need a vector database to store and query embeddings
  • You need to keep indexes in sync when code changes
  • Each repository requires separate storage and compute

For teams building AI coding tools, this infrastructure becomes a burden. What if there was a simpler way?

Our Approach: Structure-Aware Selection

ContextPacker takes a different approach. Instead of embedding every file's content, we leverage something simpler: file paths and code structure contain rich semantic information.

Consider these file paths from a web framework:

src/routing/router.py
src/routing/middleware.py
src/auth/handlers.py
src/auth/jwt.py
tests/test_routing.py

Even without reading the file contents, you can infer a lot. If someone asks "How does routing work?", the answer is likely in src/routing/. This intuition scales surprisingly well.

Adding Symbol Context

File paths alone aren't always enough. That's why we also extract function and class names from each file using lightweight AST parsing:

src/routing/router.py [Router, add_route, match_path, dispatch]
src/routing/middleware.py [Middleware, apply_chain, before_request]
src/auth/handlers.py [login, logout, refresh_token, require_auth]

Now a question like "How does authentication work?" can match login, require_auth, and jwt — even if the question doesn't use those exact words.

Intelligent Selection

With this enriched file tree, we use a fast language model to select which files are most relevant to the question. This approach has key advantages:

  • No pre-indexing required — works on first request
  • Understands code structure — knows src/ matters more than tests/
  • Cross-file reasoning — can find related files even when naming differs
  • Works on any branch or commit — no sync needed

Who Is This For?

ContextPacker is built as infrastructure for AI coding tools. We're not building the end-user product — we're providing the retrieval layer that powers them.

AI Coding Agents

Agents that need to understand codebases to answer questions, review PRs, or generate code.

IDE Extensions

VS Code, JetBrains, or Neovim plugins that provide AI-powered code assistance.

Code Review Tools

Automated review systems that need context about how code connects across files.

Documentation Generators

Tools that generate or update docs by understanding code structure.

The API

Using ContextPacker is simple. Send a repository URL and a question, get back relevant file contents formatted as Markdown:

POST /v1/packs
{
  "repo_url": "https://github.com/your-org/your-repo",
  "query": "How does authentication work?",
  "max_tokens": 6000
}

Response:
{
  "markdown": "### src/auth/handlers.py\n```python\ndef login(...)...",
  "files": [
    {"path": "src/auth/handlers.py", "tokens": 1823, "reason": "..."},
    {"path": "src/auth/jwt.py", "tokens": 945, "reason": "..."}
  ],
  "stats": {"tokens_packed": 4823, "files_selected": 2, ...}
}

The response fits directly into your LLM's context. No parsing, no transformation — just paste it in.

Private Repositories

We support private GitHub repositories with a simple addition: pass a read-only Personal Access Token in your request. The token is used once to clone, never stored or logged.

Security Model

  • Repositories are cloned to temporary directories
  • Files are read, processed, and immediately deleted
  • Tokens are never persisted — used once per request
  • You can use short-lived tokens (1-hour expiry recommended)

Trade-offs

No approach is perfect. Here's an honest look at when ContextPacker shines and when embeddings might be better:

Scenario ContextPacker Embeddings
First request to new repo ✓ Instant Minutes to index
Many queries, same repo ~3s each ✓ ~100ms each
Different branches/PRs ✓ Just works Re-index each
Infrastructure to run ✓ None (API) Vector DB
Exact string search Not optimized Not optimized

Bottom line: If you're building something that touches many repositories, many branches, or needs to work without setup — ContextPacker is a good fit. If you have one repository with very high query volume, embeddings with caching might edge ahead on latency.

Getting Started

We're currently in beta with a free tier (50 packs/month). If you're building AI coding tools and want to try it: