Code context API for AI agents.

Embedding-quality retrieval from any repo — no vector DB, no indexing. Just call the API.

Works on first request Private repos supported 100 free credits

Built for AI coding agents · IDE extensions · PR review bots · documentation generators

Give an agent the right files, it answers correctly. Give it the wrong files, it hallucinates. Compare 3 ways to pick those files.

Try: | | | |
Works with private repos

You control the access

We fetch your files over HTTPS (same as your IDE), read them, and delete them. Nothing is stored. Use a read-only PAT that expires in 1 hour — revoke it after testing.

Read-only PAT works Set 1-hour expiry Revoke anytime
curl https://contextpacker.com/v1/packs \
  -H "X-API-Key: $CP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/your-org/private",
    "query": "How does auth work?",
    "vcs": {
      "provider": "github",
      "token": "'"$GITHUB_PAT"'"
    }
  }'

Benchmarked on real repos

177 questions across 14 repos including private codebases. Same ~6000 token budget.

98%
Hit@10
Finds the right files
0.92
NDCG
No vector DB needed
8.5
Answer Quality
Same as embeddings, zero infra
Approach Hit@10 NDCG Answer Quality Setup
No context 4.9/10 None
Embeddings RAG 98% 0.79 8.6/10 Vector DB
ContextPacker 98% 0.92 8.5/10 None

Tested on private repos (zero LLM prior). Same answer quality as embeddings, +13% better file ranking, zero infrastructure.

Why not just use embeddings?

You could. But here's what you'd need to build and maintain:

No pre-indexing

Embeddings need you to index first, then wait. We work on first call — just pass the repo URL.

Embeddings: "Wait 5 min while we index"
Us: Works immediately

No vector DB

Pinecone, Weaviate, Chroma — pick one, pay for it, keep it in sync. Or just call our API.

Embeddings: Manage vector infra
Us: Zero infrastructure

Structural awareness

We read the file tree. We know src/ from tests/, entrypoints from utils.

Embeddings: Text similarity only
Us: Understands repo layout

We're a building block, not a platform. Skip the RAG pipeline — just call the API.

Under the hood

1

Clone + index on-the-fly

We shallow-clone your repo, build a file tree with symbol extraction, delete it after. No persistence.

2

LLM picks the files

A fast model (Gemini Flash) reads the tree + your query and selects ~8 files. Smarter than cosine similarity.

3

Pack to your token budget

We fit files into your limit. Large files get truncated intelligently, not dropped entirely.

Zero indexing. One HTTP call.

Just call the API with a repo URL and a question. No onboarding, no sync, no CLI.

  • Returns standard Markdown
  • Auto-caching built in
  • Works with any LLM
  • Adjustable token budget
agent.py
resp = httpx.post(
    "https://contextpacker.com/v1/packs",
    headers={"X-API-Key": KEY},
    json={
        "repo_url": "https://github.com/pallets/flask",
        "query": "Where is session handling?",
        "max_tokens": 6000,
    }
)
context = resp.json()["markdown"]

# Feed to your LLM
answer = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user", 
        "content": f"{context}\n\nQuestion: {query}"
    }]
)

Simple pricing

No subscriptions. Just credits.

Free
100
credits on signup
No credit card required
Top up
$9
1,000 credits
Never expire

1 credit = 1 API call. Top up anytime.