Code context API for AI agents.

Embedding-quality retrieval from any repo — no vector DB, no indexing.
Just run npx contextpacker <repo> "query" or call the API.

Private repos supported · 2-4s first call, cached after · 100 free credits

Give an agent the right files, it answers correctly. Give it the wrong files, it hallucinates. Compare 3 ways to pick those files.

Repository (any public repo)

Question

Max Tokens: 6000

Quick picks, or paste your own: | | | |

ContextPacker

Files Selected

Raw response

Works with private repos

You control the access

We fetch your files over HTTPS (same as your IDE), read them, and delete them. Nothing is stored. Use a read-only PAT that expires in 1 hour — revoke it after testing.

Read-only PAT works Set 1-hour expiry Revoke anytime

curl https://contextpacker.com/v1/packs \
  -H "X-API-Key: $CP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/your-org/private",
    "query": "How does auth work?",
    "vcs": {
      "provider": "github",
      "token": "'"$GITHUB_PAT"'"
    }
  }'

Benchmarked on real repos

177 questions across 14 repos including private codebases. Same ~6000 token budget.

pallets/flask tiangolo/fastapi encode/starlette gin-gonic/gin pydantic/pydantic +9 private repos

98%

Hit@10

Finds the right files

0.92

NDCG

No vector DB needed

8.5

Answer Quality

Same as embeddings, zero infra

Approach	Hit@10	NDCG	Answer Quality	Setup
No context	—	—	4.9/10	None
Embeddings RAG	98%	0.79	8.6/10	Vector DB
ContextPacker	98%	0.92	8.5/10	None

Tested on private repos (zero LLM prior). Same answer quality as embeddings, +13% better file ranking, zero infrastructure.

Why not build your own RAG?

You could. Here's what that takes:

No pre-indexing

Embeddings need you to index first, then wait. We work on first call — just pass the repo URL.

Embeddings: "Wait 5 min while we index"

Us: Works immediately

No vector DB

Pinecone, Weaviate, Chroma — pick one, pay for it, keep it in sync. Or just call our API.

Embeddings: Manage vector infra

Us: Zero infrastructure

Structural awareness

We read the file tree. We know src/ from tests/, entrypoints from utils.

Embeddings: Text similarity only

Us: Understands repo layout

We're infrastructure, not a product. No dashboards, no onboarding. Just an API.

Under the hood

Clone + index on-the-fly

We shallow-clone your repo, build a file tree with symbol extraction, delete it after. Nothing stored.

LLM picks the files

A fast model (Gemini Flash) reads the tree + your query and selects ~8 files. Smarter than cosine similarity.

Pack to your token budget

We fit files into your limit. Large files get truncated intelligently, not dropped entirely.

Zero indexing. One HTTP call.

Just call the API with a repo URL and a question. No onboarding, no sync, no CLI.

✓ Returns standard Markdown
✓ Auto-caching built in
✓ Works with any LLM
✓ Adjustable token budget

agent.py

resp = httpx.post(
    "https://contextpacker.com/v1/packs",
    headers={"X-API-Key": KEY},
    json={
        "repo_url": "https://github.com/pallets/flask",
        "query": "Where is session handling?",
        "max_tokens": 6000,
    }
)
context = resp.json()["markdown"]

# Feed to your LLM
answer = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user", 
        "content": f"{context}\n\nQuestion: {query}"
    }]
)

Simple pricing

No subscriptions. Just credits.

Free

100

credits on signup

No credit card required

Top up

1,000 credits

Never expire

1 credit = 1 API call. Top up anytime.