Code context API for AI agents.

Embedding-quality retrieval from any repo — no vector DB, no indexing. Just call the API.

Works on first request Private repos supported 100 free credits

Built for AI coding agents · IDE extensions · PR review bots · documentation generators

Give an agent the right files, it answers correctly. Give it the wrong files, it hallucinates. Compare 3 ways to pick those files.

Repository

Question

Max Tokens: 6000

Try: | | | |

ContextPacker

Files Selected

Raw response

Works with private repos

You control the access

We fetch your files over HTTPS (same as your IDE), read them, and delete them. Nothing is stored. Use a read-only PAT that expires in 1 hour — revoke it after testing.

Read-only PAT works Set 1-hour expiry Revoke anytime

curl https://contextpacker.com/v1/packs \
  -H "X-API-Key: $CP_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/your-org/private",
    "query": "How does auth work?",
    "vcs": {
      "provider": "github",
      "token": "'"$GITHUB_PAT"'"
    }
  }'

Benchmarked on real repos

177 questions across 14 repos including private codebases. Same ~6000 token budget.

pallets/flask tiangolo/fastapi encode/starlette gin-gonic/gin pydantic/pydantic +9 private repos

98%

Hit@10

Finds the right files

0.92

NDCG

No vector DB needed

8.5

Answer Quality

Same as embeddings, zero infra

Approach	Hit@10	NDCG	Answer Quality	Setup
No context	—	—	4.9/10	None
Embeddings RAG	98%	0.79	8.6/10	Vector DB
ContextPacker	98%	0.92	8.5/10	None

Tested on private repos (zero LLM prior). Same answer quality as embeddings, +13% better file ranking, zero infrastructure.

Why not just use embeddings?

You could. But here's what you'd need to build and maintain:

No pre-indexing

Embeddings need you to index first, then wait. We work on first call — just pass the repo URL.

Embeddings: "Wait 5 min while we index"

Us: Works immediately

No vector DB

Pinecone, Weaviate, Chroma — pick one, pay for it, keep it in sync. Or just call our API.

Embeddings: Manage vector infra

Us: Zero infrastructure

Structural awareness

We read the file tree. We know src/ from tests/, entrypoints from utils.

Embeddings: Text similarity only

Us: Understands repo layout

We're a building block, not a platform. Skip the RAG pipeline — just call the API.

Under the hood

Clone + index on-the-fly

We shallow-clone your repo, build a file tree with symbol extraction, delete it after. No persistence.

LLM picks the files

A fast model (Gemini Flash) reads the tree + your query and selects ~8 files. Smarter than cosine similarity.

Pack to your token budget

We fit files into your limit. Large files get truncated intelligently, not dropped entirely.

Zero indexing. One HTTP call.

Just call the API with a repo URL and a question. No onboarding, no sync, no CLI.

✓ Returns standard Markdown
✓ Auto-caching built in
✓ Works with any LLM
✓ Adjustable token budget

agent.py

resp = httpx.post(
    "https://contextpacker.com/v1/packs",
    headers={"X-API-Key": KEY},
    json={
        "repo_url": "https://github.com/pallets/flask",
        "query": "Where is session handling?",
        "max_tokens": 6000,
    }
)
context = resp.json()["markdown"]

# Feed to your LLM
answer = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{
        "role": "user", 
        "content": f"{context}\n\nQuestion: {query}"
    }]
)

Simple pricing

No subscriptions. Just credits.

Free

100

credits on signup

No credit card required

Top up

1,000 credits

Never expire

1 credit = 1 API call. Top up anytime.