Connect any GitHub repository. Ask questions in plain English. Get answers with exact file and line references.
Live Demo • Get Started • Pricing
Developers spend ~80% of their time reading and understanding code rather than writing it.
Onboarding to new codebases takes weeks. Finding where specific logic lives means grep-ing through thousands of files. Documentation is always outdated.
RepoDoc indexes your entire codebase into a vector database, then lets you query it conversationally with AI.
- Ask "How does authentication work?" → Get the answer with links to
src/lib/auth.ts:45-89 - Ask "Where are API rate limits configured?" → Instantly see the relevant files
- Generate production-ready READMEs and technical docs in one click
No more digging through files. No more outdated wikis. Just ask.
RepoDoc is engineering infrastructure for understanding code, not just an AI chatbot. A few principles shape how it works:
- Retrieval before generation: Relevant code is retrieved first; the LLM answers from that context to reduce hallucination.
- Structured semantic memory: Repo memory stores durable knowledge (concepts, decisions, relationships) with embeddings, instead of relying only on raw chat history.
- Operational observability: Every AI request is recorded (route, model, tokens, retrieval/memory counts, latency, cost, success/failure) so the system is auditable and debuggable.
- Explicit cost awareness: Token usage and estimated cost are tracked per request; optional per-project budget limits and threshold alerts keep cost predictable.
- Deterministic model fallback: A clear strategy for which model is used (e.g. primary vs fallback) so behavior is predictable under rate limits or outages.
- Layered separation: Indexing (ingestion, summarization, embedding), retrieval (vector search, memory search), and reasoning (LLM) are separate; each layer can be understood and evolved independently.
┌─────────────────────────────────────────────────────────────────────────┐
│ │
│ 1. CONNECT 2. INDEX 3. QUERY │
│ ─────────── ───────── ───────── │
│ Paste your Every file gets Ask anything. │
│ GitHub URL summarized, embedded RAG retrieves relevant │
│ and stored in code + LLM generates │
│ PostgreSQL/pgvector answers with citations │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Under the hood:
- Ingestion → LangChain's
GithubRepoLoaderpulls all files from your repo - Summarization → Each file is summarized by Gemini to capture its purpose
- Embedding → Summaries are converted to 768-dim vectors using
text-embedding-004 - Storage → Vectors stored in PostgreSQL with pgvector extension for similarity search
- Retrieval → When you ask a question, we embed your query and find the top 5 most similar code chunks
- Generation → Retrieved context + your question → Gemini 2.5 Flash generates a detailed answer
💬 Conversational Code Search: Chat with your codebase like you'd chat with a senior engineer who knows every line. Ask follow-up questions. Get code snippets with syntax highlighting. See exactly which files informed each answer.
📄 One-Click Documentation: Generate comprehensive technical documentation from your codebase automatically. The AI analyzes your code structure, patterns, and architecture to produce docs that actually reflect your implementation.
📝 README Generation: Get professional README files generated from your code. Includes proper sections for installation, usage, API references, and more, all inferred from your actual implementation.
📊 Repository Analytics: Visualize your codebase at a glance:
- Language distribution with percentages
- File counts and project metrics
- Stars, forks, and activity from GitHub
- Dependency insights
🔗 Shareable Documentation: Generate public links to share your documentation with teammates, contributors, or the world. Each link is tokenized and can be revoked anytime.
🔄 Iterative Refinement: Don't like something in the generated docs? Ask the AI to modify it. "Add a troubleshooting section" or "Update the API examples" , the docs evolve through conversation.
🏗️ Architecture View: Explore your codebase as a high-level architecture map. The AI analyzes your repo structure and surfaces modules, dependencies, and entry points so you can understand how the system is organized at a glance.
📋 Diff Analysis: Paste or upload a diff and get AI-powered analysis: what changed, impact, and suggestions. Supports query, diff, and architecture route types with token and cost tracking.
🧠 Repo Memory: RAG can use stored repo memories (semantic chunks with embeddings) for better context. Memory hit counts and retrieval counts are tracked for observability.
📊 Query Metrics (Observability): AI query observability is stored per request: route type (query / diff / architecture), model used, token counts, retrieval and memory hits, latency, estimated cost, and success/error. Indexed by project and time for analytics.
Cost tracking: Token usage and estimated cost per request; 7-day cost breakdown by route type and 30-day rolling view.
Budget guardrails: Optional per-project budget limits and threshold alerts (warning / limit exceeded).
Health status: Per-project status (healthy / warning / critical) for monitoring.
Cold start detection: First query or long idle gap is flagged so latency spikes are explainable.
Cache metrics: Cache-hit detection when a cached answer is served; visibility into cache effectiveness.
⚙️ Background Indexing: Indexing runs as a serverless job queue (Vercel cron + Postgres leasing). No blocking on project create or regenerate: jobs are queued, processed by a worker, and report progress. Retry and cancel from the UI.
Model fallback: Primary model (e.g. Gemini) with deterministic fallback (e.g. OpenRouter) under rate limits or outages so behavior is predictable.
| Layer | Technology |
|---|---|
| Framework | Next.js 16 (App Router, React 18) |
| Language | TypeScript 5 |
| Styling | Tailwind CSS 4.1, Radix UI |
| State | Redux Toolkit |
| Database | PostgreSQL + pgvector |
| ORM | Prisma 6 |
| AI/LLM | Google Gemini 2.5 Flash, OpenRouter |
| Embeddings | text-embedding-004 (768 dimensions) |
| Auth | Clerk |
| Payments | Stripe |
| Forms | React Hook Form + Zod |
| Animation | Motion (Framer Motion) |
| Testing | Jest, React Testing Library |
| Deployment | Vercel |
┌──────────────────────────────────────────────────────────────────────────────┐
│ NEXT.JS APP ROUTER │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Chat │ │ Dashboard │ │ Docs │ │ README │ │
│ │ Page │ │ Page │ │ Page │ │ Generation │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────────┬──────────┘ │
│ ┌──────┴──────┐ ┌──────┴──────┐ │
│ │ Architecture│ │ Diff │ ← Architecture view, diff analysis │
│ │ Page │ │ Page │ │
│ └──────┬──────┘ └──────┬──────┘ │
│ └────────────────┴───────────────────────────────────────────────────┘
│ │ │
│ ┌────────▼──────────────┐ │
│ │ API Routes │ │
│ │ │ │
│ │ /api/query │ ← RAG Pipeline │
│ │ /api/search │ ← Vector Search │
│ │ /api/analytics │ ← Metrics Aggregation │
│ │ /api/architecture │ ← Architecture extraction│
│ │ /api/analyze-diff │ ← Diff analysis │
│ │ /api/indexing-worker │ ← Cron job (indexing) │
│ └────────┬──────────────┘ │
└────────────────────────────────────┼─────────────────────────────────────────┘
│
┌──────────────────────────┼──────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────────┐ ┌─────────────────────┐ ┌─────────────────────────┐
│ PostgreSQL │ │ GitHub API │ │ AI Services │
│ + pgvector │ │ (Octokit) │ │ │
│ │ │ │ │ • Gemini 2.5 Flash │
│ • Users │ │ • Repo metadata │ │ • text-embedding-004 │
│ • Projects │ │ • File contents │ │ • OpenRouter fallback │
│ • Embeddings │ │ • Languages │ │ │
│ • Docs/READMEs │ │ • Stats │ │ │
│ • Share tokens │ │ │ │ │
│ • RepoMemory │ │ │ │ │
│ • IndexingJob │ │ │ │ │
│ • QueryMetrics │ │ │ │ │
└─────────────────────┘ └─────────────────────┘ └─────────────────────────┘
AI is treated as a production system. Each AI request is recorded with: route type (query / diff / architecture), model used, prompt and completion tokens, retrieval count and memory hit count, latency, estimated cost, success or failure, cold-start detection (first query or long idle gap), and cache-hit detection when a cached answer is served.
Per-project observability (via the observability API and UI) includes: 7-day cost breakdown by route type, 30-day rolling cost and budget tracking, budget threshold alerts (warning / limit exceeded), error-rate monitoring, health status (healthy / warning / critical), and memory quality metrics (e.g. hit rate, average similarity). This is how the system is run and debugged, not a sales pitch.
model User {
id String @id @default(uuid())
emailAddress String @unique
credits Int @default(150)
plan String @default("starter")
projects Project[]
}
model Project {
id String @id @default(uuid())
name String
repoUrl String
userId String
sourceCodeEmbeddings SourceCodeEmbeddings[]
docs Docs?
readme Readme?
}
model SourceCodeEmbeddings {
id String @id @default(uuid())
fileName String
sourceCode String
Summary String
summaryEmbedding Unsupported("vector")?
projectId String
}
model Docs {
id String @id @default(uuid())
content String
projectId String @unique
qnaHistory DocsQna[]
publicShare DocsShare?
}
model Readme {
id String @id @default(uuid())
content String
projectId String @unique
qnaHistory ReadmeQna[]
publicShare ReadmeShare?
}
// Additional models: RepoMemory (RAG memory + embeddings), IndexingJob (queue + status),
// QueryMetrics (per-request observability: routeType, tokens, latency, cost, success).Embedding similarity has limitations: semantic match is not perfect, and very similar phrasing can rank higher than conceptually relevant but differently worded code. Context window and retrieval depth are limited, we send a bounded number of chunks. There is a latency vs retrieval-depth tradeoff: more chunks improve coverage but increase latency and cost. Repo memory can drift after major refactors (stale facts until re-indexing or new Q&A). Cost vs model quality is a tradeoff (e.g. cheaper vs more capable models). Architecture inference is best-effort (e.g. static import analysis; dynamic or runtime behavior may be missed).
Mitigations in place: top-k retrieval caps, similarity-based ranking, explicit labeling of memory vs code context in prompts, in-memory query cache for repeated questions, and budget guardrails.
Stale or misleading memory after large refactors. Architecture view missing dynamically loaded or generated imports. Diff analysis is advisory, impact and risk are suggestions, not authoritative. Cold starts after indexing or long idle periods cause higher latency. Model rate limiting can lead to fallback or errors. Heavy indexing of large repos can increase latency or load. Observability (query metrics, error rate, health status, cold-start and cache metrics) is in place to surface these conditions.
Evolution path at higher scale: async embedding pipelines so indexing does not block requests; batched embedding jobs for efficiency; sharded or dedicated vector storage if one database becomes a bottleneck; horizontal scaling of indexing workers; background memory compaction or pruning; more aggressive or distributed caching; model tiering (e.g. cheaper models for simple queries, premium for complex ones). This is an evolution path; not all of it is implemented today.
The core intelligence lives in src/lib/rag.ts:
// 1. Embed the user's question
const queryEmbedding = await getGenerateEmbeddings(query);
// 2. Vector similarity search with pgvector
const results = await prisma.$queryRaw`
SELECT
"fileName",
"sourceCode",
"Summary",
1 - ("summaryEmbedding" <=> ${queryEmbedding}::vector) as similarity
FROM "SourceCodeEmbeddings"
WHERE "projectId" = ${projectId}
ORDER BY "summaryEmbedding" <=> ${queryEmbedding}::vector
LIMIT 5
`;
// 3. Build context from retrieved chunks
const codeContext = results
.map(
(code, idx) => `
[Source ${idx + 1}: ${code.fileName}] (Relevance: ${(code.similarity * 100).toFixed(1)}%)
Summary: ${code.summary}
Code: ${code.sourceCode.slice(0, 1000)}
`,
)
.join("\n\n");
// 4. Generate answer with Gemini
const answer = await openrouterChatCompletion({
model: "google/gemini-2.5-flash",
messages: [
{ role: "system", content: systemPrompt + codeContext },
...conversationHistory,
{ role: "user", content: question },
],
temperature: 0.3,
});- Node.js 20+
- PostgreSQL with pgvector extension
- GitHub account
- Clerk account
- Google AI API key (Gemini)
# Clone the repository
git clone https://github.com/parbhatkapila4/repodoc.git
cd repodoc
# Install dependencies
npm install
# Set up environment variables
cp .env.example .env# Database
DATABASE_URL="postgresql://user:password@localhost:5432/repodoc"
# Clerk Authentication
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY=pk_...
CLERK_SECRET_KEY=sk_...
CLERK_WEBHOOK_SECRET=whsec_...
# AI Services
GOOGLE_API_KEY=your_gemini_api_key
OPENROUTER_API_KEY=your_openrouter_key
# GitHub
GITHUB_TOKEN=ghp_...
# Stripe (Optional)
STRIPE_SECRET_KEY=sk_...
STRIPE_WEBHOOK_SECRET=whsec_...
# App
NEXT_PUBLIC_APP_URL=http://localhost:3000# Generate Prisma client
npm run db:generate
# Run migrations
npm run db:migrate
# (Optional) Open Prisma Studio
npm run db:studio# Start development server
npm run dev
# Run tests
npm test
# Type check
npm run type-check
# Lint
npm run lintrepodoc/
├── src/
│ ├── app/
│ │ ├── (app)/ # Landing page
│ │ ├── (auth)/ # Sign in/up, user sync
│ │ ├── (protected)/ # Authenticated routes
│ │ │ ├── chat/ # AI chat interface
│ │ │ ├── dashboard/ # Project management
│ │ │ ├── docs/ # Documentation viewer
│ │ │ ├── readme/ # README editor
│ │ │ ├── analytics/ # Platform analytics
│ │ │ ├── search/ # Semantic search
│ │ │ ├── architecture/ # Architecture view
│ │ │ └── diff/ # Diff analysis
│ │ └── api/
│ │ ├── query/ # RAG endpoint
│ │ ├── search/ # Vector search
│ │ ├── analytics/ # Metrics API
│ │ ├── architecture/ # Architecture extraction
│ │ ├── analyze-diff/ # Diff analysis
│ │ ├── indexing-worker/# Cron-triggered indexing job
│ │ ├── create-checkout/# Stripe checkout
│ │ └── webhooks/ # Clerk & Stripe webhooks
│ ├── components/
│ │ ├── ui/ # Radix-based primitives
│ │ └── landing/ # Marketing components
│ ├── lib/
│ │ ├── rag.ts # RAG implementation
│ │ ├── github.ts # GitHub integration
│ │ ├── gemini.ts # AI embeddings & generation
│ │ ├── openrouter.ts # LLM fallback
│ │ ├── prisma.ts # Database client
│ │ ├── actions.ts # Server actions
│ │ ├── actions-indexing.ts # Indexing job actions (status, retry, cancel)
│ │ ├── rate-limiter.ts # API protection
│ │ ├── memory.ts # Repo memory (RAG)
│ │ ├── architecture.ts # Architecture extraction
│ │ ├── diff.ts # Diff analysis
│ │ └── redis.ts # Upstash Redis (locking utilities)
│ └── hooks/ # Custom React hooks
├── prisma/
│ ├── schema.prisma # Database schema
│ └── migrations/ # Migration history
└── __tests__/ # Jest test suites
| Plan | Price | Projects | Features |
|---|---|---|---|
| Starter | $10/mo | 3 | AI chat, README generation, docs generation, basic analytics |
| Professional | $20/mo | 10 | Everything in Starter + public sharing, priority processing, email support |
| Enterprise | $49/mo | Unlimited | Everything in Professional + team features, SLA, custom integrations |
Query your codebase with natural language.
// Request
{
"projectId": "uuid",
"question": "How does authentication work?",
"conversationHistory": [
{ "role": "user", "content": "previous question" },
{ "role": "assistant", "content": "previous answer" }
]
}
// Response
{
"answer": "Authentication in this codebase is handled by...",
"sources": [
{
"fileName": "src/lib/auth.ts",
"similarity": 0.89,
"summary": "Handles user authentication..."
}
]
}Semantic search across your codebase.
// Request
{
"projectId": "uuid",
"query": "rate limiting",
"limit": 10
}
// Response
{
"results": [
{
"fileName": "src/lib/rate-limiter.ts",
"sourceCode": "...",
"summary": "...",
"similarity": 0.92
}
]
}- Authentication: All routes protected by Clerk middleware
- Input validation: Zod schemas on all API inputs
- SQL injection: Prevented by Prisma parameterized queries
- Rate limiting: Token bucket algorithm on API endpoints
- XSS protection: Next.js built-in escaping
- CSRF protection: Same-origin verification
Contributions are welcome. Please open an issue first to discuss what you'd like to change.
# Fork the repo
git checkout -b feature/your-feature
git commit -m "Add your feature"
git push origin feature/your-feature
# Open a PRBuilt by Parbhat Kapila