This guide shows you how to customize Embabel Guide to work with your codebase instead of documentation URLs.
Guide already has everything you need:
- ✅ Full RAG infrastructure with
ingestDirectory(String dir)method - ✅ Chat functionality with ChatActions
- ✅ Frontend integration (Embabel Hub)
- ✅ Agent framework and LLM integration
You just need to configure it to use your codebase instead of URLs.
git clone https://github.com/embabel/guide.git
cd guideModify src/main/java/com/embabel/guide/GuideProperties.java:
Before:
public record GuideProperties(
boolean reloadContentOnStartup,
String defaultPersona,
LlmOptions chatLlm,
String projectsPath,
ContentChunker.Config chunkerConfig,
String referencesFile,
List<String> urls, // Only URLs
String toolPrefix,
Set<String> toolGroups
) {}After:
public record GuideProperties(
boolean reloadContentOnStartup,
String defaultPersona,
LlmOptions chatLlm,
String projectsPath,
ContentChunker.Config chunkerConfig,
String referencesFile,
List<String> urls, // Keep for docs
String codebasePath, // ADD: Your codebase path
String toolPrefix,
Set<String> toolGroups
) {}Modify src/main/java/com/embabel/guide/rag/DataManager.java:
In the loadReferences() method, change:
Before:
public void loadReferences() {
int successCount = 0;
int failureCount = 0;
for (var url : guideProperties.urls()) {
try {
logger.info("⏳Loading URL: {}...", url);
ingestPage(url);
logger.info("✅ Loaded URL: {}", url);
successCount++;
} catch (Throwable t) {
logger.error("❌ Failure loading URL {}: {}", url, t.getMessage(), t);
failureCount++;
}
}
logger.info("Loaded {}/{} URLs successfully ({} failed)",
successCount, guideProperties.urls().size(), failureCount);
}After:
public void loadReferences() {
// If codebase path is configured, use directory ingestion
if (guideProperties.codebasePath() != null && !guideProperties.codebasePath().isEmpty()) {
try {
logger.info("⏳Loading codebase from: {}...", guideProperties.codebasePath());
ingestDirectory(guideProperties.codebasePath());
logger.info("✅ Loaded codebase from: {}", guideProperties.codebasePath());
} catch (Throwable t) {
logger.error("❌ Failure loading codebase from {}: {}",
guideProperties.codebasePath(), t.getMessage(), t);
throw new RuntimeException("Failed to load codebase", t);
}
return; // Skip URL ingestion if using codebase
}
// Otherwise, use existing URL-based ingestion
int successCount = 0;
int failureCount = 0;
for (var url : guideProperties.urls()) {
try {
logger.info("⏳Loading URL: {}...", url);
ingestPage(url);
logger.info("✅ Loaded URL: {}", url);
successCount++;
} catch (Throwable t) {
logger.error("❌ Failure loading URL {}: {}", url, t.getMessage(), t);
failureCount++;
}
}
logger.info("Loaded {}/{} URLs successfully ({} failed)",
successCount, guideProperties.urls().size(), failureCount);
}Modify src/main/resources/application.yml:
guide:
reload-content-on-startup: true
# Your codebase path (absolute or relative)
codebase-path: /path/to/your/codebase
# URLs can be empty if using codebase only, or keep for docs
urls: []
# Or use both codebase and docs:
# urls:
# - https://docs.example.com
default-persona: adaptive
chat-llm:
model: gpt-4.1-mini
# ... rest of configuration# Build Guide
mvn clean install
# Run Guide
mvn spring-boot:runThat's it! Guide will now:
- Index your codebase on startup (if
reload-content-on-startup: true) - Allow users to chat about your codebase
- Use RAG to find relevant code sections
- Provide references to code locations
If you want to support both codebase and URLs:
public void loadReferences() {
// Load codebase if configured
if (guideProperties.codebasePath() != null && !guideProperties.codebasePath().isEmpty()) {
try {
logger.info("⏳Loading codebase from: {}...", guideProperties.codebasePath());
ingestDirectory(guideProperties.codebasePath());
logger.info("✅ Loaded codebase from: {}", guideProperties.codebasePath());
} catch (Throwable t) {
logger.error("❌ Failure loading codebase: {}", t.getMessage(), t);
}
}
// Load URLs if configured
if (guideProperties.urls() != null && !guideProperties.urls().isEmpty()) {
int successCount = 0;
int failureCount = 0;
for (var url : guideProperties.urls()) {
try {
logger.info("⏳Loading URL: {}...", url);
ingestPage(url);
logger.info("✅ Loaded URL: {}", url);
successCount++;
} catch (Throwable t) {
logger.error("❌ Failure loading URL {}: {}", url, t.getMessage(), t);
failureCount++;
}
}
logger.info("Loaded {}/{} URLs successfully ({} failed)",
successCount, guideProperties.urls().size(), failureCount);
}
}After this customization, you get:
- Codebase Indexing: Your codebase is indexed using Tika for parsing and DrivineStore for vector storage
- Chat Interface: Users can ask questions about your codebase
- RAG Search: Vector search finds relevant code sections
- Frontend: Embabel Hub frontend (or custom frontend) connects automatically
- Agent Framework: Full Embabel agent framework with
@Actiontriggers - References: Code references with file paths and locations
Once Guide is running with your codebase:
User: "How does authentication work in this codebase?"
Guide (using RAG):
- Searches indexed code for authentication-related code
- Finds relevant classes/functions
- Provides answer with references to specific files
User: "Show me the API endpoints"
Guide:
- Finds REST controllers or API definitions
- Lists endpoints with their implementations
- Provides file paths and line numbers
Local:
- Check path: Ensure
codebase-pathis absolute or correct relative path - Check permissions: Ensure Guide can read the directory
- Check logs: Look for parsing errors in logs
Docker:
- Verify volume mount: Check that volume mount path matches
GUIDE_CODEBASE_PATHdocker exec embabel-guide ls -la <codebase-path>
- Check environment variable: Verify
GUIDE_CODEBASE_PATHis set correctlydocker exec embabel-guide env | grep GUIDE_CODEBASE_PATH
- Check container logs: Look for Java exceptions in Guide logs
docker logs embabel-guide
- Spring Boot binding: Spring Boot uses
GUIDE_CODEBASE_PATHforguide.codebase-path - Check property name: Verify
@ConfigurationProperties(prefix = "guide")in GuideProperties - YAML vs Environment: Environment variables override
application.ymlvalues
- Check Guide logs: Look for Java exceptions
# Local tail -f logs/guide.log # Docker docker logs embabel-guide
- Verify Neo4j connection: Ensure Neo4j is accessible from Guide
# Docker docker exec embabel-guide ping neo4j
- Wait for indexing: Large codebases take time to index
- Check Neo4j: Ensure Neo4j is running and Guide can connect
- Check embeddings: Verify embedding model is configured correctly
- Verify ingestion: Check logs to confirm codebase was indexed
- Check port: Default is 1337, ensure it's accessible
- Check CORS: If using external frontend, configure CORS in Guide
- Check WebSocket: Ensure WebSocket/SSE endpoints are accessible
- Verify API endpoint: Check that frontend points to correct Guide URL
Container can't find codebase:
- Verify volume mount syntax in
docker-compose.override.yml - Check that host path exists before mounting
- Use absolute paths for reliability
Environment variables not applied:
- Check
.envfile is in same directory ascompose.yaml - Or pass environment variables inline:
GUIDE_CODEBASE_PATH=/codebase docker compose up -d
Container restarts and loses data:
- Neo4j data persists in Docker volumes (default behavior)
- Codebase ingestion runs on startup if
GUIDE_RELOAD_CONTENT_ON_STARTUP=true - To skip re-ingestion, set to
false
For more detailed troubleshooting, see TESTING.md.
The customized Guide works seamlessly with Docker. You can build a custom Docker image and use Docker Compose with volume mounts.
After making the customization changes:
cd guide
docker build -t guide-custom:latest -f Dockerfile .- Create a docker-compose override file (copy from
talk-to-your-repo/docker-compose.override.yml.example):
services:
guide:
image: guide-custom:latest # Or use build: context: .
environment:
- GUIDE_CODEBASE_PATH=/codebase # Path inside container
- GUIDE_RELOAD_CONTENT_ON_STARTUP=true
volumes:
- ./test-codebase:/codebase:ro # Mount your codebase (read-only)
- /var/run/docker.sock:/var/run/docker.sock- Set environment variables (create
.envfile or pass inline):
OPENAI_API_KEY=sk-your-key-here
GUIDE_CODEBASE_PATH=/codebase
NEO4J_PASSWORD=your-password- Start services:
cd guide
docker compose --profile java up --build -d- Verify:
# Check container logs
docker logs embabel-guide
# Verify codebase is mounted
docker exec embabel-guide ls -la /codebase
# Check environment variable
docker exec embabel-guide env | grep GUIDE_CODEBASE_PATH- Configuration Binding: Spring Boot automatically binds
GUIDE_CODEBASE_PATHenvironment variable toguide.codebase-pathproperty - Volume Mounts: Docker Compose mounts your codebase directory into the container
- No Deep Changes: The customization doesn't require any Docker-specific code changes
See TESTING.md for detailed Docker testing instructions.
- Customize references: Add codebase-specific references to
references.yml - Adjust chunking: Configure
content-chunkersettings for code - Add filters: Filter out unwanted files (node_modules, target/, etc.)
- Multiple codebases: Extend to support multiple codebase paths
Customizing Guide for codebase ingestion requires minimal changes:
- ✅ Add
codebasePathtoGuideProperties(~1 line) - ✅ Modify
loadReferences()to useingestDirectory()(~10 lines) - ✅ Configure in
application.yml(~1 line)
Total changes: ~12 lines of code vs thousands of lines for reimplementation!
Everything else (ChatActions, RAG, frontend, agent framework) works as-is. 🎉