Skip to content

Feature: AI-powered 'ask' command with flexible LLM provider support #186

@raifdmueller

Description

@raifdmueller

Feature: AI-powered 'ask' command with flexible LLM provider support

Problem Statement

Current workflow for LLMs working with documentation:

  1. External LLM needs to navigate documentation
  2. Must load large chunks into context (risk of overflow)
  3. Relevant information may be spread across multiple sections
  4. No intelligent iteration or consolidation
  5. Every query costs API tokens

Result: Inefficient context usage, incomplete answers, high costs

Proposed Solution

Add an ask command that uses an internal LLM to intelligently navigate documentation, iteratively build context, and provide consolidated answers with source references.

Key Innovation: Use existing CLI coding assistants (Claude Code, Aider, etc.) to leverage user's flatrate subscriptions instead of additional API costs.

Architecture

Iterative Context Building

User: dacli ask "How does authentication work?"
         ↓
    1. Search relevant sections
       → security.authentication
       → security.authorization
       → api.endpoints
         ↓
    2. Iterate through sections:
       
       Iteration 1 (security.authentication):
       ┌─────────────────────────────────────┐
       │ Question: How does auth work?       │
       │ Previous findings: [none]           │
       │ Section: security.authentication    │
       │ Content: [JWT tokens, OAuth2...]    │
       └─────────────────────────────────────┘
                    ↓ claude-code
       Findings: "Uses JWT tokens, check authorization flow"
       
       Iteration 2 (security.authorization):
       ┌─────────────────────────────────────┐
       │ Question: How does auth work?       │
       │ Previous findings:                  │
       │   - Uses JWT tokens                 │
       │   - Need authorization details      │
       │ Section: security.authorization     │
       │ Content: [RBAC, permissions...]     │
       └─────────────────────────────────────┘
                    ↓ claude-code
       Findings: "JWT validated, then RBAC checks..."
       
       Iteration 3: Consolidation
       ┌─────────────────────────────────────┐
       │ All findings: [accumulated context] │
       │ Task: Provide final answer with     │
       │       source references             │
       └─────────────────────────────────────┘
                    ↓ claude-code
       Final answer with sources
         ↓
    3. Return consolidated answer

LLM Provider Support

Flexible Configuration via .dacli/config.toml

[llm]
# Provider: "claude-code", "aider", "amazon-q", "bedrock", "api", "custom"
provider = "claude-code"

# Custom command template (for any CLI tool)
# Variables: {prompt}, {prompt_file}, {model}
custom_command = "claude-code --non-interactive"

[llm.options]
max_iterations = 5          # Max sections to check
timeout_seconds = 30        # Per iteration
context_window = 8000       # Tokens per iteration
verbose = false             # Show iteration steps

[llm.api]
# Direct API fallback (if no CLI tool)
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4"
base_url = ""  # Optional proxy

[llm.bedrock]
# AWS Bedrock (enterprise option)
region = "us-east-1"
model_id = "anthropic.claude-3-sonnet-20240229-v1:0"
profile = "default"

[llm.amazon_q]
# Amazon Q Developer
region = "us-east-1"
profile = "default"

[search]
# Search strategy for finding relevant sections
use_embeddings = false      # Requires ADR-007 SQLite index
max_sections = 10           # Max sections to consider
min_relevance = 0.3         # Minimum search score

Supported Providers

Provider Command Cost Model Use Case
Claude Code claude-code --non-interactive Flatrate Recommended - uses existing subscription
Aider aider --yes --message Flatrate/API Popular coding assistant
Amazon Q q chat --no-input AWS subscription Enterprise, AWS-integrated
Cursor cursor --stdin Subscription IDE users
Bedrock boto3 API Pay-per-token Enterprise, compliance needs
Anthropic API Direct API Pay-per-token Fallback option
Custom User-defined Varies Any CLI LLM tool

Auto-Detection Priority

  1. Check .dacli/config.toml for configured provider
  2. Detect available CLI tools:
    • claude-code (priority 1)
    • aider (priority 2)
    • cursor (priority 3)
    • q (Amazon Q, priority 4)
  3. Check for API credentials:
    • ANTHROPIC_API_KEY
    • AWS_PROFILE / AWS credentials
  4. Fallback: Error with setup instructions

CLI Interface

Basic Usage

# Ask question about documentation
dacli ask "How does authentication work?"

# Scope to specific section
dacli ask "What are deployment options?" --section architecture

# Specify provider
dacli ask "..." --provider claude-code
dacli ask "..." --provider amazon-q
dacli ask "..." --provider bedrock

# Verbose mode (show iteration steps)
dacli ask "..." --verbose

# Custom LLM command
dacli ask "..." --llm-command "my-ai-tool --query"

# Output format
dacli ask "..." --format json
dacli ask "..." --format markdown

Configuration Management

# Initialize configuration
dacli config init
# → Detects available providers
# → Creates .dacli/config.toml with defaults
# → Prompts for provider selection

# View current config
dacli config show

# Set provider
dacli config set llm.provider claude-code
dacli config set llm.provider bedrock

# Set provider-specific options
dacli config set llm.bedrock.region us-west-2
dacli config set llm.options.max_iterations 10

# Test configuration
dacli config test
# → Runs test query
# → Reports if provider works
# → Shows estimated cost (if applicable)

MCP Tool Interface

@mcp.tool()
def ask_documentation(
    question: str,
    scope: str | None = None,
    max_iterations: int = 5,
    provider: str | None = None,
) -> dict:
    """
    Ask a question about the documentation using AI reasoning.
    
    Uses configured LLM provider to intelligently search and consolidate
    information from multiple sections.
    
    Args:
        question: The question to answer
        scope: Optional section path to limit search (e.g., "architecture")
        max_iterations: Maximum sections to check (default: 5)
        provider: Override configured provider (e.g., "claude-code", "api")
    
    Returns:
        {
            "answer": "Consolidated answer text",
            "sources": [
                {"path": "security.authentication", "relevance": 0.95},
                {"path": "security.authorization", "relevance": 0.87}
            ],
            "iterations": 3,
            "provider_used": "claude-code",
            "tokens_used": 4200  # If available
        }
    """

Implementation Phases

Phase 1: Basic Infrastructure (MVP)

Goal: Simple ask command with Claude Code integration

  • Add .dacli/config.toml support
  • Implement LLMProvider base class
  • Add ClaudeCodeProvider implementation
  • Basic dacli ask command (single iteration)
  • Configuration commands (config init, config show)
  • Tests for provider detection and execution

Deliverable: dacli ask "question" works with Claude Code

Phase 2: Iterative Context Building

Goal: Smart multi-section reasoning

  • Implement context accumulation logic
  • Add iteration loop with findings consolidation
  • Prompt templates for:
    • Initial search
    • Section evaluation
    • Follow-up suggestions
    • Final consolidation
  • Early termination (when enough info found)
  • Source tracking and citation
  • Verbose mode to show iterations

Deliverable: Intelligent multi-hop reasoning

Phase 3: Multiple Provider Support

Goal: Support major LLM providers

  • AiderProvider implementation
  • AmazonQProvider implementation
  • BedrockProvider implementation (boto3)
  • AnthropicAPIProvider implementation
  • CustomProvider for arbitrary CLI tools
  • Auto-detection with priority fallback
  • Provider-specific configuration validation
  • Cost estimation (for pay-per-token providers)

Deliverable: Works with 5+ different LLM tools

Phase 4: Advanced Features

Goal: Production-ready AI assistant

  • Integration with SQLite embeddings (ADR-007)
  • Semantic search for better section selection
  • Conversation history (follow-up questions)
  • Citation mode with line numbers
  • Structured output formats (JSON, Markdown)
  • MCP tool implementation
  • Performance metrics and logging
  • Rate limiting and timeout handling

Deliverable: Enterprise-ready documentation assistant

Technical Details

Provider Interface

class LLMProvider(ABC):
    """Base class for LLM provider implementations."""
    
    @abstractmethod
    def execute(self, prompt: str, context: str = "") -> str:
        """Execute prompt with accumulated context."""
        pass
    
    @abstractmethod
    def is_available(self) -> bool:
        """Check if provider is available/configured."""
        pass
    
    @property
    @abstractmethod
    def cost_model(self) -> str:
        """Return 'flatrate', 'pay-per-token', or 'free'."""
        pass

Iteration Algorithm

def ask_with_iteration(
    question: str,
    scope: str | None = None,
    max_iterations: int = 5
) -> dict:
    """
    Iteratively search documentation and build answer.
    """
    provider = load_configured_provider()
    findings = ""
    sources = []
    
    # 1. Find relevant sections
    candidate_sections = search_relevant_sections(question, scope)
    
    # 2. Iteratively evaluate sections
    for i, section in enumerate(candidate_sections[:max_iterations]):
        prompt = build_iteration_prompt(
            question=question,
            previous_findings=findings,
            current_section=section
        )
        
        response = provider.execute(prompt, context=findings)
        
        # Update accumulated findings
        findings = extract_findings(response)
        sources.append({
            "path": section.path,
            "relevance": calculate_relevance(response)
        })
        
        # Check if we should continue
        if should_terminate(response, findings):
            break
    
    # 3. Final consolidation
    final_answer = provider.execute(
        build_consolidation_prompt(question, findings),
        context=findings
    )
    
    return {
        "answer": final_answer,
        "sources": sources,
        "iterations": i + 1,
        "provider_used": provider.name
    }

Prompt Templates

ITERATION_PROMPT = """
Question: {question}

Previous findings:
{previous_findings}

Current section: {section_path}
{section_content}

Task:
1. Does this section contain information relevant to the question?
2. If yes, extract key points
3. Identify what information is still missing
4. Suggest which sections to check next (if any)

Respond in this format:
RELEVANT: Yes/No
KEY_POINTS: [bullet list]
MISSING: [what's still needed]
NEXT_SECTIONS: [suggested sections or "DONE" if complete]
"""

CONSOLIDATION_PROMPT = """
Question: {question}

All findings from documentation:
{accumulated_findings}

Task: Provide a final, consolidated answer that:
1. Directly answers the question
2. Synthesizes information from all sections
3. Includes source references (section paths)
4. Is clear and well-structured

Format:
ANSWER: [your answer]
SOURCES: [list of section paths used]
"""

Use Cases

Use Case 1: External LLM Needs Answer

Scenario: Claude (in Claude.ai) needs to know how authentication works in a project.

Current workflow:

Claude.ai → MCP get_structure → MCP get_section "security" → 
Read 50KB content → Try to answer from context → Maybe miss details

With ask command:

Claude.ai → MCP ask_documentation "How does authentication work?" →
dacli internally:
  - Finds relevant sections (security.auth, api.endpoints, config)
  - Iterates through them with Claude Code (flatrate!)
  - Consolidates findings
  → Returns complete answer with sources
Claude.ai → Uses answer in conversation

Benefit: Better answer, lower cost (flatrate), less context usage

Use Case 2: Developer Onboarding

Scenario: New developer exploring codebase documentation.

$ dacli ask "What's the deployment process?"
Searching documentation...
Checking: deployment.process ✓
Checking: deployment.docker ✓
Checking: ci-cd.pipeline ✓

Answer:
The deployment process involves 3 steps:
1. Build Docker image (deployment.docker)
2. Run tests in CI pipeline (ci-cd.pipeline)
3. Deploy to staging, then production (deployment.process)

For manual deployment, use: docker-compose up

Sources:
- deployment.process (line 45)
- deployment.docker (line 12)
- ci-cd.pipeline (line 78)

Use Case 3: API Documentation Query

Scenario: Finding all authentication-related endpoints.

$ dacli ask "List all endpoints that require authentication" --format json

{
  "answer": "4 endpoints require authentication",
  "details": [
    "/api/users/* - All user management endpoints",
    "/api/admin/* - Admin panel endpoints",
    "/api/profile - User profile endpoint",
    "/api/settings - User settings endpoint"
  ],
  "sources": [
    {"path": "api.authentication", "line": 23},
    {"path": "api.endpoints.users", "line": 45},
    {"path": "api.endpoints.admin", "line": 89}
  ]
}

Cost Analysis

Without dacli ask (External LLM)

  • Load full documentation: ~200k tokens
  • Claude API cost: ~$0.60 per query
  • Multiple queries needed: $2-5 per session
  • Monthly cost (100 queries): $60-500

With dacli ask (Claude Code CLI)

  • Uses existing Claude Pro subscription ($20/month)
  • 5 iterations × 4k tokens = 20k tokens
  • Covered by flatrate
  • Additional cost: $0

With dacli ask (Bedrock)

  • 5 iterations × 4k tokens = 20k tokens
  • Claude Sonnet on Bedrock: ~$0.06
  • Cost per query: $0.06 (90% cheaper than loading full docs)

Security Considerations

  • API keys stored in environment variables (not in config file)
  • .dacli/ added to .gitignore automatically
  • Prompts sanitized (no injection attacks)
  • Timeout limits to prevent infinite loops
  • Optional cost estimation before execution (for pay-per-token)

Configuration Example

$ dacli config init

╭─────────────────────────────────────────╮
│  dacli Configuration Setup              │
╰─────────────────────────────────────────╯

Detecting available LLM providers...

✓ Claude Code (claude-code) - FOUND
  Status: Ready to use
  Cost: Flatrate (recommended)

✗ Aider (aider) - NOT FOUND
  Install: pip install aider-chat

✓ Anthropic API - FOUND
  Status: ANTHROPIC_API_KEY detected
  Cost: Pay-per-token ($0.003/1k tokens)

✓ AWS Bedrock - FOUND
  Status: AWS credentials detected
  Cost: Pay-per-token ($0.003/1k tokens)
  
Select default provider:
  [1] claude-code (recommended - uses your flatrate)
  [2] api (Anthropic API)
  [3] bedrock (AWS Bedrock)
  [4] custom (specify command)
> 1

✓ Configuration saved to .dacli/config.toml
✓ Created .dacli/.gitignore

Test configuration? [Y/n] y
Running test query...
✓ Provider works! Response received in 2.3s

Setup complete! Try:
  dacli ask "How does authentication work?"

Documentation Updates

  • Add src/docs/50-user-manual/30-ai-assistant.adoc - Full guide for ask command
  • Update CLI spec (06_cli_specification.adoc) with ask command
  • Update architecture (05_building_block_view.adoc) with LLM integration
  • Add ADR-008 for LLM provider architecture decision

Acceptance Criteria

  • dacli ask "question" works with Claude Code
  • Iterative context building with 3+ sections
  • Support for 3+ different LLM providers (Claude Code, API, Bedrock)
  • Configuration via .dacli/config.toml
  • Auto-detection of available providers
  • Source citations in answers (section paths + lines)
  • Verbose mode shows iteration steps
  • MCP tool implementation
  • Cost estimation for pay-per-token providers
  • Tests for all provider implementations
  • Documentation for setup and usage

Related Issues

Labels

enhancement, feature, ai, llm, search, high-priority

Priority

High - Transforms dacli from navigation tool to intelligent documentation assistant. Leverages existing flatrate subscriptions for zero additional cost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions