Skip to content

feat: Add element content capture (Issue #159)#188

Closed
raifdmueller wants to merge 23 commits intodocToolchain:mainfrom
raifdmueller:main
Closed

feat: Add element content capture (Issue #159)#188
raifdmueller wants to merge 23 commits intodocToolchain:mainfrom
raifdmueller:main

Conversation

@raifdmueller
Copy link
Collaborator

Summary

Implements Issue #159: Add content capture to elements API/CLI.

Elements now capture their actual content (code, tables, lists, diagrams) and expose it via:

  • MCP tool: get_elements(include_content=True, content_limit=N)
  • CLI: dacli elements --include-content [--content-limit N]
  • API: ElementItem.attributes dict containing element-specific data

Changes

  • Parsers: Extended AsciiDoc and Markdown parsers to capture content for all element types
  • API: Added attributes field to ElementItem model (Issue elements command only returns metadata, not the actual content of elements #159)
  • CLI: Added --include-content and --content-limit N flags
  • MCP: Added include_content and content_limit parameters to get_elements tool
  • Tests: Added comprehensive test coverage (10 new tests)
  • Docs: Updated API and CLI specifications

Content Capture Strategy

  • Content stored in Element.attributes["content"] dict during parsing
  • Opt-in via --include-content flag (default: metadata only)
  • Optional truncation via --content-limit N
  • Element-specific attributes:
    • code: language, content
    • table: columns, rows, content (raw table text)
    • list: list_type, content (all items)
    • image: alt, src/target, title
    • diagram: format, name, content (diagram source)
    • admonition: admonition_type, content

Test Results

All tests passing (10 new tests):

  • AsciiDoc content capture (code, table, plantuml, list)
  • Markdown content capture (code, table, list)
  • CLI flags (--include-content, --content-limit)

Fixes #159

🤖 Generated with Claude Code

rdmueller and others added 23 commits January 24, 2026 08:26
- Document main/develop branching model in CLAUDE.md
- Add development workflow section to README.md
- main is now stable/production, develop is for active development

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
docToolchain#169)

* refactor: Consolidate shared parser utilities (Issue docToolchain#165)

- Create parser_utils.py with shared functions:
  - slugify(): Convert text to URL-friendly slug (uses robust Markdown version)
  - collect_all_sections(): Recursively collect sections into flat list
  - find_section_by_path(): Find section by hierarchical path

- Update asciidoc_parser.py:
  - Remove _title_to_slug() function (use slugify from parser_utils)
  - Remove _collect_all_sections() method
  - Remove _find_section_by_path() method

- Update markdown_parser.py:
  - Remove slugify() function (use from parser_utils)
  - Remove _collect_all_sections() method
  - Remove _find_section_by_path() method

- Add 19 tests for parser_utils functions

Benefits:
- DRY principle: Single source of truth for shared logic
- Bug fixes apply to both parsers automatically
- Easier testing of utility functions

All 454 tests passing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* ci: Run tests on develop branch too

Update test workflow to run on both main and develop branches.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: Remove unused pytest import

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Ralf D. Müller <ralf.d.mueller@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…Toolchain#170)

1. ParseWarning.type as Enum
   - Add WarningType enum with UNCLOSED_BLOCK, UNCLOSED_TABLE
   - Update ParseWarning to use WarningType instead of str
   - Add Enum serialization to _convert_value()
   - Update CLI and MCP to use .value for JSON output

2. Remove redundant `pass` in Exception classes
   - FileReadError and FileWriteError already have docstrings

3. Move datetime imports to top of file
   - Remove duplicate imports inside metadata() function
   - Import UTC, datetime at module level

4. Extract _compute_hash() helper function
   - DRY: Single function for MD5 hash computation
   - Used in update() and insert() commands

5. Remove stale TODO and update code
   - SourceLocation now has end_line field
   - Use elem.source_location.end_line in API response

All 435 tests passing, no behavior changes.

Co-authored-by: Ralf D. Müller <ralf.d.mueller@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…oolchain#155)

- Add recursive: bool = Query(default=False) parameter
- Pass recursive to index.get_elements()
- Completes API parity with CLI/MCP for hierarchical filtering

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ve-155

fix: Add recursive parameter to API get_elements endpoint (Issue docToolchain#155)
…lchain#136) (docToolchain#172)

- Add current_list_element variable to track current list
- Initialize end_line when creating list element
- Update end_line for each subsequent list item
- Reset tracking when list ends (non-list line)
- Add tests for list end_line behavior

Fixes docToolchain#136

Co-authored-by: Ralf D. Müller <ralf.d.mueller@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
- Add issue handling section to branching strategy
- Explain why issues don't auto-close on develop merge
- Document fixed-in-develop label usage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…olchain#163) (docToolchain#173)

- Create services/ package with MetadataService, ValidationService, ContentService
- MetadataService: get_project_metadata(), get_section_metadata()
- ValidationService: validate_structure()
- ContentService: update_section(), compute_hash()
- CLI and MCP now use shared services
- Eliminates ~300 lines of duplicated code
- No behavior changes - all 456 tests pass

Code reduction:
- cli.py: 859 → 696 lines (-19%)
- mcp_app.py: 726 → 530 lines (-27%)

Fixes docToolchain#163

Co-authored-by: Ralf D. Müller <ralf.d.mueller@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Document the fork workflow, git user configuration for AI commits,
and authentication setup for working with Claude Code.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace dual-branch strategy (main/develop) with simpler fork workflow:
- All development happens on fork (raifdmueller/dacli)
- PRs go directly to upstream/main
- Issues auto-close on merge (no more fixed-in-develop label needed)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…9327827816

Add Claude Code GitHub Workflow
Move user-specific settings (fork name, git user, auth details)
to ~/.claude/CLAUDE.md to keep project docs generic.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes:
- Parser: Capture content for code, tables, lists, diagrams
- API: Add attributes field to ElementItem model
- CLI: Add --include-content and --content-limit flags
- MCP: Add include_content and content_limit parameters
- Tests: Add comprehensive content capture tests (471 passing)
- Docs: Update API and CLI specifications

Details:
- AsciiDoc parser captures content for all block elements
- Markdown parser captures content for code, tables, lists
- Content is opt-in via --include-content flag
- Content can be truncated with --content-limit N
- API always returns attributes when present
- CLI filters attributes based on flags

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Remove accidentally committed test artifacts.
- Fix line too long (E501)
- Fix unsorted imports (I001)
# Conflicts:
#	.github/workflows/claude-code-review.yml
#	.github/workflows/claude.yml
@raifdmueller
Copy link
Collaborator Author

Closing - will create clean PR with only Issue #159 commits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

elements command only returns metadata, not the actual content of elements

2 participants