Skip to content

Add StageExample.render_content hook for docgen (closes #383)#384

Merged
mmcdermott merged 4 commits intomainfrom
feat/docgen-stage-example-hook
Apr 18, 2026
Merged

Add StageExample.render_content hook for docgen (closes #383)#384
mmcdermott merged 4 commits intomainfrom
feat/docgen-stage-example-hook

Conversation

@mmcdermott
Copy link
Copy Markdown
Owner

Summary

  • Moves per-example rendering out of the free-function _format_example in MEDS_transforms/stages/docgen.py onto StageExample.render_content(example_dir=None), so downstream packages whose stages register with a custom example_class can override rendering without monkey-patching docgen.
  • Adds a symmetric Path branch for want_data (mirrors the existing in_data handling), resolving the immediate AttributeError: 'PosixPath' object has no attribute '_pl_shards' that any subclass using a YAML-spec path as want_data hit.
  • Promotes df_to_markdown and format_dataset to public utilities on docgen so subclass overrides can reuse them.
  • Makes generate_stage_docs tolerate stages whose stage_dir lives outside the root passed for edit-link computation (returns None for edit_path instead of raising).
  • Demonstrates the hook: simple_example_pkg.export_code_summary.JsonOutputStageExample overrides render_content to parse its JSON output spec and render each file as a Markdown table.

Design

See #383 (comment) for the full design proposal.

Test plan

  • uv run pytest src/MEDS_transforms/stages/docgen.py src/MEDS_transforms/stages/examples.py --doctest-modules — doctests on the new method and utilities pass.
  • uv run pytest tests/test_generic_input.py — updated to use render_content.
  • uv run pytest example/simple_example_pkg/tests/test_registered_stages.py — new end-to-end assertion that generate_stage_docs("simple_example_pkg", root=…) renders the subclass-customized JSON table.
  • uv run ruff check / uv run ruff format --check clean.
  • CI green across all suites.
  • Copilot review iterated to no outstanding comments.

🤖 Generated with Claude Code

Move per-example rendering out of the free-function `_format_example` in
`docgen.py` and onto `StageExample.render_content(example_dir=None)`.
Downstream packages whose stages use a custom `example_class` can now
override rendering cleanly without monkey-patching docgen.

Key changes:
- `StageExample.render_content` provides the default Markdown rendering
  (chrome + data sections + CLI invocation). The `want_data` branch now
  mirrors `in_data`'s Path-handling, so a `yaml_to_disk`-spec Path in
  either field renders as a YAML code block instead of crashing on
  `_pl_shards`.
- `df_to_markdown` and `format_dataset` are promoted to public
  utilities on `docgen` so subclass overrides can reuse them.
- `generate_stage_docs` no longer crashes when a stage's `stage_dir`
  lives outside the `root` passed for edit-link computation; it falls
  back to `None` instead.
- `simple_example_pkg.JsonOutputStageExample` overrides
  `render_content` to parse the JSON output spec and render each file
  as a Markdown table — demonstrating the hook for downstream packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.17%. Comparing base (dda4577) to head (c540467).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #384   +/-   ##
=======================================
  Coverage   98.16%   98.17%           
=======================================
  Files          54       54           
  Lines        2510     2517    +7     
=======================================
+ Hits         2464     2471    +7     
  Misses         46       46           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors stage documentation generation to use a new StageExample.render_content() hook, enabling downstream packages to override per-example rendering without monkey-patching docgen, and fixes the want_data Path handling asymmetry that previously caused errors for non-MEDS-shaped examples.

Changes:

  • Introduces StageExample.render_content(example_dir=None) as the default per-example Markdown renderer (including symmetric Path handling for want_data).
  • Updates docgen to delegate example rendering to StageExample.render_content, and promotes read_example_readme, df_to_markdown, and format_dataset as public utilities.
  • Adds an end-to-end example/test demonstrating a custom StageExample subclass overriding render_content, plus a safer edit-link computation when stage_dir is outside root.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_generic_input.py Updates docgen-related test to use StageExample.render_content() instead of the removed private formatter.
src/MEDS_transforms/stages/examples.py Adds StageExample.render_content() default implementation (including want_data: Path rendering).
src/MEDS_transforms/stages/docgen.py Delegates per-example rendering to StageExample.render_content, exposes helper utilities, and makes edit-link computation tolerant of out-of-root stages.
example/simple_example_pkg/tests/test_registered_stages.py Adds an end-to-end assertion that docgen respects a custom example-class renderer.
example/simple_example_pkg/src/simple_example_pkg/export_code_summary/export_code_summary.py Demonstrates overriding render_content() to render JSON outputs as Markdown tables.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/MEDS_transforms/stages/examples.py Outdated
The CLI invocation hint in `StageExample.render_content` hand-rolled
`stage_cfg.{k}={v}` formatting, which produced invalid Hydra dotlist
syntax for `bool`, `None`, and `list` values and ignored
`do_use_config_yaml` mode.

Delegate to `self.cmd_args`, which internally calls
`dict_to_hydra_kwargs` (correct for all supported types) and returns
`[]` when `do_use_config_yaml` is set, matching how
`StageExample.test_env` actually invokes the stage. Mirrored the same
treatment in the `JsonOutputStageExample` override.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mmcdermott
Copy link
Copy Markdown
Owner Author

@copilot please take another look

Copy link
Copy Markdown

Copilot AI commented Apr 18, 2026

@copilot please take another look

Reviewed the current state of the PR. All checks pass:

  • All doctests on examples.py and docgen.py pass (11 total), including the two new tests locking the CLI hint behavior for bool/None values and do_use_config_yaml mode.
  • tests/test_generic_input.py (10 tests) passes.
  • example/simple_example_pkg/tests/ (4 tests) passes, including the end-to-end test_docgen_renders_custom_example_class.
  • ruff check is clean.

The CLI hint fix from the previous commit (using self.cmd_argsdict_to_hydra_kwargs) is covered by doctests and works correctly. No outstanding issues found.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • astral.sh
    • Triggering command: /usr/bin/curl curl -LsSf REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/MEDS_transforms/stages/examples.py Outdated
Tighten `read_example_readme`'s heading-stripping regex from
`^#[^\n]*\n+` to `^#+[^\n]*\n+` so the "any ATX level" intent is
explicit on the face of the pattern (the previous form already stripped
all levels because the `[^\n]*` greedily consumed the remaining `#`s,
but it read as if it only handled level 1). Added a doctest for a
level-3 heading and updated `render_content`'s docstring to say "ATX
heading of any level" to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/MEDS_transforms/stages/docgen.py:334

  • The root parameter is described as a “repository root” and “two directories above this file”, but the implementation defaults to Path(__file__).resolve().parents[1] (the src/MEDS_transforms package dir). This makes edit_path values like stages/<stage> instead of src/MEDS_transforms/stages/<stage>, which is unlikely to exist at the repo root and can break MkDocs edit links. Please align the docstring and the default (either compute an actual repo root, or describe/return a package-relative edit path intentionally).
        root: Repository root for computing edit links. Defaults to two
            directories above this file. Stages whose ``stage_dir`` lives outside
            ``root`` get a ``None`` ``edit_path`` (no MkDocs edit link) rather
            than raising.

    Returns:
        A list of :class:`StageDoc` objects describing each page.

    Examples:
        >>> import textwrap
        >>> docs = generate_stage_docs("MEDS_transforms")
        >>> print(docs[0].stage_name)
        add_time_derived_measurements
        >>> print(docs[0].path)
        add_time_derived_measurements
        >>> print(textwrap.shorten(docs[0].content, width=50))
        # `add_time_derived_measurements` Adds all [...]

    It can also attempt to make docs for other packages; in this case, it will return None as there are no
    stages in this fake package:

        >>> generate_stage_docs("fake_package")
        []

    When a stage's ``stage_dir`` is outside the ``root`` used for edit links, ``edit_path`` falls
    back to ``None`` instead of raising:

        >>> from pathlib import Path
        >>> docs_no_edit = generate_stage_docs("MEDS_transforms", root=Path("/nonexistent"))
        >>> all(d.edit_path is None for d in docs_no_edit)
        True
    """

    root = Path(root) if root else Path(__file__).resolve().parents[1]


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/MEDS_transforms/stages/docgen.py
Comment thread src/MEDS_transforms/stages/docgen.py Outdated
@mmcdermott
Copy link
Copy Markdown
Owner Author

Also addressed the low-confidence docstring observation from the root parameter description: clarified that the default is the MEDS_transforms package directory (docgen.py's parent-of-parent), not the repository root, and that the produced edit_path is package-relative (e.g. stages/<stage_name>). Downstream-package doc builds should pass their own package root.

- `_safe_relative_to` resolves both operands before comparing, so mixed
  absolute/relative path pairs (e.g., relative `root` against absolute
  `stage.stage_dir`) no longer silently drop edit links.
- `read_example_readme`'s ATX-stripping regex now anchors to end-of-string
  too (`^#+[^\n]*(?:\n+|$)`), so heading-only READMEs with no trailing
  newline/body are handled consistently.
- Clarified the `generate_stage_docs(root=...)` docstring: the default is
  the `MEDS_transforms` *package* directory, not the repository root;
  `edit_path` is package-relative.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mmcdermott mmcdermott merged commit 03a8020 into main Apr 18, 2026
13 of 14 checks passed
@mmcdermott mmcdermott deleted the feat/docgen-stage-example-hook branch April 18, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants