Add StageExample.render_content hook for docgen (closes #383)#384
Add StageExample.render_content hook for docgen (closes #383)#384mmcdermott merged 4 commits intomainfrom
Conversation
Move per-example rendering out of the free-function `_format_example` in `docgen.py` and onto `StageExample.render_content(example_dir=None)`. Downstream packages whose stages use a custom `example_class` can now override rendering cleanly without monkey-patching docgen. Key changes: - `StageExample.render_content` provides the default Markdown rendering (chrome + data sections + CLI invocation). The `want_data` branch now mirrors `in_data`'s Path-handling, so a `yaml_to_disk`-spec Path in either field renders as a YAML code block instead of crashing on `_pl_shards`. - `df_to_markdown` and `format_dataset` are promoted to public utilities on `docgen` so subclass overrides can reuse them. - `generate_stage_docs` no longer crashes when a stage's `stage_dir` lives outside the `root` passed for edit-link computation; it falls back to `None` instead. - `simple_example_pkg.JsonOutputStageExample` overrides `render_content` to parse the JSON output spec and render each file as a Markdown table — demonstrating the hook for downstream packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #384 +/- ##
=======================================
Coverage 98.16% 98.17%
=======================================
Files 54 54
Lines 2510 2517 +7
=======================================
+ Hits 2464 2471 +7
Misses 46 46 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Pull request overview
This PR refactors stage documentation generation to use a new StageExample.render_content() hook, enabling downstream packages to override per-example rendering without monkey-patching docgen, and fixes the want_data Path handling asymmetry that previously caused errors for non-MEDS-shaped examples.
Changes:
- Introduces
StageExample.render_content(example_dir=None)as the default per-example Markdown renderer (including symmetricPathhandling forwant_data). - Updates
docgento delegate example rendering toStageExample.render_content, and promotesread_example_readme,df_to_markdown, andformat_datasetas public utilities. - Adds an end-to-end example/test demonstrating a custom
StageExamplesubclass overridingrender_content, plus a safer edit-link computation whenstage_diris outsideroot.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/test_generic_input.py | Updates docgen-related test to use StageExample.render_content() instead of the removed private formatter. |
| src/MEDS_transforms/stages/examples.py | Adds StageExample.render_content() default implementation (including want_data: Path rendering). |
| src/MEDS_transforms/stages/docgen.py | Delegates per-example rendering to StageExample.render_content, exposes helper utilities, and makes edit-link computation tolerant of out-of-root stages. |
| example/simple_example_pkg/tests/test_registered_stages.py | Adds an end-to-end assertion that docgen respects a custom example-class renderer. |
| example/simple_example_pkg/src/simple_example_pkg/export_code_summary/export_code_summary.py | Demonstrates overriding render_content() to render JSON outputs as Markdown tables. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The CLI invocation hint in `StageExample.render_content` hand-rolled
`stage_cfg.{k}={v}` formatting, which produced invalid Hydra dotlist
syntax for `bool`, `None`, and `list` values and ignored
`do_use_config_yaml` mode.
Delegate to `self.cmd_args`, which internally calls
`dict_to_hydra_kwargs` (correct for all supported types) and returns
`[]` when `do_use_config_yaml` is set, matching how
`StageExample.test_env` actually invokes the stage. Mirrored the same
treatment in the `JsonOutputStageExample` override.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@copilot please take another look |
Reviewed the current state of the PR. All checks pass:
The CLI hint fix from the previous commit (using Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Tighten `read_example_readme`'s heading-stripping regex from `^#[^\n]*\n+` to `^#+[^\n]*\n+` so the "any ATX level" intent is explicit on the face of the pattern (the previous form already stripped all levels because the `[^\n]*` greedily consumed the remaining `#`s, but it read as if it only handled level 1). Added a doctest for a level-3 heading and updated `render_content`'s docstring to say "ATX heading of any level" to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/MEDS_transforms/stages/docgen.py:334
- The
rootparameter is described as a “repository root” and “two directories above this file”, but the implementation defaults toPath(__file__).resolve().parents[1](thesrc/MEDS_transformspackage dir). This makesedit_pathvalues likestages/<stage>instead ofsrc/MEDS_transforms/stages/<stage>, which is unlikely to exist at the repo root and can break MkDocs edit links. Please align the docstring and the default (either compute an actual repo root, or describe/return a package-relative edit path intentionally).
root: Repository root for computing edit links. Defaults to two
directories above this file. Stages whose ``stage_dir`` lives outside
``root`` get a ``None`` ``edit_path`` (no MkDocs edit link) rather
than raising.
Returns:
A list of :class:`StageDoc` objects describing each page.
Examples:
>>> import textwrap
>>> docs = generate_stage_docs("MEDS_transforms")
>>> print(docs[0].stage_name)
add_time_derived_measurements
>>> print(docs[0].path)
add_time_derived_measurements
>>> print(textwrap.shorten(docs[0].content, width=50))
# `add_time_derived_measurements` Adds all [...]
It can also attempt to make docs for other packages; in this case, it will return None as there are no
stages in this fake package:
>>> generate_stage_docs("fake_package")
[]
When a stage's ``stage_dir`` is outside the ``root`` used for edit links, ``edit_path`` falls
back to ``None`` instead of raising:
>>> from pathlib import Path
>>> docs_no_edit = generate_stage_docs("MEDS_transforms", root=Path("/nonexistent"))
>>> all(d.edit_path is None for d in docs_no_edit)
True
"""
root = Path(root) if root else Path(__file__).resolve().parents[1]
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Also addressed the low-confidence docstring observation from the |
- `_safe_relative_to` resolves both operands before comparing, so mixed absolute/relative path pairs (e.g., relative `root` against absolute `stage.stage_dir`) no longer silently drop edit links. - `read_example_readme`'s ATX-stripping regex now anchors to end-of-string too (`^#+[^\n]*(?:\n+|$)`), so heading-only READMEs with no trailing newline/body are handled consistently. - Clarified the `generate_stage_docs(root=...)` docstring: the default is the `MEDS_transforms` *package* directory, not the repository root; `edit_path` is package-relative. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
_format_exampleinMEDS_transforms/stages/docgen.pyontoStageExample.render_content(example_dir=None), so downstream packages whose stages register with a customexample_classcan override rendering without monkey-patching docgen.Pathbranch forwant_data(mirrors the existingin_datahandling), resolving the immediateAttributeError: 'PosixPath' object has no attribute '_pl_shards'that any subclass using a YAML-spec path aswant_datahit.df_to_markdownandformat_datasetto public utilities ondocgenso subclass overrides can reuse them.generate_stage_docstolerate stages whosestage_dirlives outside therootpassed for edit-link computation (returnsNoneforedit_pathinstead of raising).simple_example_pkg.export_code_summary.JsonOutputStageExampleoverridesrender_contentto parse its JSON output spec and render each file as a Markdown table.Design
See #383 (comment) for the full design proposal.
Test plan
uv run pytest src/MEDS_transforms/stages/docgen.py src/MEDS_transforms/stages/examples.py --doctest-modules— doctests on the new method and utilities pass.uv run pytest tests/test_generic_input.py— updated to userender_content.uv run pytest example/simple_example_pkg/tests/test_registered_stages.py— new end-to-end assertion thatgenerate_stage_docs("simple_example_pkg", root=…)renders the subclass-customized JSON table.uv run ruff check/uv run ruff format --checkclean.🤖 Generated with Claude Code