rtfliteis a lightweight Python library for composing rich-text-format (RTF) documents used for production-quality tables and figures.- Runtime code lives under
src/rtflite/and is packaged viapyproject.tomlwith Python 3.10+ support. - Automated tests live in
tests/and cover pagination, encoding, services, and snapshot comparisons of generated RTF strings.
- Core models & configuration:
src/rtflite/core/exposes reusable constants and configuration structures consumed across services. - Input schemas:
src/rtflite/input.pyand related modules define Pydantic models (RTFBody,RTFFigure,RTFPage, etc.) that describe incoming table/figure data. - Composition logic: Modules such as
encode.py,pagination/,row.py,attributes.py, andservices/transform structured data into final RTF output. They make heavy use of Pydantic validators, typed helper classes, and string-building utilities. - Font and color utilities:
fonts/,fonts_mapping.py,text_convert.py,text_conversion/, andservices/color_service.pymanage font metrics, color lookup, and text normalization. - Conversion helpers:
convert.pyprovides integration with LibreOffice for PDF conversion, while shell/python scripts inscripts/(e.g.,check_rtf.sh,verify_ascii.py) support validation workflows. - Documentation:
docs/hosts the Zensical site. Markdown articles may execute Python snippets viamarkdown-execduring site builds.
- Use uv for environment management. Run
uv syncto create/refresh the local virtual environment before developing. - Keep imports sorted with
isort .and format code withruff format. Follow existing typing conventions---prefer explicit type hints and Pydantic field validators over dynamic typing. - Maintain consistency with existing error handling (primarily
ValueError/TypeErrorfor validation issues) and adhere to the RTF command patterns already present in helper modules. - When touching Markdown files in the repository root, run
sh docs/scripts/sync.shto propagate changes into the documentation site. - Do not automatically commit changes. Always leave changes unstaged for the user to review and commit manually, unless explicitly instructed to commit.
- Execute
pytest(or targeted subsets) before committing. Snapshot-style tests compare normalized RTF output; update fixtures thoughtfully and document rationale when expectations change. - If functionality affects documentation examples, rebuild the docs locally with
zensical build --clean(or preview viazensical serve) to confirm rendered outputs. - Keep dependency metadata (
uv.lock) in sync when upgrading libraries; useuv lock --upgradefollowed byuv syncif dependency changes are intentional.
- Reuse utility functions in
tests/utils.pyandtests/utils_snapshot.pywhen authoring new tests to ensure consistent normalization of RTF strings. - Consult
scripts/update_color_table.Randscripts/update_unicode_latex.pywhen modifying color tables or Unicode handling to avoid drifting from validated data sources. - Public API exports are centralized in
src/rtflite/__init__.py; update__all__when adding new user-facing classes or helpers.
The project uses a modular, strategy-based architecture for pagination and rendering.
- Pagination Logic: Decoupled from the core distributor.
- Strategies:
DefaultPaginationStrategy: Standard row-limit based pagination.PageByStrategy: Handlespage_bygrouping logic.SublineStrategy: Handlessubline_bylogic.
- Registry:
StrategyRegistryresolves strategies based on configuration.
UnifiedRTFEncoder: Handles all document encoding (replacing legacy split strategies).- Flow:
- Paginate: Select strategy -> Split DataFrame into
List[PageContext]. - Process: Apply per-page features (borders, headers, dynamic attributes) via
PageFeatureProcessor. - Render: Use
PageRendererto convert eachPageContextto RTF. - Assemble: Concatenate page RTF chunks.
- Paginate: Select strategy -> Split DataFrame into
- Page-specific features (top/bottom borders, spanning rows, footnotes) are calculated and finalized on the
PageContextobject before rendering.
Pagination logic has been refactored to be metadata-driven, centralizing row height and page break calculations into PageBreakCalculator.
RowMetadata: A Pydantic model (incore.py) defining the structure of each row's pagination data (row index, data rows, header rows, page assignment, group start flags).PageBreakCalculator.calculate_row_metadata: The core method that generates a Polars DataFrame ofRowMetadata. It handles:- Content row height calculation (respecting fonts and column widths).
- Header row estimation.
- Page assignment logic (respecting
nrow,additional_rows_per_page,new_page). - Group start detection for
page_byandsubline_by.
DefaultPaginationStrategy: Usescalculate_row_metadatato determine page breaks based on content height.PageByStrategy: Usescalculate_row_metadatawithpage_bycolumns to handle grouping and potential page breaks.SublineStrategy: Usescalculate_row_metadatawithsubline_by(passed assubline_byarg) andnew_page=Trueto enforce page breaks on subline changes.
- Tests: All 394 tests passed, including regression tests for pagination and grouping.
- Linting: Codebase is clean (
ruff checkpassed). - Types: Type checking passed (
mypy .passed), with explicit casting used for Polars aggregation results.