refactor: replace Any annotations, use Path for local paths, add missing docstrings#28
refactor: replace Any annotations, use Path for local paths, add missing docstrings#28
Conversation
📝 WalkthroughWalkthroughStandardizes type annotations (replacing Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
divref/divref/main.py (1)
30-35: AddReturns:to complete the Google-style docstring block.Nice improvement adding
Args:. For full compliance, add aReturns:section (None) tosetup_logging’s docstring.Docstring completion diff
def setup_logging(level: str = "INFO") -> None: """ Set up basic logging to print to the console. Args: level: Logging level string (e.g. "INFO", "DEBUG", "WARNING"). + + Returns: + None. """As per coding guidelines, "Use Google-style docstrings with
Args:,Returns:,Yields:, andRaises:blocks."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@divref/divref/main.py` around lines 30 - 35, The docstring for setup_logging is missing a Returns section; update the Google-style docstring of the setup_logging function to include a "Returns:" block that documents it returns None (e.g., "Returns: None: No return value."), keeping the existing Args: block and overall style consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@divref/divref/alias.py`:
- Line 1: The HailExpression alias currently uses Any which disables type
checking; change the alias declaration (HailExpression: TypeAlias = Any) to
reference the concrete Hail expression type (e.g., HailExpression: TypeAlias =
hl.Expression or hail.expr.Expression), and ensure you import TypeAlias from
typing (or typing_extensions) and import hail as hl (or import
hail.expr.Expression) so the new type name resolves; update the alias definition
in the file (symbol: HailExpression, TypeAlias, Any) accordingly.
In `@divref/divref/main.py`:
- Line 13: The import of extract_gnomad_single_afs is failing because the module
divref.tools.extract_gnomad_single_afs is missing; either add a new module that
defines and exports the function extract_gnomad_single_afs (implement expected
CLI tool behavior) and then keep the registration in _tools, or remove the
extract_gnomad_single_afs import and its registration entry from the _tools list
in main.py until the new tool file is added so mypy/pytest can collect; locate
the symbol extract_gnomad_single_afs and the _tools registration in main.py to
apply the change.
---
Nitpick comments:
In `@divref/divref/main.py`:
- Around line 30-35: The docstring for setup_logging is missing a Returns
section; update the Google-style docstring of the setup_logging function to
include a "Returns:" block that documents it returns None (e.g., "Returns: None:
No return value."), keeping the existing Args: block and overall style
consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9dd7665b-5fe4-476b-aa7c-f107c6fc8876
📒 Files selected for processing (5)
divref/divref/alias.pydivref/divref/haplotype.pydivref/divref/main.pydivref/divref/tools/compute_haplotypes.pydivref/tests/tools/test_compute_haplotypes.py
Issues #20, #22, #25: - Add HailExpression type alias in alias.py to document Hail DSL expression parameters (instead of bare Any) - Replace Any with hl.Table/HailExpression in haplotype.py and compute_haplotypes.py function signatures - Add docstrings to nested helper functions in haplotype.py (get_chunk_until_next_variant, get_range) and compute_haplotypes.py (agg_haplos, collapse_haplos_across_samples, map_haplo_group, get_haplotype_summary, get_variant, get_gnomad_freq) - Change temp_dir parameter from HailPath to Path in compute_haplotypes (it is always a local filesystem path passed to hl.init) - Expand setup_logging docstring with Args section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ccb46c9 to
662a331
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
divref/divref/tools/create_fasta_and_index.py (1)
296-305:⚠️ Potential issue | 🔴 CriticalCritical: Missing
reference_genomeargument in call tobuild_haplotype_table.The mypy pipeline error confirms that
reference_genomeis required bybuild_haplotype_table(line 31) but is not passed in this call. This also explains the ARG001 warning—reference_genomeis accepted as a parameter but never used.Proposed fix
ht, pops_legend = build_haplotype_table( haplotypes_table_path=haplotypes_table_path, gnomad_va_file=gnomad_va_file, reference_fasta=reference_fasta, window_size=window_size, frequency_cutoff=frequency_cutoff, merge=merge, version_str=version_str, + reference_genome=reference_genome, tmp_dir=tmp_dir, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@divref/divref/tools/create_fasta_and_index.py` around lines 296 - 305, The call to build_haplotype_table in create_fasta_and_index.py is missing the required reference_genome argument; update the call to pass the appropriate reference_genome variable (the same one used elsewhere in this module) so the signature matches build_haplotype_table(reference_genome=...,...). Ensure you add the named parameter reference_genome=reference_genome to the argument list for build_haplotype_table to resolve the mypy ARG001 error and ensure the parameter is used.
🧹 Nitpick comments (1)
divref/divref/tools/gnomad_hail_table_test_data.py (1)
50-50: Inconsistent: Missingstr()conversion for Hail write calls.Other tools in this PR consistently use
str(path)when passing paths to Hail I/O methods. These lines passPathobjects directly, which is inconsistent with the pattern established elsewhere (e.g.,extract_gnomad_afs.pyline 64,extract_sample_metadata.pyline 32).♻️ Proposed fix for consistency
logger.info(f"Writing {va_subset.count()} variants to {out_variant_annotation_table}.") - va_subset.write(out_variant_annotation_table, overwrite=True) + va_subset.write(str(out_variant_annotation_table), overwrite=True) sa = hl.read_table(in_gnomad_hgdp_sample_metadata) sa = sa.select("gnomad_population_inference").select_globals() logger.info(f"Writing {sa.count()} samples to {out_sample_metadata}.") - sa.naive_coalesce(1).write(out_sample_metadata, overwrite=True) + sa.naive_coalesce(1).write(str(out_sample_metadata), overwrite=True)Also applies to: 56-56
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@divref/divref/tools/gnomad_hail_table_test_data.py` at line 50, The write call uses a Path object directly; update the Hail I/O calls to pass string paths by wrapping the Path variables with str()—specifically change the va_subset.write call that takes out_variant_annotation_table to use str(out_variant_annotation_table) (and likewise any other write call at the referenced location, e.g., the second write at line 56) so the Hail write invocations are consistent with other tools.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@divref/divref/tools/compute_haplotype_statistics.py`:
- Around line 6-7: The project is missing the external dependency used by
compute_haplotype_statistics.py: add the fgmetric package (which provides Metric
and MetricWriter used as imports and base classes in
compute_haplotype_statistics.py) to the project's dependency declarations (e.g.,
pyproject.toml [tool.poetry.dependencies] or setup.cfg/requirements.txt as
appropriate for this repo) so the imports "from fgmetric import Metric,
MetricWriter" resolve and CI installs fgmetric before running tests or
packaging.
In `@divref/divref/tools/compute_variation_ratios.py`:
- Line 39: The assertion assert_path_is_readable(vcfs_path) conflicts with the
docstring and hl.import_vcf's glob support: remove or replace it so glob
patterns are allowed; specifically, in compute_variation_ratios.py update the
validation for vcfs_path (instead of calling assert_path_is_readable(vcfs_path))
either remove that check entirely or implement a glob-aware check (e.g., use
glob.glob on vcfs_path and assert that it yields at least one match) before
calling hl.import_vcf to ensure both literal paths and glob patterns work as
intended.
In `@divref/divref/tools/create_fasta_and_index.py`:
- Line 294: hl.init is being passed a Path object (tmp_dir) but expects a string
path; update the call to hl.init(tmp_dir=...) to pass tmp_dir as a str by
converting the Path (e.g., use str(tmp_dir)) so hl.init receives a string.
Ensure you change the hl.init invocation that currently reads
hl.init(tmp_dir=tmp_dir) and leave other logic untouched.
- Around line 55-59: The call to
hl.get_reference(...).add_sequence(reference_fasta) passes a Path object;
convert reference_fasta to a string so Hail's add_sequence receives a file path
string. Update the usage where reference_fasta is passed (the add_sequence call
in this scope) to call str(reference_fasta) or otherwise ensure it's a str; keep
the hl.get_reference(...) and add_sequence(...) calls unchanged except for
converting the argument to a string.
- Around line 3-18: The import block is unsorted: move "from divref import
defaults" into the local/project import group with "from divref.haplotype import
get_haplo_sequence" and "from divref.haplotype import split_haplotypes" (and
then alphabetize within that group) so all divref imports are grouped together;
ensure standard-library, third-party (duckdb, hail, polars, fgpyo), and local
imports are each in their own sections and run isort or follow the existing
project import ordering conventions to pass CI.
---
Outside diff comments:
In `@divref/divref/tools/create_fasta_and_index.py`:
- Around line 296-305: The call to build_haplotype_table in
create_fasta_and_index.py is missing the required reference_genome argument;
update the call to pass the appropriate reference_genome variable (the same one
used elsewhere in this module) so the signature matches
build_haplotype_table(reference_genome=...,...). Ensure you add the named
parameter reference_genome=reference_genome to the argument list for
build_haplotype_table to resolve the mypy ARG001 error and ensure the parameter
is used.
---
Nitpick comments:
In `@divref/divref/tools/gnomad_hail_table_test_data.py`:
- Line 50: The write call uses a Path object directly; update the Hail I/O calls
to pass string paths by wrapping the Path variables with str()—specifically
change the va_subset.write call that takes out_variant_annotation_table to use
str(out_variant_annotation_table) (and likewise any other write call at the
referenced location, e.g., the second write at line 56) so the Hail write
invocations are consistent with other tools.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3f9c8dbc-caa0-4527-973e-ff1ed98fff88
⛔ Files ignored due to path filters (2)
divref/uv.lockis excluded by!**/*.lockpixi.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
divref/divref/haplotype.pydivref/divref/main.pydivref/divref/tools/compute_haplotype_statistics.pydivref/divref/tools/compute_haplotypes.pydivref/divref/tools/compute_variation_ratios.pydivref/divref/tools/create_fasta_and_index.pydivref/divref/tools/create_gnomad_sites_vcf.pydivref/divref/tools/extract_gnomad_afs.pydivref/divref/tools/extract_sample_metadata.pydivref/divref/tools/gnomad_hail_table_test_data.pydivref/pyproject.tomldivref/tests/test_haplotype.pydivref/tests/tools/test_compute_haplotypes.pydivref/tests/tools/test_create_gnomad_sites_vcf.pydivref/tests/tools/test_extract_gnomad_afs.pydivref/tests/tools/test_extract_sample_metadata.pydivref/tests/tools/test_remap_divref.pydivref/tests/tools/test_rewrite_fasta.py
✅ Files skipped from review due to trivial changes (6)
- divref/divref/main.py
- divref/pyproject.toml
- divref/tests/tools/test_extract_sample_metadata.py
- divref/tests/tools/test_rewrite_fasta.py
- divref/tests/tools/test_remap_divref.py
- divref/tests/test_haplotype.py
🚧 Files skipped from review as they are similar to previous changes (3)
- divref/tests/tools/test_compute_haplotypes.py
- divref/divref/haplotype.py
- divref/divref/tools/compute_haplotypes.py
Summary
Addresses issues #20, #22, and #25 — all closely related code-quality improvements touching the same files.
Anyfor Hail variables #20: AddedHailExpressiontype alias inalias.pyto replace bareAnyin Hail DSL expression parameters. Updatedhaplotype.pyandcompute_haplotypes.pyto useHailExpression/hl.Tableinstead ofAny.Pathinstead of aliasedHailPathfor all input/output files that are not expected to be GCS hosted #22: Changedtemp_dirparameter incompute_haplotypesfromHailPathtoPath; it is always a local filesystem path passed tohl.init(). Updated the corresponding test.haplotype.py(get_chunk_until_next_variant,get_range) andcompute_haplotypes.py(agg_haplos,collapse_haplos_across_samples,map_haplo_group,get_haplotype_summary,get_variant,get_gnomad_freq). Expanded thesetup_loggingdocstring with a missingArgs:section.Test plan
uv run --directory divref poe check-allpasses (mypy and pytest; format/lint failures are pre-existing in the untrackedextract_gnomad_single_afs.pyfile)test_compute_haplotypesstill passes withtemp_dirasPath🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Refactor
Documentation
Chores