Skip to content

Review and fix CLI help/usage messages #1598

@ayesha-usmani

Description

@ayesha-usmani

Summary

Review and fix inconsistencies and bugs in CLI command help/usage messages.

Issues Identified

1. Typo in dataset command description

Location: src/datachain/cli/parser/__init__.py:142

description="Commands for managing datasers",  # Should be "datasets"

2. Missing usage output for datachain dataset (no subcommand)

When running datachain datasets without a subcommand, instead of printing usage/help, it raises:

Error: Unexpected command None

Root cause: The dataset subparser doesn't have required=True:

datasets_subparser = datasets_parser.add_subparsers(
    dest="datasets_cmd",
    help="Use `datachain datasets CMD --help` to display command specific help",
)
# Missing: required=True

Compare to studio and job commands which correctly enforce required subcommands.

3. Debug flags visible in help output

The following flags are visible to all users but are intended for debugging only:

  • --pdb - Drop into pdb debugger on fatal exception
  • --debug-sql - Show all SQL queries (very verbose output)

Suggestion: Use argparse.SUPPRESS to hide these from help output:

parser.add_argument("--pdb", ..., help=argparse.SUPPRESS)
parser.add_argument("--debug-sql", ..., help=argparse.SUPPRESS)

4. Add spell checker to pre-commit

To prevent future typos, add a spell checker to pre-commit hooks (e.g., codespell).

Tasks

  • Fix "datasers" → "datasets" typo
  • Add required=True to dataset subparser (or print help when no subcommand given)
  • Suppress --pdb and --debug-sql from help output
  • Review all other command help messages for consistency
  • Add spell checker (codespell) to pre-commit configuration
  • Ensure all command groups behave consistently when called without subcommands

Files to Review

  • src/datachain/cli/parser/__init__.py - Main parser with typo and dataset subparser
  • src/datachain/cli/parser/studio.py - Studio command parser
  • src/datachain/cli/parser/job.py - Job command parser
  • src/datachain/cli/__init__.py - Command handlers
  • .pre-commit-config.yaml - Pre-commit configuration

Additional Notes

The studio and job commands correctly use required=True for their subparsers, so they show proper usage when called without a subcommand. The dataset command should follow the same pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions