Skip to content

Replace .isdigit() with ASCII-only digit check#2722

Open
danfitz36 wants to merge 1 commit intoKozea:mainfrom
danfitz36:fix/ascii-digit-check-2721
Open

Replace .isdigit() with ASCII-only digit check#2722
danfitz36 wants to merge 1 commit intoKozea:mainfrom
danfitz36:fix/ascii-digit-check-2721

Conversation

@danfitz36
Copy link
Copy Markdown

Summary

  • Add is_ascii_digits() utility that only matches ASCII digits (0-9), unlike Python's str.isdigit() which also matches non-ASCII digit characters (e.g. Arabic-Indic ٠١٢٣)
  • Replace all .isdigit() calls in presentational hint parsing (css/__init__.py) and PDF form field handling (pdf/anchors.py) with the new function
  • Affected attributes: cellpadding, hspace, vspace, width, height, maxlength

Context

Python's str.isdigit() returns True for non-ASCII digit characters, but the HTML spec defines digits as ASCII digits only (0-9). This could cause incorrect unit appending (e.g. treating Arabic-Indic digits as a bare number and appending px).

Note: the two .isdigit() calls in svg/bounding_box.py are intentionally left unchanged — they check individual characters in SVG path data, a different context.

Related to #2636.

Test plan

  • Unit tests for is_ascii_digits: basic values, empty string, non-digits, non-ASCII digits
  • All existing tests pass (3985 passed, 0 regressions)
  • ruff check passes

Fixes #2721

… hints

Python's str.isdigit() returns True for non-ASCII digit characters
(e.g. Arabic-Indic digits), but the HTML spec defines digits as ASCII
digits only (0-9). Replace all .isdigit() calls in presentational hint
parsing and PDF form field handling with an ASCII-only helper.

Fixes Kozea#2721

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@liZe
Copy link
Copy Markdown
Member

liZe commented Apr 13, 2026

Thanks for the pull request!

We’ll merge #2721 first and see what we can do for this. I think that it may be better to include these changes in a larger fix that can handle min/max values from the HTML specification and checks input values more seriously.

There’s no need to work more on this PR now, we’ll see after #2721.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace .isdigit() with ASCII-only digit check in presentational hints

2 participants