Skip to content

ci: add docs metadata health check#1756

Open
omarespejel wants to merge 3 commits intostarknet-io:mainfrom
omarespejel:feat/metadata-health-check
Open

ci: add docs metadata health check#1756
omarespejel wants to merge 3 commits intostarknet-io:mainfrom
omarespejel:feat/metadata-health-check

Conversation

@omarespejel
Copy link
Copy Markdown
Contributor

Summary

Add a lightweight live metadata health check for representative docs.starknet.io pages.

This complements the existing llms.txt validation and discovery endpoint checks. The goal is to catch deployed-site regressions that affect search, AI agents, and social/link previews without burdening normal docs content edits.

What This Checks

For key docs pages across homepage, build, protocol, reference, and operations sections, the script checks:

  • HTTP 200 and text/html response
  • non-empty page title
  • canonical URL matches the public docs URL
  • no noindex directive in meta robots or X-Robots-Tag
  • presence of Open Graph/Twitter title metadata
  • presence of JSON-LD structured data

Meta descriptions are currently reported as non-blocking warnings because several existing docs pages do not emit them yet. This keeps the check useful without turning existing metadata debt into a blocking gate.

Workflow Behavior

  • Runs manually via workflow_dispatch
  • Runs weekly on Monday
  • Runs on PRs only when this health-check script or workflow changes
  • Does not run on normal docs content edits

Examples Of Regressions This Catches

  • A key docs page is accidentally marked noindex.
  • A platform/CDN change removes canonical links from rendered docs pages.
  • A representative docs page stops returning HTML or starts returning an error.
  • Structured data or preview title metadata disappears from rendered pages.

Validation

Ran locally:

python3 -m py_compile scripts/check_metadata_health.py
python3 scripts/check_metadata_health.py
python3 scripts/check_metadata_health.py --base-url file:///tmp; test $? -eq 1
python3 scripts/check_metadata_health.py --base-url https://docs.starknet.io/foo; test $? -eq 1
python3 scripts/validate_llms_txt.py

Staged first in omarespejel/starknet-docs and addressed AI-review feedback from CodeRabbit and Qodo before opening this upstream PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant