Skip to content

Releases: yaniv-golan/proof-engine

v1.18.0

16 Apr 11:00

Choose a tag to compare

Added

  • KaTeX math rendering — mathematical notation in proof claims and narratives now renders as typeset math across all site surfaces. Three rendering paths: KaTeX client-side for headings and narrative markdown, pymdownx.arithmatex for markdown pipeline protection, and strip_latex() Unicode conversion for plain-text surfaces (OG tags, JSON-LD, citations, page titles).
  • tools/lib/latex_utils.pystrip_latex() function converts \(...\) LaTeX delimiters to Unicode equivalents (Greek letters, sub/superscripts, operators) for contexts where client-side rendering is unavailable.
  • tools/add-latex-to-claims.py — interactive script for retroactive conversion of math-heavy proof claims to use LaTeX delimiters. Supports dry-run mode, manual editing, and preserves proof.json/proof.py provenance parity. Skips DOI-backed proofs.
  • KaTeX v0.16.45 vendored — self-hosted CSS, JS, auto-render plugin, and 60 font files at site/static/vendor/katex/.
  • pymdownx.arithmatex — integrated into the markdown sanitizer to protect \(...\) and \[...\] delimiters from markdown processing. Configured with inline_syntax: ["round"] and block_syntax: ["square"] to avoid $...$ currency collisions.
  • Math rendering in catalogrenderMathInElement called after card rendering in both catalog.js and catalog-enhance.js.

Changed

  • tools/build-site.py — registers strip_latex as a Jinja2 filter; pre-strips claim_natural in pipeline example data.
  • tools/lib/json_ld.py — applies strip_latex() to claimReviewed field.
  • tools/lib/citation.py — applies strip_latex() to citation claim text.
  • site/templates/proof.htmlstrip_latex filter on title, OG tags, meta description, and share bar; <h1> left raw for KaTeX client-side rendering.
  • site/templates/landing.htmlstrip_latex filter on myth-card claims and featured proofs data.
  • CI workflowspymdown-extensions added to pip install in validate.yml (both jobs) and deploy-site.yml.
  • SKILL.md — added LaTeX delimiter guidance for proof authors.

v1.17.0

15 Apr 22:59

Choose a tag to compare

Added

  • proof_format_schema.json — single source of truth for proof markdown section requirements, shared between the proof-engine skill (producer) and site builder (consumer). Defines v1/v2 profiles for proof.md, proof_audit.md, and proof_narrative.md, plus conditional sections and template fallback mappings.

Changed

  • proof_loader.py — section requirements now read from proof_format_schema.json instead of hardcoded constants. Profile selection uses original_format_version to choose v1 or v2 validation rules.
  • narrative_validator.py — required narrative sections now sourced from schema instead of a hardcoded list.
  • proof.html — replaced format_version branching with fallback chains (Quality Checks or Hardening Checklist, Source Data or Extraction Records, audit or proof.md Claim Interpretation).
  • output-specs.md — added schema reference, documented ProofSummaryBuilder as primary emission path, fixed narrative heading casing to title-case.
  • SKILL.md — documented ProofSummaryBuilder in Bundled Scripts table and Key function signatures, updated emit_proof_summary gotcha, fixed narrative heading casing.

Fixed

  • Legacy emit_proof_summary() now defaults format_version to 2 — proofs generated via the legacy path no longer land with missing format_version, which caused the loader to apply v1 section requirements to v2-style proofs.

v1.16.0

15 Apr 20:42

Choose a tag to compare

Added

  • rejection_statement field for disproof proofs — each empirical_facts entry in a disproof must include a rejection_statement field: the verbatim phrase from the quote that explicitly rejects the claim. validate_proof.py warns when the field is absent and raises an issue when it is present but not a substring of the associated quote. Replaces the 25-pattern REJECTION_MARKERS vocabulary scan.
  • is_time_sensitive field in CLAIM_FORMAL — proofs that depend on the current date declare "is_time_sensitive": True in CLAIM_FORMAL. validate_proof.py uses AST to read this field and enforces four behavioral branches (declared+today → pass; declared+no today → issue; today without declaration → warning; hardcoded date without today → issue). Replaces comment-strip + regex keyword scan.
  • verbatim field per empirical_facts entry — authors can declare "verbatim": False when a quote is paraphrased. validate_proof.py checks this field structurally: warns on verbatim: False, raises an issue on verbatim: True with an ellipsis (contradiction), and nudges on ellipsis without any declaration. Replaces ellipsis-only heuristic.
  • subclaim_to_sources map in CLAIM_FORMAL — compound proofs can declare an explicit subclaim_to_sources dict mapping each sub-claim ID to its list of empirical_facts keys. validate_proof.py Path 1 uses this map directly; Path 2 falls back to key-prefix inference for proofs that don't provide it.
  • AST-based Rule 5 checkadversarial_checks is verified via AST list-element count, not vocabulary scanning. Empty list → issue; count ≥ 1 → pass with count; non-list or missing → regex fallback.
  • W3C PROV-JSON exporttools/lib/prov.py generates provenance.json per proof: a W3C PROV-JSON provenance chain mapping each evidence entity, citation-verification activity, and cross-check derivation back to the Proof Engine agent. Included in RO-Crate packages.
  • SARIF 2.1.0 exporttools/lib/sarif.py converts validate_proof.py results to SARIF 2.1.0. Each hardening rule maps to a stable rule ID (PE001PE010); issues are error level, warnings are warning. Enables integration with GitHub Code Scanning and other SARIF-aware tooling.
  • RO-Crate 1.1 packagingtools/lib/ro_crate.py generates ro-crate-metadata.json per proof: a standards-compliant research object manifest listing all proof artifacts (proof.py, proof.json, proof.md, proof_audit.md, proof_narrative.md, provenance.json, proof.ipynb) with typed schema.org roles and DOI links.
  • Jupyter Notebook exporttools/build-site.py generates proof.ipynb per proof: a two-cell Jupyter Notebook that installs dependencies and re-runs proof.py in an interactive environment. Included in the RO-Crate manifest as a ComputationalNotebook.
  • Enhanced Schema.org JSON-LD — proof pages now include isBasedOn (links to each cited source URL), mainEntity (the ClaimReview block), and sameAs provenance links. Improves search engine and Linked Data discoverability.

Changed

  • validate_proof.py design principle — all new checks read structured fields declared by the LLM at generation time; validator does mechanical verification only (substring containment, list length, field presence). No semantic inference from free text.
  • Hardening rules documentation — updated validator notes for Rule 3 (is_time_sensitive behavioral branches), Rule 5 (AST non-empty list check), and Rule 8 (rejection_statement enforcement).
  • output-specs.md — added rejection_statement, Verbatim status, and Time sensitivity to Citation Verification Details.
  • template-qualitative.md — added is_time_sensitive comment to CLAIM_FORMAL, verbatim comment to empirical_facts, and expanded disproof variant to show rejection_statement field explicitly.
  • template-date-age.mdis_time_sensitive: True now included in CLAIM_FORMAL.
  • template-compound.md — commented subclaim_to_sources block added to CLAIM_FORMAL.

v1.15.0

11 Apr 21:00

Choose a tag to compare

Added

  • Proof detail page redesign — restructured proof detail template with verdict qualifier line, jump links (summary · caveats · sources · audit trail), promoted counter-evidence section ("What could challenge this verdict?"), canonical sources table, collapsible downloads, and single generator footer
  • Format version supportformat_version field in proof.json enables v1/v2 proof format branching in loader and template. V2 proofs use renamed sections (Quality Checks, Source Data) and move Claim Interpretation to the audit trail
  • _SOURCE_TYPE_DISPLAY_LABELS — capitalized source-type labels for the detail page, separate from the lowercase landing-page labels

Changed

  • Proof detail template — evidence accordion slimmed to 3 sections (Evidence Summary, Proof Logic, Conclusion); audit trail reordered with format-version-aware section list; page title truncated at 50 chars
  • Output specs (v2) — renamed Counter-Evidence Search → "What could challenge this verdict?", Hardening Checklist → Quality Checks, Extraction Records → Source Data; Claim Interpretation moved from proof.md to proof_audit.md
  • Proof loader — v1/v2 required/optional section lists; format_version hoisted to top-level proof dict; Claim Specification made optional for v1 (3 existing proofs lack it)

Fixed

  • Generator footer stripping — regex handles both plain and italic-wrapped (*Generated by...*) footers
  • Dead inline analytics script — removed from template (proof-enhance.js handles it)
  • Jump links spacing — fixed Jinja2 whitespace control for dot separators

v1.12.0 — Agent Guardrails

09 Apr 15:45

Choose a tag to compare

Added

  • apply_verdict_qualifier() helper in computations.py — validates base verdict against the 5-value taxonomy and only appends "(with unverified citations)" to the 3 qualifiable verdicts (PROVED, DISPROVED, SUPPORTED). Prevents agents from constructing invalid verdict strings
  • emit_proof_summary() helper in computations.py — validates proof summary keys against the ProofData TypedDict schema before printing, raising ValueError on unknown keys. Prevents agents from inventing schema fields
  • Verdict validity check in validate_proof.py — detects invalid verdict strings and the += antipattern for building verdicts
  • FACT_REGISTRY format check in validate_proof.py — ensures registry entries are dicts (not plain strings) with required keys per fact type
  • claim_natural key check in validate_proof.py — warns when bare "claim" is used instead of the required "claim_natural" key
  • emit_proof_summary adoption check in validate_proof.py — warns when proofs use raw json.dumps instead of the schema-validated helper
  • Type guard in verify_citations.pybuild_citation_detail() raises TypeError with actionable message when FACT_REGISTRY entries are strings instead of dicts
  • Key stripping in proof_runner.py — unknown keys are silently stripped from proof JSON during publish, with stderr warning. Last line of defense after generation-time validation

Changed

  • All 6 proof templates refactored to use apply_verdict_qualifier() and emit_proof_summary(), replacing manual verdict construction and raw json.dumps
  • check_json_summary() updated to recognize emit_proof_summary() as a valid summary output method
  • Missing-section errors in proof_loader.py now include the list of found sections for easier debugging

v1.11.0 — Citation Verification Hardening

09 Apr 15:48

Choose a tag to compare

Added

  • Inline LaTeX $...$ stripping in normalize_text() — arXiv abstract pages with raw LaTeX like $\Lambda$CDM and $H_0 = 67.4\pm 0.5$ now normalize correctly. Three-pass regex handles complex LaTeX, single-letter variables, and unadorned multi-letter tokens
  • Scoped Greek-to-ASCII transliteration — Greek letters from LaTeX output (Λ→L, Ω→O, etc.) are transliterated for matching, while non-LaTeX Greek (μm, ρ) is preserved to avoid false positives
  • Math operator spacing collapse — ar5iv MathML rendering produces Ω m = 0.315 ± 0.007 with spaces; new steps 3a/3b collapse Greek-Latin spacing and operator spacing
  • Closest-passage suggestion engine_find_closest_passage() uses Jaccard word-set similarity to show a diagnostic hint when quotes fail verification. Ephemeral (console output only, not persisted to proof.json)
  • GitHub raw README fallback — bare github.com/owner/repo URLs that return a JS-rendered React shell now fall back to raw.githubusercontent.com with multiple README filename candidates. Reports fetch_mode='github_raw'
  • Ellipsis detection in validate_proof.py — AST-based quote extraction warns when quotes contain ... or , a strong signal of spliced non-adjacent text
  • Real-world demonstration search directive — Step 2 now prompts searching for practical applications of the claimed mechanism (not just benchmarks), after field testing revealed this gap

Changed

  • Verbatim quoting enforcement — SKILL.md, hardening-rules.md, and environment-and-sources.md now explicitly prohibit paraphrased quotes with bad/good examples, a Quote Harvesting gate in Step 2, a pre-flight citation check in Step 3, and a Citation Recovery Loop as Step 5.5
  • PDF citation guidance — rewritten to recommend snapshot workflow using Claude Code's native PDF reading; arXiv section added recommending ar5iv HTML over arxiv.org/abs
  • Self-critique checklist — added verbatim quote verification and PDF snapshot checks

v1.10.0 — Formal Citations & Zenodo DOIs

09 Apr 15:48

Choose a tag to compare

chore: bump version to 1.10.0, update changelog and docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

v1.9.0 — Claim-Fidelity Hardening

09 Apr 15:48

Choose a tag to compare

release: v1.9.0 — claim-fidelity hardening (Rule 8, entailment gaps, …

v1.8.0 — Proof Narrative Layer

07 Apr 05:20

Choose a tag to compare

fix: audit accordion scrolls instead of expanding vertically

Root cause: overflow-x:auto on .audit-body.animated.open forces
overflow-y:auto per CSS spec, creating a scroll container instead of
letting nested <details> expand the panel. Also added setTimeout
fallback for transitionend (which can silently fail), leaving
max-height capped at the initial pixel value.

Fix: use overflow:visible on the open audit body, move overflow-x:auto
to inner pre/table elements only.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

v1.7.0

06 Apr 15:05

Choose a tag to compare

Added

  • Scripts: context-dependent <sup>/<sub> handling — superscripts in running prose are stripped (e.g., footnote markers), but preserved as exponents in mathematical/scientific contexts (e.g., "10²", "m²")
  • Scripts: inline HTML tag stripping without injecting spaces — tags like <span>, <em>, <a> inside quotes no longer break matching
  • Scripts: two-pass matching — first try exact match on cleaned text, then fall back to substring search
  • Scripts: expanded Unicode invisible character normalization — strips zero-width spaces (U+200B), zero-width non-joiners (U+200C), zero-width joiners (U+200D), word joiners (U+2060), left-to-right/right-to-left marks (U+200E/U+200F), soft hyphens (U+00AD), and variation selectors (U+FE00–U+FE0F)
  • Scripts: MathML <math> tag extraction — extracts alttext attribute content and converts LaTeX notation to readable text via new latex_text.py module
  • Scripts: latex_text.py — converts LaTeX math notation (fractions, Greek letters, operators, superscripts/subscripts) to plain text for citation matching
  • Tests: integration tests for all three false-negative classes (superscript/inline-tag, invisible Unicode, MathML alttext)

Fixed

  • 4 site proofs upgraded from "with unverified citations" to clean verdicts: smartphone-screens..., the-assertion-that-no-arab-state..., the-schwarzschild-radius..., current-ai-systems-have-already...