Skip to content

Partial proof for weighted_correlated_agreement_for_parameterized_curves'#419

Closed
aleph-prover[bot] wants to merge 1 commit intomainfrom
ai-prover-20260310_063640
Closed

Partial proof for weighted_correlated_agreement_for_parameterized_curves'#419
aleph-prover[bot] wants to merge 1 commit intomainfrom
ai-prover-20260310_063640

Conversation

@aleph-prover
Copy link
Contributor

@aleph-prover aleph-prover bot commented Mar 10, 2026

⚠️ Your proof request could only be partially completed
Some theorems may only have informal proofs available.
Some hard-to-prove facts may have been replaced with axioms.


Proven lemmas: 25/26

The main goal is to prove a “weighted correlated agreement over parameterized curves” theorem (BCIKS20 Thm 7.2, Version II): if many curve-parameter values z give weighted agreement ≥ α between the curve word i ↦ ∑j z^j u_j(i) and some Reed–Solomon codeword of degree < deg, then there exist global Reed–Solomon codewords v_j such that the set of coordinates i where all u_j(i) = v_j(i) has μ-measure ≥ α.

The proof is decomposed into (i) basic lemmas about extracting a witness codeword from an “agree_set” maximum, (ii) converting between the finitary carrier finCarrier and ReedSolomon.code, (iii) an “extraction” pipeline that from a large set S of good parameters produces a large subset S′ and global coefficients v_j matching the per-z decoded codewords, and (iv) a final lemma that turns “agreement for all z ∈ S′” into correlated agreement on coordinates.

Progress so far is strong on the elementary and algebraic glue: most helper lemmas are finished (20 nodes proven out of 28 total), including the machinery to pick codewords from agree_set, membership equivalences, and a clean reduction to a bivariate-polynomial representation of curve codewords (bivariatePoly_to_curveCodewords is proved, along with supporting coefficient/degree lemmas).

What remains is essentially one mathematical bottleneck: the “curve extraction” step that is the curve-analogue of BCIKS20 Section 5 (for lines). Concretely, the key missing lemma is the one that, from many z with good agreement, extracts a large S′ and a single bivariate polynomial P(Z,X) (degree in Z ≤ l+1, degree in X < deg) such that for all z ∈ S′ we have P(z, X) equal to the chosen decoder polynomial p_z; several downstream statements are currently blocked because this lemma is not available in the library.

Once that extraction lemma is supplied/proved, the rest of the proof is ready: it immediately yields global v_j, transfers the per-z agreement to agreement between curveEval(u,z) and curveEval(v,z) on S′, and then applies the already-available “sufficiently large list agreement on a curve ⇒ correlated agreement” lemma to conclude μ_set μ {i | ∀j, u_j(i)=v_j(i)} ≥ α.

The main challenge is that the missing step is not just algebraic rewriting; it packages the Guruswami–Sudan/Johnson-bound combinatorial-geometric argument specialized to curves (the heavy part of the paper), whereas the rest of the development is largely “glue code” plus polynomial interpolation/degree bookkeeping.

@github-actions
Copy link

🤖 Gemini PR Summary

Mathematical Formalization

  • Formalizes components of BCIKS20 Theorem 7.2 (Version II), which establishes weighted correlated agreement for parameterized curves.
  • Defines curve words (linear combinations of sequences) and their weighted agreement with Reed-Solomon codewords.
  • Introduces bivariatePoly_to_curveCodewords to provide a mapping between individual parameter evaluations and global coefficients.
  • Implements an "extraction pipeline" to derive a single set of global codewords from parameters exhibiting high local agreement.
  • Establishes degree bookkeeping and membership equivalence lemmas for converting between finitary carriers and the ReedSolomon.code type.

Proof Status and Placeholders

  • CRITICAL: This is a partial proof. The core lemma curve_codeword_extraction_common_agree_points contains a sorry placeholder, representing the "curve extraction" bottleneck (the combinatorial-geometric argument from BCIKS20 Section 5).
  • The main theorem weighted_correlated_agreement_for_parameterized_curves' is structurally complete but remains logically dependent on the aforementioned sorry.
  • 25 of 26 lemmas are reported as proven, though a discrepancy exists in the PR body which also mentions "20 nodes proven out of 28 total."
  • Completes the algebraic "glue," including machinery for selecting witness codewords from maximal agreement sets and reducing the problem to bivariate polynomial interpolation.

API and Structural Changes

  • Decomposes the high-level theorem into smaller logical nodes to improve the maintainability of the proximity gap proof suite.
  • Provides a structured API for agree_set maximums, facilitating the extraction of witness codewords in coding theory contexts.
  • Standardizes transformations between curve-parameter values and Reed-Solomon decoded polynomials, bridging point-wise evaluation and algebraic representation.

Statistics

Metric Count
📝 Files Changed 1
Lines Added 781
Lines Removed 77

Lean Declarations

✏️ **Added:** 25 declaration(s)
  • theorem bivarCoeffPoly_degree_le {F : Type} [Semiring F] {l : ℕ} (P : Polynomial (Polynomial F)) (j : Fin (l + 2)) : in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivariatePolyOfD_map_evalRingHom {F : Type} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivarCoeffPoly_map_evalRingHom_eval_apply {F : Type} [Field F] {l : ℕ} (P : Polynomial (Polynomial F)) in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem exists_poly_of_mem_RS_code [Fintype ι] [Field F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivariatePolyOfD_degreeX_le {F : Type} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem curve_codeword_extraction_common_agree_points [DecidableEq ι] [Fintype ι] [Nonempty ι] [DecidableEq F] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem exists_bivariatePoly_of_D [Fintype ι] [DecidableEq ι] [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivariatePolyOfD_degree_lt_card {F : Type} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem curve_codeword_extraction_matching_poly_polys [DecidableEq ι] [Fintype ι] [Nonempty ι] [DecidableEq F] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivarCoeffPoly_map_evalRingHom_eval {F : Type} [Field F] {l : ℕ} (P : Polynomial (Polynomial F)) in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem exists_codeword_of_mem_S [DecidableEq ι] [Fintype ι] [DecidableEq F] {l : ℕ} {u : Fin (l + 2) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivariatePolyOfD_eval {F : Type} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivarCoeffPoly_coeff {F : Type} [Semiring F] {l : ℕ} (P : Polynomial (Polynomial F)) (j : Fin (l + 2)) (t : ℕ) : in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem agree_set_ge_exists [Fintype ι] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivariatePolyOfD_coeff {F : Type} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem agree_le_card_div [DecidableEq ι] [Fintype ι] [DecidableEq F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem bivariatePoly_to_curveCodewords [DecidableEq ι] [Fintype ι] [Nonempty ι] [DecidableEq F] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem curve_codeword_extraction [DecidableEq ι] [Fintype ι] [DecidableEq F] {l : ℕ} {u : Fin (l + 2) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem curve_codeword_extraction_matching_poly [DecidableEq ι] [Fintype ι] [Nonempty ι] [DecidableEq F] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem curve_codeword_extraction_matching_poly_polys_unweighted_aux [DecidableEq ι] [Fintype ι] [Nonempty ι] [DecidableEq F] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem natDegree_coeff_le_degreeX {F : Type} [Semiring F] (P : Polynomial (Polynomial F)) (t : ℕ) : in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem mem_finCarrier_iff_mem_RS_code [Fintype ι] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem poly_map_C_map_evalRingHom {F : Type} [CommSemiring F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem extract_subset_curve_codewords [DecidableEq ι] [Fintype ι] [DecidableEq F] {l : ℕ} {u : Fin (l + 2) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
  • theorem mem_S_iff_agree_set_ge [DecidableEq ι] [Fintype ι] [DecidableEq F] {l : ℕ} {u : Fin (l + 2) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean
✏️ **Affected:** 3 declaration(s) (line number changed)
  • theorem weighted_correlated_agreement_over_affine_spaces in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean moved from L786 to L1566
  • theorem weighted_correlated_agreement_over_affine_spaces' in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean moved from L822 to L1602
  • theorem weighted_correlated_agreement_for_parameterized_curves' [DecidableEq ι] [Fintype ι] [DecidableEq F] {k l : ℕ} {u : Fin (l + 2) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean moved from L755 to L1521

sorry Tracking

✅ **Removed:** 3 `sorry`(s)
  • theorem curve_codeword_extraction_matching_poly [DecidableEq ι] [Fintype ι] [Nonempty ι] [DecidableEq F] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean (L775)
  • theorem curve_codeword_extraction [DecidableEq ι] [Fintype ι] [DecidableEq F] {l : ℕ} {u : Fin (l + 2) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean (L809)
  • theorem extract_subset_curve_codewords [DecidableEq ι] [Fintype ι] [DecidableEq F] {l : ℕ} {u : Fin (l + 2) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean (L842)
❌ **Added:** 3 `sorry`(s)
  • theorem weighted_correlated_agreement_over_affine_spaces in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean (L1589)
  • theorem weighted_correlated_agreement_over_affine_spaces' in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean (L1622)
  • theorem curve_codeword_extraction_common_agree_points [DecidableEq ι] [Fintype ι] [Nonempty ι] [DecidableEq F] [Field F] [Fintype F] in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean (L941)

🎨 **Style Guide Adherence**

There are more than 20 violations in this diff. They are grouped by rule below:

  • Rule: "Avoid empty lines inside definitions or proofs."

    • Count: 86 violations
    • Representative Examples: Line 763, line 771, line 920.
  • Rule: "Every definition and major theorem should have a docstring."

    • Count: 29 violations
    • Representative Examples: Line 753 (agree_le_card_div), line 815 (bivariatePolyOfD), line 890 (curveEval).
  • Rule: "u, v, w, ... : Universes", "α, β, γ, ... : Generic types", "i, j, k, ... : Integers", "s, t, ... : Sets or Lists"."

    • Count: 60+ violations
    • Representative Examples: Line 815 (variable u used for a curve instead of a universe), line 896 (variable α used for an ℝ≥0 value instead of a generic type), line 789 (variable j used for a Fin natural index instead of an integer).
  • Rule: "Prefer fun x ↦ ... over λ x, ...."

    • Count: 26 violations
    • Representative Examples: Line 891 (fun i => ...), line 936 (fun i => ...), line 1016 (fun j i => ...).
  • Rule: "Avoid parentheses where possible. Use <| (pipe left) and |> (pipe right) to reduce nesting."

    • Count: 25 violations
    • Representative Examples: Line 756 ((Fintype.card ι)), line 774 ((agree μ w)), line 1210 ((P.map (Polynomial.evalRingHom z))).
  • Rule: "Avoid (ge) and > (gt) in theorem statements unless necessary for argument ordering. In adherence with mathlib, we standardize on (le) and < (lt)."

    • Count: 13 violations
    • Representative Examples: Line 771 (... ≥ α), line 899 (S.card > ...), line 1264 (Finset.card S > ...).
  • Rule: "Symbol Naming Dictionary: Logic (= eq, ≥/≤ le/ge)..."

    • Count: 2 violations
    • Representative Examples: Line 768 (agree_set_ge_exists uses ge instead of the standardized le), line 938 (mem_S_iff_agree_set_ge uses ge).
  • Rule: "Inequalities with Bottom/Top: In assumptions, use x ≠ ⊥ (easier to check)."

    • Count: 1 violation
    • Representative Example: Line 1130 (uses 0 < deg as a hypothesis where deg ≠ 0 is preferred for naturals).

📄 **Per-File Summaries**
  • ArkLib/Data/CodingTheory/ProximityGap/BCIKS20.lean: This file introduces several new definitions and lemmas to formalize the weighted correlated agreement for parameterized curves from the BCIKS20 paper, including bivariate polynomial interpolation and codeword extraction techniques. The changes provide a structured proof for weighted_correlated_agreement_for_parameterized_curves' by decomposing it into smaller combinatorial and algebraic steps. A significant placeholder (sorry) remains in the core extraction lemma curve_codeword_extraction_common_agree_points.

Last updated: 2026-03-10 06:38 UTC.

@quangvdao
Copy link
Collaborator

Closing: partial proof for weighted correlated agreement variant — large BCIKS20 patch with new sorrys/placeholders. Donor-only.

@quangvdao quangvdao closed this Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant