Skip to content

proofs for lemma distInterleavedCodeToCodeLB, e_leq_dist_over_3 and probOfBadPts#297

Open
FawadHa1der wants to merge 2 commits intoVerified-zkEVM:mainfrom
FawadHa1der:InterleavedCodes
Open

proofs for lemma distInterleavedCodeToCodeLB, e_leq_dist_over_3 and probOfBadPts#297
FawadHa1der wants to merge 2 commits intoVerified-zkEVM:mainfrom
FawadHa1der:InterleavedCodes

Conversation

@FawadHa1der
Copy link

probOfBadPts added some extra assumptions and e_leq_dist_over_3 has been proved with a weaker and stronger assumption. I can delete the weaker one if required. Created with help of Codex and Claude.

@github-actions
Copy link

github-actions bot commented Jan 26, 2026

🤖 Gemini PR Summary

This Pull Request advances the formalization of coding theory results from the AHIV22 paper, specifically focusing on row-span distance lower bounds and Reed-Solomon proximity gap lemmas.

Features

  • Formalized AHIV22 Lemmas: Completed the proofs for Lemmas 4.3, 4.4, and 4.5 from the AHIV22 paper within ArkLib.
  • Proximity Gap Proofs: Added the formal proof for distInterleavedCodeToCodeLB, which establishes the lower bound for interleaved codes.
  • Probability Analysis: Implemented the proof for probOfBadPts, incorporating necessary assumptions to formalize the probability of "bad points" in the proximity gap context.

Refactoring

  • Assumption Refinement:
    • Provided two versions of the proof for e_leq_dist_over_3 (one with weaker assumptions and one with stronger) to allow for flexibility in how the lemma is applied.
    • Updated probOfBadPts with extra assumptions required for formal verification.

Documentation

  • Paper References: The code in ArkLib/Data/CodingTheory/ProximityGap/AHIV22.lean is now mapped directly to the corresponding lemma numbers in the original AHIV22 publication for better traceability.

Analysis of Changes

Metric Count
📝 Files Changed 1
Lines Added 2020
Lines Removed 32

sorry Tracking

✅ **Removed:** 3 `sorry`(s)
  • lemma probOfBadPts {deg : ℕ} {α : ι ↪ F} {e : ℕ} {U_star : WordStack (A in ArkLib/Data/CodingTheory/ProximityGap/AHIV22.lean
  • lemma distInterleavedCodeToCodeLB in ArkLib/Data/CodingTheory/ProximityGap/AHIV22.lean
  • lemma e_leq_dist_over_3_strong in ArkLib/Data/CodingTheory/ProximityGap/AHIV22.lean

🎨 **Style Guide Adherence**

The following code changes violate the ArkLib style guide:

  • Line 131, 625, 809: lemma distInterleavedCodeToCodeLB, lemma dirClose_of_manyClosePts, lemma probOfBadPts

    • Rule violated: "Theorems and Proofs: snake_case (e.g., add_comm, list_reverse_id)."
  • Lines 335, 411: lemma e_leq_dist_over_3, lemma e_leq_dist_over_3_strong

    • Rule violated: "Note: In adherence with mathlib, we standardize on ≤ (le) and < (lt)." (also violates symbol naming for le).
  • Lines 126, 290, 297, 328, 406, 622, 804: (Block comments /- ... -/ used for docstrings)

    • Rule violated: "Declaration Docstrings: Use /-- ... -/ above definitions."
  • Lines 169, 182, 185, 203, 209, 219, 230, 244, 252, 260, 271, 278, 301, 313, 316, 351, 357, 360, 371, 375, 381, 389, 424, 431, 434, 447, 451, 460, 464, 471, 478, 481, 487, 492, 501, 510, 553, 574, 597, 610, 618, 638, 646, 649, 659, 663, 672, 676, 685, 692, 695, 701, 706, 715, 724, 767, 788, 797, 824, 831, 848, 856, 859, 867, 874, 882, 891, 915, 936, 1007, 1011, 1021: (Empty lines inside proofs)

    • Rule violated: "Empty Lines: Avoid empty lines inside definitions or proofs."
  • Lines 29, 30, 33, 41, 58, 59, 63, 72, 73, 74, 341, 345, 348, 358, 419, 422, 425, 432, 503, 506, 511, 515, 631, 635, 638, 647, 727, 730, 736, 739, 746, 755, 764, 768, 814, 836, 849, 851, 861, 868, 871, 875, 883, 893, 898, 903, 911, 916, 918, 937, 946, 951, 960, 969, 1002, 1012, 1016, 1017, 1018: (Use of => in anonymous functions)

    • Rule violated: "Functions: Prefer fun x ↦ ... over λ x, ..."
  • Lines 29, 33, 34, 35, 41, 54, 131, 310, 338, 414, 628, 812, 1047, 1052, 1058, 1063, 1068, 1073: (Use of u, v, or w as elements of a generic type)

    • Rule violated: "u, v, w, ... : Universes" and "x, y, z, ... : Elements of a generic type"
  • Lines 54, 133, 293, 298, 304, 310, 337, 413, 627, 812, 1047, 1052, 1058, 1063: (Use of e as a natural number)

    • Rule violated: "m, n, k, ... : Natural numbers"

📄 **Per-File Summaries**
  • ArkLib/Data/CodingTheory/ProximityGap/AHIV22.lean: This change formalizes the proofs for the row-span distance lower bound and the Reed-Solomon proximity gap lemmas (4.3, 4.4, and 4.5) from the AHIV22 paper.

Last updated: 2026-02-18 11:22 UTC.

@FawadHa1der
Copy link
Author

I was trying to implement probOfBadPts but the AI suggested we prove the other distInterleavedCodeToCodeLB and e_leq_dist_over_3 first. Didn't mean to duplicate or interfere already in progress work. My apologies if that inadvertently happen. @erdkocak @DimitriosMitsios I know this is a collaborative effort and I should sensitive to someone elses efforts

@alexanderlhicks tagging you so that you can check the assumptions and the solution.

@erdkocak
Copy link
Contributor

No problem, thanks for the great work. It seems that AI used a different approach than following the paper via using the already proven results from DG25, which I believe shows as library gets bigger proving new stuff will be much easier.

@chung-thai-nguyen
Copy link
Collaborator

@FawadHa1der Thanks for the good work. At a quick glance, the code looks good and clean, the hF & hdeg are reasonable/implicit assumptions of RS code (& derived from he). Lemma 4.4 is using OR instead of XOR but seems like good enough to use in Lemma 4.5 and it's not used anywhere else. Existing lemmas & definitions are applied fluently which is a good point. Though we still need deeper review. As a side question, which subscription plans for Codex or Claude did you use for this PR, and did it require much time for you to produce this PR?

@FawadHa1der
Copy link
Author

FawadHa1der commented Jan 27, 2026

I have a little unusual setup. I had Claude (4.5 opus) and Codex 5.2 running at the same time. Though Codex has been performing much better I have to admit, I have to look into how to make make Claude better at lean. It did stop multiple times so may be but not completely sure I think it spent about 3-5 hours. These things are running in the background as I get up to speed with my Lean learnings

@chung-thai-nguyen
Copy link
Collaborator

I have a little unusual setup. I had Claude (4.5 opus) and Codex 5.2 running at the same time. Though Codex has been performing much better I have to admit, I have to look into how to make make Claude better at lean. It did stop multiple times so may be but not completely sure I think it spent about 3-5 hours. These things are running in the background as I get up to speed with my Lean learnings

That’s impressive tbh and it's a nice showcase of AI capabilities in FV, especially given that this was done alongside you ramping up on Lean in a short duration. Thanks for the insights and the PR.

@DimitriosMitsios
Copy link
Contributor

@FawadHa1der Did you get the usual monthly subscriptions on those models or the api charge? Have you tried any of the DeepSeek's models as well? They have lower api cost by more than 10x but I don't have any experience with their performance on FV. I could give them a try myself and report back!

@FawadHa1der
Copy link
Author

FawadHa1der commented Jan 30, 2026

@DimitriosMitsios what is FV? yes just the monthly subscription. They do get rate limited sometimes which is annoying. I ave not tried DeepSeek yet. Never heard of its math/Lean4 capabilities though

@chung-thai-nguyen thank you, I have been mindful of trying not to add assumptions to the statements because I suspect they have been designed to keep re-usability in mind. After an initial implementation by AI I go back try to reduce or take out the assumptions that were added, which is not always successful as you might notice if my next PR :)

@DimitriosMitsios
Copy link
Contributor

@FawadHa1der FV := Formal Verification, it was used in a previous message and I took it for granted, sorry!

@alexanderlhicks alexanderlhicks self-assigned this Feb 7, 2026
@alexanderlhicks
Copy link
Collaborator

Hello! Thanks for the PR and sorry for the delay in getting to this. I'll be merging the 4.26 PR soon, which might break a few things, and get back to all the other PRs including this one.

`(𝔽ᵐ)ⁿ`. Let `e` be a positive integer such that `e < d/3` and `|𝔽| ≥ e`.
Suppose `d(U⋆, L^⋈m) > e`. Then, there exists `v⋆ ∈ L⋆` such that `d(v⋆, L) > e`, where `L⋆` is the
row-span of `U⋆`. -/
private def vecSupport (u : ι → F) : Finset ι :=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for these to be marked as private? Some of these also look like they might or should already exist in a standard library.


end ProximityToRS

section Tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether merging these tests would add much? E.g.

example (u v : ι → F) :
    vecSupport (F := F) (u - v) = Finset.filter (fun j => u j ≠ v j) Finset.univ :=
  vecSupport_sub (F := F) u v

is basically redundant

private def vecSupport (u : ι → F) : Finset ι :=
Finset.filter (fun j => u j ≠ 0) Finset.univ

private lemma mem_vecSupport {u : ι → F} {j : ι} : j ∈ vecSupport (F := F) u ↔ u j ≠ 0 := by
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from them being private a few of these could do with some cleaning up regading assumptions as e.g. F does not need to be finite here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment (also applies to other PRs): please take a look at the linter and style guide adherence section of the PR summary (it's easy enough to feed this to an LLM to clean up). Also, if you are adding significant proofs (i.e. proofs of main results in a file, ...) I think it's fair to add yourself (and any AI used) to the file author list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments