TrustHash

Formal verification of cryptographic hash function security in Lean 4.

What It Does

TrustHash proves that hash functions resist structural attacks by formalizing a two-layer security model:

security_level = min(generic_floor, structural_cost)

Layer 1 (Generic): Birthday bound, Wagner GBP, Joux cascade, sponge floor — attacks that work against ANY hash function.
Layer 2 (Structural): Differential, linear, algebraic, zero-sum, round-skipping — attacks that exploit internal structure.

If structural cost >= generic floor, the hash is optimal (no structural weakness). If structural < generic, the hash has an exploitable weakness.

Why It Matters

Recent results in the literature motivate formal verification of hash security:

The STARK-friendly hash report (Canteaut et al. 2020) found that GMiMC-128-d's zero-sum covers the full permutation, and Poseidon-128-d has a narrow 2^61 distinguisher margin.
"Skipping Class" (Merz & Rodriguez Garcia 2026) showed that collisions can be algebraically cheaper than preimages via invariant subspace attacks.
The Ethereum Foundation bounty (2024-2026) revealed that original Poseidon-2 security estimates were underestimated.

TrustHash formalizes the underlying cost models (differential, algebraic, zero-sum, round-skipping) so that security claims can be machine-checked rather than estimated by hand.

Architecture

Three-column analysis feeding into a unified verdict:

Column 1	Column 2	Column 3
Threat Lattice + Attack Taxonomy	Degree Tracking + S-box Cost Model	Treewidth per Section
Hash Gone Bad 4D (COL x LE x OC x IL)	Skipping Class degree bounds	STARK report (HADES structure)

The full pipeline (v3.1) runs on real hash functions:

S-box table → AutoSboxPipeline → SboxCertifiedParams →
  HashExprF → E-graph saturation (12 rules, recursive rewriting) →
    HashCircuit → constraint graph → tree decomposition →
      nice tree → security DP (with optimality proof) → two-layer verdict

Each stage is verified: E-graph preserves semantics (allRules_realSaturate_sound), DP gives an optimal lower bound (dp_optimal_of_validNTD), certified params flow through with proof (claim3_certified_flow), and multi-round DP composes across rounds (multiRound_composedCost_ge_single).

Built on four verified Lean 4 libraries (used as reference, not dependencies):

LeanHash (~134 theorems): Generic bounds, S-box properties, MDS, differential paths
DynamicTreeProg (~122 theorems): DP over tree decompositions, treeFold
OptiSat (~190 theorems): Equality saturation, ILP extraction
ProofKit (~20 theorems): Generic foldl invariants

Stats

Metric	Value
Modules	224
Theorems	2,683
Sorry	0
Axioms	0
LOC	~41,000
Version	v5.0
Hash instances analyzed	9 (6 classical + 3 AG code)
E-graph rewrite rules	12 crypto + 15 attack (all with soundness proofs)
Attack types modeled	14 (differential, linear, algebraic, boomerang, MITM, rebound, ...)
End-to-end demos	1 (PRESENT 7-stage pipeline) + multi-round DP (3R, 2R)

Directory Structure

TrustHash/
  ├── Attacks/
  │   ├── AG/               5 files — AG quantum bounds: Drinfeld-Vladut, tower params, instances
  │   ├── Composition/
  │   │   ├── EGraph/       4 files — Attack E-graph: core, e-matching, saturation, extraction
  │   │   └── Rules/        4 files — Attack rewrite rules: differential, linear, structural, hybrid
  │   ├── CollisionVariety/ 2 files — Collision variety bounds, Hilbert irreducibility
  │   ├── DivisionProperty/ 4 files — Integral attack model
  │   ├── SlideAttack/      3 files — Slide attack model
  │   ├── RelatedKey/       2 files — Key schedule + related-key attacks
  │   ├── InvariantSubspace/1 file  — Invariant subspace attacks
  │   └── ConstantTime/     1 file  — Side-channel constant-time verification
  ├── Sbox/           16 files — S-box analysis: GF(2^4), DDT/LAT/ANF, certificates, bridge
  ├── Pipeline/       10 files — Security pipeline: verdict, instances, wide trail, end-to-end
  ├── Validation/      7 files — Cross-validation gates: V27, V28, V30, V31, V40, V50, CrossV28
  ├── EGraph/         20 files — E-graph engine: union-find, saturation, extraction, rewriting
  ├── DP/             12 files — Dynamic programming: nice tree DP, compose, optimality
  ├── Math/           12 files — Foundations: Walsh-Hadamard, collision bounds, AG distribution, inverse sieve
  ├── Keccak/          7 files — Keccak/SHA-3: spec, algebraic degree, treewidth, BitVec
  ├── Feistel/         3 files — Feistel: spec, security, concrete bounds
  ├── ILP/             3 files — ILP: spec, check, unified extraction
  ├── Poseidon/        2 files — Poseidon: spec, analysis
  ├── RescuePrime/     2 files — Rescue-Prime: spec, analysis
  ├── SAT/             2 files — SAT: encoding, certificate
  ├── GMiMC/           1 file  — GMiMC: spec
  └── (root)          77 files — Core: security model, cost model, graphs, hash specs, ciphers

Toolchain

Lean 4 v4.16.0 — all proofs use Init/Std only, no Mathlib dependency.

Build

lake build

Version History

Version	Modules	Theorems	What changed
v1.0	37	350	Two-layer model, generic floor, structural bounds, verdict
v1.1	39	362	KnownHashes, AnalysisReport (usability layer)
v1.2	42	399	HashSpec pipeline, DegreeFold, AlgExpr, CryptoRewrites
v2.0	68	772	S-box DDT/LAT, E-graph engine, DP over tree decompositions, end-to-end computation pipeline
v2.1	74	844	Zero-parameter pipeline (`computeFromSbox`), constraint graph from HADES topology, E-graph integration
v2.5	95	1,138	Exponential DP (TreewidthDP), compositional constraint graph, real E-graph rewrite rules, ILP extraction encoding, wide trail theorem, PRESENT 4-bit full end-to-end
v2.6	108	1,352	S-box certificate framework (AES 8-bit), real E-graph optimizer (HashOp + SoundRules), PRESENT S-box from GF(2^4) field ops, wide trail pipeline, E-graph completeness (DAG acyclicity + fuel sufficiency + hypothesis discharge)
v2.7	119	1,500	Math foundations (Walsh-Hadamard, collision bounds, iteration dynamics), ILP real extraction, DPCompose multi-objective, Keccak spec + algebraic degree + constraint graph + treewidth, Feistel spec + security, SAT treewidth oracle
v2.8	132	1,675	3 Directions complete: D1 (Bernstein cost model, tight backward degree, Matsui linear attack), D2 (23 verified rewrite rules — 11 unconditional + 12 conditional), D3 (Feistel concrete bounds via Patarin/CCA). Updated instances, cross-validation, v2.5 legacy cleanup
v3.0	151	1,831	5 reality gaps closed: real E-graph saturation (12 sound rules), DP on 3 real circuits (PRESENT/AES/Keccak), AES 8-bit S-box certified analysis, Keccak sponge security chain, end-to-end pipeline on PRESENT 3-round. Three thesis claims machine-checked.
v3.1	173	2,219	6 future work items implemented: (1) Multi-round circuit DP (PRESENT-3R, AES-2R), (2) Recursive E-graph rewriting with depth reduction, (3) Automated S-box certificate pipeline (table → certified params), (4) Executable cipher implementations (AES round + Keccak χ with FIPS test vectors), (5) Formal DP optimality (dp_optimal_of_validNTD), (6) 3 new ciphers (Poseidon-128, Rescue-Prime-128, GMiMC-128). 6 hash instances total.
v3.1.1	173	2,219	Directory reorganization: 31 files moved to Sbox/ (16), Pipeline/ (10), Validation/ (5). Root reduced from 108→77 files. 45 structural theorems upgraded from `native_decide` to kernel-verified `decide`.
v4.0	190	2,202	Extended security model: 7 new attack types (quantum, integral, slide, related-key, invariant subspace, constant-time). Grover/BHT quantum bounds. `overallSecurityV4 = min(quantumGenericFloor, expandedStructuralCost)`.
v5.0	224	2,683	AG quantum bounds (Feng-Ling-Xing), composable attack engine (AttackEGraph, 15 rewrite rules), distribution + collision variety bounds (inverse sieve, Hilbert). 9 hash instances (6 classical + 3 AG code). SuperHash export interface.

Usage

TrustHash is a proof library, not an interactive tool. You use it by writing Lean 4: define your hash parameters, and Lean computes and verifies the security level at compile time. There are several ways to use it, from simplest to most powerful:

v1.0: Manual parameters (FullAnalysis)

Import TrustHash.FullVerdict and fill in the FullAnalysis structure:

import TrustHash.FullVerdict
open TrustHash.FullVerdict

def poseidon128 : FullAnalysis where
  outputBits     := 256   -- n: hash output length in bits
  diffUniformity := 4     -- δ: differential uniformity of the S-box
  activeSboxes   := 22    -- b: minimum active S-boxes over all trails
  algebraicDeg   := 5     -- d: algebraic degree of the S-box (x^5)
  treewidth      := 2     -- tw: treewidth of the HADES section (t-1 for t=3)
  totalRounds    := 8     -- R: total number of rounds
  skippedRounds  := 0     -- s: rounds the adversary can skip
  h_output := by omega    -- proofs that parameters are well-formed
  h_delta  := by omega
  h_active := by omega
  h_deg    := by omega
  h_tw     := by omega
  h_rounds := by omega
  h_skip   := by omega

Evaluate the security verdict:

#eval genericFloor poseidon128     -- 128 (birthday: 256/2)
#eval differentialCost poseidon128 -- 17592186044416 (4^22)
#eval algebraicCost poseidon128    -- 25 (5^2)
#eval structuralCost poseidon128   -- 25 (min of differential and algebraic)
#eval overallSecurity poseidon128  -- 25 (min of generic and structural)

Interpretation: Poseidon-128 with treewidth 2 has overall security = 25, far below the generic floor of 128. The algebraic attack (degree 5, treewidth 2 = cost 25) is the bottleneck. Increasing treewidth or algebraic degree would improve security.

Prove properties about your hash:

-- Security is always at least 1
example : overallSecurity poseidon128 ≥ 1 :=
  structural_cost_pos poseidon128 -- structural ≥ 1, and overall ≤ structural

-- Overall security can't exceed the birthday bound
example : overallSecurity poseidon128 ≤ genericFloor poseidon128 :=
  overall_le_generic poseidon128

Example: AES-128 (optimal design)

def aes128 : FullAnalysis where
  outputBits     := 128   -- 128-bit output
  diffUniformity := 4     -- AES S-box: δ = 4
  activeSboxes   := 50    -- 50 active S-boxes over 10 rounds
  algebraicDeg   := 7     -- AES S-box: degree 7
  treewidth      := 4     -- AES state: 4×4 matrix, tw = 4
  totalRounds    := 10    -- AES-128: 10 rounds
  skippedRounds  := 0
  h_output := by omega; h_delta := by omega; h_active := by omega
  h_deg := by omega; h_tw := by omega; h_rounds := by omega; h_skip := by omega

#eval genericFloor aes128     -- 64 (birthday: 128/2)
#eval structuralCost aes128   -- 2401 (min(4^50, 7^4) = 2401)
#eval overallSecurity aes128  -- 64 (min(64, 2401) = 64)
-- OPTIMAL: structural cost (2401) exceeds generic floor (64)
-- The birthday bound is the bottleneck, not the cipher's structure.

Example: Detecting a weak design

def weakHash : FullAnalysis where
  outputBits     := 256
  diffUniformity := 4
  activeSboxes   := 3     -- very few active S-boxes
  algebraicDeg   := 3     -- low algebraic degree
  treewidth      := 1     -- minimal treewidth
  totalRounds    := 4
  skippedRounds  := 0
  h_output := by omega; h_delta := by omega; h_active := by omega
  h_deg := by omega; h_tw := by omega; h_rounds := by omega; h_skip := by omega

#eval genericFloor weakHash     -- 128
#eval structuralCost weakHash   -- 3 (min(4^3, 3^1) = min(64, 3) = 3)
#eval overallSecurity weakHash  -- 3
-- WEAKNESS: structural cost (3) is far below generic floor (128).
-- The algebraic attack at treewidth 1 costs only 3^1 = 3.

v1.1: KnownHashes and Reports

v1.1 adds a usability layer with precalculated analyses and human-readable reports.

Analyze a known hash (no need to define parameters):

import TrustHash.KnownHashes
import TrustHash.AnalysisReport
open TrustHash.FullVerdict
open TrustHash.AnalysisReport
open TrustHash.KnownHashes

#eval analyzeReport "AES-128" aes128
-- === AES-128 ===
--   Output bits:       128
--   Generic floor:     64
--   Differential cost: 1267650600228229401496703205376 (δ=4, b=50)
--   Algebraic cost:    2401 (d=7, tw=4)
--   Structural cost:   2401
--   Overall security:  64
--   Verdict:           OPTIMAL
--   Bottleneck:        generic (birthday bound)

Compare all known hashes (sorted by overall security):

#eval compareTable knownHashes
-- Hash                              Generic   Struct   Overall  Verdict
-- ------------------------------------------------------------------------
-- Poseidon-256 (t=9)                    128   390625       128  OPTIMAL
-- SHA-256 (model)                       128   117649       128  OPTIMAL
-- Keccak-256 (model)                    128      243       128  OPTIMAL
-- Rescue-Prime                          128      125       125  WEAKNESS
-- Griffin-256                           128      125       125  WEAKNESS
-- AES-128                                64     2401        64  OPTIMAL
-- Poseidon-128 (t=3)                    128       25        25  WEAKNESS
-- Weak Hash                             128        3         3  WEAKNESS

Available instances: poseidon128_t3, poseidon256_t9, aes128, rescuePrime, griffin256, sha256_model, keccak256_model, weakHash_example.

v1.2: Computational Pipeline (HashSpec)

v1.2 closes the gap between the arithmetic model and real cryptanalytic computation. Instead of manually providing derived parameters (active S-boxes, treewidth), you define only design parameters and the pipeline computes everything.

Define design parameters only:

import TrustHash.Pipeline.PipelineInstances
open TrustHash.HashSpec
open TrustHash.PipelineVerdict

def myHash : HashSpec where
  construction   := .Sponge
  outputBits     := 256
  stateWidth     := 3
  rate           := 128
  capacity       := 128
  sboxDegree     := 5       -- S-box degree
  diffUniformity := 4       -- S-box differential uniformity
  branchNumber   := 3       -- MDS branch number
  fullRounds     := 8       -- R_F full rounds
  partialRounds  := 57      -- R_P partial rounds
  h_output := by omega; h_width := by omega; h_deg := by omega
  h_delta := by omega; h_branch := by omega; h_rounds := by omega

Pipeline computes derived values automatically:

-- Pipeline computes treewidth from round structure
-- tw(full) = max(1, t-1), tw(partial) = 1

-- Pipeline computes active S-boxes from wide trail
-- active = branchNumber * fullRounds + partialRounds

-- Pipeline computes costs and verdict
#eval pipelineVerdictTag myHash    -- "WEAKNESS"
#eval computeVerdict myHash        -- 25

Prove pipeline verdicts (native_decide):

theorem myHash_weakness :
    pipelineVerdictTag myHash = "WEAKNESS" := by native_decide

theorem myHash_verdict_val :
    computeVerdict myHash = 25 := by native_decide

v1.2 also includes:

DegreeFold: Degree tracking catamorphism over HADES round sequences (adapted from DynamicTreeProg.treeFold)
FoldCostBound: Proof that fold-based computation matches pipeline computation
AlgExpr + CryptoRewrites: Sound algebraic rewrite rules for S-box degree simplification (adapted from OptiSat.SoundRewriteRule)
SimplificationVerifier: End-to-end verified simplification (e.g., d^8 * 1^57 = d^8 for Poseidon partial rounds)

v2.0: End-to-End Computation Pipeline

v2.0 replaces manual parameter entry with real cryptanalytic computation. It adds 5 functional components that compute security parameters directly from S-box tables and constraint graph structure:

S-box analysis: Computes DDT, LAT, and algebraic degree FROM the S-box lookup table
Differential trail search: Analytical + exhaustive search for minimum active S-boxes
Treewidth computation: Greedy min-degree elimination on constraint graphs, tree decomposition, nice tree conversion
E-graph engine: Union-find with congruence closure, pattern matching, fuel-based saturation, greedy extraction — with semantic preservation proofs
DP over tree decompositions: niceTreeFold catamorphism with crypto-specific cost functions, connecting differential and algebraic DP to a unified security bound

Full pipeline: HashSpec + NiceNode -> ComputedAnalysis -> verdict:

import TrustHash.Pipeline.ComputeInstances
open TrustHash.ComputePipeline
open TrustHash.ComputeInstances

-- Poseidon-128: HADES structure as nice tree
-- Full analysis computed via DP over tree decomposition
#eval poseidon128Analysis.dpDiffCost      -- 18
#eval poseidon128Analysis.dpAlgCost       -- 22
#eval poseidon128Analysis.dpSecurityCost  -- 18 (min of diff and alg)
#eval poseidon128Analysis.activeSboxes    -- 81
#eval poseidon128Analysis.treewidth       -- 2
#eval computedVerdictTag poseidon128Analysis  -- "WEAKNESS"

All values are verified by native_decide:

theorem poseidon_diff_cost :
    poseidon128Analysis.dpDiffCost = 18 := by native_decide

theorem poseidon_sec_cost :
    poseidon128Analysis.dpSecurityCost = 18 := by native_decide

theorem poseidon_verdict :
    computedVerdictTag poseidon128Analysis = "WEAKNESS" := by native_decide

Compare hash functions with verified ordering:

-- AES has higher DP security than PRESENT (stronger S-box)
theorem aes_stronger_than_present :
    aes128Analysis.dpSecurityCost ≥ presentAnalysis.dpSecurityCost := by
  native_decide

-- Poseidon has higher DP security than PRESENT
theorem poseidon_stronger_than_present :
    poseidon128Analysis.dpSecurityCost ≥ presentAnalysis.dpSecurityCost := by
  native_decide

Pipeline properties (proven generically, not just for concrete instances):

-- Overall security never exceeds the birthday bound
theorem overall_le_generic (spec tree delta degree hd hdeg) :
    (computeAnalysis spec tree delta degree hd hdeg).overallSecurity
    ≤ (computeAnalysis spec tree delta degree hd hdeg).genericFloor

-- DP security cost is always ≥ 1
theorem dp_sec_pos (spec tree delta degree hd hdeg) :
    (computeAnalysis spec tree delta degree hd hdeg).dpSecurityCost ≥ 1

-- OPTIMAL verdict means structural security ≥ generic floor
theorem optimal_means (a : ComputedAnalysis)
    (h : computedVerdictTag a = "OPTIMAL") :
    a.dpSecurityCost ≥ a.genericFloor

-- WEAKNESS verdict means structural security < generic floor
theorem weakness_means (a : ComputedAnalysis)
    (h : computedVerdictTag a = "WEAKNESS") :
    a.dpSecurityCost < a.genericFloor

v2.1: Zero-Parameter Pipeline

v2.1 closes all remaining disconnection gaps. You provide ONLY a HashSpec and an S-box lookup table — the pipeline computes everything: differential uniformity, algebraic degree, constraint graph, tree decomposition, DP costs, and verdict.

Full pipeline from S-box table:

import TrustHash.Pipeline.ComputeInstancesV21
open TrustHash.EndToEndPipeline
open TrustHash.ComputeInstancesV21
open TrustHash.ComputePipeline

-- PRESENT: computed entirely from S-box table (zero manual parameters)
#eval presentV21.sboxDelta         -- 4 (computed from DDT)
#eval presentV21.sboxDegree        -- 3 (computed from ANF)
#eval presentV21.activeSboxes      -- 124
#eval presentV21.treewidth         -- 15
#eval computedVerdictTag presentV21  -- "OPTIMAL"

Graph-computed trees (no hardcoded tree topologies):

-- Poseidon v2.1: tree derived from HADES constraint graph
-- buildConstraintGraph → completeGraph K₃ → fromGraph → NiceNode
#eval poseidon128V21.activeSboxes    -- 81 (same as v2.0)
#eval poseidon128V21.treewidth       -- 2  (same as v2.0)

Cross-validation: v2.1 = v2.0 for all security quantities:

-- v2.1 computed trees produce identical results to v2.0 hardcoded trees
theorem poseidon_active_eq :
    poseidon128V21.activeSboxes = poseidon128Analysis.activeSboxes := by native_decide

theorem poseidon_tw_eq :
    poseidon128V21.treewidth = poseidon128Analysis.treewidth := by native_decide

E-graph integration: symbolic optimization of hash expressions:

import TrustHash.EGraph.PipelineIntegration
open TrustHash.EGraph.PipelineIntegration

-- E-graph optimizes algebraic degree expression via equality saturation
theorem poseidon_optimizes :
    (optimizeAndExtract poseidon128).isSome = true := by native_decide

-- Two-phase: AST rewrite rules (proven sound) + E-graph optimization
theorem poseidon_twophase :
    (optimizeTwoPhase poseidon128).isSome = true := by native_decide

-- AST-level simplification preserves evaluation (general theorem)
theorem twoPhase_ast_sound (spec : HashSpec) (env : Env) :
    eval (applyChain standardRules (hashToExpr spec)) env =
    eval (hashToExpr spec) env

v2.5: Gap Closure — Exponential DP + Real E-Graph

v2.5 closes 6 gaps between the paper's claims and the v2.1 code:

Gap	v2.1 State	v2.5 Fix
1 DP is linear	`childResult + cost` additive	Exponential `TreewidthDP` with `BagAssignment`-based `DPTable`
2 Constraint graph = lookup table	`if fullRounds > 0 then K_t else S_t`	Compositional `hadesConstraintGraph` from per-round primitives
3 E-graph = identity (empty rules)	`optimize expr [] config`	4 real algebraic rewrite rules via `cryptoERewriteRules`
4 No concrete end-to-end instance	Pipeline not tested on S-box data	PRESENT 4-bit proven end-to-end via `native_decide`
5 Hardcoded verdicts	Static verdict records	`computedVerdictV25` / `computedBottleneckV25` branch dynamically
6 Wide trail not formalized	Referenced in docstrings	`wide_trail_two_rounds` theorem (Daemen-Rijmen bound)

v2.5 full pipeline with exponential DP:

import TrustHash.ValidationV25
open TrustHash.EndToEndV25
open TrustHash.ValidationV25

-- PRESENT: full v2.5 pipeline from S-box table
-- Uses exponential DP (tighter bounds than v2.1 linear DP)
theorem present_v25_egraph :
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).egraphOptimized = true := by
  native_decide

-- v2.5 uses 4 real crypto E-graph rules (not empty/identity)
theorem present_v25_rules :
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).egraphRulesUsed = 4 := by
  native_decide

-- Exponential DP gives tighter (lower) bounds than linear DP
theorem present_v25_dp_le_v21 :
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).dpSecurityV25 ≤
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).dpSecurityV21 := by
  native_decide

Cross-validation: v2.5 agrees with v2.1 on shared metrics:

-- S-box analysis, active S-boxes, treewidth, and generic floor
-- are identical between v2.5 and v2.1
theorem present_v25_v21_active_eq :
    (computeFromSboxV25 ...).activeSboxes = (computeFromSbox ...).activeSboxes := by native_decide

theorem present_v25_v21_generic_eq :
    (computeFromSboxV25 ...).genericFloor = (computeFromSbox ...).genericFloor := by native_decide

v2.5 also includes:

TreewidthDP: Exponential DP over bag assignments with fuel-bounded computation and v2.1 fallback
DPTableLemmas: 43 theorems on DP table operations (leaf, introduce, forget, join)
ILP encoding: TENSAT formulation (C1-C4) for E-graph extraction with solution checking
WideTrailTheorem: Daemen-Rijmen bound on active S-boxes from branch number
Math foundations: Binomial, multi-collision, algebraic immunity, differential theory

Where do the parameters come from?

Level	Interface	What you provide	What is computed
v3.1	`AutoSboxPipeline` / `CipherComparison`	S-box table OR `HashSpec`	Automated pipeline: S-box table → certified params → E-graph → DP (with optimality proof) → multi-round composition → verdict for 6 ciphers
v3.0	`PresentEndToEnd` / `AES128Analysis` / `Keccak256Analysis`	`SboxCertifiedParams` + circuit	Full pipeline: E-graph optimization + circuit → constraint graph → TD → DP → two-layer verdict for 3 real ciphers
v2.8	`EndToEndPipeline` + attack rules	HashSpec + ConcreteSbox	Everything + Bernstein cost, tight backward degree, Matsui linear, Feistel bounds, 23 rewrite rules
v2.1	`EndToEndPipeline`	HashSpec + ConcreteSbox	Everything: delta, degree, graph, tree, DP costs, verdict
v2.0	`ComputePipeline`	HashSpec + NiceNode + delta + degree	DP costs, active S-boxes, treewidth, generic floor, verdict
v1.2	`PipelineVerdict`	HashSpec (construction, state width, S-box, rounds)	Treewidth, active S-boxes, costs, verdict
v1.0	`FullAnalysis`	All derived parameters directly	Costs and verdict only

Module reference

v1.0: Core security model

Module	Use case
`SecurityDefs`	Security level, ideal security, margin definitions
`GenericFloor`	Birthday, GBP, Joux, sponge bounds
`ThreatLattice4D`	Hash Gone Bad 4D threat classification (5,000 scenarios)
`CostModel`	Differential and algebraic cost types
`MDSBranchNumber`	MDS matrix branch number bounds
`MDSMultOrder`	Required multiplicative order for MDS
`ActiveSboxBounds`	Active S-box counting and bounds
`ForwardDegree`	Forward algebraic degree propagation
`BackwardDegree`	Backward algebraic degree propagation
`LinearApproximation`	Linear bias bounds
`TreewidthModel`	HADES full/partial round treewidth
`HADESSectionTW`	HADES section decomposition
`TreeFoldSecurity`	Structural cost via tree fold (min of diff and alg)
`ZeroSumDistinguisher`	Zero-sum subspace dimension bounds
`TwoLayerVerdict`	Two-layer security verdict
`RoundSkipBound`	Round-skipping attack cost
`HashRanking`	Rank hash functions by security level
`ModeAnalysis`	Mode-dependent (sponge) security
`ConfigPartialOrder`	Compare hash configurations (dominance)
`FullVerdict`	Complete two-layer verdict

v1.1: Usability layer

Module	Use case
`AnalysisReport`	Human-readable reports and comparison tables
`KnownHashes`	Precalculated analyses for 8 well-known hashes

v1.2: Computational pipeline (Pipeline/)

Module	Use case
`HashSpec`	Design-parameter-only hash specification
`Pipeline/GenericFloorPipeline`	Generic floor from HashSpec
`Pipeline/WideTrailPipeline`	Active S-boxes and treewidth from HashSpec
`Pipeline/StructuralPipeline`	Structural cost from HashSpec
`Pipeline/PipelineVerdict`	Automatic verdict from HashSpec
`Pipeline/PipelineInstances`	Concrete instances (Poseidon, AES, SHA-256, Keccak, Rescue)
`RoundStructure`	HADES round sequence decomposition
`DegreeFold`	Degree tracking catamorphism over round sequences
`FoldCostBound`	Fold-pipeline equivalence proofs
`AlgExpr`	Algebraic expression AST
`CryptoRewrites`	Sound rewrite rules (pow_combine, pow_pow, distribute, mul_one, pow_one)
`SimplificationVerifier`	End-to-end verified algebraic simplification

v2.0: Shared infrastructure

Module	Use case
`Bitwise`	XOR, popcount, parity, truncation for finite fields
`Graph`	SimpleGraph with adjacency lists, complete/star/path constructors
`FinIter`	maxOver, sumOver, countOver, tabulate2D without Finset

v2.0: S-box analysis (Sbox/)

Module	Use case
`Sbox/SboxTable`	ConcreteSbox type with lookup table, eval, outputDiff
`Sbox/DDTCompute`	Difference distribution table computation, diffUniformityFromTable
`Sbox/LATCompute`	Linear approximation table, Walsh coefficients, nonlinearityFromTable
`Sbox/AlgDegreeCompute`	Mobius transform, ANF, algebraicDegreeFromTable
`Sbox/SboxBridge`	SboxAnalysis bundle bridging S-box metrics to pipeline

v2.0: Graph + treewidth

Module	Use case
`DiffTrail`	Differential trail type, branch constraints, wide trail bound
`WideTrailSearch`	Analytical + exhaustive minimum active S-box search
`TrailBridge`	Trail analysis to pipeline bridge
`EliminationOrder`	Greedy min-degree vertex elimination
`TreeDecomp`	Tree decomposition from elimination ordering
`NiceTreeConvert`	NiceNode inductive, conversion from tree decomposition
`TWBridge`	Treewidth analysis to pipeline bridge

v2.0: E-graph engine

Module	Use case
`EGraph/UnionFind`	Disjoint-set forest with path compression (adapted from OptiSat)
`EGraph/Core`	CryptoOp (Const/Var/Add/Mul/Pow), ENode, EClass, EGraph with add/merge/canonicalize
`EGraph/CoreSpec`	Well-formedness predicates (ChildrenBounded, HashconsValid, SizesAligned) + preservation proofs
`EGraph/EMatch`	CryptoPattern matching, Substitution, rewrite rule application
`EGraph/Saturate`	Fuel-based equality saturation loop with fixpoint detection
`EGraph/SemanticSpec`	NodeEval (Nat semantics), ConsistentValuation, equivalence preserves value
`EGraph/Extraction`	Greedy bottom-up cost extraction from saturated E-graph
`EGraph/PipelineSoundness`	End-to-end: addExpr + optimize pipeline with eval preservation

v2.0: DP over tree decompositions

Module	Use case
`DP/NiceTree`	niceTreeFold (4-constructor catamorphism), niceTreeFold_lower_bound
`DP/CryptoCost`	Additive cost functions, vertex cost parameters, bag cost monotonicity
`DP/DPOperations`	cryptoDP via niceTreeFold, lower bound, positivity, monotonicity
`DP/SecurityDP`	differentialDP, algebraicDP, securityDP = min(diff, alg)
`DP/DPBridge`	DPAnalysis structure, analyzeSecurity, bridge to pipeline

v2.0: Integration (Pipeline/)

Module	Use case
`Pipeline/ComputePipeline`	ComputedAnalysis, computeAnalysis, verdict tag, bottleneck identification
`Pipeline/ComputeInstances`	Poseidon/AES/PRESENT with nice trees, DP costs, cross-hash comparisons

v2.1: Pipeline wiring (Pipeline/)

Module	Use case
`ConstraintGraph`	`buildConstraintGraph`: HADES topology → complete K_t or star S_t
`Pipeline/EndToEndPipeline`	`computeFromSbox`: HashSpec + ConcreteSbox → ComputedAnalysis (zero manual params)
`Pipeline/ComputeInstancesV21`	PRESENT/Poseidon/AES via zero-parameter pipeline, v2.1 vs v2.0 cross-validation

v2.1: E-graph pipeline integration

Module	Use case
`EGraph/HashExpr`	`hashToExpr`: HashSpec → AlgExpr (algebraic degree expression), E-graph registration
`EGraph/PipelineIntegration`	`optimizeAndExtract`, `optimizeTwoPhase` (AST rules + E-graph), depth validation

v2.1: Validation

Module	Use case
`ValidationV21`	Cross-validation v2.1 = v2.0 (active S-boxes, treewidth, generic floor), E-graph end-to-end

v2.5: Mathematical foundations

Module	Use case
`Math/BinomialBasic`	Factorial, choose, falling factorial for multi-collision bounds
`Math/MultiCollision`	k-collision counting and Suzuki bounds
`Math/AlgebraicImmunity`	Annihilator existence, algebraic immunity lower bound
`Math/DifferentialTheory`	Differential uniformity, DDT partition, APN functions
`Math/Foundations`	BoolFn, foldFin infrastructure, Hamming weight
`WideTrailTheorem`	SPN state, sboxWeight, `wide_trail_two_rounds` (Daemen-Rijmen bound)

v2.5: Exponential DP

Module	Use case
`DP/Util/NatOpt`	Nat.min properties for DP table operations
`DP/Util/FoldMin`	List.foldl Nat.min lemmas
`DP/Util/InsertMin`	HashMap-based insert-if-min for DP tables
`DP/Util/DPBound`	forget/join/introduce preserve DP bounds
`DP/TreewidthDP`	BagAssignment, DPTable, dpLeaf/dpIntroduce/dpForget/dpJoin, runDP with fuel
`DP/CryptoCost`	Crypto-specific cost functions for DP

v2.5: Constraint graph + E-graph + ILP

Module	Use case
`EGraph/ILPTypes`	TENSAT ILP formulation types (ILPVar, ILPConstraint, ILPProblem, ILPSolution)
`EGraph/ILPEncode`	E-graph → ILP encoding (C1 root, C2 exactly-one, C3 child deps, C4 acyclicity)
`EGraph/ILPCheck`	ILP solution validation, extraction, ValidExtraction certificate
`EGraph/CryptoRewritesEGraph`	4 E-graph rewrite rules (mulOne, oneMul, powOne, distribute)

v2.5: Pipeline end-to-end (Pipeline/)

Module	Use case
`Pipeline/EndToEndPipeline`	`computeFromSbox`: full pipeline (DP + constraint graph + E-graph)

v2.6: S-box certificates + real E-graph optimizer (Sbox/, Pipeline/)

Module	Use case
`Sbox/DDTCertificate`	Offline DDT certificate structure + deterministic checker
`Sbox/AES8BitSbox`	256-entry AES S-box + DDT certificate
`Sbox/CertifiedSboxBridge`	Certificate → pipeline bridge
`HashOpExpr`	Multi-round algebraic expression generator
`CryptoSoundRules`	10+ sound rewrite rules with formal proofs
`RealOptimizer`	E-graph optimizer with measurable depth reduction
`Sbox/BitVec4Arith`	GF(2^4) field arithmetic (gf16_mul, gf16_inv) — all 9 field axioms kernel-verified via `decide`
`Sbox/PresentSboxGF`	PRESENT S-box from GF(2^4) field operations
`Sbox/GFBridge`	GF-derived delta = table-derived delta
`WideTrailDP`	Wide trail cost function from Daemen-Rijmen
`Pipeline/PipelineWideTrail`	Wide trail integration with DP pipeline
`HashFormalSpec`	Exportable formal specification
`EGraph/CompletenessSpec`	DAG acyclicity, fuel sufficiency, hypothesis discharge
`Validation/ValidationV26`	Cross-validation v2.6, final gate

v2.7: Keccak + Feistel + ILP + SAT

Module	Use case
`Math/WalshHadamard`	Walsh-Hadamard transform, Parseval, spectral analysis
`Math/SecurityExtensions`	Ideal degree, SPN degree extensions
`Math/CollisionBounds`	Collision probability bounds
`Math/IterationDynamics`	Fixed point iteration, cycle structure
`ILP/ILPSpec`	ILP specification types (decidable, no sorry)
`ILP/ILPCheck`	ILP solution validation and extraction
`ILP/ExtractionUnified`	Unified extraction (greedy + ILP + treewidth DP)
`DP/DPCompose`	DP composition over multiple decompositions
`DP/DPMultiObjective`	Multi-objective DP (differential + algebraic)
`SAT/SATEncoding`	SAT encoding for treewidth oracle
`SAT/SATCertificate`	SAT certificate validation
`Keccak/KeccakSpec`	Keccak-f permutation (θ, ρ, π, χ, ι)
`Keccak/KeccakAlgDegree`	Keccak algebraic degree analysis
`Keccak/KeccakConstraintGraph`	Keccak constraint graph
`Keccak/KeccakTreewidth`	Keccak treewidth bounds
`Feistel/FeistelSpec`	Feistel round, inverse, multi-round
`Feistel/FeistelSecurity`	SPRP/CCA bounds, required rounds
`Validation/ValidationV27`	Cross-validation v2.7

v2.8: 3 Directions complete (D1 + D2 + D3)

Module	Use case
`BernsteinCostModel`	Time × price security, parallel invariance, multi-target amortization, TMTO
`TightBackwardDegree`	Boura-Canteaut composition degree, tight Keccak backward degree (1165 vs 1599)
`MatsuiLinearAttack`	Piling-up lemma, linear attack complexity O(1/ε²), MDS linear bound
`Feistel/FeistelConcreteBounds`	Patarin attack table (1–6+ rounds), CCA security, 11-round CCA-secure
`ConditionalRewriteRule`	Extends RewriteRule with decidable side conditions, backward compatibility
`DifferentialAttackRules`	5 conditional rules: δ-propagation, trail fold, zero/one delta, max prob
`LinearAttackRules`	3 conditional rules: bias square, piling fold, const fold
`AlgebraicAttackRules`	4 conditional rules: degree compose, identity, zero, const fold
`ExtendedAttackRules`	Unified chain: 23 rules total (11 unconditional + 12 conditional) + soundness
`UpdatedInstances`	AES, Poseidon, Keccak, DES with Bernstein/backward/Matsui/Feistel bounds
`Validation/CrossValidationV28`	v2.8 vs v2.6: no regressions, tighter bounds where improved
`Validation/ValidationV28`	Final gate: 0 sorry, 0 axioms, D1/D2/D3 summary theorems

v3.0: Real pipeline — E-graph, DP, certified S-boxes, end-to-end

v3.0 closes the 5 reality gaps identified in an honest audit of v2.8:

Gap	v2.8 State	v3.0 Fix
1 E-graph tested only with empty rules	Rules existed but never fired on real expressions	12 sound rewrite rules fire on real hash expressions, saturation reduces depth
2 DP uses hardcoded trees	Nice trees built manually	Circuit → constraint graph → tree decomposition → nice tree → DP chain
3 AES 8-bit S-box never analyzed	DDT certificate existed, no full analysis	Certified S-box params (δ=4, d=7, NL=112) → full ComputedAnalysis (security=62)
4 Keccak/Feistel specs not connected	Specs defined but not linked to analysis	Sponge security (c/2=128) + algebraic degree chain + structural cost → verdict
5 No end-to-end pipeline on real hash	Pipeline stages existed separately	PRESENT 3-round: S-box → expression → E-graph → circuit → TD → DP → verdict

Three thesis claims machine-checked (ThesisDemonstration.lean):

-- Claim 1: E-graph saturation reduces depth while preserving semantics
-- 12 distinct rules fire, all with individual soundness proofs
theorem claim1_depth_reduction :
    present3Round.depth > present3Optimized.depth  -- 4 > 2

theorem claim1_soundness (env : Nat → Nat) :
    present3Optimized.eval env = present3Round.eval env

-- Claim 2: DP over tree decompositions on 3 real circuits
-- Monotone in S-box parameters (stronger S-box → higher security)
theorem claim2_three_circuits :
    presentSecDP ≥ 1 ∧ aesSecDP ≥ 1 ∧ keccakChiAlgDP ≥ 1

theorem claim2_monotone :
    securityDP 4 3 tree ≤ securityDP 8 3 tree ∧  -- stronger delta
    securityDP 4 3 tree ≤ securityDP 4 7 tree     -- higher degree

-- Claim 3: Two-layer verdicts for 3 real hash functions
theorem claim3_three_verdicts :
    presentAnalysis.overallSecurity ≤ presentAnalysis.genericFloor ∧
    aes128Analysis.overallSecurity = 62 ∧       -- differential bottleneck
    keccak256Analysis.overallSecurity = 128      -- sponge capacity binding

End-to-end pipeline on PRESENT 3-round (PresentEndToEnd.lean):

-- 7-stage verified pipeline:
-- Stage 1: Certified S-box → Stage 2: Expression → Stage 3: E-graph →
-- Stage 4: Circuit → Stage 5: Tree decomp → Stage 6: DP → Stage 7: Verdict

theorem present_full_chain :
    presentCertParams.delta = 4 ∧               -- Stage 1: certified
    presentCertParams.degree = 3 ∧
    present3Round.depth = 4 ∧                   -- Stage 2: expression built
    present3Optimized.depth < present3Round.depth ∧  -- Stage 3: E-graph reduces
    present1RoundCircuit.numOps = 5 ∧           -- Stage 4: circuit
    present1TD.width ≤ 3 ∧                      -- Stage 5: treewidth bounded
    presentSecDP ≥ 1 ∧                          -- Stage 6: DP positive
    presentAnalysis.overallSecurity ≤ presentAnalysis.genericFloor  -- Stage 7

Concrete security verdicts for 3 hash functions:

Hash	Generic Floor	Structural Cost	Overall	Verdict
PRESENT 3-round	32	DP-computed	min(32, DP)	Bounded
AES-128	64	62 (diff bottleneck)	62	Near-optimal
SHA-3-256	128 (sponge c/2)	>2^40 (24 rounds)	128	Optimal

v3.0 module reference

Module	Use case
`HashOp`	Extended hash operations: `HashOp` inductive + `HashExprF` with eval, depth, DecidableEq
`HashExpr`	Expression builder from `HashSpec`, round composition
`Sbox/SboxCertifiedParams`	Dependent record `SboxCertifiedParams` with proof obligations (δ, d, NL bounds)
`HashSoundRules`	12 sound rewrite rules: sbox compose, ark identity, const fold, rounds compose, etc.
`EGraph/RealSaturate`	Configurable saturation engine: `RealSatConfig` → `RealSatResult` with progress tracking
`EGraph/RealPipelineDemo`	Saturation demos on 4 PRESENT expressions, fixpoint + depth reduction proofs
`ConstraintGraphCompute`	`HashCircuit` → constraint graph builder for PRESENT/AES/Keccak circuits
`TreeDecompCertificate`	`TDCertificate`: external compute + Lean verify pattern for tree decompositions
`DP/CryptoDPInstance`	First concrete DP on real circuits: PRESENT (δ=4,d=3), AES (δ=4,d=7), Keccak chi (d=2)
`Sbox/AES8BitCertified`	Full AES 8-bit certified analysis chain: DDT cert + params + circuit + bridge
`Keccak/KeccakSecurityAnalysis`	SHA-3-256 two-layer: sponge (128) + structural (>2^40) + cross-cipher comparison
`AES128Analysis`	`ComputedAnalysis` for AES-128: overall=62, differential bottleneck
`Keccak/Keccak256Analysis`	`ComputedAnalysis` for SHA-3-256: overall=128, sponge binding, massive margin
`PresentEndToEnd`	Keystone: 7-stage pipeline on PRESENT 3-round, `present_full_chain` theorem
`ThesisDemonstration`	3 thesis claims: E-graph depth, DP optimality, compositional security
`Validation/ValidationV30`	Final gate: 5 gaps closed, project metrics, v2.8 regression check

v3.1: Automated pipeline — multi-round DP, recursive E-graph, 6 ciphers

v3.1 implements all 6 future work items identified in v3.0:

Item	v3.0 State	v3.1 Implementation
1 Multi-round DP	1-round circuits only	`MultiRoundInstances`: PRESENT-3R (composedCost ≥ singleRoundCost), AES-2R
2 Recursive E-graph	Flat pattern matching	`RecursiveRewrite`/`RecursiveSaturate`: S-box chain depth 5→1, PRESENT depth reduction
3 Auto S-box certs	Manual `SboxCertifiedParams`	`AutoSboxPipeline`: table → DDT/LAT/ANF → certified (δ, d, NL)
4 Executable ciphers	Mathematical specs only	`AESRoundFunction` (SubBytes/ShiftRows/MixColumns/AddRoundKey) + `KeccakBitVec` (χ/θ) with FIPS test vectors
5 DP optimality	Lower bound only	`DPOptimalProof.dp_optimal_of_validNTD`: DP is tight for any `ValidNTD` tree
6 Broader coverage	3 ciphers	6 ciphers: +Poseidon-128, Rescue-Prime-128, GMiMC-128 with `CipherComparison`

Six hash instances with verified security analysis:

Hash	Generic Floor	Overall Security	Verdict
PRESENT 3-round	32	DP-computed	Bounded
AES-128	64	62	Near-optimal
SHA-3-256	128	128	Optimal
Poseidon-128	64	18	Weakness
Rescue-Prime-128	64	18	Weakness
GMiMC-128	64	11	Weakness

DP optimality theorem (DPOptimalProof.lean):

-- For any valid nice tree decomposition, the crypto DP is optimal:
-- it is the MINIMUM over all cost functions respecting the tree structure.
theorem dp_optimal_of_validNTD (c : Nat) (tree : NiceNode)
    (hvalid : ValidNTD tree)
    (f : NiceNode → Nat)
    (hf_leaf : f .leaf ≥ 1)
    (hf_intro : ∀ v child, f (.introduce v child) ≥ f child + c)
    (hf_forget : ∀ v child, f (.forget v child) ≥ f child)
    (hf_join : ∀ l r, f (.join l r) ≥ f l + f r) :
    cryptoDP c tree ≤ f tree

-- Closed-form: DP = leafCount + introCount × cost
theorem dp_eq_formula (c : Nat) (tree : NiceNode) :
    cryptoDP c tree = leafCount tree + introCount tree * c

v3.1 module reference

Module	Use case
`Sbox/LATCertificate`	Offline LAT certificate structure + checker
`Sbox/SboxFullCert`	Combined DDT + LAT + degree certificate
`Sbox/AutoSboxPipeline`	Automated: S-box table → certified params (δ, d, NL) in one step
`MultiRoundGraph`	Multi-round constraint graph composition
`MultiRoundTreeDecomp`	Tree decomposition for composed multi-round graphs
`MultiRoundDP`	`MultiRoundResult`: single-round DP → composed cost with monotonicity proof
`MultiRoundInstances`	PRESENT-3R and AES-2R concrete multi-round analyses
`EGraph/RecursiveRewrite`	`recursiveRewrite`: rewrite at all subterm positions, `exprDepth`
`EGraph/RecursiveSaturate`	`recursiveSaturate`: fuel-bounded recursive saturation loop
`EGraph/RecursiveDemo`	Demos: PRESENT 3-round depth reduction, S-box chain 5→1, soundness proof
`Keccak/KeccakBitVec`	Executable Keccak: `chi5`, `thetaLane` on `BitVec 5` with algebraic proofs
`AESRoundFunction`	Executable AES round: SubBytes, ShiftRows, MixColumns, AddRoundKey, `xtime`
`TestVectors`	FIPS test vector validation for AES and Keccak
`CipherBridge`	Bridge executable implementations to security model
`Poseidon/PoseidonSpec`	Poseidon hash specification: full + partial rounds, MDS, S-box x^α
`Poseidon/PoseidonAnalysis`	Poseidon-128 `ComputedAnalysis`: overall=18, 81 active S-boxes
`RescuePrime/RescuePrimeSpec`	Rescue-Prime spec: forward x^α + inverse x^{1/α} layers
`RescuePrime/RescuePrimeAnalysis`	Rescue-Prime-128 `ComputedAnalysis`: overall=18
`GMiMC/GMiMCSpec`	GMiMC spec: Feistel-like with single S-box per round
`CipherAnalysisV31`	GMiMC-128 analysis: overall=11
`CipherComparison`	Cross-cipher comparison: 6 instances ranked, monotonicity proofs
`DP/DPOptimalSpec`	`ValidNTD` inductive, `DPOptimal`, `DPOptimalityWitness`
`DP/DPOptimalProof`	Main theorem: `dp_optimal_of_validNTD`, `dp_eq_formula`, `securityDP_optimal_of_validNTD`
`DP/DPOptimalBridge`	`OptimalDPAnalysis`, `full_pipeline_story`, `security_cost_formula`
`Validation/ValidationV31`	Final gate: all 6 items closed, v3.0 regression check

v5.0: AG Quantum Bounds + Composable Attack Engine

v5.0 extends TrustHash from a static verifier to a dynamic security analyzer with three new capabilities:

AG quantum bounds (Feng-Ling-Xing): algebraic-geometry codes that surpass naive Grover halving
Composable attack engine: a parallel E-graph that discovers optimal attack combinations
Distribution + collision variety bounds: inverse sieve and Hilbert irreducibility as new security dimensions

Defining a hash (v5.0)

The FullAnalysisV5 structure captures all security-relevant parameters. Lean enforces well-formedness via proof obligations (h_output, h_deg, etc.) — if your parameters are invalid, the code does not compile.

import TrustHash.Attacks.FullVerdictV5
open TrustHash.Attacks.FullVerdictV5

def myHash : FullAnalysisV5 where
  outputBits              := 128    -- hash output length in bits
  diffUniformity          := 4      -- S-box differential uniformity (from DDT)
  activeSboxes            := 50     -- minimum active S-boxes (wide trail)
  algebraicDeg            := 3      -- S-box algebraic degree (from ANF)
  treewidth               := 4      -- treewidth of the round structure
  totalRounds             := 10
  skippedRounds           := 2
  integralDimension       := 32     -- integral distinguisher dimension (0 = not applicable)
  slidePeriod             := 0      -- round function period (0 = no periodicity)
  relatedKeyProb          := 85     -- -log2(related-key differential probability)
  invariantSubDim         := 0      -- invariant subspace dimension (0 = none found)
  keyScheduleNonlin       := 64     -- key schedule nonlinearity measure
  constantTimeSafe        := true   -- implementation is constant-time
  agFieldSize             := 0      -- 0 = classical hash, q = AG code hash over F_q
  distributionBound       := 100    -- inverse sieve security bits
  collisionVarietyBound   := 115    -- collision variety security bits
  bestComposedAttackCost  := 0      -- from attack engine (0 = not computed)
  -- Lean requires proofs that parameters are well-formed:
  h_output := by omega   -- outputBits >= 1
  h_delta  := by omega   -- diffUniformity >= 1
  h_active := by omega   -- activeSboxes >= 1
  h_deg    := by omega   -- algebraicDeg >= 2
  h_tw     := by omega   -- treewidth >= 1
  h_rounds := by omega   -- totalRounds >= 1
  h_skip   := by omega   -- skippedRounds <= totalRounds

Computing the verdict

#eval overallSecurityV5 myHash          -- security level in bits
#eval quantumFloorV5 myHash             -- quantum-aware generic floor
#eval expandedStructuralCostV5 myHash   -- best structural attack cost

The formula that runs internally:

overallSecurityV5 = min(quantumFloorV5, expandedStructuralCostV5)

quantumFloorV5 =
  if agFieldSize > 0 then min(bhtCollisionFloor n, quantumAGBound q 0)
  else quantumGenericFloor n

expandedStructuralCostV5 =
  min(classical_structural,
      min(distributionBound, min(collisionVarietyBound, bestComposedAttackCost)))

Everything is nested min. The adversary picks the cheapest attack. TrustHash reports that minimum.

Proving properties

-- Lean kernel-verifies that the result is exactly 42
theorem myHash_security : overallSecurityV5 myHash = 42 := by native_decide

-- Lean verifies a comparison between two hashes
theorem myHash_beats_gmimc :
    overallSecurityV5 gmimc128V5 ≤ overallSecurityV5 myHash := by native_decide

-- Lean verifies the security never exceeds the quantum floor
theorem myHash_bounded :
    overallSecurityV5 myHash ≤ quantumFloorV5 myHash := v5_le_quantumFloor myHash

If any of these fails, lake build rejects the project. There is no way to bypass verification.

AG code hashes (new in v5.0)

For hashes built over algebraic-geometry codes, set agFieldSize to the base field size. The quantum floor then uses the Drinfeld-Vladut bound instead of naive BHT:

def hyperellipticV5 : FullAnalysisV5 where
  outputBits  := 256
  agFieldSize := 2048   -- F_{2^11}, so DV(2048) = isqrt(2048) - 1 = 44
  -- ...other fields...

#eval quantumFloorV5 hyperellipticV5  -- min(85, 44) = 44
-- AG bound is tighter: 44 < 85 (classical BHT for 256-bit)

v5.0 security results

Hash	Security (bits)	Binding factor	Family
PRESENT-80	26	Quantum floor (n/3)	Classical
AES-128	42	Quantum floor (n/3)	Classical
SHA-3-256	16	Structural (d^tw = 2^4)	Classical
Poseidon-128	25	Structural (5^2)	Classical
Rescue-Prime-128	25	Structural (5^2)	Classical
GMiMC-128	3	Structural (3^1)	Classical
Hyperelliptic-256	44	AG quantum (DV=44)	AG code
Garcia-Stichtenoth-128	6	AG quantum (DV=6)	AG code
Suzuki-128	15	AG quantum (DV=15)	AG code

Where the input parameters come from

TrustHash formalizes the computation of the verdict, not the cryptanalytic work that precedes it. You must supply the parameters:

Parameter	Source
`diffUniformity`	DDT of your S-box (e.g., AES S-box has delta=4)
`activeSboxes`	Wide trail strategy: `branchNumber * fullRounds + partialRounds`
`algebraicDeg`	ANF polynomial degree of your S-box
`treewidth`	Structural analysis of your round's constraint graph
`agFieldSize`	Base field size if your hash uses AG codes (0 otherwise)

What the verification guarantees

The value is not the point — a calculator can compute min(42, 81). What TrustHash provides:

Formula correctness: v5_le_quantumFloor, v5_le_structural, v5_le_v4 are kernel-verified theorems
Parameter validation: invalid parameters (e.g., algebraicDeg := 0) are rejected at compile time
Formal comparisons: poseidon_beats_gmimc is a machine-checked proof, not a paper claim
Composition soundness: each of the 15 attack rewrite rules has an individual soundness proof
Upgrade safety: v5_le_v4 proves that v5.0 can only decrease or maintain v4.0 security levels

Known limitations

Treewidth computation uses greedy min-degree elimination (heuristic upper bound). It gives exact results for HADES-specific graphs (complete K_t, star S_t) but is not guaranteed optimal for arbitrary graphs.
E-graph extraction uses greedy bottom-up cost selection. v2.5 adds ILP encoding (TENSAT formulation) with solution checking, but the solver itself is external.
Security DP values are abstract cost units (not directly bits of security). The mapping from DP cost to concrete bit-security requires domain-specific interpretation.
Exponential DP uses fuel to bound computation. When fuel is exhausted, it falls back to v2.1 linear DP (always terminates, provably correct but less tight).
TreeDecomp validity is checked via boolean predicates (isValid), not connected to a formal specification type.
Executable ciphers (v3.1) implement individual round functions, not full cipher encryption/decryption with key schedules. Test vectors validate round-level operations.
Multi-round DP (v3.1) composes round-level DP results additively. This assumes independent rounds; correlated round attacks are not modeled.

Future work

The following areas represent open directions where TrustHash could be extended:

Mathlib integration

All proofs use Init/Std only (no Mathlib). Several areas would benefit from Mathlib's library:

Walsh-Hadamard transform: Currently modeled over Nat; Mathlib.Analysis.Fourier would give proper ℂ-valued transforms.
Algebraic immunity: Currently uses Nat-based annihilator counting; Mathlib.FieldTheory would formalize GF(2^n).
Collision probability bounds: Currently discrete; Mathlib.Probability would give measure-theoretic foundation.

Full E-graph union-find integration

v3.1 recursive rewriting operates on tree syntax (HashExprF). Connecting to the v2.0 union-find E-graph engine (EGraph.Core) would enable congruence closure and sharing-aware optimization, potentially discovering deeper simplifications.

Full cipher implementations with key schedules

v3.1 implements individual round functions (AES SubBytes/MixColumns, Keccak χ/θ). Full cipher encryption with key expansion would enable end-to-end test vector validation against NIST standards.

Correlated round attacks

Multi-round DP (v3.1) assumes independent rounds. Modeling round-key-dependent correlations (e.g., slide attacks, related-key differentials) would strengthen the security analysis.

Additional cipher families

v3.1 covers SPN (PRESENT, AES, Poseidon, Rescue-Prime), sponge (SHA-3), and generalized Feistel (GMiMC). Natural extensions:

Poseidon2/Poseidon2b: Extended HADES with external matrix multiplication.
BLAKE2/BLAKE3: Compression function analysis via ARX model.
Ascon: Lightweight authenticated encryption (NIST LWC winner).

Probabilistic security bounds

Current bounds are worst-case (minimum active S-boxes, algebraic degree). Probabilistic analysis using differential probability distributions would give tighter expected-case bounds.

References

Canteaut et al. 2020: Report on Security of STARK-friendly Hash Functions v2.0
Merz & Rodriguez Garcia 2026: Skipping Class (round-skipping via invariant subspaces)
Cheval et al. USENIX 2023: Hash Gone Bad (4D threat lattice)
La Scala & Tiwari 2025: MultiSolve (tree decomposition bridge)
Grassi et al. 2025: Poseidon2b (extended security definitions)
Wagner 2002: Generalized Birthday Problem
Dey & Ghosh 2019: Cryptographic S-box properties
Bodlaender 1996: Linear-time algorithms for treewidth (elimination ordering)
Nieuwenhuis & Oliveras 2007: Congruence closure and union-find (E-graph foundations)
Bernstein 2005: Understanding Brute Force (time × price cost model)
Matsui 1993: Linear Cryptanalysis Method for DES Cipher (piling-up lemma)
Boura & Canteaut 2010/2011: Zero-sum distinguishers, composition degree bound
Patarin 2008: Generic Attacks on Feistel Schemes (round complexity table)
Hoang & Rogaway 2010: On Generalized Feistel Networks (CCA bounds)
Daemen & Rijmen 2002: The Design of Rijndael (wide trail strategy, MDS branch number)
Bertoni et al. 2011: The Keccak Reference (sponge construction, capacity security)
Feng, Ling & Xing 2006: AG code quantum bounds (alpha_q(delta) >= A(q) - delta)
Walsh 2013/2020: Inverse sieve for algebraic sets, polynomial method over varieties
Menconi, Paredes & Sasyk 2020: Inverse sieve for global fields
Paredes & Sasyk 2021/2022: Effective Hilbert irreducibility, uniform rational point bounds
Brassard, Hoyer & Tapp 1998: Quantum collision search (BHT bound n/3)

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Tests		Tests
TrustHash		TrustHash
.DS_Store		.DS_Store
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
BENCHMARKS.md		BENCHMARKS.md
BIBLIOGRAPHIC_SEARCH_REPORT.md		BIBLIOGRAPHIC_SEARCH_REPORT.md
BIBLIOGRAPHY_GAPS_ANALYSIS.md		BIBLIOGRAPHY_GAPS_ANALYSIS.md
CLAUDE.md		CLAUDE.md
CODE_PATTERNS_V4.lean		CODE_PATTERNS_V4.lean
NOVEL_TECHNIQUES_INVENTORY.md		NOVEL_TECHNIQUES_INVENTORY.md
PLAN_V4.0.md		PLAN_V4.0.md
README.md		README.md
REFERENCES.bib		REFERENCES.bib
SEARCH_SUMMARY.md		SEARCH_SUMMARY.md
TESTS_OUTSOURCE.md		TESTS_OUTSOURCE.md
THEOREMS.md		THEOREMS.md
TrustHash.lean		TrustHash.lean
TrustHash_design_spec.md		TrustHash_design_spec.md
TrustHash_design_spec.pdf		TrustHash_design_spec.pdf
TrustHash_design_spec_v0.1.md		TrustHash_design_spec_v0.1.md
dag.2.0.0.json		dag.2.0.0.json
dag.json		dag.json
dag.v2.5.json		dag.v2.5.json
dag.v2.8.json		dag.v2.8.json
dag.v3.1.json		dag.v3.1.json
hash_library_insights.md		hash_library_insights.md
lake-manifest.json		lake-manifest.json
lakefile.lean		lakefile.lean
lean-toolchain		lean-toolchain
plan_v1.2.md		plan_v1.2.md
plan_v2.5.md		plan_v2.5.md
qa_b1_report_2026-02-26.md		qa_b1_report_2026-02-26.md
report_TrustHash_v4_0_1.md		report_TrustHash_v4_0_1.md
theorems_v3.1.json		theorems_v3.1.json
trusthash_context.md		trusthash_context.md
trusthash_insights_v2.6.md		trusthash_insights_v2.6.md
trusthash_insights_v2.8.md		trusthash_insights_v2.8.md
trusthash_insights_v3.1.md		trusthash_insights_v3.1.md
trusthash_insightsv2.1.md		trusthash_insightsv2.1.md
trusthash_new_insights.md		trusthash_new_insights.md
trusthash_new_theorems.md		trusthash_new_theorems.md
trusthash_update_v4.0.md		trusthash_update_v4.0.md
trusthash_v2.7_insights.md		trusthash_v2.7_insights.md
trusthash_v3.0_insights.md		trusthash_v3.0_insights.md
trusthash_v4.0_insights.md		trusthash_v4.0_insights.md

Folders and files

Latest commit

History

Repository files navigation

TrustHash

What It Does

Why It Matters

Architecture

Stats

Directory Structure

Toolchain

Build

Version History

Usage

v1.0: Manual parameters (FullAnalysis)

Example: AES-128 (optimal design)

Example: Detecting a weak design

v1.1: KnownHashes and Reports

v1.2: Computational Pipeline (HashSpec)

v2.0: End-to-End Computation Pipeline

v2.1: Zero-Parameter Pipeline

v2.5: Gap Closure — Exponential DP + Real E-Graph

Where do the parameters come from?

Module reference

v1.0: Core security model

v1.1: Usability layer

v1.2: Computational pipeline (Pipeline/)

v2.0: Shared infrastructure

v2.0: S-box analysis (Sbox/)

v2.0: Graph + treewidth

v2.0: E-graph engine

v2.0: DP over tree decompositions

v2.0: Integration (Pipeline/)

v2.1: Pipeline wiring (Pipeline/)

v2.1: E-graph pipeline integration

v2.1: Validation

v2.5: Mathematical foundations

v2.5: Exponential DP

v2.5: Constraint graph + E-graph + ILP

v2.5: Pipeline end-to-end (Pipeline/)

v2.6: S-box certificates + real E-graph optimizer (Sbox/, Pipeline/)

v2.7: Keccak + Feistel + ILP + SAT

v2.8: 3 Directions complete (D1 + D2 + D3)

v3.0: Real pipeline — E-graph, DP, certified S-boxes, end-to-end

v3.0 module reference

v3.1: Automated pipeline — multi-round DP, recursive E-graph, 6 ciphers

v3.1 module reference

v5.0: AG Quantum Bounds + Composable Attack Engine

Defining a hash (v5.0)

Computing the verdict

Proving properties

AG code hashes (new in v5.0)

v5.0 security results

Where the input parameters come from

What the verification guarantees

Known limitations

Future work

Mathlib integration

Full E-graph union-find integration

Full cipher implementations with key schedules

Correlated round attacks

Additional cipher families

Probabilistic security bounds

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages