Skip to content

manuelpuebla/TrustHash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TrustHash

Formal verification of cryptographic hash function security in Lean 4.

What It Does

TrustHash proves that hash functions resist structural attacks by formalizing a two-layer security model:

security_level = min(generic_floor, structural_cost)
  • Layer 1 (Generic): Birthday bound, Wagner GBP, Joux cascade, sponge floor — attacks that work against ANY hash function.
  • Layer 2 (Structural): Differential, linear, algebraic, zero-sum, round-skipping — attacks that exploit internal structure.

If structural cost >= generic floor, the hash is optimal (no structural weakness). If structural < generic, the hash has an exploitable weakness.

Why It Matters

Recent results in the literature motivate formal verification of hash security:

  • The STARK-friendly hash report (Canteaut et al. 2020) found that GMiMC-128-d's zero-sum covers the full permutation, and Poseidon-128-d has a narrow 2^61 distinguisher margin.
  • "Skipping Class" (Merz & Rodriguez Garcia 2026) showed that collisions can be algebraically cheaper than preimages via invariant subspace attacks.
  • The Ethereum Foundation bounty (2024-2026) revealed that original Poseidon-2 security estimates were underestimated.

TrustHash formalizes the underlying cost models (differential, algebraic, zero-sum, round-skipping) so that security claims can be machine-checked rather than estimated by hand.

Architecture

Three-column analysis feeding into a unified verdict:

Column 1 Column 2 Column 3
Threat Lattice + Attack Taxonomy Degree Tracking + S-box Cost Model Treewidth per Section
Hash Gone Bad 4D (COL x LE x OC x IL) Skipping Class degree bounds STARK report (HADES structure)

The full pipeline (v3.1) runs on real hash functions:

S-box table → AutoSboxPipeline → SboxCertifiedParams →
  HashExprF → E-graph saturation (12 rules, recursive rewriting) →
    HashCircuit → constraint graph → tree decomposition →
      nice tree → security DP (with optimality proof) → two-layer verdict

Each stage is verified: E-graph preserves semantics (allRules_realSaturate_sound), DP gives an optimal lower bound (dp_optimal_of_validNTD), certified params flow through with proof (claim3_certified_flow), and multi-round DP composes across rounds (multiRound_composedCost_ge_single).

Built on four verified Lean 4 libraries (used as reference, not dependencies):

  • LeanHash (~134 theorems): Generic bounds, S-box properties, MDS, differential paths
  • DynamicTreeProg (~122 theorems): DP over tree decompositions, treeFold
  • OptiSat (~190 theorems): Equality saturation, ILP extraction
  • ProofKit (~20 theorems): Generic foldl invariants

Stats

Metric Value
Modules 224
Theorems 2,683
Sorry 0
Axioms 0
LOC ~41,000
Version v5.0
Hash instances analyzed 9 (6 classical + 3 AG code)
E-graph rewrite rules 12 crypto + 15 attack (all with soundness proofs)
Attack types modeled 14 (differential, linear, algebraic, boomerang, MITM, rebound, ...)
End-to-end demos 1 (PRESENT 7-stage pipeline) + multi-round DP (3R, 2R)

Directory Structure

TrustHash/
  ├── Attacks/
  │   ├── AG/               5 files — AG quantum bounds: Drinfeld-Vladut, tower params, instances
  │   ├── Composition/
  │   │   ├── EGraph/       4 files — Attack E-graph: core, e-matching, saturation, extraction
  │   │   └── Rules/        4 files — Attack rewrite rules: differential, linear, structural, hybrid
  │   ├── CollisionVariety/ 2 files — Collision variety bounds, Hilbert irreducibility
  │   ├── DivisionProperty/ 4 files — Integral attack model
  │   ├── SlideAttack/      3 files — Slide attack model
  │   ├── RelatedKey/       2 files — Key schedule + related-key attacks
  │   ├── InvariantSubspace/1 file  — Invariant subspace attacks
  │   └── ConstantTime/     1 file  — Side-channel constant-time verification
  ├── Sbox/           16 files — S-box analysis: GF(2^4), DDT/LAT/ANF, certificates, bridge
  ├── Pipeline/       10 files — Security pipeline: verdict, instances, wide trail, end-to-end
  ├── Validation/      7 files — Cross-validation gates: V27, V28, V30, V31, V40, V50, CrossV28
  ├── EGraph/         20 files — E-graph engine: union-find, saturation, extraction, rewriting
  ├── DP/             12 files — Dynamic programming: nice tree DP, compose, optimality
  ├── Math/           12 files — Foundations: Walsh-Hadamard, collision bounds, AG distribution, inverse sieve
  ├── Keccak/          7 files — Keccak/SHA-3: spec, algebraic degree, treewidth, BitVec
  ├── Feistel/         3 files — Feistel: spec, security, concrete bounds
  ├── ILP/             3 files — ILP: spec, check, unified extraction
  ├── Poseidon/        2 files — Poseidon: spec, analysis
  ├── RescuePrime/     2 files — Rescue-Prime: spec, analysis
  ├── SAT/             2 files — SAT: encoding, certificate
  ├── GMiMC/           1 file  — GMiMC: spec
  └── (root)          77 files — Core: security model, cost model, graphs, hash specs, ciphers

Toolchain

Lean 4 v4.16.0 — all proofs use Init/Std only, no Mathlib dependency.

Build

lake build

Version History

Version Modules Theorems What changed
v1.0 37 350 Two-layer model, generic floor, structural bounds, verdict
v1.1 39 362 KnownHashes, AnalysisReport (usability layer)
v1.2 42 399 HashSpec pipeline, DegreeFold, AlgExpr, CryptoRewrites
v2.0 68 772 S-box DDT/LAT, E-graph engine, DP over tree decompositions, end-to-end computation pipeline
v2.1 74 844 Zero-parameter pipeline (computeFromSbox), constraint graph from HADES topology, E-graph integration
v2.5 95 1,138 Exponential DP (TreewidthDP), compositional constraint graph, real E-graph rewrite rules, ILP extraction encoding, wide trail theorem, PRESENT 4-bit full end-to-end
v2.6 108 1,352 S-box certificate framework (AES 8-bit), real E-graph optimizer (HashOp + SoundRules), PRESENT S-box from GF(2^4) field ops, wide trail pipeline, E-graph completeness (DAG acyclicity + fuel sufficiency + hypothesis discharge)
v2.7 119 1,500 Math foundations (Walsh-Hadamard, collision bounds, iteration dynamics), ILP real extraction, DPCompose multi-objective, Keccak spec + algebraic degree + constraint graph + treewidth, Feistel spec + security, SAT treewidth oracle
v2.8 132 1,675 3 Directions complete: D1 (Bernstein cost model, tight backward degree, Matsui linear attack), D2 (23 verified rewrite rules — 11 unconditional + 12 conditional), D3 (Feistel concrete bounds via Patarin/CCA). Updated instances, cross-validation, v2.5 legacy cleanup
v3.0 151 1,831 5 reality gaps closed: real E-graph saturation (12 sound rules), DP on 3 real circuits (PRESENT/AES/Keccak), AES 8-bit S-box certified analysis, Keccak sponge security chain, end-to-end pipeline on PRESENT 3-round. Three thesis claims machine-checked.
v3.1 173 2,219 6 future work items implemented: (1) Multi-round circuit DP (PRESENT-3R, AES-2R), (2) Recursive E-graph rewriting with depth reduction, (3) Automated S-box certificate pipeline (table → certified params), (4) Executable cipher implementations (AES round + Keccak χ with FIPS test vectors), (5) Formal DP optimality (dp_optimal_of_validNTD), (6) 3 new ciphers (Poseidon-128, Rescue-Prime-128, GMiMC-128). 6 hash instances total.
v3.1.1 173 2,219 Directory reorganization: 31 files moved to Sbox/ (16), Pipeline/ (10), Validation/ (5). Root reduced from 108→77 files. 45 structural theorems upgraded from native_decide to kernel-verified decide.
v4.0 190 2,202 Extended security model: 7 new attack types (quantum, integral, slide, related-key, invariant subspace, constant-time). Grover/BHT quantum bounds. overallSecurityV4 = min(quantumGenericFloor, expandedStructuralCost).
v5.0 224 2,683 AG quantum bounds (Feng-Ling-Xing), composable attack engine (AttackEGraph, 15 rewrite rules), distribution + collision variety bounds (inverse sieve, Hilbert). 9 hash instances (6 classical + 3 AG code). SuperHash export interface.

Usage

TrustHash is a proof library, not an interactive tool. You use it by writing Lean 4: define your hash parameters, and Lean computes and verifies the security level at compile time. There are several ways to use it, from simplest to most powerful:

v1.0: Manual parameters (FullAnalysis)

Import TrustHash.FullVerdict and fill in the FullAnalysis structure:

import TrustHash.FullVerdict
open TrustHash.FullVerdict

def poseidon128 : FullAnalysis where
  outputBits     := 256   -- n: hash output length in bits
  diffUniformity := 4     -- δ: differential uniformity of the S-box
  activeSboxes   := 22    -- b: minimum active S-boxes over all trails
  algebraicDeg   := 5     -- d: algebraic degree of the S-box (x^5)
  treewidth      := 2     -- tw: treewidth of the HADES section (t-1 for t=3)
  totalRounds    := 8     -- R: total number of rounds
  skippedRounds  := 0     -- s: rounds the adversary can skip
  h_output := by omega    -- proofs that parameters are well-formed
  h_delta  := by omega
  h_active := by omega
  h_deg    := by omega
  h_tw     := by omega
  h_rounds := by omega
  h_skip   := by omega

Evaluate the security verdict:

#eval genericFloor poseidon128     -- 128 (birthday: 256/2)
#eval differentialCost poseidon128 -- 17592186044416 (4^22)
#eval algebraicCost poseidon128    -- 25 (5^2)
#eval structuralCost poseidon128   -- 25 (min of differential and algebraic)
#eval overallSecurity poseidon128  -- 25 (min of generic and structural)

Interpretation: Poseidon-128 with treewidth 2 has overall security = 25, far below the generic floor of 128. The algebraic attack (degree 5, treewidth 2 = cost 25) is the bottleneck. Increasing treewidth or algebraic degree would improve security.

Prove properties about your hash:

-- Security is always at least 1
example : overallSecurity poseidon128 ≥ 1 :=
  structural_cost_pos poseidon128 -- structural ≥ 1, and overall ≤ structural

-- Overall security can't exceed the birthday bound
example : overallSecurity poseidon128 ≤ genericFloor poseidon128 :=
  overall_le_generic poseidon128

Example: AES-128 (optimal design)

def aes128 : FullAnalysis where
  outputBits     := 128   -- 128-bit output
  diffUniformity := 4     -- AES S-box: δ = 4
  activeSboxes   := 50    -- 50 active S-boxes over 10 rounds
  algebraicDeg   := 7     -- AES S-box: degree 7
  treewidth      := 4     -- AES state: 4×4 matrix, tw = 4
  totalRounds    := 10    -- AES-128: 10 rounds
  skippedRounds  := 0
  h_output := by omega; h_delta := by omega; h_active := by omega
  h_deg := by omega; h_tw := by omega; h_rounds := by omega; h_skip := by omega

#eval genericFloor aes128     -- 64 (birthday: 128/2)
#eval structuralCost aes128   -- 2401 (min(4^50, 7^4) = 2401)
#eval overallSecurity aes128  -- 64 (min(64, 2401) = 64)
-- OPTIMAL: structural cost (2401) exceeds generic floor (64)
-- The birthday bound is the bottleneck, not the cipher's structure.

Example: Detecting a weak design

def weakHash : FullAnalysis where
  outputBits     := 256
  diffUniformity := 4
  activeSboxes   := 3     -- very few active S-boxes
  algebraicDeg   := 3     -- low algebraic degree
  treewidth      := 1     -- minimal treewidth
  totalRounds    := 4
  skippedRounds  := 0
  h_output := by omega; h_delta := by omega; h_active := by omega
  h_deg := by omega; h_tw := by omega; h_rounds := by omega; h_skip := by omega

#eval genericFloor weakHash     -- 128
#eval structuralCost weakHash   -- 3 (min(4^3, 3^1) = min(64, 3) = 3)
#eval overallSecurity weakHash  -- 3
-- WEAKNESS: structural cost (3) is far below generic floor (128).
-- The algebraic attack at treewidth 1 costs only 3^1 = 3.

v1.1: KnownHashes and Reports

v1.1 adds a usability layer with precalculated analyses and human-readable reports.

Analyze a known hash (no need to define parameters):

import TrustHash.KnownHashes
import TrustHash.AnalysisReport
open TrustHash.FullVerdict
open TrustHash.AnalysisReport
open TrustHash.KnownHashes

#eval analyzeReport "AES-128" aes128
-- === AES-128 ===
--   Output bits:       128
--   Generic floor:     64
--   Differential cost: 1267650600228229401496703205376 (δ=4, b=50)
--   Algebraic cost:    2401 (d=7, tw=4)
--   Structural cost:   2401
--   Overall security:  64
--   Verdict:           OPTIMAL
--   Bottleneck:        generic (birthday bound)

Compare all known hashes (sorted by overall security):

#eval compareTable knownHashes
-- Hash                              Generic   Struct   Overall  Verdict
-- ------------------------------------------------------------------------
-- Poseidon-256 (t=9)                    128   390625       128  OPTIMAL
-- SHA-256 (model)                       128   117649       128  OPTIMAL
-- Keccak-256 (model)                    128      243       128  OPTIMAL
-- Rescue-Prime                          128      125       125  WEAKNESS
-- Griffin-256                           128      125       125  WEAKNESS
-- AES-128                                64     2401        64  OPTIMAL
-- Poseidon-128 (t=3)                    128       25        25  WEAKNESS
-- Weak Hash                             128        3         3  WEAKNESS

Available instances: poseidon128_t3, poseidon256_t9, aes128, rescuePrime, griffin256, sha256_model, keccak256_model, weakHash_example.

v1.2: Computational Pipeline (HashSpec)

v1.2 closes the gap between the arithmetic model and real cryptanalytic computation. Instead of manually providing derived parameters (active S-boxes, treewidth), you define only design parameters and the pipeline computes everything.

Define design parameters only:

import TrustHash.Pipeline.PipelineInstances
open TrustHash.HashSpec
open TrustHash.PipelineVerdict

def myHash : HashSpec where
  construction   := .Sponge
  outputBits     := 256
  stateWidth     := 3
  rate           := 128
  capacity       := 128
  sboxDegree     := 5       -- S-box degree
  diffUniformity := 4       -- S-box differential uniformity
  branchNumber   := 3       -- MDS branch number
  fullRounds     := 8       -- R_F full rounds
  partialRounds  := 57      -- R_P partial rounds
  h_output := by omega; h_width := by omega; h_deg := by omega
  h_delta := by omega; h_branch := by omega; h_rounds := by omega

Pipeline computes derived values automatically:

-- Pipeline computes treewidth from round structure
-- tw(full) = max(1, t-1), tw(partial) = 1

-- Pipeline computes active S-boxes from wide trail
-- active = branchNumber * fullRounds + partialRounds

-- Pipeline computes costs and verdict
#eval pipelineVerdictTag myHash    -- "WEAKNESS"
#eval computeVerdict myHash        -- 25

Prove pipeline verdicts (native_decide):

theorem myHash_weakness :
    pipelineVerdictTag myHash = "WEAKNESS" := by native_decide

theorem myHash_verdict_val :
    computeVerdict myHash = 25 := by native_decide

v1.2 also includes:

  • DegreeFold: Degree tracking catamorphism over HADES round sequences (adapted from DynamicTreeProg.treeFold)
  • FoldCostBound: Proof that fold-based computation matches pipeline computation
  • AlgExpr + CryptoRewrites: Sound algebraic rewrite rules for S-box degree simplification (adapted from OptiSat.SoundRewriteRule)
  • SimplificationVerifier: End-to-end verified simplification (e.g., d^8 * 1^57 = d^8 for Poseidon partial rounds)

v2.0: End-to-End Computation Pipeline

v2.0 replaces manual parameter entry with real cryptanalytic computation. It adds 5 functional components that compute security parameters directly from S-box tables and constraint graph structure:

  1. S-box analysis: Computes DDT, LAT, and algebraic degree FROM the S-box lookup table
  2. Differential trail search: Analytical + exhaustive search for minimum active S-boxes
  3. Treewidth computation: Greedy min-degree elimination on constraint graphs, tree decomposition, nice tree conversion
  4. E-graph engine: Union-find with congruence closure, pattern matching, fuel-based saturation, greedy extraction — with semantic preservation proofs
  5. DP over tree decompositions: niceTreeFold catamorphism with crypto-specific cost functions, connecting differential and algebraic DP to a unified security bound

Full pipeline: HashSpec + NiceNode -> ComputedAnalysis -> verdict:

import TrustHash.Pipeline.ComputeInstances
open TrustHash.ComputePipeline
open TrustHash.ComputeInstances

-- Poseidon-128: HADES structure as nice tree
-- Full analysis computed via DP over tree decomposition
#eval poseidon128Analysis.dpDiffCost      -- 18
#eval poseidon128Analysis.dpAlgCost       -- 22
#eval poseidon128Analysis.dpSecurityCost  -- 18 (min of diff and alg)
#eval poseidon128Analysis.activeSboxes    -- 81
#eval poseidon128Analysis.treewidth       -- 2
#eval computedVerdictTag poseidon128Analysis  -- "WEAKNESS"

All values are verified by native_decide:

theorem poseidon_diff_cost :
    poseidon128Analysis.dpDiffCost = 18 := by native_decide

theorem poseidon_sec_cost :
    poseidon128Analysis.dpSecurityCost = 18 := by native_decide

theorem poseidon_verdict :
    computedVerdictTag poseidon128Analysis = "WEAKNESS" := by native_decide

Compare hash functions with verified ordering:

-- AES has higher DP security than PRESENT (stronger S-box)
theorem aes_stronger_than_present :
    aes128Analysis.dpSecurityCost ≥ presentAnalysis.dpSecurityCost := by
  native_decide

-- Poseidon has higher DP security than PRESENT
theorem poseidon_stronger_than_present :
    poseidon128Analysis.dpSecurityCost ≥ presentAnalysis.dpSecurityCost := by
  native_decide

Pipeline properties (proven generically, not just for concrete instances):

-- Overall security never exceeds the birthday bound
theorem overall_le_generic (spec tree delta degree hd hdeg) :
    (computeAnalysis spec tree delta degree hd hdeg).overallSecurity
    ≤ (computeAnalysis spec tree delta degree hd hdeg).genericFloor

-- DP security cost is always ≥ 1
theorem dp_sec_pos (spec tree delta degree hd hdeg) :
    (computeAnalysis spec tree delta degree hd hdeg).dpSecurityCost ≥ 1

-- OPTIMAL verdict means structural security ≥ generic floor
theorem optimal_means (a : ComputedAnalysis)
    (h : computedVerdictTag a = "OPTIMAL") :
    a.dpSecurityCost ≥ a.genericFloor

-- WEAKNESS verdict means structural security < generic floor
theorem weakness_means (a : ComputedAnalysis)
    (h : computedVerdictTag a = "WEAKNESS") :
    a.dpSecurityCost < a.genericFloor

v2.1: Zero-Parameter Pipeline

v2.1 closes all remaining disconnection gaps. You provide ONLY a HashSpec and an S-box lookup table — the pipeline computes everything: differential uniformity, algebraic degree, constraint graph, tree decomposition, DP costs, and verdict.

Full pipeline from S-box table:

import TrustHash.Pipeline.ComputeInstancesV21
open TrustHash.EndToEndPipeline
open TrustHash.ComputeInstancesV21
open TrustHash.ComputePipeline

-- PRESENT: computed entirely from S-box table (zero manual parameters)
#eval presentV21.sboxDelta         -- 4 (computed from DDT)
#eval presentV21.sboxDegree        -- 3 (computed from ANF)
#eval presentV21.activeSboxes      -- 124
#eval presentV21.treewidth         -- 15
#eval computedVerdictTag presentV21  -- "OPTIMAL"

Graph-computed trees (no hardcoded tree topologies):

-- Poseidon v2.1: tree derived from HADES constraint graph
-- buildConstraintGraph → completeGraph K₃ → fromGraph → NiceNode
#eval poseidon128V21.activeSboxes    -- 81 (same as v2.0)
#eval poseidon128V21.treewidth       -- 2  (same as v2.0)

Cross-validation: v2.1 = v2.0 for all security quantities:

-- v2.1 computed trees produce identical results to v2.0 hardcoded trees
theorem poseidon_active_eq :
    poseidon128V21.activeSboxes = poseidon128Analysis.activeSboxes := by native_decide

theorem poseidon_tw_eq :
    poseidon128V21.treewidth = poseidon128Analysis.treewidth := by native_decide

E-graph integration: symbolic optimization of hash expressions:

import TrustHash.EGraph.PipelineIntegration
open TrustHash.EGraph.PipelineIntegration

-- E-graph optimizes algebraic degree expression via equality saturation
theorem poseidon_optimizes :
    (optimizeAndExtract poseidon128).isSome = true := by native_decide

-- Two-phase: AST rewrite rules (proven sound) + E-graph optimization
theorem poseidon_twophase :
    (optimizeTwoPhase poseidon128).isSome = true := by native_decide

-- AST-level simplification preserves evaluation (general theorem)
theorem twoPhase_ast_sound (spec : HashSpec) (env : Env) :
    eval (applyChain standardRules (hashToExpr spec)) env =
    eval (hashToExpr spec) env

v2.5: Gap Closure — Exponential DP + Real E-Graph

v2.5 closes 6 gaps between the paper's claims and the v2.1 code:

Gap v2.1 State v2.5 Fix
1 DP is linear childResult + cost additive Exponential TreewidthDP with BagAssignment-based DPTable
2 Constraint graph = lookup table if fullRounds > 0 then K_t else S_t Compositional hadesConstraintGraph from per-round primitives
3 E-graph = identity (empty rules) optimize expr [] config 4 real algebraic rewrite rules via cryptoERewriteRules
4 No concrete end-to-end instance Pipeline not tested on S-box data PRESENT 4-bit proven end-to-end via native_decide
5 Hardcoded verdicts Static verdict records computedVerdictV25 / computedBottleneckV25 branch dynamically
6 Wide trail not formalized Referenced in docstrings wide_trail_two_rounds theorem (Daemen-Rijmen bound)

v2.5 full pipeline with exponential DP:

import TrustHash.ValidationV25
open TrustHash.EndToEndV25
open TrustHash.ValidationV25

-- PRESENT: full v2.5 pipeline from S-box table
-- Uses exponential DP (tighter bounds than v2.1 linear DP)
theorem present_v25_egraph :
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).egraphOptimized = true := by
  native_decide

-- v2.5 uses 4 real crypto E-graph rules (not empty/identity)
theorem present_v25_rules :
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).egraphRulesUsed = 4 := by
  native_decide

-- Exponential DP gives tighter (lower) bounds than linear DP
theorem present_v25_dp_le_v21 :
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).dpSecurityV25 ≤
    (computeFromSboxV25 presentHashSpec presentSbox (by native_decide) (by native_decide)).dpSecurityV21 := by
  native_decide

Cross-validation: v2.5 agrees with v2.1 on shared metrics:

-- S-box analysis, active S-boxes, treewidth, and generic floor
-- are identical between v2.5 and v2.1
theorem present_v25_v21_active_eq :
    (computeFromSboxV25 ...).activeSboxes = (computeFromSbox ...).activeSboxes := by native_decide

theorem present_v25_v21_generic_eq :
    (computeFromSboxV25 ...).genericFloor = (computeFromSbox ...).genericFloor := by native_decide

v2.5 also includes:

  • TreewidthDP: Exponential DP over bag assignments with fuel-bounded computation and v2.1 fallback
  • DPTableLemmas: 43 theorems on DP table operations (leaf, introduce, forget, join)
  • ILP encoding: TENSAT formulation (C1-C4) for E-graph extraction with solution checking
  • WideTrailTheorem: Daemen-Rijmen bound on active S-boxes from branch number
  • Math foundations: Binomial, multi-collision, algebraic immunity, differential theory

Where do the parameters come from?

Level Interface What you provide What is computed
v3.1 AutoSboxPipeline / CipherComparison S-box table OR HashSpec Automated pipeline: S-box table → certified params → E-graph → DP (with optimality proof) → multi-round composition → verdict for 6 ciphers
v3.0 PresentEndToEnd / AES128Analysis / Keccak256Analysis SboxCertifiedParams + circuit Full pipeline: E-graph optimization + circuit → constraint graph → TD → DP → two-layer verdict for 3 real ciphers
v2.8 EndToEndPipeline + attack rules HashSpec + ConcreteSbox Everything + Bernstein cost, tight backward degree, Matsui linear, Feistel bounds, 23 rewrite rules
v2.1 EndToEndPipeline HashSpec + ConcreteSbox Everything: delta, degree, graph, tree, DP costs, verdict
v2.0 ComputePipeline HashSpec + NiceNode + delta + degree DP costs, active S-boxes, treewidth, generic floor, verdict
v1.2 PipelineVerdict HashSpec (construction, state width, S-box, rounds) Treewidth, active S-boxes, costs, verdict
v1.0 FullAnalysis All derived parameters directly Costs and verdict only

Module reference

v1.0: Core security model

Module Use case
SecurityDefs Security level, ideal security, margin definitions
GenericFloor Birthday, GBP, Joux, sponge bounds
ThreatLattice4D Hash Gone Bad 4D threat classification (5,000 scenarios)
CostModel Differential and algebraic cost types
MDSBranchNumber MDS matrix branch number bounds
MDSMultOrder Required multiplicative order for MDS
ActiveSboxBounds Active S-box counting and bounds
ForwardDegree Forward algebraic degree propagation
BackwardDegree Backward algebraic degree propagation
LinearApproximation Linear bias bounds
TreewidthModel HADES full/partial round treewidth
HADESSectionTW HADES section decomposition
TreeFoldSecurity Structural cost via tree fold (min of diff and alg)
ZeroSumDistinguisher Zero-sum subspace dimension bounds
TwoLayerVerdict Two-layer security verdict
RoundSkipBound Round-skipping attack cost
HashRanking Rank hash functions by security level
ModeAnalysis Mode-dependent (sponge) security
ConfigPartialOrder Compare hash configurations (dominance)
FullVerdict Complete two-layer verdict

v1.1: Usability layer

Module Use case
AnalysisReport Human-readable reports and comparison tables
KnownHashes Precalculated analyses for 8 well-known hashes

v1.2: Computational pipeline (Pipeline/)

Module Use case
HashSpec Design-parameter-only hash specification
Pipeline/GenericFloorPipeline Generic floor from HashSpec
Pipeline/WideTrailPipeline Active S-boxes and treewidth from HashSpec
Pipeline/StructuralPipeline Structural cost from HashSpec
Pipeline/PipelineVerdict Automatic verdict from HashSpec
Pipeline/PipelineInstances Concrete instances (Poseidon, AES, SHA-256, Keccak, Rescue)
RoundStructure HADES round sequence decomposition
DegreeFold Degree tracking catamorphism over round sequences
FoldCostBound Fold-pipeline equivalence proofs
AlgExpr Algebraic expression AST
CryptoRewrites Sound rewrite rules (pow_combine, pow_pow, distribute, mul_one, pow_one)
SimplificationVerifier End-to-end verified algebraic simplification

v2.0: Shared infrastructure

Module Use case
Bitwise XOR, popcount, parity, truncation for finite fields
Graph SimpleGraph with adjacency lists, complete/star/path constructors
FinIter maxOver, sumOver, countOver, tabulate2D without Finset

v2.0: S-box analysis (Sbox/)

Module Use case
Sbox/SboxTable ConcreteSbox type with lookup table, eval, outputDiff
Sbox/DDTCompute Difference distribution table computation, diffUniformityFromTable
Sbox/LATCompute Linear approximation table, Walsh coefficients, nonlinearityFromTable
Sbox/AlgDegreeCompute Mobius transform, ANF, algebraicDegreeFromTable
Sbox/SboxBridge SboxAnalysis bundle bridging S-box metrics to pipeline

v2.0: Graph + treewidth

Module Use case
DiffTrail Differential trail type, branch constraints, wide trail bound
WideTrailSearch Analytical + exhaustive minimum active S-box search
TrailBridge Trail analysis to pipeline bridge
EliminationOrder Greedy min-degree vertex elimination
TreeDecomp Tree decomposition from elimination ordering
NiceTreeConvert NiceNode inductive, conversion from tree decomposition
TWBridge Treewidth analysis to pipeline bridge

v2.0: E-graph engine

Module Use case
EGraph/UnionFind Disjoint-set forest with path compression (adapted from OptiSat)
EGraph/Core CryptoOp (Const/Var/Add/Mul/Pow), ENode, EClass, EGraph with add/merge/canonicalize
EGraph/CoreSpec Well-formedness predicates (ChildrenBounded, HashconsValid, SizesAligned) + preservation proofs
EGraph/EMatch CryptoPattern matching, Substitution, rewrite rule application
EGraph/Saturate Fuel-based equality saturation loop with fixpoint detection
EGraph/SemanticSpec NodeEval (Nat semantics), ConsistentValuation, equivalence preserves value
EGraph/Extraction Greedy bottom-up cost extraction from saturated E-graph
EGraph/PipelineSoundness End-to-end: addExpr + optimize pipeline with eval preservation

v2.0: DP over tree decompositions

Module Use case
DP/NiceTree niceTreeFold (4-constructor catamorphism), niceTreeFold_lower_bound
DP/CryptoCost Additive cost functions, vertex cost parameters, bag cost monotonicity
DP/DPOperations cryptoDP via niceTreeFold, lower bound, positivity, monotonicity
DP/SecurityDP differentialDP, algebraicDP, securityDP = min(diff, alg)
DP/DPBridge DPAnalysis structure, analyzeSecurity, bridge to pipeline

v2.0: Integration (Pipeline/)

Module Use case
Pipeline/ComputePipeline ComputedAnalysis, computeAnalysis, verdict tag, bottleneck identification
Pipeline/ComputeInstances Poseidon/AES/PRESENT with nice trees, DP costs, cross-hash comparisons

v2.1: Pipeline wiring (Pipeline/)

Module Use case
ConstraintGraph buildConstraintGraph: HADES topology → complete K_t or star S_t
Pipeline/EndToEndPipeline computeFromSbox: HashSpec + ConcreteSbox → ComputedAnalysis (zero manual params)
Pipeline/ComputeInstancesV21 PRESENT/Poseidon/AES via zero-parameter pipeline, v2.1 vs v2.0 cross-validation

v2.1: E-graph pipeline integration

Module Use case
EGraph/HashExpr hashToExpr: HashSpec → AlgExpr (algebraic degree expression), E-graph registration
EGraph/PipelineIntegration optimizeAndExtract, optimizeTwoPhase (AST rules + E-graph), depth validation

v2.1: Validation

Module Use case
ValidationV21 Cross-validation v2.1 = v2.0 (active S-boxes, treewidth, generic floor), E-graph end-to-end

v2.5: Mathematical foundations

Module Use case
Math/BinomialBasic Factorial, choose, falling factorial for multi-collision bounds
Math/MultiCollision k-collision counting and Suzuki bounds
Math/AlgebraicImmunity Annihilator existence, algebraic immunity lower bound
Math/DifferentialTheory Differential uniformity, DDT partition, APN functions
Math/Foundations BoolFn, foldFin infrastructure, Hamming weight
WideTrailTheorem SPN state, sboxWeight, wide_trail_two_rounds (Daemen-Rijmen bound)

v2.5: Exponential DP

Module Use case
DP/Util/NatOpt Nat.min properties for DP table operations
DP/Util/FoldMin List.foldl Nat.min lemmas
DP/Util/InsertMin HashMap-based insert-if-min for DP tables
DP/Util/DPBound forget/join/introduce preserve DP bounds
DP/TreewidthDP BagAssignment, DPTable, dpLeaf/dpIntroduce/dpForget/dpJoin, runDP with fuel
DP/CryptoCost Crypto-specific cost functions for DP

v2.5: Constraint graph + E-graph + ILP

Module Use case
EGraph/ILPTypes TENSAT ILP formulation types (ILPVar, ILPConstraint, ILPProblem, ILPSolution)
EGraph/ILPEncode E-graph → ILP encoding (C1 root, C2 exactly-one, C3 child deps, C4 acyclicity)
EGraph/ILPCheck ILP solution validation, extraction, ValidExtraction certificate
EGraph/CryptoRewritesEGraph 4 E-graph rewrite rules (mulOne, oneMul, powOne, distribute)

v2.5: Pipeline end-to-end (Pipeline/)

Module Use case
Pipeline/EndToEndPipeline computeFromSbox: full pipeline (DP + constraint graph + E-graph)

v2.6: S-box certificates + real E-graph optimizer (Sbox/, Pipeline/)

Module Use case
Sbox/DDTCertificate Offline DDT certificate structure + deterministic checker
Sbox/AES8BitSbox 256-entry AES S-box + DDT certificate
Sbox/CertifiedSboxBridge Certificate → pipeline bridge
HashOpExpr Multi-round algebraic expression generator
CryptoSoundRules 10+ sound rewrite rules with formal proofs
RealOptimizer E-graph optimizer with measurable depth reduction
Sbox/BitVec4Arith GF(2^4) field arithmetic (gf16_mul, gf16_inv) — all 9 field axioms kernel-verified via decide
Sbox/PresentSboxGF PRESENT S-box from GF(2^4) field operations
Sbox/GFBridge GF-derived delta = table-derived delta
WideTrailDP Wide trail cost function from Daemen-Rijmen
Pipeline/PipelineWideTrail Wide trail integration with DP pipeline
HashFormalSpec Exportable formal specification
EGraph/CompletenessSpec DAG acyclicity, fuel sufficiency, hypothesis discharge
Validation/ValidationV26 Cross-validation v2.6, final gate

v2.7: Keccak + Feistel + ILP + SAT

Module Use case
Math/WalshHadamard Walsh-Hadamard transform, Parseval, spectral analysis
Math/SecurityExtensions Ideal degree, SPN degree extensions
Math/CollisionBounds Collision probability bounds
Math/IterationDynamics Fixed point iteration, cycle structure
ILP/ILPSpec ILP specification types (decidable, no sorry)
ILP/ILPCheck ILP solution validation and extraction
ILP/ExtractionUnified Unified extraction (greedy + ILP + treewidth DP)
DP/DPCompose DP composition over multiple decompositions
DP/DPMultiObjective Multi-objective DP (differential + algebraic)
SAT/SATEncoding SAT encoding for treewidth oracle
SAT/SATCertificate SAT certificate validation
Keccak/KeccakSpec Keccak-f permutation (θ, ρ, π, χ, ι)
Keccak/KeccakAlgDegree Keccak algebraic degree analysis
Keccak/KeccakConstraintGraph Keccak constraint graph
Keccak/KeccakTreewidth Keccak treewidth bounds
Feistel/FeistelSpec Feistel round, inverse, multi-round
Feistel/FeistelSecurity SPRP/CCA bounds, required rounds
Validation/ValidationV27 Cross-validation v2.7

v2.8: 3 Directions complete (D1 + D2 + D3)

Module Use case
BernsteinCostModel Time × price security, parallel invariance, multi-target amortization, TMTO
TightBackwardDegree Boura-Canteaut composition degree, tight Keccak backward degree (1165 vs 1599)
MatsuiLinearAttack Piling-up lemma, linear attack complexity O(1/ε²), MDS linear bound
Feistel/FeistelConcreteBounds Patarin attack table (1–6+ rounds), CCA security, 11-round CCA-secure
ConditionalRewriteRule Extends RewriteRule with decidable side conditions, backward compatibility
DifferentialAttackRules 5 conditional rules: δ-propagation, trail fold, zero/one delta, max prob
LinearAttackRules 3 conditional rules: bias square, piling fold, const fold
AlgebraicAttackRules 4 conditional rules: degree compose, identity, zero, const fold
ExtendedAttackRules Unified chain: 23 rules total (11 unconditional + 12 conditional) + soundness
UpdatedInstances AES, Poseidon, Keccak, DES with Bernstein/backward/Matsui/Feistel bounds
Validation/CrossValidationV28 v2.8 vs v2.6: no regressions, tighter bounds where improved
Validation/ValidationV28 Final gate: 0 sorry, 0 axioms, D1/D2/D3 summary theorems

v3.0: Real pipeline — E-graph, DP, certified S-boxes, end-to-end

v3.0 closes the 5 reality gaps identified in an honest audit of v2.8:

Gap v2.8 State v3.0 Fix
1 E-graph tested only with empty rules Rules existed but never fired on real expressions 12 sound rewrite rules fire on real hash expressions, saturation reduces depth
2 DP uses hardcoded trees Nice trees built manually Circuit → constraint graph → tree decomposition → nice tree → DP chain
3 AES 8-bit S-box never analyzed DDT certificate existed, no full analysis Certified S-box params (δ=4, d=7, NL=112) → full ComputedAnalysis (security=62)
4 Keccak/Feistel specs not connected Specs defined but not linked to analysis Sponge security (c/2=128) + algebraic degree chain + structural cost → verdict
5 No end-to-end pipeline on real hash Pipeline stages existed separately PRESENT 3-round: S-box → expression → E-graph → circuit → TD → DP → verdict

Three thesis claims machine-checked (ThesisDemonstration.lean):

-- Claim 1: E-graph saturation reduces depth while preserving semantics
-- 12 distinct rules fire, all with individual soundness proofs
theorem claim1_depth_reduction :
    present3Round.depth > present3Optimized.depth  -- 4 > 2

theorem claim1_soundness (env : Nat → Nat) :
    present3Optimized.eval env = present3Round.eval env

-- Claim 2: DP over tree decompositions on 3 real circuits
-- Monotone in S-box parameters (stronger S-box → higher security)
theorem claim2_three_circuits :
    presentSecDP ≥ 1 ∧ aesSecDP ≥ 1 ∧ keccakChiAlgDP ≥ 1

theorem claim2_monotone :
    securityDP 4 3 tree ≤ securityDP 8 3 tree ∧  -- stronger delta
    securityDP 4 3 tree ≤ securityDP 4 7 tree     -- higher degree

-- Claim 3: Two-layer verdicts for 3 real hash functions
theorem claim3_three_verdicts :
    presentAnalysis.overallSecurity ≤ presentAnalysis.genericFloor ∧
    aes128Analysis.overallSecurity = 62-- differential bottleneck
    keccak256Analysis.overallSecurity = 128      -- sponge capacity binding

End-to-end pipeline on PRESENT 3-round (PresentEndToEnd.lean):

-- 7-stage verified pipeline:
-- Stage 1: Certified S-box → Stage 2: Expression → Stage 3: E-graph →
-- Stage 4: Circuit → Stage 5: Tree decomp → Stage 6: DP → Stage 7: Verdict

theorem present_full_chain :
    presentCertParams.delta = 4-- Stage 1: certified
    presentCertParams.degree = 3 ∧
    present3Round.depth = 4-- Stage 2: expression built
    present3Optimized.depth < present3Round.depth ∧  -- Stage 3: E-graph reduces
    present1RoundCircuit.numOps = 5-- Stage 4: circuit
    present1TD.width ≤ 3-- Stage 5: treewidth bounded
    presentSecDP ≥ 1-- Stage 6: DP positive
    presentAnalysis.overallSecurity ≤ presentAnalysis.genericFloor  -- Stage 7

Concrete security verdicts for 3 hash functions:

Hash Generic Floor Structural Cost Overall Verdict
PRESENT 3-round 32 DP-computed min(32, DP) Bounded
AES-128 64 62 (diff bottleneck) 62 Near-optimal
SHA-3-256 128 (sponge c/2) >2^40 (24 rounds) 128 Optimal
v3.0 module reference
Module Use case
HashOp Extended hash operations: HashOp inductive + HashExprF with eval, depth, DecidableEq
HashExpr Expression builder from HashSpec, round composition
Sbox/SboxCertifiedParams Dependent record SboxCertifiedParams with proof obligations (δ, d, NL bounds)
HashSoundRules 12 sound rewrite rules: sbox compose, ark identity, const fold, rounds compose, etc.
EGraph/RealSaturate Configurable saturation engine: RealSatConfigRealSatResult with progress tracking
EGraph/RealPipelineDemo Saturation demos on 4 PRESENT expressions, fixpoint + depth reduction proofs
ConstraintGraphCompute HashCircuit → constraint graph builder for PRESENT/AES/Keccak circuits
TreeDecompCertificate TDCertificate: external compute + Lean verify pattern for tree decompositions
DP/CryptoDPInstance First concrete DP on real circuits: PRESENT (δ=4,d=3), AES (δ=4,d=7), Keccak chi (d=2)
Sbox/AES8BitCertified Full AES 8-bit certified analysis chain: DDT cert + params + circuit + bridge
Keccak/KeccakSecurityAnalysis SHA-3-256 two-layer: sponge (128) + structural (>2^40) + cross-cipher comparison
AES128Analysis ComputedAnalysis for AES-128: overall=62, differential bottleneck
Keccak/Keccak256Analysis ComputedAnalysis for SHA-3-256: overall=128, sponge binding, massive margin
PresentEndToEnd Keystone: 7-stage pipeline on PRESENT 3-round, present_full_chain theorem
ThesisDemonstration 3 thesis claims: E-graph depth, DP optimality, compositional security
Validation/ValidationV30 Final gate: 5 gaps closed, project metrics, v2.8 regression check

v3.1: Automated pipeline — multi-round DP, recursive E-graph, 6 ciphers

v3.1 implements all 6 future work items identified in v3.0:

Item v3.0 State v3.1 Implementation
1 Multi-round DP 1-round circuits only MultiRoundInstances: PRESENT-3R (composedCost ≥ singleRoundCost), AES-2R
2 Recursive E-graph Flat pattern matching RecursiveRewrite/RecursiveSaturate: S-box chain depth 5→1, PRESENT depth reduction
3 Auto S-box certs Manual SboxCertifiedParams AutoSboxPipeline: table → DDT/LAT/ANF → certified (δ, d, NL)
4 Executable ciphers Mathematical specs only AESRoundFunction (SubBytes/ShiftRows/MixColumns/AddRoundKey) + KeccakBitVec (χ/θ) with FIPS test vectors
5 DP optimality Lower bound only DPOptimalProof.dp_optimal_of_validNTD: DP is tight for any ValidNTD tree
6 Broader coverage 3 ciphers 6 ciphers: +Poseidon-128, Rescue-Prime-128, GMiMC-128 with CipherComparison

Six hash instances with verified security analysis:

Hash Generic Floor Overall Security Verdict
PRESENT 3-round 32 DP-computed Bounded
AES-128 64 62 Near-optimal
SHA-3-256 128 128 Optimal
Poseidon-128 64 18 Weakness
Rescue-Prime-128 64 18 Weakness
GMiMC-128 64 11 Weakness

DP optimality theorem (DPOptimalProof.lean):

-- For any valid nice tree decomposition, the crypto DP is optimal:
-- it is the MINIMUM over all cost functions respecting the tree structure.
theorem dp_optimal_of_validNTD (c : Nat) (tree : NiceNode)
    (hvalid : ValidNTD tree)
    (f : NiceNode → Nat)
    (hf_leaf : f .leaf ≥ 1)
    (hf_intro : ∀ v child, f (.introduce v child) ≥ f child + c)
    (hf_forget : ∀ v child, f (.forget v child) ≥ f child)
    (hf_join : ∀ l r, f (.join l r) ≥ f l + f r) :
    cryptoDP c tree ≤ f tree

-- Closed-form: DP = leafCount + introCount × cost
theorem dp_eq_formula (c : Nat) (tree : NiceNode) :
    cryptoDP c tree = leafCount tree + introCount tree * c
v3.1 module reference
Module Use case
Sbox/LATCertificate Offline LAT certificate structure + checker
Sbox/SboxFullCert Combined DDT + LAT + degree certificate
Sbox/AutoSboxPipeline Automated: S-box table → certified params (δ, d, NL) in one step
MultiRoundGraph Multi-round constraint graph composition
MultiRoundTreeDecomp Tree decomposition for composed multi-round graphs
MultiRoundDP MultiRoundResult: single-round DP → composed cost with monotonicity proof
MultiRoundInstances PRESENT-3R and AES-2R concrete multi-round analyses
EGraph/RecursiveRewrite recursiveRewrite: rewrite at all subterm positions, exprDepth
EGraph/RecursiveSaturate recursiveSaturate: fuel-bounded recursive saturation loop
EGraph/RecursiveDemo Demos: PRESENT 3-round depth reduction, S-box chain 5→1, soundness proof
Keccak/KeccakBitVec Executable Keccak: chi5, thetaLane on BitVec 5 with algebraic proofs
AESRoundFunction Executable AES round: SubBytes, ShiftRows, MixColumns, AddRoundKey, xtime
TestVectors FIPS test vector validation for AES and Keccak
CipherBridge Bridge executable implementations to security model
Poseidon/PoseidonSpec Poseidon hash specification: full + partial rounds, MDS, S-box x^α
Poseidon/PoseidonAnalysis Poseidon-128 ComputedAnalysis: overall=18, 81 active S-boxes
RescuePrime/RescuePrimeSpec Rescue-Prime spec: forward x^α + inverse x^{1/α} layers
RescuePrime/RescuePrimeAnalysis Rescue-Prime-128 ComputedAnalysis: overall=18
GMiMC/GMiMCSpec GMiMC spec: Feistel-like with single S-box per round
CipherAnalysisV31 GMiMC-128 analysis: overall=11
CipherComparison Cross-cipher comparison: 6 instances ranked, monotonicity proofs
DP/DPOptimalSpec ValidNTD inductive, DPOptimal, DPOptimalityWitness
DP/DPOptimalProof Main theorem: dp_optimal_of_validNTD, dp_eq_formula, securityDP_optimal_of_validNTD
DP/DPOptimalBridge OptimalDPAnalysis, full_pipeline_story, security_cost_formula
Validation/ValidationV31 Final gate: all 6 items closed, v3.0 regression check

v5.0: AG Quantum Bounds + Composable Attack Engine

v5.0 extends TrustHash from a static verifier to a dynamic security analyzer with three new capabilities:

  1. AG quantum bounds (Feng-Ling-Xing): algebraic-geometry codes that surpass naive Grover halving
  2. Composable attack engine: a parallel E-graph that discovers optimal attack combinations
  3. Distribution + collision variety bounds: inverse sieve and Hilbert irreducibility as new security dimensions

Defining a hash (v5.0)

The FullAnalysisV5 structure captures all security-relevant parameters. Lean enforces well-formedness via proof obligations (h_output, h_deg, etc.) — if your parameters are invalid, the code does not compile.

import TrustHash.Attacks.FullVerdictV5
open TrustHash.Attacks.FullVerdictV5

def myHash : FullAnalysisV5 where
  outputBits              := 128    -- hash output length in bits
  diffUniformity          := 4      -- S-box differential uniformity (from DDT)
  activeSboxes            := 50     -- minimum active S-boxes (wide trail)
  algebraicDeg            := 3      -- S-box algebraic degree (from ANF)
  treewidth               := 4      -- treewidth of the round structure
  totalRounds             := 10
  skippedRounds           := 2
  integralDimension       := 32     -- integral distinguisher dimension (0 = not applicable)
  slidePeriod             := 0      -- round function period (0 = no periodicity)
  relatedKeyProb          := 85     -- -log2(related-key differential probability)
  invariantSubDim         := 0      -- invariant subspace dimension (0 = none found)
  keyScheduleNonlin       := 64     -- key schedule nonlinearity measure
  constantTimeSafe        := true   -- implementation is constant-time
  agFieldSize             := 0      -- 0 = classical hash, q = AG code hash over F_q
  distributionBound       := 100    -- inverse sieve security bits
  collisionVarietyBound   := 115    -- collision variety security bits
  bestComposedAttackCost  := 0      -- from attack engine (0 = not computed)
  -- Lean requires proofs that parameters are well-formed:
  h_output := by omega   -- outputBits >= 1
  h_delta  := by omega   -- diffUniformity >= 1
  h_active := by omega   -- activeSboxes >= 1
  h_deg    := by omega   -- algebraicDeg >= 2
  h_tw     := by omega   -- treewidth >= 1
  h_rounds := by omega   -- totalRounds >= 1
  h_skip   := by omega   -- skippedRounds <= totalRounds

Computing the verdict

#eval overallSecurityV5 myHash          -- security level in bits
#eval quantumFloorV5 myHash             -- quantum-aware generic floor
#eval expandedStructuralCostV5 myHash   -- best structural attack cost

The formula that runs internally:

overallSecurityV5 = min(quantumFloorV5, expandedStructuralCostV5)

quantumFloorV5 =
  if agFieldSize > 0 then min(bhtCollisionFloor n, quantumAGBound q 0)
  else quantumGenericFloor n

expandedStructuralCostV5 =
  min(classical_structural,
      min(distributionBound, min(collisionVarietyBound, bestComposedAttackCost)))

Everything is nested min. The adversary picks the cheapest attack. TrustHash reports that minimum.

Proving properties

-- Lean kernel-verifies that the result is exactly 42
theorem myHash_security : overallSecurityV5 myHash = 42 := by native_decide

-- Lean verifies a comparison between two hashes
theorem myHash_beats_gmimc :
    overallSecurityV5 gmimc128V5 ≤ overallSecurityV5 myHash := by native_decide

-- Lean verifies the security never exceeds the quantum floor
theorem myHash_bounded :
    overallSecurityV5 myHash ≤ quantumFloorV5 myHash := v5_le_quantumFloor myHash

If any of these fails, lake build rejects the project. There is no way to bypass verification.

AG code hashes (new in v5.0)

For hashes built over algebraic-geometry codes, set agFieldSize to the base field size. The quantum floor then uses the Drinfeld-Vladut bound instead of naive BHT:

def hyperellipticV5 : FullAnalysisV5 where
  outputBits  := 256
  agFieldSize := 2048   -- F_{2^11}, so DV(2048) = isqrt(2048) - 1 = 44
  -- ...other fields...

#eval quantumFloorV5 hyperellipticV5  -- min(85, 44) = 44
-- AG bound is tighter: 44 < 85 (classical BHT for 256-bit)

v5.0 security results

Hash Security (bits) Binding factor Family
PRESENT-80 26 Quantum floor (n/3) Classical
AES-128 42 Quantum floor (n/3) Classical
SHA-3-256 16 Structural (d^tw = 2^4) Classical
Poseidon-128 25 Structural (5^2) Classical
Rescue-Prime-128 25 Structural (5^2) Classical
GMiMC-128 3 Structural (3^1) Classical
Hyperelliptic-256 44 AG quantum (DV=44) AG code
Garcia-Stichtenoth-128 6 AG quantum (DV=6) AG code
Suzuki-128 15 AG quantum (DV=15) AG code

Where the input parameters come from

TrustHash formalizes the computation of the verdict, not the cryptanalytic work that precedes it. You must supply the parameters:

Parameter Source
diffUniformity DDT of your S-box (e.g., AES S-box has delta=4)
activeSboxes Wide trail strategy: branchNumber * fullRounds + partialRounds
algebraicDeg ANF polynomial degree of your S-box
treewidth Structural analysis of your round's constraint graph
agFieldSize Base field size if your hash uses AG codes (0 otherwise)

What the verification guarantees

The value is not the point — a calculator can compute min(42, 81). What TrustHash provides:

  • Formula correctness: v5_le_quantumFloor, v5_le_structural, v5_le_v4 are kernel-verified theorems
  • Parameter validation: invalid parameters (e.g., algebraicDeg := 0) are rejected at compile time
  • Formal comparisons: poseidon_beats_gmimc is a machine-checked proof, not a paper claim
  • Composition soundness: each of the 15 attack rewrite rules has an individual soundness proof
  • Upgrade safety: v5_le_v4 proves that v5.0 can only decrease or maintain v4.0 security levels

Known limitations

  • Treewidth computation uses greedy min-degree elimination (heuristic upper bound). It gives exact results for HADES-specific graphs (complete K_t, star S_t) but is not guaranteed optimal for arbitrary graphs.
  • E-graph extraction uses greedy bottom-up cost selection. v2.5 adds ILP encoding (TENSAT formulation) with solution checking, but the solver itself is external.
  • Security DP values are abstract cost units (not directly bits of security). The mapping from DP cost to concrete bit-security requires domain-specific interpretation.
  • Exponential DP uses fuel to bound computation. When fuel is exhausted, it falls back to v2.1 linear DP (always terminates, provably correct but less tight).
  • TreeDecomp validity is checked via boolean predicates (isValid), not connected to a formal specification type.
  • Executable ciphers (v3.1) implement individual round functions, not full cipher encryption/decryption with key schedules. Test vectors validate round-level operations.
  • Multi-round DP (v3.1) composes round-level DP results additively. This assumes independent rounds; correlated round attacks are not modeled.

Future work

The following areas represent open directions where TrustHash could be extended:

Mathlib integration

All proofs use Init/Std only (no Mathlib). Several areas would benefit from Mathlib's library:

  • Walsh-Hadamard transform: Currently modeled over Nat; Mathlib.Analysis.Fourier would give proper -valued transforms.
  • Algebraic immunity: Currently uses Nat-based annihilator counting; Mathlib.FieldTheory would formalize GF(2^n).
  • Collision probability bounds: Currently discrete; Mathlib.Probability would give measure-theoretic foundation.

Full E-graph union-find integration

v3.1 recursive rewriting operates on tree syntax (HashExprF). Connecting to the v2.0 union-find E-graph engine (EGraph.Core) would enable congruence closure and sharing-aware optimization, potentially discovering deeper simplifications.

Full cipher implementations with key schedules

v3.1 implements individual round functions (AES SubBytes/MixColumns, Keccak χ/θ). Full cipher encryption with key expansion would enable end-to-end test vector validation against NIST standards.

Correlated round attacks

Multi-round DP (v3.1) assumes independent rounds. Modeling round-key-dependent correlations (e.g., slide attacks, related-key differentials) would strengthen the security analysis.

Additional cipher families

v3.1 covers SPN (PRESENT, AES, Poseidon, Rescue-Prime), sponge (SHA-3), and generalized Feistel (GMiMC). Natural extensions:

  • Poseidon2/Poseidon2b: Extended HADES with external matrix multiplication.
  • BLAKE2/BLAKE3: Compression function analysis via ARX model.
  • Ascon: Lightweight authenticated encryption (NIST LWC winner).

Probabilistic security bounds

Current bounds are worst-case (minimum active S-boxes, algebraic degree). Probabilistic analysis using differential probability distributions would give tighter expected-case bounds.

References

  • Canteaut et al. 2020: Report on Security of STARK-friendly Hash Functions v2.0
  • Merz & Rodriguez Garcia 2026: Skipping Class (round-skipping via invariant subspaces)
  • Cheval et al. USENIX 2023: Hash Gone Bad (4D threat lattice)
  • La Scala & Tiwari 2025: MultiSolve (tree decomposition bridge)
  • Grassi et al. 2025: Poseidon2b (extended security definitions)
  • Wagner 2002: Generalized Birthday Problem
  • Dey & Ghosh 2019: Cryptographic S-box properties
  • Bodlaender 1996: Linear-time algorithms for treewidth (elimination ordering)
  • Nieuwenhuis & Oliveras 2007: Congruence closure and union-find (E-graph foundations)
  • Bernstein 2005: Understanding Brute Force (time × price cost model)
  • Matsui 1993: Linear Cryptanalysis Method for DES Cipher (piling-up lemma)
  • Boura & Canteaut 2010/2011: Zero-sum distinguishers, composition degree bound
  • Patarin 2008: Generic Attacks on Feistel Schemes (round complexity table)
  • Hoang & Rogaway 2010: On Generalized Feistel Networks (CCA bounds)
  • Daemen & Rijmen 2002: The Design of Rijndael (wide trail strategy, MDS branch number)
  • Bertoni et al. 2011: The Keccak Reference (sponge construction, capacity security)
  • Feng, Ling & Xing 2006: AG code quantum bounds (alpha_q(delta) >= A(q) - delta)
  • Walsh 2013/2020: Inverse sieve for algebraic sets, polynomial method over varieties
  • Menconi, Paredes & Sasyk 2020: Inverse sieve for global fields
  • Paredes & Sasyk 2021/2022: Effective Hilbert irreducibility, uniform rational point bounds
  • Brassard, Hoyer & Tapp 1998: Quantum collision search (BHT bound n/3)

About

Formal verification of cryptographic hash function security — 212 theorems, 0 sorry, Lean 4

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors