Skip to content

feat: Support nested loop exits#958

Open
daulet wants to merge 17 commits intoAeneasVerif:mainfrom
daulet:daulet/break_support
Open

feat: Support nested loop exits#958
daulet wants to merge 17 commits intoAeneasVerif:mainfrom
daulet:daulet/break_support

Conversation

@daulet
Copy link
Copy Markdown

@daulet daulet commented Apr 23, 2026

Summary

  • Add support for return inside nested loops and labelled outer break/continue.
  • Thread propagated loop exits through pre-passes, symbolic evaluation, fixed-point handling, pure
    lowering, and backend extraction.
  • Add Lean/Coq/F* primitives and generated fixtures for nested loop-exit output.
  • Remove stale docs claiming nested loops are unsupported.

Testing

  • Added focused Rust fixtures covering scalar nested exits, borrow-sensitive propagated exits,
    recursive loop extraction, multi-depth breaks, outer continues, and backend-specific output
    shapes.
  • Added OCaml unit coverage for symbolic loop-exit shapes, fixed-point entry behavior, loop-exit
    context collection, and pure/backend extraction naming.
  • Regenerated Lean baselines for the new loop-exit encoding and added Coq/F* generated backend
    fixtures to pin emitted syntax.
  • Ran the full extraction/regression suite plus targeted nested-loop and backend-generation tests.

Development

  • The sequence of commits is planned and implemented with a mixture of codex 5.3 and 5.5 on xhigh effort, reviewed with claude opus 4.7 on each commit.

@daulet daulet marked this pull request as ready for review April 23, 2026 18:06
@sonmarcho
Copy link
Copy Markdown
Member

sonmarcho commented Apr 23, 2026

Is this AI generated? This looks AI generated: if that is the case, you should indicate it clearly.

@daulet
Copy link
Copy Markdown
Author

daulet commented Apr 23, 2026

Is this AI generated? This looks AI generated: if that is the case, you should indicate it clearly.

Yes, I've updated the description.

@sonmarcho
Copy link
Copy Markdown
Member

I'm sorry but I don't see myself reviewing an AI generated PR with a diff of 6k lines and barely any documentation: I have no idea how your code works. I'm not opposed to AI assisted code and I welcome improvements to the toolchain, but in the current state I can not reasonably review this. Could you give me detailed explanations about how the PR works, the choices you've made, etc.?

@daulet
Copy link
Copy Markdown
Author

daulet commented Apr 23, 2026

I'm sorry but I don't see myself reviewing an AI generated PR with a diff of 6k lines and barely any documentation: I have no idea how your code works. I'm not opposed to AI assisted code and I welcome improvements to the toolchain, but in the current state I can not reasonably review this. Could you give me detailed explanations about how the PR works, the choices you've made, etc.?

Yes, I figured its too big to accept in one PR, especially for first time contributor, but it also address the existing limitation holistically (if not we should construct a test case where it doesn't work).

One way to see the logic of this change is to critique the plan from first commit. Another is the following detailed explanation (also AI generated but with my edits). For code review I'd recommend to review the implementation files first, skip generated backend files, that large generated diff is mostly from the generated Lean import opening LoopExit.

Also important note: I didn't test this on anything that requires Windows.

What this PR changes

The PR adds support for nested loop exits that leave the current loop:

  • break 'outer
  • continue 'outer
  • return from inside a nested loop
  • multiple propagated break/continue depths reaching the same loop

Previously the pre-pass rejected or rewrote many of these cases. This PR stops treating them as unsupported and instead represents them explicitly in the symbolic and pure ASTs.

Core idea

A normal loop is still translated using:

ControlFlow.cont state
ControlFlow.done result

But if a loop body can exit beyond the current loop, the done payload needs to say which kind of exit happened. The PR introduces:

LoopExit normalBreak propagatedBreak propagatedContinue propagatedReturn

Meaning:

  • normalBreak: a break consumed by the current loop
  • propagatedBreak: a break targeting an enclosing loop
  • propagatedContinue: a continue targeting an enclosing loop
  • propagatedReturn: a function return from inside the loop

Panic/failure is not encoded in LoopExit; it remains the existing Result.fail path.

Example shape:

loop (...) : Result (LoopExit normalBreakTy propagatedBreakTy Unit Unit)

The caller then matches on the LoopExit after the loop and either continues normal execution or propagates the exit outward.

Depth handling

LLBC break/continue depths are relative to the loop where they occur.

Inside the current loop:

  • Break 0 means “normal break from this loop”
  • Break n, where n > 0, becomes PropagatedBreak (n - 1)
  • Continue 0 means “re-enter this loop”, so it contributes to the loop fixed point
  • Continue n, where n > 0, becomes PropagatedContinue (n - 1)
  • Return becomes PropagatedReturn

The decrement is important: once an exit leaves the current loop, one loop boundary has been consumed.

Main implementation path

Recommended review order:

  1. src/PrePasses.ml

    This removes the old “outer break/continue/return unsupported” assertions and makes the loop normalization more conservative. It only rewrites local exits when doing so does not change nested control-flow semantics.

  2. src/interp/InterpLoopsFixedPoint.ml

    This classifies loop-body results into local re-entry, normal break, propagated break, propagated continue, propagated return, or non-loop failure. It also computes and joins the symbolic contexts for each exit kind/depth.

  3. src/interp/InterpLoops.ml

    This threads those propagated exit contexts into symbolic loop synthesis. Each propagated exit gets both:

    • a payload context, used to build the loop result value
    • a branch expression, used after the loop result is matched
  4. src/symbolic/SymbolicAst.ml

    This adds LoopExit and loop_exit_kind to the symbolic AST so the exit kind is preserved until pure lowering.

  5. src/symbolic/SymbolicToPureExpressions.ml

    This lowers the symbolic loop into pure code:

    • computes the LoopExit type
    • wraps loop-body exits with the right constructor
    • emits a post-loop match on normalBreak, propagatedBreak, propagatedContinue, and propagatedReturn
  6. src/pure/Pure.ml, src/pure/PureUtils.ml, src/pure/PureTypeCheck.ml, src/pure/PrintPure.ml

    These add the pure builtin type and helpers for LoopExit.

  7. src/extract/ExtractBase.ml and backend primitives

    These map LoopExit and the auxiliary sum type into Lean, Coq, F*, and HOL4 names.

Why introduce AeneasSum

LoopExit has one field for propagated breaks and one field for propagated continues. If multiple break depths reach the same loop, we need to distinguish them inside that one field.

For example, a loop may receive both:

  • break 'middle
  • break 'outer

Those are both propagated breaks, but with different remaining depths. The PR encodes that payload as a binary sum:

AeneasSum leftPayload rightPayload

This avoids generating a custom ADT per loop shape. Lean/Coq/F* use an Aeneas-defined sum to avoid naming collisions; HOL4 uses its native sum with INL/INR.

Why not keep using only ControlFlow

ControlFlow only distinguishes continue vs done. Once done may mean “normal break”, “break outer”, “continue outer”, or “return”, we need another discriminator. Encoding all of that into tuples would be fragile and proof-hostile; LoopExit makes the generated shape explicit.

Current limitations

This PR does not try to solve every possible loop shape. In particular, fixed-point loops with propagated exits but no normal break are still rejected. Most while loops have a normal break path through the false condition, so this still covers the intended nested-control cases.

Example: an infinite loop where every exit is a return or an outer control-flow jump, and there is no path that simply breaks the current loop.

pub fn no_normal_break(flag: bool) -> u32 {
    loop {
        if flag {
            return 1;
        } else {
            return 2;
        }
    }
}

HOL4 extraction names are wired, but the new HOL4 nested-loop fixture is not regenerated in this PR because the current test setup does not regenerate HOL4 outputs for these cases.

Test coverage

The tests cover the implementation at several levels:

  • focused OCaml regression tests for exit classification, depth decrementing, payload grouping, LoopExit type construction, and backend extraction names
  • Rust translation fixtures for nested break, nested continue, nested return, multi-depth exits, and borrowed/backward-continuation cases
  • Lean generated fixtures for fixed-point and recursive loop translation
  • Coq/F* generated backend fixtures for representative nested-loop exits
  • existing array and nested-loop translation tests were regenerated to catch regressions in normal loop output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants