Skip to content

test: nightly fuzz test job for bstm#26136

Open
aljo242 wants to merge 14 commits intomainfrom
test/fuzz-bst
Open

test: nightly fuzz test job for bstm#26136
aljo242 wants to merge 14 commits intomainfrom
test/fuzz-bst

Conversation

@aljo242
Copy link
Copy Markdown
Contributor

@aljo242 aljo242 commented Mar 19, 2026

run nightly heavy fuzz and post to slack

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR adds a nightly GitHub Actions workflow that runs a 30-minute BlockSTM fuzz test and posts results to Slack, introduces a FuzzBlockSTMAppHashDeterminism differential fuzz test comparing sequential vs BlockSTM execution for app-hash determinism, and adds a safeFeePayer helper to prevent panicking FeePayer() implementations from crashing the pre-estimation path.

  • A full BaseApp harness (newFuzzAppHarness) is unnecessarily created inside the f.Fuzz(...) closure on every iteration solely to extract txConfig, which is constant across all iterations — this significantly reduces fuzzing throughput and should be lifted outside the loop.
  • safeFeePayer silently swallows all panics from FeePayer() without any logging; real bugs from production tx implementations will be invisible unless the crash is reproduced outside the estimation path.
  • The nightly workflow's failure notification fires on both failure and cancelled states (previously noted), and newly discovered corpus entries are not committed back to the repo (previously noted).

Confidence Score: 3/5

  • Safe to merge with minor functional concerns — the wasteful per-iteration app setup will reduce fuzz throughput but won't cause incorrect test results.
  • The core differential testing logic (FinalizeBlock → Commit loop, app-hash and tx-result comparison, post-commit balance checks) is structurally sound. The safeFeePayer recovery pattern is correct and well-tested. The main concern is a performance bug in the fuzz harness that creates a full app per iteration just for txConfig, which will noticeably limit how many cases the 30-minute nightly run can cover. Several workflow-level issues from prior review threads (cancelled-job false alerts, no corpus persistence) remain open.
  • tests/integration/blockstm/blockstm_fuzz_test.go (wasteful per-iteration app init) and .github/workflows/fuzz-nightly.yml (open workflow concerns from prior threads).

Important Files Changed

Filename Overview
.github/workflows/fuzz-nightly.yml New nightly workflow scheduling a 30-minute BlockSTM fuzz run with Slack success/failure notifications. Several pre-existing issues flagged in prior threads (cancelled-job notification, corpus persistence, duplicate concurrency group) remain open.
internal/blockstm/txnrunner.go Added safeFeePayer helper that catches panics from feeTx.FeePayer() so that misbehaving tx implementations can't crash the pre-estimation phase. The approach is sound but silently discards panics without any logging.
internal/blockstm/txnrunner_test.go Adds mockPanicFeeTx and a corresponding decoder, plus a test case asserting that a panicking FeePayer() implementation is handled gracefully. Coverage looks correct.
tests/integration/blockstm/blockstm_fuzz_test.go New differential fuzz test comparing sequential vs BlockSTM execution for app-hash determinism. Core structure (newFuzzAppHarness → fundAccounts → runDifferentialBlockStream → FinalizeBlock+Commit loop) is correct, but a full app harness is unnecessarily spun up on every fuzz iteration purely to obtain txConfig, which can significantly reduce throughput.

Reviews (3): Last reviewed commit: "Merge branch 'main' into test/fuzz-bst" | Re-trigger Greptile

@aljo242
Copy link
Copy Markdown
Contributor Author

aljo242 commented Mar 19, 2026

@greptile re-review, those p2 issues are not real

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 60.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.47%. Comparing base (6c7bddd) to head (86c27ea).

Files with missing lines Patch % Lines
internal/blockstm/txnrunner.go 60.00% 4 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #26136      +/-   ##
==========================================
+ Coverage   61.44%   61.47%   +0.03%     
==========================================
  Files         967      951      -16     
  Lines       63380    62455     -925     
==========================================
- Hits        38943    38395     -548     
+ Misses      24437    24060     -377     
Files with missing lines Coverage Δ
internal/blockstm/txnrunner.go 94.28% <60.00%> (+13.03%) ⬆️

... and 22 files with indirect coverage changes

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aljo242 aljo242 added the backport/v0.54.x Backport PR's to release/v0.54.x branch label Mar 20, 2026
@aljo242
Copy link
Copy Markdown
Contributor Author

aljo242 commented Mar 24, 2026

@greptile can you re-review and give me a new score?

Comment on lines +177 to +186
func safeFeePayer(feeTx sdk.FeeTx) (addr sdk.AccAddress, ok bool) {
defer func() {
if recover() != nil {
addr = nil
ok = false
}
}()

return sdk.AccAddress(feeTx.FeePayer()), true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 recover() in production estimation path masks real bugs

safeFeePayer uses a blanket recover() that silently swallows any panic originating from feeTx.FeePayer(), including ones that would indicate genuine programming errors (e.g., a nil-receiver dereference or an index out-of-bounds in a well-formed tx implementation). The pre-estimation path skipping a tx is functionally correct, but swallowing an unexpected panic here means the issue will never surface in logs or metrics — the tx is just processed without an estimate, and the fuzz test will pass because both seq and stm behave identically.

Consider logging the recovered panic value (even at debug level) so that unexpected panics from real tx types are visible:

func safeFeePayer(feeTx sdk.FeeTx) (addr sdk.AccAddress, ok bool) {
    defer func() {
        if r := recover(); r != nil {
            // FeePayer() panicked; skip pre-estimation for this tx.
            // Log at debug level if a logger is available.
            addr = nil
            ok = false
        }
    }()
    return sdk.AccAddress(feeTx.FeePayer()), true
}

aljo242 added 3 commits March 24, 2026 11:49
Emit a log line when FeePayer panics so pre-estimation skips remain visible during fuzz and production debugging.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/v0.54.x Backport PR's to release/v0.54.x branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant