[Proposal] Fix breadcrumb race condition by jrodiz · Pull Request #8041 · firebase/firebase-android-sdk

jrodiz · 2026-04-16T14:40:45Z

PR: Fix breadcrumb race condition — log() entry dropped before logException() reads it

Summary

Fixes: #8034

Firebase.crashlytics.log() breadcrumbs were silently dropped when called from a background thread immediately before recordException().
Root cause: a race condition between the common and diskWrite Crashlytics workers caused the non-fatal event to snapshot the log before the breadcrumb was written to disk.
Fix: one-line change from common.submit() to common.submitTask() in CrashlyticsCore.log(), which suspends the common worker until the disk write completes.

Root Cause

CrashlyticsCore.log() used a double-dispatch pattern:

// BEFORE (buggy)
crashlyticsWorkers.common.submit(
    () -> {
        crashlyticsWorkers.diskWrite.submit(() -> controller.writeToLog(timestamp, msg));
    });

CrashlyticsWorker.submit(Runnable) chains the runnable onto the common queue and marks the task complete as soon as the runnable returns — not when the inner diskWrite task finishes. So the sequence was:

log("breadcrumb") → adds task C1 to common: "enqueue writeToLog on diskWrite"
logException(ex) → adds task C2 to common: "call writeNonFatalException"
common runs C1: calls diskWrite.submit(writeToLog) → C1 completes immediately
common runs C2: calls writeNonFatalException → calls logFileManager.getLogString()
— the diskWrite task D1 has been queued but not yet run → log is empty → breadcrumb missing
diskWrite eventually runs D1: writeToLog — but the event was already captured without it

The main-thread workaround happened to work because Handler.post {} batched both calls into a single posted block, accidentally serializing them in a way that masked the race.

Fix

// AFTER (fixed)
crashlyticsWorkers.common.submitTask(
    () -> crashlyticsWorkers.diskWrite.submit(() -> controller.writeToLog(timestamp, msg)));

submitTask(Callable<Task<T>>) sets the common worker's internal tail to the Task returned by diskWrite.submit(...). Subsequent common tasks (like logException) are chained after that Task, so they cannot start until the disk write has completed.

Test Plan

New test: CrashlyticsCoreTest#testLog_breadcrumbIsWrittenBeforeLogExceptionReadsIt

Calls log("test breadcrumb") immediately followed by logException(exception) — reproducing the exact pattern reported in the issue.
Awaits only crashlyticsWorkers.common (not diskWrite separately). With submitTask, awaiting common guarantees the diskWrite has also completed, so the log MUST be on disk.
Asserts logFileManager.getLogString() is non-null and contains the breadcrumb.

Without the fix, awaiting common would NOT guarantee that diskWrite finished, making this assertion unreliable (the log might be empty at assertion time).

Existing tests: All existing tests in :firebase-crashlytics compile and run clean.

Risks / Trade-offs

Slightly more serialized hot path: common now suspends until each writeToLog disk write completes before processing the next task. In practice, writeToLog is fast (a small QueueFile append), and the previous behavior (fire-and-forget) was already incorrect per the documented contract ("queuing up on common worker to maintain the order").
No deadlock risk: diskWrite never submits back to common, so no cycle.

gemini-code-assist · 2026-04-16T14:40:50Z

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

jrodiz · 2026-04-16T14:44:16Z

/gemini review

gemini-code-assist

Code Review

This pull request addresses a race condition where breadcrumbs logged immediately before a non-fatal exception might be missing from the crash report. The fix replaces a simple task submission with a task that ensures the common worker waits for the disk write to complete before proceeding. A regression test and a helper method for testing were also added. The review feedback suggests refining the terminology in the code comments to avoid potential confusion with Java's thread suspension mechanisms.

gemini-code-assist · 2026-04-16T14:49:58Z

+    // submitTask suspends the common worker until the diskWrite task completes, ensuring
+    // that subsequent tasks on the common worker (e.g. logException) see this log entry.


The term 'suspends' might be slightly misleading in a Java context, as it is often associated with Thread.suspend() or Kotlin coroutines. Since this is a serial worker queue, 'waits for' or 'chains' might be more accurate to describe how submitTask prevents subsequent tasks from starting until the returned task completes.

Suggested change

// submitTask suspends the common worker until the diskWrite task completes, ensuring

// that subsequent tasks on the common worker (e.g. logException) see this log entry.

// submitTask ensures the common worker waits for the diskWrite task to complete, ensuring

// that subsequent tasks on the common worker (e.g. logException) see this log entry.

mrober

LGTM after the changelog

mrober · 2026-04-16T16:36:06Z

@@ -1,5 +1,7 @@
 # Unreleased

+- [fixed] Fixed more strict mode violations


Copy/paste error?

Maybe something like "Fixed race condition that caused logs from background threads to not be attached to reports in some cases [#8034]"

Hey yep c/p error

…) reads it CrashlyticsCore.log() used common.submit() which completed as soon as the diskWrite task was enqueued, not when it finished. This allowed the subsequent logException() common task to call logFileManager.getLogString() before writeToLog() had run on the diskWrite worker, silently dropping the breadcrumb from the non-fatal report. Fix: change to common.submitTask() so the common worker suspends until the diskWrite task resolves before dispatching the next item (e.g. logException). Adds a regression test that calls log() immediately before logException(), awaits only the common worker, and asserts the breadcrumb is present on disk. Fixes firebase#8034

gemini-code-assist bot reviewed Apr 16, 2026

View reviewed changes

lehcar09 requested a review from mrober April 16, 2026 15:14

jrodiz force-pushed the fix/jrc--8034.Fix.log.breadcrumb.race.condition branch from fdc0273 to 585f551 Compare April 16, 2026 15:45

mrober previously approved these changes Apr 16, 2026

View reviewed changes

jrodiz dismissed mrober’s stale review via 3f046e3 April 16, 2026 19:10

jrodiz force-pushed the fix/jrc--8034.Fix.log.breadcrumb.race.condition branch from 585f551 to 3f046e3 Compare April 16, 2026 19:10

mrober approved these changes Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Fix breadcrumb race condition#8041

[Proposal] Fix breadcrumb race condition#8041
jrodiz wants to merge 1 commit intofirebase:mainfrom
jrodiz:fix/jrc--8034.Fix.log.breadcrumb.race.condition

jrodiz commented Apr 16, 2026

Uh oh!

gemini-code-assist bot commented Apr 16, 2026

Uh oh!

jrodiz commented Apr 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 16, 2026

Uh oh!

mrober left a comment

Uh oh!

mrober Apr 16, 2026

Uh oh!

mrober Apr 16, 2026

Uh oh!

jrodiz Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// submitTask suspends the common worker until the diskWrite task completes, ensuring
		// that subsequent tasks on the common worker (e.g. logException) see this log entry.

		@@ -1,5 +1,7 @@
		# Unreleased

		- [fixed] Fixed more strict mode violations

Conversation

jrodiz commented Apr 16, 2026

PR: Fix breadcrumb race condition — log() entry dropped before logException() reads it

Summary

Root Cause

Fix

Test Plan

Risks / Trade-offs

Uh oh!

gemini-code-assist bot commented Apr 16, 2026

Uh oh!

jrodiz commented Apr 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

mrober left a comment

Choose a reason for hiding this comment

Uh oh!

mrober Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

mrober Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

jrodiz Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants