Skip to content

Investigation: Eternal orchestrations with large messages have unbounded blob counts#3307

Draft
Copilot wants to merge 1 commit intodevfrom
copilot/fix-unbounded-blob-counts
Draft

Investigation: Eternal orchestrations with large messages have unbounded blob counts#3307
Copilot wants to merge 1 commit intodevfrom
copilot/fix-unbounded-blob-counts

Conversation

Copy link
Contributor

Copilot AI commented Jan 5, 2026

Summary

What changed?

Investigation confirmed issue #509 still exists. The fix requires changes in the upstream Azure/durabletask repository, not this extension.

Root cause: DeleteMessageAsync in TaskHubQueue.cs deletes queue messages but not the associated blobs in the largemessages container when messages exceed 45KB.

Required fix (in Azure/durabletask):

// In src/DurableTask.AzureStorage/Messaging/TaskHubQueue.cs, after queue message deletion:
if (!string.IsNullOrEmpty(message.CompressedBlobName))
{
    await this.messageManager.DeleteBlobAsync(message.CompressedBlobName);
}

Why is this change needed?

Eternal orchestrations using ContinueAsNew with large messages cause unbounded blob growth. Message blobs use sequence numbers in their names, which don't reset on ContinueAsNew.

Issues / work items


Project checklist

  • Documentation changes are not required
  • Release notes are not required for the next release
  • Backport is not required
  • All required tests have been added/updated (unit tests, E2E tests)
  • No extra work is required to be leveraged by OutOfProc SDKs
  • No change to the version of the WebJobs.Extensions.DurableTask package
  • No EventIds were added to EventSource logs
  • This change should be added to the v2.x branch
  • Breaking change?

AI-assisted code disclosure (required)

Was an AI tool used? (select one)

  • No
  • Yes, AI helped write parts of this PR (e.g., GitHub Copilot)
  • Yes, an AI agent generated most of this PR

If AI was used:

  • Tool(s): GitHub Copilot Agent
  • AI-assisted areas/files: Investigation and documentation only (no code changes)
  • What you changed after AI output: N/A - investigation PR

AI verification (required if AI was used):

  • I understand the code and can explain it
  • I verified referenced APIs/types exist and are correct
  • I reviewed edge cases/failure paths (timeouts, retries, cancellation, exceptions)
  • I reviewed concurrency/async behavior
  • I checked for unintended breaking or behavior changes

Testing

Automated tests

  • Result: N/A - no code changes in this repository

Manual validation (only if runtime/behavior changed)

  • N/A - investigation only

Notes for reviewers

  • This PR has no code changes because the fix must be made in Azure/durabletask, not here
  • DurableTask.AzureStorage is consumed as a NuGet dependency with no extension points to intercept message deletion
  • Next steps: Submit PR to Azure/durabletask, then update dependency version in Directory.Packages.props (currently v2.7.1)
Original prompt

This section details on the original issue you should resolve

<issue_title>Eternal orchestrations with large messages have unbounded blob counts</issue_title>
<issue_description>Eternal orchestrations that use ContinueAsNew and have state stored in large messages will see their blob storage usage grow unbounded over time.

As of this PR, we no longer have random blob names, which means we can overwrite blobs from previous generations, reducing the impact of the problem. However, it does not handle message blobs, which will still continue to grow in an unbounded way since the blob names contain the sequence number, which does not reset. Fixing the message blob problem may be as simple as deleting message blobs at the same time we delete their corresponding queue messages (which we might want to do anyways).</issue_description>

<agent_instructions>check if this issue still exists, if so fix it</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@cgillum @dsm0880 History purging won't help because it only purges history for instances which have completed or failed. Eternal orchestrations technically don't ever complete or fail. @cgillum Our existing purge logic should handle the first case. I need to spend more time studying the code for the second case. Depending on how we name the blobs, we might be able to determine whether some of those blobs are no longer needed - i.e. they belong to an "execution ID" which no longer exists.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI changed the title [WIP] Fix unbounded blob counts in eternal orchestrations Investigation: Eternal orchestrations with large messages have unbounded blob counts Jan 5, 2026
Copilot AI requested a review from YunchuWang January 5, 2026 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eternal orchestrations with large messages have unbounded blob counts

2 participants