Fix S3 path lookup hanging on large prefixes by pditommaso · Pull Request #6849 · nextflow-io/nextflow

pditommaso · 2026-02-19T21:00:00Z

Summary

Fix S3ObjectSummaryLookup.lookup() causing unbounded pagination when checking existence of S3 paths with large number of objects under the prefix
Replace pagination loop (maxKeys=250 + marker) with a single listObjects call using maxKeys=2

Problem

S3ObjectSummaryLookup.lookup() is used by S3FileSystemProvider.checkAccess() to verify if an S3 path exists. The method paginated through all objects matching the prefix in batches of 250. On prefixes with millions of objects (e.g. s3://bucket/results accumulated from many pipeline runs), this caused the main thread to hang for minutes parsing massive XML responses from S3.

Observed in production: nf-schema FormatDirectoryPathEvaluator calls Files.exists() on an S3 outdir path during parameter validation. With a prefix containing many objects, the main thread hung for 7+ minutes (180s CPU) stuck in XmlDomParser.parseElement, parsing unbounded listObjects responses.

Fix

The matchName() check only needs to find either the exact key or its first child (key/). Since S3 returns objects in lexicographic order, these are guaranteed to appear in the first 1-2 results. Using maxKeys=2 without pagination is sufficient and eliminates the unbounded listing.

Test plan

Verified matchName logic: exact key match or key/ child always appears first in lexicographic S3 listing
Smoke test with large S3 prefix to confirm fast Files.exists() check

The lookup method paginated through all objects under an S3 prefix (maxKeys=250) to check path existence. On prefixes with millions of objects this caused the main thread to hang for minutes parsing massive XML responses. Observed in production: nf-schema parameter validation calls Files.exists() on an S3 outdir path, which triggers S3ObjectSummaryLookup.lookup. With a large prefix like s3://bucket/results containing many objects from previous runs, the pagination loop iterated indefinitely. Fix: use maxKeys=2 and remove pagination. The matchName check only needs to find the exact key or its first child (key + "/"), which are guaranteed to appear in the first results due to S3 lexicographic ordering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>

netlify · 2026-02-19T21:00:05Z

✅ Deploy Preview for nextflow-docs-staging canceled.

Name	Link
🔨 Latest commit	`92ccdc9`
🔍 Latest deploy log	https://app.netlify.com/projects/nextflow-docs-staging/deploys/699779d3de88f7000823666e

pditommaso · 2026-02-19T22:01:21Z

Tests pass, however in my tests execution fails

Feb-19 21:56:42.056 [TaskFinalizer-8] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_SAREK:PREPARE_INTERVALS:TABIX_BGZIPTABIX_INTERVAL_COMBINED (S07604624_Padded_Agilent_SureSelectXT_allexons_V6_UTR); work-dir=s3://nextflow-ci-dev/test-sarek/2a/010b2fe3d028c01abf368bbd3934c9
  error [nextflow.exception.MissingFileException]: Cannot access directory: '/nextflow-ci-dev/test-sarek/2a/010b2fe3d028c01abf368bbd3934c9'

jorgee · 2026-02-20T09:48:56Z

There is something wrong in this approach. I have a folder with similar names such as the following

Then I made a test to check if folder 'a' exists and it says false. In master is working, but not in this branch

    def 'should check s3 folder exists with similar names' () {
        when:
        def result = nextflow.cloud.aws.util.S3PathFactory.create('s3:///jorgee-eu-west1-test1/test_lexicorder/a')
        then:
        result.exists() == true
        result.isDirectory() == true
    }

I see the lexicographical order is guaranteed in general buckets with listObjectsV2 call
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html

Sorting order of returned objects
General purpose bucket - For general purpose buckets, ListObjectsV2 returns objects in lexicographical order based on their key names.

Directory bucket - For directory buckets, ListObjectsV2 does not return objects in lexicographical order.

I do not see this statement in listObjects but in nextMarker it says something about S3 objects in alphabetical sort, but not sure if it always applies.

I am debugging it to see what is wrong and if listObjectsV2 fixes it

jorgee · 2026-02-20T12:59:47Z

The listObjectV2 has the same problem. There are several symbols that are before '/' that can be part of a file.
I have also tried to play with delimiter, but no success.

I see two solutions but imply to make two calls in the worst case:

Keep the same code, in most of the cases it will get 'a' or 'a/', and when not found, add a second listObject call with key+'/' to be sure the folder exists. I have implemented it in Fix S3 lookup unbounded pagination with double call #6851
Another alternative is making two HeadObject requests. One just with key. If fails another with key+'/'. The `HeadObjectResponse is providing the content length and last modified. So, they are also valid. This require more modifications in the code but I think head calls are cheaper.

pditommaso · 2026-02-20T13:15:12Z

I guess list (capped to 10) should cover most of the case, otherwise falling back to the double head. wdyt?

jorgee · 2026-02-20T13:25:55Z

I think we wil not need to fallback to double head. The first call, either the list or head, will cover the non directory case. So, the fallback just need to check the directory, and the S3Object for a folder only requires the key ending with '/'

pditommaso · 2026-02-20T13:35:05Z

Not getting what are the symbols that can before '/' that can be part of a file. Can a test be made to capture this case?

jorgee · 2026-02-20T14:50:16Z

In lexicographic order, there are several symbols (such as '-' or '.') that go before '/'. So, If you have a set of key names 'name/', 'name-1', 'name-2', 'name.txt'. In lexicographic order, 'name-1', 'name-2' and 'name.txt' will appear before 'name/'. This is the reason why the PR does not always work. We could increase to 10 to have more chance to get the folder in the first try but we need always the fallback. I added a test in #6851 that is reproducing this behaviour

jorgee · 2026-02-20T15:59:47Z

I have checked the approach with the headObject for a directory, and it is not working because a directory is not an object. So, the option in #6851 is the only way.

pditommaso · 2026-02-20T16:03:05Z

Ok, let's go with that

pditommaso requested a review from jorgee February 19, 2026 21:00

jorgee mentioned this pull request Feb 20, 2026

Fix S3 lookup unbounded pagination with double call #6851

Open

pditommaso closed this Feb 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix S3 path lookup hanging on large prefixes#6849

Fix S3 path lookup hanging on large prefixes#6849
pditommaso wants to merge 1 commit intomasterfrom
fix/s3-lookup-unbounded-pagination

pditommaso commented Feb 19, 2026

Uh oh!

netlify bot commented Feb 19, 2026 •

edited

Loading

Uh oh!

pditommaso commented Feb 19, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

pditommaso commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

pditommaso commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

pditommaso commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

pditommaso commented Feb 19, 2026

Summary

Problem

Fix

Test plan

Uh oh!

netlify bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for nextflow-docs-staging canceled.

Uh oh!

pditommaso commented Feb 19, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

pditommaso commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

pditommaso commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

jorgee commented Feb 20, 2026

Uh oh!

pditommaso commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

netlify bot commented Feb 19, 2026 •

edited

Loading