Skip to content

DC picks up checkpoint only on 3rd run, instead of 2nd #1629

@dmpetrov

Description

@dmpetrov

It doesn't pick up checkpoint in 2nd run.

import os
import datachain as dc

first_run = os.environ.get("FIRST_RUN", "0") == "1"

def score_number(num: int) -> float:
    if first_run:
        if num > 5:
            raise ValueError("test error")
    print("processing", num)
    return float(num**2) / 1000.0

if first_run:
    (
        dc.read_values(num=list(range(1, 10)))
        .save("filtered_nums")
    )

result = (
    # dc.read_values(num=list(range(1, 10)))
    dc.read_dataset("filtered_nums")
    .settings(batch_size=1)          # process one row at a time for fine-grained checkpointing
    .map(score=score_number)
    .save("scored_nums")
)

Output:

$ FIRST_RUN=1 python process_data.py
processing 1
processing 2
processing 3
processing 4
processing 5
Traceback (most recent call last):
   [stack trace]

$ python process_data.py
processing 1  # this should not be here
processing 2 # this should not be here
processing 3 # this should not be here
processing 4 # this should not be here
processing 5 # this should not be here
processing 6
processing 7
processing 8
processing 9

PS: Funny part - it DOES pick it up in 3rd run if you still fail in the 2nd run.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions