Technical specifics
What: If a simple job fails to start (e.g. the creator failed to talk to ceph). The job is gone and will be forever trapped in the NOT_STARTED state.
Why: It is not handled
Who: Users are affecterd
How: We are not handling this.
We could likely handle this with the same failure queue pattern we have for watched-files where they are submitted to a failure queue and reprocessed after new jobs, indefinitely.
Technical specifics
What: If a simple job fails to start (e.g. the creator failed to talk to ceph). The job is gone and will be forever trapped in the
NOT_STARTEDstate.Why: It is not handled
Who: Users are affecterd
How: We are not handling this.
We could likely handle this with the same failure queue pattern we have for
watched-fileswhere they are submitted to a failure queue and reprocessed after new jobs, indefinitely.