Errors in BAM iterator should be fatal#265
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces error handling for BAM queries in mosdepth.nim and updates the hts-nim dependency to a specific fork in mosdepth.nimble. The review feedback highlights a critical issue where using quit in the error handler bypasses resource cleanup, potentially leading to corrupted output files. Additionally, it notes the maintainability risk of depending on a personal fork for hts-nim and suggests reverting to the official package once the upstream changes are merged.
mosdepth.nim
Outdated
| except IOError: | ||
| quit getCurrentExceptionMsg() |
There was a problem hiding this comment.
Using quit here will terminate the program immediately, bypassing the cleanup logic at the end of the main function (e.g., closing file handles on lines 817-836). This can lead to corrupted or incomplete output files, especially for compressed formats like BGZF used for .bed.gz files.
A better approach is to let the exception propagate up to a top-level handler. The main function should be structured with a try...finally block to ensure that file handles are always closed, even if an error occurs.
For now, you can simply re-raise the exception. A follow-up change should be made to add proper exception handling and cleanup in main.
except IOError:
raise
References
- Avoid using
quitor other abrupt termination functions in the middle of processing, as it prevents proper resource cleanup (e.g., closing files, releasing locks). Instead, let exceptions propagate to a top-level handler that can perform a graceful shutdown.
There was a problem hiding this comment.
I think this is OK, since a BAM iterator failure essentially guarantees that the output will be useless as well. A wrapper script could catch the nonzero exit code and erase the partial output file(s).
mosdepth.nimble
Outdated
| # Dependencies | ||
|
|
||
| requires "hts >= 0.3.29", "docopt == 0.7.1", "nim >= 1.0.0", "https://github.com/brentp/d4-nim >= 0.0.5" | ||
| requires "https://github.com/julianhess/hts-nim >= 0.3.30", "docopt == 0.7.1", "nim >= 1.0.0", "https://github.com/brentp/d4-nim >= 0.0.5" |
There was a problem hiding this comment.
Pointing the hts-nim dependency to a personal fork is a maintainability risk. While this may be necessary for development before the upstream pull request is merged, this should be reverted to the official hts-nim package from Nimble once the changes are merged upstream to avoid depending on a personal fork long-term.
There was a problem hiding this comment.
This is just a temporary hack to get this to build. If this PR and the corresponding PR to hts-nim get accepted, the resulting updates would be committed to Nimble.
Now that hts-nim PR is merged, we revert the commit that uses my personal fork (f337ba1), and update to the latest Nimble version.
|
LGTM, agree @julianhess |
|
Thanks so much for the speedy review! |
If this iteration fails, it's likely due to a corrupted BAM (or in my case, an unstable cloud filesystem mount). The current approach of simply logging a message to stderr is suboptimal, since this results in a truncated coverage file and a successful exit code, which is difficult to catch in pipelines. To address this, I've modified
Bam.query()inhts-nim(submitted as a separate pull request here) to raise an exception, which is then caught and handled (with a nonzero exit) in mosdepth.