Skip to content

NVIDIA BioNeMo Framework v2.6.2

Choose a tag to compare

@trvachov trvachov released this 02 Jul 23:18
· 326 commits to main since this release

Updates & Improvements

  • Fixes numerous ESM2 model issues:
    1. Finetuning metric for token classification is fixed. #946
    2. Losses for finetuning were fixed for data and model parallelism. #959
    3. Bug in inference script that concerns checkpoint loading is fixed. #950
  • Updated base Docker image to nvidia-pytorch 25.04-py3

Known Issues

  • Evo2 generation is broken (i.e. bionemo-evo2/src/bionemo/evo2/run/infer.py). See issue #890. A workaround exists on branch #949 and we are working to fix this issue for the July release.
  • There is a NCCL communication issue on certain A100 multi-node environments. In our internal testing, we were not able to reproduce the issue reliably across environments. If end users see the following error, please report in issue #970 :
[rank9]: torch.distributed.DistBackendError: NCCL error in: /opt/pytorch/pytorch/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:3356, internal error - please report this issue to the NCCL developers, NCCL version 2.26.3

What's Changed

New Contributors

Full Changelog: v2.6.1...v2.6.2