-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
I'm trying to train locally on a RTX 6000 Pro Blackwell (workstation) GPU, but after a random number of steps it stops with an error related to PhysX:
2026-02-20T16:41:50Z [167,746ms] [Error] [omni.physx.plugin] PhysX error: PhysX Internal CUDA error. Simulation cannot continue! Error code 719!
, FILE /builds/omniverse/physics/physx/source/physx/src/NpScene.cpp, LINE 3000
2026-02-20T16:41:50Z [167,746ms] [Error] [omni.physx.plugin] Cuda context manager error, simulation will be stopped and new cuda context manager will be created.
2026-02-20T16:41:50Z [167,746ms] [Error] [omni.physx.tensors.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/GpuArticulationView.cpp: 621
2026-02-20T16:41:50Z [167,746ms] [Error] [omni.physx.tensors.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.tensors/plugins/gpu/CudaKernels.cu: 573
2026-02-20T16:41:50Z [167,746ms] [Error] [omni.physx.tensors.plugin] Failed to fetch DOF velocity attribute
Error executing job with overrides: []
Traceback (most recent call last):
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab_tasks/isaaclab_tasks/utils/hydra.py", line 100, in hydra_main
func(env_cfg, agent_cfg, *args, **kwargs)
File "/home/gminelli-iit.local/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py", line 217, in main
runner.learn(num_learning_iterations=agent_cfg.max_iterations, init_at_random_ep_len=True)
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/rsl_rl/runners/on_policy_runner.py", line 105, in learn
obs, rewards, dones, extras = self.env.step(actions.to(self.env.device))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab_rl/isaaclab_rl/rsl_rl/vecenv_wrapper.py", line 156, in step
obs_dict, rew, terminated, truncated, extras = self.env.step(actions)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/gymnasium/wrappers/common.py", line 393, in step
return super().step(action)
^^^^^^^^^^^^^^^^^^^^
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/gymnasium/core.py", line 327, in step
return self.env.step(action)
^^^^^^^^^^^^^^^^^^^^^
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab/isaaclab/envs/manager_based_rl_env.py", line 197, in step
self.scene.update(dt=self.physics_dt)
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab/isaaclab/scene/interactive_scene.py", line 487, in update
articulation.update(dt)
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab/isaaclab/assets/articulation/articulation.py", line 267, in update
self._data.update(dt)
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab/isaaclab/assets/articulation/articulation_data.py", line 102, in update
self.joint_acc
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab/isaaclab/assets/articulation/articulation_data.py", line 777, in joint_acc
self._joint_acc.data = (self.joint_vel - self._previous_joint_vel) / time_elapsed
^^^^^^^^^^^^^^
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaaclab/source/isaaclab/isaaclab/assets/articulation/articulation_data.py", line 767, in joint_vel
self._joint_vel.data = self._root_physx_view.get_dof_velocities()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/gminelli-iit.local/.conda/envs/env_isaaclab/lib/python3.11/site-packages/isaacsim/extscache/omni.physics.tensors-107.3.26+107.3.3.lx64.r.cp311.u353/omni/physics/tensors/impl/api.py", line 1764, in get_dof_velocities
raise Exception("Failed to get DOF velocities from backend")
Exception: Failed to get DOF velocities from backend
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 563
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 566
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 569
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 572
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 575
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 578
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 581
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 584
2026-02-20T16:41:50Z [167,779ms] [Error] [omni.physx.fabric.plugin] CUDA error: unspecified launch failure: ../../../extensions/runtime/source/omni.physx.fabric/plugins/DirectGpuHelper.cpp: 587
2026-02-20T16:41:50Z [167,797ms] [Warning] [omni.physx.plugin] USD stage detach not called, holding a loose ptr to a stage!
2026-02-20T16:41:50Z [167,799ms] [Warning] [omni.physx.plugin] PhysX warning: /builds/omniverse/physics/physx/source/gpucommon/src/PxgCudaMemoryAllocator.cpp, FILE /builds/omniverse/physics/physx/source/gpucommon/src/PxgCudaMemoryAllocator.cpp, LINE 167
2026-02-20T16:41:50Z [167,811ms] [Warning] [omni.physx.plugin] PhysX warning: /builds/omniverse/physics/physx/source/gpucommon/src/PxgCudaMemoryAllocator.cpp, FILE /builds/omniverse/physics/physx/source/gpucommon/src/PxgCudaMemoryAllocator.cpp, LINE 68
Steps to reproduce
Following the docs, create an env and run a training example:
conda create -n env_isaaclab python=3.11
conda activate env_isaaclab
pip install --upgrade pip
pip install isaaclab[isaacsim,all]==2.3.2.post1 --extra-index-url https://pypi.nvidia.com
pip install -U torch==2.7.0 torchvision==0.22.0 --index-url https://download.pytorch.org/whl/cu128
python scripts/reinforcement_learning/rsl_rl/train.py --task Isaac-Velocity-Rough-Unitree-A1-v0 --headless --num_envs 4096
With Isaac-Velocity-Rough-Unitree-A1-v0 i got the error randomly after 200-1000 iteration steps. Same error appears changing the tasks (i tried a bunch of manipulation and locomotion) and both with RSL and SB3 code, so it seems to be more related to hardware/driver configuration or lib versions.
I tried a configuration with Python 3.11.14, isaacsim 5.1.0.0, isaaclab 2.3.0 on both Ubuntu 22.02 and 24.04 always ending with same error.
Relatively to GPU drivers i tried 570.172.08, 570.195.03, 580.95.05, 580.126.16, 590.48.01 on Ubuntu 22 and 570.195.03 on Ubuntu 24
Extra warning logs at the beginning:
2026-02-20T17:07:57Z [57ms] [Warning] [omni.usd_config.extension] Enable omni.materialx.libs extension to use MaterialX
2026-02-20T17:07:57Z [247ms] [Warning] [omni.platforminfo.plugin] failed to open the default display. Can't verify X Server version.
2026-02-20T17:07:57Z [354ms] [Warning] [carb] Acquiring non optional plugin interface which is not listed as dependency: [omni::physx::IPhysxBenchmarks v1.0] (plugin: <default plugin>), by client: omni.physics.physx.plugin. Add it to CARB_PLUGIN_IMPL_DEPS() macro of a client.
2026-02-20T17:07:57Z [359ms] [Warning] [omni.isaac.dynamic_control] omni.isaac.dynamic_control is deprecated as of Isaac Sim 4.5. No action is needed from end-users.
|---------------------------------------------------------------------------------------------|
| Driver Version: 570.195.03 | Graphics API: Vulkan
|=============================================================================================|
| GPU | Name | Active | LDA | GPU Memory | Vendor-ID | LUID |
| | | | | | Device-ID | UUID |
| | | | | | Bus-ID | |
|---------------------------------------------------------------------------------------------|
| 0 | NVIDIA RTX PRO 6000 Blackwell .. | Yes: 0 | | 97887 MB | 10de | 0 |
| | | | | | 2bb1 | 803be3d7.. |
| | | | | | 81 | |
|=============================================================================================|
| OS: 22.04.5 LTS (Jammy Jellyfish) ubuntu, Version: 22.04.5, Kernel: 6.8.0-100-generic
| Processor: AMD Ryzen Threadripper 9960X 24-Cores
| Cores: 24 | Logical Cores: 48
|---------------------------------------------------------------------------------------------|
| Total Memory (MB): 257082 | Free Memory: 245866
| Total Page/Swap (MB): 2047 | Free Page/Swap: 2047
|---------------------------------------------------------------------------------------------|
2026-02-20T17:07:59Z [2,446ms] [Warning] [gpu.foundation.plugin] CPU performance profile is set to powersave. This profile sets the CPU to the lowest frequency reducing performance.
2026-02-20T17:07:59Z [2,456ms] [Warning] [gpu.foundation.plugin] IOMMU is enabled.
2026-02-20T17:07:59Z [2,472ms] [Warning] [omni.kvdb.plugin] Disabling key-value database because another kit process is locking it
[INFO]: Parsing configuration from: isaaclab_tasks.manager_based.locomotion.velocity.config.a1.rough_env_cfg:UnitreeA1RoughEnvCfg
[INFO]: Parsing configuration from: isaaclab_tasks.manager_based.locomotion.velocity.config.a1.agents.rsl_rl_ppo_cfg:UnitreeA1RoughPPORunnerCfg
======================================================================================
[INFO][IsaacLab]: Logging to file: /tmp/isaaclab/logs/isaaclab_2026-02-20_18-08-01.log
======================================================================================
18:08:01 [simulation_context.py] WARNING: The `enable_external_forces_every_iteration` parameter in the PhysxCfg is set to False. If you are experiencing noisy velocities, consider enabling this flag. You may need to slightly increase the number of velocity iterations (setting it to 1 or 2 rather than 0), together with this flag, to improve the accuracy of velocity updates.
[INFO]: Base environment:
Environment device : cuda:0
Environment seed : 42
Physics step-size : 0.005
Rendering step-size : 0.02
Environment step-size : 0.02
[INFO] Generating terrains based on curriculum took : 0.759850 seconds
[INFO]: Time taken for scene creation : 4.051372 seconds
[INFO]: Scene manager: <class InteractiveScene>
Number of environments: 4096
Environment spacing : 2.5
Source prim name : /World/envs/env_0
Global prim paths : ['/World/ground']
Replicate physics : True
[INFO]: Starting the simulation. This may take a few seconds. Please wait...
... (training steps) ...