Documentation: Installing DGL with ROCm 6.4 Using Docker

1. Objective

To set up a GPU-accelerated deep learning environment using:

ROCm 6.4
PyTorch (ROCm build)
DGL (Deep Graph Library)
Docker containerization

This approach avoids dependency conflicts and ensures compatibility with AMD GPUs.

2. Environment Overview

Host System

Linux distribution: Ubuntu 22.04 / 24.04 (adjust as appropriate)
ROCm installed: 6.4.x
GPU: AMD GPU compatible with ROCm
Docker: Installed and verified

Container Base Image

Repository: rocm/dgl
Example tag used: dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1

3. Docker Installation Verification

After installing Docker, verification was performed using:

sudo docker run hello-world

Successful execution confirmed:

Docker daemon is running
User has permission to execute Docker commands

4. Pulling the ROCm DGL Image

The official AMD DGL image was pulled from Docker Hub:

sudo docker pull rocm/dgl:dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1

This image includes:

ROCm 6.4 runtime
PyTorch 2.4.1 (ROCm build)
DGL preinstalled
Python 3.10
Ubuntu 22.04 base

5. Running the Container with GPU Access

To enable ROCm GPU access inside Docker, the container was launched with the following command:

sudo docker run -it \
  --cap-add=SYS_PTRACE \
  --security-opt seccomp=unconfined \
  --device=/dev/kfd \
  --device=/dev/dri \
  --group-add video \
  --ipc=host \
  --shm-size 8G \
  rocm/dgl:dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1

Explanation of Key Flags

Flag	Purpose
`--device=/dev/kfd`	Exposes AMD compute device
`--device=/dev/dri`	Exposes GPU rendering interface
`--group-add video`	Grants GPU access permissions
`--ipc=host`	Improves shared memory handling
`--shm-size 8G`	Prevents DataLoader memory issues
`--cap-add=SYS_PTRACE`	Required for debugging

6. Verification Inside Container

6.1 Verify ROCm Detection

rocminfo | head

Confirms GPU is visible.

6.2 Verify PyTorch ROCm Backend

python3 -c "import torch; print(torch.__version__, torch.version.hip)"

Expected:

Correct PyTorch version
HIP version not None

Check GPU availability:

python3 -c "import torch; print(torch.cuda.is_available())"

Expected output:

True

(Note: ROCm uses torch.cuda namespace.)

6.3 Verify DGL Installation

export DGLBACKEND=pytorch

python3 - << 'EOF'
import dgl
print("DGL version:", dgl.__version__)
EOF

Successful import confirms:

DGL properly installed
Linked correctly with PyTorch backend

7. Rationale for Using Docker

Using Docker provides:

Version isolation (no system contamination)
Guaranteed compatibility (prevalidated by AMD)
Reproducibility across machines
Simplified dependency management

This avoids:

Pip version conflicts
ROCm wheel mismatch errors
Manual compilation complexity

8. Optional: Running with Project Mount

To use local project files:

sudo docker run -it \
  --device=/dev/kfd \
  --device=/dev/dri \
  --group-add video \
  --ipc=host \
  --shm-size 8G \
  -v $(pwd):/workspace \
  rocm/dgl:dgl-2.4_rocm6.4_ubuntu22.04_py3.10_pytorch_release_2.4.1

This mounts the current directory into /workspace inside the container.

9. Conclusion

The Docker-based ROCm installation successfully provides:

Stable PyTorch + ROCm 6.4 integration
Working DGL framework
GPU acceleration inside container
Controlled and reproducible environment

This setup is recommended for research workflows requiring ROCm compatibility and graph-based deep learning.

Q2: Will you run experiments on multiple machines (requiring reproducibility documentation)? Q3: Do you plan to benchmark GPU utilization to validate performance inside Docker?

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Documentation: Installing DGL with ROCm 6.4 Using Docker

1. Objective

2. Environment Overview

3. Docker Installation Verification

4. Pulling the ROCm DGL Image

5. Running the Container with GPU Access

Explanation of Key Flags

6. Verification Inside Container

6.1 Verify ROCm Detection

6.2 Verify PyTorch ROCm Backend

6.3 Verify DGL Installation

7. Rationale for Using Docker

8. Optional: Running with Project Mount

9. Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Documentation: Installing DGL with ROCm 6.4 Using Docker

1. Objective

2. Environment Overview

3. Docker Installation Verification

4. Pulling the ROCm DGL Image

5. Running the Container with GPU Access

Explanation of Key Flags

6. Verification Inside Container

6.1 Verify ROCm Detection

6.2 Verify PyTorch ROCm Backend

6.3 Verify DGL Installation

7. Rationale for Using Docker

8. Optional: Running with Project Mount

9. Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages