CUDA rules for Bazel
This repository contains Starlark implementation of CUDA rules for Bazel.
These rules provide a set of rules and macros that make it easier to build CUDA with Bazel.
Add the following to your MODULE.bazel file and replace the placeholders with actual values.
bazel_dep(name = "rules_cc", version = "{rules_cc_version}")
bazel_dep(name = "rules_cuda", version = "0.2.5")
# pick a specific version (this is optional and can be skipped)
archive_override(
module_name = "rules_cuda",
integrity = "{SRI value}", # see https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity
url = "https://github.com/bazel-contrib/rules_cuda/archive/{git_commit_hash}.tar.gz",
strip_prefix = "rules_cuda-{git_commit_hash}",
)
cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
cuda.toolkit(
name = "cuda",
toolkit_path = "",
)
use_repo(cuda, "cuda")rules_cc provides the C++ toolchain dependency for rules_cuda; in Bzlmod, the compatibility repository is handled by rules_cc itself.
Traditional WORKSPACE approach
Add the following to your WORKSPACE file and replace the placeholders with actual values.
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "rules_cc",
sha256 = "{rules_cc_sha256}",
strip_prefix = "rules_cc-{rules_cc_version}",
urls = ["https://github.com/bazelbuild/rules_cc/releases/download/{rules_cc_version}/rules_cc-{rules_cc_version}.tar.gz"],
)
load("@rules_cc//cc:extensions.bzl", "compatibility_proxy_repo")
compatibility_proxy_repo()
http_archive(
name = "rules_cuda",
sha256 = "{sha256_to_replace}",
strip_prefix = "rules_cuda-{git_commit_hash}",
urls = ["https://github.com/bazel-contrib/rules_cuda/archive/{git_commit_hash}.tar.gz"],
)
load("@rules_cuda//cuda:repositories.bzl", "rules_cuda_dependencies", "rules_cuda_toolchains")
rules_cuda_dependencies()
rules_cuda_toolchains(register_toolchains = True)rules_cc needs to be available before loading rules_cuda, and compatibility_proxy_repo() must be called to populate the compatibility repository that rules_cc expects.
NOTE: rules_cuda_toolchains implicitly calls register_detected_cuda_toolchains, and the use of
register_detected_cuda_toolchains depends on the auto-detection of installed CUDA toolkits.
For hermetic toolchains, the rules handle toolchain configuration and library downloading automatically. See cuda.redist_json integration test for a comprehensible example.
For locally installed toolchains,
_detect_local_cuda_toolkit
and detect_clang
determines how they are detected.
Either situation depends on cc toolchain availability, so you must also ensure the cc compiler is properly configured.
On Windows, this means that you will also need to set the environment variable BAZEL_VC properly.
cuda_library: Can be used to compile and create static library for CUDA kernel code. The resulting targets can be consumed by C/C++ Rules.cuda_objects: If you don't understand what device link means, you must never use it. This rule produces incomplete object files that can only be consumed bycuda_library. It is created for relocatable device code and device link time optimization source files.
cuda_binary: A convenience macro for building CUDA-enabled executables. It builds acc_binary-style target from CUDA sources.cuda_test: A convenience macro for CUDA-enabled tests. It behaves likecuda_binarybut creates acc_test-style target that can be run withbazel test.
Some flags are defined in cuda/BUILD.bazel. To use them, for example:
bazel build --@rules_cuda//cuda:archs=compute_61:compute_61,sm_61
In .bazelrc file, you can define a shortcut alias for the flag, for example:
# Convenient flag shortcuts.
build --flag_alias=cuda_archs=@rules_cuda//cuda:archs
and then you can use it as follows:
bazel build --cuda_archs=compute_61:compute_61,sm_61
-
@rules_cuda//cuda:enableEnable or disable all rules_cuda related rules. When disabled, the detected CUDA toolchains will also be disabled to avoid potential human error. By default, rules_cuda rules are enabled. See
examples/if_cudafor how to support both cuda-enabled and cuda-free builds. -
@rules_cuda//cuda:archsSelect the CUDA archs to support. See cuda_archs specification DSL grammar.
-
@rules_cuda//cuda:compilerSelect the CUDA compiler; available options are
nvccorclang. -
@rules_cuda//cuda:coptsAdd copts to all CUDA compile actions.
-
@rules_cuda//cuda:host_coptsAdd copts to the host compiler.
-
@rules_cuda//cuda:runtimeSet the default cudart to link; for example,
--@rules_cuda//cuda:runtime=@cuda//:cuda_runtime_staticlinks the static CUDA runtime. -
--features=cuda_device_debugSets nvcc flags to enable debug information in device code. Currently ignored for clang, where
--compilation_mode=debugapplies to both host and device code.
Check out the examples to see if they fit your needs.
See examples for basic usage.
See rules_cuda_examples for extended real-world projects.
Sometimes the following error occurs:
cc1plus: fatal error: /tmp/tmpxft_00000002_00000019-2.cpp: No such file or directory
The problem is caused by nvcc using PIDs to determine temporary file names, and with --spawn_strategy linux-sandbox, which is the default strategy on Linux, the PIDs nvcc sees are all very small numbers (say 2~4) due to sandboxing. linux-sandbox is not hermetic because it mounts root into the sandbox, so /tmp is shared between sandboxes, which causes name conflicts under high parallelism. A similar problem has been reported on the NVIDIA forums.
To avoid it:
- Update to Bazel 7 where
--incompatible_sandbox_hermetic_tmpis enabled by default. - Using
--spawn_strategy localshould eliminate the case because it lets nvcc see the true PIDs. - Using
--experimental_use_hermetic_linux_sandboxshould eliminate the case because it avoids sharing/tmp. - Adding the
-objtempoption should reduce the chance of this happening.