Skip to content

CPU/GPU portable `parallel_for` `parallel_reduce` in Julia for productive science. Funded by the US DOE Advanced Scientific Computing Research (ASCR).

License

Notifications You must be signed in to change notification settings

JuliaORNL/JACC.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JACC.jl: Julia for Accelerators

Documentation CI-CPU CI-GPU-NVIDIA CI-GPU-AMD CI-GPU-Apple

CPU/GPU portable array/parallel_for/parallel_reduce in Julia for productive science.

JACC.jl leverages the LLVM-based Julia language and ecosystem, in particular JuliaGPU, and optional package extensions. Similar to portable layers like Kokkos in C++, Julia users will have easy access to vendor-neutral CPU/GPU computing writing a single high-level source code, but in a language like Julia designed for science.

JACC.jl programming model provides:

  1. Portable array, zeros, ones, fill metaprogramming for the selected vendor backend (CuArray. ROCArray, MtlArray, etc.).

  2. parallel_for and parallel_reduce kernel launching: (i) basic APIs for non-experts, and (ii) low-level control APIs for threads/blocks, synchronization, multi GPU, and shared memory usage.

  3. Backend selection using Preferences.jl: "threads" (default), "cuda", "amdgpu", "metal" and "oneAPI". Stored in Julia's LocalPreferences.toml, so code is 100% vendor-agnostic via @init_backend.

Goals

  1. JACC.jl provides easy access to GPU computing in Julia without having to learn the details of each backend or CPU/GPU parallel programming.

  2. Julia HPC developers can use JACC.jl as a productive meta-programming layer that adds and communicates use-case and testing coverage to the ever-growing JuliaGPU ecosystem.

  3. As a platform to advace research in portable parallel programming, e.g. shared memory, multi-GPU, for science facilities, etc.

Support and Roadmap

Julia provides a tight, interoperable ecosystem for GPU programming. Still, vendor support of some features may vary. The following table summarizes the current support status of JACC.jl features across different backends.

Feature \ Backend CPU CUDA AMDGPU Metal oneAPI
CI TBD
x86, Arm GH Runners RTXA4000, GTX1080 MI100 M1 TBD
Float64 ✅ (if supported)
Multi (GPU) N/A
shared N/A
@atomic

Roadmap:

  • JACC v1.0 stable API including @parallel_for/@parallel_reduce macros by early 2026.
  • JACC.BLAS for kernel-level linear algebra routines.
  • Expand JACC's ecosystem including scientific applications, see JACC-Applications
  • New functionality impacting scientific users.

Quick start

  1. JACC.jl is a registered Julia package. Install JACC.jl like any other Julia package:

    $ julia
    julia> import Pkg
    julia> Pkg.add("JACC")
  2. Set a backend (outside code): "cuda", "amdgpu", or "threads" (default). This will generate a LocalPreferences.toml file.

    julia> import JACC
    julia> JACC.set_backend("cuda")

    NOTE: This will also add the backend package (CUDA.jl in this case) as a dependency to the current project.

  3. Import JACC and load appropriate extension. @init_backend inserts an import statement so that you don't have to reference a specific backend in your code. (It must therefore be called at a top-level scope.)

    import JACC
    JACC.@init_backend

    NOTE: Without calling @init_backend if your backend is something other than "threads", using most of JACC functions will result in an error related to, using CUDA for example, get_backend(::Val(:cuda)).

  4. Run a kernel example (see tests directory), copy the code below into a jacc-example.jl file:

    import JACC
    JACC.@init_backend
    
    function axpy(i, alpha, x, y)
      @inbounds x[i] += alpha * y[i]
    end
    
    N = 100_000
    alpha = 2.0
    x = JACC.zeros(Float32, N)
    y = JACC.array(fill(Float32(5), N))
    JACC.parallel_for(N, axpy, alpha, x, y)
    a = JACC.parallel_reduce(x)
    println("Result: ", a)

    and run it:

    • CPU using 4 threads: $ julia -t 4 jacc-example.jl
    • GPU: $ julia jacc-example.jl

    The first time it will take a while to download and precompile backend dependency packages.

    NOTE: JACC.array converts a Base.Array to the array type used by the current backend. If you need to use the array type in your code (for example as function parameter or in a struct, use JACC.array_type():

    const JACCArray = JACC.array_type()
    function f(a::JACCArray{Float32, 1})
        # ...
    end

Resources

Citation

If you find JACC.jl useful please cite the following paper from SC24 WACCPD, open version available here.

@INPROCEEDINGS{JACC,
  author={Valero-Lara, Pedro and Godoy, William F and Mankad, Het and Teranishi, Keita and Vetter, Jeffrey S and Blaschke, Johannes and Schanen, Michel},
  booktitle={Proceedings of the SC '24 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis},
  title={{JACC: Leveraging HPC Meta-Programming and Performance Portability with the Just-in-Time and LLVM-based Julia Language}},
  year={2024},
  volume={},
  number={},
  pages={},
  doi={10.1109/SCW63240.2024.00245}
}

Sponsors:

JACC.jl is funded by the US Department of Energy Advanced Scientific Computing Research (ASCR) projects:

  • S4PST, part of the Next Generation of Scientific Software Technologies (NGSST) ASCR Program.
  • NGSST sponsors the Consortium for the Advancement of Scientific Software, CASS
  • ASCR Competitive Portfolios for Advanced Scientific Computing Research, MAGMA/Fairbanks project

Former sponsors:

About

CPU/GPU portable `parallel_for` `parallel_reduce` in Julia for productive science. Funded by the US DOE Advanced Scientific Computing Research (ASCR).

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 6