Description
normalization_kernel_wrapper() in zendnnl/src/lowoha_operators/normalization/lowoha_normalization.cpp unconditionally dispatches RMS_NORM and FUSED_ADD_RMS_NORM to rms_norm_avx512() without checking whether the CPU supports AVX-512. This causes SIGILL (Illegal Instruction, exit code 132) on Zen 3 and earlier processors that only support AVX2.
Reproduction
CPU: AMD EPYC 7313 16-Core (Zen 3 / Milan) — AVX2 only, no AVX-512
$ OMP_NUM_THREADS=4 ./build/install/examples/bin/examples
...
Thread 1 "examples" received signal SIGILL, Illegal instruction.
0x... in zendnnl::lowoha::normalization::rms_norm_avx512(...) ()
#0 rms_norm_avx512()
#1 normalization_kernel_wrapper()
#2 normalization_direct()
#3 run_lowoha_rms_norm_fp32_example()
#4 main()
The crash originates from run_lowoha_rms_norm_fp32_example() in the examples binary. Any user code calling normalization_direct() with norm_type_t::RMS_NORM will also crash on non-AVX-512 hardware.
Root cause
In lowoha_normalization.cpp:37-46:
if (params.norm_type == norm_type_t::RMS_NORM ||
params.norm_type == norm_type_t::FUSED_ADD_RMS_NORM) {
log_info("Using AVX512 kernel for ", norm_type_to_str(params.norm_type));
status_t status = rms_norm_avx512(input, output, residual, gamma, params);
// ...
return status;
}
No get_avx512f_status() check before calling the AVX-512 path. Other operators (e.g., matmul) correctly guard ISA-specific paths — see lowoha_matmul.cpp:351 which checks get_f16_status() before F16 dispatch.
LayerNorm and BatchNorm are unaffected because they take the else branch to normalization_reference_wrapper(), which is a portable reference implementation. RMS_NORM has no fallback path.
Suggested fix
Add the platform check and fall through to the existing reference kernel:
if (params.norm_type == norm_type_t::RMS_NORM ||
params.norm_type == norm_type_t::FUSED_ADD_RMS_NORM) {
if (zendnnl_platform_info().get_avx512f_status()) {
log_info("Using AVX512 kernel for ", norm_type_to_str(params.norm_type));
status_t status = rms_norm_avx512(input, output, residual, gamma, params);
if (status != status_t::success) {
log_error(norm_type_to_str(params.norm_type), " kernel failed");
}
return status;
}
log_info("AVX512 not available, using reference kernel for ",
norm_type_to_str(params.norm_type));
}
Requires adding #include "common/zendnnl_global.hpp" for zendnnl_platform_info().
Impact
- Severity: Any call to LOWOHA RMS_NORM or FusedAddRMSNorm crashes on Zen 3 and earlier
- Workaround: None without code change — the only code path for RMS_NORM is the AVX-512 kernel
- Scope: Affects all users running on non-AVX-512 AMD CPUs (Zen 1/2/3, EPYC 7001/7002/7003 series)
Description
normalization_kernel_wrapper()inzendnnl/src/lowoha_operators/normalization/lowoha_normalization.cppunconditionally dispatchesRMS_NORMandFUSED_ADD_RMS_NORMtorms_norm_avx512()without checking whether the CPU supports AVX-512. This causesSIGILL(Illegal Instruction, exit code 132) on Zen 3 and earlier processors that only support AVX2.Reproduction
CPU: AMD EPYC 7313 16-Core (Zen 3 / Milan) — AVX2 only, no AVX-512
The crash originates from
run_lowoha_rms_norm_fp32_example()in the examples binary. Any user code callingnormalization_direct()withnorm_type_t::RMS_NORMwill also crash on non-AVX-512 hardware.Root cause
In
lowoha_normalization.cpp:37-46:No
get_avx512f_status()check before calling the AVX-512 path. Other operators (e.g., matmul) correctly guard ISA-specific paths — seelowoha_matmul.cpp:351which checksget_f16_status()before F16 dispatch.LayerNorm and BatchNorm are unaffected because they take the
elsebranch tonormalization_reference_wrapper(), which is a portable reference implementation. RMS_NORM has no fallback path.Suggested fix
Add the platform check and fall through to the existing reference kernel:
Requires adding
#include "common/zendnnl_global.hpp"forzendnnl_platform_info().Impact