XLlamaCPP now has Vulkan wheels for Windows and Linux. They still include the standard CPU optimizations that are present in the standard build. Note that while there is also a Vulkan OSX wheel, this is for Intel Macs that lack Metal acceleration.
Some code for matching available VRAM to GGUF layers is available here and may be of use when added to this GGUF layer reader code.