-
Notifications
You must be signed in to change notification settings - Fork 98
Open
Description
The new small versions of Qwen3.5 fail with DMR
https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF
Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/k33g/.docker/models/bundles/sha256/387dba4409dea9bcf2a7bb865d4d200c1bcc58ffd9c9cac43679d76c468231a2/model/model.gguf'
srv load_model: failed to load model, '/Users/k33g/.docker/models/bundles/sha256/387dba4409dea9bcf2a7bb865d4d200c1bcc58ffd9c9cac43679d76c468231a2/model/model.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
quantize.imatrix.chunks_count u32 = 80
llama_model_loader: - type f32: 133 tensors
llama_model_loader: - type q8_0: 36 tensors
llama_model_loader: - type q4_K: 98 tensors
llama_model_loader: - type q5_K: 36 tensors
llama_model_loader: - type q6_K: 17 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 497.39 MiB (5.55 BPW)
https://huggingface.co/unsloth/Qwen3.5-2B-GGUF
Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/k33g/.docker/models/bundles/sha256/bf39efff6e5f95df945cd0d1d97c7dc0ab204d032c34c20df7e3d6b9df2812e6/model/model.gguf'
srv load_model: failed to load model, '/Users/k33g/.docker/models/bundles/sha256/bf39efff6e5f95df945cd0d1d97c7dc0ab204d032c34c20df7e3d6b9df2812e6/model/model.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
quantize.imatrix.chunks_count u32 = 80
llama_model_loader: - type f32: 133 tensors
llama_model_loader: - type q8_0: 36 tensors
llama_model_loader: - type q4_K: 98 tensors
llama_model_loader: - type q5_K: 36 tensors
llama_model_loader: - type q6_K: 17 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 1.18 GiB (5.40 BPW)
https://huggingface.co/unsloth/Qwen3.5-4B-GGUF
Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/k33g/.docker/models/bundles/sha256/54cd4d8a351047d5657aa9aee31d1ffbafed33819d0a2fe7275313d694cce220/model/model.gguf'
srv load_model: failed to load model, '/Users/k33g/.docker/models/bundles/sha256/54cd4d8a351047d5657aa9aee31d1ffbafed33819d0a2fe7275313d694cce220/model/model.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
quantize.imatrix.chunks_count u32 = 80
llama_model_loader: - type f32: 177 tensors
llama_model_loader: - type q8_0: 48 tensors
llama_model_loader: - type q4_K: 131 tensors
llama_model_loader: - type q5_K: 48 tensors
llama_model_loader: - type q6_K: 22 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 2.54 GiB (5.19 BPW)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels