Skip to content

[bug] Unable to use Qwen3.5 0.8b 2b 4b with DMR #724

@k33g

Description

@k33g

The new small versions of Qwen3.5 fail with DMR

https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF

Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/k33g/.docker/models/bundles/sha256/387dba4409dea9bcf2a7bb865d4d200c1bcc58ffd9c9cac43679d76c468231a2/model/model.gguf'
srv    load_model: failed to load model, '/Users/k33g/.docker/models/bundles/sha256/387dba4409dea9bcf2a7bb865d4d200c1bcc58ffd9c9cac43679d76c468231a2/model/model.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
        quantize.imatrix.chunks_count u32              = 80
llama_model_loader: - type  f32:  133 tensors
llama_model_loader: - type q8_0:   36 tensors
llama_model_loader: - type q4_K:   98 tensors
llama_model_loader: - type q5_K:   36 tensors
llama_model_loader: - type q6_K:   17 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 497.39 MiB (5.55 BPW)

https://huggingface.co/unsloth/Qwen3.5-2B-GGUF

Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/k33g/.docker/models/bundles/sha256/bf39efff6e5f95df945cd0d1d97c7dc0ab204d032c34c20df7e3d6b9df2812e6/model/model.gguf'
srv    load_model: failed to load model, '/Users/k33g/.docker/models/bundles/sha256/bf39efff6e5f95df945cd0d1d97c7dc0ab204d032c34c20df7e3d6b9df2812e6/model/model.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
          quantize.imatrix.chunks_count u32              = 80
llama_model_loader: - type  f32:  133 tensors
llama_model_loader: - type q8_0:   36 tensors
llama_model_loader: - type q4_K:   98 tensors
llama_model_loader: - type q5_K:   36 tensors
llama_model_loader: - type q6_K:   17 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 1.18 GiB (5.40 BPW)

https://huggingface.co/unsloth/Qwen3.5-4B-GGUF

Failed to generate a response: error response: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/k33g/.docker/models/bundles/sha256/54cd4d8a351047d5657aa9aee31d1ffbafed33819d0a2fe7275313d694cce220/model/model.gguf'
srv    load_model: failed to load model, '/Users/k33g/.docker/models/bundles/sha256/54cd4d8a351047d5657aa9aee31d1ffbafed33819d0a2fe7275313d694cce220/model/model.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
          quantize.imatrix.chunks_count u32              = 80
llama_model_loader: - type  f32:  177 tensors
llama_model_loader: - type q8_0:   48 tensors
llama_model_loader: - type q4_K:  131 tensors
llama_model_loader: - type q5_K:   48 tensors
llama_model_loader: - type q6_K:   22 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 2.54 GiB (5.19 BPW)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions