Vision doesn't work with model `lmstudio-community/Magistral-Small-2509-MLX-4bit`

The lmstudio-community Magistral Model does not work with image input:

```
==========
Files: ['https://www.kernel.org/theme/images/logos/tux.png'] 

Prompt: <s>[SYSTEM_PROMPT]First draft your thinking process (inner monologue) until you arrive at a response. Format your response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and the response in the same language as the input.

Your thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working through an exercise on scratch paper. Be as casual and as long as you want until you are confident to generate the response. Use the same language as the input.[/THINK]Here, provide a self-contained response.[/SYSTEM_PROMPT][INST][IMG]What is in this image?[/INST]
Traceback (most recent call last):
  File "/Users/schnow265/.local/bin/mlx_vlm.generate", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/schnow265/.local/share/uv/tools/mlx-vlm/lib/python3.12/site-packages/mlx_vlm/generate.py", line 1466, in main
    result = generate(
             ^^^^^^^^^
  File "/Users/schnow265/.local/share/uv/tools/mlx-vlm/lib/python3.12/site-packages/mlx_vlm/generate.py", line 694, in generate
    for response in stream_generate(model, processor, prompt, image, audio, **kwargs):
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnow265/.local/share/uv/tools/mlx-vlm/lib/python3.12/site-packages/mlx_vlm/generate.py", line 582, in stream_generate
    for n, (token, logprobs) in enumerate(gen):
                                ^^^^^^^^^^^^^^
  File "/Users/schnow265/.local/share/uv/tools/mlx-vlm/lib/python3.12/site-packages/mlx_vlm/generate.py", line 421, in generate_step
    embedding_output = model.get_input_embeddings(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnow265/.local/share/uv/tools/mlx-vlm/lib/python3.12/site-packages/mlx_vlm/models/mistral3/mistral3.py", line 270, in get_input_embeddings
    final_inputs_embeds = self.merge_input_ids_with_image_features(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/schnow265/.local/share/uv/tools/mlx-vlm/lib/python3.12/site-packages/mlx_vlm/models/mistral3/mistral3.py", line 317, in merge_input_ids_with_image_features
    raise ValueError(
ValueError: Number of image token positions (49) does not match number of image features (16) for batch 0
```

## Steps to reproduce:

Run the command:

```
mlx_vlm.generate --model lmstudio-community/Magistral-Small-2509-MLX-4bit --image https://www.kernel.org/theme/images/logos/tux.png --prompt "What is in this image?"
```

---

Environment:

MacBook Pro M5 Pro
24 GB RAM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vision doesn't work with model `lmstudio-community/Magistral-Small-2509-MLX-4bit` #855

Steps to reproduce:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Vision doesn't work with model lmstudio-community/Magistral-Small-2509-MLX-4bit #855

Description

Steps to reproduce:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Vision doesn't work with model `lmstudio-community/Magistral-Small-2509-MLX-4bit` #855