-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug
When using Docling picture description through the OpenAI-compatible API VLM engine against Azure OpenAI GPT-5 chat deployments, Docling sends request parameters that Azure rejects.
Observed issues:
max_tokensis sent, but Azure requiresmax_completion_tokensfor this modeltemperature=0.0is sent, but this deployment only accepts the default temperature value
This makes picture description fail unless downstream users monkeypatch Docling internals.
Relevant areas appear to be:
docling/models/inference_engines/vlm/api_openai_compatible_engine.pydocling/models/stages/picture_description/picture_description_vlm_engine_model.py
The failure path appears to be:
- picture description input is created with
temperature=0.0andmax_new_tokens=200 - OpenAI-compatible API engine maps
max_new_tokenstomax_tokens - Azure rejects the request
Typical Azure errors:
Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.Unsupported value: 'temperature' does not support 0.0 with this model. Only the default (1) value is supported.
Steps to reproduce
- Configure Docling picture description with
PictureDescriptionVlmEngineOptions.from_preset(...) - Use
ApiVlmEngineOptions(engine_type=VlmEngineType.API_OPENAI, ...) - Point the URL to an Azure OpenAI chat completions endpoint, for example:
https://<resource>.openai.azure.com/openai/deployments/<deployment>/chat/completions?api-version=<version> - Use a GPT-5 chat deployment that supports image input
- Run document conversion with picture description enabled
- Observe request failure from Azure OpenAI because of unsupported generation parameters
Docling version
2.78.0
Python version
3.12
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working