Skip to content

Embeddings requests to /v1/embeddings are mis-handled and fail with chat/completions validation errors #2418

@sbekkerm

Description

@sbekkerm

Summary

Valid POST /v1/embeddings requests (OpenAI-compatible embeddings API) are treated as chat-completions or completions by the extension. Because the request body has model and input instead of messages or prompt, the extension returns “failed to extract request data” or “invalid chat-completions request: chat-completions request must have at least one message” when the body doesn’t match (e.g. embeddings with input instead of messages).

The extension does not recognize the embeddings API path or request shape, so it never accepts embeddings as valid.

Root cause

Request body handling — pkg/epp/util/request/body.go

  • Path dispatch: determineAPITypeFromPath (lines 57–73) only handles /v1/conversations, /v1/responses, /v1/chat/completions, and /v1/completions. There is no branch for /v1/embeddings, so the path falls through to the default (completionsAPI).
  • Body parsing: ExtractRequestBody has no case for an embeddings-shaped body (model + input). Requests are therefore interpreted as completions (expecting prompt) or, when the path is normalized elsewhere, as chat-completions (expecting messages), which produces the errors above.

Response handling — pkg/epp/handlers/response.go

  • Usage extraction: extractUsageByAPIType only has explicit handling for response, conversation, chat.completion, chat.completion.chunk, and text_completion. Embeddings responses use top-level "object": "list" and the same usage fields (prompt_tokens, total_tokens, completion_tokens). There is no explicit embeddings/list type, so usage extraction falls back to the default branch and embeddings are not a first-class API type.

Expected behavior

  • Request: POST /v1/embeddings with a body such as:
    { "model": "text-embedding-3-small", "input": "The food was delicious!" }
    (or input as an array of strings) is recognized as an embeddings request and accepted.
  • Response: Embeddings responses are explicitly recognized for usage and metrics instead of relying on the default branch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions