Skip to content

[Bug]: Bedrock pass-through not working using Application Inference Profile #18761

@Archalbc

Description

@Archalbc

What happened?

Hi !

Environment

  • LiteLLM: v1.80.8-stable.1-patch01
  • Deployment: AWS EKS

config.yaml:

model_list:
  - model_name: claude-sonnet-4
    litellm_params:
      model: bedrock/anthropic.claude-sonnet-4-20250514-v1:0
      model_id: arn:aws:bedrock:eu-west-1:123456789:application-inference-profile/abcdefgh1234
      aws_region_name: eu-west-1
      aws_role_name: arn:aws:iam::123456789:role/bedrock-runtime

Bug

Can't use Bedrock Pass-through API with models using Application Inference Profiles.

Client api call example:
curl -X POST https://my-litellm.com/bedrock/model/claude-sonnet-4/invoke ..........truncated

LiteLLM Log:
Exception Client error '403 Forbidden' for url 'https://bedrock-runtime.eu-west-1.amazonaws.com/model/anthropic.claude-sonnet-4-20250514-v1:0/invoke'

Solution

LiteLLM should use the model_id field value containing the Inference Profile ARN instead of the translated model name (i.e anthropic.claude-sonnet-4-20250514-v1:0).

Workaround: Set the ARN in the model field as well in the config.yaml.

Context

With the current LiteLLM chat/completions API, I was not able to upload a "Scanned" PDF (without text) to use the "visual understanding" feature of Claude. I tried using file_data, image_url, and a lot of other syntaxe that I saw in multiple documentations... but each time the AI answer that the document content is empty.

So I decided to use the LiteLLM pass-through feature to directly call the Bedrock API as I was already able to do it using python/boto3-sdk.

This should allow me to call the Bedrock Invoke API like this :

import base64
import requests
import json

# --- Encode PDF ---
with open("./scanned.pdf", "rb") as f:
    encoded_pdf = base64.b64encode(f.read()).decode("utf-8")

base64_url = f"data:application/pdf;base64,{encoded_pdf}"

url = "https://my-litellm.com/bedrock/model/claude-sonnet-4/invoke"

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer sk-XXXXXXXXX",
}

payload = {
    "anthropic_version": "bedrock-2023-05-31",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": encoded_pdf
                    },
                },
                {
                    "type": "text",
                    "text": "Tell me about this document."
                }
            ]
        }
    ],
    "max_tokens": 1024
}

resp = requests.post(url, headers=headers, json=payload)
resp.raise_for_status()

data = resp.json()
print(json.dumps(data))

Unfortunatly, the answer is sad:

Traceback (most recent call last):                                                                                                                                         
  File "/home/toto/dev/tests/litellm.py", line 41, in <module>
    resp.raise_for_status()                                                                                                                                                
  File "/home/toto/dev/tests/venv/lib/python3.12/site-packages/requests/models.py", line 1026, in raise_for_status
    raise HTTPError(http_error_msg, response=self)                                                                                                                         
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://my-litellm.com/bedrock/model/claude-sonnet-4/invoke

Here is the associated error in LiteLLM log:

10:22:41 - LiteLLM Router:INFO: router.py:3065 - ageneric_api_call_with_fallbacks(model=claude-sonnet-4) Exception Client error '403 Forbidden' for url 'https://bedrock-runtime.eu-west-1.amazonaws.com/model/anthropic.claude-sonnet-4-20250514-v1:0/invoke'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403

We access Bedrock with IAM Roles allowed to use some models using Application Inference Profiles.

We can clearly see that LiteLLM is trying to call Bedrock using the model ID instead of using the Inference Profile.

So I tricked the configuration:

model_list:
  - model_name: claude-sonnet-4
    litellm_params:
      # Set model field like the model_id field
      model: arn:aws:bedrock:eu-west-1:123456789:application-inference-profile/abcdefgh1234 
      model_id: arn:aws:bedrock:eu-west-1:123456789:application-inference-profile/abcdefgh1234
      aws_region_name: eu-west-1
      aws_role_name: arn:aws:iam::123456789:role/bedrock-runtime

And now it worked, here is now the LiteLLM log again:

10:29:27 - LiteLLM:DEBUG: litellm_logging.py:1049
                                                                                                                                                                                                                                                                                                                                                      
POST Request Sent from LiteLLM:                                                                                                                                                                                                                                                                                                                       
curl -X POST \                                                                                                                                                                                                                                                                                                                                        
https://bedrock-runtime.eu-west-1.amazonaws.com/model/arn:aws:bedrock:eu-west-1:xxxxxxxxxxxxxxx:application-inference-profile%abcdefgh1234/invoke \
-H 'Content-Type: application/json' -H 'X-Amz-Date: 20260107T102927Z' -H 'X-Amz-Security-Token: IQ****==' -H 'Authorization: AW****8b' \                                                                                                                                                                                                              
-d '{'anthropic_version': 'bedrock-2023-05-31', 'messages': [{'role': 'user', 'content': [{'type': 'document', 'source': {'type': 'base64', 'media_type': 'application/pdf', 'data': 'JVBERi0xLj..........

Relevant log output

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.80.8-stable.1-patch01

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocsIssues related to LiteLLM documentationllm translationproxy

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions