-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Description
What happened?
Hi !
Environment
- LiteLLM: v1.80.8-stable.1-patch01
- Deployment: AWS EKS
config.yaml:
model_list:
- model_name: claude-sonnet-4
litellm_params:
model: bedrock/anthropic.claude-sonnet-4-20250514-v1:0
model_id: arn:aws:bedrock:eu-west-1:123456789:application-inference-profile/abcdefgh1234
aws_region_name: eu-west-1
aws_role_name: arn:aws:iam::123456789:role/bedrock-runtime
Bug
Can't use Bedrock Pass-through API with models using Application Inference Profiles.
Client api call example:
curl -X POST https://my-litellm.com/bedrock/model/claude-sonnet-4/invoke ..........truncated
LiteLLM Log:
Exception Client error '403 Forbidden' for url 'https://bedrock-runtime.eu-west-1.amazonaws.com/model/anthropic.claude-sonnet-4-20250514-v1:0/invoke'
Solution
LiteLLM should use the model_id field value containing the Inference Profile ARN instead of the translated model name (i.e anthropic.claude-sonnet-4-20250514-v1:0).
Workaround: Set the ARN in the model field as well in the config.yaml.
Context
With the current LiteLLM chat/completions API, I was not able to upload a "Scanned" PDF (without text) to use the "visual understanding" feature of Claude. I tried using file_data, image_url, and a lot of other syntaxe that I saw in multiple documentations... but each time the AI answer that the document content is empty.
So I decided to use the LiteLLM pass-through feature to directly call the Bedrock API as I was already able to do it using python/boto3-sdk.
This should allow me to call the Bedrock Invoke API like this :
import base64
import requests
import json
# --- Encode PDF ---
with open("./scanned.pdf", "rb") as f:
encoded_pdf = base64.b64encode(f.read()).decode("utf-8")
base64_url = f"data:application/pdf;base64,{encoded_pdf}"
url = "https://my-litellm.com/bedrock/model/claude-sonnet-4/invoke"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer sk-XXXXXXXXX",
}
payload = {
"anthropic_version": "bedrock-2023-05-31",
"messages": [
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": encoded_pdf
},
},
{
"type": "text",
"text": "Tell me about this document."
}
]
}
],
"max_tokens": 1024
}
resp = requests.post(url, headers=headers, json=payload)
resp.raise_for_status()
data = resp.json()
print(json.dumps(data))Unfortunatly, the answer is sad:
Traceback (most recent call last):
File "/home/toto/dev/tests/litellm.py", line 41, in <module>
resp.raise_for_status()
File "/home/toto/dev/tests/venv/lib/python3.12/site-packages/requests/models.py", line 1026, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://my-litellm.com/bedrock/model/claude-sonnet-4/invokeHere is the associated error in LiteLLM log:
10:22:41 - LiteLLM Router:INFO: router.py:3065 - ageneric_api_call_with_fallbacks(model=claude-sonnet-4) Exception Client error '403 Forbidden' for url 'https://bedrock-runtime.eu-west-1.amazonaws.com/model/anthropic.claude-sonnet-4-20250514-v1:0/invoke'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403
We access Bedrock with IAM Roles allowed to use some models using Application Inference Profiles.
We can clearly see that LiteLLM is trying to call Bedrock using the model ID instead of using the Inference Profile.
So I tricked the configuration:
model_list:
- model_name: claude-sonnet-4
litellm_params:
# Set model field like the model_id field
model: arn:aws:bedrock:eu-west-1:123456789:application-inference-profile/abcdefgh1234
model_id: arn:aws:bedrock:eu-west-1:123456789:application-inference-profile/abcdefgh1234
aws_region_name: eu-west-1
aws_role_name: arn:aws:iam::123456789:role/bedrock-runtime
And now it worked, here is now the LiteLLM log again:
10:29:27 - LiteLLM:DEBUG: litellm_logging.py:1049
POST Request Sent from LiteLLM:
curl -X POST \
https://bedrock-runtime.eu-west-1.amazonaws.com/model/arn:aws:bedrock:eu-west-1:xxxxxxxxxxxxxxx:application-inference-profile%abcdefgh1234/invoke \
-H 'Content-Type: application/json' -H 'X-Amz-Date: 20260107T102927Z' -H 'X-Amz-Security-Token: IQ****==' -H 'Authorization: AW****8b' \
-d '{'anthropic_version': 'bedrock-2023-05-31', 'messages': [{'role': 'user', 'content': [{'type': 'document', 'source': {'type': 'base64', 'media_type': 'application/pdf', 'data': 'JVBERi0xLj..........
Relevant log output
What part of LiteLLM is this about?
Proxy
What LiteLLM version are you on ?
v1.80.8-stable.1-patch01
Twitter / LinkedIn details
No response