[Feature]: Support `cache_control_injection_points` for `tools` location

### Check for existing issues

- [x] I have searched the existing issues and checked that my issue is not a duplicate.

### The Feature

## Summary

Add support for a `"location": "tool_config"` option in `cache_control_injection_points` to enable automatic injection of Bedrock Converse API `cachePoint` markers into the `toolConfig.tools` array. This would allow users to cache tool definitions alongside system prompts, which is supported by Bedrock but not currently accessible through LiteLLM's auto-injection mechanism.

## Motivation

AWS Bedrock's Converse API supports prompt caching on three fields for Claude models: `system`, `messages`, and `tools` ([docs](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html)). The current `cache_control_injection_points` feature only supports injecting `cache_control` into messages (by `role` or `index`). There is no way to inject a `cachePoint` into the `toolConfig` section of the request.

For agentic applications with many tools (e.g., 5+ MCP servers with 20+ tool definitions), the tool definitions represent a significant portion of the input tokens that are static across requests. Being able to cache these alongside the system prompt would provide substantial cost and latency savings.

## Current Behavior

```python
response = completion(
    model="bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0",
    messages=[...],
    tools=[...],  # 20+ tool definitions
    cache_control_injection_points=[
        {"location": "message", "role": "system"},
        # No way to target tools
    ],
)
```

LiteLLM injects `cache_control` into the system message but the tool definitions in `toolConfig.tools` are sent without any `cachePoint`, so they are reprocessed on every request.

## Proposed Behavior

```python
response = completion(
    model="bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0",
    messages=[...],
    tools=[...],
    cache_control_injection_points=[
        {"location": "message", "role": "system"},
        {"location": "tool_config"},  # NEW: inject cachePoint after last tool
    ],
)
```

LiteLLM would append a `cachePoint` entry to the end of the `toolConfig.tools` array in the Bedrock Converse API request:

```json
{
  "toolConfig": {
    "tools": [
      {"toolSpec": {"name": "tool_1", ...}},
      {"toolSpec": {"name": "tool_2", ...}},
      {"cachePoint": {"type": "default"}}
    ]
  }
}
```

This follows the same pattern Bedrock uses for system and message cache checkpoints, as documented in the [Bedrock prompt caching guide](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html).

## Supported Models

Per the Bedrock docs, the following models support `tools` as a cache checkpoint field:

| Model | Tools Caching |
|-------|--------------|
| Claude 3.7 Sonnet | Yes |
| Claude 3.5 Sonnet v2 | Yes |
| Claude 3.5 Haiku | Yes |
| Claude Sonnet 4 | Yes |
| Claude Opus 4 | Yes |
| Amazon Nova models | No (only `system` and `messages`) |

## Implementation Notes

The injection logic would be similar to the existing message injection in `litellm/litellm_core_utils/prompt_templates/`:

1. When `{"location": "tool_config"}` is present in `cache_control_injection_points`
2. And the request includes `tools` (non-empty)
3. After LiteLLM translates OpenAI-format tools to Bedrock's `toolConfig.tools` format
4. Append `{"cachePoint": {"type": "default"}}` as the last entry in the `tools` array

For the Anthropic direct API (non-Bedrock), the equivalent would be adding `cache_control: {"type": "ephemeral"}` to the last tool definition, following Anthropic's API spec.

## Use Case

We run an agentic application (OpenAI Agents SDK + LiteLLM + Bedrock) with 5 MCP servers providing 20+ tools. The system prompt is ~4K tokens and tool definitions add another ~3K tokens. Currently we can only cache the system prompt via `cache_control_injection_points`. Being able to cache tools would roughly double the cached prefix size, further reducing per-request cost and latency.

## Related Issues

- #10226 — Cache control injection points for Anthropic/Bedrock (general improvements)
- #12695 — Assistant & tool messages dropping cache points

## Environment

- LiteLLM version: 1.81.9
- Provider: AWS Bedrock (Converse API)
- Model: `us.anthropic.claude-3-7-sonnet-20250219-v1:0`


### Motivation, pitch

I'm building an agentic application using LiteLLM with AWS Bedrock (Claude 3.7 Sonnet) that has 20+ tool definitions from multiple MCP servers. I'm using cache_control_injection_points with {"location": "message", "role": "system"} to cache the system prompt, which works well.

However, Bedrock's Converse API supports prompt caching on system, messages, AND tools for Claude models (AWS docs). There's currently no way to use cache_control_injection_points to inject a cachePoint into the toolConfig.tools array.

For tool-heavy agentic workloads, the tool definitions are a significant chunk of static input tokens that get reprocessed on every request. Being able to cache them alongside the system prompt would roughly double the cached prefix, further reducing cost and latency.

Related: #10226 (closed — addressed negative index support, not tool caching)

### What part of LiteLLM is this about?

SDK (litellm Python package)

### LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

No

### Twitter / LinkedIn details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Support `cache_control_injection_points` for `tools` location #21969

Check for existing issues

The Feature

Summary

Motivation

Current Behavior

Proposed Behavior

Supported Models

Implementation Notes

Use Case

Related Issues

Environment

Motivation, pitch

What part of LiteLLM is this about?

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

Twitter / LinkedIn details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Tools Caching
Claude 3.7 Sonnet	Yes
Claude 3.5 Sonnet v2	Yes
Claude 3.5 Haiku	Yes
Claude Sonnet 4	Yes
Claude Opus 4	Yes
Amazon Nova models	No (only `system` and `messages`)

Uh oh!

[Feature]: Support cache_control_injection_points for tools location #21969

Description

Check for existing issues

The Feature

Summary

Motivation

Current Behavior

Proposed Behavior

Supported Models

Implementation Notes

Use Case

Related Issues

Environment

Motivation, pitch

What part of LiteLLM is this about?

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

Twitter / LinkedIn details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Feature]: Support `cache_control_injection_points` for `tools` location #21969