This repository contains materials for the re:Invent 2025 Bedrock Open Weight Model Builder session, focusing on demonstrating the capabilities and advantages of open-weight models on Amazon Bedrock.
Optimize open weight models for low-latency, cost-effective AI apps
Open-weight models deliver exceptional performance while offering customization control. Organizations can process sensitive data locally, deploy models tailored to specific requirements, and scale efficiently at lower latency and cost. However, maximizing these benefits requires strategic decisions—poor choices waste resources and compromise results. This session provides a practical framework for using open-weight models in Amazon Bedrock. Learn to evaluate and select the ideal model for your specific use cases, understand the trade-offs between different models and sizes, and identify deployment patterns that balance cost and latency. We'll demonstrate optimization techniques and architect solutions for real-world workloads, including agentic applications.
- Session ID: AIM311
- Content Level: L300
- Key Messages
- Low latency across models
- Cost comparison between different model options
- Accuracy differences between standard and fine-tuned models
- Anastasia Tzeveleka
- Jeremy Bartosiewicz
- Luca Perrozzi
- Chakra Nagarajan
- Wale Akinfaderin
- Cost per token processed: Economic efficiency of model usage
- Latency: Response time and processing speed, like time to first token
- Throughput: Number of requests handled per unit time
- Context window size: Maximum input length the model can process
- Integrations: Compatibility with existing systems and workflows
- Ecosystem tools: Supporting libraries, frameworks, and utilities
- Multimodality: Support for text, images, audio, and other data types
- Reasoning ability: Model's capacity for logical thinking and problem-solving
- Accuracy: Correctness of responses and factual information
- Creativity: Ability to generate novel and innovative content
- Language: Quality of language generation and comprehension
- Adaptability: Flexibility to handle diverse tasks and contexts
- Fine-tuning or custom training options: Customization capabilities
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtCompare APIs and open-weight models (Llama, GPT OSS, Qwen, DeepSeek) to showcase Amazon Bedrock's capabilities.
Files:
Evaluate quality, latency, and accuracy metrics with focus on tool calling and agentic tasks using automated and LLM-as-a-Judge methodology.
Files:
- OpenAI Open Weight Models: Expanded to new regions on AWS Bedrock
- DeepSeek-V3.1: Now available fully managed in Amazon Bedrock
- Qwen3 Models: Now available fully managed in Amazon Bedrock
- On-demand Deployment: Custom Meta Llama models in Amazon Bedrock