Skip to content

aws-samples/sample-open-weight-models-with-amazon-bedrock

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

re:Invent 2025 Model Builder session AIM311

Repo Overview

This repository contains materials for the re:Invent 2025 Bedrock Open Weight Model Builder session, focusing on demonstrating the capabilities and advantages of open-weight models on Amazon Bedrock.

Session Title

Optimize open weight models for low-latency, cost-effective AI apps

Session Abstract

Open-weight models deliver exceptional performance while offering customization control. Organizations can process sensitive data locally, deploy models tailored to specific requirements, and scale efficiently at lower latency and cost. However, maximizing these benefits requires strategic decisions—poor choices waste resources and compromise results. This session provides a practical framework for using open-weight models in Amazon Bedrock. Learn to evaluate and select the ideal model for your specific use cases, understand the trade-offs between different models and sizes, and identify deployment patterns that balance cost and latency. We'll demonstrate optimization techniques and architect solutions for real-world workloads, including agentic applications.

Session Details

  • Session ID: AIM311
  • Content Level: L300
  • Key Messages
    • Low latency across models
    • Cost comparison between different model options
    • Accuracy differences between standard and fine-tuned models

Session Speakers

  • Anastasia Tzeveleka
  • Jeremy Bartosiewicz
  • Luca Perrozzi
  • Chakra Nagarajan
  • Wale Akinfaderin

Pillars for LLM Model Evaluation

1. Operational Metrics (coverd by Lab 1 & Lab 2)

  • Cost per token processed: Economic efficiency of model usage
  • Latency: Response time and processing speed, like time to first token
  • Throughput: Number of requests handled per unit time

2. Features & Usability (covered by Lab 1)

  • Context window size: Maximum input length the model can process
  • Integrations: Compatibility with existing systems and workflows
  • Ecosystem tools: Supporting libraries, frameworks, and utilities
  • Multimodality: Support for text, images, audio, and other data types

3. Performance & Quality (covered by Lab 2)

  • Reasoning ability: Model's capacity for logical thinking and problem-solving
  • Accuracy: Correctness of responses and factual information
  • Creativity: Ability to generate novel and innovative content
  • Language: Quality of language generation and comprehension
  • Adaptability: Flexibility to handle diverse tasks and contexts
  • Fine-tuning or custom training options: Customization capabilities

Repo Setup and Flow

Create Virtual Environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Requirements

pip install -r requirements.txt

Lab 1: Model Selection & API Comparison

Compare APIs and open-weight models (Llama, GPT OSS, Qwen, DeepSeek) to showcase Amazon Bedrock's capabilities.

Files:

Lab 2: Performance Evaluation

Evaluate quality, latency, and accuracy metrics with focus on tool calling and agentic tasks using automated and LLM-as-a-Judge methodology.

Files:

Technical Resources

Benchmarking & Evaluation

Recent Announcements (September 2025)

New Model Availability

Deployment Options

About

This repository contains materials for the re:Invent 2025 Bedrock Open Weight Model Builder session AIM311, focusing on demonstrating the capabilities and advantages of open-weight models on Amazon Bedrock.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors