MCP server and agent skills for Yotta Platform — the GPU cloud for AI/ML workloads.
Give any AI agent the ability to provision GPUs, launch pods, deploy models, and manage infrastructure through natural language. Built on the Model Context Protocol (MCP).
| Layer | What it does | Count |
|---|---|---|
| Tools | CRUD operations for VMs, Pods, Serverless endpoints, Volumes, and Registry credentials | 37 |
| Resources | GPU catalog with specs, pricing, and availability | 2 |
| Prompts | Guided workflows for GPU selection, pod launch, and model serving | 3 |
| Skills | Agent skill definitions for Claude Code and compatible agents | 3 |
- Node.js >= 18
- A Yotta Platform API key (get one here)
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"yotta": {
"command": "npx",
"args": ["-y", "@yottascale/agent-native-infra"],
"env": {
"YOTTA_API_KEY": "your-api-key"
}
}
}
}claude mcp add yotta -- npx -y @yottascale/agent-native-infraSet the API key in your environment:
export YOTTA_API_KEY=your-api-key{
"mcpServers": {
"yotta": {
"command": "npx",
"args": ["-y", "@yottascale/agent-native-infra"],
"env": {
"YOTTA_API_KEY": "your-api-key"
}
}
}
}git clone https://github.com/yottalabsai/agent-native-infra
cd agent-native-infra
npm install
YOTTA_API_KEY=your-api-key npx tsx src/index.tsOr point Claude Desktop / Claude Code at the local build:
{
"mcpServers": {
"yotta": {
"command": "npx",
"args": ["tsx", "/path/to/agent-native-infra/src/index.ts"],
"env": { "YOTTA_API_KEY": "your-api-key" }
}
}
}YOTTA_API_KEY=your-api-key npx @modelcontextprotocol/inspector npx -y @yottascale/agent-native-infraInteractive GPU instances for development, training, and batch jobs.
| Tool | Description |
|---|---|
pod_create |
Create a GPU pod with a Docker image, GPU type/count, ports, and env vars |
pod_get |
Get pod details by ID |
pod_list |
List pods, optionally filtered by region or status |
pod_delete |
Delete a pod (irreversible) |
pod_pause |
Pause a running pod (stops billing, preserves state) |
pod_resume |
Resume a paused pod |
Elastic (serverless) GPU endpoints for production inference.
| Tool | Description |
|---|---|
serverless_create |
Create a serverless endpoint (ALB, QUEUE, or CUSTOM mode) |
serverless_get |
Get endpoint details by ID |
serverless_list |
List all serverless endpoints, optionally filtered by status |
serverless_update |
Update endpoint configuration |
serverless_delete |
Delete an endpoint (irreversible) |
serverless_stop |
Stop a running endpoint |
serverless_start |
Start a stopped endpoint |
serverless_scale |
Scale worker count up or down |
serverless_list_workers |
List workers for an endpoint |
serverless_list_tasks |
List tasks for a QUEUE-mode endpoint |
serverless_task_count |
Get task status counts |
serverless_submit_task |
Submit a task to a QUEUE-mode endpoint |
serverless_get_task |
Get details of a specific task by ID |
serverless_worker_logs |
Get logs from a specific worker |
Full GPU virtual machines.
| Tool | Description |
|---|---|
vm_create |
Create a GPU VM (on-demand or spot) |
vm_get |
Get VM details by ID |
vm_list |
List VMs (paginated) |
vm_types |
List available VM/GPU types with region availability |
vm_rename |
Rename a VM |
vm_terminate |
Terminate a VM (irreversible) |
Persistent and object storage for pods and VMs.
| Tool | Description |
|---|---|
volume_create |
Create a storage volume (S3, R2, CEPH, VENDOR) |
volume_list |
List volumes by storage type (paginated) |
volume_get |
Get volume details by ID |
volume_delete |
Delete a volume (must be unmounted) |
volume_rename |
Rename a volume |
volume_resize |
Resize a CEPH or VENDOR volume |
Manage credentials for pulling private Docker images.
| Tool | Description |
|---|---|
registry_list |
List all registry credentials |
registry_get |
Get a credential by ID |
registry_create |
Create a new credential |
registry_update |
Update a credential |
registry_delete |
Delete a credential |
| URI | Description |
|---|---|
yotta://gpus |
Full GPU catalog (all types with VRAM, pricing, regions) |
yotta://gpus/{gpuType} |
Individual GPU type details |
| GPU | VRAM |
|---|---|
| NVIDIA RTX 4090 | 24 GB |
| NVIDIA RTX 5090 | 32 GB |
| NVIDIA A100 | 80 GB |
| NVIDIA H100 | 80 GB |
| NVIDIA H200 | 141 GB |
| NVIDIA B200 | 192 GB |
| NVIDIA B300 | 288 GB |
| NVIDIA RTX PRO 6000 | 96 GB |
Interactive GPU recommendation based on model size, task type, budget, and quantization. Estimates VRAM requirements and suggests optimal configurations.
Task: fine-tuning | Model: Llama-3-70B | Budget: medium | Quantization: int4
→ Recommends H100 80GB x1 with QLoRA
Configure and launch a GPU pod from preset templates:
- pytorch — General deep learning (training, fine-tuning, research)
- unsloth — Fast LoRA/QLoRA fine-tuning (2-5x speedup)
- skyrl — Reinforcement learning (RLHF, PPO, GRPO)
- comfyui — Image generation (Stable Diffusion, SDXL, Flux)
Deploy a model for inference. Supports multiple serving frameworks (vLLM, TGI, Triton) and deployment modes:
| Mode | Description |
|---|---|
| POD | Single GPU instance via pod_create — good for dev/testing |
| ALB | HTTP load balancer via serverless_create — real-time inference at scale |
| QUEUE | Async job queue — batch/long-running jobs |
| CUSTOM | Raw container — gRPC or custom protocols |
The skills/yotta-agent-skills/SKILL.md file provides structured knowledge for AI agents, including:
- VRAM estimation heuristics for sizing GPUs to models
- Template-to-image mapping for quick pod launches
- Serving framework selection guidance
- Step-by-step configuration workflows
Compatible with Claude Code and any agent framework that supports skill files.
| Environment Variable | Required | Default | Description |
|---|---|---|---|
YOTTA_API_KEY |
Yes | — | Yotta Platform API key |
YOTTA_API_BASE_URL |
No | https://api.yottalabs.ai |
API base URL |
npm run dev # Watch mode with hot reload
npm test # Run tests
npm run test:watch # Watch mode tests
npm run lint # Type check
npm run build # Compile TypeScriptsrc/
├── index.ts # Server entry point
├── config.ts # Environment configuration
├── api/
│ ├── client.ts # HTTP client for Yotta V2 API
│ └── types.ts # TypeScript interfaces
├── tools/
│ ├── vms.ts # VM tools (6)
│ ├── pods.ts # Pod tools (6)
│ ├── serverless.ts # Serverless tools (14)
│ ├── volumes.ts # Volume tools (6)
│ └── registry.ts # Registry tools (5)
├── resources/
│ ├── index.ts # GPU catalog resources
│ └── gpus.json # GPU type definitions
└── prompts/
├── gpu-selector.ts # GPU recommendation prompt
├── launch-pod.ts # Pod launch prompt
└── serve-model.ts # Model serving prompt
skills/
└── yotta-agent-skills/
└── SKILL.md # Agent skill definitions
MIT — see LICENSE.