@yottascale/agent-native-infra

MCP server and agent skills for Yotta Platform — the GPU cloud for AI/ML workloads.

Give any AI agent the ability to provision GPUs, launch pods, deploy models, and manage infrastructure through natural language. Built on the Model Context Protocol (MCP).

What's included

Layer	What it does	Count
Tools	CRUD operations for VMs, Pods, Serverless endpoints, Volumes, and Registry credentials	37
Resources	GPU catalog with specs, pricing, and availability	2
Prompts	Guided workflows for GPU selection, pod launch, and model serving	3
Skills	Agent skill definitions for Claude Code and compatible agents	3

Quick start

Prerequisites

Node.js >= 18
A Yotta Platform API key (get one here)

Use with Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "yotta": {
      "command": "npx",
      "args": ["-y", "@yottascale/agent-native-infra"],
      "env": {
        "YOTTA_API_KEY": "your-api-key"
      }
    }
  }
}

Use with Claude Code

claude mcp add yotta -- npx -y @yottascale/agent-native-infra

Set the API key in your environment:

export YOTTA_API_KEY=your-api-key

Use with Cursor, Windsurf, or any MCP-compatible client

{
  "mcpServers": {
    "yotta": {
      "command": "npx",
      "args": ["-y", "@yottascale/agent-native-infra"],
      "env": {
        "YOTTA_API_KEY": "your-api-key"
      }
    }
  }
}

Run locally (from source)

git clone https://github.com/yottalabsai/agent-native-infra
cd agent-native-infra
npm install
YOTTA_API_KEY=your-api-key npx tsx src/index.ts

Or point Claude Desktop / Claude Code at the local build:

{
  "mcpServers": {
    "yotta": {
      "command": "npx",
      "args": ["tsx", "/path/to/agent-native-infra/src/index.ts"],
      "env": { "YOTTA_API_KEY": "your-api-key" }
    }
  }
}

Test with MCP Inspector

YOTTA_API_KEY=your-api-key npx @modelcontextprotocol/inspector npx -y @yottascale/agent-native-infra

Tools

Pods

Interactive GPU instances for development, training, and batch jobs.

Tool	Description
`pod_create`	Create a GPU pod with a Docker image, GPU type/count, ports, and env vars
`pod_get`	Get pod details by ID
`pod_list`	List pods, optionally filtered by region or status
`pod_delete`	Delete a pod (irreversible)
`pod_pause`	Pause a running pod (stops billing, preserves state)
`pod_resume`	Resume a paused pod

Serverless

Elastic (serverless) GPU endpoints for production inference.

Tool	Description
`serverless_create`	Create a serverless endpoint (ALB, QUEUE, or CUSTOM mode)
`serverless_get`	Get endpoint details by ID
`serverless_list`	List all serverless endpoints, optionally filtered by status
`serverless_update`	Update endpoint configuration
`serverless_delete`	Delete an endpoint (irreversible)
`serverless_stop`	Stop a running endpoint
`serverless_start`	Start a stopped endpoint
`serverless_scale`	Scale worker count up or down
`serverless_list_workers`	List workers for an endpoint
`serverless_list_tasks`	List tasks for a QUEUE-mode endpoint
`serverless_task_count`	Get task status counts
`serverless_submit_task`	Submit a task to a QUEUE-mode endpoint
`serverless_get_task`	Get details of a specific task by ID
`serverless_worker_logs`	Get logs from a specific worker

Virtual Machines

Full GPU virtual machines.

Tool	Description
`vm_create`	Create a GPU VM (on-demand or spot)
`vm_get`	Get VM details by ID
`vm_list`	List VMs (paginated)
`vm_types`	List available VM/GPU types with region availability
`vm_rename`	Rename a VM
`vm_terminate`	Terminate a VM (irreversible)

Volumes

Persistent and object storage for pods and VMs.

Tool	Description
`volume_create`	Create a storage volume (S3, R2, CEPH, VENDOR)
`volume_list`	List volumes by storage type (paginated)
`volume_get`	Get volume details by ID
`volume_delete`	Delete a volume (must be unmounted)
`volume_rename`	Rename a volume
`volume_resize`	Resize a CEPH or VENDOR volume

Container Registry

Manage credentials for pulling private Docker images.

Tool	Description
`registry_list`	List all registry credentials
`registry_get`	Get a credential by ID
`registry_create`	Create a new credential
`registry_update`	Update a credential
`registry_delete`	Delete a credential

Resources

URI	Description
`yotta://gpus`	Full GPU catalog (all types with VRAM, pricing, regions)
`yotta://gpus/{gpuType}`	Individual GPU type details

Available GPUs

GPU	VRAM
NVIDIA RTX 4090	24 GB
NVIDIA RTX 5090	32 GB
NVIDIA A100	80 GB
NVIDIA H100	80 GB
NVIDIA H200	141 GB
NVIDIA B200	192 GB
NVIDIA B300	288 GB
NVIDIA RTX PRO 6000	96 GB

Prompts

`gpu-selector`

Interactive GPU recommendation based on model size, task type, budget, and quantization. Estimates VRAM requirements and suggests optimal configurations.

Task: fine-tuning | Model: Llama-3-70B | Budget: medium | Quantization: int4
→ Recommends H100 80GB x1 with QLoRA

`launch-pod`

Configure and launch a GPU pod from preset templates:

pytorch — General deep learning (training, fine-tuning, research)
unsloth — Fast LoRA/QLoRA fine-tuning (2-5x speedup)
skyrl — Reinforcement learning (RLHF, PPO, GRPO)
comfyui — Image generation (Stable Diffusion, SDXL, Flux)

`serve-model`

Deploy a model for inference. Supports multiple serving frameworks (vLLM, TGI, Triton) and deployment modes:

Mode	Description
POD	Single GPU instance via `pod_create` — good for dev/testing
ALB	HTTP load balancer via `serverless_create` — real-time inference at scale
QUEUE	Async job queue — batch/long-running jobs
CUSTOM	Raw container — gRPC or custom protocols

Agent Skills

The skills/yotta-agent-skills/SKILL.md file provides structured knowledge for AI agents, including:

VRAM estimation heuristics for sizing GPUs to models
Template-to-image mapping for quick pod launches
Serving framework selection guidance
Step-by-step configuration workflows

Compatible with Claude Code and any agent framework that supports skill files.

Configuration

Environment Variable	Required	Default	Description
`YOTTA_API_KEY`	Yes	—	Yotta Platform API key
`YOTTA_API_BASE_URL`	No	`https://api.yottalabs.ai`	API base URL

Development

npm run dev          # Watch mode with hot reload
npm test             # Run tests
npm run test:watch   # Watch mode tests
npm run lint         # Type check
npm run build        # Compile TypeScript

Project structure

src/
├── index.ts              # Server entry point
├── config.ts             # Environment configuration
├── api/
│   ├── client.ts         # HTTP client for Yotta V2 API
│   └── types.ts          # TypeScript interfaces
├── tools/
│   ├── vms.ts            # VM tools (6)
│   ├── pods.ts           # Pod tools (6)
│   ├── serverless.ts     # Serverless tools (14)
│   ├── volumes.ts        # Volume tools (6)
│   └── registry.ts       # Registry tools (5)
├── resources/
│   ├── index.ts          # GPU catalog resources
│   └── gpus.json         # GPU type definitions
└── prompts/
    ├── gpu-selector.ts   # GPU recommendation prompt
    ├── launch-pod.ts     # Pod launch prompt
    └── serve-model.ts    # Model serving prompt
skills/
└── yotta-agent-skills/
    └── SKILL.md          # Agent skill definitions

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
skills/yotta-agent-skills		skills/yotta-agent-skills
src		src
.env.example		.env.example
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
smithery.yaml		smithery.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@yottascale/agent-native-infra

What's included

Quick start

Prerequisites

Use with Claude Desktop

Use with Claude Code

Use with Cursor, Windsurf, or any MCP-compatible client

Run locally (from source)

Test with MCP Inspector

Tools

Pods

Serverless

Virtual Machines

Volumes

Container Registry

Resources

Available GPUs

Prompts

`gpu-selector`

`launch-pod`

`serve-model`

Agent Skills

Configuration

Development

Project structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

@yottascale/agent-native-infra

What's included

Quick start

Prerequisites

Use with Claude Desktop

Use with Claude Code

Use with Cursor, Windsurf, or any MCP-compatible client

Run locally (from source)

Test with MCP Inspector

Tools

Pods

Serverless

Virtual Machines

Volumes

Container Registry

Resources

Available GPUs

Prompts

gpu-selector

launch-pod

serve-model

Agent Skills

Configuration

Development

Project structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`gpu-selector`

`launch-pod`

`serve-model`

Packages