Skip to content

Add olive mcp server#2353

Open
xiaoyu-work wants to merge 7 commits intomainfrom
xiaoyu/mcp
Open

Add olive mcp server#2353
xiaoyu-work wants to merge 7 commits intomainfrom
xiaoyu/mcp

Conversation

@xiaoyu-work
Copy link
Copy Markdown
Collaborator

Describe your changes

Add olive mcp server

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

@devang-ml devang-ml requested review from jambayk and shaahji March 18, 2026 05:30
Comment on lines +105 to +110
seen = set()
deduped = []
for pkg in extra_packages:
if pkg not in seen:
seen.add(pkg)
deduped.append(pkg)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the order matter? If not, this could be simplified to following -

extra_packages = list(set(extra_packages))

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

"NvTensorRTRTXExecutionProvider",
]

SUPPORTED_PRECISIONS = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use StrEnum?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

# Constants
# ---------------------------------------------------------------------------

SUPPORTED_PROVIDERS = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use StrEnum?

Comment on lines +22 to +30
CMD_OPTIMIZE = "optimize"
CMD_QUANTIZE = "quantize"
CMD_FINETUNE = "finetune"
CMD_CAPTURE_ONNX_GRAPH = "capture_onnx_graph"
CMD_BENCHMARK = "benchmark"
CMD_DIFFUSION_LORA = "diffusion_lora"
CMD_EXPLORE_PASSES = "explore_passes"
CMD_VALIDATE_CONFIG = "validate_config"
CMD_RUN_CONFIG = "run_config"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine into a named StrEnum?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines +157 to +170
while True:
try:
line = await proc.stderr.readline()
except ValueError:
# Line exceeded even the 10MB limit — skip it
continue
if not line:
break
decoded = line.decode("utf-8", errors="replace").rstrip()
if decoded:
# Truncate extremely long lines for display (e.g. base64 blobs)
if len(decoded) > 500:
decoded = decoded[:500] + "... (truncated)"
_job_log(job_id, decoded)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop will block infinitely if there's nothing written to stderr. Also, an empty line will also break out of the loop which isn't intended behavior. Check explicitly for None.

if line is None: break

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asyncio.StreamReader.readline() returns b"" at EOF and b"\n" for empty lines — so if not line: break will correctly exit only on EOF, never on empty lines.

stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
env=env,
limit=10 * 1024 * 1024, # 10 MB line limit (default 64KB is too small for olive output)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could start a slave thread to read the proc.stdout and not be limited by the size. That approach will also provide a live progress update the user rather than waiting till the end when the process completes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stderr is already streamed live via await proc.stderr.readline(), each line is logged immediately as it arrives, providing real-time progress. stdout is read at the end intentionally because it contains the final JSON result. The limit parameter only controls the per-line buffer size

Comment on lines +49 to +62
elif command == CMD_QUANTIZE:
algorithm = kwargs.get("algorithm", "rtn")
impl = kwargs.get("implementation", "olive")
if impl == "bnb":
extras.add("bnb")
elif impl == "inc":
extras.add("inc")
elif impl == "autogptq" or algorithm == "gptq":
extra_packages.extend(["auto-gptq", "optimum", "datasets"])
elif impl == "awq" or algorithm == "awq":
extra_packages.append("autoawq")
# Static quantization needs calibration data
if algorithm != "rtn":
extra_packages.append("datasets")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information is available in olive_config.json. Rather not duplicate it here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking in the soon future we need to split this feature out of Olive repo, as it is not a part of Olive. Any thoughts?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if this is split out, it's likely to have a dependency on Olive. You could still load the olive_config from the dependency.



@mcp.tool()
async def detect_hardware() -> dict:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all supported by python package psutils. Can we just take a dependency on that module rather than duplicating effort.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Updated.

job_log_fn(job_id, f"Reusing cached venv ({key})")

_touch_venv(venv_path)
return python_path
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to add a simple python -m pip list to show the status of the created environment.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copilot AI review requested due to automatic review settings March 23, 2026 21:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Olive MCP (Model Context Protocol) server under mcp/ to expose Olive workflows (optimize/quantize/finetune/benchmark/capture) as MCP tools, running long operations as background jobs in isolated uv-managed environments.

Changes:

  • Introduces the olive_mcp server package (FastMCP server, tools, background job runner, worker process).
  • Implements per-command dependency resolution and cached uv venv creation to avoid ORT variant conflicts.
  • Adds packaging + docs for running the server (pyproject.toml, uv.lock, README, example MCP configs).

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
mcp/src/olive_mcp/jobs.py Background job lifecycle, log streaming, parameter validation, concurrency limiting, error suggestions.
mcp/src/olive_mcp/tools.py MCP tool surface (optimize/quantize/finetune/capture/benchmark/manage outputs) + prompts + job polling.
mcp/src/olive_mcp/worker.py Worker entrypoint executing Olive CLI APIs and serializing workflow output to JSON.
mcp/src/olive_mcp/packages.py Maps commands/options → olive-ai[extras] + extra dependencies for isolated env installs.
mcp/src/olive_mcp/venv.py Cached uv venv creation/reuse + purging of old environments.
mcp/src/olive_mcp/server.py FastMCP server instance + interaction instructions.
mcp/src/olive_mcp/constants.py Shared constants/enums for commands/providers/precisions and dependency mapping.
mcp/README.md Setup and usage documentation for MCP clients.
mcp/pyproject.toml Packaging and olive-mcp script entrypoint.
mcp/uv.lock Locked dependencies for the mcp/ subproject.
mcp/.mcp.json.example / mcp/.gitignore / mcp/src/olive_mcp/__init__.py / __main__.py Project scaffolding and entrypoints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants