Skip to content

[QAIRT] Implement QAIRT ORT->Genie workflow#2358

Open
qti-kromero wants to merge 15 commits intomicrosoft:mainfrom
CodeLinaro:dev/qti-kromero/ort-genie-workflow
Open

[QAIRT] Implement QAIRT ORT->Genie workflow#2358
qti-kromero wants to merge 15 commits intomicrosoft:mainfrom
CodeLinaro:dev/qti-kromero/ort-genie-workflow

Conversation

@qti-kromero
Copy link
Copy Markdown
Contributor

@qti-kromero qti-kromero commented Mar 14, 2026

Describe your changes

Implements a complete QAIRT workflow for converting ONNX Runtime models to Genie-compatible format through three new passes:

QairtPreparation: Executes external preparation scripts to quantize and prepare HuggingFace models for QAIRT, with configurable caching and script parameters.

QairtGenAIBuilder: Converts prepared models using QAIRT GenAIBuilder API with support for:

  • CPU and HTP backend targets
  • Device-specific optimizations (VTCM size, HVX threads, extended UDMA)
  • Model configurations (sequence lengths, multi-graph, model splits)

QairtEncapsulation: Wraps QAIRT DLC models in ONNX protobuf format with EPContext nodes, generating genai_config.json for onnxruntime-genai compatibility.

This enables end-to-end optimization of generative AI models for Qualcomm hardware accelerators.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

Copy link
Copy Markdown
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@qti-kromero qti-kromero force-pushed the dev/qti-kromero/ort-genie-workflow branch from 5badb8e to fdaea52 Compare March 18, 2026 15:41
@qti-kromero qti-kromero marked this pull request as ready for review March 18, 2026 16:57
@qti-kromero
Copy link
Copy Markdown
Contributor Author

dev testing complete and unit testing added - pending reviewers

@qti-kromero
Copy link
Copy Markdown
Contributor Author

@jambayk @xiaoyu-work would it be possible to get a reviewer added to this

return {
"backend": PassConfigParam(
type_=str,
default_value="CPU",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR mostly looks good to me but could you use an enum with the options for backend and log_level like in

class RotateMode(StrEnumBase):

this gives it automatic validation of the allowed values. thanks!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new QAIRT pipeline to Olive to support an end-to-end ORT → QAIRT (Genie) workflow, including model preparation, QAIRT GenAIBuilder compilation, and ONNX encapsulation for onnxruntime-genai compatibility.

Changes:

  • Introduces three new QAIRT passes: preparation (external script runner), GenAIBuilder (CPU/HTP backends), and encapsulation (EPContext ONNX wrapper + genai_config.json).
  • Adds QAIRT model handlers and new Framework/ModelFileFormat enums for QAIRT artifacts.
  • Adds a new pytest suite for the QAIRT passes and updates Olive’s pass registry configuration (olive_config.json) with QAIRT entries and extra dependencies.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
olive/passes/qairt/preparation.py New pass to run external preparation scripts and emit QairtPreparedModelHandler.
olive/passes/qairt/gen_ai_builder.py New pass to compile/build QAIRT artifacts via QAIRT GenAIBuilder for CPU/HTP targets.
olive/passes/qairt/encapsulation.py New pass to export DLC + wrap it into an ONNX EPContext model and generate genai_config.json.
olive/passes/qairt/__init__.py Adds QAIRT passes package.
olive/model/handler/qairt.py Adds QAIRT model handler types (QairtPreparedModelHandler, QairtModelHandler).
olive/model/handler/__init__.py Exposes the new QAIRT model handlers from the handler package.
olive/constants.py Adds QAIRT framework and QAIRT model file formats.
olive/olive_config.json Registers new QAIRT passes and adds QAIRT extra dependency mapping.
test/passes/qairt/conftest.py Adds shared fixtures for mocking QAIRT modules and model handlers.
test/passes/qairt/test_preparation.py Unit tests for QairtPreparation behavior and subprocess streaming.
test/passes/qairt/test_gen_ai_builder.py Unit tests for GenAIBuilder CPU/HTP behavior and validation paths.
test/passes/qairt/test_encapsulation.py Unit tests for encapsulation, DLC discovery, ONNX generation, and genai_config creation.

# Can only set target and transformation configurations if the BE is HTP
if config.backend == qairt.BackendType.HTP.value:
# Device configs
gen_ai_builder.set_targets([config.soc_details])
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For HTP, set_targets([config.soc_details]) is called unconditionally, but soc_details defaults to None and validate_config doesn’t enforce it. Passing [None] into the QAIRT API is likely to fail at runtime. Add a validation/guard so HTP requires a non-empty soc_details (or skip set_targets when it’s unset and rely on QAIRT defaults).

Suggested change
gen_ai_builder.set_targets([config.soc_details])
if config.soc_details:
gen_ai_builder.set_targets([config.soc_details])

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +45
"log_level": PassConfigParam(
type_=str,
default_value=None,
description="Log level to be used within underlying QAIRT components."
"Valid values: DEBUG, INFO, WARN, ERROR.",
),
"run_checker": PassConfigParam(
type_=bool,
default_value=False,
description="Runs the onnx checker on the model before it is encapsulated.",
),
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log_level config option is defined but never used in this pass. Either wire it up (e.g., set the same QAIRT_LOG_LEVEL env var used in QairtGenAIBuilder) or remove it to avoid confusing users with a no-op parameter.

Copilot uses AI. Check for mistakes.
Comment on lines +257 to +260
genai_config["model"]["decoder"]["head_size"] = src_config.get("hidden_size", -1) // src_config.get(
"num_attention_heads", -1
)
genai_config["model"]["decoder"]["hidden_size"] = src_config.get("hidden_size", -1)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

head_size is computed as hidden_size // num_attention_heads using default values of -1 when keys are missing. This can silently produce incorrect values (e.g., -1 // -1 == 1) or raise if num_attention_heads is 0/non-int. Mirror the safer logic used in olive/passes/openvino/ov_utils.py:create_genai_config by validating both values are positive ints before dividing; otherwise set head_size to -1.

Suggested change
genai_config["model"]["decoder"]["head_size"] = src_config.get("hidden_size", -1) // src_config.get(
"num_attention_heads", -1
)
genai_config["model"]["decoder"]["hidden_size"] = src_config.get("hidden_size", -1)
hidden_size = src_config.get("hidden_size", -1)
num_attention_heads = src_config.get("num_attention_heads", -1)
head_size = -1
if isinstance(hidden_size, int) and isinstance(num_attention_heads, int) and hidden_size > 0 and num_attention_heads > 0:
head_size = hidden_size // num_attention_heads
genai_config["model"]["decoder"]["head_size"] = head_size
genai_config["model"]["decoder"]["hidden_size"] = hidden_size

Copilot uses AI. Check for mistakes.
Comment on lines +121 to +123
with subprocess.Popen(
["python", str(script_path), "--config", config_file_path],
cwd=str(script_path.parent),
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subprocess is invoked via the literal executable name "python", which can pick up a different interpreter than the one running Olive (e.g., in venv/conda). Use sys.executable (or equivalent) so the preparation script runs under the same Python environment and has access to the same installed dependencies.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,391 @@
# -------------------------------------------------------------------------

Check warning

Code scanning / lintrunner

RUFF/format Warning test

Run lintrunner -a to apply this patch.

def test_preparation_uses_sys_executable_and_env(tmp_path, mock_hf_model, mock_qairt_modules):
"""Test that subprocess uses sys.executable and passes environment."""
import os

Check warning

Code scanning / lintrunner

PYLINT/W0611 Warning test

Unused import os (unused-import)
See unused-import.

def test_preparation_uses_sys_executable_and_env(tmp_path, mock_hf_model, mock_qairt_modules):
"""Test that subprocess uses sys.executable and passes environment."""
import os

Check warning

Code scanning / lintrunner

RUFF/F401 Warning test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants