Skip to content

Add MiniMax as first-class LLM provider#253

Open
octo-patch wants to merge 1 commit intoom-ai-lab:mainfrom
octo-patch:feature/add-minimax-provider
Open

Add MiniMax as first-class LLM provider#253
octo-patch wants to merge 1 commit intoom-ai-lab:mainfrom
octo-patch:feature/add-minimax-provider

Conversation

@octo-patch
Copy link
Copy Markdown

@octo-patch octo-patch commented Mar 23, 2026

Summary

Add MiniMax as a first-class LLM provider for OmAgent, following the existing BaseLLM + registry pattern.

Changes

  • minimax_llm.py - MiniMaxLLM class inheriting BaseLLM, registered via @registry.register_llm(). Uses OpenAI-compatible API with:

    • Temperature clamping to [0, 1.0] range
    • Automatic think tag stripping from responses
    • Sync (_call) and async (_acall) support
    • Streaming support
    • Response format validation (text / json_object)
  • Supported Models: MiniMax-M2.7 (1M context), MiniMax-M2.7-highspeed, MiniMax-M2.5 (204K context), MiniMax-M2.5-highspeed

  • minimax.yml - Example YAML config

  • MiniMax.md - Full documentation with setup, config, and usage examples

  • README.md - Added MiniMax to supported providers list

  • tests - 28 unit tests + 3 integration tests (31 total)

Usage

name: MiniMaxLLM
model_id: MiniMax-M2.7
api_key: ${env| MINIMAX_API_KEY}
temperature: 0

Test Plan

  • 28 unit tests pass (mocked API)
  • 3 integration tests pass (real MiniMax API)

Summary by CodeRabbit

Release Notes

  • New Features

    • Added MiniMax as a supported cloud LLM provider with full configuration and integration support
  • Documentation

    • New MiniMax documentation page covering setup, supported models, configurable parameters, and usage examples
    • Updated README to document MiniMax provider alongside existing cloud providers

Add MiniMaxLLM class inheriting BaseLLM with OpenAI-compatible API,
temperature clamping (0-1.0), think-tag stripping, streaming support,
and registry integration. Includes M2.7/M2.5 model definitions,
YAML config example, documentation, and 31 tests (28 unit + 3 integration).

Co-Authored-By: octopus <octopus@github.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 23, 2026

Walkthrough

This pull request adds support for the MiniMax LLM provider to OmAgent. It includes a new provider implementation with request/response handling, configuration examples, comprehensive documentation, and test coverage for both unit and integration scenarios.

Changes

Cohort / File(s) Summary
Documentation & Configuration
README.md, docs/concepts/models/MiniMax.md, examples/step1_simpleVQA/configs/llms/minimax.yml
Documented MiniMax as a supported cloud LLM provider. Added setup instructions for the MINIMAX_API_KEY environment variable, provided YAML and Python configuration examples, listed supported models with context windows, and documented all configurable parameters.
Core Provider Implementation
omagent-core/src/omagent_core/models/llms/minimax_llm.py
Implemented the MiniMaxLLM provider class with temperature clamping, response format validation, OpenAI-compatible request conversion, automatic system prompt generation (with datetime, geolocation, and platform info), streaming support, and post-processing to strip <think>...</think> tags from responses. Includes both synchronous and asynchronous invocation paths.
Test Suite
tests/test_minimax_llm.py, tests/test_minimax_llm_integration.py
Added 360 lines of unit tests covering initialization, temperature clamping, response post-processing, format validation, message conversion, and registry integration. Added 95 lines of integration tests verifying actual API calls, response structure, token usage tracking, and async functionality.

Sequence Diagram

sequenceDiagram
    participant Client
    participant MiniMaxLLM
    participant OpenAI SDK
    participant MiniMax API
    
    Client->>MiniMaxLLM: _call(records) / _acall(records)
    MiniMaxLLM->>MiniMaxLLM: _msg2req(records)<br/>Convert Message to OpenAI format
    MiniMaxLLM->>MiniMaxLLM: _generate_default_sys_prompt()<br/>Optionally add system context<br/>(datetime, geolocation, platform)
    MiniMaxLLM->>MiniMaxLLM: Validate & clamp parameters<br/>(temperature, response_format)
    
    alt Streaming Enabled
        MiniMaxLLM->>OpenAI SDK: chat.completions.create(stream=true)
        OpenAI SDK->>MiniMax API: POST /v1/chat/completions
        MiniMax API-->>OpenAI SDK: Stream chunks
        OpenAI SDK-->>MiniMaxLLM: Iterator
        MiniMaxLLM-->>Client: Return raw stream
    else Non-Streaming
        MiniMaxLLM->>OpenAI SDK: chat.completions.create(stream=false)
        OpenAI SDK->>MiniMax API: POST /v1/chat/completions
        MiniMax API-->>OpenAI SDK: Complete response
        OpenAI SDK-->>MiniMaxLLM: ChatCompletion object
        MiniMaxLLM->>MiniMaxLLM: Strip <think>...</think> tags<br/>from message content
        MiniMaxLLM-->>Client: Return processed response
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A hop and a skip, MiniMax appears,
With OpenAI robes and prompt frontiers!
System whispers of time and place,
Async and sync in harmonious race.
Think tags stripped clean, responses shine true,
Integration complete, our providers grew! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.74% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add MiniMax as first-class LLM provider' directly and accurately summarizes the main objective of the PR, which is to integrate MiniMax as a supported LLM provider alongside OpenAI and Azure OpenAI.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
omagent-core/src/omagent_core/models/llms/minimax_llm.py (2)

220-222: Consider using spread syntax for list concatenation.

As flagged by Ruff (RUF005), prefer [self._generate_default_sys_prompt(), *messages] for clearer intent.

♻️ Proposed fix
         if self.use_default_sys_prompt:
-            messages = [self._generate_default_sys_prompt()] + messages
+            messages = [self._generate_default_sys_prompt(), *messages]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@omagent-core/src/omagent_core/models/llms/minimax_llm.py` around lines 220 -
222, Replace the current conditional list concatenation in the Minimax LLM where
self.use_default_sys_prompt leads to prepending a generated prompt by using
Python's list unpacking for clearer intent: when self.use_default_sys_prompt is
true, construct messages as [self._generate_default_sys_prompt(), *messages]
instead of using +, updating the block around the use_default_sys_prompt check
in minimax_llm.py (look for self.use_default_sys_prompt and
_generate_default_sys_prompt) so messages is reassigned via the spread-style
list.

246-251: geocoder.ip("me") can hang or fail without timeout.

This method makes an external network call to determine IP geolocation. Without a timeout, it can block indefinitely if the network is unavailable or slow, impacting LLM initialization when use_default_sys_prompt=True.

Consider adding a timeout or making this call lazy/async, and handle exceptions gracefully:

♻️ Proposed defensive approach
     def _get_location(self) -> str:
-        g = geocoder.ip("me")
-        if g.ok:
-            return g.city + "," + g.country
-        else:
+        try:
+            g = geocoder.ip("me", timeout=2)
+            if g.ok and g.city and g.country:
+                return f"{g.city},{g.country}"
+        except Exception:
+            pass
         return "unknown"
tests/test_minimax_llm_integration.py (1)

13-21: Consider using a conftest.py or package installation instead of sys.path manipulation.

Direct sys.path manipulation is fragile and can cause import issues. For test organization, consider:

  • Installing the package in editable mode (pip install -e omagent-core)
  • Using a conftest.py with proper path setup
  • Using pytest's pythonpath configuration
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_minimax_llm_integration.py` around lines 13 - 21, The test file
tests/test_minimax_llm_integration.py currently manipulates sys.path with
sys.path.insert and os.path.join which is fragile; replace this by removing the
sys.path modification and either install the omagent-core package in editable
mode (pip install -e omagent-core) or create a pytest conftest.py that sets up
imports (e.g., adding the package root to sys.path or using importlib to make
the package importable) or configure pytest's pythonpath so tests import the
package normally; update tests/test_minimax_llm_integration.py to import the
package modules directly once the package is installed or conftest.py is in
place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@omagent-core/src/omagent_core/models/llms/minimax_llm.py`:
- Around line 149-162: The instance defaults for tools and tool_choice are not
used because the call in the method uses kwargs.get("tools", None) and
kwargs.get("tool_choice", None); update the call to fall back to the instance
attributes (self.tools and self.tool_choice) so the instance-level defaults
declared on the class are respected (i.e., replace the kwargs.get default None
for "tools" and "tool_choice" with self.tools and self.tool_choice when building
the arguments passed to self.client.chat.completions.create).
- Around line 175-203: The async method _acall currently always calls
res.model_dump() and strips tags, which breaks when streaming is requested;
modify _acall to respect kwargs.get("stream", self.stream) (same check as _call)
and if streaming is enabled return the raw streaming response from
self.aclient.chat.completions.create (do not call res.model_dump() or
post-process), otherwise proceed with model_dump(), strip think tags via
_strip_think_tags, and return the parsed result; locate references to _acall,
_call, self.aclient.chat.completions.create, and res.model_dump to implement
this conditional flow.
- Around line 103-117: The dict validation in check_response_format currently
allows empty dicts and doesn't require the "type" key; update
check_response_format so that when self.response_format is a dict you first
assert that "type" is present (raise ValueError if missing or not a str), then
validate that self.response_format["type"] is one of the allowed values ("text",
"json_object"); keep rejecting any other unexpected keys if desired, but the
critical fix is to require and validate the "type" key before accepting the
dict.
- Around line 140-173: The BaseLLM type annotations are wrong: update the return
types of BaseLLM._call and BaseLLM.generate from -> str to -> Dict to match
actual implementations (e.g., minimax_llm.MinimaxLLM._call) and callers that
expect dictionaries; also add/ensure from typing import Dict is present and
adjust any docstrings or type comments referencing str; run static checks to
confirm all LLM subclasses conform to the new Dict return type.

---

Nitpick comments:
In `@omagent-core/src/omagent_core/models/llms/minimax_llm.py`:
- Around line 220-222: Replace the current conditional list concatenation in the
Minimax LLM where self.use_default_sys_prompt leads to prepending a generated
prompt by using Python's list unpacking for clearer intent: when
self.use_default_sys_prompt is true, construct messages as
[self._generate_default_sys_prompt(), *messages] instead of using +, updating
the block around the use_default_sys_prompt check in minimax_llm.py (look for
self.use_default_sys_prompt and _generate_default_sys_prompt) so messages is
reassigned via the spread-style list.

In `@tests/test_minimax_llm_integration.py`:
- Around line 13-21: The test file tests/test_minimax_llm_integration.py
currently manipulates sys.path with sys.path.insert and os.path.join which is
fragile; replace this by removing the sys.path modification and either install
the omagent-core package in editable mode (pip install -e omagent-core) or
create a pytest conftest.py that sets up imports (e.g., adding the package root
to sys.path or using importlib to make the package importable) or configure
pytest's pythonpath so tests import the package normally; update
tests/test_minimax_llm_integration.py to import the package modules directly
once the package is installed or conftest.py is in place.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0a20f2c8-dd83-46a3-9bb6-f3c0370d45a5

📥 Commits

Reviewing files that changed from the base of the PR and between c131f82 and aa72d0c.

📒 Files selected for processing (6)
  • README.md
  • docs/concepts/models/MiniMax.md
  • examples/step1_simpleVQA/configs/llms/minimax.yml
  • omagent-core/src/omagent_core/models/llms/minimax_llm.py
  • tests/test_minimax_llm.py
  • tests/test_minimax_llm_integration.py

Comment on lines +103 to +117
def check_response_format(self) -> None:
if isinstance(self.response_format, str):
if self.response_format == "text":
self.response_format = {"type": "text"}
elif self.response_format == "json_object":
self.response_format = {"type": "json_object"}
elif isinstance(self.response_format, dict):
for key, value in self.response_format.items():
if key not in ["type"]:
raise ValueError(f"Invalid response format key: {key}")
if key == "type":
if value not in ["text", "json_object"]:
raise ValueError(f"Invalid response format value: {value}")
else:
raise ValueError(f"Invalid response format: {self.response_format}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

response_format dict validation allows invalid structures.

The validation logic at lines 110-115 iterates over dict keys but only checks if keys are not "type". This means:

  1. An empty dict {} passes validation (no "type" key required)
  2. A dict with only invalid keys like {"foo": "bar"} raises on foo but not on missing type

Consider validating that "type" key exists and has a valid value:

🛠️ Proposed fix
     def check_response_format(self) -> None:
         if isinstance(self.response_format, str):
             if self.response_format == "text":
                 self.response_format = {"type": "text"}
             elif self.response_format == "json_object":
                 self.response_format = {"type": "json_object"}
+            else:
+                raise ValueError(f"Invalid response format: {self.response_format}")
         elif isinstance(self.response_format, dict):
-            for key, value in self.response_format.items():
-                if key not in ["type"]:
-                    raise ValueError(f"Invalid response format key: {key}")
-                if key == "type":
-                    if value not in ["text", "json_object"]:
-                        raise ValueError(f"Invalid response format value: {value}")
+            if "type" not in self.response_format:
+                raise ValueError("response_format dict must contain 'type' key")
+            if self.response_format["type"] not in ["text", "json_object"]:
+                raise ValueError(f"Invalid response format type: {self.response_format['type']}")
+            if len(self.response_format) > 1:
+                extra_keys = set(self.response_format.keys()) - {"type"}
+                raise ValueError(f"Invalid response format keys: {extra_keys}")
         else:
             raise ValueError(f"Invalid response format: {self.response_format}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@omagent-core/src/omagent_core/models/llms/minimax_llm.py` around lines 103 -
117, The dict validation in check_response_format currently allows empty dicts
and doesn't require the "type" key; update check_response_format so that when
self.response_format is a dict you first assert that "type" is present (raise
ValueError if missing or not a str), then validate that
self.response_format["type"] is one of the allowed values ("text",
"json_object"); keep rejecting any other unexpected keys if desired, but the
critical fix is to require and validate the "type" key before accepting the
dict.

Comment on lines +140 to +173
def _call(self, records: List[Message], **kwargs) -> Dict:
if self.api_key is None or self.api_key == "":
raise ValueError("api_key is required")

messages = self._msg2req(records)
temperature = self._clamp_temperature(
kwargs.get("temperature", self.temperature)
)

res = self.client.chat.completions.create(
model=self.model_id,
messages=messages,
temperature=temperature,
max_tokens=kwargs.get("max_tokens", self.max_tokens),
response_format=kwargs.get("response_format", self.response_format),
tools=kwargs.get("tools", None),
tool_choice=kwargs.get("tool_choice", None),
stream=kwargs.get("stream", self.stream),
n=kwargs.get("n", self.n),
top_p=kwargs.get("top_p", self.top_p),
stop=kwargs.get("stop", self.stop),
stream_options=kwargs.get("stream_options", self.stream_options),
)

if kwargs.get("stream", self.stream):
return res
else:
result = res.model_dump()
# Strip think tags from response content
for choice in result.get("choices", []):
msg = choice.get("message", {})
if msg.get("content"):
msg["content"] = self._strip_think_tags(msg["content"])
return result
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check return types of _call in other LLM implementations
rg -n "def _call" --type=py -A 10 omagent-core/src/omagent_core/models/llms/ | head -100

Repository: om-ai-lab/OmAgent

Length of output: 6553


🏁 Script executed:

# Check the actual generate() implementation in base.py
sed -n '44,65p' omagent-core/src/omagent_core/models/llms/base.py

Repository: om-ai-lab/OmAgent

Length of output: 943


🏁 Script executed:

# Also check the full _call signature and any return value transformations
sed -n '40,80p' omagent-core/src/omagent_core/models/llms/base.py

Repository: om-ai-lab/OmAgent

Length of output: 1601


🏁 Script executed:

# Search for calls to generate() to see how the return value is used
rg "\.generate\(" --type=py -B 2 -A 2 omagent-core/src/ | head -60

Repository: om-ai-lab/OmAgent

Length of output: 1871


🏁 Script executed:

# Check if there are tests for minimax that show expected behavior
fd "test.*minimax" --type=py omagent-core/ 2>/dev/null || echo "No test files found with minimax in name"

Repository: om-ai-lab/OmAgent

Length of output: 101


🏁 Script executed:

# Check test files for LLMs to understand expected return type
fd "test.*llm" --type=py omagent-core/ 2>/dev/null | head -5

Repository: om-ai-lab/OmAgent

Length of output: 43


🏁 Script executed:

# Check tool_system manager.py to see how fix_res is used
sed -n '1,400p' omagent-core/src/omagent_core/tool_system/manager.py | grep -A 10 "fix_res = self.llm.generate"

Repository: om-ai-lab/OmAgent

Length of output: 433


🏁 Script executed:

# Verify minimax implementation matches pattern of other providers
# Check what minimax _call actually returns (full return statement)
sed -n '140,173p' omagent-core/src/omagent_core/models/llms/minimax_llm.py

Repository: om-ai-lab/OmAgent

Length of output: 1485


Type annotation contract violation: _call declares return type str but actually returns Dict.

The base class BaseLLM._call() declares return type str, and BaseLLM.generate() directly returns this result. However, this implementation (and all other LLM providers) actually return Dict. Real callers like tool_system/manager.py and the base class usage in base.py treat the result as a dictionary (accessing nested keys like ["choices"][0]), confirming the actual contract is Dict, not str.

This type annotation mismatch affects all LLM implementations systematically. Update the base class signatures in base.py (lines 44 and 47) to declare -> Dict instead of -> str.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@omagent-core/src/omagent_core/models/llms/minimax_llm.py` around lines 140 -
173, The BaseLLM type annotations are wrong: update the return types of
BaseLLM._call and BaseLLM.generate from -> str to -> Dict to match actual
implementations (e.g., minimax_llm.MinimaxLLM._call) and callers that expect
dictionaries; also add/ensure from typing import Dict is present and adjust any
docstrings or type comments referencing str; run static checks to confirm all
LLM subclasses conform to the new Dict return type.

Comment on lines +149 to +162
res = self.client.chat.completions.create(
model=self.model_id,
messages=messages,
temperature=temperature,
max_tokens=kwargs.get("max_tokens", self.max_tokens),
response_format=kwargs.get("response_format", self.response_format),
tools=kwargs.get("tools", None),
tool_choice=kwargs.get("tool_choice", None),
stream=kwargs.get("stream", self.stream),
n=kwargs.get("n", self.n),
top_p=kwargs.get("top_p", self.top_p),
stop=kwargs.get("stop", self.stop),
stream_options=kwargs.get("stream_options", self.stream_options),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

tools and tool_choice instance defaults not passed through.

Lines 155-156 use kwargs.get("tools", None) and kwargs.get("tool_choice", None) instead of falling back to self.tools and self.tool_choice. This means the instance-level defaults (lines 85-91) are never used.

🐛 Proposed fix
-            tools=kwargs.get("tools", None),
-            tool_choice=kwargs.get("tool_choice", None),
+            tools=kwargs.get("tools", self.tools),
+            tool_choice=kwargs.get("tool_choice", self.tool_choice),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
res = self.client.chat.completions.create(
model=self.model_id,
messages=messages,
temperature=temperature,
max_tokens=kwargs.get("max_tokens", self.max_tokens),
response_format=kwargs.get("response_format", self.response_format),
tools=kwargs.get("tools", None),
tool_choice=kwargs.get("tool_choice", None),
stream=kwargs.get("stream", self.stream),
n=kwargs.get("n", self.n),
top_p=kwargs.get("top_p", self.top_p),
stop=kwargs.get("stop", self.stop),
stream_options=kwargs.get("stream_options", self.stream_options),
)
res = self.client.chat.completions.create(
model=self.model_id,
messages=messages,
temperature=temperature,
max_tokens=kwargs.get("max_tokens", self.max_tokens),
response_format=kwargs.get("response_format", self.response_format),
tools=kwargs.get("tools", self.tools),
tool_choice=kwargs.get("tool_choice", self.tool_choice),
stream=kwargs.get("stream", self.stream),
n=kwargs.get("n", self.n),
top_p=kwargs.get("top_p", self.top_p),
stop=kwargs.get("stop", self.stop),
stream_options=kwargs.get("stream_options", self.stream_options),
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@omagent-core/src/omagent_core/models/llms/minimax_llm.py` around lines 149 -
162, The instance defaults for tools and tool_choice are not used because the
call in the method uses kwargs.get("tools", None) and kwargs.get("tool_choice",
None); update the call to fall back to the instance attributes (self.tools and
self.tool_choice) so the instance-level defaults declared on the class are
respected (i.e., replace the kwargs.get default None for "tools" and
"tool_choice" with self.tools and self.tool_choice when building the arguments
passed to self.client.chat.completions.create).

Comment on lines +175 to +203
async def _acall(self, records: List[Message], **kwargs) -> Dict:
if self.api_key is None or self.api_key == "":
raise ValueError("api_key is required")

messages = self._msg2req(records)
temperature = self._clamp_temperature(
kwargs.get("temperature", self.temperature)
)

res = await self.aclient.chat.completions.create(
model=self.model_id,
messages=messages,
temperature=temperature,
max_tokens=kwargs.get("max_tokens", self.max_tokens),
response_format=kwargs.get("response_format", self.response_format),
tools=kwargs.get("tools", None),
n=kwargs.get("n", self.n),
top_p=kwargs.get("top_p", self.top_p),
stop=kwargs.get("stop", self.stop),
stream_options=kwargs.get("stream_options", self.stream_options),
)

result = res.model_dump()
# Strip think tags from response content
for choice in result.get("choices", []):
msg = choice.get("message", {})
if msg.get("content"):
msg["content"] = self._strip_think_tags(msg["content"])
return result
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

_acall missing streaming support unlike _call.

_call (lines 164-165) checks kwargs.get("stream", self.stream) and returns the raw stream object when streaming is enabled. However, _acall doesn't have this check and always calls res.model_dump(), which will fail or behave unexpectedly when streaming is requested asynchronously.

🐛 Proposed fix to add streaming support to _acall
     async def _acall(self, records: List[Message], **kwargs) -> Dict:
         if self.api_key is None or self.api_key == "":
             raise ValueError("api_key is required")
 
         messages = self._msg2req(records)
         temperature = self._clamp_temperature(
             kwargs.get("temperature", self.temperature)
         )
 
         res = await self.aclient.chat.completions.create(
             model=self.model_id,
             messages=messages,
             temperature=temperature,
             max_tokens=kwargs.get("max_tokens", self.max_tokens),
             response_format=kwargs.get("response_format", self.response_format),
             tools=kwargs.get("tools", None),
             n=kwargs.get("n", self.n),
             top_p=kwargs.get("top_p", self.top_p),
             stop=kwargs.get("stop", self.stop),
             stream_options=kwargs.get("stream_options", self.stream_options),
+            stream=kwargs.get("stream", self.stream),
         )
 
+        if kwargs.get("stream", self.stream):
+            return res
+
         result = res.model_dump()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@omagent-core/src/omagent_core/models/llms/minimax_llm.py` around lines 175 -
203, The async method _acall currently always calls res.model_dump() and strips
tags, which breaks when streaming is requested; modify _acall to respect
kwargs.get("stream", self.stream) (same check as _call) and if streaming is
enabled return the raw streaming response from
self.aclient.chat.completions.create (do not call res.model_dump() or
post-process), otherwise proceed with model_dump(), strip think tags via
_strip_think_tags, and return the parsed result; locate references to _acall,
_call, self.aclient.chat.completions.create, and res.model_dump to implement
this conditional flow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant