refactor(mini-swe-agent) by FatPigeorz · Pull Request #39 · Agentix-Project/Agentix

FatPigeorz · 2026-05-23T18:47:57Z

Summary

Task 5 of the 7-task batch. Expands the mini-swe-agent integration from a 25-line passthrough to a runner that captures the agent's native v2 trajectory, converts it into a structured Trajectory (the same ATIF v1.2 shape harbor uses in MiniSweAgent.populate_context_post_run), and aggregates token usage + cost for downstream eval / RL consumers.

New return shape

result = await client.remote(
    mini_swe.run,
    task=task,
    workdir=workdir,
    agent=agent,
    trajectory_path="/tmp/mini-swe-agent.trajectory.json",
)
# result: MiniSweAgentResult(
#     exit_status=..., submission=..., workdir=...,
#     raw_trajectory={...},                       # mini-swe-agent v2 native
#     trajectory=Trajectory(...),                 # structured ATIF-v1.2
#     usage={"n_input_tokens": ..., "n_output_tokens": ..., "cost_usd": ...},
# )

Module layout

agentix.agents.mini_swe_agent/
├── trajectory.py   Trajectory + Step + ToolCall + Observation + Metrics + FinalMetrics
│                   + from_mini_swe_agent(raw, session_id=...)
│                   + aggregate_usage(raw)
└── runner.py       run(task, workdir, agent, trajectory_path=None, session_id=None)
                    -> MiniSweAgentResult

The structured trajectory rides back to the host inside client.remote(...)'s pickled return value — no shared filesystem assumed. Tool calls are lifted into structured ToolCalls, tool results attach as observations on the preceding agent step, and cost is apportioned by completion-token share (matching harbor's converter).

Example update: examples/run-mini-swe-agent/main.py prints mini_usage and mini_trajectory_steps lines alongside the existing mini_exit_status / mini_submission.

Test plan

8 new tests in plugins/agents/mini-swe-agent/tests/test_mini_swe_agent.py cover the structured result shape, trajectory load + parse, inline trajectory passthrough, exception propagation, aggregate-vs-final-metrics consistency, dict / json round trip, and the string-arguments fallback for tool calls.
Full repo suite: 184 passing.
pyright clean.
ruff clean.

Made with Cursor

Expands the mini-swe-agent integration from a 25-line passthrough to a runner that captures the agent's native v2 trajectory, converts it into a structured `Trajectory` (the same ATIF v1.2 shape harbor uses in `MiniSweAgent.populate_context_post_run`), and aggregates token usage + cost for downstream eval / RL consumers. New modules: * `agentix.agents.mini_swe_agent.trajectory` — `Trajectory` / `Step` / `ToolCall` / `Observation` / `Metrics` / `FinalMetrics` / `AgentInfo` dataclasses; a `from_mini_swe_agent(raw, session_id=...)` converter that handles system / user / assistant / tool roles, lifts assistant `tool_calls` into structured `ToolCall`s, attaches tool results as observations on the preceding agent step, and apportions cost by completion-token share. Plus a cheap `aggregate_usage(...)` for callers that only need totals (matches harbor's `populate_context_post_run` shape: `n_input_tokens`, `n_output_tokens`, `n_cache_tokens`, `cost_usd`). * `agentix.agents.mini_swe_agent.runner` — `run(task, workdir, agent, trajectory_path=None, session_id=None)` now returns a `MiniSweAgentResult(exit_status, submission, workdir, raw_trajectory, trajectory, usage)`. When `trajectory_path` points at the file mini-swe-agent's CLI flag `--output` writes, the structured trajectory rides back inside `client.remote(...)`'s pickled return value — no shared filesystem assumed. Example update: `examples/run-mini-swe-agent/main.py` prints `mini_usage` (input/output/cached/cost) and `mini_trajectory_steps` in addition to the existing `mini_exit_status` / `mini_submission` lines, so a user sees the new shape end-to-end. Tests: * 8 tests in `plugins/agents/mini-swe-agent/tests/`: - structured `MiniSweAgentResult` shape - trajectory load + parse from a file path - inline trajectory passthrough - exception propagation - `aggregate_usage` ↔ `final_metrics` consistency - `to_dict` strips None / `to_json` round trip - string `tool_calls.function.arguments` fallback to `command` Full repo suite: 184 passing. pyright clean. ruff clean. Co-authored-by: Cursor <cursoragent@cursor.com>

FatPigeorz changed the title ~~refactor(mini-swe-agent): harbor-style trajectory + post-run metrics~~ refactor(mini-swe-agent) May 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(mini-swe-agent)#39

refactor(mini-swe-agent)#39
FatPigeorz wants to merge 1 commit into
masterfrom
refactor/mini-swe-agent-harbor

FatPigeorz commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FatPigeorz commented May 23, 2026

Summary

New return shape

Module layout

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant