Skip to content

feat: MathAgent β€” LLM-as-Translator + SymPy-as-Solver Architecture#37

Open
cohen-liel wants to merge 1 commit intomainfrom
feat/math-agent-sympy-architecture
Open

feat: MathAgent β€” LLM-as-Translator + SymPy-as-Solver Architecture#37
cohen-liel wants to merge 1 commit intomainfrom
feat/math-agent-sympy-architecture

Conversation

@cohen-liel
Copy link
Copy Markdown
Owner

Summary

New architecture where the LLM translates math questions into Python/SymPy code, a Python subprocess executes it, and majority voting on N execution outputs determines the final verified answer.

Architecture

User Question β†’ LLM (Translator) β†’ Python/SymPy Code β†’ Python Subprocess β†’ Computed Answer
                                                                              ↓
                                                              Majority Voting on N paths

Core Principle: The LLM never "solves" math β€” it only translates questions into SymPy code. Python/SymPy does the actual computation, giving verified results.

Key Components

File Purpose
server/sympyExecutor.ts Python subprocess executor with 15s timeout and sandboxing
server/mathSolver.ts Code generation β†’ execution β†’ majority voting pipeline
server/vllmClient.ts LLM client with vLLM/Forge fallback
drizzle/schema.ts Updated schema with generatedCode, executionOutput, executionStatus
client/src/pages/* Frontend pages showing code + execution output

New DB Fields

  • generatedCode β€” Python/SymPy code generated by the LLM
  • executionOutput β€” Captured stdout from Python execution
  • executionStatus β€” success | error | timeout

Tests

31 passing vitest tests covering router validation, SymPy executor (real Python execution), extractAnswer utility, and majority voting logic.

New architecture where LLM translates math questions into Python/SymPy code,
Python subprocess executes it, and majority voting on N execution outputs
determines the final verified answer.

Key components:
- sympyExecutor.ts: Python subprocess with timeout and sandboxing
- mathSolver.ts: code generation β†’ execution β†’ majority voting pipeline
- vllmClient.ts: LLM client with vLLM/Forge fallback
- Updated schema with generatedCode, executionOutput, executionStatus fields
- Frontend pages showing code + execution output
- 31 passing vitest tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant