Replies: 1 comment
-
Update: Token Optimizer Audit — "30-50% Token Reduction" ClaimsWe downloaded The Metrics Are Fabricated
The IronyWhile claiming 30-50% token reduction, ruflo actually adds ~15,000-25,000 tokens of noise per session:
The tool that promises token savings costs you more tokens than not having it installed. Full details added to the audit gist. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey 👋
First off — respect to the author for the ambition behind this project. Building an AI agent orchestration framework is a massive undertaking, and the vision here is genuinely impressive. The README paints a picture of something that could be transformative: 300+ MCP tools, Byzantine fault-tolerant consensus, neural pattern learning, WASM sandboxed agents, hierarchical swarm coordination.
I'm not here to tear that vision down. The marketing is strong, the roadmap is ambitious, and maybe this is where ruflo is headed. But I believe the community deserves transparency about where things stand today.
What We Did
We conducted a deep independent technical audit of ruflo v3.5.51 — hands-on testing of every major tool category, source code analysis, local process inspection, and hooks code review. We spawned 8 research agents across two analysis phases.
What We Found
Out of 300+ MCP tools exposed by ruflo:
memory_store/search(HNSW),embeddings_generate,terminal_execute,session_saveagent_spawn,task_assign,neural_train,wasm_agent_prompt,workflow_executeKey Findings
agent_spawn — Creates a
Mapentry{ status: "idle", taskCount: 0 }. No subprocess. No LLM call. Status never changes.neural_train — Reports 93.6% accuracy on 5 samples. Then every prediction returns
"coder"regardless of input. The accuracy metric appears to be randomized.wasm_agent_prompt — Echoes your input back. No WASM runtime exists.
Consensus types (Byzantine, Raft, Queen, Gossip, CRDT) — All selectable at
hive-mind init, all route to the same JSON file handler.verifySignature()unconditionally returnstrue. Raft'srequestVotes()fires a localEventEmitter(comment in source: "For now, emit event for testing"). The consensus type you select is stored as a string label that changes nothing about behavior.The disconnected LLM provider — An
AnthropicProviderclass with real HTTP calls toapi.anthropic.comexists in the codebase. AProviderManagerwith round-robin routing exists too. But nothing in the agent spawn or task execution path imports or calls these providers. The wire is missing.Intelligence layer — Processes 5,706 entries (only ~20 unique, rest are duplicates) into a 100 MB graph file. Runs PageRank over near-identical nodes (converges to uniform ~0.02 for all). Injects the same entry 5 times per message. ~15,000 tokens wasted per session on noise.
What Actually Works
The HNSW memory system is real and useful — 384-dim
all-MiniLM-L6-v2embeddings, real vector search, SQLite persistence. The embeddings engine works. Session persistence works. These ~10 tools provide genuine value.Why This Matters
Users installing ruflo see "300+ MCP tools" and reasonably expect them to work. When
agent_spawnsits idle forever andneural_trainreturns random accuracy, that's not a missing feature — it's a trust problem. New users may spend hours debugging what they assume is their configuration, when the tools simply have no execution backend.The Ask
I'd love to see:
implemented,stub, orplannedin the docsThis isn't about criticism — it's about helping users make informed decisions and helping the project earn the trust its ambition deserves.
📄 Full audit with source code evidence, file inventories, token cost analysis, and recommendations:
👉 Complete Audit Report (GitHub Gist)
Audit conducted 2026-04-04 on ruflo v3.5.51 / @claude-flow/cli@latest
Beta Was this translation helpful? Give feedback.
All reactions