WFGY 3.0 · TXT-based tension reasoning engine (community test) #72
Unanswered
onestardao
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
WFGY 3.0 · TXT-based tension reasoning engine (community test)
WFGY 2.0 grew up in the RAG / infra world. Its 16-problem ProblemMap is already used as a failure-mode language by several external projects and lists (LlamaIndex RAG troubleshooting docs, Harvard MIMS Lab’s ToolUniverse, Rankify, QCRI LLM Lab’s multimodal RAG survey, multiple “Awesome X” lists, etc.). In practice, 2.0 became a shared checklist for “what exactly broke in my pipeline”.
WFGY 3.0 tries to push the same language into a general reasoning engine.
Instead of only naming RAG failure modes, 3.0 is shipped as a single TXT pack wired to 131 S-class questions. You upload the TXT into a strong LLM, type
runthengo, and from that point on the model enters a dedicated console that treats your question as a point inside this “tension atlas” instead of as a random prompt.The engine itself is already stable. What is still in flux are the prompts, menu wording, and console UX, and this is where I would really like feedback from people who care about deep reasoning quality.
How to try it (5 minutes)
run, thengo, and follow the built-in menu to pick a mission. Bring one real high-tension question from your life, research, or system, not a toy problem.If the first run collapses, loops, or feels fake, that is still useful. Please note which model you used, what you asked, and where it broke.
What feedback is most useful right now
GO mode
Does the quick
goflow give you a clear sense of what this engine is trying to do at the effective layer, or does it feel gimmicky / confusing?Console and missions
Are the menu options and mission descriptions readable enough, or too dense / too long? Is there anything you would never click because the wording is unclear?
Behaviour across models
On your model of choice, do the PROMPT_02 / PROMPT_03 / STORY flows feel too heavy, too light, or about right? Are there points where a small wording change would obviously help the model think better?
Atlas feel (the 131 S-class problems)
When the engine references S-class IDs, does it help you navigate (“this feels like a map”), or does it just add noise? If you try to map the same real-world question more than once, does it land in roughly the same region?
You can send feedback as a GitHub Discussion reply, as a GitHub issue with logs / screenshots, or as an external write-up (I am happy to link back). Honest failure reports are more valuable than polite praise.
License and usage
WFGY 3.0 follows the same MIT license as WFGY 2.0. You are free to:
In return, the only real expectation is that you treat this TXT as a serious candidate for a reasoning engine. If you find places where it clearly fails that bar, please show me where.
https://github.com/onestardao/WFGY
Beta Was this translation helpful? Give feedback.
All reactions