Your Kedro pipelines are green, your RAG answers are wrong – here is a 16-problem map I use to debug them #75
onestardao
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I ran into a pattern that I guess many Kedro users are seeing now:
the pipelines look perfect from Kedro’s point of view, but the RAG / LLM node at the end is still giving wrong or unstable answers.
To make this easier to debug, I wrote a long Medium article that treats this as a failure-diagnostics problem, not a “prompt tuning” problem:
👉 “Your Kedro pipelines are reproducible. Your RAG answers are wrong. Here is a 16-problem map to debug them.”
https://psbigbig.medium.com/your-kedro-pipelines-are-reproducible-ae42f775bfde
A quick summary of what is inside, from a Kedro user’s perspective:
1. The situation
2. A 16-problem failure map + global debug card
3. How it plugs into Kedro without changing your infra
The whole point is to keep Kedro as-is and add a semantic failure language on top. The article describes three levels:
Manual triage on a few pipelines
Structured diagnostics per node
rag_failure_reportsto your Data Catalog (JSON or Parquet).wfgy_problem_no,wfgy_lane, and optionally a ΔS zone (semantic stress band).A Kedro hook that runs the clinic after LLM nodes
after_node_runhook that only fires for nodes taggedllm_node.rag_failure_reports.The article includes a small sketch of such a hook and shows how to keep everything version-controlled inside your repo (for example in a
docs/wfgy_rag_clinic/folder with the debug card image + a system-prompt text file).4. Instruments under the hood (optional, for people who like theory)
If you read further down, there is an explanation of how the map thinks about semantic stress ΔS, four zones of tension, and a few internal instruments (λ_observe, E_resonance and four repair operators) that give both humans and LLMs a consistent way to talk about “where tension accumulates” in the pipeline. You do not need to implement math to use them; the appendix system prompt lets an LLM approximate all of this from text.
5. Why I am sharing this here
I maintain an open-source project called WFGY that focuses on failure-first debugging for RAG / LLM systems. The 16-problem map started there, then got adapted into several other tools. This article is my attempt to write a Kedro-specific walkthrough, instead of a generic RAG rant.
I would really appreciate feedback from Kedro users:
Again, the full article with the image and the copy-pasteable system prompt is here:
https://psbigbig.medium.com/your-kedro-pipelines-are-reproducible-ae42f775bfde
Thanks for reading, and happy to iterate on this if the Kedro community finds it useful.
WFGY 3.0 · RAG 16 Problem Map — Global Debug Card
Beta Was this translation helpful? Give feedback.
All reactions