Hi, I would like to suggest adding WFGY as a practical resource.
WFGY focuses on diagnosing real system failures. The ProblemMap checklist covers 16 recurring failure modes in retrieval and generation pipelines. Many issues are data centric in practice, such as contamination, duplication collapse, chunking contract mismatch, and evaluation leakage.
Links
GitHub repo: https://github.com/onestardao/WFGY