Epic: Part 7 — AI Evaluations
Integration with Vertex AI Evaluation Service to run automated LLM judge evaluations on Glow CI outputs. All LLM calls (RAG retrieval + Gemini synthesis) are logged via Cloud Logging and fed into Vertex AI evals. Business stakeholders define evaluation criteria (hallucination, citation coverage, relevance); the platform runs automated evaluations and surfaces results.
Key capabilities:
- Cloud Logging — all Glow CI LLM calls instrumented and observable via GCP
- Vertex AI Evaluation Service — LLM call logs feed automated evaluations using Vertex AI's built-in evaluation pipeline
- Stakeholder-defined criteria (hallucination, citation coverage, relevance)
- Continuous automated eval runs with results visible to business teams
Stories
| # |
Story |
Role |
Sprint |
| 7.2 |
Instrument Glow CI with Cloud Logging + connect to Vertex AI Evaluation Service |
Engineer |
Sprint 5 |
📄 PRD: Part 7 — Glow CI PRD
Epic: Part 7 — AI Evaluations
Integration with Vertex AI Evaluation Service to run automated LLM judge evaluations on Glow CI outputs. All LLM calls (RAG retrieval + Gemini synthesis) are logged via Cloud Logging and fed into Vertex AI evals. Business stakeholders define evaluation criteria (hallucination, citation coverage, relevance); the platform runs automated evaluations and surfaces results.
Key capabilities:
Stories
📄 PRD: Part 7 — Glow CI PRD