Explain LLM Version by hadia206 · Pull Request #511 · bodo-ai/PyDough

hadia206 · 2026-04-20T20:21:19Z

Linked ticket

Closes #512

Type of change

Bug fix
New feature
Refactor
Docs / config

What changed and why?

Adds pydough.explain_llm(), a new exploration API that returns a structured description of a PyDough collection expression designed for LLM consumption.

Unlike explain() (human-readable prose for interactive debugging), explain_llm returns a stable, machine-parseable payload, either a JSON dict (format="json") or markdown (format="md"), so LLM judges can validate and self-correct generated PyDough code.

Key design decisions:

available_terms lives under a debug sub-key: scope information is preserved for human debugging but kept out of the main payload so judge prompts aren't distracted by fields irrelevant to correctness.
Implicit scoping is made explicit : PyDough's relationship-navigation scoping (e.g. COUNT(orders) inside customers.CALCULATE(...)) is correct but looks like a missing filter to a naive judge. The structured
implicit_scope_note field and per-step notes surface this so evaluators can distinguish "missing filter" from "implicitly scoped via relationship navigation."
Structured error taxonomy: 13 error categories (unrecognized_term, plural_in_calculate, bad_window_per, etc.) with error_type, details, and hint fields, so the LLM gets actionable guidance without parsing the raw exception message.
conditions as structured dicts — Where step conditions expose operator, left, and right as parsed sub-dicts (not just strings), so a judge can inspect operands directly.

Output shape (success):

query_summary (one deterministic sentence)
steps (ordered operations)
schema (source collection, output columns + types, ordering, limit).

On error: always {"error": true, "message": ..., "error_type": ..., "details": ..., "hint": ..., "steps": [], "schema": null}.

Implementation:

New shared helpers in _common.py: describe_expression, describe_subcollection_arg, generate_step_notes, generate_query_summary, _cond_texts, _collation_entry
New pydough/exploration/explain_llm.py: qualification, step-walking, schema building, error classification, markdown render.

How I tested this?

82 tests in tests/test_explain_llm.py
local testing and CI
Testing with LLM team

Notes for reviewers

…explain_llm

…into Hadia/explain_llm

…explain_llm

hadia206 · 2026-06-12T17:51:41Z

+    """
+    return (
+        isinstance(node.ancestor_context, GlobalContext)
+        and node.ancestor_context.ancestor_context is not None


A plain root table access has GlobalContext as ancestor_context, but that GlobalContext's own ancestor_context is None. CROSS qualification inserts an extra intermediate GlobalContext whose ancestor_context IS set.
That nesting is the only reliable signal that distinguishes CROSS from a normal table access.

hadia206 · 2026-06-12T17:55:07Z

+    """
+    text: str = expr.to_string()
+
+    match expr:


ChildReferenceExpression and BackReferenceExpression are both subclasses of Reference, they must be matched before case Reference() or they fall through to the wrong branch.

hadia206 · 2026-06-12T18:00:17Z

+                if detail.get("kind") != "Aggregation":
+                    continue
+                for arg in detail.get("args", []):
+                    implicit_note = arg.get("implicit_scope_note")


Two distinct cases:

non-null implicit_scope_note means the collection IS correctly scoped via relationship navigation (correct PyDough pattern, so emit an informational note, not a warning). e.g. customers.CALCULATE(n=COUNT(orders))

null means the collection may be unscoped relative to a cross-product context, only then warn. e.g. nations.CROSS(regions).CALCULATE(n=COUNT(orders)) this is potentially wrong since orders has no relationship to nations or regions. In this case access_path = [] so implicit_scope_note is null. If the CALCULATE doesn't filter on any of the CROSS-introduced terms, the COUNT aggregates all orders for every row, which is likely a bug.

john-sanchez31

Great job Hadia! Please check my comments below before merging. Most are related to docstrings and type hints.

john-sanchez31 · 2026-06-17T22:11:45Z

+* `notes` — list of strings; always present, may be empty
+
+The `schema` section (when `"error"` is `False`) includes:
+* `source_collection` — the root table name, or `null` for graph-level expressions


NIT:

Suggested change

* `source_collection` — the root table name, or `null` for graph-level expressions

* `source_collection` — the root collection name, or `null` for graph-level expressions

john-sanchez31 · 2026-06-17T22:13:29Z

+  "query_summary": "Accesses 'nations', filtered to rows where region.name == 'ASIA', selecting key, name.",
+  "steps": [
+    {
+      "order": 1, "type": "GlobalContext",


Suggested change

"order": 1, "type": "GlobalContext",

"order": 1,

"type": "GlobalContext",

Same applies to all steps

john-sanchez31 · 2026-06-17T22:28:40Z

+            # customers.WHERE(...).orders).  Get that chain via child instead.
+            current = current.child
+        else:
+            nxt = getattr(current, "preceding_context", None)


john-sanchez31 · 2026-06-17T22:38:21Z

+    # Non-empty access_path → row-level scoping via relationship navigation.
+    implicit_scope_note: str | None = None
+    if access_path:
+        path_str = " → ".join(f"'{p}'" for p in access_path)


Suggested change

path_str = " → ".join(f"'{p}'" for p in access_path)

path_str: str = " → ".join(f"'{p}'" for p in access_path)

john-sanchez31 · 2026-06-17T22:40:33Z

+
+    Args:
+        `arg`: the collection arg from an ``ExpressionFunctionCall``.
+        `parent`: the parent ``Calculate`` (or other child operator) that owns


Suggested change

`parent`: the parent ``Calculate`` (or other child operator) that owns

`parent`: the parent ``CALCULATE`` (or other child operator) that owns

john-sanchez31 · 2026-06-18T14:58:48Z

+        return "\n".join(lines)
+
+    # ------------------------------------------------------------------ #
+    # Key Facts — quick-reference block at the top so the judge sees the  #


NIT: I think we shouldn't mention the judge here. Maybe change this comments for one more general?

john-sanchez31 · 2026-06-18T15:05:56Z

+    # Key Facts — quick-reference block at the top so the judge sees the  #
+    # most checkable facts before reading any steps.                      #
+    # ------------------------------------------------------------------ #
+    schema = result["schema"]


john-sanchez31 · 2026-06-18T15:08:26Z

+        body = _render_step_body(step)
+        lines.extend(body)
+
+        notes = step.get("notes", [])


john-sanchez31 · 2026-06-18T15:08:41Z

+
+    lines.append(f"- **Source collection:** {f'`{src}`' if src else '_(none)_'}")
+
+    output_cols = schema.get("output_columns", [])


john-sanchez31 · 2026-06-18T15:13:43Z

+            f"Expected a collection, but received an expression: "
+            f"{qualified.to_string()}. Did you mean to use explain_term?"
+        )
+        result = _error_payload(msg)


^ Let's add a pre-declaration near the top: result: dict

juankx-bodo · 2026-06-18T17:33:40Z

+    # ------------------------------------------------------------------ #
+    # 1. Subject                                                           #
+    # ------------------------------------------------------------------ #
+    cross_step = next((s for s in steps if s["type"] == "Cross"), None)


same for table_step & user_step

juankx-bodo · 2026-06-18T17:35:44Z

+                detail = s.get("term_details", {}).get(tname, {})
+                if detail.get("kind") == "Aggregation":
+                    for arg_d in detail.get("args", []):
+                        cname = arg_d.get("name")


juankx-bodo · 2026-06-18T17:36:47Z

+    # understands they filter a different level of the data.
+    top_conds: list[str] = []
+    sub_conds: list[str] = []
+    past_first_sub = False


Suggested change

past_first_sub = False

past_first_sub: bool = False

juankx-bodo · 2026-06-18T17:38:36Z

+                detail = s.get("term_details", {}).get(name, {})
+                if detail.get("kind") != "Aggregation":
+                    continue
+                fn = detail.get("function", "AGG").lower()


same for arg_name

juankx-bodo · 2026-06-18T17:41:07Z

+    # ------------------------------------------------------------------ #
+    topk_step = next((s for s in steps if s["type"] == "TopK"), None)
+    order_step = next((s for s in steps if s["type"] == "OrderBy"), None)
+    sort_step = topk_step or order_step


for topk_step, order_step, sort_step, collation, by_str, suffix & summary

juankx-bodo · 2026-06-18T17:43:01Z

+        PyDoughUnqualifiedException,
+    )
+
+    msg = str(e)


Suggested change

msg = str(e)

msg: str = str(e)

juankx-bodo · 2026-06-18T17:45:29Z

+    if "Did you mean" in msg:
+        details: dict = {}
+        # Extract the wrong term: "Unrecognized term of ...: 'TERM'."
+        term_match = _re.search(r":\s*'([^']+)'", msg)


Suggested change

term_match = _re.search(r":\s*'([^']+)'", msg)

term_match: Match[str] | None = _re.search(r":\s*'([^']+)'", msg)

juankx-bodo · 2026-06-18T17:46:25Z

+        if term_match:
+            details["term"] = term_match.group(1)
+        # Extract suggestions: "Did you mean: a, b, c?"
+        sugg_match = _re.search(r"Did you mean:\s*([^?]+)\?", msg)


Suggested change

sugg_match = _re.search(r"Did you mean:\s*([^?]+)\?", msg)

sugg_match: Match[str] | None = _re.search(r"Did you mean:\s*([^?]+)\?", msg)

juankx-bodo · 2026-06-18T17:49:30Z

+    """
+    if isinstance(e, str):
+        message = e
+        details: dict[str, object]


What would be the value of details in this case?

knassre-bodo

This entire logic is quite thorough and interesting! I didn't go through every single detail of some of the middle functions, but I have left some comments on places where I think we can potentially iterate a bit further at the macro level.

My biggest wish after reading everything is something that I think would be tricky to conceptually figure out, but if you can do it would be amazing for extensibility in future: could we find a way to move some aspects of this, particularly stuff that is extremely specific to each type of QDAG node, into the QDAG APIs? Those classes use a LOT of ABC logic, with extensive class hierarchies, so perhaps there is a way to make this work by folding in different methods/templates to some of the abstract base classes, then having the explain_llm logic case on the object ancestry (e.g. calculate vs where vs topk vs orderby vs singular all inherit from AugmentingChildOperator, so we could have explain_llm case on whether it is an instance of that in order to resolve a lot of common logic, and a lot of other things inherit from ChildAccess).

If you look into this and it seems horrifically impractical, we can disregard for now, but if it is even somewhat viable I would encourage doing it. After all, we'll need to extend this all over again for EXPLODE, and I'd prefer if it was literally impossible to miss adding any implementations because if we did, the ABC would fail due to un-implemented methods.

Some areas that I think are particularly ripe for moving into the ABCs:

describe_expression
most/all of the _build_xxx_step can just be made into a single abstract method that the classes implement
Possibly _render_step_body?

Besides that, I think the overhaul to the testing approach is probably what I would consider the most. I think actually being able to see the output will help us tell if we are missing anything serious, or any glaring bugs jump out.

knassre-bodo · 2026-06-23T20:17:58Z

+``kind`` tags, and explicit scoping notes so a model can self-correct without
+parsing prose.
+
+Output schema (success)::


Should there be two colons here? I don't know the intended format.

knassre-bodo · 2026-06-23T20:18:08Z

+        }
+    }
+
+Output schema (error)::


knassre-bodo · 2026-06-23T20:21:03Z

+@pytest.fixture
+def tpch_session(get_sample_graph: graph_fetcher) -> PyDoughSession:
+    """A PyDoughSession loaded with the TPCH graph (no DB connection needed)."""
+    graph: GraphMetadata = get_sample_graph("TPCH")
+    session = PyDoughSession()
+    session.metadata = graph
+    return session
+
+
+@pytest.fixture
+def tpch_graph(get_sample_graph: graph_fetcher) -> GraphMetadata:
+    return get_sample_graph("TPCH")


These can be in conftest, and be session-level

knassre-bodo · 2026-06-23T20:22:08Z

+    def impl():
+        return nations.CALCULATE(key, name)


To make these tests easier to create/run, and potentially even parameterize, you could perhaps turn these into strings and have _run use pydough.from_string.

knassre-bodo · 2026-06-23T20:22:38Z

+def _run(
+    impl: Callable[[], UnqualifiedNode],
+    graph: GraphMetadata,
+    session: PyDoughSession,
+) -> dict:
+    """Qualify ``impl`` under ``graph`` and call ``explain_llm``."""
+    node: UnqualifiedNode = pydough.init_pydough_context(graph)(impl)()
+    return cast(dict, pydough.explain_llm(node, session=session))
+
+
+def _step(result: dict, order: int) -> dict:
+    """Return the step with the given 1-based order."""
+    return next(s for s in result["steps"] if s["order"] == order)


Let's name these functions a bit more descriptively, also add arguments/returns to the docstrings.

knassre-bodo · 2026-06-23T20:31:08Z

+            f"Expected a collection, but received an expression: "
+            f"{qualified.to_string()}. Did you mean to use explain_term?"
+        )
+        result = _error_payload(msg)


^ Let's add a pre-declaration near the top: result: dict

knassre-bodo · 2026-06-23T20:31:27Z

+    steps = _collect_steps(qualified)
+    schema = _build_schema(qualified)


knassre-bodo · 2026-06-23T20:34:28Z

+            # Detect window functions (RANKING, PERCENTILE, etc.) in the
+            # condition.  Their `per=` partition argument is resolved to SQL
+            # PARTITION BY during compilation and is NOT stored on the
+            # WindowCall QDAG node, so it cannot be shown in the condition
+            # text above.  Alert the judge so it doesn't mis-read a
+            # per-partition rank as a global rank.


Why only for WHERE? Technically, calculate/orderby/topk can also contain window functions inside their expression arguments.

knassre-bodo · 2026-06-23T20:36:38Z

+    Inside a ``CALCULATE``, aggregation arguments are represented as
+    ``ChildReferenceCollection`` nodes that point to the parent's child list


It's not just aggregations: it can also be singular sub-collections that are referenced to pull data into the current context (e.g. nations.CALCULATE(nation_name=name, region_name=region.name)

knassre-bodo · 2026-06-23T20:48:58Z

+    Clause order:
+    1. Subject  — ``TableCollection`` / ``Cross`` / ``UserGeneratedCollection``
+    2. Filter   — all ``Where`` step conditions joined with ``" and "``
+    3. Partition — ``PartitionBy`` keys
+    4. Compute  — final ``Calculate`` step (refs + aggregations)
+    5. Limit/Order — ``TopK`` or ``OrderBy``


How does this order interact with far more complex queries that have multiple layers of partitioning / stepping back into the children? I'm struggling to visualize this (may help to do so with my testing suggestions).

hadia206 added 26 commits April 20, 2026 13:20

add md version and refine JSON output

3f94bb6

udpate docs

79e60be

udpate docs

14fd93e

update Calculate

aa35092

fix where issue

99c5098

another fix

a171925

fixes

b9a87e0

enhance output

7125324

fix per explain

428695b

update notes

2461c1b

fix where with partition

b2102b1

add per group

832ed6f

attempt on join vs. list of fields

fb98a98

add key facts at the top

2ca0265

split filter in key facts

0cf0f11

regression tests and fix partition

e809a47

partition test

2df8b98

add output to keyfacts

f811f93

add error types and hints and test them

62528dc

add Expected an expression, but received a collection case

c42d416

more errors

61dfce7

Merge branch 'main' of https://github.com/bodo-ai/PyDough into Hadia/…

67b4b8b

…explain_llm

update hint

ccb29d6

Merge branch 'Hadia/explain_llm' of https://github.com/bodo-ai/PyDough …

0e549c5

…into Hadia/explain_llm

Merge branch 'main' of https://github.com/bodo-ai/PyDough into Hadia/…

a6aaf89

…explain_llm

cleanup

597a658

hadia206 commented Jun 12, 2026

View reviewed changes

[run CI]

946a925

hadia206 marked this pull request as ready for review June 12, 2026 18:05

hadia206 added 2 commits June 12, 2026 11:39

[run CI] cleanup

c6e4ccf

[run CI] comment

772fdb6

hadia206 requested review from a team, john-sanchez31, juankx-bodo and knassre-bodo and removed request for a team June 12, 2026 19:03

john-sanchez31 reviewed Jun 18, 2026

View reviewed changes

juankx-bodo approved these changes Jun 18, 2026

View reviewed changes

knassre-bodo reviewed Jun 23, 2026

View reviewed changes

	* `source_collection` — the root table name, or `null` for graph-level expressions
	* `source_collection` — the root collection name, or `null` for graph-level expressions

	"order": 1, "type": "GlobalContext",
	"order": 1,
	"type": "GlobalContext",

	path_str = " → ".join(f"'{p}'" for p in access_path)
	path_str: str = " → ".join(f"'{p}'" for p in access_path)

	`parent`: the parent ``Calculate`` (or other child operator) that owns
	`parent`: the parent ``CALCULATE`` (or other child operator) that owns


		lines.append(f"- Source collection: {f'`{src}`' if src else '_(none)_'}")

		output_cols = schema.get("output_columns", [])

	term_match = _re.search(r":\s*'([^']+)'", msg)
	term_match: Match[str] \| None = _re.search(r":\s*'([^']+)'", msg)

	sugg_match = _re.search(r"Did you mean:\s*([^?]+)\?", msg)
	sugg_match: Match[str] \| None = _re.search(r"Did you mean:\s*([^?]+)\?", msg)

		steps = _collect_steps(qualified)
		schema = _build_schema(qualified)

		Inside a ``CALCULATE``, aggregation arguments are represented as
		``ChildReferenceCollection`` nodes that point to the parent's child list

Conversation

hadia206 commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Linked ticket

Type of change

What changed and why?

How I tested this?

Notes for reviewers

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

john-sanchez31 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knassre-bodo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hadia206 commented Apr 20, 2026 •

edited

Loading

knassre-bodo left a comment •

edited

Loading