Skip to content

Fix contained related places when pagination occurs#6328

Draft
nick-nlb wants to merge 2 commits into
datacommonsorg:masterfrom
nick-nlb:v2-migration-related-places
Draft

Fix contained related places when pagination occurs#6328
nick-nlb wants to merge 2 commits into
datacommonsorg:masterfrom
nick-nlb:v2-migration-related-places

Conversation

@nick-nlb
Copy link
Copy Markdown
Contributor

@nick-nlb nick-nlb commented Jun 1, 2026

Issue

b/515861634

Description

When using Spanner, the related places at the bottom of a the places page could be either incomplete or not shown at all. An example of this is the place page for San Mateo County, where a BT-backed mixer will show a set of places, and a Spanner-backed mixer will not (see screenshots below).

The reason for this was Spanner pagination combined with the way that the Flask backend was making the request.

Flask was requesting all containedInPlace nodes, regardless of the type, and then filtering out to show relevant child place types within Flask. San Mateo, as an example, contained hundreds of non-place contained nodes (EPAReportingFacility, VehicleCrashIncidents) that are pulled out by the query before any relevant places appear. With BT, this worked (albeit with unnecessarily large payloads). With Spanner, the pagination means that no relevant places are retrieved on the first page, and no places are then displayed.

Solution

This PR implements two approaches to solve this problem. The first is to resolve all paginated pages in the raw_property_values call. This involved updating the raw_property_values in a similar way to how property_values had already been updated, in order to allow pagination to be handled.

This in itself does solve the problem, but the multiple Spanner calls to resolve the pagination add in latency (for the sake of retrieving nodes, many of which we do not need).

The second arm of the solution is to update the property values call itself to filter by the place types that we are interested, giving us a payload that only consists of types that we will use. In practice, with the Mixer-level filtering in place, pagination will be invoked less often.

Testing

Local

In master, this page will produce results locally when using a Mixer backed with BT, and no results when using a mixer backed with Spanner.

In this branch, Spanner will give the same results as BT.

San Mateo County

Production

Production can also serve as a working reference of the page (see the "Places in San Mateo County" section at the bottom of the page).

San Mateo County

Screenshot

The following is the section that is missing with Spanner in master (but available otherwise, and now with this PR in Spanner).

image

…lter for places, rather than pulling all containedInPlace nodes (regardless of the type) and filtering them out in Flask. Add pagination to the `raw_property_values` response.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where related places were not appearing on place pages when using Spanner-backed mixers. By enabling pagination for property value lookups and implementing server-side filtering for relevant place types, the system now correctly retrieves and displays child places even when large amounts of unrelated data exist in the graph.

Highlights

  • Pagination Support: Updated the raw_property_values function to support pagination by introducing a max_pages parameter and integrating dc.v2node_paginated.
  • Constraint Filtering: Optimized data retrieval by adding type-based constraints to the containedInPlace property fetch, reducing the payload size and avoiding unnecessary node processing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces paginated node fetching in server/lib/fetch.py when the USE_V2_API feature flag is enabled, and updates get_child_place_types in server/routes/place/utils.py to filter child places by expected types using constraints. Feedback suggests simplifying duplicated string formatting in fetch.py by extracting the expression into a local variable. Additionally, it is recommended to limit max_pages to 1 (or 2) instead of None in utils.py to prevent potential latency and timeout issues when querying places with a large number of child nodes.

Comment thread server/lib/fetch.py
Comment on lines +646 to +651
raw_property_values_response = fetch.raw_property_values(
nodes=[place.dcid],
prop="containedInPlace",
out=False,
constraints=constraints,
max_pages=None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since get_child_place_types only needs to identify the types of child places that exist (to populate the categories/tabs), fetching all pages of child places (max_pages=None) is unnecessary and can introduce significant latency or timeouts for places with thousands of child nodes (e.g., countries or large states with many cities/zip codes). Since we are already filtering by the expected child types using the constraints parameter, the first page of results is highly likely to contain all the relevant child place types. Consider limiting max_pages to 1 (or 2) to avoid unnecessary sequential API calls.

Suggested change
raw_property_values_response = fetch.raw_property_values(
nodes=[place.dcid],
prop="containedInPlace",
out=False,
constraints=constraints,
max_pages=None)
raw_property_values_response = fetch.raw_property_values(
nodes=[place.dcid],
prop="containedInPlace",
out=False,
constraints=constraints,
max_pages=1)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree with the principle being stated here, that "highly likely" to contain all the relevant child place types is enough. I would argue that it might be highly likely that the first page will not contain all place types in a place that has so many pages that it overruns the pagination.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow-up to this: while the original assertion that the code-assist reviewer made was incorrect, there is an underlying inefficiency where we pull all the child places in order to get the child place types, collate those place types and then use them to once again pull all the child places.

I've created a new function that returns both the places types (as the original function did) as well as the places, and added a TODO to consider converting other uses of the original function to this in order to get rid of what is a now very similar function pair with repeated code.

…ng function does) but that returns both the types and the return child places, so that we do not have to make another separate call for the same information.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant