Componentize dagster_gcp resources#33354
Open
michalcabir-ui wants to merge 4 commits intodagster-io:masterfrom
Open
Componentize dagster_gcp resources#33354michalcabir-ui wants to merge 4 commits intodagster-io:masterfrom
michalcabir-ui wants to merge 4 commits intodagster-io:masterfrom
Conversation
OwenKephart
reviewed
Feb 3, 2026
python_modules/libraries/dagster-gcp/dagster_gcp/components/bigquery.py
Outdated
Show resolved
Hide resolved
OwenKephart
reviewed
Feb 3, 2026
python_modules/libraries/dagster-gcp/dagster_gcp/components/dataproc.py
Outdated
Show resolved
Hide resolved
OwenKephart
reviewed
Feb 3, 2026
python_modules/libraries/dagster-gcp/dagster_gcp/components/gcs.py
Outdated
Show resolved
Hide resolved
OwenKephart
reviewed
Feb 3, 2026
python_modules/libraries/dagster-gcp/dagster_gcp/components/io_managers.py
Outdated
Show resolved
Hide resolved
OwenKephart
reviewed
Feb 3, 2026
python_modules/libraries/dagster-gcp/dagster_gcp_tests/component_tests/test_gcp_components.py
Outdated
Show resolved
Hide resolved
OwenKephart
requested changes
Feb 3, 2026
Contributor
OwenKephart
left a comment
There was a problem hiding this comment.
This looks quite close!
Just had a few smaller comments, and a couple small updates to make to the tests
You can just get rid of the IOManager component, I think it's unclear if/how we'll want IOManagers represented in the components system so for now we'll ignore them
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary & Motivation
This PR implements the foundational infrastructure for dagster-gcp Components, enabling YAML-based configuration for Google Cloud resources in Dagster projects.
Used explicit field duplication rather than dynamic field copying. This ensures strict type safety, clear documentation, and avoids runtime surprises(base on what i did on the AWS pr).
Key Design Decisions:
Explicit Pydantic Models: All components (BigQueryResourceComponent, GCSResourceComponent, etc.) explicitly define their fields using pydantic.Field.
GCS: GCSResourceComponent, GCSFileManagerResourceComponent
IO managers: Left out of this PR per review; may be added later.
Dataproc: DataprocResourceComponent (Marked as Beta, strictly enforces required fields like project_id, region, and cluster_name to match the underlying resource).
How I Tested These Changes
I verified the implementation using a comprehensive test suite in dagster_gcp_tests/component_tests/test_gcp_components.py:
Sandbox Integration: Validated full YAML-to-Resource lifecycles using the create_defs_folder_sandbox pattern for all implemented components.
Field Synchronization: Implemented automated tests to ensure Component fields remain a superset of the underlying Resource fields. This ensures that if a Resource adds a field in the future, the test will fail to remind us to update the Component.
Complex Configuration: Verified that DataprocResourceComponent correctly handles complex nested configurations (dictionaries) and required fields.
Changelog
Added foundational Component infrastructure and registry entry points.
Implemented BigQueryResourceComponent, GCSResourceComponent, and GCSFileManagerResourceComponent.
Implemented DataprocResourceComponent (Beta) with support for cluster config dictionaries.
docs: Added a comprehensive MD guide for GCP Components in docs/docs/integrations/libraries/gcp/component.md, updated docs\sphinx\sections\integrations\libraries\gcp\dagster-gcp.rst.