Skip to content

fix(source-wordpress): add lookback window to incremental streams and fix tab character in pages stream#76063

Draft
devin-ai-integration[bot] wants to merge 3 commits intomasterfrom
devin/1775190438-fix-wordpress-dst-lookback-window
Draft

fix(source-wordpress): add lookback window to incremental streams and fix tab character in pages stream#76063
devin-ai-integration[bot] wants to merge 3 commits intomasterfrom
devin/1775190438-fix-wordpress-dst-lookback-window

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot commented Apr 3, 2026

What

Resolves https://github.com/airbytehq/oncall/issues/11859:

The WordPress connector's incremental streams (editor_blocks, pages, media, comments) use local-timezone cursor fields (modified / date), which are non-monotonic during DST fall-back transitions — risking data loss when the clock repeats an hour.

Additionally, the pages stream had an embedded tab character in its field_name: "modified_after\t". This likely caused the WordPress API to not recognize the modified_after parameter, meaning server-side filtering was silently broken for pages.

How

  1. Added lookback_window: PT1H to the DatetimeBasedCursor on all four incremental streams (editor_blocks, comments, pages, media). This re-fetches 1 hour of data before the cursor on each sync, covering the maximum 1-hour DST shift. Duplicates are handled by destination deduplication.
  2. Removed the tab character from the pages stream's start_time_option.field_name ("modified_after\t"modified_after).
  3. Bumped version 0.0.480.0.49.

Declarative-First Evaluation

Used the built-in lookback_window property of DatetimeBasedCursor — a one-line manifest addition per stream. No custom Python components needed. Prior art: source-convertkit, source-mantle, and others use lookback_window: PT1H for the same pattern.

Test Coverage

Created unit_tests/test_manifest.py with 6 tests:

  • Parametrized test verifying lookback_window: PT1H on all 4 incremental streams
  • Regression test for the pages field_name tab character
  • Full-manifest recursive scan ensuring no field_name contains a tab

This connector has no integration test secrets (all acceptance tests are bypassed), so manifest-level validation is the appropriate testing approach. No live API testing was performed.

Review guide

  1. airbyte-integrations/connectors/source-wordpress/manifest.yaml — the core fix. Four one-line lookback_window: PT1H additions and the tab removal on pages.
  2. airbyte-integrations/connectors/source-wordpress/unit_tests/test_manifest.py — new tests
  3. airbyte-integrations/connectors/source-wordpress/metadata.yaml — version bump
  4. docs/integrations/sources/wordpress.md — changelog entry

Key things to verify:

  • The tab character is actually removed from the pages stream (field_name: modified_after without quotes, line ~289 in diff)
  • lookback_window: PT1H is present on all 4 incremental streams, not just a subset
  • Fixing the pages field_name means server-side filtering will now actually work — this is a positive behavior change but may alter data volumes for pages syncs (previously the API may have returned unfiltered results)

User Impact

  • Incremental syncs on editor_blocks, pages, media, and comments will now re-fetch 1 extra hour of data per sync, preventing data loss during DST transitions. This may produce some duplicate records which are handled by destination dedup.
  • The pages stream will now correctly send the modified_after parameter to the WordPress API, enabling proper server-side filtering that was previously broken by the tab character.

Can this PR be safely reverted and rolled back?

  • YES 💚

Link to Devin session: https://app.devin.ai/sessions/f007d5e6b4024e15aca2f6e791014ca1

… fix tab character in pages stream

- Add lookback_window: PT1H to editor_blocks, comments, pages, and media streams
  to prevent data loss during DST fall-back transitions
- Fix tab character bug in pages stream field_name (modified_after\t -> modified_after)
- Bump version from 0.0.48 to 0.0.49

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
      • Bump types: patch (default), minor, major, major_rc, rc, promote.
      • The rc type is a smart default: applies minor_rc if stable, or bumps the RC number if already RC.
      • The promote type strips the RC suffix to finalize a release.
      • Example: /bump-version type=rc or /bump-version type=minor
    • /bump-progressive-rollout-version - Alias for /bump-version type=rc. Bumps with an RC suffix and enables progressive rollout.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

source-wordpress Connector Test Results

3 tests   1 ✅  3s ⏱️
1 suites  2 💤
1 files    0 ❌

Results for commit aec4d9e.

♻️ This comment has been updated with latest results.

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-a4yq3r9c2-airbyte-growth.vercel.app

Built with commit aec4d9e.
This pull request is being automatically deployed with vercel-action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant