Skip to content

chore: remove sql feature and sqlparser dependency from DataFusion#1437

Draft
Copilot wants to merge 5 commits intomainfrom
copilot/remove-sql-feature-dependency
Draft

chore: remove sql feature and sqlparser dependency from DataFusion#1437
Copilot wants to merge 5 commits intomainfrom
copilot/remove-sql-feature-dependency

Conversation

Copy link
Contributor

Copilot AI commented Feb 27, 2026

Removes the sql feature (and its transitive datafusion-sql/sqlparser dependencies) from our DataFusion integration, updating to DataFusion 53.0.0 which fixes the recursive_protection feature so it no longer activates sql.

Cargo.toml

  • datafusion: default-features = false, explicitly list all needed features including recursive_protection (now safe in v53) — omitting sql
  • datafusion-expr, datafusion-functions-nested: default-features = false to prevent their sql default from re-enabling datafusion-expr/sql via feature unification
  • datafusion-datasource-avro added as a direct workspace dependency (the avro feature in datafusion v52 pulled in datafusion-sql; in v53 it is clean, but avro types are imported directly from datafusion-datasource-avro for clarity)
  • Removed separate recursive_protection entries for individual crates (datafusion-common, datafusion-expr, datafusion-physical-expr, datafusion-optimizer, datafusion-physical-optimizer) — datafusion/recursive_protection now handles all sub-crates correctly in v53

wildcard.rs

Replace sqlparser::ast types and expr_to_sql with the non-sql equivalents from datafusion_expr::expr:

// Before
use datafusion::sql::unparser::expr_to_sql;
use datafusion::sql::sqlparser::ast;
let ilike = wildcard_options.ilike_pattern.map(|x| ast::IlikeSelectItem { pattern: x });
let item = ast::ReplaceSelectElement { expr: expr_to_sql(&expression)?, ... };

// After
let ilike = wildcard_options.ilike_pattern.map(|x| expr::IlikeSelectItem { pattern: x });
let item = expr::ReplaceSelectElement { expr: expression.to_string(), ... };

A make_ident helper reduces repetition when constructing expr::Ident with default quote_style/span values.

error.rs

Remove the DataFusionError::SQL match arm — that variant is gated behind datafusion-common's sql feature, which is now disabled.

avro/mod.rs and codec.rs

Import AvroFormat and AvroSource directly from datafusion_datasource_avro rather than via datafusion::datasource, avoiding the need for the avro feature on the datafusion workspace entry.


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: linhr <5601366+linhr@users.noreply.github.com>
Copilot AI changed the title [WIP] Remove sql feature and sqlparser dependency from DataFusion chore: remove sql feature and sqlparser dependency from DataFusion Feb 27, 2026
@linhr linhr marked this pull request as ready for review February 27, 2026 09:59
@linhr
Copy link
Contributor

linhr commented Feb 27, 2026

@copilot Please investigate why datafusion-sql and sqlparser still show up in Cargo.lock.

@linhr linhr marked this pull request as draft February 27, 2026 10:04
@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

❌ Patch coverage is 3.44828% with 28 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ates/sail-plan/src/resolver/expression/wildcard.rs 3.44% 28 Missing ⚠️
@@            Coverage Diff             @@
##             main    #1437      +/-   ##
==========================================
- Coverage   75.12%   75.12%   -0.01%     
==========================================
  Files         870      870              
  Lines      114384   114393       +9     
==========================================
  Hits        85936    85936              
- Misses      28448    28457       +9     
Flag Coverage Δ *Carryforward flag
ibis-tests 19.24% <ø> (+<0.01%) ⬆️ Carriedforward from 682a98e
python-unit-tests 56.35% <3.44%> (-0.01%) ⬇️
rust-slow-tests 43.05% <ø> (+<0.01%) ⬆️ Carriedforward from 682a98e
rust-unit-tests 38.69% <3.44%> (-0.01%) ⬇️
spark-tests 33.15% <ø> (+<0.01%) ⬆️ Carriedforward from 682a98e

*This pull request uses carry forward flags. Click here to find out more.

Files with missing lines Coverage Δ
crates/sail-common-datafusion/src/error.rs 49.39% <ø> (+0.58%) ⬆️
crates/sail-data-source/src/formats/avro/mod.rs 14.28% <ø> (ø)
crates/sail-execution/src/codec.rs 17.21% <ø> (ø)
...ates/sail-plan/src/resolver/expression/wildcard.rs 61.13% <3.44%> (-2.80%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…and sqlparser from compiled deps

Co-authored-by: linhr <5601366+linhr@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 27, 2026

@copilot Please investigate why datafusion-sql and sqlparser still show up in Cargo.lock.

They still appeared because datafusion's avro feature directly enables datafusion-sql:

avro = [
    "datafusion-common/avro",
    "datafusion-datasource-avro",
    "unicode_expressions",
    "datafusion-sql/recursive_protection",
    "datafusion-sql",   # <-- this pulled in datafusion-sql
]

Fixed in facc882 by removing avro from datafusion's features and instead depending directly on datafusion-datasource-avro. The imports in sail-data-source and sail-execution are updated to use datafusion_datasource_avro directly.

Note: datafusion-sql and sqlparser still appear in Cargo.lock as resolved optional dependencies of datafusion, datafusion-common, and datafusion-expr (all of which have sqlparser as an optional dep for their sql feature). This is standard Cargo behavior — the resolver records all optional deps even when unactivated. They are confirmed NOT compiled: cargo build --message-format=json shows neither sqlparser nor datafusion-sql in the set of compiled artifacts.

@lonless9
Copy link
Contributor

@copilot Merge from origin/main branch and resolve the conflicts. Then follow apache/datafusion#20625 that mentioned

When enabling the recursive_protection feature for the datafusion crate, the sql feature is enabled. This is undesirable if the downstream project would like the sql feature to be off.

This issue has been handled in the DataFusion 53 version so now again try to remove the sql feature and sqlparser dependency.

Copy link
Contributor Author

Copilot AI commented Mar 26, 2026

@copilot Merge from origin/main branch and resolve the conflicts. Then follow apache/datafusion#20625 that mentioned

When enabling the recursive_protect...

Done in d5ab5c5. Merged from origin/main (DataFusion 53.0.0) and resolved the conflict.

With DataFusion 53, recursive_protection no longer activates sql (apache/datafusion#20625), so the Cargo.toml is simplified:

  • datafusion: default-features = false, features include recursive_protection but NOT sql
  • datafusion-expr and datafusion-functions-nested: still need default-features = false since their own defaults include sql
  • Removed the separate recursive_protection entries for datafusion-common, datafusion-optimizer, etc. (now handled via datafusion/recursive_protection)

sqlparser and datafusion-sql are confirmed NOT compiled (verified via --message-format=json).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants