Skip to content

feat: Allow scalars in horizontal expressions#3435

Open
FBruzzesi wants to merge 8 commits intomainfrom
test/sum-horizontal-with-scalar
Open

feat: Allow scalars in horizontal expressions#3435
FBruzzesi wants to merge 8 commits intomainfrom
test/sum-horizontal-with-scalar

Conversation

@FBruzzesi
Copy link
Member

Description

Some tiny extra gymnastic to allow this for pandas, pyarrow and dask. All other backends where solved by some refactor

What type of PR is this? (check all applicable)

  • 💾 Refactor
  • ✨ Feature
  • 🐛 Bug Fix
  • 🔧 Optimization
  • 📝 Documentation
  • ✅ Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

@FBruzzesi FBruzzesi added the enhancement New feature or request label Feb 1, 2026
pc.min_element_wise, [s.native for s in series], init_series.native
series = tuple(chain.from_iterable(expr(df) for expr in exprs))
result = reduce(
lambda s1, s2: s1._with_binary(pc.min_element_wise, s2), series
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let _with_binary take care of the broadcasting for scalars

Copy link
Member

@dangotbanned dangotbanned Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pc.min_element_wise doesn't need broadcasting btw

Copy link
Member Author

@FBruzzesi FBruzzesi Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a scalar input, by the time we get here, it's a length one array, not a scalar anymore, and we get a shape mismatch

Copy link
Member

@dangotbanned dangotbanned Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see 🤦

I was thinking of this behavior:

(Scalar, Scalar) - > Scalar
(Scalar, Array) -> Array
(Array, Array) -> Array

Which is what broadcasting is, but I forgot the Scalar preservation is only in #2572

Copy link
Member Author

@FBruzzesi FBruzzesi Feb 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0cad6c1 should reduce the overhead of both "align"-ing first, and then reduce-ing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FBruzzesi 😍😍😍

Comment on lines 175 to 179
expr_results = [s for _expr in exprs for s in _expr(df)]
series = align_series_full_broadcast(df, *(s.fillna(0) for s in expr_results))
non_na = align_series_full_broadcast(
df, *(1 - s.isna() for s in expr_results)
expr_results = align_series_full_broadcast(
df, *[s for _expr in exprs for s in _expr(df)]
)
series = (s.fillna(0) for s in expr_results)
non_na = (1 - s.isna() for s in expr_results)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First broadcast (once) so that we ensure to have series (i.e. no more dask_expr.Scalar's), then it's possible to perform fillna() and isna() safely

Comment on lines -238 to +240
series = list(chain.from_iterable(expr(df) for expr in exprs))
series = self._series._align_full_broadcast(
*chain.from_iterable(expr(df) for expr in exprs)
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to broadcast scalars first

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dangotbanned unlike the arrow case, here we need to align the series for how we implement the operation after


def sum_horizontal(*exprs: IntoExpr | Iterable[IntoExpr]) -> Expr:
def sum_horizontal(
*exprs: PythonLiteral | IntoExpr | Iterable[PythonLiteral | IntoExpr],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will comment here to exemplify the behavior. Polars definition of IntoExpr is already including python literal. They distinguish between:

# Inputs that can convert into a `col` expression
IntoExprColumn: TypeAlias = Union["Expr", "Series", str]
# Inputs that can convert into an expression
IntoExpr: TypeAlias = PythonLiteral | IntoExprColumn | None

I am not sure if it's reasonable for us to eventually align with those definition and distinction

code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this in polars too.

I think an issue in narwhals could be the special-casing for pandas non-str column "names".
But I'm not sure how far that support extends, e.g. do we actually support it everywhere?

dangotbanned added a commit that referenced this pull request Feb 2, 2026
See #3435 (comment)

Also
- tweaked some typing
- added docs
- included `coalesce` in the same path
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enh: allow for scalars in sum_horizontal

2 participants