Conversation
Need to see if this also passes downstream ci Related (bdbb1c5)
| def __call__(self) -> Self: | ||
| # NOTE: (Internal doc) | ||
| # But, why? | ||
| # | ||
| # Let's say we have a function that looks like this: | ||
| # | ||
| # >>> import pyarrow as pa | ||
| # >>> from narwhals.dtypes import DType | ||
| # >>> from narwhals.typing import NonNestedDType | ||
| # | ||
| # >>> def convert(dtype: DType | type[NonNestedDType]) -> pa.DataType: ... | ||
| # | ||
| # | ||
| # If we need to resolve the union to a *class*, we can use: | ||
| # | ||
| # >>> always_class = dtype.base_type() | ||
| # | ||
| # But what if instead, we need an *instance*?: | ||
| # | ||
| # >>> always_instance_1 = dtype if isinstance(dtype, DType) else dtype() | ||
| # >>> always_instance_2 = dtype() if isinstance(dtype, type) else dtype | ||
| # | ||
| # By defining `__call__`, we can save an instance check and keep things readable: | ||
| # | ||
| # >>> always_instance = dtype() | ||
| # >>> always_class = dtype.base_type() | ||
| return self |
There was a problem hiding this comment.
I'm intentionally not making this public API, but want there to be a reminder on how it helps 🙂
There was a problem hiding this comment.
100% agree to avoid making this public 👌🏼
DType.__call__DType.__call__
| def _resolve_base_type_dtype(into: IntoDType, /) -> tuple[type[DType], DType]: | ||
| return into.base_type(), into() |
There was a problem hiding this comment.
I suppose this is a TL;DR
We can extract the class and instance from IntoDType - only paying the cost of initialization if we needed to anyway
|
I'm not a huge fan of relying on from __future__ import annotations
from typing import ClassVar
class Instance[T]:
def __get__(self, instance: T | None, owner: type[T]) -> T:
if instance is None:
return owner()
return instance
class Dtype:
instance: ClassVar[Instance[Dtype]] = Instance()
print(Dtype().instance) # <__main__.Dtype object at 0x7512677f5940>
print(Dtype.instance) # <__main__.Dtype object at 0x751267771450> |
|
thanks @camriddell - yeah i think i prefer that too |
|
Thanks for the input @camriddell! 🙂 I'm having trouble writing up something concise. def list_supertype_current(left: List, right: List, version: Version) -> List | None:
if inner := get_supertype(left.inner(), right.inner(), version=version):
return List(inner)
return None
def list_supertype_proposed(left: List, right: List, version: Version) -> List | None:
if inner := get_supertype(left.inner.instance, right.inner.instance, version=version):
return List(inner)
return Nonein relation to ...
I think I understand your concern, but this swings too far in the opposite direction IMO. Let's pretend we don't know what the descriptor protocol is for a moment. left.inner()
left.inner.instanceFrom reading the code, the second looks like an attribute access and is hiding that it could be constructing a If we could only highlight doing work vs not doing work - surely a silent no-op is preferable to the inverse, right? |
|
Let me chime in the discussion as well as I kind of started both:
TheoryIn theory, nested types allow for the inner type to be any Lines 278 to 279 in d7bb050 which means that also classes rather than object are allowed. This aligns with what polars does - I assume to improve ergonomics for users - for example in PracticeIn practice, if our only use case for In other words, the only case in which we can get a class rather than an instance is when a user passes it. In conclusionI mostly wanted to state the obvious, trying to bring the attention to the fact that we are solution-ing/debating for a non-problem as we may not need to do the |
Description
As part of #3386, @FBruzzesi and I identified an issue with simply reusing
IntoDType.__eq__to compareDTypes.When we start nesting
DTypes, there are cases like this where a change from a default makes theSchemaincompatible:Here, different
time_units are allowed and we should just pick the less-precise one:But if the difference is on
time_zones - we should raise:This commit shows how many
isinstancechecks andimports that addingDType.__call__saves.The cost might seem small, but it adds up once you consider
concatneed to do it for:SchemasDTypesDTypes can also be nested 😳Caching will help, but since this problem can scale in multiple directions - it's probably best we minimize anything we can 😄
What type of PR is this? (check all applicable)
Related issues
concat(..., how={"vertical_relaxed", "diagonal_relaxed"})#3386Needed to see if this broke downstream ci, but all unrelated:
darts- unrelatedmarimo- unrelatedscikit-legounrelated