Skip to content

[Variant] Align cast logic for from/to_decimal for variant#9689

Open
klion26 wants to merge 4 commits intoapache:mainfrom
klion26:variant-cast-decimal
Open

[Variant] Align cast logic for from/to_decimal for variant#9689
klion26 wants to merge 4 commits intoapache:mainfrom
klion26:variant-cast-decimal

Conversation

@klion26
Copy link
Copy Markdown
Member

@klion26 klion26 commented Apr 10, 2026

Which issue does this PR close?

What changes are included in this PR?

  • Extract some logic in arrow-cast
  • Reuse the extracted logic in arrow-cast and parquet-variant

Are these changes tested?

Reuse the existing tests in arrow-test

Are there any user-facing changes?

Yes, changed the docs

@github-actions github-actions bot added arrow Changes to the arrow crate parquet-variant parquet-variant* crates labels Apr 10, 2026
Copy link
Copy Markdown
Member Author

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scovich @sdf-jkl Please help to review this when you're free, thanks

.ok()
.and_then(|x: i32| x.try_into().ok()),
Variant::ShortString(v) => {
parse_string_to_decimal_native::<Decimal32Type>(v.as_str(), 0usize)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use v.as_str() instead of v because we use match *self, if we change to match self then we need to derefer self in other match arms, seems there is litter benefit gained, so stick to the current match *self and use v.as_str() here.

Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General approach looks reasonable, but needs some tweaks to avoid regressing performance. Do we have benchmarks we can throw at this to verify?

let v = cast_single_decimal_to_integer::<D, T::Native>(
array.value(i),
div,
scale as _,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we casting? Isn't it a trivial i16 -> i16 cast?
(again below)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the overlfow cast_single_decimal_to_integer receives i16, the scale of arrow decimal is i8 and VariantDecimal is u8.

} else {
match cast_options.safe {
true => {
let v = cast_single_decimal_to_integer::<D, T::Native>(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code hoisted checks for scale < 0 (mul_checked vs. div_checked) and cast_options.safe (NULL vs. error) outside the loop, producing four simple loop bodies. This was presumably done for performance reasons (minimizing branching inside the loop).

The new code pushes the cast_options.safe check inside a single loop and pushes scale < 0 check all the way down inside cast_single_decimal_to_integer. That triples the number of branches inside the loop body (the null check is per-row and so is always stuck inside the loop). Performance will almost certainly be impacted, possibly significantly.

It would be safer to just preserve the replication (even tho it duplicates logic with the new helper), and rely on the compiler's inlining and "jump threading" optimizations to eliminate that redundancy:

code snippet
if scale < 0 {
    if cast_options.safe {
        for i in 0..array.len() {
            if array.is_null(i) {
                value_builder.append_null();
            } else {
                let v = cast_single_decimal_to_integer::<D, T::Native>(...);
                value_builder.append_option(v.ok());
            }
        }
    } else {
        for i in 0..array.len() {
            if array.is_null(i) {
                value_builder.append_null();
            } else {
                let v = cast_single_decimal_to_integer::<D, T::Native>(...);
                value_builder.append_value(v?);
            }
        }
    }
} else {
    if cast_options.safe {
        for i in 0..array.len() {
            if array.is_null(i) {
                value_builder.append_null();
            } else {
                let v = cast_single_decimal_to_integer::<D, T::Native>(...);
                value_builder.append_option(v.ok());
            }
        }
    } else {
        for i in 0..array.len() {
            if array.is_null(i) {
                value_builder.append_null();
            } else {
                let v = cast_single_decimal_to_integer::<D, T::Native>(...);
                value_builder.append_value(v?);
            }
        }
    }
}

If you wanted to simplify a bit, you could define and use a local macro inside this function:

// Helper macro for emitting nearly the same loop every time, so we can hoist branches out.
// The compiler will specialize the resulting code (inlining and jump threading)
macro_rules! cast_loop {
    (|$v:ident| $body:expr) => {{
        for i in 0..array.len() {
            if array.is_null(i) {
                value_builder.append_null();
            } else {
                let $v = cast_single_decimal_to_integer::<D, T::Native>(...);
                $body
            }
        }
    }};
}

if scale < 0 {
    if cast_options.safe {
        cast_loop!(|v| value_builder.append_option(v.ok()));
    } else {
        cast_loop!(|v| value_builder.append_value(v?));
    }
} else {
    if cast_options.safe {
        cast_loop!(|v| value_builder.append_option(v.ok()));
    } else {
        cast_loop!(|v| value_builder.append_value(v?));
    }
}

Note that the four loop bodies are almost syntactically identical -- differing only in whether they append_option(v.ok()) or append_value(v?) -- but the inlined body of cast_single_decimal_to_integer inside each loop will be specialized based on the scale < 0 check we already performed. Result: stand-alone calls to the helper function are always safe, but we still get maximum performance here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explain. fixed.

D: DecimalType,
F: Fn(D::Native) -> f64,
{
f(x) / 10_f64.powi(scale)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmul is drastically cheaper than fdiv on every architecture I know of. As in, 5-10x higher ops/second.
Any reason we shouldn't switch?

Suggested change
f(x) / 10_f64.powi(scale)
f(x) * 10_f64.powi(-scale)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks for the performcne numbers.

}),
Float64 => cast_decimal_to_float::<D, Float64Type, _>(array, |x| {
as_float(x) / 10_f64.powi(*scale as i32)
single_decimal_to_float_lossy::<D, F>(&as_float, x, *scale as _)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're anyway changing the code, i32::from(*scale) makes clear that this is a lossless conversion

(a bunch more similarly lossless as _ below)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i32::from(*scale) is better, fixed.

Variant::Decimal16(d) => Self::cast_decimal_to_num::<Decimal128Type, T, _>(
d.integer(),
d.scale(),
|x: i128| x as f64,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised those type annotations are necessary when the first arg takes D::Native and should thus constrain the third arg's `F: fn(D::Native) -> f64?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler didn't need this, fixed.

Comment on lines +1193 to +1194
.map(|(_, frac)| frac.len())
.unwrap_or(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
.map(|(_, frac)| frac.len())
.unwrap_or(0);
.map_or_else(0, |(_, frac)| frac.len());

or even

Suggested change
.map(|(_, frac)| frac.len())
.unwrap_or(0);
.map_or_default(|(_, frac)| frac.len());

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, use map_or_else here as map_or_default is a nightly-only experimental API for now

Comment on lines +1198 to +1200
parse_string_to_decimal_native::<D>(input, scale_usize)
.ok()
.and_then(|raw| VD::try_new(raw, scale).ok())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
parse_string_to_decimal_native::<D>(input, scale_usize)
.ok()
.and_then(|raw| VD::try_new(raw, scale).ok())
let raw = parse_string_to_decimal_native::<D>(input, scale_usize).ok()?;
VD::try_new(raw, scale).ok()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, more readable now

self.as_num()
}

fn convert_string_to_decimal<VD, D>(input: &str) -> Option<VD>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If you swap the template order, I callers would be a tad more readable, e.g.:

convert_string_to_decimal::<Decimal32Type, _>

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +1331 to +1332
.as_num::<i64>()
.map(|x| (x as i128).try_into().ok())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why not just as_num::<i128> directly?
But if you must keep the double cast, at least do i128::from(x) to make clear it's lossless.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no UInt128 arrow type. as_num needs the type to implement DecimalCastTarget, which needs a corresponding arrow type UInt128.

Changed to i128::from(..)

Copy link
Copy Markdown
Member Author

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scovich Thanks for the review. I've addressed the comments. For the benchmark, I'll verify locally and come back.

let v = cast_single_decimal_to_integer::<D, T::Native>(
array.value(i),
div,
scale as _,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the overlfow cast_single_decimal_to_integer receives i16, the scale of arrow decimal is i8 and VariantDecimal is u8.

} else {
match cast_options.safe {
true => {
let v = cast_single_decimal_to_integer::<D, T::Native>(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explain. fixed.

D: DecimalType,
F: Fn(D::Native) -> f64,
{
f(x) / 10_f64.powi(scale)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks for the performcne numbers.

}),
Float64 => cast_decimal_to_float::<D, Float64Type, _>(array, |x| {
as_float(x) / 10_f64.powi(*scale as i32)
single_decimal_to_float_lossy::<D, F>(&as_float, x, *scale as _)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i32::from(*scale) is better, fixed.

Variant::Decimal16(d) => Self::cast_decimal_to_num::<Decimal128Type, T, _>(
d.integer(),
d.scale(),
|x: i128| x as f64,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler didn't need this, fixed.

Comment on lines +1193 to +1194
.map(|(_, frac)| frac.len())
.unwrap_or(0);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, use map_or_else here as map_or_default is a nightly-only experimental API for now

Comment on lines +1198 to +1200
parse_string_to_decimal_native::<D>(input, scale_usize)
.ok()
.and_then(|raw| VD::try_new(raw, scale).ok())
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, more readable now

Comment on lines +1331 to +1332
.as_num::<i64>()
.map(|x| (x as i128).try_into().ok())
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no UInt128 arrow type. as_num needs the type to implement DecimalCastTarget, which needs a corresponding arrow type UInt128.

Changed to i128::from(..)

self.as_num()
}

fn convert_string_to_decimal<VD, D>(input: &str) -> Option<VD>
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

),
Variant::Float(f) => single_float_to_decimal::<O>(f64::from(*f), mul),
Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(
Copy link
Copy Markdown
Member Author

@klion26 klion26 Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems, the logic of parsing a string to decimal in arrow-cast has a bug, it will call it with scale:i8 as uscale, will validate and file an issue to track it.(but it doesn't relate to the current change),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variant spec doesn't even allow negative scale in the first place:
image

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... but this code is converting from variant to arrow decimal, so negative scales are totally legit.

Meanwhile, perhaps a code comment would be good here? It might not be obvious to future source divers why the scale > 0 constraint.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, scale >= 0 should be safe?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code today, arrow-cast has checked the scale in decimal.rs::cast_string_to_decimal(only supports scale >=0 when cast uf8 to decimal), and I need to update the code here from >0 to >=0, will add a comment here also.

{
if scale < 0 {
return Err(ArrowError::InvalidArgumentError(format!(
"Cannot cast string to decimal with negative scale {scale}"
)));
}

@scovich scovich self-requested a review April 13, 2026 13:49
Copy link
Copy Markdown
Contributor

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits, but one dangerous floating point order of operations issue that I overlooked while suggesting #9689 (comment). Probably best to back it out (see comment for details).

F: Fn(D::Native) -> f64,
{
f(x) / 10_f64.powi(scale)
f(x) * 10_f64.powi(-scale)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rescuing #9689 (comment) from github oblivion...

I just remembered that floating point a * 10**-b is technically NOT equivalent to a / 10**b. The algebraic operators section of rust docs talks about it:

Algebraic operators of the form a.algebraic_*(b) allow the compiler to optimize floating point operations using all the usual algebraic properties of real numbers – despite the fact that those properties do not hold on floating point numbers. This can give a great performance boost since it may unlock vectorization.

The exact set of optimizations is unspecified but typically allows combining operations, rearranging series of operations based on mathematical properties, converting between division and reciprocal multiplication, and disregarding the sign of zero.

(emphasis mine)

Whether we think any difference matters for this specific case... I don't know. But we should probably defer a change like this to its own PR with appropriate performance evaluation and weighing of trade-offs.

At least there's now a narrow waist for such a future optimization to be made easily.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same question before changing the code, but the code on play rust below shows that they are equal, not sure if this is enought.

yes, If this code below is not enough,we should keep it as it in the current pr

    let max: u8 = u8::MAX;
    for i in 0..max {
        let left = 1f64 / 10_f64.powi(<i32 as From::<u8>>::from(i));
        let right = 1f64 * 10_f64.powi(-<i32 as From::<u8>>::from(i));
        if left != right {
            println!("No equal {:?}", i);
        }
    }
    println!("Over")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pure luck that 1f didn't trigger anything. A more complete example finds plenty of values just in -10..=10:

a = -10, b = -30
(a as f64) / 10_f64.powi(b)   = -1.00000000000000007618e31
(a as f64) * 10_f64.powi(-b) = -9.99999999999999963590e30
a = -10, b = -26
(a as f64) / 10_f64.powi(b)   = -1.00000000000000015073e27
(a as f64) * 10_f64.powi(-b) = -1.00000000000000001329e27
a = -10, b = -23
(a as f64) / 10_f64.powi(b)   = -9.99999999999999849005e23
(a as f64) * 10_f64.powi(-b) = -9.99999999999999983223e23
a = -10, b = -17
(a as f64) / 10_f64.powi(b)   = -9.99999999999999872000e17
(a as f64) * 10_f64.powi(-b) = -1.00000000000000000000e18
a = -10, b = -5
(a as f64) / 10_f64.powi(b)   = -9.99999999999999883585e5
(a as f64) * 10_f64.powi(-b) = -1.00000000000000000000e6
a = -10, b = 6
(a as f64) / 10_f64.powi(b)   = -1.00000000000000008180e-5
(a as f64) * 10_f64.powi(-b) = -9.99999999999999912396e-6
a = -10, b = 11
(a as f64) / 10_f64.powi(b)   = -1.00000000000000003643e-10
(a as f64) * 10_f64.powi(-b) = -9.99999999999999907185e-11
a = -10, b = 15
(a as f64) / 10_f64.powi(b)   = -9.99999999999999998819e-15
(a as f64) * 10_f64.powi(-b) = -1.00000000000000015659e-14
a = -10, b = 17
(a as f64) / 10_f64.powi(b)   = -9.99999999999999979098e-17
(a as f64) * 10_f64.powi(-b) = -1.00000000000000010236e-16
Found 246 examples in all

Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(
v,
<i8 as TryInto<usize>>::try_into(scale).expect("scale is positive, would never fail"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as _ seems appropriate for cases like this?

(sorry if my earlier comments gave the impression that it was outright bad/forbidden)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix it. Thanks for the feedback. I'd very much appreciate your feedback, they let me learn more about the Rust type system and the best practices.

Use from because the compiler can guarantee the function is infallible, but the case here, the compiler can't guarantee (the logic guarantees it's infallible)

Comment on lines +319 to +321
fn arrow_type() -> DataType {
$arrow_type
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why a function for this one, instead of a const like the other uses?

Suggested change
fn arrow_type() -> DataType {
$arrow_type
}
const ARROW_TYPE: DataType = $arrow_type;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to const, This style is more consistent.

Comment on lines +850 to +852
base.pow_checked(<u32 as From<u8>>::from(scale))
.ok()
.and_then(|div| match T::KIND {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place where and_then hurts readability

let div = base.pow_checked(<u32 as From<u8>>::from(scale)).ok()?;
match T::KIND {
  ...
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +896 to +900
Variant::Decimal16(d) => {
Self::cast_decimal_to_num::<Decimal128Type, T, _>(d.integer(), d.scale(), |x| {
x as f64
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside: Very odd choice by fmt... I would have expected

Suggested change
Variant::Decimal16(d) => {
Self::cast_decimal_to_num::<Decimal128Type, T, _>(d.integer(), d.scale(), |x| {
x as f64
})
}
Variant::Decimal16(d) => Self::cast_decimal_to_num::<Decimal128Type, T, _>(
d.integer(),
d.scale(),
|x| x as f64,
)

🤷

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to manually change it, fmt changed back to this format.

// find the last '.'
let scale_usize = input
.rsplit_once('.')
.map_or_else(|| 0, |(_, frac)| frac.len());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.map_or_else(|| 0, |(_, frac)| frac.len());
.map_or(0, |(_, frac)| frac.len());

(sorry, I mixed up the two names in my earlier suggestion)

I'm actually surprised clippy didn't flag it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +1333 to +1336
Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => self
.as_num::<i64>()
.map(|x| <i128 as From<i64>>::from(x).try_into().ok())
.unwrap(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why unwrap an option in a function that returns `Option?

Suggested change
Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => self
.as_num::<i64>()
.map(|x| <i128 as From<i64>>::from(x).try_into().ok())
.unwrap(),
Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => {
let x = self.as_num::<i64>()?;
i128::from(x).try_into().ok()
}

Also: the <i128 as From<i64>> is surprising -- does the compiler actually need it for some reason?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to the suggestion, it looks better

use unwrap here because this is always safe, didn't comment this because I didn't add a comment because I thought it would be too direct to get this.

Yes, the compiler needs it because there are multiple from found

Note: candidate #1 is defined in the trait `From`
Note: candidate #2 is defined in an impl of the trait `num_traits::NumCast` for the type `i128

Copy link
Copy Markdown
Member Author

@klion26 klion26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scovich, thanks for the review. I've addressed the comments.

I didn't find time to add the benchmark and run them today, will find time tomorrow.

F: Fn(D::Native) -> f64,
{
f(x) / 10_f64.powi(scale)
f(x) * 10_f64.powi(-scale)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same question before changing the code, but the code on play rust below shows that they are equal, not sure if this is enought.

yes, If this code below is not enough,we should keep it as it in the current pr

    let max: u8 = u8::MAX;
    for i in 0..max {
        let left = 1f64 / 10_f64.powi(<i32 as From::<u8>>::from(i));
        let right = 1f64 * 10_f64.powi(-<i32 as From::<u8>>::from(i));
        if left != right {
            println!("No equal {:?}", i);
        }
    }
    println!("Over")

),
Variant::Float(f) => single_float_to_decimal::<O>(f64::from(*f), mul),
Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code today, arrow-cast has checked the scale in decimal.rs::cast_string_to_decimal(only supports scale >=0 when cast uf8 to decimal), and I need to update the code here from >0 to >=0, will add a comment here also.

{
if scale < 0 {
return Err(ArrowError::InvalidArgumentError(format!(
"Cannot cast string to decimal with negative scale {scale}"
)));
}

Variant::Double(f) => single_float_to_decimal::<O>(*f, mul),
Variant::String(v) if scale > 0 => parse_string_to_decimal_native::<O>(
v,
<i8 as TryInto<usize>>::try_into(scale).expect("scale is positive, would never fail"),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix it. Thanks for the feedback. I'd very much appreciate your feedback, they let me learn more about the Rust type system and the best practices.

Use from because the compiler can guarantee the function is infallible, but the case here, the compiler can't guarantee (the logic guarantees it's infallible)

Comment on lines +319 to +321
fn arrow_type() -> DataType {
$arrow_type
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to const, This style is more consistent.

Comment on lines +850 to +852
base.pow_checked(<u32 as From<u8>>::from(scale))
.ok()
.and_then(|div| match T::KIND {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +896 to +900
Variant::Decimal16(d) => {
Self::cast_decimal_to_num::<Decimal128Type, T, _>(d.integer(), d.scale(), |x| {
x as f64
})
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to manually change it, fmt changed back to this format.

// find the last '.'
let scale_usize = input
.rsplit_once('.')
.map_or_else(|| 0, |(_, frac)| frac.len());
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +1333 to +1336
Variant::Int8(_) | Variant::Int16(_) | Variant::Int32(_) | Variant::Int64(_) => self
.as_num::<i64>()
.map(|x| <i128 as From<i64>>::from(x).try_into().ok())
.unwrap(),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to the suggestion, it looks better

use unwrap here because this is always safe, didn't comment this because I didn't add a comment because I thought it would be too direct to get this.

Yes, the compiler needs it because there are multiple from found

Note: candidate #1 is defined in the trait `From`
Note: candidate #2 is defined in an impl of the trait `num_traits::NumCast` for the type `i128

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate parquet-variant parquet-variant* crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Align cast logic for from/to_decimal for variant to cast kernel

2 participants