feat: Support int to timestamp casts#3541
Conversation
6bec627 to
11ef1b6
Compare
There was a problem hiding this comment.
Current testing framework which uses collect() to fetch data into a list . However, for long timestamps this errors out .
Steps to reproduce :
spark.sql("SELECT cast(-9223372036854775808 as timestamp)").collect()
There was a problem hiding this comment.
You could read the data from a table instead?
scala> val data = Seq(1L, 10324234242L, 234234234234L, -9223372036854775808L).toDF("a")
scala> data.createTempView("t1")
scala> spark.sql("select a, cast(a as timestamp) from t1").show(false)
+--------------------+-----------------------------+
|a |a |
+--------------------+-----------------------------+
|1 |1969-12-31 17:00:01 |
|10324234242 |2297-02-28 03:50:42 |
|234234234234 |9392-08-02 03:03:54 |
|-9223372036854775808|-290308-12-21 12:59:09.224192|
+--------------------+-----------------------------+
There was a problem hiding this comment.
Thank you for the suggestion . The issue is rather on the castTest framework in which we have a collect() statement. This collect would convert the spark's timestamp (microseconds) to java's timestamp (which is basically a date) causing error / overflow. In order to avoid that I have had to implement assertDataFrameEquals (inspired from spark-testing-base) to make sure we have data and schema parity .
Please let me know if you think we have easier options to do this and I would love to change the code
There was a problem hiding this comment.
Another option to look at:
scala> spark.sql("select a, cast(cast(a as timestamp) as string) from t1").collect()
res3: Array[org.apache.spark.sql.Row] = Array(
[1,1969-12-31 17:00:01],
[10324234242,2297-02-28 03:50:42],
[234234234234,9392-08-02 03:03:54],
[-9223372036854775808,-290308-12-21 12:59:09.224192]
)
There was a problem hiding this comment.
Thank you for the comment @andygrove . I tried double casting to string but that didnt work out given the existing limitation of String -> Date . However , casting it back to Long simplified things
|
I believe this is ready for your review @andygrove |
7859888 to
464cce2
Compare
|
Thank you for merging main and the suggestions re testing long to timestamp @andygrove , @mbutrovich |
427d3b6 to
fbacef0
Compare
Which issue does this PR close?
Closes #.
Rationale for this change
What changes are included in this PR?
How are these changes tested?