-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Open
Labels
type:community-supportCommunity-relatedCommunity-related
Description
I am using Hudi 0.15.0, In spark sql, I run following simple query,
When I run select * from hudi_cow_20251229_07, the result is as follows, I wonder why 1,2,3 and 1,3,6 are gone(I am using insert, no duplicates should be dropped)
spark-sql> select * from hudi_cow_20251229_07;
spark-sql> select * from hudi_cow_20251229_07;
_hoodie_commit_time _hoodie_commit_seqno _hoodie_record_key _hoodie_partition_path _hoodie_file_name a b c
20251229154740370 20251229154740370_0_0 1 6efdbc56-1ebd-4cec-a1d4-6738aed8352b-0_0-15-21_20251229154740370.parquet 1 4 7
set hoodie.spark.sql.insert.into.operation=insert;
set hoodie.datasource.write.insert.drop.duplicates=false;
set hoodie.datasource.write.insert.dup.policy=none;
CREATE TABLE IF NOT EXISTS hudi_cow_20251229_07 (
a INT,
b INT,
c INT
)
USING hudi
tblproperties(
type='cow',
primaryKey='a',
hoodie.datasource.write.precombine.field='c'
);
insert into hudi_cow_20251229_07(a,b,c) values(1,2,3),(1,4,7),(1,3,6);
Metadata
Metadata
Assignees
Labels
type:community-supportCommunity-relatedCommunity-related