Spark: Fix CREATE VIEW IF NOT EXISTS failure when non-Iceberg view exists in SparkSessionCatalog #14930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

stuxuhai wants to merge 1 commit into apache:main from stuxuhai:fix-create-view-if-not-exists-hive-collision

+588 −0

Contributor

stuxuhai commented Dec 27, 2025

This is a follow-up PR. The previous PR was closed after the branch was force-reset to apache:main.

Purpose

This PR fixes a bug where CREATE VIEW IF NOT EXISTS fails with a NoSuchIcebergViewException: Not an iceberg view (wrapped in QueryExecutionException) instead of succeeding silently when a non-Iceberg view (e.g., a Hive view) already exists in the SparkSessionCatalog.

The Problem

When SparkSessionCatalog is configured with spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog
spark.sql.catalog.spark_catalog.type=hive

A user executes CREATE VIEW IF NOT EXISTS db.view_name AS ....
If db.view_name already exists as a Hive View (or any non-Iceberg table/view).
SparkSessionCatalog.createView currently delegates directly to the underlying Iceberg catalog (asViewCatalog.createView).
The Iceberg catalog (e.g., HiveCatalog) attempts to load the view. Since it is not an Iceberg view, it throws NoSuchIcebergViewException.
Spark expects ViewAlreadyExistsException to handle the IF NOT EXISTS logic. Because it receives a different exception, the query fails entirely.

The Fix

Before delegating the creation to the Iceberg catalog, we explicitly check if the identifier already exists in the underlying session catalog (which is the source of truth for the global namespace).

If getSessionCatalog().tableExists(ident) returns true, we immediately throw ViewAlreadyExistsException. This allows Spark's analysis rules to correctly catch the exception and ignore the operation as per IF NOT EXISTS semantics.

Verification

Added a new unit test in TestSparkSessionCatalog to verify that CREATE VIEW IF NOT EXISTS succeeds when a Hive view exists.
Verified that CREATE VIEW (without if not exists) correctly throws AnalysisException (Table or view already exists).


          Spark: Fix CREATE VIEW IF NOT EXISTS failure when non-Iceberg view ex…

5d78d5a

…ists in SparkSessionCatalog

github-actions bot added the spark label

Contributor Author

stuxuhai commented Jan 6, 2026

@huaxingao The previous PR was automatically closed due to a force push, so I’ve opened a new one.
Could you please help review it when you have time? Thanks!

nastra reviewed

View reviewed changes

...src/test/java/org/apache/iceberg/spark/extensions/TestSparkSessionCatalogWithExtensions.java

+              import org.junit.jupiter.api.BeforeAll;
+              import org.junit.jupiter.api.Test;
+              public class TestSparkSessionCatalogWithExtensions {

Contributor

nastra Jan 8, 2026

I would suggest to first fix the issue in one Spark version and later backport stuff

Contributor

huaxingao Jan 10, 2026

+1 to first fix 4.1 and then back-porting

nastra requested a review from huaxingao

January 8, 2026 16:04

huaxingao reviewed

View reviewed changes

...src/test/java/org/apache/iceberg/spark/extensions/TestSparkSessionCatalogWithExtensions.java

+              import org.junit.jupiter.api.BeforeAll;
+              import org.junit.jupiter.api.Test;
+              public class TestSparkSessionCatalogWithExtensions {

Contributor

huaxingao Jan 10, 2026

+1 to first fix 4.1 and then back-porting

...src/test/java/org/apache/iceberg/spark/extensions/TestSparkSessionCatalogWithExtensions.java

+                  }
+                }
+                public static void setUpCatalog() {

Contributor

huaxingao Jan 10, 2026

nit: private?

...src/test/java/org/apache/iceberg/spark/extensions/TestSparkSessionCatalogWithExtensions.java

+                  spark.conf().set("spark.sql.catalog.spark_catalog.type", "hive");
+                }
+                public static void resetSparkCatalog() {

Contributor

huaxingao Jan 10, 2026

nit: private?

huaxingao reviewed

View reviewed changes

...src/test/java/org/apache/iceberg/spark/extensions/TestSparkSessionCatalogWithExtensions.java

+                protected static TestHiveMetastore metastore = null;
+                protected static HiveConf hiveConf = null;
+                protected static SparkSession spark = null;
+                protected static JavaSparkContext sparkContext = null;

Contributor

huaxingao Jan 10, 2026

is this necessary? If not, can we remove?

huaxingao reviewed

View reviewed changes

...src/test/java/org/apache/iceberg/spark/extensions/TestSparkSessionCatalogWithExtensions.java

+                  spark
+                      .conf()
+                      .set("spark.sql.catalog.spark_catalog", "org.apache.iceberg.spark.SparkSessionCatalog");
+                  spark.conf().set("spark.sql.catalog.spark_catalog.type", "hive");

Contributor

huaxingao Jan 10, 2026

Should we add spark.sessionState().catalogManager().reset() when flipping these configs (either inside the helper methods or immediately after calling them in the tests), similar to how spark/v4.1/spark/src/test/java/org/apache/iceberg/spark/TestSparkSessionCatalog.java does it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels