-
Notifications
You must be signed in to change notification settings - Fork 346
Materialized Views in Clickhouse: Two issues causing permanent drift on every plan/apply #3693
Description
Description
Two issues cause materialized views to be DROP+CREATE'd on every atlas schema apply, creating an infinite recreation loop:
- Column definitions omitted from CREATE statement — ClickHouse infers wrong types
- SQL expression parenthesization mismatch — Atlas adds parentheses around
INconditions insidemultiIfthat ClickHouse strips when storing
Both issues create permanent drift that I cannot resolve with any HLC changes.
Bug 1: Column definitions omitted during CREATE
Steps to Reproduce
- Define a materialized view with explicit column types in HCL that differ from what ClickHouse would infer from the query:
materialized "example_mv" {
schema = schema.logs
to = table.example_target
column "metric_name" {
null = false
type = sql("LowCardinality(String)")
}
column "metric_value" {
null = false
type = Float64
}
as = <<-SQL
SELECT
'some_literal' AS metric_name,
multiIf(condition1, value, condition2, -value, NULL) AS metric_value
FROM logs.source_table
SQL
refresh {
expr = "EVERY 5 MINUTE"
append = true
}
}- Run
atlas schema apply— the view is created - Check
SHOW CREATE TABLE— the column definitions show:metric_nameasString(notLowCardinality(String))metric_valueasNullable(Float64)(notFloat64)
- Run
atlas schema applyagain — the provider detects drift and wants to DROP+CREATE again - This repeats infinitely
The provider generates a CREATE statement without column definitions:
-- What the provider generates (no column block):
CREATE MATERIALIZED VIEW `logs`.`example_mv`
REFRESH EVERY 5 MINUTE APPEND TO
`logs`.`example_target` AS SELECT ...-- What it should generate (with column block):
CREATE MATERIALIZED VIEW `logs`.`example_mv`
REFRESH EVERY 5 MINUTE APPEND TO
`logs`.`example_target`
(
`metric_name` LowCardinality(String),
`metric_value` Float64
)
AS SELECT ...Without explicit column definitions, ClickHouse infers types from the SELECT expressions:
- A string literal like
'some_value' AS metric_name→ inferred asString, notLowCardinality(String) multiIf(..., value, ..., NULL) AS metric_value→ inferred asNullable(Float64)because of theNULLbranch, notFloat64
Evidence
Terraform apply log shows the CREATE without column definitions:
CREATE MATERIALIZED VIEW
`logs`.`example_mv`
REFRESH EVERY 5 MINUTE APPEND TO
`logs`.`example_target` AS WITH cte AS (SELECT ...)
But SHOW CREATE TABLE in ClickHouse after creation shows inferred types:
CREATE MATERIALIZED VIEW logs.example_mv
REFRESH EVERY 5 MINUTE APPEND TO logs.example_target
(
`metric_name` String, -- HCL says LowCardinality(String)
`metric_value` Nullable(Float64) -- HCL says Float64
)Bug 2: SQL expression parenthesization mismatch with multiIf + IN
Steps to Reproduce
- Define a materialized view with
multiIfcontainingINconditions in the SQL:
as = <<-SQL
SELECT
sum(multiIf(col IN ('a', 'b'), metric_value, col IN ('c', 'd'), -metric_value, NULL)) AS metric_value
FROM logs.source_table
SQL- Run
atlas schema apply - Atlas generates CREATE with added parentheses around the IN conditions:
sum(multiIf((col IN ('a', 'b')), metric_value, (col IN ('c', 'd')), -metric_value, NULL))- ClickHouse stores the SQL without those parentheses:
sum(multiIf(col IN ('a', 'b'), metric_value, col IN ('c', 'd'), -metric_value, NULL))- On next plan, Atlas reads back the DB state (without parens), compares to what it would generate (with parens), sees a difference → wants to DROP+CREATE again
- This repeats infinitely
HCL SQL → Atlas adds parens → CREATE executed → ClickHouse strips parens → SHOW CREATE TABLE → Atlas reads without parens → sees diff → DROP+CREATE
Note
Even after manually matching HCL column types to ClickHouse's inferred types (String, Nullable(Float64)) to eliminate Bug 1, Bug 2 alone still triggers the recreation.
Environment
- ClickHouse Cloud (SharedMergeTree engine)
- Materialized views with
REFRESH EVERY ... APPEND - Column types that differ from ClickHouse's inference (e.g.,
LowCardinality(String)vsString,Float64vsNullable(Float64)) - SQL using
multiIfwithINconditions - Using the provider via Terraform (arigaio/atlas)
Workaround
I can imagine that I'm missing smth but I wasn't able to find the workaround here:
- Changing HCL column types to match ClickHouse's inference would make them inconsistent with the target table definition
- Removing parentheses from HCL doesn't help because Atlas re-adds them when generating the CREATE statement
Thank you in advance!