Skip to content

Materialized Views in Clickhouse: Two issues causing permanent drift on every plan/apply #3693

@Stasia-sv

Description

@Stasia-sv

Description

Two issues cause materialized views to be DROP+CREATE'd on every atlas schema apply, creating an infinite recreation loop:

  1. Column definitions omitted from CREATE statement — ClickHouse infers wrong types
  2. SQL expression parenthesization mismatch — Atlas adds parentheses around IN conditions inside multiIf that ClickHouse strips when storing

Both issues create permanent drift that I cannot resolve with any HLC changes.


Bug 1: Column definitions omitted during CREATE

Steps to Reproduce

  1. Define a materialized view with explicit column types in HCL that differ from what ClickHouse would infer from the query:
materialized "example_mv" {
  schema = schema.logs
  to     = table.example_target

  column "metric_name" {
    null = false
    type = sql("LowCardinality(String)")
  }
  column "metric_value" {
    null = false
    type = Float64
  }

  as = <<-SQL
    SELECT
      'some_literal' AS metric_name,
      multiIf(condition1, value, condition2, -value, NULL) AS metric_value
    FROM logs.source_table
  SQL

  refresh {
    expr   = "EVERY 5 MINUTE"
    append = true
  }
}
  1. Run atlas schema apply — the view is created
  2. Check SHOW CREATE TABLE — the column definitions show:
    • metric_name as String (not LowCardinality(String))
    • metric_value as Nullable(Float64) (not Float64)
  3. Run atlas schema apply again — the provider detects drift and wants to DROP+CREATE again
  4. This repeats infinitely

The provider generates a CREATE statement without column definitions:

-- What the provider generates (no column block):
CREATE MATERIALIZED VIEW `logs`.`example_mv`
REFRESH EVERY 5 MINUTE APPEND TO
`logs`.`example_target` AS SELECT ...
-- What it should generate (with column block):
CREATE MATERIALIZED VIEW `logs`.`example_mv`
REFRESH EVERY 5 MINUTE APPEND TO
`logs`.`example_target`
(
    `metric_name` LowCardinality(String),
    `metric_value` Float64
)
AS SELECT ...

Without explicit column definitions, ClickHouse infers types from the SELECT expressions:

  • A string literal like 'some_value' AS metric_name → inferred as String, not LowCardinality(String)
  • multiIf(..., value, ..., NULL) AS metric_value → inferred as Nullable(Float64) because of the NULL branch, not Float64

Evidence

Terraform apply log shows the CREATE without column definitions:

CREATE MATERIALIZED VIEW
`logs`.`example_mv`
REFRESH EVERY 5 MINUTE APPEND TO
`logs`.`example_target` AS WITH cte AS (SELECT ...)

But SHOW CREATE TABLE in ClickHouse after creation shows inferred types:

CREATE MATERIALIZED VIEW logs.example_mv
REFRESH EVERY 5 MINUTE APPEND TO logs.example_target
(
    `metric_name` String,           -- HCL says LowCardinality(String)
    `metric_value` Nullable(Float64) -- HCL says Float64
)

Bug 2: SQL expression parenthesization mismatch with multiIf + IN

Steps to Reproduce

  1. Define a materialized view with multiIf containing IN conditions in the SQL:
  as = <<-SQL
    SELECT
      sum(multiIf(col IN ('a', 'b'), metric_value, col IN ('c', 'd'), -metric_value, NULL)) AS metric_value
    FROM logs.source_table
  SQL
  1. Run atlas schema apply
  2. Atlas generates CREATE with added parentheses around the IN conditions:
sum(multiIf((col IN ('a', 'b')), metric_value, (col IN ('c', 'd')), -metric_value, NULL))
  1. ClickHouse stores the SQL without those parentheses:
sum(multiIf(col IN ('a', 'b'), metric_value, col IN ('c', 'd'), -metric_value, NULL))
  1. On next plan, Atlas reads back the DB state (without parens), compares to what it would generate (with parens), sees a difference → wants to DROP+CREATE again
  2. This repeats infinitely
HCL SQL → Atlas adds parens → CREATE executed → ClickHouse strips parens → SHOW CREATE TABLE → Atlas reads without parens → sees diff → DROP+CREATE

Note

Even after manually matching HCL column types to ClickHouse's inferred types (String, Nullable(Float64)) to eliminate Bug 1, Bug 2 alone still triggers the recreation.


Environment

  • ClickHouse Cloud (SharedMergeTree engine)
  • Materialized views with REFRESH EVERY ... APPEND
  • Column types that differ from ClickHouse's inference (e.g., LowCardinality(String) vs String, Float64 vs Nullable(Float64))
  • SQL using multiIf with IN conditions
  • Using the provider via Terraform (arigaio/atlas)

Workaround

I can imagine that I'm missing smth but I wasn't able to find the workaround here:

  • Changing HCL column types to match ClickHouse's inference would make them inconsistent with the target table definition
  • Removing parentheses from HCL doesn't help because Atlas re-adds them when generating the CREATE statement

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions