You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/altinity-kb-setup-and-maintenance/altinity-kb-check-replication-ddl-queue.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -49,7 +49,7 @@ WHERE
49
49
50
50
1. If there are no errors, just everything get slower - check the load (usual system metrics)
51
51
52
-
##Common problems & solutions
52
+
# Common problems & solutions
53
53
54
54
- If the replication queue does not have any Exceptions only postponed reasons without exceptions just leave ClickHouse® do Merges/Mutations and it will eventually catch up and reduce the number of tasks in `replication_queue`. Number of concurrent merges and fetches can be tuned but if it is done without an analysis of your workload then you may end up in a worse situation. If Delay in queue is going up actions may be needed:
55
55
@@ -81,17 +81,17 @@ ORDER BY count() DESC, sum(num_tries) DESC
81
81
FORMAT TSVRaw;
82
82
```
83
83
84
-
###Problem with mutation stuck in the queue:
84
+
## Problem with mutation stuck in the queue:
85
85
86
86
- This can happen if the mutation is finished and by some reason the task is not removed from the queue. This can be detected by checking `system.mutations` table and see if the mutation is done but the task is still in the queue.
87
87
88
88
- kill the mutation (again)
89
89
90
-
###Replica is not starting because local set of files differs too much
90
+
## Replica is not starting because local set of files differs too much
91
91
92
92
- First try increase the thresholds or set flag `force_restore_data` flag and restarting clickhouse/pod https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication#recovery-after-complete-data-loss
93
93
94
-
###Replica is in Read-Only MODE
94
+
## Replica is in Read-Only MODE
95
95
96
96
Sometimes due to crashes, zookeeper split brain problem or other reasons some of the tables can be in Read-Only mode. This allows SELECTS but not INSERTS. So we need to do DROP / RESTORE replica procedure.
97
97
@@ -111,7 +111,7 @@ SELECT name FROM system.detached_parts WHERE table = 'table_name'; -- check for
111
111
112
112
Starting from version 23, it's possible to use syntax [SYSTEM DROP REPLICA \'replica_name\' FROM TABLE db.table](https://clickhouse.com/docs/en/sql-reference/statements/system#drop-replica) instead of the `ZKPATH` variant, but you need to execute the above command from a different replica than the one you want to drop, which is not convenient sometimes. We recommend using the above method because it works with any version and is more reliable.
113
113
114
-
###Procedure for many replicas generating DDL:
114
+
## Procedure for many replicas generating DDL:
115
115
116
116
```sql
117
117
SELECT DISTINCT'DETACH TABLE '|| database ||'.'|| table ||' ON CLUSTER \'data\';'FROM clusterAllReplicas('data',system.replicas) WHERE active_replicas < total_replicas FORMAT TSVRaw;
@@ -177,7 +177,7 @@ restore_replica() {
177
177
restore_replica "$@"
178
178
```
179
179
180
-
###Stuck DDL tasks in the distributed_ddl_queue
180
+
## Stuck DDL tasks in the distributed_ddl_queue
181
181
182
182
Sometimes [DDL tasks](/altinity-kb-setup-and-maintenance/altinity-kb-ddlworker/) (the ones that use ON CLUSTER) can get stuck in the `distributed_ddl_queue` because the replicas can overload if multiple DDLs (thousands of CREATE/DROP/ALTER) are executed at the same time. This is very normal in heavy ETL jobs.This can be detected by checking the `distributed_ddl_queue` table and see if there are tasks that are not moving or are stuck for a long time.
0 commit comments