Skip to content

Commit 3ebaea7

Browse files
headers formatting
1 parent 54c0be6 commit 3ebaea7

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

content/en/altinity-kb-setup-and-maintenance/altinity-kb-check-replication-ddl-queue.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ WHERE
4949

5050
1. If there are no errors, just everything get slower - check the load (usual system metrics)
5151

52-
## Common problems & solutions
52+
# Common problems & solutions
5353

5454
- If the replication queue does not have any Exceptions only postponed reasons without exceptions just leave ClickHouse® do Merges/Mutations and it will eventually catch up and reduce the number of tasks in `replication_queue`. Number of concurrent merges and fetches can be tuned but if it is done without an analysis of your workload then you may end up in a worse situation. If Delay in queue is going up actions may be needed:
5555

@@ -81,17 +81,17 @@ ORDER BY count() DESC, sum(num_tries) DESC
8181
FORMAT TSVRaw;
8282
```
8383

84-
### Problem with mutation stuck in the queue:
84+
## Problem with mutation stuck in the queue:
8585

8686
- This can happen if the mutation is finished and by some reason the task is not removed from the queue. This can be detected by checking `system.mutations` table and see if the mutation is done but the task is still in the queue.
8787

8888
- kill the mutation (again)
8989

90-
### Replica is not starting because local set of files differs too much
90+
## Replica is not starting because local set of files differs too much
9191

9292
- First try increase the thresholds or set flag `force_restore_data` flag and restarting clickhouse/pod https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication#recovery-after-complete-data-loss
9393

94-
### Replica is in Read-Only MODE
94+
## Replica is in Read-Only MODE
9595

9696
Sometimes due to crashes, zookeeper split brain problem or other reasons some of the tables can be in Read-Only mode. This allows SELECTS but not INSERTS. So we need to do DROP / RESTORE replica procedure.
9797

@@ -111,7 +111,7 @@ SELECT name FROM system.detached_parts WHERE table = 'table_name'; -- check for
111111

112112
Starting from version 23, it's possible to use syntax [SYSTEM DROP REPLICA \'replica_name\' FROM TABLE db.table](https://clickhouse.com/docs/en/sql-reference/statements/system#drop-replica) instead of the `ZKPATH` variant, but you need to execute the above command from a different replica than the one you want to drop, which is not convenient sometimes. We recommend using the above method because it works with any version and is more reliable.
113113

114-
### Procedure for many replicas generating DDL:
114+
## Procedure for many replicas generating DDL:
115115

116116
```sql
117117
SELECT DISTINCT 'DETACH TABLE ' || database || '.' || table || ' ON CLUSTER \'data\';' FROM clusterAllReplicas('data',system.replicas) WHERE active_replicas < total_replicas FORMAT TSVRaw;
@@ -177,7 +177,7 @@ restore_replica() {
177177
restore_replica "$@"
178178
```
179179

180-
### Stuck DDL tasks in the distributed_ddl_queue
180+
## Stuck DDL tasks in the distributed_ddl_queue
181181

182182
Sometimes [DDL tasks](/altinity-kb-setup-and-maintenance/altinity-kb-ddlworker/) (the ones that use ON CLUSTER) can get stuck in the `distributed_ddl_queue` because the replicas can overload if multiple DDLs (thousands of CREATE/DROP/ALTER) are executed at the same time. This is very normal in heavy ETL jobs.This can be detected by checking the `distributed_ddl_queue` table and see if there are tasks that are not moving or are stuck for a long time.
183183

0 commit comments

Comments
 (0)