The feature was added in v0.12.0. It's enabled by default since v0.13.0 (#157). You can disable it be setting internal-aggregation = false to use aggregation in graphite-clickhouse.
[clickhouse]
# ClickHouse-side aggregation
internal-aggregation = true
# maximum number of points per metric. It should be set to 4096 or less for ClickHouse older than 20.8
# https://github.com/ClickHouse/ClickHouse/commit/d7871f3976e4f066c6efb419db22725f476fd9fa
max-data-points = 1048576
The only known frontend supporting passing maxDataPoints from requests is carbonapi>=0.14. Protocol should be set to carbonapi_v3_pb for this feature to fully work, see config->backendv2->backends->protocol.
But even without mentioned adjustments, internal-aggregation improves the whole picture by implementing whisper-like aggregation behavior (see below).
The feature uses ClickHouse aggregation combinator -Resample. This aggregator is available since version 19.11
Note: version 0.12 is compatible only with CH 20.1.13.105, 20.3.10.75, 20.4.5.36, 20.5.2.7 or newer since it uses -OrNull modifier.
Generally, it's a good idea to always use the latest LTS ClickHouse release to have the actual version.
- Upgrade the carbonapi to version 0.14.0 or greater
- Upgrade graphite-clickhouse to version 0.12.0 or greater
- Set the
backendv2->backends->protocol: carbonapi_v3_pbin carbonapi only after graphite-clickhouse is upgraded - Upgrade ClickHouse
- Enable
internal-aggregationin graphite-clickhouse
header:
xFilesFactor: [0, 1]
aggregation: {avg,sum,min,max,...}
retention: 1d:1m,1w:5m,1y:1h
data:
archive1: 1440 points
archive2: 2016 points
archive3: 8760 points
- Retention description:
- Stores point/1m for one day
- Stores point/5m for one week
- Stores point/1h for one year
- Older points than any of mentioned age are overwriten by new incoming points
- Each archive filled up simultaneously
- Aggregation on the fly during writing
xFilesFactorcontrols if points from archive(N) should be aggregated into archive(N+1)- Points are selected only from one archive, with the most precision:
- from <= now-1d -> archive1
- from <= now-7d -> archive2
- else -> archive3
Completely another principle of data storage.
- Retention scheme looks slightly different:
retention: 0:60,1d:5m,1w:1h,1y:1d- Stores point/minute, if the age of point is at least 0sec
- Stores point/5min, if the age of point is at least one day
- Stores point/1h, if the age of point is at least one week
- GraphiteMergeTree doesn't drop metrics after some particular age, so after one year we would store it with the minimum possible resolution point/day
- Retention and aggregation policies are applied only when point becomes older than X (1d,1w,1y)
- There is no such thing as
archive, each point is stored only once - No
xFilesFactorentity: each point will be aggregated
SELECT Path, Time, Value, Timestamp
FROM data WHERE ...Logic:
- Select all points
- Aggregate them on the fly to the proper
archivestep - Pass further to graphite-web/carbonapi
Problems:
- A huge overhead for Path (the heaviest part)
- Extremely inefficient in terms of network traffic, especially when CH cluster is used
- The CH node
query-initiatormust collect the whole data (in memory or on the disk), and only then points will be passed further
- The CH node
SELECT Path,
groupArray(Time),
groupArray(Value),
groupArray(Timestamp)
FROM data WHERE ... GROUP BY Path- Up to 6 time less network load
- But still selects all points and aggregates in graphite-clickhouse
Fetching data: September 2020 (#88) (v0.12.0)
SELECT Path,
arrayFilter(x->isNotNull(x),
anyOrNullResample($from, $until, $step)
(toUInt32(intDiv(Time, $step)*$step), Time)
),
arrayFilter(x->isNotNull(x),
${func}OrNullResample($from, $until, $step)
(Value, Time)
)
FROM data WHERE ... GROUP BY Path- This solution implements
archiveanalog on CH side - Most of the data is aggregated on CH shards and doesn't leave them, so
query-initiatorconsumes much less memory - When carbonapi with
format=carbonapi_v3_pbis used, the/render?maxDataPoints=xparameter processed on CH side too
Fetching data: April 2021 (#145)
WITH anyResample($from, $until, $step)(toUInt32(intDiv(Time, $step)*$step), Time) AS mask
SELECT Path,
arrayFilter(m->m!=0, mask) AS times,
arrayFilter((v,m)->m!=0, ${func}Resample($from, $until, $step)(Value, Time), mask) AS values
FROM data WHERE ... GROUP BY Path- Query improved a bit: dropped the use of
-OrNullimproved compatibility with different CH versions.
For small requests, the difference is not so big, but for the heavy one the amount of data was decreased up to 100 times:
target=${986_metrics_60s_precision}
from=-7d
maxDataPoints=100
| method | rows | points | data (binary) | time (s) |
|---|---|---|---|---|
| row/point | 9887027 | 9887027 | 556378258 (530M) | 16.486 |
| groupArray | 986 | 9887027 | 158180388 (150M) | 35.498 |
| -Resample | 986 | 98553 | 1421418 (1M) | 13.181 |
note: it's localhost, so with slow network effect may be even more significant.
The classical pipeline:
- Fetch the data in graphite-web/carbonapi
- Apply all functions from
target - Compare the result with
maxDataPointsURI parameter and adjust them
Current:
- Get data, aggregated with the proper function directly from CH
- Fetch pre-aggregated data with a proper functions from ClickHouse
- Apply all functions to the pre-aggregated data