InfluxDB retention policy and data downsampling

15 Jun, 2019 (Updated: 17 Sep, 2019)

Goal: set up data downsampling scheme on existing collectd data.

This article elaborates the official InfluxDB guide here (downsampling and retention (influxdata.com))
where we already have a database with existing measurements that we want to
keep.

I’d like to downsample my collectd data on ping, memory, load, etc. data from
two machines. I only want high (minute) time resolution for a few days, then
downsample to save space.

Update :

instead of using different field/measurement names (e.g. mean_value in ping_hourly), one can use the same measurement and field name to make querying more easy.
using grafana variables (github.com), one can use dynamic retention policy (RP) use, e.g. use the right RP for the right time range

Explore data ¶

Since we need to manually rename fields to prevent prefixing with ‘mean_’ etc, we check how many fields there are in each measurement, e.g.

SELECT * FROM load GROUP BY * limit 1

will show three fields:

time                           longterm midterm shortterm
----                           -------- ------- ---------
2019-08-27T00:00:46.344999042Z 0.47     0.48    0.5

Define downsampling queries ¶

It helps to first finetune your downsampling queries manually, such that you can check if you’re happy with the result

SELECT mean(value) as value FROM df GROUP BY time(1h),*
SELECT mean(rx) as rx, mean(tx) as tx FROM interface GROUP BY time(1h),*
SELECT mean(value) as value FROM iwinfo GROUP BY time(1h),*
SELECT mean(longterm) as longterm, mean(midterm) as midterm, mean(shortterm) as shortterm FROM load GROUP BY time(1h),*
SELECT mean(value) as value FROM memory GROUP BY time(1h),*
SELECT mean(value) as value FROM ping GROUP BY time(1h),*

Create long retention policies ¶

First create long retention policy to save the existing data to. If we would
create the short one first, the retention policy could prune our data if we’re
not fast enough. Note that this is NOT the default policy.

CREATE RETENTION POLICY "always" ON "collectd" DURATION INF REPLICATION 1

Manually aggregate old data ¶

Now we manually save our old data into the newly created retention policy.
Note that this is done in a different measurement. To retain tags, we use ",*" in the GROUP BY clause. If omitted, this would drop all tags.

SELECT mean(value) as value INTO "always"."df" FROM "df" GROUP BY time(1h),*
SELECT mean(rx) as rx, mean(tx) as tx INTO "always"."interface" FROM "interface" GROUP BY time(1h),*
SELECT mean(value) as value INTO "always"."iwinfo" FROM "iwinfo" GROUP BY time(1h),*
SELECT mean(longterm) as longterm, mean(midterm) as midterm, mean(shortterm) as shortterm  INTO "always"."load" FROM "load" GROUP BY time(1h),*
SELECT mean(value) as value INTO "always"."memory" FROM "memory" GROUP BY time(1h),*
SELECT mean(value) as value INTO "always"."ping" FROM "ping" GROUP BY time(1h),*

Create 5-day short DEFAULT retention policy ¶

Now that we’ve safely stored our aggregated data, we can create the short
and default retention policy, 5 days in this case. This means that influxdb
regularly prunes data older than 5 days from this database.

CREATE RETENTION POLICY "five_days" ON "collectd" DURATION 5d REPLICATION 1 DEFAULT

Set up continuous queries ¶

Now we’re ready to automate the data downsampling using continuous queries. By
default these continuous queries run every 30 minutes, which can be altered in
the configuration file of influxdb.

CREATE CONTINUOUS QUERY cq_60m_df ON collectd BEGIN SELECT mean(value) as value INTO collectd.always.df FROM collectd.five_days.df GROUP BY time(1h), * END
CREATE CONTINUOUS QUERY cq_60m_interface ON collectd BEGIN SELECT mean(rx) as rx, mean(tx) as tx INTO collectd.always.interface FROM collectd.five_days.interface GROUP BY time(1h), * END
CREATE CONTINUOUS QUERY cq_60m_iwinfo ON collectd BEGIN SELECT mean(value) as value INTO collectd.always.iwinfo FROM collectd.five_days.iwinfo GROUP BY time(1h), * END
CREATE CONTINUOUS QUERY cq_60m_load ON collectd BEGIN SELECT mean(longterm) as longterm, mean(midterm) as midterm, mean(shortterm) as shortterm INTO collectd.always.load FROM collectd.five_days.load GROUP BY time(1h), * END
CREATE CONTINUOUS QUERY cq_60m_memory ON collectd BEGIN SELECT mean(value) as value  INTO collectd.always.memory FROM collectd.five_days.memory GROUP BY time(1h), * END
CREATE CONTINUOUS QUERY cq_60m_ping ON collectd BEGIN SELECT mean(value) as value INTO collectd.always.ping FROM collectd.five_days.ping GROUP BY time(1h), * END
CREATE CONTINUOUS QUERY cq_60m_cpufreq ON collectd BEGIN SELECT mean(value) as value  INTO collectd.always.cpufreq FROM collectd.five_days.cpufreq GROUP BY time(1h), * END

Check setup ¶

This should show the above commands on the database:

SHOW CONTINUOUS QUERIES

The below command should show data filling up:

SELECT * FROM "always"."ping" ORDER BY DESC LIMIT 10

2019-06-14T16:00:00Z rpi3b 40.960490196078446 ping          dataix.ru
2019-06-14T16:00:00Z rpi3b root     4249448094.8965516 df_complex used

then wait until the hour is over and check again for updates.

Setting up Grafana ¶

Following instructions found here (github.com), we can tell Grafana to use the correct RP for specific time ranges.

Make this mapping from time range to retention policy:

INSERT INTO always rp_config,idx=1 rp="five_days",start=0i,end=432000000i -9223372036854775806
INSERT INTO always rp_config,idx=2 rp="always",start=432000000i,end=3110400000000i -9223372036854775806

Then create this variable in Grafana:

SELECT rp FROM always.rp_config WHERE ($__to - $__from) > "start" AND ($__to - $__from) < "end"

Now use $rp as prefix for all queries relevant for this RP.

One bug here is that the above query checks for time range , while retention policies are about time passed. E.g. if we view 1 day of data 10 weeks ago, the above query will result in using the high-resolution five_days RP which has no data for 10 weeks ago. No solution yet.

Sources ¶

#Linux #Influxdb #Server