Reduce Home Assistant database size
My Home Assistant database is getting quite large (937M), even when I only have limited history (20 days). I’d like to prune noisy sensors, filter these out from filling up the databse, and thereby get a longer history in Home Assistant. I started with 937M/20d and ended up with TODO/60d
Nomenclature ¶
Home Assistant uses the following nomenclature, which is good to realize when optimizing. See also Optimize Your Home Assistant Database (smarthomescene.com).
- database (home-assistant.io): databse backend where data is stored (e.g. SQLite [default], or PostgreSQL etc.)
- recorder (home-assistant.io): integration that determines which entities are writing to database
- logbook (home-assistant.io) & history (home-assistant.io): two viewers on the data stored in the databse, the former event-oriented, the latter data-oriented.
Inspect database and identify data usage ¶
See this great post (home-assistant.io) as inspiration
Open the database as readonly:
sudo sqlite3 -readonly /var/lib/homeassistant/home-assistant_v2.db
.mode column
.width 30, 10,
Get large tables (ignore ix_*
which are indices)
SELECT SUM(pgsize) bytes, name FROM dbstat GROUP BY name ORDER BY bytes DESC;
Results in (top-5 results, relative size of top-4 non-ix* tables added):
bytes name relative
----------- ------------------------------------ -----------
282152960 states 66%
113123328 ix_states_context_id_bin
87564288 ix_states_metadata_id_last_updated_ts
78295040 ix_states_last_updated_ts
73031680 statistics 17%
60743680 ix_states_old_state_id
56459264 ix_states_attributes_id
43597824 ix_states_event_id
43515904 statistics_short_term 10%
24170496 ix_statistics_statistic_id_start_ts
23629824 events 6%
We get total size via:
SELECT SUM(pgsize) bytes, name FROM dbstat DESC;
resulting in
bytes name
------------------------------ -------------
971333632 sqlite_schema
Which is a bit strange because it’s more than twice that of the top-4 items. Maybe ix_*
tables do count?
Viewing states
usage ¶
SELECT
COUNT(*) AS cnt,
COUNT(*) * 100 / (SELECT COUNT(*) FROM states) AS cnt_pct,
states_meta.entity_id
FROM states
INNER JOIN states_meta ON states.metadata_id=states_meta.metadata_id
GROUP BY states_meta.entity_id
ORDER BY cnt ASC;
cnt cnt_pct entity_id
--------- ------- -------------------------------------------------------------------------
46933 1 sensor.innr_sp_120_425efe01_electrical_measurement_rms_voltage
46994 1 sensor.dishwasher_electricalmeasurementrmsvoltage_2
53249 1 sensor.solaredge_ac_voltage_ab
53249 1 sensor.solaredge_ac_voltage_an
56415 1 sensor.washing_machine_electricalmeasurement
58999 1 sensor.solaredge_ac_frequency
70570 1 sensor.bt7_hw_top_40013
86610 2 climate.climate_system_s1
87444 2 sensor.itho_hru_supply_temp
109377 2 sensor.itho_hru_exhaust_temp
121870 2 sensor.dishwasher_electricalmeasurement_2
131907 3 sensor.bt1_outdoor_temperature_40004
159653 3 sensor.itho_hru_exhaust_fan_actual
161413 3 sensor.itho_hru_supply_fan_actual
203475 4 sensor.innr_sp_120_electricalmeasurement_3
303691 7 automation.publish_solaredge_to_mqtt
1153211 27 sensor.innr_sp_120_425efe01_electrical_measurement
Reduce database size ¶
Exclude data from recorder ¶
I only care about sensor data, history of automation is not really necessary
recorder:
auto_purge: true
purge_keep_days: 60
auto_repack: true
exclude:
entity_globs:
- sensor.esp*uptime
domains:
- automation
Purge noisy sensors ¶
If you have entities where you would like to keep less history than others, you can target their purging with a scheduled automation. More information on this is in the documentation:
alias: Purge noisy (power) sensors
trigger:
- platform: time
at: "04:00:00"
action:
- service: recorder.purge_entities
data:
entity_globs: sensor.esp*uptime
keep_days: 1
target:
entity_id:
- sensor.innr_sp_120_425efe01_electrical_measurement
- sensor.innr_sp_120_electricalmeasurement_3
- sensor.dishwasher_electricalmeasurement_2
- sensor.solaredge_ac_frequency
- sensor.solaredge_ac_voltage_ab
- sensor.solaredge_ac_voltage_an
- sensor.washing_machine_electricalmeasurement
- sensor.washing_machine_electricalmeasurementrmsvoltage
- sensor.innr_sp_120_electricalmeasurementrmsvoltage_3
- sensor.dishwasher_electricalmeasurementrmsvoltage_2
- sensor.innr_sp_120_425efe01_electrical_measurement_rms_voltage
- sensor.innr_sp_120_electricalmeasurementrmsvoltage
- sensor.innr_sp_120_425efe01_electrical_measurement_rms_current
- sensor.itho_hru_exhaust_fan_actual
- sensor.itho_hru_supply_fan_actual
mode: single
Vacuum database ¶
After reducing data, you need to purge/VACUUM
your database. Somehow this didnt happen when I used auto-purge: true
in my recorder settings, so I manually performed the action as described here (home-assistant.io).
- Shutdown Home Assistant
- Ensure
home-assistant_v2.db-shm
andhome-assistant_v2.db-wal
files are gone - Run
sqlite3 /var/lib/homeassistant/home-assistant_v2.db
VACUUM;
- Restart Home Assistant
Other sources ¶
- https://community.home-assistant.io/t/how-to-keep-your-recorder-database-size-under-control/295795 (home-assistant.io)
- https://community.home-assistant.io/t/how-to-shrink-the-database/152243/21 (home-assistant.io)
- https://stackoverflow.com/questions/5987042/how-do-you-efficiently-trim-an-sqlite-database-down-to-a-given-file-size (stackoverflow.com)