Tim's blah blah blah

Reduce Home Assistant database size

My Home Assistant database is getting quite large (937M), even when I only have limited history (20 days). I’d like to prune noisy sensors, filter these out from filling up the databse, and thereby get a longer history in Home Assistant. I started with 937M/20d and ended up with TODO/60d


Home Assistant uses the following nomenclature, which is good to realize when optimizing. See also Optimize Your Home Assistant Database.

Inspect database and identify data usage

See this great post as inspiration

Open the database as readonly:

sudo sqlite3 -readonly /var/lib/homeassistant/home-assistant_v2.db
.mode column
.width 30, 10,

Get large tables (ignore ix_* which are indices)

SELECT SUM(pgsize) bytes, name FROM dbstat GROUP BY name ORDER BY bytes DESC;

Results in (top-5 results, relative size of top-4 non-ix* tables added):

bytes        name                                  relative        
-----------  ------------------------------------  -----------  
282152960    states                                66%                        
113123328    ix_states_context_id_bin                      
87564288     ix_states_metadata_id_last_updated_ts         
78295040     ix_states_last_updated_ts                     
73031680     statistics                            17%        
60743680     ix_states_old_state_id                        
56459264     ix_states_attributes_id                       
43597824     ix_states_event_id                            
43515904     statistics_short_term                 10%        
24170496     ix_statistics_statistic_id_start_ts           
23629824     events                                6%        

We get total size via:

SELECT SUM(pgsize) bytes, name FROM dbstat DESC;

resulting in

bytes                           name         
------------------------------  -------------
971333632                       sqlite_schema

Which is a bit strange because it’s more than twice that of the top-4 items. Maybe ix_* tables do count?

Viewing states usage

  COUNT(*) AS cnt,
  COUNT(*) * 100 / (SELECT COUNT(*) FROM states) AS cnt_pct,
FROM states
INNER JOIN states_meta ON states.metadata_id=states_meta.metadata_id
GROUP BY states_meta.entity_id
cnt       cnt_pct entity_id                                                                
--------- ------- -------------------------------------------------------------------------
46933     1       sensor.innr_sp_120_425efe01_electrical_measurement_rms_voltage
46994     1       sensor.dishwasher_electricalmeasurementrmsvoltage_2
53249     1       sensor.solaredge_ac_voltage_ab
53249     1       sensor.solaredge_ac_voltage_an
56415     1       sensor.washing_machine_electricalmeasurement
58999     1       sensor.solaredge_ac_frequency
70570     1       sensor.bt7_hw_top_40013
86610     2       climate.climate_system_s1
87444     2       sensor.itho_hru_supply_temp
109377    2       sensor.itho_hru_exhaust_temp
121870    2       sensor.dishwasher_electricalmeasurement_2
131907    3       sensor.bt1_outdoor_temperature_40004
159653    3       sensor.itho_hru_exhaust_fan_actual
161413    3       sensor.itho_hru_supply_fan_actual
203475    4       sensor.innr_sp_120_electricalmeasurement_3
303691    7       automation.publish_solaredge_to_mqtt
1153211   27      sensor.innr_sp_120_425efe01_electrical_measurement

Reduce database size

Exclude data from recorder

I only care about sensor data, history of automation is not really necessary

  auto_purge: true
  purge_keep_days: 60
  auto_repack: true
      - sensor.esp*uptime
      - automation

Purge noisy sensors

If you have entities where you would like to keep less history than others, you can target their purging with a scheduled automation. More information on this is in the documentation:

alias: Purge noisy (power) sensors
  - platform: time
    at: "04:00:00"
  - service: recorder.purge_entities
      entity_globs: sensor.esp*uptime
      keep_days: 1
        - sensor.innr_sp_120_425efe01_electrical_measurement
        - sensor.innr_sp_120_electricalmeasurement_3
        - sensor.dishwasher_electricalmeasurement_2
        - sensor.solaredge_ac_frequency
        - sensor.solaredge_ac_voltage_ab
        - sensor.solaredge_ac_voltage_an
        - sensor.washing_machine_electricalmeasurement
        - sensor.washing_machine_electricalmeasurementrmsvoltage
        - sensor.innr_sp_120_electricalmeasurementrmsvoltage_3
        - sensor.dishwasher_electricalmeasurementrmsvoltage_2
        - sensor.innr_sp_120_425efe01_electrical_measurement_rms_voltage
        - sensor.innr_sp_120_electricalmeasurementrmsvoltage
        - sensor.innr_sp_120_425efe01_electrical_measurement_rms_current
        - sensor.itho_hru_exhaust_fan_actual
        - sensor.itho_hru_supply_fan_actual
mode: single

Vacuum database

After reducing data, you need to purge/VACUUM your database. Somehow this didnt happen when I used auto-purge: true in my recorder settings, so I manually performed the action as described here.

  1. Shutdown Home Assistant
  2. Ensure home-assistant_v2.db-shm and home-assistant_v2.db-wal files are gone
  3. Run sqlite3 /var/lib/homeassistant/home-assistant_v2.db
  4. VACUUM;
  5. Restart Home Assistant

Other sources

#home-assistant #sqlite #Linux #server #smarthome