Homelab Proxmox + VyOS + Debian setup from scratch
(Updated: )
Here I document my home server config. I’m trying to integrate the router in it by using a second USB ethernet as WAN port. Running Proxmox I can run a Debian installation for my usual stuff and a separate router VM for routing/firewall/adblock. Since setting up in 2022, my setup has grown considerably.
Contents
Goal ¶
I’m looking for a small and energy efficient server with some storage capability.
- Small form factor (<5 liter)
- Low power consumption home server + router configuration
- Nice to have: virtualized router and home server (to satisfy above)
- Server requirements
- A few TB storage for home use (backup, pictures, etc.)
- Stable and secure Linux OS preferred
- Run media server
- Run home assistant
- Router requirements (from home networking setup (vanwerkhoven.org))
- Should support 100 MBit WAN (NAT/firewalling requirement)
- Separate networks for internal, guest, and buggy IoT devices (VLAN-aware ethernet and WiFi)
- Ability to prevent buffer bloat (need decent QoS)
- Gigabit LAN
- Low-power (ideally <10W for full setup)
- LAN-wide adblocking (DNS-based pi-hole or related)
- Home VPN server (IKEv2 or Wireguard, >50Mbps)
Todo ¶
- Set up warning in case thin volumes exceed usage of thin pool (autoextend? source? (sleeplessbeastie.eu))
fstrim
guest os volumes after rebootingpve
host (source (askubuntu.com))- Move Home Assistant to separate VM – less nesting –> done
- Reconfigure disk sizes, 8GB was a bit too conservative for proxmox
Hardware ¶
I’m looking for a small and energy efficient server with some storage capability. I’ve settled for using a NUC with an extra 2.5" bay, which suits my needs.
- NUC8i3BEH (intel.com) (with Intel i3-8109U)
- 2TB SSD (Samsung 970 EVO Plus M.2 80mm PCIE)
- 4TB SSD (Samsung 860 2.5" SATA)
- 32GB RAM (2x Micron 16GB DDR4 SODIMM @ 2400 MT/s)
- Zigbee USB dongle (Conbee (phoscon.de))
- USB Ethernet dongle for split WAN/LAN network (ASUS)
- USB to smartmeter cable (FTD)
- USB to heatmeter cable (FTD)
- Optional: USB port to power ESP8266 water meter board
- Optional: Bluetooth USB dongle (for more range)
Target services & architecture ¶
I considered the following virtualization software:
- Proxmox or XCP-NG + XO
- Proxmox is easier, uses less power. LVM on multiple disks requires some manual command-line work.
For routing, I considered the following platforms:
- OpenWRT or VyOS or pfSense or OPNsense
- OpenWRT seems to have difficulties upgrading easily.
- VyOS seems nice, is linux, have experience, but rolling release either means upgrading a lot or having random stability.
- Don’t know OPNsense/pfSense.
In the end I settled for Proxmox + VyOS
- Proxmox (8GB storage + 2GB RAM)
- sharing common bulk storage to guests via mount points
- VyOS VM (8GB storage + 2GB RAM)
- dns adblock list
- wireguard
- regular router config
- fq-codel QoS
- Debian Stable LXC ‘unifi’ (8GB thinvol + 2GB RAM)
- unifi-controller installed natively
- Debian Stable LXC ‘proteus’ (256GB thinvol + 24GB RAM)
- First prio
- Nginx (for website & reverse proxy)
- Letsencrypt (for SSL certificates)
- Docker
- Nextcloud (for file sharing)
- bpatrik/pigallery2 (for personal photo sharing)
- Home automation worker scripts (for data generation/collection)
- many
- Influxdb (for data storage)
- Mosquitto (glueing home automation)
- Second prio
- Grafana (for data visualization)
- Plex/Jellyfin/Emby (HTPC)
- Collectd (for data generation/collection)
- smbd (for Time Machine backups)
- Transmission (downloading torrents)
- First prio
- Home Assistant VM (for monitoring)
Software distribution ¶
I went with Ubuntu Server 20.04-LTS before, but this had frequent updates (feels like daily, also non-security). On top of the Ubuntu’s ideological direction is a bit off course, hence for this setup I’m using Debian Stable instead, which incidentally is the same as the host OS useed by Proxmox.
Installation & configuration ¶
Proxmox ¶
Download ISO ¶
Get ISO from proxmox (proxmox.com) and copy to USB disk (proxmox.com) on macOS using hdiutil
and dd
:
hdiutil convert proxmox-ve_*.iso -format UDRW -o proxmox-ve_*.dmg
diskutil list
diskutil unmountDisk /dev/diskX
sudo dd if=proxmox-ve_*.dmg bs=1M of=/dev/rdiskX
Installation ¶
LVM settings ¶
Install Proxmox as described on their wiki (proxmox.com), reserving a small part for the OS and most disk space for VMs:
- hdsize full
- swapsize 2GB
- maxroot 20GB
- minfree 0.5GB
ZFS settings ¶
I use ZFS RAID0 on two disks with default settings (ashift=12, compress=on, checksum=on, copies=1, ARC max size=768MB, hdsize=1800GB), using the SSD as primary disk.
Initial networking ¶
In my Proxmox-as-router setup, I use a USB dongle ethernet for WAN connection and the onboard NIC for LAN purposes. I set the LAN-facing NIC as management interface, and leave IP/gateway/DNS as-is during installation, which will be fixed later. My initial /etc/network/interfaces
looks like:
cat << 'EOF' > /etc/network/interfaces
auto lo
iface lo inet loopback
auto enx000ec6955446
iface enx000ec6955446 inet dhcp
#iface enx000ec6955446 inet manual
#auto vmbr1
#iface vmbr1 inet manual
# bridge-ports enx000ec6955446
# bridge-stp off
# bridge-fd 0
##WAN
iface enp0s25 inet manual
auto vmbr0
iface vmbr0 inet manual
bridge-ports enp0s25
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#LAN
auto vmbr0.10
iface vmbr0.10 inet static
address 172.17.10.6/24
#gateway 172.17.10.1
#Mgmt interface
EOF
service networking restart
Now connect to 172.17.10.6/24 on VLAN 10 to further configure the machine.
Post-installation ¶
Optional: post-install fixes, sourced from Proxmox Helper Scripts (github.io). Never run scripts from the internet.
apt-get update
apt-get dist-upgrade
apt install sudo
# From https://raw.githubusercontent.com/tteck/Proxmox/main/misc/post-pve-install.sh
# Disable Enterprise Repository
sed -i "s/^deb/#deb/g" /etc/apt/sources.list.d/pve-enterprise.list
sed -i "s/^deb/#deb/g" /etc/apt/sources.list.d/ceph.list
# Enable No-Subscription Repository https://pve.proxmox.com/wiki/Package_Repositories#sysadmin_no_subscription_repo
cat << 'EOF' >>/etc/apt/sources.list
# deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription
deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription
EOF
# Optional: disable Subscription Nag
echo "DPkg::Post-Invoke { \"dpkg -V proxmox-widget-toolkit | grep -q '/proxmoxlib\.js$'; if [ \$? -eq 1 ]; then { echo 'Removing subscription nag from UI...'; sed -i '/data.status/{s/\!//;s/active/NoMoreNagging/}' /usr/share/javascript/proxmox-widget-toolkit/proxmoxlib.js; }; fi\"; };" >/etc/apt/apt.conf.d/no-nag-script
apt --reinstall install proxmox-widget-toolkit &>/dev/null
# Upgrade proxmox now
apt-get update
apt-get dist-upgrade
reboot
# remove unused kernels (from https://raw.githubusercontent.com/tteck/Proxmox/main/misc/kernel-clean.sh)
dpkg --list | grep 'kernel-.*-pve' | awk '{print $2}' | grep -v $(uname -r) | sort -V
apt purge proxmox-kernel-6.8.4-2-pve-signed
proxmox-boot-tool refresh
Limit journal filesize to 100M
# set SystemMaxUse=100M
sed -i "s/^.SystemMaxUse=/SystemMaxUse=100M/g" /etc/systemd/journald.conf
grep SystemMaxUse /etc/systemd/journald.conf
service systemd-journald restart
Add regular user with sudo power:
adduser tim
usermod -aG sudo tim
mkdir -p ~tim/.ssh/
touch ~tim/.ssh/authorized_keys
chown -R tim:tim ~tim/.ssh
chmod og-rwx ~tim/.ssh/authorized_keys
cat << 'EOF' >>~tim/.ssh/authorized_keys
ssh-rsa AAAAB...
EOF
Now forbid root login for SSH and forbid password authentication (use public key only):
sed -i "s/^.PermitRootLogin yes/PermitRootLogin no/g" /etc/ssh/sshd_config
grep PermitRootLogin /etc/ssh/sshd_config
sed -i "s/^.PasswordAuthentication yes/PasswordAuthentication no/g" /etc/ssh/sshd_config
grep PasswordAuthentication /etc/ssh/sshd_config
sshd -t
systemctl restart ssh
Optional: add (lower privileged) user to Proxmox VE:
pveum user add tim@pve -firstname "Tim"
pveum passwd tim@pve
pveum acl modify / -user tim@pve -role PVEVMAdmin
The
Enable colors in shell & vim:
sed -i "s/^# alias /alias /" ~/.bashrc
cat << 'EOF' >> ~/.bashrc
alias grep='grep --color=auto'
alias fgrep='fgrep --color=auto'
alias egrep='egrep --color=auto'
EOF
sudo apt install vim sudo
Set ondemand cpu governer for power saving. Add intel_pstate=disable to boot parameters (2019) (linuxquestions.org):
sed -i 's/^GRUB_CMDLINE_LINUX_DEFAULT="quiet/GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_pstate=disable/' /etc/default/grub
grep GRUB_CMDLINE_LINUX_DEFAULT /etc/default/grub
# vi /etc/default/grub
# GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_pstate=disable"
proxmox-boot-tool refresh && reboot
Set scaling governer to ondemand
apt install cpufrequtils
echo ondemand | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
Alternatively try enabling speedstep in BIOS (2018) (stackoverflow.com) (not needed for my setup).
Measure power consumption:
- only network (no HDMI, keyboard, USB) : ~2.5W+-0.3W, spikes from 1.6-3.5W (ondemand governor, chi-by-eye PM231E)
- only network + ASUS USB NIC (w/o cable): ~2.8W+-0.3W (ondemand governor, chi-by-eye PM231E)
- only network + Achate ASIX USB3.0-C NIC (w/o cable): ~6.0W+-0.3W (ondemand governor, chi-by-eye PM231E)
Storage ¶
I have two disks of different speed (SSD & HDD). I want the VMs to run on the fast disk and bulk data to run on the slow disk. With LVM, I ensure this by first extending the LVM over both disks (beause proxmox was installed on one of the disks (should be the fast one)), then create two thinpools, the first one for VMs which resides on the fast disk, the second spans the remainder of the fast disk + the full slow disk. This wastes a bit of space between the two thinpools, but ensures the VMs run on the fast disk.
ZFS is a bit more flexible and can do it like XYZ
ZFS ¶
Ars (arstechnica.com) has a good background article on zfs to get you started, this Reddit post (reddit.com) clarifies how Proxmox sets it up by default, and the PVE manual is also useful on ZFS (proxmox.com).
TODO: maybe set vm.swappiness = 10
Initial zfs pool config looks something like this:
zpool status
pool: rpool
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
ata-ST2000LM007-1R8174_WCC0HHLE-part3 ONLINE 0 0 0
ata-Samsung_SSD_850_EVO_M.2_250GB_S33CNX0J508589Y-part3 ONLINE 0 0 0
errors: No known data errors
zpool list -v
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 2.04T 1.47G 2.04T - - 0% 0% 1.00x ONLINE -
ata-ST2000LM007-1R8174_WCC0HHLE-part3 1.82T 1.05G 1.81T - - 0% 0.05% - ONLINE
ata-Samsung_SSD_850_EVO_M.2_250GB_S33CNX0J508589Y-part3 232G 436M 230G - - 0% 0.18% - ONLINE
zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 1.47G 1.97T 104K /rpool
rpool/ROOT 1.47G 1.97T 96K /rpool/ROOT
rpool/ROOT/pve-1 1.47G 1.97T 1.47G /
rpool/data 96K 1.97T 96K /rpool/data
rpool/var-lib-vz 96K 1.97T 96K /var/lib/vz
By default pve creates these entries:
rpool
is the ZFS poolrpool/ROOT
dataset for root?rpool/ROOT/pve-1
dataset for pverpool/data
volume block for VM images (freebsd.org)rpool/var-lib-vz
dataset mounted at/var/lib/vz
for backups
You can see this back partially in /etc/pve/storage.cfg
:
cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content iso,vztmpl,backup
zfspool: local-zfs
pool rpool/data
sparse
content images,rootdir
Now we tweak the initial zfs config:
- Reserve 20GB as minimum for PVE
- 500GB quota (github.io) for backups so it doesn’t choke the rest
- Split data volume into one for VMs (
vmdata
) and bulk (bulkdata
)
zfs set reservation=20GB rpool/ROOT/pve-1
zfs set quota=500G rpool/var-lib-vz
zfs rename rpool/data rpool/vmdata
sed -i 's/rpool\/data/rpool\/vmdata/' /etc/pve/storage.cfg
reboot
zfs create -o compress=lz4 rpool/bulkdata
zfs create -o compress=lz4 rpool/bulkdata/bulk
zfs create -o compress=lz4 rpool/backups
zfs create -o compress=lz4 rpool/backups/backupsmba
zfs create -o compress=lz4 rpool/backups/backupsmbp
zfs create -o compress=lz4 rpool/backups/backupstex
The result view looks like:
zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 20.0G 1.95T 104K /rpool
rpool/ROOT 20.0G 1.95T 96K /rpool/ROOT
rpool/ROOT/pve-1 1.34G 1.97T 1.34G /
rpool/backups 392K 1.95T 104K /rpool/backups
rpool/backups/backupsmba 96K 1.95T 96K /rpool/backups/backupsmba
rpool/backups/backupsmbp 96K 1.95T 96K /rpool/backups/backupsmbp
rpool/backups/backupstex 96K 1.95T 96K /rpool/backups/backupstex
rpool/bulkdata 192K 1.95T 96K /rpool/bulkdata
rpool/bulkdata/bulk 96K 1.95T 96K /rpool/bulkdata/bulk
rpool/var-lib-vz 96K 500G 96K /var/lib/vz
rpool/vmdata 96K 1.95T 96K /rpool/vmdata
You can see that rpool/var-lib-vz
decreased in AVAIL
to the 500GB quota, and that the maximum AVAIL
for all except rpool/ROOT/pve-1
has dropped by the reservation of 20GB. The nice thing of zfs is that the PVE root is also part of the main pool, while for LVM this (by default) is not the case, i.e. with LVM you might have created a PVE root partition that is too small (really annoying), or too big (wastes space).
Measure power consumption on backup NUC (NUC5i3RYH, 256GB m.2 SSD + 2TB 2.5" HDD):
- only network: ~8.5W+-0.3W (ondemand governor, chi-by-eye PM231E)
- only network + Achate ASIX USB3.0-C NIC (w/ cable): ~9.5W+-0.3W (ondemand governor, chi-by-eye PM231E)
Initial load: 20:13:35 up 9 min, 1 user, load average: 0.13, 0.22, 0.13
LVM ¶
Expand LVM over two disks ¶
Sources here (serverfault.com), here (kenmoini.com), here (proxmox.com), here (proxmox.com) and here (sleeplessbeastie.eu).
Find disk to extend LVM onto, create full-disk LVM partition, and wipe previous LVM configs
lsblk -o name,size,type,model,serial
cfdisk /dev/sda
pvremove -y -ff /dev/sda*
Find disk by serial, create new physical volume (pv) for LVM, if the disk contained previous partitions you might get
ls /dev/disk/by-id/*WCC0HHLE*
pvcreate /dev/disk/by-id/ata-ST2000LM007-1R8174_WCC0HHLE-part1
pvs -a
Now extend the existing pool onto it
vgextend pve /dev/disk/by-id/ata-ST2000LM007-1R8174_WCC0HHLE-part1
vgs -a
Remove ’local-lvm’ storage via GUI or pvesm
to start with a clean sheet.
pvesm remove local-lvm
Now extend data LVM pool, recreate new one later (could also extend probably)
lvremove pve/data
lvs -a
# tim@pve2:~$ lvs -a
# LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
# root pve -wi-ao---- 20.00g
# swap pve -wi-ao---- 2.00g
Set up LVM thin pools & thin volumes ¶
My setup looks something like:
- LVM (0.024TB)
- Proxmox: 20 GB (root) + 4 GB (swap)
- LVM thin ’thinpool_vms’ (0.5TB)
- VyOS: 8GB (0.008TB)
- Debian: 250BG (0.25TB)
- Other/future/debian growth
- LVM thin ’thinpool_data’ (remainder = 5.5TB for data)
- Bulk data: 3TB
- Backups VMS: 0.25TB
- Backups MBP: 1TB
- Backups MBA: 0.25TB
- Backups remote: 1.25TB
Show current setup
lvs -a
pvdisplay
lvdisplay
Create data setup in LVM. In my case sda
is slow, sdb
is fast
# Create thinpool for vms on fast disk. I want 0.5TB. In case the disk is big enough to hold this, run
lvcreate --thin -L 0.5TB pve/thinpool_vms /dev/sdb3
# If the disk is not big enough and you need to expand over the second disk, check the available extents on the fast disk, create thinpool to fill this up
pvdisplay -m /dev/sdb3 | grep "Free PE"
lvcreate --thin -l53730 pve/thinpool_vms /dev/sdb3
lvextend -L 0.5TB /dev/mapper/pve-thinpool_vms
# Create thinpool on remainder of fast disk + slow disk
lvcreate --thin -l 100%FREE pve/thinpool_data
Ensure data split across disks was succesful.
pvdisplay -m /dev/sda1
pvdisplay -m /dev/sdb3
lvdisplay -m /dev/pve/thinpool_vms
lvdisplay -m /dev/pve/thinpool_data
# Create LVs for future use on thinpool_data
lvcreate --thinpool pve/thinpool_data --name lv_bulk --virtualsize 1.0T
lvcreate --thinpool pve/thinpool_data --name lv_backup_vms --virtualsize 0.5T
lvcreate --thinpool pve/thinpool_data --name lv_backup_mbp --virtualsize 1T
lvcreate --thinpool pve/thinpool_data --name lv_backup_mba --virtualsize 0.25T
lvcreate --thinpool pve/thinpool_data --name lv_backup_tex --virtualsize 1.25T
Allow freeing up of unused space on thin volumes (see source (askubuntu.com))
vi /etc/lvm/lvm.conf
# issue_discards = 1
Create filesystems on drive
mkfs.ext4 /dev/mapper/pve-lv_bulk
mkfs.ext4 /dev/mapper/pve-lv_backup_vms
mkfs.ext4 /dev/mapper/pve-lv_backup_mbp
mkfs.ext4 /dev/mapper/pve-lv_backup_mba
mkfs.ext4 /dev/mapper/pve-lv_backup_tex
Mount in Proxmox
mkdir /mnt/bulk
mkdir -p /mnt/backup/{vms,mba,mbp,tex}
mount /dev/mapper/pve-lv_bulk /mnt/bulk/
mount /dev/mapper/pve-lv_backup_vms /mnt/backup/vms
mount /dev/mapper/pve-lv_backup_mbp /mnt/backup/mbp
mount /dev/mapper/pve-lv_backup_mba /mnt/backup/mba
mount /dev/mapper/pve-lv_backup_tex /mnt/backup/tex
chmod og-rx /mnt/backup/{vms,mba,mbp,tex}
chmod og-rx /mnt/bulk/
Ensure automount
cat << 'EOF' >>/etc/fstab
/dev/mapper/pve-lv_bulk /mnt/bulk ext4 defaults 0 2
/dev/mapper/pve-lv_backup_vms /mnt/backup/vms ext4 defaults 0 2
/dev/mapper/pve-lv_backup_mbp /mnt/backup/mbp ext4 defaults 0 2
/dev/mapper/pve-lv_backup_mba /mnt/backup/mba ext4 defaults 0 2
/dev/mapper/pve-lv_backup_tex /mnt/backup/tex ext4 defaults 0 2
EOF
Add directory to PVE storage manager
pvesm add dir backup --path /mnt/backup/vms --content vztmpl,iso,backup
Add thin pool to PVE storage manager
pvesm scan lvmthin pve
pvesm add lvmthin thinpool_vms --vgname pve --thinpool thinpool_vms
Push back backups from elsewhere & optionally resize disks/partitions (serverfault.com)
e2fsck -fy /dev/pve/vm-200-disk-0
resize2fs /dev/pve/vm-200-disk-0 300G
lvreduce -L 300G /dev/pve/vm-200-disk-0
# Edit LXC config in /etc/pve/lxc
#rootfs: thinpool_vms:vm-200-disk-0,size=300G
Final setup looks something like:
tim@pve:~$ sudo lvs -a
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv_backup_mba pve Vwi-aotz-- 256.00g thinpool_data 2.96
lv_backup_mbp pve Vwi-aotz-- 1.00t thinpool_data 1.86
lv_backup_tex pve Vwi-aotz-- 1.25t thinpool_data 1.84
lv_backup_vms pve Vwi-aotz-- 256.00g thinpool_data 3.32
lv_bulk pve Vwi-aotz-- 3.00t thinpool_data 1.79
[lvol0_pmspare] pve ewi------- 128.00m
root pve -wi-ao---- 8.00g
swap pve -wi-ao---- 4.00g
thinpool_data pve twi-aotz-- <4.95t 2.25 11.18
[thinpool_data_tdata] pve Twi-ao---- <4.95t
[thinpool_data_tmeta] pve ewi-ao---- 80.00m
thinpool_vms pve twi-aotz-- 512.00g 7.17 12.75
[thinpool_vms_tdata] pve Twi-ao---- 512.00g
[thinpool_vms_tmeta] pve ewi-ao---- 128.00m
vm-100-disk-0 pve Vwi-aotz-- 8.00g thinpool_vms 61.17
vm-200-disk-0 pve Vwi-a-tz-- 300.00g thinpool_vms 4.66
vm-201-disk-0 pve Vwi-aotz-- 300.00g thinpool_vms 5.30
vm-202-disk-0 pve Vwi-a-tz-- 8.00g thinpool_vms 24.24
Optional tweaks & bugfixes ¶
Optional: assign USB dongle interface a nice name (stackexchange.com). N.B. this breaks proxmox recognizing the adapter as network interface in the GUI, disabling some configuration options.
echo 'SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="7c:10:c9:19:47:80", NAME="usb0"' | tee -a /etc/udev/rules.d/70-persistent-net.rules
echo "auto usb0
iface usb0 inet dhcp" | tee -a /etc/network/interfaces
udevadm control --reload-rules && udevadm trigger
Optional: Fix boot delay after bluetooth driver error. Looks like it’s caused by DHCP timeout (askubuntu.com) on unconnected ethernet port, leave as is.
Add intel-ibt-17*
bluetooth driver for NUC –> does not work, conflicts with proxmox kernel. Wait until adopted in main PVE kernel.
- Add non-free debian packages in /etc/apt/sources.list or related
- Install firmware
apt install firmware-iwlwifi
Bugfix: bridge brought up before physical port not up: “error: vmbr1: bridge port enx000ec6955446 does not exist”
- Increase
bridge_maxwait
to 40s - Alternative: Increase
bridge_waitport
? - Try something else
Migrating from previous setup ¶
Plan:
- VyOS –> make backup, then restore, should work out of the box?
- Update networking config on proxmox, check if I can still reach it from WAN side (should allow ssh/http(s))
- Unifi –> make backup, then restore, should work out of the box?
- Debian –> make backup, then restore on new disk
- Check new USB forwarding options: zigbee & smart meter
- Update exceptions on PVE host –> get from lxc/201.conf
- Fix mount points –> remap in lxc.conf
Upgrade guests ¶
- Separate home assistant from rest as separate VM/LXC
VyOS & networking ¶
Build LTS VyOS ¶
From Docker
sudo docker pull vyos/vyos-build:current
git clone -b current --single-branch https://github.com/vyos/vyos-build
sudo docker run --rm -it --privileged -v $(pwd):/vyos -w /vyos vyos/vyos-build:current bash
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error creating device nodes: mount /dev/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0:/var/lib/docker/overlay2/e90113bf3d2c142cb632e9642757e64e4f45120143f684aaf784b6d9a4b7b3b0/merged/dev/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0 (via /proc/self/fd/7), flags: 0x1000: no such file or directory: unknown.
From source
sudo apt install live-build
git clone -b current --single-branch https://github.com/vyos/vyos-build
sudo make clean
sudo apt install live-build
#pip install tomli jinja2 git psutil??
Failed.
Installation as KVM ¶
Get image, install and add serial socket (vyos.io) for using xterm.js
support (copy-pasting). Also start on boot.
ls /var/lib/vz/template/iso/
qm create 101 --name vyos --memory 2048 --net0 virtio,bridge=vmbr0 --net1 virtio,bridge=vmbr1 --ide2 media=cdrom,file=local:iso/vyos-rolling-latest.iso --virtio0 thinpool_vms:8
qm set 101 --net1 virtio,bridge=vmbr1
qm set 101 -serial0 socket
qm set 101 --onboot 1
Open terminal via Spice/xterm.js, install image, remove image, and reboot
qm start 101
# log in to guest, run `install image` and follow instructions
qm set 101 --ide2 none
qm reboot 101
Enable QEMU guest agent (proxmox.com) in Proxmox (VyOS has this since 2018). Source (itsfullofstars.de)
qm set 101 --agent 1
qm agent 101 ping
Update VyOS version ¶
You can update the VyOS (vyos.io) version by downloading a new image and installing it:
add system image https://s3-us.vyos.io/rolling/current/vyos-rolling-latest.iso
reboot
Unfortunately the rolling releases aren’t signed, only LTS are. There are mythical early production access images (vyos.io) close to LTS which I wasn’t able to find, however.
Upgerading should convert your config automatically. https://forum.vyos.io/t/upgrade-from-1-4-to-1-5/13323 (vyos.io)
Upgrading from 1.4 to 1.5 ¶
From 1.4-rolling-2023* to 1.5.
- Create manual VM backup in PVE.
- Ensure you can connect to network manually - find port in switch, connect, set manual address, reach
pve
. - Add new image
add system image https://github.com/vyos/vyos-nightly-build/releases/download/1.5-rolling-202411070006/vyos-1.5-rolling-202411070006-generic-amd64.iso
Trying to fetch ISO file from https://github.com/vyos/vyos-nightly-build/releases/download/1.5-rolling-x/vyos-1.5-rolling-x-generic-amd64.iso...
Downloading...
Redirecting to x
The file is 493.000 MiB.
[#######################################################################################################################################################################] 100%
Download complete.
Done.
Checking for digital signature file...
Downloading...
Redirecting to x
The file is 0.334 KiB.
[#######################################################################################################################################################################] 100%
Download complete.
Checking digital signature...
Signature key id in /var/tmp/install-image.11916/vyos-1.5-rolling-x-generic-amd64.iso.minisig is x
but the key id in the public key is x
Signature check FAILED, trying BACKUP key...
Signature key id in /var/tmp/install-image.11916/vyos-1.5-rolling-202411070006-generic-amd64.iso.minisig is x
but the key id in the public key is x
Digital signature is valid.
Checking SHA256 checksums of files on the ISO image... OK.
Done!
What would you like to name this image? [1.5-rolling-x]:
OK. This image will be named: 1.5-rolling-x
Installing "1.5-rolling-x" image.
Copying new release files...
Would you like to save the current configuration
directory and config file? (Yes/No) [Yes]:
Copying current configuration...
Would you like to save the SSH host keys from your
current configuration? (Yes/No) [Yes]:
Copying SSH keys...
Running post-install script...
Done.
- Save config:
configure; save
- Connect to terminal via pve to watch debug output on reboot
- Reboot
reboot
- Optional: Pray
- Reconnect – does not show new version, grub is not updated?
Configure VyOS ¶
Set global settings
set set system host-name vyos
set system domain-name lan.vanwerkhoven.org
Configure eth1
(=vmbr1=WAN) as DHCP client
#TODO replace with dhcp query @ right VLAN id later
set interfaces ethernet eth1 vif 300
set interfaces ethernet eth1 vif 300 description "T-mobile WAN"
set interfaces ethernet eth1 vif 300 address dhcp
Set up neworking ¶
Setup VLAN on LAN network, see here (vyos.io)
- Source: https://forum.vyos.io/t/bridge-with-vlans/7459 (vyos.io)
- Source: https://blog.kroy.io/2020/05/04/vyos-from-scratch-edition-1/#Configuring_the_LAN_and_Remote_access (kroy.io)
- Source: https://github.com/lamw/PowerCLI-Example-Scripts/blob/master/Modules/VyOS/vyos.template (github.com)
- Source: https://docs.vyos.io/en/latest/configuration/interfaces/bridge.html#using-vlan-aware-bridge (vyos.io)
- Source: https://engineerworkshop.com/blog/configuring-vlans-on-proxmox-an-introductory-guide/ (engineerworkshop.com)
set interfaces ethernet eth0 description LAN
set interfaces bridge br100 enable-vlan
# set interfaces bridge br100 member interface eth0 allowed-vlan 2-4092
set interfaces bridge br100 member interface eth0 allowed-vlan 10
set interfaces bridge br100 member interface eth0 allowed-vlan 20
set interfaces bridge br100 member interface eth0 allowed-vlan 30
set interfaces bridge br100 member interface eth0 allowed-vlan 40
set interfaces bridge br100 vif 10 address 172.17.10.1/24
set interfaces bridge br100 vif 10 description 'VLAN1-Infra'
set interfaces bridge br100 vif 20 address 172.17.20.1/24
set interfaces bridge br100 vif 20 description 'VLAN20-Trusted'
set interfaces bridge br100 vif 30 address 172.17.30.1/24
set interfaces bridge br100 vif 30 description 'VLAN30-Guest'
set interfaces bridge br100 vif 40 address 172.17.40.1/24
set interfaces bridge br100 vif 40 description 'VLAN40-IoT'
set interfaces bridge br100 stp
Enable SSH on only management interface without password auth.
set service ssh port '22'
set service ssh listen-address 172.17.10.1
set service ssh disable-password-authentication
set system login user vyos authentication public-keys tim@neptune type ssh-rsa
set system login user vyos authentication public-keys tim@neptune key AAAA...
Harden SSH, only allow strong ciphers, don’t use md5/sha1, don’t use contested nistp256 (stackexchange.com):
set service ssh ciphers aes128-cbc
set service ssh ciphers aes128-ctr
set service ssh ciphers aes128-gcm@openssh.com
set service ssh ciphers aes192-cbc
set service ssh ciphers aes192-ctr
set service ssh ciphers aes256-cbc
set service ssh ciphers aes256-ctr
set service ssh ciphers aes256-gcm@openssh.com
set service ssh ciphers chacha20-poly1305@openssh.com
set service ssh mac hmac-sha2-256
set service ssh mac hmac-sha2-256-etm@openssh.com
set service ssh mac hmac-sha2-512
set service ssh mac hmac-sha2-512-etm@openssh.com
set service ssh key-exchange curve25519-sha256
set service ssh key-exchange curve25519-sha256@libssh.org
set service ssh key-exchange diffie-hellman-group-exchange-sha256
set service ssh key-exchange diffie-hellman-group14-sha256
set service ssh key-exchange diffie-hellman-group16-sha512
set service ssh key-exchange diffie-hellman-group18-sha512
Set timezone & ntp server from global pool:
set system time-zone Europe/Amsterdam
delete system ntp
set system ntp server 0.nl.pool.ntp.org
set system ntp server 1.nl.pool.ntp.org
Enable DNS from DHCP, also for local machine. Set cache to 100k, 10x up from dnsmasq default. More local caching should give more speed and more privay (less public querying).
# TODO: fix when going live to WAN interface
# Specifically use name servers received for the interface that is using DHCP client to get an IP
set service dns forwarding dhcp eth1.300
set service dns forwarding allow-from 172.17.0.0/16
set service dns forwarding domain lan.vanwerkhoven.org server 172.17.10.1
set service dns forwarding listen-address 172.17.10.1
set service dns forwarding listen-address 172.17.20.1
set service dns forwarding listen-address 172.17.30.1
set service dns forwarding listen-address 172.17.40.1
set service dns forwarding listen-address 172.17.50.1
set service dns forwarding cache-size 100000
# set system name-server 172.17.10.1 # use static
# set system name-servers eth1.1 # use from dhcp -- not working?
Configure DHCP server ranges for VLAN. 100-254 is dynamic, 1-100 is for static hosts.
delete service dhcp-server shared-network-name vlan10
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 range vlan10range start 172.17.10.100
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 range vlan10range stop 172.17.10.254
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 default-router 172.17.10.1
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 domain-name lan.vanwerkhoven.org
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 name-server 172.17.10.1
delete service dhcp-server shared-network-name vlan20
set service dhcp-server shared-network-name vlan20 authoritative
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 range vlan20range start 172.17.20.100
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 range vlan20range stop 172.17.20.254
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 default-router 172.17.20.1
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 domain-name lan.vanwerkhoven.org
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 name-server 172.17.20.1
delete service dhcp-server shared-network-name vlan30
set service dhcp-server shared-network-name vlan30 subnet 172.17.30.0/24 range vlan30range start 172.17.30.100
set service dhcp-server shared-network-name vlan30 subnet 172.17.30.0/24 range vlan30range stop 172.17.30.254
set service dhcp-server shared-network-name vlan30 subnet 172.17.30.0/24 default-router 172.17.30.1
set service dhcp-server shared-network-name vlan30 subnet 172.17.30.0/24 domain-name lan.vanwerkhoven.org
set service dhcp-server shared-network-name vlan30 subnet 172.17.30.0/24 name-server 172.17.30.1
delete service dhcp-server shared-network-name vlan40
set service dhcp-server shared-network-name vlan40 subnet 172.17.40.0/24 range vlan40range start 172.17.40.100
set service dhcp-server shared-network-name vlan40 subnet 172.17.40.0/24 range vlan40range stop 172.17.40.254
set service dhcp-server shared-network-name vlan40 subnet 172.17.40.0/24 default-router 172.17.40.1
set service dhcp-server shared-network-name vlan40 subnet 172.17.40.0/24 domain-name lan.vanwerkhoven.org
set service dhcp-server shared-network-name vlan40 subnet 172.17.40.0/24 name-server 172.17.40.1
Set up masquerading for outbound traffic. Ensure high rule number so it’s processed after firewalling.
#TODO: fix WAN VLAN
set nat source rule 5010 outbound-interface 'eth1.300'
set nat source rule 5010 source address '172.17.0.0/16'
set nat source rule 5010 translation address masquerade
set nat source rule 5010 protocol all
set nat source rule 5010 description 'Masquerade for WAN'
Set up static ips/host names. Still not possible to set the hostname of the router itself (reddit.com).
# Internal facing services
set system static-host-mapping host-name vyos.lan.vanwerkhoven.org inet 172.17.10.1 # not sure if this works, already set to 127.0.0.1
set system static-host-mapping host-name proteus.lan.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name nextcloud.lan.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name mqtt.lan.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name influxdb.lan.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name pve.lan.vanwerkhoven.org inet 172.17.10.4
set system static-host-mapping host-name unifi.lan.vanwerkhoven.org inet 172.17.10.5
set system static-host-mapping host-name pve2.lan.vanwerkhoven.org inet 172.17.10.6
set system static-host-mapping host-name proteus2.lan.vanwerkhoven.org inet 172.17.10.7
# Split DNS for specific hosts (mostly https & ssh) N.B. Ensure you restart DNS or set dns cache-size to 0 to ensure the cache is cleared!
# Public & internal facing services requiring https - to use letsencrypt we can only have 1 subdomain below vanwerkhoven.org
set system static-host-mapping host-name ssh.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name home.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name www.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name nextcloud.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name photos.vanwerkhoven.org inet 172.17.10.2
set system static-host-mapping host-name grafana.vanwerkhoven.org inet 172.17.10.2
# set system static-host-mapping host-name homeassistant.vanwerkhoven.org inet 172.17.10.20 --> set below
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping gs108e ip-address 172.17.10.3
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping gs108e mac-address 78:D2:94:2F:81:F8
set system static-host-mapping host-name gs108e.lan.vanwerkhoven.org inet 172.17.10.3
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping uap-lr1-floor1 ip-address 172.17.10.10
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping uap-lr1-floor1 mac-address 18:E8:29:93:E1:66
set system static-host-mapping host-name uap-lr1-floor1.lan.vanwerkhoven.org inet 172.17.10.10
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping uap-lr2-floor2 ip-address 172.17.10.11
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping uap-lr2-floor2 mac-address 18:E8:29:E6:00:2E
set system static-host-mapping host-name uap-lr2-floor2.lan.vanwerkhoven.org inet 172.17.10.11
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping uap-lr3-floor3 ip-address 172.17.10.12
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping uap-lr3-floor3 mac-address E0:63:DA:79:A7:93
set system static-host-mapping host-name uap-lr3-floor3.lan.vanwerkhoven.org inet 172.17.10.12
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping homeassistantvm ip-address 172.17.10.20
set service dhcp-server shared-network-name vlan10 subnet 172.17.10.0/24 static-mapping homeassistantvm mac-address 02:85:73:A4:71:88
# Don't use LAN because we might want to reach HA also externally
set system static-host-mapping host-name homeassistant.vanwerkhoven.org inet 172.17.10.20
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 static-mapping philips-hue ip-address 172.17.20.21
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 static-mapping philips-hue mac-address 00:17:88:79:93:47
set system static-host-mapping host-name philips-hue.lan.vanwerkhoven.org inet 172.17.20.21
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 static-mapping appletv-living ip-address 172.17.20.20
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 static-mapping appletv-living mac-address D0:03:4B:26:85:0C
set system static-host-mapping host-name appletv-living.lan.vanwerkhoven.org inet 172.17.20.20
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 static-mapping esp-mobile ip-address 172.17.20.30
set service dhcp-server shared-network-name vlan20 subnet 172.17.20.0/24 static-mapping esp-mobile mac-address 84:0d:8e:8f:52:f5
set system static-host-mapping host-name esp-mobile.lan.vanwerkhoven.org inet 172.17.20.30
set service dhcp-server shared-network-name vlan40 subnet 172.17.40.0/24 static-mapping esp8266-iapetus ip-address 172.17.40.30
set service dhcp-server shared-network-name vlan40 subnet 172.17.40.0/24 static-mapping esp8266-iapetus mac-address 84:0d:8e:8f:4e:11
set system static-host-mapping host-name esp8266-iapetus.lan.vanwerkhoven.org inet 172.17.40.30
Optional: change WAN uplink config ¶
e.g for 4G/LTE backup internet or provider change
TODO: set up WAN load balancing
- Possible solution: https://forum.vyos.io/t/wan-failover-with-dhcp/13920 (vyos.io)
- Docs: https://docs.vyos.io/en/latest/configuration/loadbalancing/wan.html (vyos.io)
- Solution without firewall https://forum.vyos.io/t/pbr-wan-failover/8364/7 (vyos.io)
- https://forum.vyos.io/t/wan-load-balancing-failover-conntrack-issues/14266/4 (vyos.io)
- Something like this, but not as pseudo-config: https://forum.vyos.io/t/wan-load-balancing-source-destination-group-support/12172 (vyos.io)
- https://forum.vyos.io/t/wan-failover-not-working-for-me-rolling-1-4-versions-any-help-please/8886 (vyos.io)
- https://forum.vyos.io/t/specifying-outbound-interface-on-destination-translation-nat/8192 (vyos.io)
Set interfaces, these can persist
set interfaces ethernet eth1 vif 300 address 'dhcp'
set interfaces ethernet eth1 vif 300 description 'Odido WAN'
set interfaces ethernet eth1 vif 6 description 'KPN WAN'
set interfaces ethernet eth1 address 'dhcp'
set interfaces ethernet eth1 description 'Odido 4G LTE WAN'
Optional: set Experiabox in DMZ
- Check client shows up in client list (2.254)
- Set static IP on DHCP tab (2.254)
- Click on cient, select DMZ tab, and enable DMZ for this client (2.254)
PPPoE settings for KPN (source (ctfassets.net)):
- PPPoE via VLAN 6 (802.1q).
- PPPoE authentication PAP with username and password (e.g. internet / internet).
- Maximum packet size (MTU) 1500 bytes (rfc4638)
- Obtain IPv4 address + DNS servers via PPPoE
- Obtain IPv6 address range + DNS servers (IPv6) via DHCPv6-PD request (in PPPoE). Use one address for the router from this range.
Guides to get KPN PPPoE working
- Non working https://community.kpn.com/thuisnetwerk-72/gebruik-een-eigen-router-i-p-v-de-experia-box-458609/index179.html (kpn.com) & https://pastebin.com/S2tWY98e (pastebin.com)
- Non working https://gathering.tweakers.net/forum/list_messages/2205126 (tweakers.net)
- Working Vyatta https://gist.github.com/Ruben-E/abb9a4a872a7c4ffff058ae291ef2627 (github.com)
- Working Vyatta https://jelleraaijmakers.nl/2021/11/kpn-internet-and-iptv-with-edgerouter-x (jelleraaijmakers.nl)
- Why you need MSS clamping https://samuel.kadolph.com/2015/02/mtu-and-tcp-mss-when-using-pppoe-2/ (kadolph.com)
# Optional, already arranged via pppoe0
# set interfaces ethernet eth0 vif 6 mtu 1508
set interfaces pppoe pppoe0 description 'KPN WAN'
set interfaces pppoe pppoe0 mtu 1500
set interfaces pppoe pppoe0 authentication username 'internet'
set interfaces pppoe pppoe0 authentication password 'internet'
set interfaces pppoe pppoe0 source-interface 'eth1.6'
# Ensure we get DNS from ISP, i.e. no-peer-dns must be unset
delete interfaces pppoe pppoe0 no-peer-dns
set interfaces ethernet eth1 vif 6 ip adjust-mss clamp-mss-to-pmtu
set interfaces pppoe pppoe0 ip adjust-mss 1352
# set interfaces pppoe pppoe0 ip adjust-mss clamp-mss-to-pmtu
Max packet size is 1468 (MSS):
vyos@vyos:~$ /usr/bin/ping -M do -s 1468 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1468(1496) bytes of data.
1476 bytes from 8.8.8.8: icmp_seq=1 ttl=120 time=3.73 ms
1476 bytes from 8.8.8.8: icmp_seq=2 ttl=120 time=4.10 ms
^C
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 3.729/3.915/4.101/0.186 ms
vyos@vyos:~$ /usr/bin/ping -M do -s 1469 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 1469(1497) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7170ms
Debug random websites not working, while DNS works, Safari/Firefox time out
# Ensure there's no lingering dhcp leases with conflicting router/dns settings
renew dhcp interface eth1
# Disable QoS
delete qos interface eth1
# Delete double interfaces
delete firewall zone WAN interface 'eth1.300'
delete firewall zone WAN interface 'eth1'
delete firewall zone WAN interface 'eth1.6'
set firewall zone WAN interface 'pppoe0'
# Experiment with MTU/MSS
# Source: https://samuel.kadolph.com/2015/02/mtu-and-tcp-mss-when-using-pppoe-2/
# Source: https://gathering.tweakers.net/forum/list_message/76682408#76682408
# Source: https://community.kpn.com/thuisnetwerk-72/wat-zijn-de-juiste-waarde-voor-mtu-en-mss-bij-gebruik-kpn-glasvezel-600779
set interfaces pppoe pppoe0 mtu 1500
# Eth VLAN MTU (+8byte for PPPoE)
set interfaces ethernet eth1 vif 6 mtu 1508
# Eth VLAN MTU (+4byte for VLAN tag)
set interfaces ethernet eth1 mtu 1512
# 40 byte less for IP + TCP header
set interfaces ethernet eth1 vif 6 ip adjust-mss 1460
# Result: network unreachable / no download / partial website
set interfaces pppoe pppoe0 mtu 1500
set interfaces ethernet eth1 vif 6 mtu 1508
set interfaces ethernet eth1 mtu 1512
set interfaces ethernet eth1 vif 6 ip adjust-mss clamp-mss-to-pmtu
# Result: network unreachable / no download / partial website
# set interfaces pppoe pppoe0 mtu 1492
# set interfaces ethernet eth1 mtu 1500
Other config ¶
set interfaces ethernet eth1 mtu '1512'
set interfaces ethernet eth1 vif 6 mtu '1508'
set interfaces pppoe pppoe0 ip adjust-mss 'clamp-mss-to-pmtu'
set interfaces pppoe pppoe0 mtu '1500'
# No fast.com / slashdot
set interfaces ethernet eth1 mtu '1520'
set interfaces ethernet eth1 vif 6 mtu '1500'
set interfaces pppoe pppoe0 ip adjust-mss 'clamp-mss-to-pmtu'
set interfaces pppoe pppoe0 mtu '1492'
# fast.com OK
set interfaces ethernet eth1 mtu '1504'
set interfaces ethernet eth1 vif 6 mtu '1500'
set interfaces pppoe pppoe0 ip adjust-mss 'clamp-mss-to-pmtu'
set interfaces pppoe pppoe0 mtu '1492'
# fast.com OK
set interfaces pppoe pppoe0 ip adjust-mss 'clamp-mss-to-pmtu'
set interfaces pppoe pppoe0 mtu '1500'
# fast.com / DDG NOK
set interfaces pppoe pppoe0 mtu '1500'
# fast.com / DDG NOK
set interfaces pppoe pppoe0 ip adjust-mss 'clamp-mss-to-pmtu'
set interfaces pppoe pppoe0 mtu '1492'
# fast.com / DDG OK --> minimal working setup
set interfaces pppoe pppoe0 mtu '1492'
# fast.com / DDG NOK
set interfaces pppoe pppoe0 ip adjust-mss 'clamp-mss-to-pmtu'
set interfaces pppoe pppoe0 mtu '1500'
set proxmox enx7c10c9194780 / vmbr1 mtu 9000
# fast.com / DDG NOK, internet works
set interfaces ethernet eth1 mtu '1512'
set interfaces ethernet eth1 vif 6 mtu '1508'
set interfaces pppoe pppoe0 ip adjust-mss 'clamp-mss-to-pmtu'
set interfaces pppoe pppoe0 mtu '1500'
set proxmox enx7c10c9194780 / vmbr1 mtu 9000
# fast.com / DDG NOK, internet works
Continue ¶
TODO Try clamp MSS: https://docs.vyos.io/en/latest/configuration/interfaces/ethernet.html#cfgcmd-set-interfaces-ethernet-interface-ip-adjust-mss-mss-clamp-mss-to-pmtu (vyos.io) MSS value = MTU - 20 (IP header) - 20 (TCP header), resulting in 1452 bytes on a 1492 byte MTU.
set interfaces ethernet ip adjust-mss <mss | clamp-mss-to-pmtu> set firewall options interface adjust-mss 1460
Prepare for IPv6 on KPN PPPoE ¶
https://blog.daknob.net/ipv6-first-with-vyos/ (daknob.net)
# set interfaces pppoe pppoe0 ipv6 address autoconf - needed sometimes
set interfaces pppoe pppoe0 dhcpv6-options pd 1 length '56'
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.10 address 1
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.10 sla-id 10
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.20 address 1
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.20 sla-id 20
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.30 address 1
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.30 sla-id 30
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.40 address 1
set interfaces pppoe pppoe0 dhcpv6-options pd 1 interface br100.40 sla-id 40
# Enable DNS https://blog.daknob.net/ipv6-first-with-vyos/
set service dns forwarding dnssec validate
# set service dns forwarding ignore-hosts-file
set service dns forwarding dns64-prefix 64:ff9b::/96 # what is needed?
set service dns forwarding listen-address fd53::53
# set service dns forwarding listen-address 192.168.1.1
# Distribute IPs to clients
set service router-advert interface br100.10 name-server fd53::53
set service router-advert interface br100.10 prefix ::/64 valid-lifetime '172800'
set service router-advert interface br100.10 link-mtu '1492'
# set service router-advert interface br100.10 nat64prefix 64:ff9b::/96
# set service router-advert interface br100.10 link-mtu '1492'
# set service router-advert interface br100.10 name-server <NAME SERVER> -- how can we use ISP?
# set service router-advert interface br100.10 prefix ::/64 valid-lifetime '172800'
Update QoS, DNS & firewall, these can also persist and be set at the same time
set qos interface eth1.300 egress 'WAN_QUEUE'
set qos interface pppoe0 egress 'WAN_QUEUE'
set qos interface eth1 egress 'WAN_QUEUE'
set service dns forwarding dhcp 'eth1.300'
set service dns forwarding dhcp 'pppoe0'
#-- not applicable for pppoe -- why not?
set service dns forwarding dhcp 'eth1'
### Set firewall, these can persist in an interface-group or listed separately
set firewall group interface-group WAN interface eth1.300
set firewall group interface-group WAN interface pppoe0
set firewall group interface-group WAN interface eth1
# set firewall zone WAN interface WAN
set firewall zone WAN interface 'eth1.300'
set firewall zone WAN interface 'pppoe0'
set firewall zone WAN interface 'eth1'
Set NAT, this needs to be adjusted per outbound VLAN
set nat destination rule 100 inbound-interface 'eth1.300'
set nat destination rule 102 inbound-interface 'eth1.300'
set nat destination rule 104 inbound-interface 'eth1.300'
set nat destination rule 106 inbound-interface 'eth1.300'
set nat destination rule 108 inbound-interface 'eth1.300'
set nat source rule 5001 outbound-interface 'eth1.300'
set nat source rule 5010 outbound-interface 'eth1.300'
set nat destination rule 100 inbound-interface 'pppoe0'
set nat destination rule 102 inbound-interface 'pppoe0'
set nat destination rule 104 inbound-interface 'pppoe0'
set nat destination rule 106 inbound-interface 'pppoe0'
set nat destination rule 108 inbound-interface 'pppoe0'
set nat source rule 5001 outbound-interface 'pppoe0'
set nat source rule 5010 outbound-interface 'pppoe0'
set nat destination rule 100 inbound-interface 'eth1'
set nat destination rule 102 inbound-interface 'eth1'
set nat destination rule 104 inbound-interface 'eth1'
set nat destination rule 106 inbound-interface 'eth1'
set nat destination rule 108 inbound-interface 'eth1'
set nat source rule 5001 outbound-interface 'eth1'
set nat source rule 5010 outbound-interface 'eth1'
delete interfaces ethernet eth1 mtu
delete interfaces ethernet eth1 vif 6 mtu
delete interfaces pppoe pppoe0 mtu
delete interfaces ethernet eth1 vif 6 ip adjust-mss
Set up port forwaring ¶
- 10022 to 172.17.10.4:22
- 443 to 172.17.10.2:443
- 80 to 172.17.10.2:80
- 8883 to 172.17.10.2:1883
set nat destination rule 100 description 'Port Forward: SSH to 172.17.10.2'
set nat destination rule 100 destination port '22'
set nat destination rule 100 inbound-interface 'eth1.300'
set nat destination rule 100 protocol 'tcp'
set nat destination rule 100 translation address '172.17.10.2'
set nat destination rule 102 description 'Port Forward: HTTP to 172.17.10.2'
set nat destination rule 102 destination port '80'
set nat destination rule 102 inbound-interface 'eth1.300'
set nat destination rule 102 protocol 'tcp'
set nat destination rule 102 translation address '172.17.10.2'
set nat destination rule 104 description 'Port Forward: HTTPS to 172.17.10.2'
set nat destination rule 104 destination port '443'
set nat destination rule 104 inbound-interface 'eth1.300'
set nat destination rule 104 protocol 'tcp'
set nat destination rule 104 translation address '172.17.10.2'
set nat destination rule 106 description 'Port Forward: MQTT to 172.17.10.2'
set nat destination rule 106 destination port '8883'
set nat destination rule 106 inbound-interface 'eth1.300'
set nat destination rule 106 protocol 'tcp'
set nat destination rule 106 translation address '172.17.10.2'
set nat destination rule 106 translation port '1883'
Hairpin NAT (vyos.io) is not implemented (vyos.net) well for dynamics IPs (see here (reddit.com) and here (vyos.io)), so we use split DNS for local resolving instead.
For reference, a hairpin NAT setup looks like:
set nat destination rule 100 description 'Port Forward SSH'
set nat destination rule 100 destination port '22'
set nat destination rule 100 inbound-interface '<WAN interface>' # WAN interface
set nat destination rule 100 protocol 'tcp'
set nat destination rule 100 translation address '<LAN IP>' # LAN IP
set nat destination rule 101 description 'Port Forward: SSH (NAT Reflection: INSIDE)'
set nat destination rule 101 destination port '22'
set nat destination rule 101 destination address '<WAN IP>' # WAN IP --> required but not in official docs
set nat destination rule 101 inbound-interface '<LAN interface>' # LAN interface
set nat destination rule 101 protocol 'tcp'
set nat destination rule 101 translation address '<LAN IP>' # LAN IP
set nat source rule 100 description 'Port Forward: all to <LAN RANGE>/24 (NAT Reflection: INSIDE)'
set nat source rule 100 destination address '<LAN RANGE>/24'
set nat source rule 100 source address '<LAN RANGE>/24'
set nat source rule 100 outbound-interface '<LAN interface>' # LAN interface
set nat source rule 100 protocol 'tcp'
set nat source rule 100 translation address 'masquerade'
Set up port firewall ¶
Set up zone-based firewall using the following zones:
- WAN: Internet
- Local: router itself, access to everything (VPN, DNS, DHCP, etc.)
- Infra: trusted VLAN with infrastructure, access to everything (switch, server, pve, access points)
- Trusted: trusted clients, limited access to infra (e.g. home laptops, appletv, phones, ipad, hue)
- Guest: untrusted clients, only access to WAN (e.g. work laptops, work phones, guest phones, thermostat)
- IoT: untrusted clients, only access to server in Infra, no WAN access (esp clients)
Rules:
- FW_ACCEPT: drop invalid, accept rest
- FW_DROP: drop all
- FW_2LOCAL: allow DNS, DHCP, SSH, Wireguard (to router)
- FW_TRUST2INFRA: allow trusted clients to reach: server (SSH, HTTP, HTTPS, Home Assistant (via proxy?), Grafana (via proxy?)), pve (SSH), unifi (web only?)
- FW_IOT2INFRA: allow IOT to reach server (MQTT(S)/Home Assistant API/HTTP(S))
- FW_GUEST2TRUST: allow guest clients to reach: appleTV (all ports, mdns)
- FW_WAN2ALL: allow established & related, drop rest
- FW_WAN2INFRA: allow established & related, allow port forwards, drop rest. For port forwards we only have to specify the port, as the IP is implied by the port forwarding rule set up earlier (e.g. allowing port 80 doesn’t open http on all hosts because the forward only allows to 1 specific host)
- FW_WAN2INFRA: allow established & related, allow wireguard (maybe IKEv2 later), drop rest.
to Local | to Infra | to Trusted | to Guest | to IoT | to WAN | |
---|---|---|---|---|---|---|
Local | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | |
Infra | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | |
Trust. | FW_2LOCAL | FW_TRUST2INFRA | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | |
Guest | FW_2LOCAL | FW_GUEST2INFRA | FW_GUEST2TRUST | FW_DROP | FW_ACCEPT | |
IoT | FW_2LOCAL | FW_IOT2INFRA | FW_DROP | FW_DROP | FW_IOT2WAN | |
WAN | FW_WAN2LOCAL | FW_WAN2INFRA | FW_WAN2ALL | FW_WAN2ALL | FW_WAN2ALL |
FW_ACCEPT
set firewall name FW_ACCEPT default-action accept
set firewall name FW_ACCEPT enable-default-log # temp
set firewall name FW_ACCEPT rule 200 action drop
set firewall name FW_ACCEPT rule 200 description 'drop invalid'
set firewall name FW_ACCEPT rule 200 state invalid enable
set firewall name FW_ACCEPT rule 200 log enable # temp
FW_DROP
set firewall name FW_DROP default-action drop
FW_WAN2ALL
set firewall name FW_WAN2ALL default-action drop
set firewall name FW_WAN2ALL rule 200 action accept
set firewall name FW_WAN2ALL rule 200 description 'accept established/related'
set firewall name FW_WAN2ALL rule 200 state established enable
set firewall name FW_WAN2ALL rule 200 state related enable
set firewall name FW_WAN2ALL rule 210 action accept
set firewall name FW_WAN2ALL rule 210 description 'wireguard maybe also needed'
set firewall name FW_WAN2ALL rule 210 destination port 51820
set firewall name FW_WAN2ALL rule 210 protocol udp
set firewall name FW_WAN2ALL rule 210 state new enable
FW_WAN2LOCAL
set firewall name FW_WAN2LOCAL default-action drop
# Wireguard debugging
set firewall name FW_WAN2LOCAL enable-default-log
set firewall name FW_WAN2LOCAL log enable
set firewall name FW_WAN2LOCAL rule 200 action accept
set firewall name FW_WAN2LOCAL rule 200 description 'accept established/related'
set firewall name FW_WAN2LOCAL rule 200 state established enable
set firewall name FW_WAN2LOCAL rule 200 state related enable
set firewall name FW_WAN2LOCAL rule 200 log enable
set firewall name FW_WAN2LOCAL rule 210 action accept
set firewall name FW_WAN2LOCAL rule 210 description 'wireguard'
set firewall name FW_WAN2LOCAL rule 210 destination port 51820
set firewall name FW_WAN2LOCAL rule 210 protocol udp
set firewall name FW_WAN2LOCAL rule 210 state new enable
set firewall name FW_WAN2LOCAL rule 210 log enable
FW_WAN2INFRA
set firewall name FW_WAN2INFRA default-action drop
set firewall name FW_WAN2INFRA rule 200 action accept
set firewall name FW_WAN2INFRA rule 200 description 'accept established/related'
set firewall name FW_WAN2INFRA rule 200 state established enable
set firewall name FW_WAN2INFRA rule 200 state related enable
set firewall name FW_WAN2INFRA rule 210 action accept
set firewall name FW_WAN2INFRA rule 210 description 'accept port forwards'
set firewall name FW_WAN2INFRA rule 210 log enablet
set firewall name FW_WAN2INFRA rule 210 protocol tcp
set firewall name FW_WAN2INFRA rule 210 destination port 22,80,443,1883
set firewall name FW_WAN2INFRA rule 210 destination address 172.17.10.4
set firewall name FW_WAN2INFRA rule 210 state new 'enable'
# set firewall name FW_WAN2INFRA rule 220 action accept
# set firewall name FW_WAN2INFRA rule 220 description 'wireguard - just testing now'
# set firewall name FW_WAN2INFRA rule 220 destination port 51820
# set firewall name FW_WAN2INFRA rule 220 protocol udp
FW_2LOCAL
set firewall name FW_2LOCAL default-action drop
set firewall name FW_2LOCAL rule 200 action accept
set firewall name FW_2LOCAL rule 200 description 'accept established/related'
set firewall name FW_2LOCAL rule 200 log disable
set firewall name FW_2LOCAL rule 200 state established enable
set firewall name FW_2LOCAL rule 200 state related enable
set firewall name FW_2LOCAL rule 210 action accept
set firewall name FW_2LOCAL rule 210 description 'accept dhcp'
set firewall name FW_2LOCAL rule 210 log disable
set firewall name FW_2LOCAL rule 210 protocol udp
set firewall name FW_2LOCAL rule 210 destination port 67-68
set firewall name FW_2LOCAL rule 220 action accept
set firewall name FW_2LOCAL rule 220 description 'accept dns'
set firewall name FW_2LOCAL rule 220 log disable
set firewall name FW_2LOCAL rule 220 protocol udp
set firewall name FW_2LOCAL rule 220 destination port 53
set firewall name FW_2LOCAL rule 230 action accept
set firewall name FW_2LOCAL rule 230 description 'accept ssh'
set firewall name FW_2LOCAL rule 230 log disable
set firewall name FW_2LOCAL rule 230 protocol tcp
set firewall name FW_2LOCAL rule 230 destination port 22
# delete firewall name FW_2LOCAL rule 240
# set firewall name FW_2LOCAL rule 240 action accept
# set firewall name FW_2LOCAL rule 240 description 'accept wireguard (not sure if needed - seems not)'
# set firewall name FW_2LOCAL rule 240 log disable
# set firewall name FW_2LOCAL rule 240 protocol udp
# set firewall name FW_2LOCAL rule 240 destination port 51820
FW_TRUST2INFRA
set firewall name FW_TRUST2INFRA default-action drop
set firewall name FW_TRUST2INFRA rule 200 action accept
set firewall name FW_TRUST2INFRA rule 200 description 'accept established/related'
set firewall name FW_TRUST2INFRA rule 200 log disable
set firewall name FW_TRUST2INFRA rule 200 state established enable
set firewall name FW_TRUST2INFRA rule 200 state related enable
set firewall name FW_TRUST2INFRA rule 210 action accept
set firewall name FW_TRUST2INFRA rule 210 description 'accept mqtt(s)/http(s)/HA/ssh/grafana/jellyfin&emby/plex/iperf/transmission to proteus'
set firewall name FW_TRUST2INFRA rule 210 destination address 172.17.10.2
set firewall name FW_TRUST2INFRA rule 210 protocol tcp
set firewall name FW_TRUST2INFRA rule 210 destination port 8883,1883,80,443,8123,22,3000,8096,32400,32469,7575,9001
set firewall name FW_TRUST2INFRA rule 211 action accept
set firewall name FW_TRUST2INFRA rule 211 description 'accept plex network discovery to proteus -- not working yet '
set firewall name FW_TRUST2INFRA rule 211 destination address 172.17.10.2
set firewall name FW_TRUST2INFRA rule 211 protocol udp
set firewall name FW_TRUST2INFRA rule 211 destination port 1900,5353,32410,32412,32413,32414
set firewall name FW_TRUST2INFRA rule 220 action accept
set firewall name FW_TRUST2INFRA rule 220 description 'accept ssh to pve'
set firewall name FW_TRUST2INFRA rule 220 destination address 172.17.10.4
set firewall name FW_TRUST2INFRA rule 220 protocol tcp
set firewall name FW_TRUST2INFRA rule 220 destination port 22
set firewall name FW_TRUST2INFRA rule 230 action accept
set firewall name FW_TRUST2INFRA rule 230 description 'accept ssh/web to unifi controller'
set firewall name FW_TRUST2INFRA rule 230 destination address 172.17.10.5
set firewall name FW_TRUST2INFRA rule 230 protocol tcp
set firewall name FW_TRUST2INFRA rule 230 destination port 22,443,8443
set firewall name FW_TRUST2INFRA rule 240 action accept
set firewall name FW_TRUST2INFRA rule 240 description 'accept HA to home assistant'
set firewall name FW_TRUST2INFRA rule 240 destination address 172.17.10.20
set firewall name FW_TRUST2INFRA rule 240 protocol tcp
set firewall name FW_TRUST2INFRA rule 240 destination port 8123
FW_GUEST2INFRA
set firewall name FW_GUEST2INFRA default-action drop
set firewall name FW_GUEST2INFRA rule 200 action accept
set firewall name FW_GUEST2INFRA rule 200 description 'accept established/related'
set firewall name FW_GUEST2INFRA rule 200 log disable
set firewall name FW_GUEST2INFRA rule 200 state established enable
set firewall name FW_GUEST2INFRA rule 200 state related enable
set firewall name FW_GUEST2INFRA rule 210 action accept
set firewall name FW_GUEST2INFRA rule 210 description 'accept http(s)/ssh to proteus'
set firewall name FW_GUEST2INFRA rule 210 destination address 172.17.10.2
set firewall name FW_GUEST2INFRA rule 210 protocol tcp
set firewall name FW_GUEST2INFRA rule 210 destination port 22,80,443
set firewall name FW_GUEST2INFRA rule 220 action accept
set firewall name FW_GUEST2INFRA rule 220 description 'accept http(s)/ssh to pve'
set firewall name FW_GUEST2INFRA rule 220 destination address 172.17.10.4
set firewall name FW_GUEST2INFRA rule 220 protocol tcp
set firewall name FW_GUEST2INFRA rule 220 destination port 22,80,443
FW_IOT2INFRA
set firewall name FW_IOT2INFRA default-action drop
set firewall name FW_IOT2INFRA rule 200 action accept
set firewall name FW_IOT2INFRA rule 200 description 'accept established/related'
set firewall name FW_IOT2INFRA rule 200 log disable
set firewall name FW_IOT2INFRA rule 200 state established enable
set firewall name FW_IOT2INFRA rule 200 state related enable
set firewall name FW_IOT2INFRA rule 210 action accept
set firewall name FW_IOT2INFRA rule 210 description 'accept mqtt(s)/HA API to proteus'
set firewall name FW_IOT2INFRA rule 210 destination address 172.17.10.2
set firewall name FW_IOT2INFRA rule 210 protocol tcp
set firewall name FW_IOT2INFRA rule 210 destination port 8883,1883,6053
set firewall name FW_IOT2INFRA rule 220 action accept
set firewall name FW_IOT2INFRA rule 220 description 'accept NibeGW UDP traffic to proteus'
set firewall name FW_IOT2INFRA rule 220 destination address 172.17.10.2
set firewall name FW_IOT2INFRA rule 220 protocol udp
set firewall name FW_IOT2INFRA rule 220 destination port 9999,10000
FW_IOT2WAN
set firewall name FW_IOT2WAN default-action drop
set firewall name FW_IOT2WAN rule 200 action accept
set firewall name FW_IOT2WAN rule 200 description 'accept established/related'
set firewall name FW_IOT2WAN rule 200 log disable
set firewall name FW_IOT2WAN rule 200 state established enable
set firewall name FW_IOT2WAN rule 200 state related enable
set firewall name FW_IOT2WAN rule 210 action accept
set firewall name FW_IOT2WAN rule 210 description 'accept solaredge to phone home'
set firewall name FW_IOT2WAN rule 210 source address 172.17.40.35
set firewall name FW_IOT2WAN rule 210 protocol tcp
FW_GUEST2TRUST
set firewall name FW_GUEST2TRUST default-action drop
set firewall name FW_GUEST2TRUST rule 200 action accept
set firewall name FW_GUEST2TRUST rule 200 description 'accept established/related'
set firewall name FW_GUEST2TRUST rule 200 log disable
set firewall name FW_GUEST2TRUST rule 200 state established enable
set firewall name FW_GUEST2TRUST rule 200 state related enable
set firewall name FW_GUEST2TRUST rule 210 action accept
set firewall name FW_GUEST2TRUST rule 210 description 'accept access to AppleTV'
set firewall name FW_GUEST2TRUST rule 210 destination address 172.17.20.20
set firewall name FW_GUEST2TRUST rule 210 protocol tcp_udp
Apply firewall zones to interfaces
to Local | to Infra | to Trusted | to Guest | to IoT | to WAN | |
---|---|---|---|---|---|---|
Local | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | |
Infra | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | |
Trust. | FW_2LOCAL | FW_TRUST2INFRA | FW_ACCEPT | FW_ACCEPT | FW_ACCEPT | |
Guest | FW_2LOCAL | FW_DROP | FW_GUEST2TRUST | FW_DROP | FW_ACCEPT | |
IoT | FW_2LOCAL | FW_IOT2INFRA | FW_IOT2INFRA | FW_DROP | FW_IOT2WAN | |
WAN | FW_WAN2LOCAL | FW_WAN2INFRA | FW_WAN2ALL | FW_WAN2ALL | FW_WAN2ALL |
@TODO fix VLAN before live
# Temp disable firewall
delete firewall zone LOCAL
delete firewall zone INFRA
delete firewall zone TRUSTED
delete firewall zone GUEST
delete firewall zone IOT
delete firewall zone WAN
set firewall zone LOCAL local-zone
set firewall zone LOCAL default-action drop
set firewall zone LOCAL from INFRA firewall name FW_ACCEPT
set firewall zone LOCAL from TRUSTED firewall name FW_2LOCAL
set firewall zone LOCAL from GUEST firewall name FW_2LOCAL
set firewall zone LOCAL from IOT firewall name FW_2LOCAL
set firewall zone LOCAL from WAN firewall name FW_WAN2LOCAL
set firewall zone INFRA interface br100.10
set firewall zone INFRA default-action drop
set firewall zone INFRA from LOCAL firewall name FW_ACCEPT
set firewall zone INFRA from TRUSTED firewall name FW_TRUST2INFRA
set firewall zone INFRA from GUEST firewall name FW_GUEST2INFRA
set firewall zone INFRA from IOT firewall name FW_IOT2INFRA
set firewall zone INFRA from WAN firewall name FW_WAN2INFRA
set firewall zone TRUSTED interface br100.20
set firewall zone TRUSTED interface wg0
set firewall zone TRUSTED default-action drop
set firewall zone TRUSTED from LOCAL firewall name FW_ACCEPT
set firewall zone TRUSTED from INFRA firewall name FW_ACCEPT
set firewall zone TRUSTED from GUEST firewall name FW_GUEST2TRUST
set firewall zone TRUSTED from IOT firewall name FW_IOT2INFRA
set firewall zone TRUSTED from WAN firewall name FW_WAN2ALL
set firewall zone GUEST interface br100.30
set firewall zone GUEST default-action drop
set firewall zone GUEST from LOCAL firewall name FW_ACCEPT
set firewall zone GUEST from INFRA firewall name FW_ACCEPT
set firewall zone GUEST from TRUSTED firewall name FW_ACCEPT
set firewall zone GUEST from IOT firewall name FW_DROP
set firewall zone GUEST from WAN firewall name FW_WAN2ALL
set firewall zone IOT interface br100.40
set firewall zone IOT default-action drop
set firewall zone IOT from LOCAL firewall name FW_ACCEPT
set firewall zone IOT from INFRA firewall name FW_ACCEPT
set firewall zone IOT from TRUSTED firewall name FW_ACCEPT
set firewall zone IOT from GUEST firewall name FW_DROP
set firewall zone IOT from WAN firewall name FW_WAN2ALL
set firewall zone WAN interface eth1.300
set firewall zone WAN default-action drop
set firewall zone WAN from LOCAL firewall name FW_ACCEPT
set firewall zone WAN from INFRA firewall name FW_ACCEPT
set firewall zone WAN from TRUSTED firewall name FW_ACCEPT
set firewall zone WAN from GUEST firewall name FW_ACCEPT
set firewall zone WAN from IOT firewall name FW_IOT2WAN
Restart firewall - is done automatically. If you don’t notice the changes you probably made a mistake :p
Allow mDNS reflector, allow guests to reach AppleTV (.30 to .20) & Plex to reach AppleTV (.10 to .20). Also set igmp-proxy for Plex media server (maybe - see https://www.reddit.com/r/Ubiquiti/comments/6rdqx8/plex_server_across_multiple_subnets/ (reddit.com))
set service mdns repeater interface br100.10
set service mdns repeater interface br100.20
set service mdns repeater interface br100.30
# Not working yet
# set protocols igmp-proxy interface br100.10 role upstream
# set protocols igmp-proxy interface br100.10 alt-subnet 172.17.0.0/16
# set protocols igmp-proxy interface br100.20 role downstream
# set protocols igmp-proxy interface br100.30 role downstream
# Not working yet
set service broadcast-relay id 1 description 'Plex GDM'
set service broadcast-relay id 1 interface 'br100.10'
set service broadcast-relay id 1 interface 'br100.20'
set service broadcast-relay id 1 port '32410'
set service broadcast-relay id 2 description 'Plex GDM'
set service broadcast-relay id 2 interface 'br100.10'
set service broadcast-relay id 2 interface 'br100.20'
set service broadcast-relay id 2 port '32412'
set service broadcast-relay id 3 description 'Plex GDM'
set service broadcast-relay id 3 interface 'br100.10'
set service broadcast-relay id 3 interface 'br100.20'
set service broadcast-relay id 3 port '32413'
set service broadcast-relay id 4 description 'Plex GDM'
set service broadcast-relay id 4 interface 'br100.10'
set service broadcast-relay id 4 interface 'br100.20'
set service broadcast-relay id 4 port '32414'
set service broadcast-relay id 5 description 'Plex GDM'
set service broadcast-relay id 5 interface 'br100.10'
set service broadcast-relay id 5 interface 'br100.20'
set service broadcast-relay id 5 port '32400'
Set up port QoS & MSS ¶
Set up QoS (vyos.io). Partially inspired by this very outdated (2015) source (github.com).
set qos policy shaper WAN_QUEUE bandwidth '100mbit'
set qos policy shaper WAN_QUEUE default bandwidth '95%'
set qos policy shaper WAN_QUEUE default queue-type fq-codel
set qos policy shaper WAN_QUEUE class 10 queue-type fq-codel
set qos policy shaper WAN_QUEUE class 10 bandwidth '10%'
set qos policy shaper WAN_QUEUE class 10 priority '1'
set qos policy shaper WAN_QUEUE class 10 match icmp ip protocol icmp
set qos policy shaper WAN_QUEUE class 10 match dns ip source port 53
# TODO: fix WAN VLAN when deploying
set qos interface eth1.1 egress WAN_QUEUE
delete qos interface eth1.1 ingress WAN_QUEUE
Set MSS-clamping (vyos.io) to ensure optimal link utilization. Diagnose max MTU using ping, then set MSS value = MTU - 20 (IP header) - 20 (TCP header):
set firewall options interface adjust-mss 1460
Configure Ad-blocking ¶
Set up DNS-based ad blocking on VyOS (vanwerkhoven.org).
Configure VPN ¶
Set up Wireguard VPN, see the official VyOS docs (vyos.io) and this example (reddit.com). We use 172.17.50.0/24
to differentiate from LAN subnets, but keep it on the Trusted VLAN (20).
set firewall name WAN-LOCAL rule 200 state new enable
config
run generate pki wireguard key-pair install interface wg0
# Public key: uGc4JMJ4IJc0aoIY/ITOrFGWjmn+RxnqRQMecOS4uB8=
set interfaces wireguard wg0 address 172.17.10.1/24
set interfaces wireguard wg0 address 172.17.50.1/24
set interfaces wireguard wg0 description "Roadwarrior Wireguard"
set interfaces wireguard wg0 port 51820
# Add first peer with local IP '172.17.50.100/32'
# run generate pki wireguard key-pair
set interfaces wireguard wg0 peer tim public-key 'ZtYsIGNwdEqxB3YY9iUFCnQAOK1IkAsmBhuaZ5cekGI='
set interfaces wireguard wg0 peer tim persistent-keepalive 15
set interfaces wireguard wg0 peer tim allowed-ips 172.17.50.100/32
# run generate pki wireguard preshared-key install interface wg0 peer tim
set interfaces wireguard wg0 peer tim preshared-key 'PSK'
# Second peer
# run generate pki wireguard key-pair
set interfaces wireguard wg0 peer helene public-key 'XjjkRchCR4qY7i12sJVK/Jl6cycgCovY64305PMw3n8='
set interfaces wireguard wg0 peer helene persistent-keepalive 15
set interfaces wireguard wg0 peer helene allowed-ips 172.17.50.101/32
# run generate pki wireguard preshared-key install interface wg0 peer helene
set interfaces wireguard wg0 peer helene preshared-key 'PSK'
# Third peer
# run generate pki wireguard key-pair
set interfaces wireguard wg0 peer macbook public-key 'lEKOXMTHRVRURB5aOBioFiZlZS582YYZknqqp/wXt2s='
set interfaces wireguard wg0 peer macbook persistent-keepalive 15
set interfaces wireguard wg0 peer macbook allowed-ips 172.17.50.102/32
# run generate pki wireguard preshared-key install interface wg0 peer macbook
set interfaces wireguard wg0 peer macbook preshared-key 'PSK'
# Optionally set routing -- only needed for site-to-site LAN setups I think
# set protocols static route 172.17.50.0/24 interface wg0
commit; save; exit
Update firewall to not NAT roadwarrior traffic
set service dns forwarding listen-address 172.17.50.1
set nat source rule 5001 outbound-interface 'eth1.300'
set nat source rule 5001 destination address 172.17.50.0/24
set nat source rule 5001 exclude
set nat source rule 5001 protocol all
set nat source rule 5001 translation address masquerade
set nat source rule 5001 description 'Exclude roadwarrior VPN'
Now generate client configs and install these on your clients. Be sure to manually copy over the client public key to the server!
generate wireguard client-config tim interface wg0 server home.vanwerkhoven.org address 172.17.50.100/24
generate wireguard client-config helene interface wg0 server home.vanwerkhoven.org address 172.17.50.101/24
generate wireguard client-config macbook interface wg0 server home.vanwerkhoven.org address 172.17.50.102/24
Debug VyOS Wireguard ¶
Enable debug log for Wireguard (vyos.io):
echo module wireguard +p > /sys/kernel/debug/dynamic_debug/control
Check firewall
sudo nft list tables
sudo nft list table vyos_nat
show log firewall name FW_WAN2INFRA
show firewall name FW_WAN2LOCAL
Test static routing
set protocols static route 172.17.50.0/24 interface wg0
Monitor
monitor traffic interface wg0
Networking ¶
Setup VLAN-aware networking on management interface (proxmox.com). See also https://homenetworkguy.com/how-to/virtualize-opnsense-on-proxmox-as-your-primary-router/ (homenetworkguy.com) Resulting /etc/network/interfaces
config.
auto lo
iface lo inet loopback
auto enx000ec6955446
# iface enx000ec6955446 inet dhcp
iface enx000ec6955446 inet manual
auto vmbr1
iface vmbr1 inet manual
bridge-ports enx000ec6955446
bridge-stp off
bridge-fd 0
#WAN
iface enp0s25 inet manual
auto vmbr0
iface vmbr0 inet manual
bridge-ports enp0s25
bridge-stp off
bridge-fd 0
bridge-vlan-aware yes
bridge-vids 2-4094
#LAN
auto vmbr0.10
iface vmbr0.10 inet static
address 172.17.10.6/24
gateway 172.17.10.1
#Mgmt interface
source /etc/network/interfaces.d/*
Now (re)connect to pve over ethernet at 172.17.10.6 using VLAN 10.
Configure GS108EP switch ¶
I use a GS108EP switch to connect my Access Points over PoE. To ensure these get the right VLAN, we need to configure it as well. Unforatuntely the GS108E does not support changing the management VLAN from 1, so we have to use a workaround.
As reminder:
- The PVID defines the VLAN where untagged frames TO the switch are sent to (ingress). This is typically the same with the one-and-only Untagged VLAN you have on a 802.1q VLAN port
- The port memebership determines how VLAN tags are applied to traffic going from the switch (egress).
Since we use our management interface in an untagged fashion, we can use 10 for the VyOS config and the rest of the network, and use 1 for the netgear switch.
Debian ¶
Proxmox supports two guest architectures:
- LXC:
- Pro: light container, possibility for hardware acceleration(?), faster
- Con: more complicated for Docker, less secure/isolation from host(?)
- VM:
- Pro: fully separated/more secure, Docker works out of the box
- Con: low disk speed for random I/O (and maybe others)
I finally went for LXC because of disk speed (arstechnica.com):
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --size=2g --iodepth=1 --runtime=30 --time_based --end_fsync=1
pve host:
WRITE: bw=220MiB/s (231MB/s), 220MiB/s-220MiB/s (231MB/s-231MB/s), io=6690MiB (7015MB), run=30431-30431msec
debian VM guest @ virtio:
WRITE: bw=67.5MiB/s (70.7MB/s), 67.5MiB/s-67.5MiB/s (70.7MB/s-70.7MB/s), io=2048MiB (2147MB), run=30363-30363msec
debian VM guest @ scsi:
WRITE: bw=31.1MiB/s (32.6MB/s), 31.1MiB/s-31.1MiB/s (32.6MB/s-32.6MB/s), io=1493MiB (1566MB), run=48003-48003msec
debian LXC guest @ virtio
WRITE: bw=192MiB/s (202MB/s), 192MiB/s-192MiB/s (202MB/s-202MB/s), io=5876MiB (6161MB), run=30533-30533msec
old proteus:
WRITE: bw=172MiB/s (180MB/s), 172MiB/s-172MiB/s (180MB/s-180MB/s), io=5338MiB (5598MB), run=31078-31078msec
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=64k --size=128m --numjobs=16 --iodepth=16 --runtime=30 --time_based --end_fsync=1
pve host:
WRITE: bw=2429MiB/s (2547MB/s), 140MiB/s-164MiB/s (147MB/s-172MB/s), io=72.7GiB (78.0GB), run=30127-30641msec
debian VM guest @ virtio:
WRITE: bw=1856MiB/s (1946MB/s), 108MiB/s-123MiB/s (114MB/s-129MB/s), io=55.7GiB (59.8GB), run=30133-30712msec
debian LXC guest @ virtio
WRITE: bw=2045MiB/s (2145MB/s), 117MiB/s-141MiB/s (123MB/s-148MB/s), io=61.9GiB (66.4GB), run=30585-30979msec
old proteus:
WRITE: bw=286MiB/s (300MB/s), 15.8MiB/s-20.8MiB/s (16.6MB/s-21.8MB/s), io=9648MiB (10.1GB), run=30656-33702msec
Install & configure Debian server as LXC (selected) ¶
Get images using Proxmox’ Proxmox VE Appliance Manager (proxmox.com):
sudo pveam update
sudo pveam available
sudo pveam download local debian-11-standard_11.6-1_amd64.tar.zst
sudo pveam download local debian-12-standard_12.7-1_amd64.tar.zst
sudo pveam list local
Check storage to use
pvesm status
Create and configure LXC container (proxmox.com) based on downloaded image. Ensure it’s an unprivileged container to protect our host and router running on it.
sudo pct create 203 local:vztmpl/debian-12-standard_12.7-1_amd64.tar.zst --description "Debian 12 LXC" --hostname proteus2 --rootfs thinpool_vms:256 --unprivileged 1 --cores 4 --memory 16384 --ssh-public-keys /root/.ssh/tim.id_rsa.pub --net0 name=eth0,bridge=vmbr0,firewall=0,gw=172.17.10.1,ip=172.17.10.7/24,tag=10
Now configure networking, on Proxmox’ vmbr0
with VLAN ID 10. This means the guest can only
# This does not work, cannot create network device on vmbr0.10
# pct set 203 --net0 name=eth0,bridge=vmbr0.10,firewall=0,gw=172.19.10.1,ip=172.19.10.2/24
# Does not work:
# pct set 203 --net0 name=eth0,bridge=vmbr0,firewall=0,gw=172.17.10.1,ip=172.17.10.2/24,trunks=10
# Works:
# pct set 203 --net0 name=eth0,bridge=vmbr0,firewall=0,gw=172.17.10.1,ip=172.17.10.2/24,tag=10
sudo pct set 203 --onboot 1
Optional: only required if host does not have this set up correctly (could be because network was not available at init)
sudo pct set 203 --searchdomain lan.vanwerkhoven.org --nameserver 172.17.10.1
If SSH into guest fails or takes a long time, this can be due to LXC / Apparmor security features (stackoverflow.com) which prevent mount
from executing. To solve, ensure nesting is allowed (ostechnix.com):
sudo pct set 203 --features nesting=1
To enable Docker (jlu5.com) inside the LXC container, we need both nesting & keyctl:
sudo pct set 203 --features nesting=1,keyctl=1
Start & log in, set root password, configure some basics
sudo pct start 203
sudo pct enter 203
passwd
apt install sudo vim
dpkg-reconfigure locales
dpkg-reconfigure tzdata
Add regular user, add to system groups (debian.org), and set ssh key
adduser tim
usermod -aG adm,render,sudo,staff tim
mkdir -p ~tim/.ssh/
touch ~tim/.ssh/authorized_keys
chown -R tim:tim ~tim/.ssh
cp /root/.ssh/authorized_keys ~tim/.ssh/authorized_keys
chmod og-rwx ~tim/.ssh/authorized_keys
cat << 'EOF' >>~tim/.ssh/authorized_keys
ssh-rsa AAAA...
EOF
# Allow non-root to use ping
setcap cap_net_raw+p $(which ping)
Update & upgrade and install automatic updates (linode.com)
sudo apt update
sudo apt upgrade
sudo apt install unattended-upgrades
# Comment 'label=Debian' to not auto-update too much
sudo vi /etc/apt/apt.conf.d/50unattended-upgrades
# Tweak some settings
cat << 'EOF' | sudo tee -a /etc/apt/apt.conf.d/50unattended-upgrades
Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
Unattended-Upgrade::Remove-New-Unused-Dependencies "true";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
EOF
sudo unattended-upgrades --dry-run --debug
Install Docker (docker.com). Need to use custom apt repo to get latest version which works inside an unprivileged LXC container (as proposed on the docker forums (docker.com)):
sudo apt remove docker docker-engine docker.io containerd runc docker-compose
sudo apt update
sudo apt install \
ca-certificates \
curl \
gnupg \
lsb-release
sudo mkdir -m 0755 -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo docker run hello-world
Non-solutions ¶
I also tried these options that didn’t work for my older Docker version:
And we maybe need to change (stackoverflow.com) boot (my-take-on.tech) parameters (proxmox.com):
sed -i 's/^GRUB_CMDLINE_LINUX_DEFAULT="quiet/GRUB_CMDLINE_LINUX_DEFAULT="quiet systemd.unified_cgroup_hierarchy=0/' /etc/default/grub
This failed.
Docker inside (unpriviliged) LXC not supported, but can be made to work.
- https://forums.docker.com/t/docker-problem-in-unpriviledged-lxc-on-debian-11-2-bullseye/121685 (docker.com)
- https://bobcares.com/blog/proxmox-docker-unprivileged-container/ (bobcares.com)
- https://quibtech.com/p/run-docker-containers-in-proxmox-lxc/ (quibtech.com)
- https://www.youtube.com/watch?v=Fc06qnL0Jgw (youtube.com)
- https://jlu5.com/blog/docker-unprivileged-lxc-2021 (jlu5.com)
- https://forums.docker.com/t/docker-problem-in-unpriviledged-lxc-on-debian-11-2-bullseye/121685 (docker.com)
Try newer version of docker as proposed on the docker forums (docker.com)
sudo apt-get install docker-compose-plugin docker-compose docker.io
Fails.
Try to install all packages:
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Works! So it was missing a package?!
Now try to install the debian original docker (fewer apt repositories is more stability)
sudo apt-get remove docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Install & configure Debian server as VM (not used) ¶
# Get ISO from https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/
ls /var/lib/vz/template/iso/
qm create 200 --name proteus --description "Debian VM server" --cores 4 --memory 12288 --net0 virtio,bridge=vmbr0,firewall=0,tag=10 --ide2 media=cdrom,file=local:iso/debian-11.6.0-amd64-netinst.iso --virtio0 thinpool_vms:300
# ipconfig0 did not work? --ipconfig0 gw=172.17.10.1,ip=172.17.10.2/24
qm set 200 -serial0 socket
qm set 200 --onboot 1
Open terminal via Spice/xterm.js, install image, remove image, and reboot
qm start 200
# in guest: install image as usual
qm set 200 --ide2 none
qm reboot 200
Add QEMU guest agent
qm set 200 --agent 1
qm agent 200 ping
Test docker
apt install docker.io docker-compose
sudo docker run hello-world
Works
Enable GPU sharing (cetteup.com) in VM:
# vi /etc/default/grub
# GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_pstate=disable
#intel_iommu=on
#i915.enable_gvt=1"
GRUB_CMDLINE_LINUX_DEFAULT+="intel_iommu=on i915.enable_gvt=1"
proxmox-boot-tool refresh && reboot
# Check for success
cat /proc/cmdline
dmesg | grep -e DMAR -e IOMMU
# Load modules
cat << 'EOF' >> /etc/modules
# Modules required for PCI passthrough
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
# Modules required for Intel GVT
kvmgt
exngt
vfio-mdev
EOF
reboot
# Pass through PCI --> Via GUI
# Not sure how this works on CLI, something like: qm set 200 --hostpci0 0000:00:02.0,mdev=i915-GVTg_V5_4
Expose bulk storage to Debian server ¶
I prefer to keep the guest OS disks smallish so I can back them up. However if I want to store bulk data I don’t have space. To solve this there are three approaches to share storage from host to guest:
- Via Samba on host machine, mount in guest. Pro: always works. Con: more complex setup, increases host attack surface
- Via bind mount points. Pro: works well in LXC. Fast. Con: only LXC (selected)
- Via disk pass-through. Pro: works well in KVM (& LXC?). Fast. Con: cannot write from two guests simultaneously.
Automounting Samba in LXC guest didn’t work for me, giving error “Starting of mnt-bulk.automount not supported.” LXC containers are special, apparently (reddit.com). However I document the steps here for reference.
1. Share data via Samba (not used) ¶
Set up Samba server on Proxmox (digitalocean.com).
Install, disable unnecessary netbios deamon, and stop samba itself during configuration.
apt install samba
systemctl stop nmbd.service
systemctl disable nmbd.service
systemctl stop smbd.service
# systemctl disable smbd.service
Configure
[global]
server string = pve.vanwerkhoven.org
server role = standalone server
interfaces = lo vmbr0.10
bind interfaces only = yes
disable netbios = yes
smb ports = 445
log file = /var/log/samba/smb.log
max log size = 10000
# log level = 3 passdb:5 auth:5
Add users
adduser --home /mnt/bulk --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1010 bulkdata
adduser --home /mnt/backup/mbp --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1011 backupmbp
adduser --home /mnt/backup/mba --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1012 backupmba
adduser --home /mnt/backup/tex --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1013 backuptex
chown backupmbp:backupmbp /mnt/backup/mbp
chown backupmba:backupmba /mnt/backup/mba
chown backuptex:backuptex /mnt/backup/tex
chown bulkdata:bulkdata /mnt/bulk
chmod 2770 /mnt/backup/{mba,mbp}
chmod 2770 /mnt/bulk
openssl rand -base64 20
smbpasswd -a backupmbp
smbpasswd -a backupmba
smbpasswd -a backuptex
smbpasswd -e backupmba
smbpasswd -e backupmbp
smbpasswd -e backuptex
Set up shares
[bulk]
path = /mnt/bulk
browseable = yes
read only = no
writable = yes
force create mode = 0660
force directory mode = 2770
valid users = sambarw
[backupmbp]
comment = Time Machine mbp
path = /mnt/backup/mbp
browseable = yes
writeable = yes
create mask = 0600
directory mask = 0700
spotlight = yes
vfs objects = catia fruit streams_xattr
fruit:aapl = yes
fruit:time machine = yes
valid users = backupmbp
[backupmba]
comment = Time Machine MBA
path = /mnt/backup/mba
browseable = yes
writeable = yes
create mask = 0600
directory mask = 0700
spotlight = yes
vfs objects = catia fruit streams_xattr
fruit:aapl = yes
fruit:time machine = yes
valid users = backupmba
Restart Samba
systemctl restart smbd.service
Now mount Samba share automatically (stackexchange.com) on client from pve host:
sudo apt install smbclient cifs-utils
cat << 'EOF' >>/root/.smbcredentials
user=sambarw
password=redacted
EOF
Automount, but ensure mounting doesn’t fail (askubuntu.com) because network is not up yet (askubuntu.com).
sudo mkdir /mnt/bulk
sudo chown root:users /mnt/bulk/
sudo chmod g+rw /mnt/bulk/
sudo cat << 'EOF' >>/etc/fstab
//pve.lan.vanwerkhoven.org/bulk /mnt/bulk cifs credentials=/root/.smbcredentials,rw,uid=tim,gid=users,auto,x-systemd.automount,_netdev 0 0
EOF
2. Share data via mount points (LXC only) (selected) ¶
In the second approach, we mount something on the host and propagate it to the guest (bayton.org), or create a privileged container (proxmox.com).
Mount points require some care regarding UID/GIDs (e.g. see documented on the proxmox wiki (proxmox.com)), but overall seem an easy method to get storage from host to guest.
What worked for me was adding a mountpoint using pct (thushanfernando.com):
sudo mkdir /mnt/bulk
sudo chown tim:users /mnt/bulk
sudo chmod g+w /mnt/bulk
Make user on host (bulkdata:bulkdata
) that we’ll propagate UID/GID to in the guest:
adduser --home /mnt/bulk --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1010 bulkdata
adduser --home /mnt/backup/mbp --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1011 backupmbp
adduser --home /mnt/backup/mba --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1012 backupmba
adduser --home /mnt/backup/tex --no-create-home --shell /usr/sbin/nologin --disabled-password --uid 1013 backuptex
usermod -aG bulkdata tim
Set up UID/GID mapping to propagate users 1010–1020 to the same uid on the host (e.g. using this tool (github.com)). N.B. this is only required if you want to write from both the host and guest. If you only write in (multiple) guests, you only need to ensure the user/group writing from the different guests have the same UID/GID.
cat << 'EOF' >>/etc/pve/lxc/201.conf
# uid map: from uid 0 map 1010 uids (in the ct) to the range starting 100000 (on the host), so 0..1010 (ct) → 100000..101010 (host)
lxc.idmap = u 0 100000 1010
lxc.idmap = g 0 100000 1010
# we map 10 uids starting from uid 1010 onto 1010, so 1010 → 1010
lxc.idmap = u 1010 1010 10
lxc.idmap = g 1010 1010 10
# we map the rest of 65535 from 1020 upto 101020, so 1020..65535 → 101020..165535
lxc.idmap = u 1020 101020 64516
lxc.idmap = g 1020 101020 64516
EOF
Add the following to /etc/subuid
and /etc/subgid
(there might already be entries in the file, also for root
):
cat << 'EOF' >>/etc/subuid
root:1010:10
EOF
cat << 'EOF' >>/etc/subgid
root:1010:10
EOF
Now mount the actual bind point
pct shutdown 201
pct set 201 -mp0 /mnt/bulk,mp=/mnt/bulk
pct start 201
and that’s it. Now we can continue configuring the services.
3. Via disk pass through (KVM only) (not used) ¶
Pass through bulk storage using volume pass through with virtio
(should be faster than SCSI or IDE (proxmox.com)):
qm set 200 -virtio1 /dev/disk/by-id/dm-name-pve-lv_bulk,backup=0,snapshot=0
#qm set 200 -scsi1 /dev/disk/by-id/dm-name-pve-lv_bulk,backup=0,snapshot=0
Proxmox hardening ¶
Tips from Samuel’s Website (samuel.domains) and pveproxy(8) man page (proxmox.com)
Limit server access to specific IPs:
cat << 'EOF' >>/etc/default/pveproxy
# TvW 20230114 added for security reasons
DENY_FROM="all"
ALLOW_FROM="172.17.10.0/24"
POLICY="allow"
# For PVE-Manager >= 6.4 only.
LISTEN_IP="172.17.10.4"
EOF
Disable NFS:
cat << 'EOF' >>/etc/default/nfs-common
# TvW 20230114 disabled for security reasons
NEED_STATD=no
EOF
Install Unifi Network Application (controller) as LXC container ¶
Install Unifi Network Application (Controller) on Debian (only supported Linux platform) using the Unifi guide (ui.com) and the the Alpine guide (alpinelinux.org).
Get images using Proxmox’ Proxmox VE Appliance Manager (proxmox.com):
pveam update
pveam available
pveam download local debian-11-standard_11.6-1_amd64.tar.zst #OR Alpine?
pveam list local
Check storage to use
pvesm status
Create and configure LXC container (proxmox.com) based on downloaded image. Ensure it’s an unprivileged container to protect our host and router running on it. Also configure networking, run on Proxmox’ vmbr0
with VLAN ID 10 in the Management VLAN.
pct create 202 local:vztmpl/debian-11-standard_11.6-1_amd64.tar.zst --description "Debian LXC Unifi Network Application" --hostname unifi --rootfs thinpool_vms:8 --unprivileged 1 --cores 2 --memory 2048 --ssh-public-keys /root/.ssh/tim.id_rsa.pub --net0 name=eth0,bridge=vmbr0,firewall=0,gw=172.17.10.1,ip=172.17.10.5/24,tag=10
pct set 202 --onboot 1
Optional: only required if host does not have this set up correctly (could be because network was not available at init):
pct set 202 --searchdomain lan.vanwerkhoven.org --nameserver 172.17.10.1
Start & log in, set root password, configure some basics
pct start 202
pct enter 202
passwd
apt install sudo vim
dpkg-reconfigure locales
dpkg-reconfigure tzdata
If SSH into guest fails or takes a long time, this can be due to LXC / Apparmor security features (stackoverflow.com) which prevent mount
from executing. To solve, ensure nesting is allowed (ostechnix.com):
pct shutdown 202
pct set 202 --features nesting=1
pct start 202
Hardening sshd is not required: by default, root is only allowed to login with pubkey authentication.
Install required packages to add Unifi apt source, then add new source & related keys
apt-get update && apt-get install ca-certificates apt-transport-https
echo 'deb https://www.ui.com/downloads/unifi/debian stable ubiquiti' | tee /etc/apt/sources.list.d/100-ubnt-unifi.list
wget -O /etc/apt/trusted.gpg.d/unifi-repo.gpg https://dl.ui.com/unifi/unifi-repo.gpg
Unifi (v7.3.83 in my case) has very specific MongoDB requirements:
unifi : Depends: mongodb-server (>= 2.4.10) but it is not installable or
mongodb-10gen (>= 2.4.14) but it is not installable or
mongodb-org-server (>= 2.6.0) but it is not installable
Depends: mongodb-server (< 1:4.0.0) but it is not installable or
mongodb-10gen (< 4.0.0) but it is not installable or
mongodb-org-server (< 4.0.0) but it is not installable
Prep for specific MongoDB version, see this guide (mongodb.com). The MongoDB repo for Stretch (Debian 9) has the newest compatible version (3.6) with a matching pgp key, a bit newer than the 3.4 version as written in the Unifi guide (ui.com). The PGP key for this repo will expire on 2023-12-09, not sure what will happen then.
wget -O /etc/apt/trusted.gpg.d/mongodb-repo.gpg https://pgp.mongodb.com/server-3.6.pub
echo "deb https://repo.mongodb.org/apt/debian stretch/mongodb-org/3.6 main" | tee /etc/apt/sources.list.d/mongodb-org-3.6.list
apt-get update
Install Unifi Network Application from apt, this takes 560 MB of disk space for the package & required dependencies (yeah, for just a controller).
apt-get update && apt-get install unifi
Enable, autostart, and start Unifi service:
systemctl is-enabled unifi
systemctl enable unifi
Update & upgrade and install automatic updates (linode.com)
apt update && apt upgrade
apt install unattended-upgrades
# Comment 'label=Debian' to not auto-update too much
vi /etc/apt/apt.conf.d/50unattended-upgrades
# Tweak some settings
cat << 'EOF' | sudo tee -a /etc/apt/apt.conf.d/50unattended-upgrades
Unattended-Upgrade::Remove-Unused-Kernel-Packages "true";
Unattended-Upgrade::Remove-New-Unused-Dependencies "true";
Unattended-Upgrade::Remove-Unused-Dependencies "true";
EOF
sudo unattended-upgrades --dry-run --debug
Migrate services ¶
Install dependencies, prefer python packages via apt for system-wide install (askubuntu.com) and potentially some security because we don’t install from public pip repository
apt install jq curl python3-netcdf4
Service overview ¶
Now:
- InfluxDB + data (port X) - via apt 1.6 (else we need special apt repo) –> OK DONE
- Port configuration from old proteus
- Set up new influxdb with accounts (generic write user and read user and admin)
- Worker scripts
- All scripts:
1. Unify naming:
<source>2<target>
, e.g.knmi2influxdb
2. Update in-place with credentials, ideally with backwards compatibility - co2signal –> OK DONE * migrate to HA: no, HA has time lag issue * normalize, separate secrets, add influxdb login: OK * tested: OK
- knmi –> OK DONE * migrate to HA: no, keep separate * normalize, separate secrets, add influxdb login * tested: OK
- mqtt2influxdb –> OK done * migrate to HA: not possible, different functionality * migrated: OK tested: OK
- smeter –> OK done –> phase out, use dsmr reader on HASS * migrate to HA: yes? * migrated: OK tested: OK
- water_meter_reader –> OK Done * migrate to HA –> already working via esphome detector & mqtt push
- epexspot –> OK done * migrate to HA: no? * migrated: OK tested: OK
- hue –> phase out, use powercalc on HASS * migrate to HA: no, requires too much calculation/processing * migrated: OK tested: NOK
- mkwebdata * migrate to HA: no * migrated: OK tested: OK
- multical –> OK done –> phase out * migrate to HA: no, custom stuff
- SBFspot –> OK Done –> phase out * Check which scripts are being used, archive old ones * Read secrets from external file * migrated: OK tested: OK
- evohome –> phase out * migrate to HA: yes? * needed in future: no * migrated: OK tested: NOK
- Collectd (for data generation/collection) – on proxmox?
- install on proxmox
- migrate configuration
- Nginx + letsencrypt (port 80/443) –> OK DONE
- Docker
- portainer (port 9000) –> not required OK DONE
- Nextcloud (port 9080) –> run on proteus port @ nextcloud.vanwerkhoven.org –> OK DONE
- bpatrik/pigallery2 (for personal photo sharing) (port 3080) –> run on proteus @ photos.vanwerkhoven.org –> OK DONE
- Home Assistant (port 8143) –> run on proteus @ homeassistant.vanwerkhoven.org (VPN) –> OK DONE
- Grafana (port 3000) – via vendor apt repository @ grafana.vanwerkhoven.org –> OK DONE
- lscr.io/linuxserver/unifi-controller (for Unifi AP management) –> separate LXC @ unifi.lan.vanwerkhoven.org –> OK DONE
- Mosquitto (glueing home automation) – on proteus –> OK DONE
Later:
- Transmission (downloading torrents) – on proteus
- Plex/Jellyfin (HTPC) – needs hw accel, required running in privileged container
- smbd (for Time Machine backups) – on proteus
Service hardware requirements ¶
- GPU: Jellyfin
- Bluetooh: host server & Home Assistant
- USB smart meter: host server & Home Assistant
- USB heat meter: host server
- USB Conbee Zigbee: Home Assistant
Get HW accel in guest/container: https://www.reddit.com/r/jellyfin/comments/s417qw/hardware_acceleration_inside_proxmox_lxc_not/ (reddit.com)
Prepare old server ¶
- Stop cron jobs from collecting data (not strictly necessary)
Docker ¶
First install docker (also see above)
sudo apt install docker.io docker-compose
Reverse proxy options for containers ¶
Optional: prepare forwarding traffic from WAN to containers using a reverse proxy, following some best practices (reddit.com), e.g. using nginx-proxy (github.com).
- Expose & publish Docker container ports on host, then reverse proxy to specific port (e.g.
-p 127.0.0.1:8000:8000
)
- Pro: already operational, least trust required, fastest solution
- Con: occupies host ports that are never used, potentially exposes services to users with access to host, requires manual acme management
- Use internal Docker network to map reverse proxy (e.g. dynamically using nginx-proxy (github.com))
- Pro: easy solution (well that was never a consideration /s), does not expose ports, automatic acme handling
- Con: requires trust in 3rd party nginx implementation, slower(?) than native nginx, requires exposing Docker socket (traefik.io) granting it root on host.
- Use Traefik (digitalocean.com) to reverse proxy for Docker (bonus: built-in ACME challenge)
- Pro: easy solution, does not expose ports, automatic acme/letsencrypt handling
- Con: slower than nginx, new approach, requires exposing Docker socket (traefik.io) granting it root on host.
I decided to go for option 1: most effort and most overkill (security/speed) for my situation :p
Portainer to ease container management –> not needed ¶
And optionally install portainer
to help manage docker. Bind to localhost to ensure this service cannot be accessed outside the machine
sudo docker volume create portainer_data
sudo docker run -d \
--name portainer \
--restart=always \
-p 127.0.0.1:8000:8000 -p 127.0.0.1:9443:9443 \
-v /var/run/docker.sock:/var/run/docker.sock \
-v portainer_data:/data \
portainer/portainer-ce:latest
sudo docker ps
InfluxDB ¶
Make backup on old system
/usr/bin/influxd backup -portable /home/tim/backup/influx_snapshot.db/$(date +%Y%m%d)-migrate
Use (old) native Debian package for stability, security & least additional apt repositories
apt install influxdb-client influxdb
scp -P 10022 -r tim@172.17.10.107:/home/tim/backup/influx_snapshot.db/20230713-migrate .
influxd restore -portable /home/tim/backup/influx_snapshot.db/20230713-migrate/
Migrate config, reload
scp -P 10022 tim@172.17.10.107:/etc/influxdb/influxdb.conf influxdb.conf-migrate
sudo diff /etc/influxdb/influxdb.conf /etc/influxdb/influxdb.conf-migrate
sudo service influxdb restart
Add users in InfluxDB (influxdata.com)
influx -precision rfc3339ls -a
CREATE USER influxadmin WITH PASSWORD 'pwd' WITH ALL PRIVILEGES
CREATE USER influxwrite WITH PASSWORD 'pwd'
GRANT WRITE ON collectd TO influxwrite
GRANT WRITE ON smarthomev3 TO influxwrite
CREATE USER influxread WITH PASSWORD 'pwd'
GRANT READ ON collectd TO influxread
GRANT READ ON smarthomev3 TO influxread
CREATE USER influxreadwrite WITH PASSWORD 'pwd'
GRANT ALL ON collectd TO influxreadwrite
GRANT ALL ON smarthomev3 TO influxreadwrite
Test account wiht curl
chmod o-r ~/.profile
cat << 'EOF' >>~/.profile
export INFLUX_USERNAME=influxadmin
export INFLUX_PASSWORD=pwd
EOF
curl -G http://localhost:8086/query --data-urlencode "q=SHOW DATABASES"
curl -G http://localhost:8086/query -u influxwrite:pwd --data-urlencode "q=SHOW DATABASES"
In case InfluxDB is not running, check that path to types.db
is correct.
Failed to connect to http://localhost:8086: Get "http://localhost:8086/ping": dial tcp [::1]:8086: connect: connection refused
Please check your connection settings and ensure 'influxd' is running.
Restore retention policies (vanwerkhoven.org) (Archive (archive.org)) – NB this was not necessary when restoring from backup
SHOW RETENTION POLICIES ON collectd
CREATE RETENTION POLICY "always" ON "collectd" DURATION INF REPLICATION 1
CREATE RETENTION POLICY "five_days" ON "collectd" DURATION 5d REPLICATION 1 DEFAULT
# For Grafana viewing - see https://github.com/grafana/grafana/issues/4262#issuecomment-475570324
INSERT INTO always rp_config,idx=1 rp="five_days",start=0i,end=432000000i -9223372036854775806
INSERT INTO always rp_config,idx=2 rp="always",start=432000000i,end=3110400000000i -9223372036854775806
# Restore continuous queries
cq_60m_cpu CREATE CONTINUOUS QUERY cq_60m_cpu ON collectd BEGIN SELECT mean(value) AS value INTO collectd.always.cpu FROM collectd.five_days.cpu GROUP BY time(1h), * END
cq_60m_cpufreq CREATE CONTINUOUS QUERY cq_60m_cpufreq ON collectd BEGIN SELECT mean(value) AS value INTO collectd.always.cpufreq FROM collectd.five_days.cpufreq GROUP BY time(1h), * END
cq_60m_df CREATE CONTINUOUS QUERY cq_60m_df ON collectd BEGIN SELECT mean(value) AS value INTO collectd.always.df FROM collectd.five_days.df GROUP BY time(1h), * END
cq_60m_interface CREATE CONTINUOUS QUERY cq_60m_interface ON collectd BEGIN SELECT mean(rx) AS rx, mean(tx) AS tx INTO collectd.always.interface FROM collectd.five_days.interface GROUP BY time(1h), * END
cq_60m_iwinfo CREATE CONTINUOUS QUERY cq_60m_iwinfo ON collectd BEGIN SELECT mean(value) AS value INTO collectd.always.iwinfo FROM collectd.five_days.iwinfo GROUP BY time(1h), * END
cq_60m_load CREATE CONTINUOUS QUERY cq_60m_load ON collectd BEGIN SELECT mean(longterm) AS longterm, mean(midterm) AS midterm, mean(shortterm) AS shortterm INTO collectd.always.load FROM collectd.five_days.load GROUP BY time(1h), * END
cq_60m_memory CREATE CONTINUOUS QUERY cq_60m_memory ON collectd BEGIN SELECT mean(value) AS value INTO collectd.always.memory FROM collectd.five_days.memory GROUP BY time(1h), * END
cq_60m_ping CREATE CONTINUOUS QUERY cq_60m_ping ON collectd BEGIN SELECT mean(value) AS value INTO collectd.always.ping FROM collectd.five_days.ping GROUP BY time(1h), * END
SHOW CONTINUOUS QUERIES
Home Assistant ¶
VM ¶
Get Home Assistant image (home-assistant.io)
cd /var/lib/vz/template/iso
sudo wget https://github.com/home-assistant/operating-system/releases/download/13.1/haos_ova-13.1.qcow2.xz
sudo xz -d haos_ova-13.1.qcow2.xz
Create the new VM. See this guide (stefandroid.com) for command-line examples
qm create 101 -agent 1 -tablet 0 -localtime 1 -bios ovmf -cpu host -cores 4 -memory 8192 -name haos -net0 virtio,bridge=vmbr0,macaddr=02:85:73:A4:71:88,tag=10 -onboot 1 -ostype l26 -scsihw virtio-scsi-pci
qm importdisk 101 /var/lib/vz/template/iso/haos_ova-13.1.qcow2 thinpool_vms
qm set 101 --scsi0 thinpool_vms:vm-101-disk-0,cache=writethrough
qm set 101 --boot c --bootdisk scsi0
pvesm alloc thinpool_vms 101 vm-101-disk-1 4M
qm set 101 -efidisk0 thinpool_vms:vm-101-disk-1
Explanation copied from the guide (stefandroid.com)
- Create the VM. I’m using 4 cores & 8GB here
- Import the decompressed qcow2 image as a disk to the local-lvm storage. Change the storage if you store your Proxmox VMs somewhere else. –> I added
,cache=writethrough
in case it’s not default - Assign the imported disk from (2) to the VM.
- Set the imported disk from (2) as the boot disk.
- Allocate 4 MiB for the EFI disk.
- Assign the EFI disk to the VM.
The tteck script (githubusercontent.com) does the same, but a bit more opaquely.
Docker ¶
Migrate config from old machine, see https://www.home-assistant.io/installation/linux#install-home-assistant-container (home-assistant.io)
# Create backup on old config (HA Core)
sudo systemctl stop home-assistant@homeassistant.service
sudo tar cvf ~/homeassistant.tar.gz ~homeassistant/.homeassistant
# Move to new machine & right place
scp oldserver:homeassistant.tar.gz newserver:/var/lib/
scp -r -P 10022 tim@172.17.10.107:homeassistant.tar.gz .
cd /var/lib/ && sudo tar xvf ./homeassistant.tar.gz
sudo mv .homeassistant homeassistant
sudo chown root:root homeassistant
sudo chmod og-rwx homeassistant/
Start docker container via docker run
or docker compose
:
# Run new docker, do not use --privileged for safety and easier running in LXC
# https://community.home-assistant.io/t/why-does-the-documentation-say-we-need-priviledged-mode-for-a-docker-install-now/336556/2
sudo docker run -d \
--name homeassistant \
--restart=unless-stopped \
-e TZ=Europe/Brussels \
-v /var/lib/homeassistant:/config \
--network=host \
ghcr.io/home-assistant/home-assistant:stable
cat << 'EOF' >> ~tim/docker/home-assistant-compose.yml
#version: '3'
# https://www.home-assistant.io/installation/linux#docker-compose
# docker compose -f home-assistant-compose.yml up -d # run
# docker compose -f home-assistant-compose.yml pull # update
services:
homeassistant:
container_name: homeassistant
image: "ghcr.io/home-assistant/home-assistant:stable"
volumes:
- /var/lib/homeassistant:/config
- /etc/localtime:/etc/localtime:ro
restart: unless-stopped
network_mode: host
devices:
- /dev/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00
EOF
sudo docker compose -f home-assistant-compose.yml up -d
Forward Zigbee USB device ¶
I want some usb devices to be accessible in my containers. We could use udev
to set user/owner (stackexchange.com) on the original device, however this might mess up something on the host using those devices. Instead, I use udev
to create symlinks (stackexchange.com) and run scripts that copies the devices and chown
s them suitable for use in my containers.
First we create udev rules to create symlinks that I can programmatically find (suffix with container-link
). I use the same udev
rule to run a script after usb devices are online.
cat << 'EOF' | sudo tee /etc/udev/rules.d/65-usb-for-containers.rules
SUBSYSTEM=="tty", ENV{ID_SERIAL}=="FTDI_FT232R_USB_UART_AC2F17KR", SYMLINK+="FTDI_FT232R_USB_UART_AC2F17KR-container-link", RUN+="/usr/local/bin/mk_usb-for-containers.sh"
SUBSYSTEM=="tty", ENV{ID_SERIAL}=="FTDI_FT232R_USB_UART_AQ00K6K3", SYMLINK+="FTDI_FT232R_USB_UART_AQ00K6K3-container-link", RUN+="/usr/local/bin/mk_usb-for-containers.sh"
SUBSYSTEM=="tty", ENV{ID_SERIAL}=="dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131", SYMLINK+="dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-container-link", RUN+="/usr/local/bin/mk_usb-for-containers.sh"
EOF
In the script I copy the devices to a separate location. Copying nodes can be done with tar (stackoverflow.com) or can be done with (stackoverflow.com) cp -R (ibm.com).
cat << 'EOF' | sudo tee /usr/local/bin/mk_usb-for-containers.sh
#!/usr/bin/env bash
sudo rm -f /lxc/201/devices/*container-link && sudo cp -Lrp /dev/*-container-link /lxc/201/devices/ && sudo chown 100000:100020 /lxc/201/devices/*
EOF
sudo chmod 0750 /usr/local/bin/mk_usb-for-containers.sh
Then reload rules to test this is working
sudo udevadm control --reload-rules && sudo service udev restart && sudo udevadm trigger
Then create the lxc mount points based on the new links in my separate location:
lxc.cgroup2.devices.allow: c 188:* rwm
lxc.mount.entry: /lxc/201/devices/FTDI_FT232R_USB_UART_AC2F17KR-container-link dev/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0 none bind,optional,create=file
lxc.mount.entry: /lxc/201/devices/FTDI_FT232R_USB_UART_AQ00K6K3-container-link dev/usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0 none bind,optional,create=file
lxc.cgroup2.devices.allow: c 166:* rwm
lxc.mount.entry: /lxc/201/devices/dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-container-link dev/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00 none bind,optional,create=file
Now apply to Home Assistant Docker image
root@pve:/etc/pve/lxc# ls -l /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00
# lrwxrwxrwx 1 root root 13 Jul 27 20:50 /dev/serial/by-id/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00 -> ../../ttyACM0
root@pve:/etc/pve/lxc# ls -l /dev/ttyACM0
#crw-rw---- 1 root dialout 166, 0 Jul 27 20:51 /dev/ttyACM0
mkdir -p /lxc/201/devices
cd /lxc/201/devices/
sudo mknod -m 660 usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00 c 166 0
sudo chown 100000:100020 usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00
ls -al /lxc/201/devices/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00
cat << 'EOF' | sudo tee -a /etc/pve/lxc/201.conf
lxc.cgroup2.devices.allow: c 166:* rwm
lxc.mount.entry: /lxc/201/devices/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00 dev/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00 none bind,optional,create=file
EOF
Error when restarting Docker container – WIP
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error creating device nodes: mount /dev/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00:/var/lib/docker/overlay2/963a244fa0d220f872cc0e02714e6045b112c5db6404ce5a47903ec936b2e51e/merged/dev/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00 (via /proc/self/fd/6), flags: 0x1000: no such file or directory: unknown
Not due to old docker: https://forum.proxmox.com/threads/docker-container-would-not-start-after-upgrading-proxmox-ve-8-1-to-8-2-with-oci-runtime-create-failed.145875/ (proxmox.com)
Check mount points (docker.com)
sudo docker container inspect b52be18b84a5 --format '{{ json .Mounts }}' | jq
[
{
"Type": "bind",
"Source": "/var/lib/homeassistant",
"Destination": "/config",
"Mode": "rw",
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/etc/localtime",
"Destination": "/etc/localtime",
"Mode": "ro",
"RW": false,
"Propagation": "rprivate"
}
]
Recrate merged/ folder
https://stackoverflow.com/questions/65655199/how-to-fix-docker-container-after-deleting-overlay2-folder (stackoverflow.com) https://stackoverflow.com/questions/70666374/how-to-disable-docker-diff (stackoverflow.com) https://github.com/docker/for-mac/issues/1396 (github.com) – for max specifically
docker system prune --all
sudo systemctl stop docker
sudo rm -r /var/lib/docker/overlay2/
sudo systemctl restart docker
DOCKER_BUILDKIT=0
docker compose down && docker compose build --progress plain --no-cache && docker compose up --> this didnt recreate my images
sudo docker system prune -a
sudo docker compose -f pigallery2-compose.yml pull
sudo docker compose -f pigallery2-compose.yml down
sudo docker system prune
sudo docker compose -f pigallery2-compose.yml up -d
sudo docker compose -f nextcloud-compose.yml pull
sudo docker compose -f nextcloud-compose.yml down
sudo docker system prune
sudo docker compose -f nextcloud-compose.yml up -d
sudo docker compose -f home-assistant-compose.yml pull
sudo docker compose -f home-assistant-compose.yml down
sudo docker system prune
sudo docker compose -f home-assistant-compose.yml up -d
Alternative: retry adding USB using cgroups (stackoverflow.com)
https://stackoverflow.com/questions/24225647/docker-a-way-to-give-access-to-a-host-usb-or-serial-device (stackoverflow.com) https://marc.merlins.org/perso/linux/post_2018-12-20_Accessing-USB-Devices-In-Docker-_ttyUSB0_-dev-bus-usb-_-for-fastboot_-adb_-without-using-privileged.html (merlins.org)
Run zigbee2mqtt - WIP ¶
on Linux ¶
Because USB in docker is not working, running zigbee2mqtt (zigbee2mqtt.io) separately might be more robust.
Check that nodejs is not used anywhere
apt-cache rdepends --installed nodejs
Get custom apt repo for nodejs. I don’t like running custom scripts as root from internet, so I downloaded it and decomposed it into separate commands.
# sudo curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y apt-transport-https ca-certificates curl gnupg
sudo mkdir -p /usr/share/keyrings
sudo rm -f /usr/share/keyrings/nodesource.gpg || true
sudo rm -f /etc/apt/sources.list.d/nodesource.list || true
curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/nodesource.gpg
node_version="20.x"
arch=$(dpkg --print-architecture)
echo "deb [arch=$arch signed-by=/usr/share/keyrings/nodesource.gpg] https://deb.nodesource.com/node_$node_version nodistro main" | sudo tee /etc/apt/sources.list.d/nodesource.list
# N|solid Config
echo "Package: nsolid" | sudo tee /etc/apt/preferences.d/nsolid > /dev/null
echo "Pin: origin deb.nodesource.com" | sudo tee -a /etc/apt/preferences.d/nsolid > /dev/null
echo "Pin-Priority: 600" | sudo tee -a /etc/apt/preferences.d/nsolid > /dev/null
# Nodejs Config
echo "Package: nodejs" | sudo tee /etc/apt/preferences.d/nodejs > /dev/null
echo "Pin: origin deb.nodesource.com" | sudo tee -a /etc/apt/preferences.d/nodejs > /dev/null
echo "Pin-Priority: 600" | sudo tee -a /etc/apt/preferences.d/nodejs > /dev/null
sudo apt update -y
and install nodejs
sudo apt install nodejs git make g++ gcc libsystemd-dev
# g++ is already the newest version (4:10.2.1-1).
# gcc is already the newest version (4:10.2.1-1).
# git is already the newest version (1:2.30.2-1+deb11u2).
# make is already the newest version (4.3-4.1).
# The following NEW packages will be installed:
# libsystemd-dev nodejs
# Verify that the correct nodejs and npm (automatically installed with nodejs)
# version has been installed
node --version # Should output V18.x, V20.x, V21.X
npm --version # Should output 9.X or 10.X
Create a new user and directory for zigbee2mqtt and set your user as owner of it. We do need a home dir as npm uses it for dependencies.
sudo adduser --disabled-login zigbee2mqtt
sudo adduser zigbee2mqtt dialout
sudo mkdir /opt/zigbee2mqtt
sudo chown -R zigbee2mqtt: /opt/zigbee2mqtt
Create mosquitto account
passwd=$(openssl rand -base64 24)
sudo mosquitto_passwd /etc/mosquitto/passwd zigbee2mqtt
sudo systemctl restart mosquitto.service
Clone Zigbee2MQTT repository
sudo -u zigbee2mqtt git clone --depth 1 https://github.com/Koenkk/zigbee2mqtt.git /opt/zigbee2mqtt
Install dependencies (as user “pi”)
cd /opt/zigbee2mqtt
sudo -u zigbee2mqtt npm ci
If this command fails and returns an ERR_SOCKET_TIMEOUT
error, run this command instead: npm ci --maxsockets 1
Build the app
sudo -u zigbee2mqtt npm run build
Copy and open the configuration file
sudo -u zigbee2mqtt cp /opt/zigbee2mqtt/data/configuration.example.yaml /opt/zigbee2mqtt/data/configuration.yaml
sudo -u zigbee2mqtt vim /opt/zigbee2mqtt/data/configuration.yaml
# Home Assistant integration (MQTT discovery)
homeassistant: true
# allow new devices to join, set this to false by default, then temporarily allow via frontend, see https://www.zigbee2mqtt.io/guide/usage/pairing_devices.html#frontend-recommended
permit_join: false
# MQTT settings
mqtt:
# MQTT base topic for zigbee2mqtt MQTT messages
base_topic: zigbee2mqtt
# MQTT server URL
server: 'mqtt://localhost'
# MQTT server authentication, uncomment if required:
user: user
password: password
# Choose your channel carefully, see. https://www.metageek.com/training/resources/zigbee-wifi-coexistence/ and https://www.zigbee2mqtt.io/advanced/zigbee/02_improve_network_range_and_stability.html#reduce-wi-fi-interference-by-changing-the-zigbee-channel
channel: 11
# Note: all options are optional
availability:
active:
# Time after which an active device will be marked as offline in
# minutes (default = 10 minutes)
timeout: 5
passive:
# Time after which a passive device will be marked as offline in
# minutes (default = 1500 minutes aka 25 hours)
timeout: 30
# Serial settings
serial:
# Location of CC2531 USB sniffer
#port: /dev/ttyACM0
port: /dev/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00
adapter: deconz
advanced:
# Set network_key: GENERATE to let Zigbee2MQTT generate a new random key on the first start. The configuration.yml gets updated with the new key. Changing the network_key requires repairing of all devices
# https://www.zigbee2mqtt.io/guide/configuration/zigbee-network.html#network-config
network_key: GENERATE
frontend: true
Start
cd /opt/zigbee2mqtt
sudo -u zigbee2mqtt npm start
Start as deamon
cat << 'EOF' | sudo tee /etc/systemd/system/zigbee2mqtt.service
[Unit]
Description=zigbee2mqtt
After=network.target
[Service]
Environment=NODE_ENV=production
Type=notify
ExecStart=/usr/bin/node index.js
WorkingDirectory=/opt/zigbee2mqtt
StandardOutput=inherit
# Or use StandardOutput=null if you don't want Zigbee2MQTT messages filling syslog, for more options see systemd.exec(5)
StandardError=inherit
WatchdogSec=10s
Restart=always
RestartSec=10s
User=zigbee2mqtt
[Install]
WantedBy=multi-user.target
EOF
Now configure at http://localhost:8080
Ensure you set Reporting settings correctly, e.g. some Innr plugs report energy update only with large changes (e.g ~0.5kWh). Concretely, set Min rep change
to ~0.001 for Cluster seMetering
& Attribute currentSummDelivered
. See also this github issue (github.com).
On Docker - doesn’t work ¶
Create mosquitto account
passwd=$(openssl rand -base64 24)
sudo mosquitto_passwd /etc/mosquitto/passwd zigbee2mqtt << ${passwd}
sudo systemctl restart mosquitto.service
Create data directory, get default config, update mosquitto account
sudo mkdir /var/lib/zigbee2mqtt/
sudo chmod o-rwx /var/lib/zigbee2mqtt/
wget https://raw.githubusercontent.com/Koenkk/zigbee2mqtt/master/data/configuration.yaml -P /var/lib/zigbee2mqtt/data
sudo chmod o-rwx /var/lib/zigbee2mqtt/data/configuration.yaml
Init docker compose config (zigbee2mqtt.io):
cat << 'EOF' | tee ~tim/docker/zigbee2mqtt-compose.yml
# Start: docker compose up -d zigbee2mqtt
# Update: docker compose pull zigbee2mqtt && docker compose up -d zigbee2mqtt
# version: '3.8' # obsolete
services:
zigbee2mqtt:
container_name: zigbee2mqtt
image: koenkk/zigbee2mqtt
restart: unless-stopped
volumes:
- /var/lib/zigbee2mqtt/data:/app/data
- /run/udev:/run/udev:ro
ports:
# Frontend port
- 8080:8080
environment:
- TZ=Europe/Berlin
devices:
# Make sure this matched your adapter location
- /dev/usb-dresden_elektronik_ingenieurtechnik_GmbH_ConBee_II_DE2149131-if00:/dev/ttyACM0
# Optional: run as rootless mode, but has quite some caveats (see https://docs.docker.com/engine/security/rootless/)
# group_add:
# - dialout
# user: 1000:1000
EOF
Start
sudo docker compose -f zigbee2mqtt-compose.yml up -d
Creates error for usb forward
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error creating device nodes: mount /dev/ttyACM0:/var/lib/docker/overlay2/666a6b67f09494fbe8eef8cd80abf60e8d3d091a94bfacb4cb871d6465f526d1/merged/dev/ttyACM0 (via /proc/self/fd/6), flags: 0x1000: no such file or directory: unknown
Clean up docker container
Reverse proxy via nginx ¶
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name homeassistant.vanwerkhoven.org;
location / {
# include snippets/nginx-server-proxy-tim.conf;
# TvW 20230222 Default options for server blocks acting as reverse proxy. Should be part of location / { }
# include modules-available/nginx-server-proxy-tim.conf;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $server_name;
#proxy_set_header X-Forwarded-Ssl on;
#proxy_set_header Upgrade $http_upgrade;
#proxy_set_header Connection "upgrade";
#client_max_body_size 16G;
proxy_buffering off;
proxy_pass http://127.0.0.1:8123;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection “upgrade”;
}
# include snippets/nginx-server-ssl-tim.conf;
# TvW 20230222 Default options for server blocks serving ssl
# Added 20190122 TvW Add HTTPS strict transport security
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
ssl_certificate /etc/letsencrypt/live/vanwerkhoven.org/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/vanwerkhoven.org/privkey.pem; # managed by Certbot
include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
}
Migrate HA to MariaDB ¶
@TODO Figure out how to setup MariaDB later
Optimize configuration: add mariadb, influxdb, tweak recorder to only store relevant stuff https://smarthomescene.com/guides/optimize-your-home-assistant-database/ (smarthomescene.com) https://community.home-assistant.io/t/migrating-home-assistant-database-from-sqlite-to-mariadb/96895 (home-assistant.io)
# Remove
sudo apt remove mysql-server-8.0 mysql-server mysql-client-8.0 mysql-client-core-8.0 mysql-common
sudo apt-get remove --purge
sudo apt install mariadb-server
# Fix apparmor because of old mysql installation
# https://askubuntu.com/questions/1185710/mariadb-fails-despite-apparmor-profile
# https://stackoverflow.com/questions/40997257/mysql-service-fails-to-start-hangs-up-timeout-ubuntu-mariadb
echo "# TvW 20230127 fix apparmor issue mariadb" | sudo tee -a /etc/apparmor.d/usr.sbin.mysqld
echo "/usr/sbin/mysqld { }" | sudo tee -a /etc/apparmor.d/usr.sbin.mysqld
sudo apparmor_parser -v -R /etc/apparmor.d/usr.sbin.mysqld
sudo systemctl restart mariadb
sudo reboot
# Did not help? Try reboot
#sudo /etc/init.d/apparmor reload
sudo mysql_secure_installation
## create database
mysql -e 'CREATE SCHEMA IF NOT EXISTS `hass_db` DEFAULT CHARACTER SET utf8mb4'
## create user (use a safe password please)
mysql -e "CREATE USER 'hass_user'@'localhost' IDENTIFIED BY 'pwd'"
mysql -e "GRANT ALL PRIVILEGES ON hass_db.* TO 'hass_user'@'localhost'"
mysql -e "GRANT usage ON *.* TO 'hass_user'@'localhost'"
Migrate: method 1, use only SQL
pip install sqlite3-to-mysql
sqlite3mysql -f ./home-assistant_v2.db -d hass_db -u hass_user -p
Migrate: method 2, todo
sqlite3 ~homeassistant/.homeassistant/home-assistant_v2.db .dump > hadump.sql
git clone https://github.com/athlite/sqlite3-to-mysql
recorder:
auto_purge: true
purge_keep_days: 21
auto_repack: true
db_url: mysql://user:password@localhost/homeassistant?unix_socket=/var/run/mysqld/mysqld.sock&charset=utf8mb4
Grafana ¶
We can either use apt or the docker image. I go for apt here so I can more easily re-use my letsencrypt certificate via /etc/grafana/grafana.ini (grafana.com).
Install for Debian (grafana.com)
sudo apt-get install -y apt-transport-https
sudo apt-get install -y software-properties-common wget
sudo wget -q -O /usr/share/keyrings/grafana.key https://apt.grafana.com/gpg.key
Add repo
echo "deb [signed-by=/usr/share/keyrings/grafana.key] https://apt.grafana.com stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
Install
sudo apt-get install grafana
Start now & start automatically
sudo systemctl daemon-reload
sudo systemctl start grafana-server
sudo systemctl status grafana-server
sudo systemctl enable grafana-server.service
Enable HTTPS using letsencrypt certificate (grafana.com)
sudo ln -s /etc/letsencrypt/live/vanwerkhoven.org/privkey.pem /etc/grafana/grafana.key
sudo ln -s /etc/letsencrypt/live/vanwerkhoven.org/fullchain.pem /etc/grafana/grafana.crt
# Allow access
sudo groupadd letsencrypt-cert
sudo usermod --append --groups letsencrypt-cert grafana
sudo chgrp -R letsencrypt-cert /etc/letsencrypt/*
sudo chmod -R g+rx /etc/letsencrypt/*
sudo chgrp -R grafana /etc/grafana/grafana.crt /etc/grafana/grafana.key
sudo chmod 400 /etc/grafana/grafana.crt /etc/grafana/grafana.key
Migrate configuration
- Install used plugin on new server – later
- Stop Grafana service on source and destination server – OK
- Copy /var/lib/grafana/grafana.db from old to new server – OK
- Check /etc/grafana/grafana.ini - OK
- Reconnect to datasource
scp -P 10022 tim@172.17.10.107:/etc/grafana/grafana.ini /etc/grafana/grafana.ini-migrate
sudo diff /etc/grafana/grafana.ini /etc/grafana/grafana.ini-migrate
Set up notifications for everything https://grafana.com/docs/grafana/latest/alerting/fundamentals/alert-rules/message-templating/ (grafana.com) https://grafana.com/docs/grafana/latest/alerting/manage-notifications/template-notifications/using-go-templating-language/ (grafana.com)
Nextcloud ¶
Install regular Docker image (github.com) (instead of the all-in-one image with possibly too much junk). Enable cron requires a separate container (github.com), and link to Redis by setting:
cat << 'EOF' | tee ~tim/docker/nextcloud-compose.yml
# https://github.com/nextcloud/docker#running-this-image-with-docker-compose
# docker compose -f nextcloud-compose.yml up -d
volumes:
nextcloud:
db:
services:
db:
image: mariadb:10.11
restart: always
command: --transaction-isolation=READ-COMMITTED --log-bin=binlog --binlog-format=ROW
volumes:
- db:/var/lib/mysql
environment:
- MYSQL_ROOT_PASSWORD=dtgUUVhZcbuGmks4gajHBHcnXX2yXyve
- MYSQL_PASSWORD=LlI8Fcp0vBi7QIgcwQ02vFGtH8I2Wn2B
- MYSQL_DATABASE=nextcloud
- MYSQL_USER=nextcloud
redis:
image: redis:alpine
restart: always
app:
image: nextcloud:30
restart: always
ports:
- 9081:80
depends_on:
- db
- redis
links:
- db
- redis
volumes:
- nextcloud:/var/www/html
environment:
- MYSQL_PASSWORD=LlI8Fcp0vBi7QIgcwQ02vFGtH8I2Wn2B
- MYSQL_DATABASE=nextcloud
- MYSQL_USER=nextcloud
- MYSQL_HOST=db
- APACHE_DISABLE_REWRITE_IP=1
- TRUSTED_PROXIES=172.17.10.0/24
- REDIS_HOST=redis
cron:
image: nextcloud:30
volumes_from:
- app
entrypoint: /cron.sh
EOF
sudo docker compose -f nextcloud-compose.yml up -d
Migrate data, then force Nextcloud to scan for new files (nextcloud.com) (see useful docs on external storage (nextcloud.com)).
rsync -e 'ssh -p 10022' --archive --progress --verbose root@172.17.10.107:/var/snap/nextcloud/common/nextcloud/data/timlow/files/ .
docker exec -u www-data 715a4e700667 php occ files:scan --all
Add https schema for reverse proxy config (nextcloud.com)
vim /var/lib/docker/volumes/docker_nextcloud/_data/config/config.php
# 'overwrite.cli.url' => 'https://nextcloud.vanwerkhoven.org'
# 'overwritehost' => 'nextcloud.vanwerkhoven.org',
# 'overwriteprotocol' => 'https',
# 'overwritewebroot' => '/',
Update: should probably not do this but instead use parameters like specified here (github.com):
environment:
- APACHE_DISABLE_REWRITE_IP=1
- TRUSTED_PROXIES=172.17.10.0/24
Enable file uploads >2M
- In nextcloud php.ini
- In nginx
Optional: make certain folders accessible outside Docker using bind mount
, first as trial, then permanently on boot via /etc/fstab
:
sudo mount -o bind /var/lib/docker/volumes/docker_nextcloud/_data/data/tim/files/alexandra /media/alexandra
cat << 'EOF' | sudo tee -a /etc/fstab
/var/lib/docker/volumes/docker_nextcloud/_data/data/tim/files/alexandra /media/alexandra none bind
EOF
Running OCC commands (github.com) is done as:
docker compose exec --user www-data app php occ
sudo docker compose -f nextcloud-compose.yml exec --user www-data app php occ db:add-missing-indices
sudo docker compose -f nextcloud-compose.yml exec --user www-data app php occ maintenance:repair --include-expensive
Upgrading nextcloud is documented on github (github.com)
- Increment major version by 1 in compose file
- Run
docker compose pull && docker compose up -d
This worked smoothly from 24 –> 25 –> 26 –> 27 –> 28 –> 29 –> 30, and MariaDB from 10.5 to 10.11 (at the final step as suggested by nextcloud itself).
If you get 502 Bad Gateway
from nginx, nextcloud is likely still starting up.
PiGallery2 ¶
Configure docker compose file (github.com) for PiGallery2 only (github.com), we do the reverse nginx proxy ourselves. Furthermore, bind mount (docker.com) the images directory directly to the source in the Nextcloud Docker volume.
cat << 'EOF' > ~tim/docker/pigallery2-compose.yml
version: '3.2'
# Version 3.2 required for long-syntax volume configuration -- see https://docs.docker.com/compose/compose-file/compose-file-v3/#volumes
# Source: https://github.com/bpatrik/pigallery2/blob/master/docker/README.md
# docker compose -f pigallery2-compose.yml up -d
services:
pigallery2:
image: bpatrik/pigallery2:latest
container_name: pigallery2
environment:
- NODE_ENV=production # set to 'debug' for full debug logging
volumes:
- "/var/lib/pigallery/config:/app/data/config" # CHANGE ME -> OK
- "db-data:/app/data/db"
# - "/media/alexandra:/app/data/images:ro" # CHANGE ME -> OK
- type: bind
source: /var/lib/docker/volumes/docker_nextcloud/_data/data/tim/files/alexandra/
target: /app/data/images
read_only: true
- "/var/lib/pigallery/tmp:/app/data/tmp" # CHANGE ME -> OK
ports:
- 3010:80
restart: always
volumes:
db-data:
EOF
sudo docker compose -f pigallery2-compose.yml pull
sudo docker compose -f pigallery2-compose.yml up -d
Add virtual host, something like below:
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name photos.vanwerkhoven.org;
location / {
include snippets/nginx-server-proxy-tim.conf;
client_max_body_size 1G;
proxy_pass http://127.0.0.1:3010;
}
include snippets/nginx-server-ssl-tim.conf;
ssl_certificate /etc/letsencrypt/live/vanwerkhoven.org/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/vanwerkhoven.org/privkey.pem; # managed by Certbot
}
Login, configure settings, create sharing link.
Mosquitto ¶
Install daemon and clients
sudo apt install mosquitto mosquitto-clients
Port configuration, don’t use SSL for now
cat << 'EOF' | sudo tee /etc/mosquitto/conf.d/tim.conf
# TvW 20190818
# From https://www.digitalocean.com/community/questions/how-to-setup-a-mosquitto-mqtt-server-and-receive-data-from-owntracks
connection_messages true
log_timestamp true
# https://www.digitalocean.com/community/tutorials/how-to-install-and-secure-the-mosquitto-mqtt-messaging-broker-on-ubuntu-16-04
# TvW 201908
allow_anonymous false
password_file /etc/mosquitto/passwd
listener 1883
EOF
cat << 'EOF' | sudo tee /etc/mosquitto/conf.d/ssl-tim.conf.off
# Letsencrypt needs different CA https://mosquitto.org/blog/2015/12/using-lets-encrypt-certificates-with-mosquitto/
# Or not?
#cafile /etc/ssl/certs/DST_Root_CA_X3.pem
certfile /etc/letsencrypt/live/vanwerkhoven.org/cert.pem
cafile /etc/letsencrypt/live/vanwerkhoven.org/chain.pem
keyfile /etc/letsencrypt/live/vanwerkhoven.org/privkey.pem
tls_version tlsv1.2
listener 8883
EOF
Port users from old server
sudo touch /etc/mosquitto/passwd
sudo chown mosquitto /etc/mosquitto/passwd
sudo chmod og-rwx /etc/mosquitto/passwd
cat << 'EOF' | sudo tee -a /etc/mosquitto/passwd
user:$6$SALT$7HASH==
EOF
Test run config (sudo is important, else you might get Error: Unable to write pid file.
)
sudo /usr/sbin/mosquitto -c /etc/mosquitto/mosquitto.conf -v
If running mosquitto >2.0 and using letsencrypt certificates, ensure to copy them properly after deployment (mosquitto.org) using e.g. this script (github.com). I’m not using this as it requires too many moving parts. Instead, consider using a 100-years self-signed cert.
Worker scripts ¶
co2signal2influxdb ¶
Done, no extra actions
SBFspot ¶
Because it’s not possible to tunnel bluetooth on LXC guests, install this one on pve host (source (~January 2020) (proxmox.com) (’ Bluetooth in Linux funktioniert als Netzwerk-Device, sprich, beim Laden des passenden Treibers für Bluetooth-Hardware registriert sich das ‘hci’ device nicht als USB, sondern als Netzwerk-Adapter. Das durchreichen der USB /dev-node ist damit also nutzlos, weil die Kommunikation über ganz andere Schnittstellen funktioniert.’) - NB this (github.com) and this (proxmox.com) don’t work for bluetooth).
Install SBFspot (github.com) from source (no builds for x86):
sudo apt-get -y --no-install-recommends install bluetooth libbluetooth-dev
sudo apt-get install -y libboost-date-time-dev libboost-system-dev libboost-filesystem-dev libboost-regex-dev
sudo apt-get install -y sqlite3 libsqlite3-dev
sudo apt-get install -y g++
sudo apt-get install -y mosquitto-clients
sudo mkdir /var/log/sbfspot.3
sudo chown -R $USER:$USER /var/log/sbfspot.3
sbfspot_version=3.9.7
wget –c https://github.com/SBFspot/SBFspot/archive/V$sbfspot_version.tar.gz
# Slightly tweaked from docs
mkdir sbfspot-$sbfspot_version
tar -xvf V$sbfspot_version.tar.gz -C sbfspot-$sbfspot_version --strip-components 1
cd sbfspot-$sbfspot_version/SBFspot
make -j4 sqlite
sudo make install_sqlite
Port data / configuration from previous setup
scp -r -P 10022 tim@172.17.10.107:/usr/local/bin/sbfspot.3/SBFspot.cfg /usr/local/bin/sbfspot.3/
sudo chown root:tim /usr/local/bin/sbfspot.3/SBFspot.cfg
sudo chmod o-r SBFspot.cfg
sudo chmod g+w SBFspot.cfg
rsync -av -e "ssh -p 10022" tim@172.17.10.107:/var/lib/sbfspot /var/lib/
Check bluetooth devices, select one to use
hcitool dev
# Devices:
# hci1 00:19:0E:07:4E:47 # Belkin (Atech)
# hci0 04:EA:56:87:A6:12 # Intel
Add to cron
# SBFspot, every minute in sync with smeterd. since SBFspot/bluetooth don't
*/1 5-22 * * * /home/tim/workers/SBFspot2influxdb/get_sbfspot_daydata.sh
30 23 * * * /usr/local/bin/sbfspot.3/SBFspot -sp0 -ad7 -am2 -ae2 -finq -q 2>&1 | logger -p user.err; /home/tim/workers/SBFspot2influxdb/sbfspot_month2influxdb.sh
Repair any gaps from SMA history
/home/tim/workers/sbfspot2influxdb/sbfspot_month2influxdb.sh 20230714
epexspot2influx2b.py ¶
pip install entsoe-py
heat meter ¶
On host: identify & forward USB port (github.com)
ls -l /dev/serial/by-id/usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0
# lrwxrwxrwx 1 root root 13 Jul 15 22:01 /dev/serial/by-id/usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0 -> ../../ttyUSB0
ls -l /dev/ttyUSB1
crw-rw---- 1 root dialout 188, 1 Jul 15 22:03 /dev/ttyUSB1
mkdir -p /lxc/201/devices
cd /lxc/201/devices/
sudo mknod -m 660 usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0 c 188 1
sudo chown 100000:100020 usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0
ls -al /lxc/201/devices/usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0
cat << 'EOF' | sudo tee -a /etc/pve/lxc/201.conf
lxc.cgroup2.devices.allow: c 188:* rwm
lxc.mount.entry: /lxc/201/devices/usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0 dev/usb-FTDI_FT232R_USB_UART_AQ00K6K3-if00-port0 none bind,optional,create=file
EOF
smeterd ¶
On host: identify & forward USB port (github.com)
ls -l /dev/serial/by-id/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0
# lrwxrwxrwx 1 root root 13 Jul 15 22:01 /dev/serial/by-id/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0 -> ../../ttyUSB0
ls -l /dev/ttyUSB0
# crw-rw---- 1 root dialout 188, 0 Jul 15 22:01 /dev/ttyUSB0
mkdir -p /lxc/201/devices
cd /lxc/201/devices/
mknod -m 660 usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0 c 188 0
chown 100000:100020 usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0
ls -al /lxc/201/devices/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0
cat << 'EOF' | sudo tee -a /etc/pve/lxc/201.conf
lxc.cgroup2.devices.allow: c 188:* rwm
lxc.mount.entry: /lxc/201/devices/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0 dev/usb-FTDI_FT232R_USB_UART_AC2F17KR-if00-port0 none bind,optional,create=file
EOF
Get script, add user to dialout
sudo apt install python3-pip -y
pip install smeterd
sudo usermod --append --groups dialout tim
Live DNS IP updater ¶
@TODO migrate to new server & set live
Via gandi-live-dns-config.py
sudo install -m 600 -o tim -g tim /dev/null /etc/gandi-live-dns-config.py # equivalent to touch && chmod 600 && chown root:root
cat << 'EOF' | sudo tee /etc/gandi-live-dns-config.py
# my config
api_secret='secret API string goes here'
domains={'vanwerkhoven.org':['www','home','nextcloud','photos','alexandramaya']}
ttl='1800' # our IP doesnt change that often, 30min down is ~OK
ifconfig4='http://whatismyip.akamai.com' # returns ipv4
ifconfig6='' # disabled until we get IPv6 right for VPN/firewall/etc.
#ifconfig6='https://ifconfig.co/ip' # returns ipv6
interface='' # set empty because else we get local ipv6
EOF
# Add crontab entry
# TvW 20210927 Disabled because I want some subdomains ipv4-only (home) because
# of VPN. Also, if my IPv6 address changes I need to update router firewalling
# and port forwarding as well. -- Update: run all hostnames as ipv4 only for now
*/5 * * * * python3 /home/tim/workers/gandi-live-dns/src/gandi-live-dns.py >/dev/null 2>&1
Collectd ¶
@TODO
Collect VyOS stats via SNMP ¶
https://collectd.org/wiki/index.php/Plugin:SNMP (collectd.org) https://support.vyos.io/en/kb/articles/snmpv3 (vyos.io) https://docs.vyos.io/en/latest/configuration/service/snmp.html (vyos.io) https://forum.vyos.io/t/difficulty-monitoring-vyos-through-snmp/4146 (vyos.io)
Set up SNMP on VyOS
Get SNMP browser - e.g. https://www.ireasoning.com/mibbrowser.shtml (ireasoning.com)
Nginx ¶
Here we install and configure nginx. This DigitalOcean guide (digitalocean.com) is a useful reference for nginx configuration.
More sources:
- https://linuxize.com/post/secure-nginx-with-let-s-encrypt-on-debian-10/ (linuxize.com)
- https://community.letsencrypt.org/t/certbot-auto-no-longer-works-on-debian-based-systems/139702/7 (letsencrypt.org)
Base install
sudo apt install nginx
Inspect, clean, and migrate nginx configuration:
sudo install -m 644 -o root -g root /dev/null /etc/nginx/conf.d/nginx-http-tim.conf # equivalent to touch && chmod 644 && chown root:root
cat << 'EOF' | sudo tee /etc/nginx/conf.d/nginx-http-tim.conf
# TvW 20230222 Additional default http block configuration settings, included automatically by default nginx.conf
# TvW 20200604 Disabled don't advertise version
server_tokens off;
# Add log format separate per virtual host so we can use goaccess to view who visits the server
# Parse using
# `goaccess /var/log/nginx/access.log --log-format=VCOMBINED -o report-all.html`
log_format vcombined '$host: $remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"';
access_log /var/log/nginx/access.log vcombined;
# TvW 20230222 expand gzip options - don't remember why, probably speed
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_buffers 16 8k;
gzip_http_version 1.1;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
EOF
sudo install -m 644 -o root -g root /dev/null /etc/nginx/snippets/nginx-server-site-tim.conf # equivalent to touch && chmod 644 && chown root:root
cat << 'EOF' | sudo tee /etc/nginx/snippets/nginx-server-site-tim.conf
# TvW 20230222 Default options for server blocks serving files
# include snippets/nginx-server-site-tim.conf;
# Add index.php to the list if you are using PHP
index index.php index.html index.htm index.nginx-debian.html;
location / {
# First attempt to serve request as file, then
# as directory, then fall back to displaying a 404.
try_files $uri $uri/ =404;
# This is cool because no php is touched for static content.
# include the "?$args" part so non-default permalinks doesn't break when using query string
#try_files $uri $uri/ /index.php?$args;
}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
location ~ /\.ht {
deny all;
}
# Cache control
location ~* \.(?:js|css|png|jpg|jpeg|webp|gif|ico)$ {
expires 30d;
add_header Cache-Control "public, no-transform";
}
EOF
sudo install -m 644 -o root -g root /dev/null /etc/nginx/snippets/nginx-server-proxy-tim.conf # equivalent to touch && chmod 644 && chown root:root
cat << 'EOF' | sudo tee /etc/nginx/snippets/nginx-server-proxy-tim.conf
# TvW 20230222 Default options for server blocks acting as reverse proxy. Should be part of location / { }
# include snippets/nginx-server-proxy-tim.conf;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $server_name;
#proxy_set_header X-Forwarded-Ssl on;
#proxy_set_header Upgrade $http_upgrade;
#proxy_set_header Connection "upgrade";
EOF
sudo install -m 644 -o root -g root /dev/null /etc/nginx/snippets/nginx-server-ssl-tim.conf # equivalent to touch && chmod 644 && chown root:root
cat << 'EOF' | sudo tee /etc/nginx/snippets/nginx-server-ssl-tim.conf
# TvW 20230222 Default options for server blocks serving ssl
# include snippets/nginx-server-ssl-tim.conf;
# Added 20190122 TvW Add HTTPS strict transport security
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
# Added 20190121 TvW Logjam attack - see weakdh.org
ssl_dhparam /etc/ssl/private/dhparams_weakdh.org.pem;
EOF
Fix logrotate conf to keep logs for a year (instead of 14 days):
cat << 'EOF' | sudo tee /etc/logrotate.d/nginx
/var/log/nginx/*.log {
# Rotate weekly instead of default daily
weekly
missingok
# Keep 52 instead of 14 files
rotate 52
compress
# Don't delay, compress after first rotation
# delaycompress
notifempty
create 0640 www-data adm
sharedscripts
prerotate
if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
run-parts /etc/logrotate.d/httpd-prerotate; \
fi \
endscript
postrotate
invoke-rc.d nginx rotate >/dev/null 2>&1
endscript
}
EOF
Parse logs (goaccess.io) into visually digestable data using Goaccess:
# --persist/--keep-db-files on all files parsed
# --restore/--load-from-disk on second & subsequent files parsed
mkdir -p /tmp/goaccess-{nextcloud,photos,all}/
# At reboot, run goaccess on all files, then run on latest file every 5min
zgrep --no-filename "^nextcloud.vanwerkhoven.org" /var/log/nginx/access.log* | nice -n 19 goaccess --log-format=VCOMBINED -o /var/www/html/stats/report-nextcloud.html --keep-db-files --db-path /tmp/goaccess-nextcloud/ -
zgrep --no-filename "^nextcloud.vanwerkhoven.org" /var/log/nginx/access.log | nice -n 19 goaccess --log-format=VCOMBINED -o /var/www/html/stats/report-nextcloud.html --load-from-disk --keep-db-files --db-path /tmp/goaccess-nextcloud/ -
zgrep --no-filename "^photos.vanwerkhoven.org" /var/log/nginx/access.log* | nice -n 19 goaccess --log-format=VCOMBINED -o /var/www/html/stats/report-photos.html --keep-db-files --db-path /tmp/goaccess-photos/ -
zgrep --no-filename "^photos.vanwerkhoven.org" /var/log/nginx/access.log | nice -n 19 goaccess --log-format=VCOMBINED -o /var/www/html/stats/report-photos.html --keep-db-files --load-from-disk --db-path /tmp/goaccess-photos/ -
zgrep --no-filename -v '^nextcloud.vanwerkhoven.org\|^photos.vanwerkhoven.org' /var/log/nginx/access.log* | nice -n 19 goaccess --log-format=VCOMBINED -a -o /var/www/html/stats/report-all.html --keep-db-files --db-path /tmp/goaccess-all/ -
zgrep --no-filename -v '^nextcloud.vanwerkhoven.org\|^photos.vanwerkhoven.org' /var/log/nginx/access.log | nice -n 19 goaccess --log-format=VCOMBINED -a -o /var/www/html/stats/report-all.html --keep-db-files --load-from-disk --db-path /tmp/goaccess-all/ -
# Optional in case of problems, use something like below (from: https://goaccess.io/faq#configuration)
# LC_TIME="en_US.UTF-8" bash -c 'goaccess /var/log/nginx/access.log --log-format=VCOMBINED -o report.html'
# Dump & inspect existing config
nginx -T
# Migrate config
scp -r oldserver:/etc/ngnix/nginx.conf newserver:/etc/ngnix/nginx.conf # -- if you don't have tweaks here you might want to keep the vanilla configuration in case something's improved.
scp -r oldserver:/etc/ngnix/conf.d/ newserver:/etc/ngnix/conf.d/
scp -r oldserver:/etc/ngnix/modules-available/ newserver:/etc/ngnix/modules-available/
scp -r oldserver:/etc/ngnix/sites-available/ newserver:/etc/ngnix/sites-available/
scp -r oldserver:/etc/ngnix/sites-enabled/ newserver:/etc/ngnix/sites-enabled/
Migrate Certbot. Recommended package manager is snap (eff.org), which has some FOSS (letsencrypt.org)issues (linuxmint.com) being closed source. Hence we stick with apt
for now, which has an older version (1.12.0), but which should be fine (I was still using 0.40.0 on my old Ubuntu server).
Alternatives:
- Use snap
- Use another client (letsencrypt.org)
Install certbot client, this installs both /etc/cron.d/certbot
and a systemd
timer which can be seen running systemctl list-timers
(see this explanation (letsencrypt.org)).
sudo apt install certbot python3-certbot-dns-gandi
Two options:
- Get new certificate with maybe new account (preferred)
- Migrate certificates
New certificates ¶
sudo apt install certbot python3-certbot-dns-gandi python3-certbot-nginx
sudo install -m 600 -o root -g root /dev/null /etc/letsencrypt/gandi.ini # equivalent to touch && chmod 600 && chown root:root
cat << 'EOF' | sudo tee /etc/letsencrypt/gandi.ini
# live dns v5 api key
certbot_plugin_gandi:dns_api_key=APIKEY
# optional organization id, remove it if not used
certbot_plugin_gandi:dns_sharing_id=SHARINGID
EOF
# Get certificate, use old plugin syntax because debian uses an old certbot client
sudo certbot certonly -a certbot-plugin-gandi:dns --certbot-plugin-gandi:dns-credentials /etc/letsencrypt/gandi.ini -d vanwerkhoven.org -d \*.vanwerkhoven.org --server https://acme-v02.api.letsencrypt.org/directory
# IMPORTANT NOTES:
# - Congratulations! Your certificate and chain have been saved at:
# /etc/letsencrypt/live/<domain>/fullchain.pem
# Your key file has been saved at:
# /etc/letsencrypt/live/<domain>/privkey.pem
# Your certificate will expire on 2023-05-24. To obtain a new or
# tweaked version of this certificate in the future, simply run
# certbot again. To non-interactively renew *all* of your
# certificates, run "certbot renew"
# Optional: Run nginx installer to install to servers, else install manually
sudo certbot run --nginx --certbot-plugin-gandi:dns-credentials /etc/letsencrypt/gandi.ini -d vanwerkhoven.org -d \*.vanwerkhoven.org --server https://acme-v02.api.letsencrypt.org/directory
# Optional: install automatic certificate renewal (also installed by default), either explicitly using the plugin, or implicitly via settings stored in /etc/letsencrypt/renewal/<domain>.org.conf
0 0 * * 0 certbot renew -q --authenticator dns-gandi --dns-gandi-credentials /etc/letsencrypt/gandi.ini --server https://acme-v02.api.letsencrypt.org/directory # explicitly use settings
0 0 * * 0 certbot renew -q # implictly use settings
Migrate certificates ¶
Transfer settings/certs (serverfault.com), something like:
ssh proteus
sudo scp -r /etc/letsencrypt/* <target>
Didn’t work this out
Deploy Let’s Encrypt certificates ¶
@TODO figure out how to propagate the certificate safely and automatically across services Push certificate to PVE (proxmox.com)
- Use SSH with unencrypted public key authentication only available to specific user
- Use shared disk mount / mount point, copy new certificates there, poll daily from receiving server
cp fullchain.pem /etc/pve/nodes/pve/pveproxy-ssl.pem
cp private-key.pem /etc/pve/nodes/pve/pveproxy-ssl.key
Jellyfin ¶
sudo apt install jellyfin
GPU acceleration in lxc guest ¶
Set up GPU acceleration (from this reddit post (reddit.com)):
On host, identify hardware device:
apt install vainfo
ls -l /dev/dri
drwxr-xr-x 2 root root 80 Jul 21 20:08 by-path
crw-rw---- 1 root video 226, 0 Jul 21 20:08 card0
crw-rw---- 1 root render 226, 128 Jul 21 20:08 renderD128
Check what group ids video
and render
have:
grep "video\|render" /etc/group
video:x:44:
render:x:103:
Prepare UID mapping to guest, allow root to map group ids video
and render
via /etc/subgid
:
cat << 'EOF' | sudo tee -a /etc/subgid
root:44:1
root:103:1
EOF
Now pass hardware through to lxc, and group map video
and render
. Note I have to merge this with my group mapping of bulkdata
user/groups.
cat << 'EOF' | sudo tee -a /etc/pve/lxc/201.conf
lxc.cgroup2.devices.allow: c 226:* rwm
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.idmap: g 0 100000 44
lxc.idmap: g 44 44 1
lxc.idmap: g 45 100045 60
lxc.idmap: g 105 103 1
lxc.idmap: g 106 100106 904
EOF
After this, reboot the LXC container:
# From host:
sudo pct reboot 201
# From guest:
sudo reboot
Now prepare the guest, using Debian native packages (alternatively, install from Intel apt repo (intel.com))
sudo usermod -aG render,video root
sudo usermod -aG render,video jellyfin
sudo apt install --no-install-recommends libva2 libigdgmm11 mesa-va-drivers intel-media-va-driver-non-free
sudo apt install vainfo
Now vainfo
should work:
sudo vainfo
error: can't connect to X server!
libva info: VA-API version 1.10.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_10
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.10 (libva 2.10.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 21.1.1 ()
vainfo: Supported profile and entrypoints
VAProfileNone : VAEntrypointVideoProc
VAProfileNone : VAEntrypointStats
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Simple : VAEntrypointEncSlice
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointFEI
VAProfileH264Main : VAEntrypointEncSliceLP
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointFEI
VAProfileH264High : VAEntrypointEncSliceLP
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointEncPicture
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointFEI
VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
VAProfileVP8Version0_3 : VAEntrypointVLD
VAProfileVP8Version0_3 : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointFEI
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileHEVCMain10 : VAEntrypointEncSlice
VAProfileVP9Profile0 : VAEntrypointVLD
VAProfileVP9Profile2 : VAEntrypointVLD
Profit!
Plex ¶
Install as Docker image or via apt source (plex.tv) (I chose apt install because less dependencies)
echo deb https://downloads.plex.tv/repo/deb public main | sudo tee /etc/apt/sources.list.d/plexmediaserver.list
curl https://downloads.plex.tv/plex-keys/PlexSign.key | sudo apt-key add -
sudo apt install plexmediaserver
Disable local network auth (plex.tv) in advanced settings (plex.tv)
<Preferences allowedNetworks="172.17.0.0/255.255.0.0" />
For first login, log in via localhost using ssh tunnel, e.g.
ssh -L 32400:localhost:32400 proteus
open http://localhost:32400/web
Open: set up https reverse proxy or not?
Collectd ¶
On proxmox host (to get pure CPU stats). Need to manually add ping and snmp libs to prevent error ERROR: dlopen("/usr/lib/collectd/ping.so") failed: liboping.so.0: cannot open shared object file
sudo apt install collectd-core liboping0 libsnmp40
Set up /etc/collectd/collectd.conf
:
datadir: "/var/lib/collectd/rrd/"
libdir: "/usr/lib/collectd/"
BaseDir "/var/run/collectd"
Include "/etc/collectd/conf.d"
PIDFile "/var/run/collectd.pid"
PluginDir "/usr/lib/collectd"
TypesDB "/usr/share/collectd/types.db"
Hostname "pve"
Interval 60
LoadPlugin memory
<Plugin memory>
ValuesPercentage false
ValuesAbsolute true
</Plugin>
LoadPlugin cpu
<Plugin cpu>
ValuesPercentage false
ReportByCpu false
ReportByState false
</Plugin>
LoadPlugin cpufreq
LoadPlugin ping
<Plugin ping>
TTL 127
Interval 60
AddressFamily ipv4
Host "dataix.ru"
Host "linx.net"
Host "ams-ix.net"
</Plugin>
LoadPlugin network
<Plugin network>
Server "proteus.lan.vanwerkhoven.org" "25826"
Forward false
</Plugin>
LoadPlugin load
LoadPlugin snmp
<Plugin snmp>
<Data "ifmib_if_octets32_table">
Type "if_octets"
Table true
TypeInstanceOID "IF-MIB::ifDescr"
# TypeInstancePrefix "if" # Optional
Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"
InvertMatch true
Ignore "*eth1.300*"
</Data>
# Numerical approach, less readable once passed on to influxdb because interface name is not used
# <Data "ifmib_if_octets32">
# Type "if_octets"
# Table false
# TypeInstance "iso.3.6.1.2.1.2.2.1.2.9"
# Values "iso.3.6.1.2.1.2.2.1.10.9" "iso.3.6.1.2.1.2.2.1.16.9"
# </Data>
<Host "vyos">
Address "172.17.10.1"
Version 2
Community "public"
Collect "ifmib_if_octets32" "ifmib_if_octets32_table"
Interval 30
</Host>
</Plugin>
Enable snmp on VyOS, this flow from vyos -> proxmox -> influxdb server.
set service snmp community public authorization ro
set service snmp community public network 172.17.10.0/24
set service snmp listen-address 172.17.10.1
Test this, then find what you need (I wanted traffic through my WAN interface)
See also:
- https://superuser.com/questions/1461522/what-are-the-steps-to-take-to-get-specific-information-from-snmp (superuser.com)
- https://collectd.org/documentation/manpages/collectd-snmp.5.shtml (collectd.org)
snmpwalk -v 2c -c public 172.17.10.1 | less
# Get MIB on your collectd server:
# MIB search path: /home/tim/.snmp/mibs:/usr/share/snmp/mibs:/usr/share/snmp/mibs/iana:/usr/share/snmp/mibs/ietf
mkdir -p ~/.snmp/
scp -r vyos@172.17.10.1:/usr/share/snmp/mibs ~/.snmp/
mkdir -p /tmp/migrate_mibs/
scp -r vyos@172.17.10.1:/usr/share/snmp/mibs /tmp/migrate_mibs/
sudo mv /tmp/migrate_mibs/mibs/*.txt /usr/share/snmp/mibs/
sudo chown root:root /usr/share/snmp/mibs/*.txt
# IF: iso.3.6.1.2.1.2.2.1.2.9 = STRING: "eth1.300"
# RX: iso.3.6.1.2.1.2.2.1.10.9 = Counter32: 1306354856
# TX: iso.3.6.1.2.1.2.2.1.16.9 = Counter32: 968162683
Monitor LVM usage ¶
LVM plugin for collectd is deprecated, and telegraf is too new for Debian. Instead we hack together a plugin in Python (or maybe shell?)
Data we need:
sudo lvs -S 'lv_attr =~ ^t'
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
thinpool_data pve twi-aotz-- <4.95t 25.52 19.78
thinpool_vms pve twi-aotz-- 512.00g 17.65 16.10
tim@pve:~$ sudo lvs -S 'lv_attr =~ ^V'
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv_backup_mba pve Vwi-aotz-- 256.00g thinpool_data 2.96
lv_backup_mbp pve Vwi-aotz-- 1.00t thinpool_data 1.86
Desired format (only LV name and Data% and meta%):
thinpool_data 25.52 19.78
thinpool_vms 17.65 16.10
Check lvmreport how to shape the right names
Get list of output cols: sudo lvs -O help
sudo lvs -S 'lv_attr =~ ^t|V' -o lv_name,data_percent,metadata_percent --noheading --separator ","
Allow non-root to run this via /etc/sudoers
adduser --no-create-home --disabled-login collectd-plugin
cat << 'EOF' >>/etc/apt/sources.list
%collectd-plugin ALL= NOPASSWD: /usr/sbin/lvs
EOF
Alternatively you could set capabilities on the lvs binary itself, however this allows all users to run this – see here (stackexchange.com) and here (stackoverflow.com).
Create collectd Exec plugin (collectd.org) based on the lvs
output, inspired by df (collectd.org) type/instance use. See the plain text protocol (collectd.org) and collectd-exec(5) man page (collectd.org) for details.
sudo install -m 755 /dev/null /usr/local/bin/collectd-lvm-usage.sh
cat << 'EOF' | sudo tee /usr/local/bin/collectd-lvm-usage.sh
#!/usr/bin/env bash
# /usr/local/bin/collectd-lvm-usage
HOSTNAME="${COLLECTD_HOSTNAME:-localhost}"
INTERVAL="${COLLECTD_INTERVAL:-60}"
while true; do
while read line; do
# Trim whitespace https://unix.stackexchange.com/questions/102008/how-do-i-trim-leading-and-trailing-whitespace-from-each-line-of-some-output
IFS=',' linearr=($(echo "${line}" | awk '{$1=$1};1'))
volname=${linearr[0]:-undefined}
dataused=${linearr[1]:-99}
echo "PUTVAL \"$HOSTNAME/lvm-${volname}/percent_bytes-used_data\" interval=$INTERVAL N:${dataused}"
metaused=${linearr[2]}
if [[ -n ${metaused} ]]; then
echo "PUTVAL \"$HOSTNAME/lvm-${volname}/percent_bytes-used_meta\" interval=$INTERVAL N:${metaused}"
fi
done <<< "$(sudo lvs -S 'lv_attr =~ ^t|V' -o lv_name,data_percent,metadata_percent --noheading --separator ",")"
sleep "$INTERVAL";
done
EOF
Add this to your collectd config:
LoadPlugin exec
<Plugin exec>
Exec "collectd-plugin:collectd-plugin" "/usr/local/bin/collectd-lvm-usage.sh"
</Plugin>
Transmission ¶
Install tranmission
sudo apt install transmission-daemon
usermod -aG bulkdata debian-transmission
Update config, ensure daemon is stopped to prevent overwriting on daemon exit
sudo systemctl stop transmission-daemon.service
"blockl:st-url": "http://list.iblocklist.com/?list=bt_level1&fileformat=p2p&archiveformat=gz",
"download-dir": "/mnt/bulk/temp",
"incomplete-dir": "/mnt/bulk/temp",
"rpc-authentication-required": false, # optional, else could leave transmission:transmission default
"rpc-whitelist": "127.0.0.1,172.17.20.*",
Add port 9091 to VyOS firewall
set firewall name FW_TRUST2INFRA rule 210 destination port +9091
Crashes and recovery ¶
Unscheduled power off ¶
On pve:
Aug 11 13:14:37 pve systemd-modules-load[438]: Inserted module 'kvmgt'
Aug 11 13:14:37 pve systemd-modules-load[438]: Failed to find module 'exngt'
Aug 11 13:14:37 pve systemd-modules-load[438]: Failed to find module 'vfio-mdev'
[...]
Aug 11 13:14:37 pve systemd-fsck[560]: There are differences between boot sector and its backup.
Aug 11 13:14:37 pve systemd-fsck[560]: This is mostly harmless. Differences: (offset:original/backup)
Aug 11 13:14:37 pve systemd-fsck[560]: 65:01/00
Aug 11 13:14:37 pve systemd-fsck[560]: Not automatically fixing this.
Aug 11 13:14:37 pve systemd-fsck[560]: Dirty bit is set. Fs was not properly unmounted and some data may be corrupt.
Aug 11 13:14:37 pve systemd-fsck[560]: Automatically removing dirty bit.
Aug 11 13:14:37 pve systemd-fsck[560]: *** Filesystem was changed ***
Aug 11 13:14:37 pve systemd-fsck[560]: Writing changes.
Aug 11 13:14:37 pve systemd-fsck[560]: /dev/nvme0n1p2: 5 files, 84/130812 clusters
Changelog ¶
20231011: extended PVE root partition to 10GB because disk was full (8GB was too optimistic)
sudo swapoff -a
free -h
sudo vim /etc/fstab
lsblk
sudo lvreduce --size -2G /dev/mapper/pve-swap
sudo lvextend -l +100%FREE /dev/mapper/pve-root
lsblk -o +PARTTYPE
sudo resize2fs /dev/mapper/pve-root
df -h
20240809: extended backup_vm volume because backups were getting too big
sudo lvextend -L +0.25T /dev/mapper/pve-lv_backup_vms
sudo resize2fs /dev/mapper/pve-lv_backup_vms
#Networking #Nextcloud #Nginx #Security #Server #Smarthome #Debian #Vyos #Proxmox #Unix