Tim's blah blah blah

🎼 Hearmony: DIY Children's Music Player built on ESP8266

(Updated: )

Inspired by Hörbert (hoerbert.com) and Toniebox (tonies.com) we decided to make our own: Hearmony. We currently built a prototype and are industrializing.


Requirements / features

We have an audience of children of 1-4 years old in mind.

  1. Music
    1. Plays 9 different songs from removable memory card
    2. Music quality slightly better than ‘button books’ (>=32kbit mp3 mono)
    3. Volume pre-set, configurable via software
    4. Current song stops (can be interrupted) if new song is started
    5. Stop button (can be same as off button)
  2. Battery
    1. Battery should last 1 (target: 3) months @ 30min/day = 15 (target: 45) hours.
    2. Device should power-off automatically if forgotten
    3. Batteries should be replaceable, ideally 4×AA
  3. Casing
    1. Non-toxic (finish)
    2. Sturdy, drop-safe from 50cm height
    3. Can be carried around by toddlers, weight <1kg (?)
    4. Sturdy buttons that don’t break easily
    5. Should open easily to replace batteries
    6. Have opening for micro-USB charging
    7. Speaker protected from toddler fingers

Related projects

There’s numerous related projects, which either use a pure Atmel chip (and typically decode wav/PCM files), and mp3 versions, usually with dfplayer.

Pure Arduino/Atmel:

DFplayer/mp3 based:

Future version / known bugs

  1. Using 555 timer instead of arduino/pololu combo: saving 9 EUR -1 EUR = 8 EUR in parts. (see e.g. here (wordpress.com))
  2. Reduce 1.5 second start-up delay to <0.2s
  3. Don’t use LED pin 13 for buttons.
  4. Rotate amplifier, find 0.254mm pin version
  5. Check which switche to use: small 6x6mm or big 12x12mm

Bill of Materials

This is the current BoM (under development), at total cost of 42-64€ (ex shipping).

  1. Casing (5-20€)
    1. Cardboard children’s suitcase (hema.nl) - 5.5€
    2. Wouden suitcase ca.18x13x5,5cm (marzkreatiek.nl) - 5€
    3. DIY luxury wooden box - 20€?
  2. Electronics (~30€)
    1. MCU/SoC: Pro Mini clone (tinytronics.nl) - 5€
    2. Music player: dfplayer clone YX5200 (tinytronics.nl) - 4€
    3. Amplifier: PAM8403 2×3W amp w/ volume knob (tinytronics.nl) - 3€
    4. Memory: SD Card (tinytronics.nl) - 5€
    5. Speaker: 75mm speaker - 8 ohm 1 Watt (opencircuit.nl) 3€
    6. Buttons:
      1. Big: Tactile pushbutton 12mm x 12mm (tinytronics.nl) + button caps (tinytronics.nl) - 4€ per 10
      2. Small: Tactile pushbutton 6mm x 6mm (tinytronics.nl) + caps (DIY or Amazon) - 4€ per 10
    7. On/off button & circuit: pushbutton 12mm (tinytronics.nl) + Pololu latch circuit (opencircuit.nl) - 4€
    8. Battery holders: 4×AA battery holder (tinytronics.nl) - 1.5€
  3. Batteries (7-14€):
    1. 4×AA Energizer Recharge Extreme - 2300mAh (nkon.nl)
    2. 4×AA Eneloop Pro 2500mAh (nkon.nl) - 14€
  4. Software (0€): Native Arduino IDE

Bill of Process

  1. Connect hardware
  2. Upload software
  3. Assemble & use

Electronics design

I decided to use a 3.3V Arduino Mini Pro + DFPlayer mini + PAM8403 amp (with volume knob) + Pololu mini push button switch (option 1.1 below). This gives me ~80mA power consumption, good & adjustable volume, auto-power off at reasonable price. A cheaper but more limited option would be leaving out the Arduino (option 1.2), but that was a bit too bare bones for me. Below I detail how I got to this design choice.

design format price power use power-off† memory control††
1.1 DFplayer/Arduino/PAM8302 mp3 17€ 80 mA auto SD full
1.2 DFplayer/PAM8302 mp3 8€ 70 mA manual SD simple
2.1 Arduino/PAM8302 wav* 15€ 200 mA?* auto SD full
2.2 Arduino wav* 12€ 50 mA auto SD full
3. DFplayer/ESP8266/PAM8302 mp3 17€ > than 1.1 auto SD full
4. Adafruit Audio FX/VLSI mp3 30€ n/a auto onboard ?
5. Teensy3.2/Cortex M4 mp3* 30€ > than 2.1 auto SD full

† Auto uses a Pololu latch circuit, manual requires user interaction
†† simple only plays songs (no start/stop etc.), full can do anything an Arduino can do
* decodes music in software
** could be improved with reduced volume out of PAM8302


Battery time is putting most constraints on the electronics design, so we start with that.

To get 15 hours play time with 4 AA @ 2500 mAh, I can use 166 mA @ 5V maximum. My power budget then looks as follows:

Additionally, I want an automatic power-off circuit to minimize idle power consumption to near-zero, and to prevent forgetting to turn off the player.

Furthermore I look at the required logic/user interfaces, which needs to:

Given these requirements, I investigated what options exist.

Design choices

There are a range of chips on the market to play music etc. (see below), with main choices along these axes:

Design alternatives

Given requirements and hardware available, I came up with the following 2 major design alternatives, each with two different flavors.

  1. MP3 DFplayer based (dfrobot.com): Pro: simple, integrated SD module, plays MP3, widely used by others.
    1. Complete setup with Arduino & latch-circuit for power control: Arduino Pro Mini 3.3V/ATmega328P (5€) + YX5200 (5€) + PAM8302 (3€) + latch circuit (4€) = 17€
    2. Simpler flavor with only DFplayer (using AD1/AD2 ports), but has no auto power-off: YX5200 (5€) + PAM8302 (3€) = 8€
  2. WAV TMRpcm-based (github.com). Pro: works on an Arduino only! simple and versatile design
    1. Low-power (3.3V) Arduino with external class-D amplifier: Arduino Pro Mini 3.3V/ATmega328P (5€) + PAM8302 (3€) + SD card module (3€) + latch circuit (4€) = 15€
    2. Normal (5V) Arduino directly driving the speaker from GPIO pin (can be done (github.com)): Arduino Pro Mini 5V/ATmega328P (5€) + SD card module (3€) + latch circuit (4€) = 12€

There’s other options that I didn’t further investigate:

  1. (mp3, easy): WeMos D1 mini/ESP8266 (5€) + YX5200 (5€) + PAM8302 (3€) + latch circuit (4€) = 17€ ==> NOK due to high power use of ESP8266 chip, even at modem sleep this uses quite a lot and we don’t need to processing power anyway.
  2. (mp3, super easy): Adafruit Audio FX/VLSI (30€) = 30€ ==> NOK due to lack of auto power-off and price, also limited memory could pose a problem (although solvable)
  3. (mp3, medium): Teensy3.2/Cortex M4 20$ + PAM8302 (3€) + SD card module (3€) + latch circuit (4€) = 30€ ==> NOK, high price and likely high power use. Would have used software mp3 decoding (adafruit.com),

Power consumption

I’ve tested several designs on power consumption until I found the right fit. Power consumption was measured on a 5V Nokia phone charger with a UNI-T UT133A (uni-trend.com). I used a Sparkfun 1W/8Ω speaker.

  1. DFplayer
    1. 3.3V Arduino + DFPlayer mini w/ PAM8403 amp (with volume knob): 80mA, good & adjustable volume
    2. DFPlayer mini w/ PAM8403 amp (no volume knob): 80mA-90mA (higher due to different amplifier settings)
    3. 3.3V Arduino + DFPlayer mini w/ PAM8403 amp (no volume knob): 100-200mA, volume very loud
    4. 3.3V Arduino + DFPlayer mini w/ int. 8002A amp: 350mA (no sound), 400mA (playing) volume very loud
    5. 3.3V Arduino + DFPlayer mini w/o amp (using DAC out directly to speaker): 80mA, not audible
  2. TMRpcm
    1. 3V Arduino + TMRpcm + SD w/ PAM8403 amp (no volume knob) @ 16kHz: 200mA, volume quite loud
    2. 5V Arduino + TMRpcm + SD w/o amp @ 16kHz/32kHz WAV: 60mA/80mA, volume a bit soft (no noticable difference with 3.3V Arduino)
    3. 3.3V Arduino + TMRpcm + SD w/o amp @ 16kHz WAV: 50mA, volume a bit soft, even after software increased volume using ffmpeg’s loudnorm @ I=-7.

I also tested a 3W/4Ω speaker on the DFPlayer mini w/ int. amp which uses 500mA without playing music and gives a ticking noise / reboot loop (likely some power supply/voltage regulator overloaded).

Battery options

We use AA NiMH for ease of finding good quality, no down-time when recharging, in spite of what DIYI0T says (diyi0t.com). Also, the discharge curve of NiMH batteries (lygte-info.dk) is quite favorable as the voltge is within the input range for a long time.


TMRpcm: https://wokwi.com/arduino/projects/316627468467307072 (wokwi.com) dfplayer: https://wokwi.com/arduino/projects/318250851963503170 (wokwi.com)

Change volume of MP3 using ffmpeg (superuser.com)

Audio file pre-processing for DFplayer (normalized, 44.1kHz, 128k CBR mp3s)

	find . -type f -iname "*m4a" | while read -r file; do echo $file; ffmpeg -nostdin -i "${file}" -y -af loudnorm -vn -ar 44100 -ac 1 -b:a 128k "${file}.mp3"; done

	ls 00{98,99}*m4a | while read -r file; do echo $file; ffmpeg -nostdin -i "${file}" -y -af loudnorm,atempo=1.5  -vn -ar 44100 -ac 1 -b:a 128k "${file}.mp3"; done

Optiboot bootloader

The default Arduino bootloader delays the boot significantly (~2-3) seconds, which can be improved using either disabling the bootloader or by customizing it. Since having a bootloader has some advantages, I chose to install Optiboot (github.com). I’m using the Arduino IDE, from which you can burn Optiboot as follows:

  1. Add Optiboot as board manager (github.com): 1. Find the desired Optiboot release on the Optiboot Release page. 1. Use the “Copy link address” feature of your browser to copy the URL of the associated .json file. 1. Paste this URL into the “Additional Boards Manager URLs” field in the Arduino IDE “Preferences” pane. (Separate it from other URLs that might be present with a comma or click the icon to the right of the field to insert it on a new line.) 1. After closing the Preferences window, the Tools/Boards/Boards Manager menu should include an entry for that version of Optiboot. Select that entry and click the Install button.
  2. Burn bootloader to board (github.com)
    1. Use AVR programmer to burn, e.g. using USBasp (arduinodev.com), using Arduino IDE
      1. Connect the programmer to GND/RESET/VCC and SCK/MISO/MOSI pins (see here (arduinodev.com), N.B. pin 8 is sometimes NC instead of GND)
      2. Open Arduino IDE
      3. Configure: choose ‘Board -> Optiboot on 28-pins cpus’, ‘Processor -> ATmega328p’, ‘CPU speed -> 8MHz (int)’, ‘Programmer -> USBasp’. (N.B. In Arduino IDE, ‘Port’ will remain empty as USBasp does not present itself as USB device)
      4. Click ‘Burn Bootloader’
      5. After successfully burning the boatloader, the arduino will keep blinking its LED 3 times. This means optiboot is signalling it’s alive, and has no sketch to run.
    2. Cheap & slow (like really slow): Use FTDI TTL chip to burn (see here (github.io) and here (adafruit.com) and here (arduino.cc))
  3. Upload sketches as usual
    1. Use Sketch -> Upload Using Programmer in Arduino IDE to upload using USBasp

It turns out DFplayer mini also has ~1.5 second startup time, so we don’t gain all that much, except we need a wait loop to check that DFplayer has started up.

Buttons - native/Keypad.h

Using Chris–A’s Keypad (github.com) library we can make a matrix keypad which uses fewer GPIOs. It’s very simple (github.com) to use:

#include <Keypad.h>

const byte ROWS = 4; //four rows
const byte COLS = 3; //three columns
char keys[ROWS][COLS] = {
byte rowPins[ROWS] = {5, 4, 3, 2}; //connect to the row pinouts of the keypad --
byte colPins[COLS] = {8, 7, 6}; //connect to the column pinouts of the keypad

Keypad keypad = Keypad( makeKeymap(keys), rowPins, colPins, ROWS, COLS );

void setup(){
void loop(){
	char key = keypad.getKey();
	if (key){

Alternative - WAV playing w/ TMRpcm

TMRh20’s TMRpcm (github.com) library plays 8bit 30kHz WAV files natively on an Arduino.


There’s a few options for casing:

  1. Make your own 😎 (3D printed/laser cut/CNC’ed/whatnot)
  2. Get children’s suitcase, e.g. in NL:
    1. Houten Koffertje ca.18x13x5,5cm (marzkreatiek.nl) at Marzkreatiek
    2. Koffertje 14x20.5x8 cm (hema.nl) at Hema
    3. Koffertje 18x23.5x10 cm (hema.nl) at Hema

Appendix 1: Available COTS hardware

There’s a range of COTS products available, a brief summary listed here:

part consumption (mA) charging/power logic decoding amplification
Adafruit Audio FX Sound Board 16MB (adafruit.com) 30mA X X X X
Wemos D1 mini (diyi0t.com) 27mA** X X
ESP8266EX (espressif.com) 15mA** X
ATMega32u4 (amperka.ru) <5mA* X
Arduino Nano (arduino.cc) 19mA X X
Teensy 3.2 (adafruit.com) ?mA X X
Sparkfun Pro Mini 3.3V (sparkfun.com) ~10mA X
ATmega328P (arduino.cc) 1.5mA X
VS1053b (sparkfun.com) 10-37mA? X
YX5200 (hackaday.io) 15mA X
KT403A (elecrow.com) 15mA X
BY8301 (ba3ar.kz) 16mA? X X X
8002A (radioremont.com) 100mA X
PAM8403 (tinytronics.nl) 15mA X
PAM8302 (opencircuit.nl) <15mA? X
Pushbutton Power Switch (pololu.com) <1µA (off) X
SD Card (stackexchange.com)2 (goughlui.com) ~20mA

* at 4MHz ** at modem sleep (i.e. no wifi)

Appendix 2: Music

Below I document how I got music from youtube using youtube-dl (github.com), also using this ostechnix.com (ostechnix.com) guide, see also https://unix.stackexchange.com/questions/230481/how-to-download-portion-of-video-with-youtube-dl-command#282413 (stackexchange.com)


Get formats:

youtube-dl --list-formats cU-_st6Ksac

Get URL and/or title:

youtube-dl --get-url cU-_st6Ksac
youtube-dl -f bestaudio/best --get-url cU-_st6Ksac
youtube-dl -f bestaudio/best --get-url --get-filename -o '%(url)s;%(title)s' cU-_st6Ksac

Get audio only, convert in youtube-dl immediately:

youtube-dl --extract-audio --audio-format mp3 cU-_st6Ksac
youtube-dl --extract-audio --audio-format mp3 --audio-quality 128k cU-_st6Ksac
youtube-dl --extract-audio --audio-format mp3 --audio-quality 128k --postprocessor-args "-af loudnorm -vn -ar 44100 -ac 1" cU-_st6Ksac

Convert to normalized, video-less, 44.1kHz, mono, 128k CBR mp3 (without/with trimming):

ffmpeg -nostdin -i "${file}" -y -af loudnorm -vn -ar 44100 -ac 1 -b:a 128k "${file}.mp3";
ffmpeg -ss 00:01:00 -to 00:02:00 -nostdin -i "${file}" -y -af loudnorm -vn -ar 44100 -ac 1 -b:a 128k "${file}.mp3";

Get volume histogram, silence occurrences, and finally trim silence from start of audio track.

ffmpeg -vn -i "${file}" -filter:a volumedetect -f null -
ffmpeg -vn -i "${file}" -af "silencedetect=n=-40dB:d=1" -f null - 
ffmpeg -vn -i "${file}" -af "silenceremove=start_periods=1:start_duration=0:start_threshold=-40dB" cU-_st6Ksac-trimmed.mp3

Combine everything in one pass (without/with trimming):

ffmpeg -nostdin -i "$(youtube-dl -f bestaudio/best --get-url cU-_st6Ksac)" -af loudnorm -vn -ar 44100 -ac 1 -b:a 128k cU-_st6Ksac.mp3
ffmpeg -nostdin -ss 00:00:00 -to 03:02:00 -i "$(youtube-dl -f bestaudio/best --get-url cU-_st6Ksac)" -af "loudnorm,silenceremove=start_periods=1:start_duration=0:start_threshold=-40dB" -vn -ar 44100 -ac 1 -b:a 128k cU-_st6Ksac.mp3

Batch processing

# 80s/90s children's theme playlist
ARRYT=( "0001;cU-_st6Ksac;00:00:00;00:05:00" "0002;3pZHgZfNhc0;00:00:00;00:10:00" "0003;mhEvV9Q7zfE;00:00:07;00:00:57" )

# Internet meme playlist
ARRYT=( "0001;jQE66WA2s-A;00:00:35;00:01:45" "0002;6WpMlwVwydo;00:00:00;00:10:00" "0003;lVCHi6GJLOE;00:00:00;00:10:00" "0004;QH2-TGUlwu4;00:00:00;00:00:57.5" "0005;dQw4w9WgXcQ;00:00:00;00:01:00" "0006;KmtzQCSh6xk;00:00:00;00:10:00" "0007;EIyixC9NsLI;00:00:00;00:10:00" "0008;dfTPlsIq7d0;00:00:00;00:10:00" )

# Saint Saens: Carnival of the Animals
ARRYT=( "0001;8gjNhJ7l7Mk;00:00:00;00:10:00" "0002;lEd7Ovt4cWE;00:00:00;00:10:00" "0003;RoFY7-2f_lM;00:00:00;00:10:00" "0004;wPHqJTpgo-U;00:00:00;00:10:00" "0005;f1nVDoCnsNk;00:00:00;00:10:00" "0006;8gjNhJ7l7Mk;00:00:00;00:10:00" "0007;-OAQ6rAs9DA;00:00:00;00:10:00" "0008;pyaBeSgyFoY;00:00:00;00:10:00" "0009;NJpqN2oTgR8;00:00:00;00:10:00" "0010;ZFJf3rHd69c;00:00:00;00:10:00" "0011;0y1ntDP07rM;00:00:00;00:10:00" "0012;0TSkIG9lFvY;00:00:00;00:10:00" "0013;cXEy_UfSgCU;00:00:00;00:10:00" "0014;dNbyZFHeuFA;00:00:00;00:10:00" )

# Mix list
ARRYT=( "0001;cU-_st6Ksac;00:00:00;00:05:00" "0002;gAjR4_CbPpQ;00:01:35;00:02:41" "0003;goeOUTRy2es;00:05:50;00:07:13" "0004;3pZHgZfNhc0;00:00:00;00:10:00" "0005;mhEvV9Q7zfE;00:00:07;00:00:57" "0006;QH2-TGUlwu4;00:00:00;00:00:57.5" "0007;lVCHi6GJLOE;00:00:00;00:10:00" "0008;8gjNhJ7l7Mk;00:00:00;00:10:00" "0009;RoFY7-2f_lM;00:00:00;00:10:00")

ARRYT=( "0001;cU-_st6Ksac;00:00:00;00:05:00" )

for thisyt in "${ARRYT[@]}"; do
	true; # Split this videos request settings. Each item should be formatted as<PREFIX>;<YOUTUBE ID>;<start in HH:MM:SS>;<end in HH:MM:SS>
	IFS=';' thisytarr=($(echo "${thisyt}"))
	true; # Get video source URL and title using youtube-dl. Use double dashes (--) to support ID strings starting with a dash.
	IFS=$'\n' ytinfoarr=($(youtube-dl -f bestaudio/best --get-url --get-filename -o '%(title)s' -- "${ytid}"))
	echo ">> Trim ${ytid} (${songfile}) from ${songss} to ${songto}"
	true; # Need to call youtube-dl twice because otherwise the URL might already have expired giving a 403 access denied return code
	ffmpeg -nostdin -y -ss "${songss}" -to "${songto}" -i "$(youtube-dl -f bestaudio/best --get-url -- ${ytid})" -af "loudnorm,silenceremove=start_periods=1:start_duration=0:start_threshold=-40dB" -vn -ar 44100 -ac 1 -b:a 128k "${songprefix}_${songfile}_${ytid}.mp3"
	true; # ffmpeg -nostdin -y -ss "${songss}" -to "${songto}" -i "$(youtube-dl -f bestaudio/best --get-url ${ytid})" -af loudnorm -vn -ar 44100 -ac 1 -b:a 128k "${songprefix}_${songfile}.mp3"
	true; # ffmpeg -nostdin -y -ss 00:01:35 -to 00:02:41 -i "$(youtube-dl -f bestaudio/best --get-url ${ytid})" -af loudnorm -vn -ar 44100 -ac 1 -b:a 128k "${songprefix}_${songfile}.mp3"
	true; # ffmpeg -nostdin -y -i "$(youtube-dl -f bestaudio/best --get-url ${ytid})" -af loudnorm -vn -ar 44100 -ac 1 -b:a 128k "${songprefix}_${songfile}.mp3"

#Arduino #Diy