DECtalk ESPress Firmware — Build Process & Architecture
This document describes the build system, source layout, and internal architecture of the DECtalk ESPress firmware.
The firmware build supports both ESP32-S3 and ESP32-C6 targets. The separate HARDWARE.md guide documents the photographed physical perfboard build, which is specifically based on an ESP32-C6 board.
For the DECtalk component build process (dapi source compilation, dictionary cross-compilation, porting notes), see the component BUILD.md.
Table of Contents
- Prerequisites
- Directory Layout
- How the Build Works
- Firmware Architecture
- Build Commands Reference
- Porting Notes
Prerequisites
| Requirement | Version | Notes |
|---|---|---|
| ESP-IDF | v6.0+ (tested with v6.0) | The sdkconfig.defaults header references ESP-IDF 6.0 |
| Python | 3.8+ | Required by ESP-IDF tools |
| Host C compiler | cc or gcc |
Used at build time to compile the dictionary compiler that runs on the host |
| CMake | 3.5+ | Bundled with ESP-IDF |
| Ninja | any | Bundled with ESP-IDF |
Installing ESP-IDF
Linux / macOS
# Install system dependencies (Ubuntu/Debian)
sudo apt-get install git wget flex bison gperf python3 python3-pip \
python3-venv cmake ninja-build ccache libffi-dev libssl-dev \
dfu-util libusb-1.0-0
# Clone ESP-IDF
mkdir -p ~/esp && cd ~/esp
git clone --recursive https://github.com/espressif/esp-idf.git
cd esp-idf
git checkout v6.0 # or latest stable release
# Install toolchains (ESP32-S3 + ESP32-C6 targets)
./install.sh esp32s3 esp32c6
# Activate the environment (run in every new shell, or add to .bashrc)
. ~/esp/esp-idf/export.shWindows
Use the ESP-IDF Windows Installer which bundles Git, Python, CMake, Ninja, and the Xtensa/RISC-V toolchains.
Cloning the Repository
The upstream DECtalk source tree is included as a Git submodule at components/dectalk/dectalk. You must initialise it when you clone:
git clone --recursive https://github.com/lllucius/DECtalk_ESPress.git
cd DECtalk_ESPressIf you already cloned without --recursive, pull the submodule manually:
git submodule update --init --recursiveDirectory Layout
DECtalk_ESPress/
├── CMakeLists.txt # Top-level ESP-IDF project file
├── sdkconfig.defaults # Default Kconfig values (target, flash, PSRAM, TinyUSB…)
├── sdkconfig.defaults.esp32c6 # Target-specific overrides for ESP32-C6
├── sdkconfig.devel # Optional development overrides (diagnostics, PSRAM, debugging)
├── partitions.csv # Custom partition table
├── BUILD.md # ← this file (firmware build & architecture)
├── README.md # Firmware overview and quick-start
│
├── components/
│ └── dectalk/ # ESP-IDF component wrapping the upstream dapi library
│ ├── CMakeLists.txt # Compiles all dapi sources + local stubs; builds dictionary
│ ├── Kconfig.projbuild # menuconfig: language, dict storage, source path (DECtalk menu)
│ ├── README.md # Component overview, Kconfig settings, dictionary modes
│ ├── BUILD.md # Component build process, dapi compilation, porting notes
│ ├── project_include.cmake # Registers custom partition subtypes; manages partition CSV
│ ├── include/
│ │ ├── config.h # Maps Kconfig DECTALK_DICT_ROOT → DECTALK_INSTALL_PREFIX
│ │ └── sys/
│ │ ├── ipc.h # Minimal IPC_CREAT / IPC_RMID stubs
│ │ ├── mman.h # mmap/munmap prototypes + MAP_FAILED constant
│ │ └── shm.h # shmget/shmat/shmdt/shmctl prototypes
│ └── src/
│ ├── libc_stubs.c # shmget/shmat/shmdt/shmctl, nanosleep, readlink, dirname
│ └── loaddict_wrappers.c # __wrap_load_dictionary / __wrap_unload_dictionary
│
├── main/ # Main application component
│ ├── CMakeLists.txt # Registers main sources; depends on dectalk, driver, pthread…
│ ├── Kconfig.projbuild # menuconfig: audio, tuning, CDC/JTAG transport, fw commands, diagnostics
│ ├── idf_component.yml # IDF component manager dep: espressif/esp_tinyusb ≥ 2.0.0
│ ├── dtesp.c # Entry point (app_main), I2S init, threads, ESPress protocol
│ ├── dtesp.h # Protocol constants, DLE encode/decode, public API
│ ├── dtesp_audio.c # I2S output initialisation + TLV320DAC3100 codec init
│ ├── dtesp_audio.h # Audio subsystem public API
│ ├── dtesp_transport.h # Transport vtable (dtesp_transport_t) shared by all transports
│ ├── dtesp_jobs.h # Job queue types (SPEAK_TEXT, ACTION, FLUSH)
│ ├── dtesp_job_pool.c # Pre-allocated job pool allocator
│ ├── dtesp_job_pool.h # Job pool public API
│ ├── custom_commands.c # [:fw …] tokeniser and job-list builder
│ ├── custom_commands.h # Tokeniser public API and job-list types
│ ├── custom_actions.c # [:fw …] sub-command handlers (gpio, voice, rate, tone, …)
│ ├── custom_actions.h # Action handler public API
│ ├── fw_settings.c # NVS-backed codec settings (volume, profile, autoswitch)
│ ├── fw_settings.h # Firmware settings public API
│ ├── tlv320.c # TI TLV320DAC3100 codec driver (Adafruit breakout)
│ ├── tlv320.h # Codec driver public API
│ ├── volume_knob.c # Optional ADC-backed volume potentiometer support
│ ├── volume_knob.h # Volume knob public API / no-op wrappers
│ ├── usb_cdc_transport.c # ESP32-S3 USB CDC-ACM transport layer (TinyUSB wrapper)
│ ├── usb_cdc_transport.h # ESP32-S3 transport API
│ ├── jtag_serial_transport.c # ESP32-C6 USB Serial/JTAG transport layer
│ ├── jtag_serial_transport.h # ESP32-C6 transport API
│ ├── diag_mem.c # Optional heap/stack diagnostics task
│ └── diag_mem.h # Diagnostics API
│
├── host/ # Python host-side tools
│ ├── README.md # Host tools documentation
│ ├── dtesp_serial.py # DECtalkESPressSerial class (serial protocol API)
│ └── dtesp_gui_qt.py # Qt (PySide6/PyQt6) GUI for voice control, status, pause/resume
│
└── tests/ # Host-native unit tests (no ESP-IDF required)
├── Makefile # Build and run: `make -C tests test`
└── test_custom_commands.c # Tests for the [:fw …] custom command parser
How the Build Works
1. Project Bootstrapping
CMakeLists.txt is a minimal ESP-IDF project file:
cmake_minimum_required(VERSION 3.22)
include($ENV{IDF_PATH}/tools/cmake/project.cmake)
project(dtesp)ESP-IDF discovers the components/dectalk/ and main/ components automatically.
2. Component: dectalk (the TTS library)
The DECtalk component handles source resolution, language selection, dictionary cross-compilation, and dapi library compilation. For full details see the component BUILD.md.
3. Component: main (Firmware Application)
The main/ component contains the application logic:
| File | Role |
|---|---|
dtesp.c |
Entry point (app_main), I2S initialisation, thread creation, ESPress protocol loop, speech task, TTS callback |
dtesp_audio.c |
I2S output initialisation and, when selected, TLV320DAC3100 codec configuration |
dtesp_job_pool.c |
Pre-allocated pool allocator for job objects, with heap fallback |
custom_commands.c |
Tokenises incoming text for [:fw …] tokens and builds ordered job lists |
custom_actions.c |
Sub-command handlers for [:fw gpio], [:fw voice], [:fw rate], [:fw tone], codec controls, and TLV320 DSP commands such as bass, treble, eq, drc, spkgain, and mute |
fw_settings.c |
NVS-backed mirror of codec settings (volume, profile, autoswitch); loaded at startup, persisted by [:fw save] |
tlv320.c |
Driver for the TI TLV320DAC3100 stereo DAC / headphone amplifier (Adafruit breakout); compiled only when DTESP_DAC_TLV320 is selected |
volume_knob.c |
Optional ADC-sampled analog volume knob with smoothing, hysteresis, and soft-takeover against firmware volume changes |
usb_cdc_transport.c |
ESP32-S3 TinyUSB CDC-ACM driver: RX stream buffer, DTR-based connection tracking, reconnection detection |
jtag_serial_transport.c |
ESP32-C6 USB Serial/JTAG driver: buffered RX/TX, reconnect detection, RTS-reset suppression |
diag_mem.c |
Optional diagnostic task enabled from idf.py menuconfig that logs stack HWM and heap stats every 10 s |
Dependencies declared in CMakeLists.txt:
dectalk— the TTS library componentdriver— ESP-IDF I2S driverpthread— POSIX threadingesp_driver_gpio— GPIO driver (used for RGB LED disable and optional GPIO action)esp_driver_i2c— I2C driver (used by TLV320DAC3100 codec)esp_driver_i2s— I2S driver (audio output)esp_timer— High-resolution timeresp_driver_ledc— LEDC PWM driver (compiled in only whenDTESP_FW_CMD_TONE_ENABLEis set)nvs_flash— Non-volatile storage (compiled in only whenDTESP_DAC_TLV320is selected)
External dependency via idf_component.yml:
espressif/esp_tinyusb ≥ 2.0.0— TinyUSB CDC-ACM for native USB on ESP32-S3 only
4. Partition Table
partitions.csv defines a custom layout:
| Name | Type | SubType | Size | Purpose |
|---|---|---|---|---|
nvs |
data | nvs | 24 KB | Non-volatile storage |
phy_init |
data | phy | 4 KB | PHY calibration data |
factory |
app | factory | 2 MB | Application firmware |
Additional partitions can be added for dictionary storage:
- A
udictdata partition (subtype0x40) when using partition-based dictionary storage — this can be created automatically viaCONFIG_DECTALK_AUTOCREATE_PARTITIONS. - A
storageSPIFFS partition when using file-system-based dictionary loading (commented out by default).
The project_include.cmake file registers the custom udict subtype (0x40) with ESP-IDF and manages dynamic partition table extension.
5. sdkconfig.defaults
These are the minimal settings required for the project. ESP-IDF applies them automatically whenever the sdkconfig file is created or recreated (e.g. after idf.py fullclean or idf.py set-target):
| Setting | Value | Rationale |
|---|---|---|
CONFIG_ESP_DEFAULT_CPU_FREQ_MHZ_240 |
y |
Maximum CPU clock for synthesis performance |
CONFIG_ESP_TASK_WDT_EN |
n |
Task watchdog disabled (speech synthesis is CPU-intensive) |
CONFIG_IDF_TARGET |
esp32s3 |
Default target SoC |
CONFIG_PARTITION_TABLE_CUSTOM |
y |
Use the project's partitions.csv |
CONFIG_PTHREAD_TASK_STACK_SIZE_DEFAULT |
8192 |
Default pthread stack (8 KB) |
CONFIG_TINYUSB_CDC_ENABLED |
y |
Enable TinyUSB CDC-ACM for the ESP32-S3 host protocol |
When building for ESP32-C6, ESP-IDF also loads sdkconfig.defaults.esp32c6 after idf.py set-target esp32c6. That file currently overrides:
| Setting | Value | Rationale |
|---|---|---|
CONFIG_ESP_DEFAULT_CPU_FREQ_MHZ_160 |
y |
ESP32-C6 default maximum CPU clock |
CONFIG_TINYUSB_CDC_ENABLED |
n |
ESP32-C6 host communications use USB Serial/JTAG instead of TinyUSB CDC |
6. sdkconfig.devel (Development Overrides)
sdkconfig.devel contains additional settings useful during development and debugging. These are not applied automatically — you must explicitly combine them with sdkconfig.defaults (see Combining sdkconfig Files below).
| Setting | Value | Rationale |
|---|---|---|
CONFIG_COMPILER_STACK_CHECK_MODE_STRONG |
y |
Strong stack-smashing detection |
CONFIG_DTESP_ENABLE_DIAG_MEM |
y |
Enable heap/stack diagnostics task |
CONFIG_DTESP_LOG_LEVEL_VERBOSE |
y |
Verbose ESP_LOG output |
CONFIG_ESPTOOLPY_FLASHSIZE_8MB |
y |
8 MB flash for firmware + dictionary |
CONFIG_ESPTOOLPY_HEADER_FLASHSIZE_UPDATE |
y |
Auto-update flash size in binary header |
CONFIG_ESP_SYSTEM_PANIC_PRINT_HALT |
y |
Print backtrace and halt on panic |
CONFIG_FREERTOS_USE_TRACE_FACILITY |
y |
Enable FreeRTOS task trace (for diagnostics) |
CONFIG_HEAP_ABORT_WHEN_ALLOCATION_FAILS |
y |
Hard-fail on OOM for easier debugging |
CONFIG_HEAP_POISONING_COMPREHENSIVE |
y |
Full heap poisoning for corruption detection |
CONFIG_SPIRAM |
y |
Enable PSRAM |
CONFIG_SPIRAM_MODE_OCTAL |
y |
Octal SPI PSRAM |
7. Combining sdkconfig Files
ESP-IDF can merge multiple defaults files at configuration time using the -D SDKCONFIG_DEFAULTS CMake variable. This is useful for layering the development overrides on top of the base defaults:
# Create (or recreate) sdkconfig with both base and devel settings:
idf.py -D SDKCONFIG_DEFAULTS="sdkconfig.defaults;sdkconfig.devel" buildSettings in later files override earlier ones, so sdkconfig.devel values take precedence over sdkconfig.defaults.
When do you need to do this? Only when the
sdkconfigfile needs to be created or recreated — for example afteridf.py fullclean,idf.py set-target, or when cloning the project for the first time. Oncesdkconfigexists, subsequentidf.py buildcommands reuse it and you do not need to pass-D SDKCONFIG_DEFAULTSagain. You can also make further changes interactively withidf.py menuconfigat any time.
Firmware Architecture
Thread Model
app_main() creates two pthreads and then returns (freeing the default FreeRTOS task):
| Thread | Core | Stack | Role |
|---|---|---|---|
speech_thread |
CPU 1 (configurable) | default (8 KB) | Dequeues text from speech_queue, calls TextToSpeechSpeak() + Sync() |
main_thread |
any | 12 KB (configurable) | Runs the ESPress protocol loop: host-transport reads, DLE state machine, flow control |
CPU pinning is important: the speech synthesis in TextToSpeechSpeak() is compute-intensive and does not yield to the scheduler. Pinning it to CPU 1 keeps CPU 0 free so the IDLE0 task can service the Task Watchdog Timer (even though the watchdog is disabled in the defaults, this is defensive).
Data Flow
Host (PC) ESP32-S3 / ESP32-C6
───────── ───────────────────
Serial terminal / GUI USB CDC-ACM / USB Serial/JTAG
│ │
│ ASCII text, control chars │
│ DLE command sequences │
├───────────────────────────────►│
│ ▼
│ main_thread (protocol loop)
│ ├── DLE state machine
│ ├── Control char handlers
│ ├── Text accumulation buffer
│ └── XON/XOFF flow control
│ │
│ │ strdup'd text chunks
│ ▼
│ speech_queue (FreeRTOS queue)
│ │
│ ▼
│ speech_thread
│ ├── TextToSpeechSpeak()
│ ├── TextToSpeechSync()
│ └── Flush / drain
│ │
│ │ TTS_MSG_BUFFER callback
│ ▼
│ dtesp_tts_callback()
│ ├── Audio samples → I2S DMA
│ └── Index markers → DLE INDEX
│ │
│ DLE STATUS, INDEX, XON/XOFF │
│◄───────────────────────────────┤
│ │
│ ▼
│ I2S peripheral → DAC → speaker
TTS In-Memory Mode
The firmware uses TextToSpeechOpenInMemory() with three rotating audio buffers (16 KB each, 8192 16-bit samples per buffer). Each buffer also carries up to 8 index-mark slots. When a buffer is filled, the dtesp_tts_callback() is invoked with TTS_MSG_BUFFER:
- Any embedded index marks are extracted and sent to the host as DLE INDEX sequences.
- Audio samples are written to the I2S DMA ring buffer via
i2s_channel_write(). - If speech is paused (SO received), samples are zeroed before writing.
- The buffer is reset and re-queued with
TextToSpeechAddBuffer().
Host Serial Transport
The main/ component selects the host transport at build time based on IDF_TARGET:
- ESP32-S3 →
usb_cdc_transport.cusing TinyUSB CDC-ACM - ESP32-C6 →
jtag_serial_transport.cusing the built-inusb_serial_jtagdriver
ESP32-S3: USB CDC-ACM
Why TinyUSB CDC instead of the built-in USB Serial/JTAG? The ESP32-S3's onboard USB Serial/JTAG peripheral automatically reboots the chip whenever the host toggles DTR. Because the ESPress protocol relies on DTR transitions to detect host connect/disconnect events, using the built-in Serial/JTAG would cause the device to reboot every time a host application opens the port. TinyUSB's CDC-ACM device avoids this by giving the firmware full control over how DTR line-state changes are handled — no reboot, just a protocol-state reset.
usb_cdc_transport.c wraps the espressif/esp_tinyusb CDC-ACM interface:
- RX path: A TinyUSB callback drains 64-byte USB bulk packets into a FreeRTOS stream buffer (default 4 KB). The protocol loop calls
usb_cdc_transport_read()which blocks on the stream buffer with a configurable timeout. - TX path:
usb_cdc_transport_write()queues data viatinyusb_cdcacm_write_queue()and flushes with a 50 ms timeout. Writes are silently dropped when no host is connected (cdc_connected == false). - Connection tracking: The line-state callback monitors DTR. When DTR transitions low→high (host opens the port), a reconnection counter is incremented. The protocol loop polls
usb_cdc_transport_check_reconnected()each iteration and resets all protocol state on reconnection. - Initial boot:
cdc_had_disconnectstarts astrueso the first host connection after power-on is treated as a reconnection, triggering XON.
ESP32-C6: USB Serial/JTAG
jtag_serial_transport.c uses ESP-IDF's usb_serial_jtag driver:
- RTS reset suppression: During initialisation the firmware sets
USB_SERIAL_JTAG.chip_rst.usb_uart_chip_rst_dis = 1, disabling the host-RTS-triggered chip reset so opening the port does not reboot the device. - RX/TX buffering: ESP-IDF installs application-managed ring buffers sized by
CONFIG_DTESP_JTAG_RX_BUF_SIZEandCONFIG_DTESP_JTAG_TX_BUF_SIZE. - Connection tracking: The protocol layer polls
usb_serial_jtag_is_connected()and treats a disconnect→connect transition as a host reconnection, resetting protocol state and sending the initial XON just as the CDC transport does.
Dictionary Loading
Dictionary loading is handled by the DECtalk component. See the component README for dictionary storage modes and the component BUILD.md for the dictionary build pipeline and __wrap_load_dictionary() implementation.
Flow Control
The protocol loop implements two-tier XON/XOFF flow control:
- Text buffer level — XOFF at 2/3 full, XON at 1/3 full.
- Speech queue depth — XOFF at 3/4 full, XON at 1/4 full.
XOFF is sent when either threshold is exceeded (aggressive). XON requires both to be below their respective thresholds (conservative) to prevent rapid oscillation.
Idle Flush
If no new characters arrive for CONFIG_DTESP_TEXT_IDLE_TIMEOUT_MS (default 200 ms), any buffered text is automatically flushed to the speech queue. This handles the case where the host sends text without a trailing CR.
Build Commands Reference
# Full clean build
idf.py fullclean && idf.py build
# Full clean build with development overrides
idf.py fullclean && idf.py -D SDKCONFIG_DEFAULTS="sdkconfig.defaults;sdkconfig.devel" build
# Build only
idf.py build
# Flash (replace /dev/ttyUSB0 with your UART port)
idf.py -p /dev/ttyUSB0 flash
# Flash dictionary partition separately (when using partition mode)
idf.py -p /dev/ttyUSB0 udict-flash
# Monitor console output (UART0)
idf.py -p /dev/ttyUSB0 monitor
# Open menuconfig
idf.py menuconfig
# Set target (only needed once, or after fullclean)
idf.py set-target esp32s3
# Build for ESP32-C6 instead
idf.py set-target esp32c6
idf.py buildChanging Language
idf.py menuconfig
# Navigate to: DECtalk → DECtalk Language
# Select desired language, save, exit
idf.py build
idf.py -p /dev/ttyUSB0 flashSee the component README for supported languages and compile definitions.
Changing Dictionary Storage Mode
idf.py menuconfig
# Navigate to: DECtalk → Dictionary location
# Choose: Embedded in firmware / Dedicated partition / File systemSee the component README for detailed descriptions of each storage mode and its sub-options.
Porting Notes
For details on how the upstream dapi library was adapted for ESP32 (compile definitions, header shims, libc stubs, linker wrapping, warning suppression), see the component BUILD.md.