Skip to main content

State Architecture

This document describes the target state architecture for Songbird, the tradeoffs behind it, and the migration path from the current system.

Principles

  1. StateStore is the single source of truth for all persistent state. If you close and reopen the DAW, everything in StateStore loads. Nothing else is persisted.
  2. The sync engine is the sole mediator. No component reads or writes StateStore directly. The frontend, audio engine, and collab layer all go through the sync engine.
  3. The audio thread never blocks. It cannot lock a mutex, allocate, or call into StateStore. It receives data via lock-free channels (SPSC ring buffers, atomic swaps).
  4. rtFrame data is ephemeral. Anything sent over the binary rtFrame path (meter levels, transport position, spectrum, CPU stats) is not persisted and does not touch StateStore.

What Lives Where

StateStore (persistent, undo-able)

Everything that should survive a close/reopen cycle:
  • Project metadata (name, BPM, time signature, key, sample rate)
  • Tracks (name, type, color, gain, pan, mute, solo, armed)
  • Clips (MIDI notes, audio file references, positions, gains)
  • Plugin instances and their parameter values
  • Sends, returns, routing configuration
  • Sections, loop region
  • Take lanes
  • Arrangement markers, automation envelopes
  • Clip launcher state
  • UI state that should persist (view mode, panel layout, editor selections)

Ephemeral (not in StateStore)

  • Transport playhead position, play/record state (derived from audio engine)
  • Meter levels, spectrum analysis, stereo width (rtFrame)
  • CPU stats, buffer size
  • Active MIDI notes currently sounding
  • Audio engine graph topology (derived from StateStore on rebuild)
  • Recording buffers (in-flight, committed to StateStore on stop)

Satellite Services (not state, but capabilities)

These manage external resources and don’t belong in StateStore:
  • SessionManager (file save/load/autosave)
  • PresetManager (plugin preset files)
  • MidiDeviceManager (hardware enumeration)
  • TemplateLibrary, LoopBrowser (filesystem)
  • CollabSession (network)

Data Flow

┌─────────────┐     commands      ┌─────────────┐    mutations     ┌────────────┐
│  Frontend    │ ───────────────→  │ Sync Engine  │ ──────────────→ │ StateStore │
│  (Zustand)   │ ←─────────────── │             │ ←────────────── │  (Project)  │
└─────────────┘    broadcasts      │             │   change events  └────────────┘
                                   │             │
┌─────────────┐  commands/snapshots│             │
│ Audio Engine │ ←──────────────── │             │
│ (RT thread)  │ ──────────────→  │             │
└─────────────┘  events (ring buf) └─────────────┘

┌─────────────┐   binary frames
│ Audio Engine │ ─────────────────→  Frontend (rtFrame at ~30Hz)
│ (RT thread)  │                     Bypasses StateStore entirely
└─────────────┘

Audio Engine Sync: The Hybrid Approach

The audio engine needs data from StateStore but cannot access it directly. We use two mechanisms, chosen by the nature of the change:

Snapshots (structural changes)

For operations that change the structure of what the engine processes — adding/removing tracks, adding/moving/deleting clips, changing routing, loading a project — the sync engine builds a derived snapshot and swaps it atomically into the engine. The engine’s state is partitioned into independently swappable slices:
SliceWhat it containsSwap mechanismWhen rebuilt
TrackClips (per-track)MIDI notes, audio clip scheduling data for one trackArc swap in schedulerClip edit, note edit, recording commit
Graph topologyNodes, connections, plugin chain, send/return wiringSession swap (full)Add/remove track, change routing, load project
Transport configBPM, time sig, loop regionEngineCommandUser changes BPM, toggles loop
Mixer paramsGain, pan, mute, solo per nodeEngineCommandFader drag, mute toggle
When a mutation touches clips on track 3, only TrackClips for track 3 is rebuilt and swapped. The graph, other tracks, and mixer params are untouched.

Commands (continuous parameters)

For high-frequency, single-value changes — fader drags, plugin knob turns, transport seek — the sync engine sends EngineCommand messages through the SPSC ring buffer. These are applied directly by the audio thread on the next process_block(). Commands are ephemeral overrides. The next snapshot rebuild for that slice will include the committed value from StateStore, naturally subsuming any in-flight commands.

Why hybrid, not pure snapshots or pure commands

Pure snapshots would mean rebuilding and swapping on every fader tick (~60Hz). The allocation + rebuild cost is unnecessary for a single field change, and latency is coarser (whole-buffer granularity vs per-sample). Pure commands would require a command variant for every field on every struct, and incremental sync is fragile — forget one command and the engine silently drifts from StateStore. The command surface explodes with project complexity. Hybrid gives the best of both: commands for the hot path (zero allocation, sub-buffer latency), snapshots for structural changes (impossible to drift, no command explosion).

MIDI Monitoring vs Recording

MIDI monitoring and recording are deliberately split into two paths:

Monitoring (bypasses sync engine)

MIDI hardware events are injected directly into the audio graph for minimum latency. The MIDI device manager pushes EngineCommand::MidiNoteOn/Off to the ring buffer. The audio thread processes them on the next callback (~5ms at 512 samples/44.1kHz). The sync engine is not involved — this is a direct hardware-to-engine path.

Recording (through sync engine)

The sync engine independently captures incoming MIDI events, timestamps them against the transport position, and accumulates them in a recording buffer. When recording stops:
  1. Accumulated notes are assembled into a clip
  2. Sync engine calls mutate_described("Record MIDI", ...) on StateStore
  3. StateStore captures undo snapshot, marks dirty, triggers git commit
  4. Sync engine rebuilds TrackClips for the recorded track and swaps it into the scheduler
  5. Sync engine broadcasts to frontend
During recording, StateStore is not touched. The recording buffer is ephemeral.
MIDI keyboard ──→ EngineCommand::MidiNoteOn ──→ [ring buffer] ──→ audio thread
                                                                    (monitoring)
              ──→ sync engine recording buffer (accumulates)

                    └── on stop: mutate_described() ──→ StateStore
                                                         ├──→ undo snapshot
                                                         ├──→ git commit
                                                         └──→ frontend broadcast

Mutation Path

Current (to be migrated away from)

The StateBackend trait works on serde_json::Value. Every mutation serializes the entire Project to JSON, mutates the JSON, then deserializes back. This costs ~100–500µs per mutation due to double serde round-trip, regardless of how small the change is.

Target

StateBackend should expose typed mutation methods that operate directly on &mut Project (or typed slices of it). The JSON indirection was a decoupling choice — the sync engine crate (songbird-sync) doesn’t depend on the Project struct. This can be resolved by making the trait generic or adding the dependency. Direct mutation reduces per-operation cost to field writes + diff, eliminating the serde bottleneck on the hot path (fader drags, live note edits).

Silent vs Described mutations

  • mutate_silent: No undo entry. Used during continuous gestures (fader drag, live note drag). Captures a pre_silent_snapshot on first call for undo baseline.
  • mutate_described: Creates an undo entry. Used on commit (fader release, note drop, recording stop). If preceded by silent mutations, undo reverts to the pre-drag state.

Latency Budget

Estimated per-operation costs in the target architecture:
OperationAudio engine latencyStateStore latency
MIDI note monitoring~5ms (next audio buffer)N/A (not written)
Fader drag tick~100ns (ring buffer push)~1–10µs (direct field write)
Edit MIDI note~10–50µs (single-track clip rebuild + swap)~1–10µs (direct field write)
Add/remove track~30–200µs (graph topology rebuild + swap)~1–10µs (direct field write)
Load project~10–150ms (full session build, plugin init, audio decode)~1ms (deserialize from disk)
Recording stop~10–50µs (single-track clip rebuild + swap)~100–500µs (clip commit + undo snapshot)

Migration from Current Architecture

What gets eliminated

  • AppState god object — replaced by StateStore + satellite services + transport infrastructure
  • Mixer struct on AppState — redundant with Project.tracks[i].{gain, pan, muted, solo} in StateStore
  • MixerStoreState / store slices in Rust — the Zustand store shape is a frontend concern; Rust doesn’t need its own definition of the frontend’s mixer state
  • snake_to_camel_json() in build_track_state_payload() — eliminated once serde rename_all = "camelCase" is applied to Project structs (the serde_fix branch)
  • JSON round-trip in dispatch_bridge.rs — replaced by direct typed mutations

What stays

  • Project struct — internal to StateStore, not exposed
  • StateStore — gains additional slices for state currently on AppState (arrangement, automation, clip launcher, etc.)
  • StateBackend trait — the sync engine’s API into StateStore, extended with typed methods
  • SPSC ring buffers — the audio thread communication path
  • SchedulerChannel — extended to support per-track clip swaps
  • SessionChannel — for full project load/swap

Alternatives Considered

Pure command dispatch (status quo)

Every change sends an explicit EngineCommand. Engine maintains its own full copy of state. Rejected because: two copies of everything, easy to drift, command surface explodes with every new field.

Pure Arc-swap snapshots

Every mutation rebuilds a full snapshot and atomically swaps it. Rejected because: unnecessary allocation on hot path (fader drags), rebuild cost scales with project size even for single-field changes.

Triple buffer / seqlock

Lock-free shared memory with fixed pre-allocated buffers. Rejected because: requires flat Copy structs with hard caps — impractical for variable-length nested data (tracks → clips → notes).

Everything in StateStore, including ephemeral state

Transport position, meter levels, etc. persisted in StateStore. Rejected because: transport position changes every ~5ms from the audio engine — writing it to StateStore would be pointless serialization churn with no persistence benefit.