43 KiB

Raw Blame History

DAW Backend Architecture & Implementation Roadmap

Version: 1.0
Date: October 2025
Language: Rust
Audio I/O: cpal

Architecture Overview
Core Components
Metatracks Architecture
Implementation Roadmap
Technical Specifications
Testing Strategy

Architecture Overview

High-Level Design

The DAW follows a multi-threaded, message-passing architecture that separates real-time audio processing from UI and control logic:

┌─────────────┐     Commands      ┌─────────────┐     Commands      ┌─────────────┐
│  UI Thread  │ ←──────────────→  │Control Thread│ ←──────────────→ │ Audio Thread│
└─────────────┘     Events        └─────────────┘    Lock-free     └─────────────┘
                                          │             Queues              │
                                          ↓                                 │
                                   ┌─────────────┐                         │
                                   │Project State│←────────────────────────┘
                                   │(Triple-buf) │      Atomic reads
                                   └─────────────┘

Design Principles

Real-time Safety: Audio thread is lock-free and allocation-free
Hierarchical Composition: Tracks can contain other tracks (metatracks)
Message-Based Communication: All cross-thread communication via lock-free queues
Incremental Complexity: Architecture supports simple flat tracks initially, scales to nested metatracks
Separation of Concerns: Audio processing, state management, and UI are decoupled

Core Components

1. Audio Engine (Real-time Thread)

Responsibilities:

Process audio graph in response to cpal callbacks
Execute commands from control thread
Maintain playback position and transport state
Send events back to control thread

Constraints:

No memory allocations
No blocking operations (mutex, I/O)
No unbounded loops
Pre-allocated buffers only

Key Structures:

struct AudioEngine {
    tracks: Vec<TrackNode>,
    playhead: u64,              // Sample position
    playing: bool,
    sample_rate: u32,
    
    // Communication
    command_rx: rtrb::Consumer<Command>,
    event_tx: rtrb::Producer<AudioEvent>,
    
    // Pre-allocated resources
    mix_buffer: Vec<f32>,
    buffer_pool: BufferPool,
}

enum Command {
    Play,
    Stop,
    Seek(f64),
    SetTempo(f32),
    UpdateTrackVolume(TrackId, f32),
    UpdateTrackMute(TrackId, bool),
    AddEffect(TrackId, EffectType),
    // ... more commands
}

enum AudioEvent {
    PlaybackPosition(f64),
    PeakLevel(TrackId, f32),
    BufferUnderrun,
    // ... more events
}

2. Track Hierarchy

Track Node Types:

enum TrackNode {
    Audio(AudioTrack),
    Midi(MidiTrack),
    Metatrack(Metatrack),
    Bus(BusTrack),
}

struct AudioTrack {
    id: TrackId,
    name: String,
    clips: Vec<Clip>,
    effects: Vec<Box<dyn Effect>>,
    volume: f32,
    pan: f32,
    muted: bool,
    solo: bool,
    parent: Option<TrackId>,
}

struct MidiTrack {
    id: TrackId,
    name: String,
    clips: Vec<MidiClip>,
    instrument: Box<dyn Effect>,  // Virtual instrument
    effects: Vec<Box<dyn Effect>>,
    volume: f32,
    pan: f32,
    muted: bool,
    solo: bool,
    parent: Option<TrackId>,
}

struct Metatrack {
    id: TrackId,
    name: String,
    children: Vec<TrackId>,
    effects: Vec<Box<dyn Effect>>,
    
    // Metatrack-specific features
    time_stretch: f32,      // Speed multiplier (0.5 = half speed)
    pitch_shift: f32,       // Semitones
    offset: f64,            // Time offset in seconds
    
    volume: f32,
    pan: f32,
    muted: bool,
    solo: bool,
    parent: Option<TrackId>,
    
    // UI hints
    collapsed: bool,
    color: Color,
}

struct BusTrack {
    id: TrackId,
    name: String,
    inputs: Vec<TrackId>,   // Which tracks send to this bus
    effects: Vec<Box<dyn Effect>>,
    volume: f32,
    pan: f32,
}

3. Clips and Regions

struct Clip {
    id: ClipId,
    content: ClipContent,
    start_time: f64,        // Position in parent track (seconds)
    duration: f64,          // Clip duration (seconds)
    offset: f64,            // Offset into content (seconds)
    
    // Clip-level processing
    gain: f32,
    fade_in: f64,
    fade_out: f64,
    reversed: bool,
}

enum ClipContent {
    AudioFile {
        pool_index: usize,  // Index into AudioPool
    },
    MidiData {
        events: Vec<MidiEvent>,
    },
    MetatrackReference {
        track_id: TrackId,
    },
}

struct MidiEvent {
    timestamp: u64,         // Sample offset within clip
    status: u8,
    data1: u8,
    data2: u8,
}

4. Audio Pool

Shared audio file storage:

struct AudioPool {
    files: Vec<AudioFile>,
    cache: LruCache<FileId, Vec<f32>>,
}

struct AudioFile {
    id: FileId,
    path: PathBuf,
    data: Vec<f32>,         // Interleaved samples
    channels: u32,
    sample_rate: u32,
    frames: u64,
}

5. Effect System

trait Effect: Send {
    fn process(&mut self, buffer: &mut [f32], channels: usize, sample_rate: u32);
    fn set_parameter(&mut self, id: u32, value: f32);
    fn get_parameter(&self, id: u32) -> f32;
    fn reset(&mut self);
}

// Example implementations
struct GainEffect {
    gain_db: f32,
}

struct SimpleEQ {
    low_gain: f32,
    mid_gain: f32,
    high_gain: f32,
    filters: [BiquadFilter; 3],
}

struct SimpleSynth {
    oscillators: Vec<Oscillator>,
    adsr: AdsrEnvelope,
}

6. Render Context

Carries time and tempo information through the track hierarchy:

struct RenderContext {
    global_position: u64,       // Absolute sample position
    local_position: u64,        // Position within current scope
    sample_rate: u32,
    tempo: f32,
    time_signature: (u32, u32),
    time_stretch: f32,          // Accumulated stretch factor
}

impl Metatrack {
    fn transform_context(&self, ctx: RenderContext) -> RenderContext {
        let offset_samples = (self.offset * ctx.sample_rate as f64) as u64;
        let local_pos = ((ctx.local_position.saturating_sub(offset_samples)) as f64 
            / self.time_stretch as f64) as u64;
        
        RenderContext {
            global_position: ctx.global_position,
            local_position: local_pos,
            sample_rate: ctx.sample_rate,
            tempo: ctx.tempo * self.time_stretch,
            time_signature: ctx.time_signature,
            time_stretch: ctx.time_stretch * self.time_stretch,
        }
    }
}

7. Project State

struct Project {
    tracks: HashMap<TrackId, TrackNode>,
    root_tracks: Vec<TrackId>,
    audio_pool: AudioPool,
    
    // Global settings
    sample_rate: u32,
    tempo: f32,
    time_signature: (u32, u32),
    
    // Metadata
    name: String,
    created: SystemTime,
    modified: SystemTime,
}

impl Project {
    fn get_processing_order(&self) -> Vec<TrackId> {
        // Depth-first traversal for correct rendering order
        let mut order = Vec::new();
        for root_id in &self.root_tracks {
            self.collect_depth_first(*root_id, &mut order);
        }
        order
    }
    
    fn collect_depth_first(&self, id: TrackId, order: &mut Vec<TrackId>) {
        if let Some(TrackNode::Metatrack(meta)) = self.tracks.get(&id) {
            for child_id in &meta.children {
                self.collect_depth_first(*child_id, order);
            }
        }
        order.push(id);
    }
}

8. Buffer Management

struct BufferPool {
    buffers: Vec<Vec<f32>>,
    available: Vec<usize>,
    buffer_size: usize,
}

impl BufferPool {
    fn acquire(&mut self) -> Vec<f32> {
        if let Some(idx) = self.available.pop() {
            let mut buf = std::mem::take(&mut self.buffers[idx]);
            buf.fill(0.0);
            buf
        } else {
            vec![0.0; self.buffer_size]
        }
    }
    
    fn release(&mut self, buffer: Vec<f32>) {
        let idx = self.buffers.len();
        self.buffers.push(buffer);
        self.available.push(idx);
    }
}

Metatracks Architecture

Processing Model

Metatracks use a pre-mix model:

Mix all children into a temporary buffer
Apply metatrack's effects to the mixed buffer
Mix result into parent's output

fn process_metatrack(
    meta: &Metatrack,
    project: &Project,
    output: &mut [f32],
    context: RenderContext,
    buffer_pool: &mut BufferPool,
) {
    // Transform context for children
    let child_context = meta.transform_context(context);
    
    // Acquire scratch buffer
    let mut submix = buffer_pool.acquire();
    submix.resize(output.len(), 0.0);
    
    // Process all children into submix
    for child_id in &meta.children {
        if let Some(child) = project.tracks.get(child_id) {
            process_track_node(
                child,
                project,
                &mut submix,
                child_context,
                buffer_pool
            );
        }
    }
    
    // Apply metatrack's effects
    for effect in &mut meta.effects {
        effect.process(&mut submix, 2, context.sample_rate);
    }
    
    // Mix into output with volume
    for (out, sub) in output.iter_mut().zip(submix.iter()) {
        *out += sub * meta.volume;
    }
    
    // Return buffer to pool
    buffer_pool.release(submix);
}

Time Transformation

Metatracks can manipulate time for all children:

Time Stretch: Speed up or slow down playback
Offset: Shift content in time
Pitch Shift: Transpose content (future feature, requires pitch-preserving time stretch)

Metatrack Operations

enum MetatrackOperation {
    // Creation
    CreateFromSelection(Vec<TrackId>),
    CreateEmpty,
    
    // Hierarchy manipulation
    AddToMetatrack(TrackId, Vec<TrackId>),
    RemoveFromMetatrack(TrackId, Vec<TrackId>),
    MoveToMetatrack { track: TrackId, new_parent: TrackId },
    Ungroup(TrackId),
    Flatten(TrackId),
    
    // Transformation
    SetTimeStretch(TrackId, f32),
    SetOffset(TrackId, f64),
    
    // Rendering
    BounceToAudio(TrackId),
    Freeze(TrackId),
    Unfreeze(TrackId),
}

Nesting Limits

Recommended maximum depth: 10 levels
Reason: Performance and UI complexity
Implementation: Check depth during metatrack creation

Implementation Roadmap

Phase 1: Single Audio File Playback (Week 1-2)

Goal: Play one audio file through speakers

Deliverables:

Basic cpal integration
Load audio file with symphonia
Simple playback loop
Press spacebar to play/pause

Core Implementation:

struct SimpleEngine {
    audio_data: Vec<f32>,
    playhead: usize,
    sample_rate: u32,
    playing: Arc<AtomicBool>,
}

// Main audio callback
fn audio_callback(data: &mut [f32], engine: &mut SimpleEngine) {
    if engine.playing.load(Ordering::Relaxed) {
        let end = (engine.playhead + data.len()).min(engine.audio_data.len());
        let available = end - engine.playhead;
        
        data[..available].copy_from_slice(
            &engine.audio_data[engine.playhead..end]
        );
        
        engine.playhead = end;
    } else {
        data.fill(0.0);
    }
}

Dependencies:

cpal = "0.15"
symphonia = "0.5"

Success Criteria:

Audio plays without clicks or pops
Can start/stop playback
No audio thread panics

Phase 2: Transport Control + UI Communication (Week 2-3)

Goal: Start/stop/seek from a basic UI

Deliverables:

Lock-free command queue
Atomic playhead position
Basic UI (terminal or simple window)
Play/pause/seek controls

Core Implementation:

enum Command {
    Play,
    Stop,
    Seek(f64),
}

struct Engine {
    audio_data: Vec<f32>,
    playhead: Arc<AtomicU64>,
    command_rx: rtrb::Consumer<Command>,
    playing: bool,
    sample_rate: u32,
}

fn audio_callback(data: &mut [f32], engine: &mut Engine) {
    // Process all pending commands
    while let Ok(cmd) = engine.command_rx.pop() {
        match cmd {
            Command::Play => engine.playing = true,
            Command::Stop => {
                engine.playing = false;
                engine.playhead.store(0, Ordering::Relaxed);
            }
            Command::Seek(seconds) => {
                let samples = (seconds * engine.sample_rate as f64) as u64;
                engine.playhead.store(samples, Ordering::Relaxed);
            }
        }
    }
    
    // Render audio...
}

New Dependencies:

rtrb = "0.3" (lock-free ringbuffer)

Success Criteria:

Commands execute within 1 buffer period
No audio glitches during seek
Playhead position updates smoothly

Phase 3: Multiple Audio Tracks (Week 3-4)

Goal: Play multiple audio files simultaneously

Deliverables:

Track data structure
Per-track volume control
Mute/solo functionality
Mix multiple tracks

Core Implementation:

struct Track {
    id: u32,
    audio_data: Vec<f32>,
    volume: f32,
    muted: bool,
    solo: bool,
}

struct Engine {
    tracks: Vec<Track>,
    playhead: Arc<AtomicU64>,
    command_rx: rtrb::Consumer<Command>,
    sample_rate: u32,
    mix_buffer: Vec<f32>,
}

enum Command {
    Play,
    Stop,
    Seek(f64),
    SetTrackVolume(u32, f32),
    SetTrackMute(u32, bool),
    SetTrackSolo(u32, bool),
}

fn audio_callback(data: &mut [f32], engine: &mut Engine) {
    // Process commands...
    
    if engine.playing {
        // Clear mix buffer
        engine.mix_buffer.fill(0.0);
        
        // Check if any track is soloed
        let any_solo = engine.tracks.iter().any(|t| t.solo);
        
        // Mix all active tracks
        for track in &engine.tracks {
            let active = !track.muted && (!any_solo || track.solo);
            if active {
                mix_track(track, &mut engine.mix_buffer, engine.playhead, data.len());
            }
        }
        
        // Copy mix to output
        data.copy_from_slice(&engine.mix_buffer[..data.len()]);
    }
}

fn mix_track(track: &Track, output: &mut [f32], playhead: u64, frames: usize) {
    let start = playhead as usize;
    let end = (start + frames).min(track.audio_data.len());
    
    for (i, sample) in track.audio_data[start..end].iter().enumerate() {
        output[i] += sample * track.volume;
    }
}

Success Criteria:

4+ tracks play simultaneously without distortion
Volume changes are smooth (no clicks)
Solo/mute work correctly
CPU usage remains reasonable

Phase 4: Clips & Timeline (Week 4-5)

Goal: Place audio regions at different positions on timeline

Deliverables:

Clip data structure
Timeline-based playback
Audio pool for shared audio data
Multiple clips per track

Core Implementation:

struct Clip {
    id: u32,
    audio_pool_index: usize,
    start_time: f64,        // Seconds
    duration: f64,
    offset: f64,            // Offset into audio file
    gain: f32,
}

struct Track {
    id: u32,
    clips: Vec<Clip>,
    volume: f32,
    muted: bool,
    solo: bool,
}

struct AudioPool {
    files: Vec<Vec<f32>>,
}

fn render_track(
    track: &Track,
    output: &mut [f32],
    pool: &AudioPool,
    playhead_seconds: f64,
    sample_rate: u32,
    frames: usize,
) {
    for clip in &track.clips {
        let clip_start = clip.start_time;
        let clip_end = clip.start_time + clip.duration;
        
        // Check if clip is active in this time range
        if playhead_seconds < clip_end && 
           playhead_seconds + (frames as f64 / sample_rate as f64) > clip_start {
            
            render_clip(clip, output, pool, playhead_seconds, sample_rate, frames);
        }
    }
}

fn render_clip(
    clip: &Clip,
    output: &mut [f32],
    pool: &AudioPool,
    playhead_seconds: f64,
    sample_rate: u32,
    frames: usize,
) {
    let audio = &pool.files[clip.audio_pool_index];
    
    // Calculate position within clip
    let clip_position = playhead_seconds - clip.start_time + clip.offset;
    let start_sample = (clip_position * sample_rate as f64) as usize;
    
    // Calculate how many samples to copy
    let samples_available = audio.len().saturating_sub(start_sample);
    let samples_to_copy = samples_available.min(output.len());
    
    // Mix into output
    for i in 0..samples_to_copy {
        output[i] += audio[start_sample + i] * clip.gain * clip.volume;
    }
}

Success Criteria:

Clips play at correct timeline positions
Multiple clips per track work correctly
Clips can overlap
Audio pool prevents duplication

UI Requirements:

Basic timeline view
Drag clips to position them
Visual representation of waveforms

Phase 5: Effect Processing (Week 5-6)

Goal: Add gain/pan/simple effects to tracks

Deliverables:

Effect trait
Basic effects (gain, pan, simple EQ)
Per-track effect chain
Effect parameter control

Core Implementation:

trait Effect: Send {
    fn process(&mut self, buffer: &mut [f32], channels: usize, sample_rate: u32);
    fn set_parameter(&mut self, id: u32, value: f32);
    fn get_parameter(&self, id: u32) -> f32;
    fn reset(&mut self);
}

struct GainEffect {
    gain_linear: f32,
}

impl Effect for GainEffect {
    fn process(&mut self, buffer: &mut [f32], _channels: usize, _sample_rate: u32) {
        for sample in buffer.iter_mut() {
            *sample *= self.gain_linear;
        }
    }
    
    fn set_parameter(&mut self, id: u32, value: f32) {
        if id == 0 {  // Gain in dB
            self.gain_linear = 10.0_f32.powf(value / 20.0);
        }
    }
    
    fn get_parameter(&self, id: u32) -> f32 {
        if id == 0 {
            20.0 * self.gain_linear.log10()
        } else {
            0.0
        }
    }
    
    fn reset(&mut self) {}
}

struct Track {
    id: u32,
    clips: Vec<Clip>,
    effects: Vec<Box<dyn Effect>>,
    volume: f32,
    muted: bool,
}

fn render_track(
    track: &mut Track,
    output: &mut [f32],
    // ... other params
) {
    // Render all clips...
    
    // Apply effect chain
    for effect in &mut track.effects {
        effect.process(output, 2, sample_rate);
    }
    
    // Apply track volume
    for sample in output.iter_mut() {
        *sample *= track.volume;
    }
}

Additional Effects to Implement:

struct PanEffect {
    pan: f32,  // -1.0 (left) to 1.0 (right)
}

struct SimpleEQ {
    low_gain: f32,
    mid_gain: f32,
    high_gain: f32,
    low_filter: BiquadFilter,
    high_filter: BiquadFilter,
}

struct BiquadFilter {
    b0: f32, b1: f32, b2: f32,
    a1: f32, a2: f32,
    x1: f32, x2: f32,
    y1: f32, y2: f32,
}

Success Criteria:

Effects process without distortion
Multiple effects can chain
Parameter changes are smooth
No performance degradation

Begin DSP Library:

Basic filters (lowpass, highpass, bandpass)
Utilities (db to linear, frequency to coefficients)

Phase 6: Hierarchical Tracks - Foundation (Week 6-7)

Goal: Introduce track hierarchy (groups) without full metatracks

Deliverables:

TrackNode enum
Group tracks
Recursive rendering
Parent-child relationships

Core Implementation:

enum TrackNode {
    Audio(AudioTrack),
    Group(GroupTrack),
}

struct AudioTrack {
    id: u32,
    clips: Vec<Clip>,
    effects: Vec<Box<dyn Effect>>,
    volume: f32,
    parent: Option<u32>,
}

struct GroupTrack {
    id: u32,
    children: Vec<u32>,
    effects: Vec<Box<dyn Effect>>,
    volume: f32,
}

struct Project {
    tracks: HashMap<u32, TrackNode>,
    root_tracks: Vec<u32>,
    audio_pool: AudioPool,
}

fn render_track_node(
    node_id: u32,
    project: &Project,
    output: &mut [f32],
    context: &RenderContext,
    buffer_pool: &mut BufferPool,
) {
    match &project.tracks[&node_id] {
        TrackNode::Audio(track) => {
            render_audio_track(track, output, &project.audio_pool, context);
        }
        TrackNode::Group(group) => {
            // Get temp buffer from pool
            let mut group_buffer = buffer_pool.acquire();
            group_buffer.resize(output.len(), 0.0);
            
            // Render all children
            for child_id in &group.children {
                render_track_node(
                    *child_id,
                    project,
                    &mut group_buffer,
                    context,
                    buffer_pool
                );
            }
            
            // Apply group effects
            for effect in &mut group.effects {
                effect.process(&mut group_buffer, 2, context.sample_rate);
            }
            
            // Mix into output
            for (out, group) in output.iter_mut().zip(group_buffer.iter()) {
                *out += group * group.volume;
            }
            
            buffer_pool.release(group_buffer);
        }
    }
}

struct RenderContext {
    playhead_seconds: f64,
    sample_rate: u32,
    tempo: f32,
}

Success Criteria:

Can create groups of tracks
Groups can nest (test 3-4 levels)
Effects on groups affect all children
No audio glitches from recursion

Refactoring Required:

Migrate from Vec<Track> to HashMap<TrackId, TrackNode>
Update all track access code
Add parent tracking

Phase 7: MIDI Support (Week 7-8)

Goal: Play MIDI through virtual instruments

Deliverables:

MIDI data structures
MIDI clip rendering
Simple virtual instrument
MIDI track type

Core Implementation:

struct MidiEvent {
    timestamp: u64,     // Sample position
    status: u8,
    data1: u8,          // Note/CC number
    data2: u8,          // Velocity/value
}

struct MidiClip {
    id: u32,
    events: Vec<MidiEvent>,
    start_time: f64,
    duration: f64,
}

struct MidiTrack {
    id: u32,
    clips: Vec<MidiClip>,
    instrument: Box<dyn Effect>,  // Synth as effect
    effects: Vec<Box<dyn Effect>>,
    volume: f32,
    parent: Option<u32>,
}

enum TrackNode {
    Audio(AudioTrack),
    Midi(MidiTrack),
    Group(GroupTrack),
}

// Simple sine wave synth for testing
struct SimpleSynth {
    voices: Vec<SynthVoice>,
    sample_rate: f32,
}

struct SynthVoice {
    active: bool,
    note: u8,
    velocity: u8,
    phase: f32,
    frequency: f32,
}

impl Effect for SimpleSynth {
    fn process(&mut self, buffer: &mut [f32], channels: usize, sample_rate: u32) {
        // Process active voices
        for voice in &mut self.voices {
            if voice.active {
                for frame in buffer.chunks_mut(channels) {
                    let sample = (voice.phase * 2.0 * PI).sin() 
                        * (voice.velocity as f32 / 127.0) * 0.3;
                    
                    for channel in frame.iter_mut() {
                        *channel += sample;
                    }
                    
                    voice.phase += voice.frequency / sample_rate as f32;
                    if voice.phase >= 1.0 {
                        voice.phase -= 1.0;
                    }
                }
            }
        }
    }
    
    // Handle MIDI events via parameters
    fn set_parameter(&mut self, id: u32, value: f32) {
        match id {
            0 => self.note_on(value as u8, 100),  // Note on
            1 => self.note_off(value as u8),       // Note off
            _ => {}
        }
    }
}

fn render_midi_track(
    track: &mut MidiTrack,
    output: &mut [f32],
    context: &RenderContext,
    frames: usize,
) {
    // Collect MIDI events for this render period
    let mut events_to_process = Vec::new();
    
    for clip in &track.clips {
        collect_events_in_range(
            clip,
            context.playhead_seconds,
            frames,
            context.sample_rate,
            &mut events_to_process
        );
    }
    
    // Sort by timestamp
    events_to_process.sort_by_key(|e| e.timestamp);
    
    // Process events through instrument
    for event in events_to_process {
        handle_midi_event(&mut track.instrument, event);
    }
    
    // Generate audio
    track.instrument.process(output, 2, context.sample_rate);
    
    // Apply effect chain
    for effect in &mut track.effects {
        effect.process(output, 2, context.sample_rate);
    }
}

Success Criteria:

Can load and play MIDI files
Notes trigger at correct times
Polyphony works (4+ notes)
Timing is sample-accurate

Dependencies:

midly = "0.5" (for MIDI file parsing)

Phase 8: Full Metatracks (Week 8-9)

Goal: Add time transformation and metatrack-specific features

Deliverables:

Metatrack type with transformations
Time stretch functionality
Offset capability
Transform context propagation

Core Implementation:

struct Metatrack {
    id: u32,
    children: Vec<u32>,
    effects: Vec<Box<dyn Effect>>,
    
    // Transformation parameters
    time_stretch: f32,      // 0.5 = half speed, 2.0 = double speed
    pitch_shift: f32,       // Semitones (future feature)
    offset: f64,            // Time offset in seconds
    
    volume: f32,
    parent: Option<u32>,
}

enum TrackNode {
    Audio(AudioTrack),
    Midi(MidiTrack),
    Metatrack(Metatrack),
    Group(GroupTrack),
}

struct RenderContext {
    global_position: u64,   // Absolute sample position
    local_position: u64,    // Position within current scope
    sample_rate: u32,
    tempo: f32,
    time_signature: (u32, u32),
    time_stretch: f32,      // Accumulated stretch
}

impl Metatrack {
    fn transform_context(&self, ctx: RenderContext) -> RenderContext {
        let offset_samples = (self.offset * ctx.sample_rate as f64) as u64;
        
        let adjusted_position = ctx.local_position.saturating_sub(offset_samples);
        let stretched_position = (adjusted_position as f64 / self.time_stretch as f64) as u64;
        
        RenderContext {
            global_position: ctx.global_position,
            local_position: stretched_position,
            sample_rate: ctx.sample_rate,
            tempo: ctx.tempo * self.time_stretch,
            time_signature: ctx.time_signature,
            time_stretch: ctx.time_stretch * self.time_stretch,
        }
    }
}

fn render_metatrack(
    meta: &Metatrack,
    project: &Project,
    output: &mut [f32],
    context: RenderContext,
    buffer_pool: &mut BufferPool,
) {
    // Transform context for children
    let child_context = meta.transform_context(context);
    
    // Acquire buffer for submix
    let mut submix = buffer_pool.acquire();
    submix.resize(output.len(), 0.0);
    
    // Render all children with transformed context
    for child_id in &meta.children {
        if let Some(child) = project.tracks.get(child_id) {
            render_track_node(
                *child_id,
                project,
                &mut submix,
                child_context,
                buffer_pool
            );
        }
    }
    
    // Apply metatrack effects
    for effect in &mut meta.effects {
        effect.process(&mut submix, 2, context.sample_rate);
    }
    
    // Mix into output
    for (out, sub) in output.iter_mut().zip(submix.iter()) {
        *out += sub * meta.volume;
    }
    
    buffer_pool.release(submix);
}

Metatrack Operations:

impl Project {
    fn create_metatrack_from_selection(&mut self, track_ids: Vec<u32>) -> u32 {
        let metatrack_id = self.next_id();
        
        let metatrack = Metatrack {
            id: metatrack_id,
            children: track_ids.clone(),
            effects: Vec::new(),
            time_stretch: 1.0,
            pitch_shift: 0.0,
            offset: 0.0,
            volume: 1.0,
            parent: None,
        };
        
        // Update parent references
        for track_id in track_ids {
            if let Some(track) = self.tracks.get_mut(&track_id) {
                set_parent(track, Some(metatrack_id));
            }
        }
        
        self.tracks.insert(metatrack_id, TrackNode::Metatrack(metatrack));
        self.root_tracks.push(metatrack_id);
        
        metatrack_id
    }
    
    fn ungroup_metatrack(&mut self, metatrack_id: u32) {
        if let Some(TrackNode::Metatrack(meta)) = self.tracks.get(&metatrack_id) {
            let children = meta.children.clone();
            
            // Remove parent from children
            for child_id in children {
                if let Some(track) = self.tracks.get_mut(&child_id) {
                    set_parent(track, None);
                }
                self.root_tracks.push(child_id);
            }
            
            // Remove metatrack
            self.tracks.remove(&metatrack_id);
            self.root_tracks.retain(|&id| id != metatrack_id);
        }
    }
}

Success Criteria:

Can create metatracks from track selection
Time stretch affects all children
Offset shifts children in time
Can nest metatracks 5+ levels deep
Performance remains acceptable

Phase 9: Polish & Optimization (Week 9-11)

9a. Buffer Pool Optimization (Week 9)

Goal: Eliminate allocations in audio thread

struct BufferPool {
    buffers: Vec<Vec<f32>>,
    available: Vec<usize>,
    buffer_size: usize,
    total_allocations: AtomicUsize,
}

impl BufferPool {
    fn new(count: usize, size: usize) -> Self {
        let mut buffers = Vec::with_capacity(count);
        let mut available = Vec::with_capacity(count);
        
        for i in 0..count {
            buffers.push(vec![0.0; size]);
            available.push(i);
        }
        
        BufferPool {
            buffers,
            available,
            buffer_size: size,
            total_allocations: AtomicUsize::new(0),
        }
    }
    
    fn acquire(&mut self) -> Vec<f32> {
        if let Some(idx) = self.available.pop() {
            let mut buf = std::mem::take(&mut self.buffers[idx]);
            buf.fill(0.0);
            buf
        } else {
            self.total_allocations.fetch_add(1, Ordering::Relaxed);
            vec![0.0; self.buffer_size]
        }
    }
    
    fn release(&mut self, buffer: Vec<f32>) {
        if buffer.len() == self.buffer_size {
            let idx = self.buffers.len();
            self.buffers.push(buffer);
            self.available.push(idx);
        }
    }
}

Success Criteria:

Zero allocations during steady-state playback
Pool size auto-adjusts to actual usage
Metrics show allocation count

9b. Lock-Free State Updates (Week 10)

Goal: Replace Mutex with triple-buffering for project state

struct TripleBuffer<T> {
    buffers: [T; 3],
    write_idx: AtomicUsize,
    read_idx: AtomicUsize,
}

impl<T: Clone> TripleBuffer<T> {
    fn new(initial: T) -> Self {
        TripleBuffer {
            buffers: [initial.clone(), initial.clone(), initial],
            write_idx: AtomicUsize::new(0),
            read_idx: AtomicUsize::new(0),
        }
    }
    
    // Called from control thread
    fn write(&mut self, value: T) {
        let write_idx = self.write_idx.load(Ordering::Acquire);
        let next_idx = (write_idx + 1) % 3;
        
        self.buffers[next_idx] = value;
        self.write_idx.store(next_idx, Ordering::Release);
    }
    
    // Called from audio thread
    fn read(&self) -> &T {
        let read_idx = self.read_idx.load(Ordering::Acquire);
        let write_idx = self.write_idx.load(Ordering::Acquire);
        
        if read_idx != write_idx {
            self.read_idx.store(write_idx, Ordering::Release);
        }
        
        &self.buffers[read_idx]
    }
}

Success Criteria:

No locks in audio thread
State updates propagate within 1-2 buffers
No audio glitches during updates

9c. Disk Streaming (Week 11)

Goal: Stream large audio files that don't fit in RAM

struct StreamingFile {
    id: FileId,
    path: PathBuf,
    channels: u32,
    sample_rate: u32,
    total_frames: u64,
    
    // Streaming state
    buffer_rx: rtrb::Consumer<AudioChunk>,
    current_chunk: Option<AudioChunk>,
    chunk_offset: usize,
}

struct AudioChunk {
    start_frame: u64,
    data: Vec<f32>,
}

// Background streaming thread
fn streaming_thread(
    file_path: PathBuf,
    request_rx: Receiver<StreamRequest>,
    chunk_tx: rtrb::Producer<AudioChunk>,
) {
    let mut file = File::open(file_path).unwrap();
    let mut decoder = /* create decoder */;
    
    loop {
        if let Ok(request) = request_rx.try_recv() {
            // Seek to requested position
            decoder.seek(request.frame);
        }
        
        // Read chunk
        let chunk = decoder.read_frames(CHUNK_SIZE);
        
        // Send to audio thread
        let _ = chunk_tx.push(AudioChunk {
            start_frame: current_frame,
            data: chunk,
        });
        
        current_frame += CHUNK_SIZE;
    }
}

Success Criteria:

Can play files larger than RAM
No dropouts during streaming
Seek works smoothly
Multiple streaming files simultaneously

Phase 10: Advanced Features (Week 11+)

Automation

struct AutomationLane {
    parameter_id: u32,
    points: Vec<AutomationPoint>,
}

struct AutomationPoint {
    time: f64,
    value: f32,
    curve: CurveType,
}

enum CurveType {
    Linear,
    Exponential,
    SCurve,
}

Plugin Hosting (VST/CLAP)

struct PluginHost {
    scanner: PluginScanner,
    instances: HashMap<PluginId, PluginInstance>,
}

struct PluginInstance {
    id: PluginId,
    plugin_type: PluginType,
    handle: *mut c_void,
    parameters: Vec<Parameter>,
}

enum PluginType {
    VST3,
    CLAP,
}

Project Save/Load

#[derive(Serialize, Deserialize)]
struct ProjectFile {
    version: String,
    metadata: ProjectMetadata,
    tracks: Vec<SerializedTrack>,
    audio_files: Vec<AudioFileReference>,
    tempo_map: TempoMap,
}

Undo/Redo System

trait Command {
    fn execute(&mut self, project: &mut Project);
    fn undo(&mut self, project: &mut Project);
}

struct CommandHistory {
    commands: Vec<Box<dyn Command>>,
    position: usize,
}

Technical Specifications

Performance Targets

Latency: < 10ms (varies by buffer size and sample rate)
CPU Usage: < 50% for 32 tracks with effects at 44.1kHz
Track Count: Support 64+ tracks without performance degradation
Nesting Depth: 10 levels of metatrack nesting
Plugin Count: 8+ plugins per track

Buffer Sizes

Audio Callback: 128-512 frames (adjustable)
Streaming Chunk: 8192 frames
Ring Buffer: 8192 samples (UI → Audio commands)

Sample Rates

Supported: 44.1kHz, 48kHz, 88.2kHz, 96kHz
Default: 48kHz
Internal Processing: Always at project sample rate

Memory Budget

Audio Pool Cache: 1GB default, configurable
Buffer Pool: 50 buffers × 4096 samples × 4 bytes = 800KB
Per-Track Overhead: < 1KB

Thread Model

UI Thread: User interaction, visualization
Control Thread: Project state management, file I/O
Audio Thread: Real-time processing (cpal callback)
Streaming Thread(s): Disk I/O for large files

Data Format

Internal Audio: 32-bit float, interleaved
File Support: WAV, FLAC, MP3, OGG via symphonia
MIDI: Standard MIDI File Format
Project Files: JSON or MessagePack

Testing Strategy

Unit Tests

#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_clip_rendering_at_position() {
        let clip = Clip {
            id: 0,
            audio_pool_index: 0,
            start_time: 1.0,  // Starts at 1 second
            duration: 2.0,
            offset: 0.0,
            gain: 1.0,
        };
        
        let pool = AudioPool {
            files: vec![vec![1.0; 96000]],  // 2 seconds at 48kHz
        };
        
        let mut output = vec![0.0; 4800];  // 0.1 seconds
        
        render_clip(&clip, &mut output, &pool, 1.5, 48000, 4800);
        
        assert!(output.iter().any(|&s| s != 0.0));
    }
    
    #[test]
    fn test_metatrack_time_stretch() {
        let mut context = RenderContext {
            global_position: 48000,
            local_position: 48000,
            sample_rate: 48000,
            tempo: 120.0,
            time_signature: (4, 4),
            time_stretch: 1.0,
        };
        
        let metatrack = Metatrack {
            id: 0,
            children: vec![],
            effects: vec![],
            time_stretch: 0.5,  // Half speed
            pitch_shift: 0.0,
            offset: 0.0,
            volume: 1.0,
            parent: None,
        };
        
        let child_context = metatrack.transform_context(context);
        
        assert_eq!(child_context.time_stretch, 0.5);
        assert_eq!(child_context.tempo, 60.0);
    }
    
    #[test]
    fn test_buffer_pool_no_allocations() {
        let mut pool = BufferPool::new(10, 1024);
        
        let buf1 = pool.acquire();
        let buf2 = pool.acquire();
        
        pool.release(buf1);
        pool.release(buf2);
        
        let buf3 = pool.acquire();
        let buf4 = pool.acquire();
        
        // Should reuse buffers, no new allocations
        assert_eq!(pool.total_allocations.load(Ordering::Relaxed), 0);
    }
}

Integration Tests

#[test]
fn test_full_playback_pipeline() {
    // Setup
    let (cmd_tx, cmd_rx) = rtrb::RingBuffer::new(256);
    let (evt_tx, evt_rx) = rtrb::RingBuffer::new(256);
    
    let mut engine = AudioEngine::new(48000, cmd_rx, evt_tx);
    engine.load_audio_file("test.wav");
    
    // Start playback
    cmd_tx.push(Command::Play).unwrap();
    
    // Render some audio
    let mut output = vec![0.0; 4800];
    engine.process(&mut output);
    
    // Verify audio was rendered
    assert!(output.iter().any(|&s| s.abs() > 0.001));
    
    // Check events
    if let Ok(AudioEvent::PlaybackPosition(pos)) = evt_rx.pop() {
        assert!(pos > 0.0);
    }
}

#[test]
fn test_nested_metatrack_rendering() {
    let mut project = Project::new(48000);
    
    // Create structure: Metatrack1 -> Metatrack2 -> AudioTrack
    let audio_id = project.add_audio_track();
    let meta2_id = project.create_metatrack_from_selection(vec![audio_id]);
    let meta1_id = project.create_metatrack_from_selection(vec![meta2_id]);
    
    // Apply transformations
    project.set_metatrack_time_stretch(meta1_id, 0.5);
    project.set_metatrack_time_stretch(meta2_id, 2.0);
    
    // Render
    let mut output = vec![0.0; 4800];
    project.render(&mut output, 0.0);
    
    // Effective stretch should be 0.5 * 2.0 = 1.0 (normal speed)
    assert!(output.iter().any(|&s| s != 0.0));
}

Performance Tests

#[bench]
fn bench_render_32_tracks(b: &mut Bencher) {
    let mut engine = create_engine_with_tracks(32);
    let mut output = vec![0.0; 512 * 2];
    
    b.iter(|| {
        engine.process(&mut output);
    });
}

#[bench]
fn bench_metatrack_nesting_10_levels(b: &mut Bencher) {
    let mut project = create_nested_metatracks(10);
    let mut output = vec![0.0; 512 * 2];
    
    b.iter(|| {
        project.render(&mut output, 0.0);
    });
}

Audio Quality Tests

THD+N: Total Harmonic Distortion + Noise < 0.01%
Frequency Response: Flat ±0.1dB 20Hz-20kHz
Click Detection: No clicks during parameter changes
Timing Accuracy: MIDI events within ±1 sample

Stress Tests

Long Sessions: 8+ hours continuous playback
Many Tracks: 128 tracks, 8 effects each
Deep Nesting: 20 levels of metatrack nesting
Rapid Commands: 1000 commands/second
Large Files: 1GB+ audio files streaming

Recommended Crates

Core Audio

cpal = "0.15" - Audio I/O
symphonia = "0.5" - Audio decoding
rubato = "0.14" - Sample rate conversion

Concurrency

rtrb = "0.3" - Lock-free ring buffers
crossbeam = "0.8" - Additional concurrency tools
parking_lot = "0.12" - Better mutexes (non-realtime)

DSP

realfft = "3.3" - FFT for spectral processing
biquad = "0.4" - IIR filters

Serialization

serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0" or rmp-serde = "1.1" (MessagePack)

File I/O

midly = "0.5" - MIDI file parsing
hound = "3.5" - WAV file writing

Future

vst3-sys or clack - Plugin hosting
egui = "0.24" - Immediate mode GUI (if building UI in Rust)

Project Structure

daw-backend/
├── Cargo.toml
├── src/
│   ├── main.rs
│   ├── lib.rs
│   │
│   ├── audio/
│   │   ├── mod.rs
│   │   ├── engine.rs          # Audio engine, main processing loop
│   │   ├── track.rs           # Track types (Audio, MIDI, Metatrack)
│   │   ├── clip.rs            # Clip management
│   │   ├── pool.rs            # Audio pool
│   │   ├── buffer_pool.rs     # Buffer allocation pool
│   │   └── render.rs          # Rendering functions
│   │
│   ├── project/
│   │   ├── mod.rs
│   │   ├── project.rs         # Project state
│   │   ├── hierarchy.rs       # Track hierarchy management
│   │   ├── operations.rs      # Project operations (add track, etc.)
│   │   └── serialization.rs   # Save/load
│   │
│   ├── effects/
│   │   ├── mod.rs
│   │   ├── trait.rs           # Effect trait
│   │   ├── gain.rs
│   │   ├── pan.rs
│   │   ├── eq.rs
│   │   └── synth.rs           # Simple synth for MIDI
│   │
│   ├── dsp/
│   │   ├── mod.rs
│   │   ├── filters.rs         # Biquad, etc.
│   │   ├── envelope.rs        # ADSR
│   │   ├── oscillator.rs
│   │   └── utils.rs           # DB conversion, etc.
│   │
│   ├── io/
│   │   ├── mod.rs
│   │   ├── audio_file.rs      # Audio file loading
│   │   ├── midi_file.rs       # MIDI file loading
│   │   └── streaming.rs       # Disk streaming
│   │
│   ├── command/
│   │   ├── mod.rs
│   │   ├── types.rs           # Command/Event enums
│   │   └── queue.rs           # Command queue management
│   │
│   └── ui/
│       ├── mod.rs
│       └── bridge.rs          # UI-Audio communication
│
├── tests/
│   ├── integration_tests.rs
│   └── audio_quality_tests.rs
│
└── benches/
    └── performance.rs

Conclusion

This architecture provides:

Incremental Development: Each phase builds on the last without requiring rewrites
Real-time Safety: Lock-free, allocation-free audio thread from the start
Flexibility: Hierarchical tracks support simple projects and complex arrangements
Scalability: Architecture handles 64+ tracks with deep nesting
Extensibility: Effect trait makes plugin hosting straightforward

The roadmap gets you from "hello audio" to a full-featured DAW in 11 weeks, with each phase delivering working, testable functionality.

Next Steps:

Set up Rust project with cpal
Implement Phase 1 (single file playback)
Add comprehensive tests
Profile and optimize as needed
Continue through phases sequentially

Good luck building your DAW!

43 KiB Raw Blame History Unescape Escape

DAW Backend Architecture & Implementation Roadmap

Table of Contents

Architecture Overview

High-Level Design

Design Principles

Core Components

1. Audio Engine (Real-time Thread)

2. Track Hierarchy

3. Clips and Regions

4. Audio Pool

5. Effect System

6. Render Context

7. Project State

8. Buffer Management

Metatracks Architecture

Processing Model

Time Transformation

Metatrack Operations

Nesting Limits

Implementation Roadmap

Phase 1: Single Audio File Playback (Week 1-2)

Phase 2: Transport Control + UI Communication (Week 2-3)

Phase 3: Multiple Audio Tracks (Week 3-4)

Phase 4: Clips & Timeline (Week 4-5)

Phase 5: Effect Processing (Week 5-6)

Phase 6: Hierarchical Tracks - Foundation (Week 6-7)

Phase 7: MIDI Support (Week 7-8)

Phase 8: Full Metatracks (Week 8-9)

Phase 9: Polish & Optimization (Week 9-11)

9a. Buffer Pool Optimization (Week 9)

9b. Lock-Free State Updates (Week 10)

9c. Disk Streaming (Week 11)

Phase 10: Advanced Features (Week 11+)

Automation

Plugin Hosting (VST/CLAP)

Project Save/Load

Undo/Redo System

Technical Specifications

Performance Targets

Buffer Sizes

Sample Rates

Memory Budget

Thread Model

Data Format

Testing Strategy

Unit Tests

Integration Tests

Performance Tests

Audio Quality Tests

Stress Tests

Recommended Crates

Core Audio

Concurrency

DSP

Serialization

File I/O

Future

Project Structure

Conclusion

43 KiB

Raw Blame History