Skip to main content
Songbird’s dev script wraps four macOS profilers behind one flag surface (./utils/build-rs --<profiler>). Each cuts the stack at a different point and answers a different question. Pick the cheapest one that can fail in a way you’d care about.
ProfilerQuestion it answersOverheadOutput
samply”Which Rust function burns CPU?”~1–3%Firefox Profiler in browser
dhat”Which Rust call allocates the most?“high (allocator wrap)dhat-heap.json
Instruments — Time Profiler”Same as samply, plus correlate with system events”low.trace bundle
Instruments — Allocations”Where’s our RSS going, including WebKit / GPU buffers?“medium.trace bundle
Instruments — Leaks”Are we leaking objects over time?“medium.trace bundle
Instruments — Metal System Trace”Why is the GPU stalling / dropping frames?“medium.trace bundle
All profilers imply --release — debug builds have wildly different perf characteristics, so profiling them measures the wrong code. rust/Cargo.toml’s [profile.release] keeps line-tables-only debug info and uses packed dSYM so function names show up in every tool. Traces land in ./.profiling/ (gitignored) with timestamped filenames so repeated runs don’t clobber each other.

Picking the right tool

What's the question?
├─ "CPU is hot, where?"
│     → samply (lightest) or Instruments Time Profiler (more detail)

├─ "RSS is huge, where?"
│     → dhat for Rust-only heap; Instruments Allocations for full process
│       (Rust + WebKit + GPU buffers + dylibs)

├─ "Something is leaking over time"
│     → Instruments Leaks

├─ "Frame rate / GPU is bad"
│     → Instruments Metal System Trace

└─ "I just want to feel the release build"
      → --release (no wrapper)

CPU sampling: samply

./utils/build-rs --samply
Wraps the app in samply record. Records a CPU sampling profile and opens it in the Firefox Profiler when the app exits. Low overhead — fine for “feel the app while profiling” sessions, including interactive UI work. Install once: cargo install samply. When to reach for it: the question is “which Rust function is hot?” and you don’t need system-level correlation. Faster turnaround than Instruments because there’s no .trace bundle to open.

Heap allocations: dhat

./utils/build-rs --dhat
Builds with the dhat-heap Cargo feature, which swaps the global allocator for dhat::Alloc. Every Rust allocation gets a backtrace. On clean exit, dhat writes dhat-heap.json to the CWD. Important: quit via Cmd-Q or File → Quit, not Ctrl-C. Dhat writes its output in a Drop impl that only fires on a clean shutdown. Ctrl-C kills the process before drop runs and you get no data. View the result by opening dh_view.html and loading dhat-heap.json. When to reach for it: “which Rust function allocates the most” or “why does our memory grow over time.” Doesn’t see WebKit / GPU memory — Instruments Allocations is the right tool for full-process view. Performance note: dhat slows the program significantly (allocator overhead with per-alloc backtrace). Use it for capture workflows, not for “running normally with profiling on.”

Instruments — Time Profiler

./utils/build-rs --time-profiler
# or equivalently
./utils/build-rs --instruments "Time Profiler"
Same CPU sampling idea as samply, but recorded via Apple’s xctrace into a .trace bundle that opens in Instruments.app. Strengths over samply: native macOS tooling, correlates with thread state, system calls, and (when combined in a custom template) GPU work. Requires full Xcode (not just Command Line Tools) for xctrace.

Instruments — Allocations

./utils/build-rs --allocations
Tracks every allocation in the process — Rust heap, WebKit, GPU buffers, dylibs, everything. Use this when total RSS is suspicious and you need to see across the language boundary. If the answer turns out to live in Rust, drop back to dhat: lighter to use, more precise call stacks, and viewable in a browser tab.

Instruments — Leaks

./utils/build-rs --leaks
Periodic leak detection: takes heap snapshots at intervals and reports allocations that nothing references anymore. Best for “RSS slowly creeps up” rather than “RSS is high right now.”

Instruments — Metal System Trace

./utils/build-rs --metal-system-trace
Records a structured timeline of every Metal event: API calls, command-buffer encoding on the CPU, GPU execution of those buffers on parallel tracks, shader compilation, stalls. Combined with the Time Profiler track in a custom template, lets you correlate “GPU stall happened because this Rust function blocked the CPU thread.” Songbird’s arrangement view renders through WebGPU → Metal via the WebKit GPU process, so this is the tool for any react_ui/src/lib/gl/ or react_ui/src/components/panels/arrangement/gl/ work that needs more signal than “feels janky.”

Custom Instruments templates

--instruments TPL accepts any template name from xctrace list templates. Names are case- and space-sensitive ("Animation Hitches", not "animation-hitches"). You can also build your own template in Instruments.app: File → New → Blank, drag in the instruments you want (Time Profiler + Allocations + Metal System Trace + Thread State, say), save it, then pass its name to --instruments. One process, one capture, multiple correlated tracks on a unified timeline.

Attaching to a running app

If the --instruments flag launches the app under xctrace and Instruments.app refuses to open the resulting trace (“missing template” or similar), record by attaching to an already-running process instead:
./utils/build-rs --release &
xctrace record --template "Time Profiler" \
  --attach $(pgrep -n songbird) \
  --output .profiling/profile-$(date +%Y%m%d-%H%M%S).trace
This skips the template-resolution step that fails on some Xcode installs and is also the right approach when you want to start a recording mid-session rather than at launch.

Combining profilers

The script enforces one heavyweight profiler at a time — --samply, --dhat, and --instruments (and its shortcuts) are mutually exclusive in the launch command. Forcing two doesn’t give better data:
  • Two CPU samplers interfere with each other’s sampling cadence.
  • Two heap trackers fight over the malloc hook.
  • dhat + anything else dominates the trace because dhat slows allocation 10–100×; the “hot function” you see is the allocator.
When you genuinely want multiple correlated tracks, use a custom Instruments template (above) — that’s one capture process recording several signals natively.

The Metal HUD

./utils/build-rs --release --metal-hud
Sets MTL_HUD_ENABLED=1 (and __XPC_MTL_HUD_ENABLED=1 for WebKit’s out-of-process GPU service) so macOS draws its live performance HUD in the corner of any window backed by a CAMetalLayer. Near-zero overhead, no recording. Caveat — often doesn’t draw for Tauri apps. The HUD attaches to processes that own a CAMetalLayer directly. Tauri renders the WebView through WebKit’s GPU process, which composites back into the main window as an IOSurface-backed layer. The HUD may draw inside the GPU process but stays invisible to you, or skip drawing entirely. Chrome and Electron architecturally own their layer and work fine; we don’t. If the HUD doesn’t show up after --metal-hud, that’s expected. Fall back to --metal-system-trace — it doesn’t depend on CAMetalLayer ownership and gives strictly more information.

Troubleshooting

“Missing template” when opening a trace. Instruments.app shows this when the .trace bundle references a template that isn’t installed on the machine opening it — the template name was misspelled, or the trace came from a newer Xcode. Pass an exact name from xctrace list templates. The default "Time Profiler" ships with every Xcode and is the safe fallback. If you still get the error, use the attach-mode recipe above — it uses your local Instruments install’s template definitions instead of resolving one out of the bundle. Profile shows hex addresses instead of function names. Means the release build was stripped of debug info. rust/Cargo.toml’s [profile.release] block already sets debug = "line-tables-only" and split-debuginfo = "packed" so this shouldn’t happen — if it does, double-check you didn’t override the profile in a local Cargo.toml. The “packed” setting matters: macOS’s default “unpacked” split-debuginfo writes per-object .o debug info that samply can’t reliably follow. Packed runs dsymutil at link time to consolidate into a single <bin>.dSYM bundle that every macOS profiling tool finds automatically. dhat-heap.json is empty or missing. You almost certainly killed the app with Ctrl-C instead of Cmd-Q. The output writes from a Drop impl, which only runs on clean shutdown. Re-run, then quit the app cleanly. Instruments crashes on open. Rare but happens; usually a stale Xcode install. Try attach-mode recording instead — it produces a trace bundle from your current Instruments install rather than relying on whatever xctrace baked into the previous bundle.

Output cleanup

Traces accumulate fast. The .profiling/ directory is gitignored, so it won’t bloat the repo, but each Instruments .trace bundle can be hundreds of MB. Wipe them periodically:
rm -rf .profiling/
dhat-heap.json writes to CWD, not .profiling/, since dhat doesn’t know about our directory convention. Delete by hand or find . -name dhat-heap.json -delete.