Linux Audio FAQ

Linux audio questions tend to follow patterns. The same problems surface year after year, across different distributions, different hardware, and different experience levels. This FAQ compiles the questions that actually get asked repeatedly in technical discussion, not the hypothetical ones that documentation writers think people might ask. I cover why identical hardware can behave completely differently across two Linux installs, how to pick sample rate and buffer size without guessing, what the difference is between desktop audio defaults and production settings, which components fail first under real load, and what diagnostic steps to run before concluding your driver is broken. These answers draw on years of mailing list discussion and hands-on debugging. For deeper treatment of specific topics, follow the links into the LAD resource pages and the audio quality guide.

Why Does Linux Audio Work Perfectly on One Machine and Terribly on Another?

This is probably the most common frustration. Someone sets up a recording rig on a ThinkPad and everything works flawlessly at 128 samples. They try the same distribution, same interface, same configuration on a newer Dell, and XRUNs appear within minutes. The hardware is "better" by every specification. Yet the audio is worse.

The answer is almost always one of three things: power management firmware, interrupt routing, or a single misbehaving device driver. Modern laptops and desktops have aggressive power saving features baked into their firmware. C-states allow CPU cores to enter deep sleep modes that save power but take microseconds to wake from. Those microseconds are invisible to a web browser but catastrophic to an audio callback that must complete within a tight deadline. One machine's BIOS might allow you to disable deep C-states. Another's might not expose the option at all.

Interrupt routing matters because the chipset decides how hardware events get delivered to CPU cores. Some systems route USB interrupts through a shared interrupt line with storage controllers, meaning a disk write can delay an audio buffer callback. Others isolate them cleanly. You cannot tell from the spec sheet. You can only tell from /proc/interrupts and from testing.

The third cause, a misbehaving driver, is subtler. A GPU driver that holds a kernel lock for 2 ms during a display mode change will cause an XRUN on a tight audio setup. A WiFi driver that runs a firmware scan with interrupts disabled will do the same. These are not audio driver bugs. They are unrelated drivers that happen to interfere with the timing guarantees audio work requires. Disabling WiFi and running a basic GPU configuration eliminates two of the most common offenders.

How Do I Choose the Right Sample Rate?

Pick 48 kHz unless you have a specific, articulable reason to use something else. That is the short answer. Here is the longer one.

44.1 kHz is the CD standard and works perfectly well. 48 kHz is the video and broadcast standard and matches what most modern interfaces use as their native clock rate. Running at 96 kHz doubles your CPU load and bus bandwidth for frequency content above 24 kHz that neither your monitors nor your ears can meaningfully reproduce in most monitoring environments. There are legitimate uses for higher rates: certain plugin algorithms produce cleaner results when oversampling from a higher source rate, and archival recording at 96 kHz preserves more flexibility for future processing. But these are deliberate choices for specific workflows, not defaults.

The critical rule is consistency. Every component in your signal chain must agree on the sample rate. If your ALSA device is configured for 48 kHz but your session files are 44.1 kHz, something will resample. That resampling might be transparent, or it might introduce artifacts depending on which component does it and what algorithm it uses. Set the rate at the interface level, confirm it in your sound server configuration, and keep all session files at the same rate. The audio quality guide covers resampling pitfalls in more detail.

What Buffer Size Should I Use?

Start at 256 samples and work down. I know that sounds conservative. It is intentionally conservative.

The buffer size determines how many samples the system accumulates before processing them as a block. Smaller buffers mean less latency between input and output. They also mean the CPU must complete each processing cycle faster, leaving less margin for anything else on the system to cause a delay. At 48 kHz, 256 samples gives you about 5.3 ms per direction, roughly 10.6 ms round trip. That is perceptible but workable for most performers. At 128 samples, round trip drops to about 5.3 ms, which feels nearly instantaneous. At 64 samples, you are under 3 ms round trip, but your system must be exceptionally well-tuned to sustain it without dropouts.

The mistake most people make is setting 64 samples because a forum post said it was possible, then spending hours chasing intermittent XRUNs that would vanish at 128. Test under real conditions: your actual plugin chain, your actual interface, your actual background services running. If 256 is clean for an hour, try 128. If 128 holds, try 64. If 64 is flaky, go back to 128 and stay there. The perceptual difference between 128 and 64 is negligible for most work. The stability difference can be enormous. See the latency resources page for the full breakdown of buffer mechanics and scheduling.

Why Are Desktop Audio Defaults Wrong for Production?

Desktop audio and production audio have fundamentally different priorities. Desktop audio cares about compatibility: every application should be able to make sound without any configuration, even if three browser tabs, a video call, and a notification system are all playing simultaneously. Production audio cares about timing: one application needs exclusive, deterministic access to the hardware with guaranteed scheduling and zero tolerance for glitches.

PulseAudio, which was the default desktop sound server for years, was designed for the first scenario. It resampled freely, mixed streams from multiple applications, added its own buffering, and prioritized "something comes out of the speakers" over "this arrives with minimal, consistent delay." PipeWire improves on this significantly, but its default configuration is still a desktop configuration. Default buffer sizes tend to be 1024 or 2048 samples. Default scheduling is often non-real-time. Default resampling quality is set for CPU efficiency rather than transparency.

Switching to production settings means changing the buffer size, confirming real-time scheduling is active, setting the sample rate explicitly, and often disabling the session manager's automatic device switching behavior. On distributions like Fedora or Ubuntu Studio that ship audio-oriented configurations, some of this is handled for you. On a stock Debian or Arch install, you configure it yourself. The Arch Linux professional audio wiki page remains one of the most comprehensive distribution-specific references for this process.

What Breaks First Under Load?

When you push a Linux audio system toward its limits, failures follow a predictable sequence. Understanding that sequence tells you where to look when problems appear.

First to break: scheduling. As CPU load increases, the scheduler has less slack time to accommodate real-time audio threads. The audio callback must complete within one buffer period. If the CPU is 80% loaded with plugin processing and a background task briefly steals a core, the callback misses its deadline. This manifests as an XRUN.

Second to break: memory bandwidth. Audio processing itself is not memory-intensive, but large sample libraries, convolution reverbs, and multiple instances of sampler plugins can saturate memory bandwidth, especially on systems where CPU and GPU share main memory. When memory bandwidth is saturated, audio buffer fills stall. This is less common than scheduling failures but more confusing because top shows plenty of free CPU.

Third to break: the USB bus. USB audio interfaces share bus bandwidth with everything else on the same controller. A USB hard drive running a backup, a webcam streaming, or even an active USB hub with chatty devices can consume enough isochronous bandwidth to crowd out the audio interface's transfers. The interface does not crash. It just starts missing deadlines.

Last to break: the audio interface firmware. This is the rarest failure mode but the hardest to diagnose. Some interfaces have internal firmware bugs that surface only under specific timing conditions. They might work perfectly at 256 samples and fail reproducibly at 128, not because of the host system but because of a firmware timing path that cannot keep up. If you have eliminated every host-side cause and the problem persists at a specific buffer size, the interface firmware is the likely culprit.

What Should I Test Before Blaming Drivers?

"The ALSA driver is broken" is the most common incorrect diagnosis in Linux audio troubleshooting. ALSA drivers are imperfect, but they are also some of the most tested kernel code in existence. Before concluding that the driver is at fault, run through this checklist:

Is real-time scheduling active? Check with chrt -p on your sound server's process. If it shows SCHED_OTHER, your real-time configuration is not applied. Fix it before blaming anything else.
Is the CPU governor set to performance? Run cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor. If it says powersave or ondemand, frequency scaling is adding latency jitter.
Are other USB devices competing for bandwidth? Unplug everything except the audio interface and test. If the problem disappears, the issue is bus contention, not the driver.
Is the sample rate consistent? Mismatched rates between the ALSA device and the sound server force resampling that adds latency and sometimes introduces glitches that look like driver bugs.
Are power management C-states causing wake latency? Try booting with processor.max_cstate=1 on the kernel command line. If audio stability improves dramatically, the issue is power management firmware, not the audio driver.
Is a GPU driver holding locks? Test with a minimal display configuration or a different GPU driver. Proprietary GPU drivers are frequent sources of scheduling interference that manifests as audio problems.
Does the problem exist at higher buffer sizes? If the interface works cleanly at 512 samples but fails at 64, the driver is not broken. Your system simply cannot meet the tighter timing requirements at that buffer size with your current configuration.

Only after all seven checks come back clean should you consider filing a driver bug. When you do, include your kernel version, ALSA version, interface model, exact buffer settings, the output of cat /proc/asound/cards, and the relevant dmesg output. A well-documented bug report gets fixed. A vague complaint about "crackling audio" gets ignored.

Can PipeWire Replace JACK for Serious Work?

For most workflows in 2026, yes. PipeWire's JACK compatibility layer handles standard routing, plugin hosting, and session management reliably. The buffer handling, graph latency compensation, and real-time scheduling all work. For typical recording, mixing, and production work, PipeWire with WirePlumber is the practical default on most distributions, and fighting it by running standalone JACK alongside PipeWire creates more problems than it solves.

Where standalone JACK still has an edge is in extreme low-latency scenarios (64 samples and below), certain MIDI clock synchronization cases, and systems where PipeWire's broader scope introduces competing demands. If you are running a dedicated live performance rig with a single JACK application and no desktop audio needs, standalone JACK removes a layer of abstraction. For everyone else, PipeWire is the more maintainable choice.

Where Do I Go From Here?

This FAQ covers the questions that come up most often. For deeper treatment of specific areas:

The latency resources page covers PREEMPT_RT, scheduler tuning, IRQ handling, and benchmark interpretation in full detail.
The audio quality guide covers ALSA, JACK, and PipeWire configuration from the quality and stability perspective.
The LAD hub provides the navigational map to all community resources, subscription guides, event history, and member references.
The benchmarks section collects tools and traces for measuring real system performance rather than guessing at it.

Most Linux audio problems are configuration problems, not hardware problems and not driver problems. The diagnostic steps above will resolve the majority of issues without a single line of code. When they do not, you have enough information to file a useful report or ask a precise question in the right place. That is how problems actually get solved.