Dither is probably a concept that few in the audio space, besides producers and audio engineers, understand too much about. With streaming and Hi-Res content replacing the trusty CD, even less attention is paid to this once cornerstone of audio quality. But with 24-bit Hi-Res on the rise, it’s worth understanding how dither once again proves that bigger numbers aren’t always better.
In a nutshell, dither is a randomized coding sequence applied to 16-bit audio masters in order to reduce quantization noise. If you read our recent feature on audio bit-depth, you’ll be well up to speed on digital noise. If not, I strongly suggest checking that out first for a primer. Quantization noise is important because it tells us how accurately we can store our audio before we lose details to background noise.
Why do we need dither?
To recap, conversion and digital math can only be done to a certain level of accuracy, defined by the number of bits (digital 1s and 0s) available for storing audio information. Moving from a high bit-depth, such as when mixing, to a lower bit-depth for storage introduces errors called quantization noise. This is audible at low bit-depths and as such we can express the signal-to-noise ratio (SNR) of a digital file in dB.
For example, an 8-bit signal has an SNR of 48dB, 12-bits is 72dB, while 16-bit hits 96dB, and 24-bits a whopping 144dB. However, track mixing is often done at a much higher resolution—32 or even 64-bits. This means that data is sliced off or truncated when moving on to master at 16 or 24-bit. This produces quantization noise, but the specific type of rounding error leads to noise with a particular trait – harmonic distortion.
This is because values that fall part between a quantization step always round to predictable values. In a simple rounding example, 0.6 always becomes 1, and 0.4 always becomes 0. Another way to think about this a slight clipping of the waveform which produces odd harmonics. We can see examples of this harmonic distortion and the effect on SNR for different bit-depths with a 250Hz sine wave below.
Note how the odd harmonic distortion ruins what would otherwise be a much lower noise floor. Fortunately, dither allows us to clean up this harmonic distortion into a more pleasing white noise while also improving SNR. Talk about a miracle cure.
How does dither work?
At it’s most basic, dither is the introduction of random noise to prevent those predictable quantization steps. Rather than 0.6 always rounding to 1, perhaps sometimes it rounds to 0. We could do this by randomly adding or subtracting 0.5 from our value. This way, there’s no repeatable rounding patterns or clipping of the waveform, there’s simply random noise at a much lower amplitude.
This is achieved by generating a random noise signal that toggles the last or least-significant bit of the desired signal. This is then added to the audio file before quantization. So for instance, we randomly toggle the 16th bit for a 16-bit master or the 8th bit for an 8-bit master. Toggling a more significant bit would add more noise to the signal that might mask important details.
The graph below demonstrates what happens to noise when we apply this type of dither. Notice that the lowest amount of noise has increased, so we’ve raised the noise floor in that regard. However, the highest peak of harmonic distortion has fallen almost entirely to the noise floor, improving our total SNR in the process and resulting in lower overall noise. Perfect.
In total, our 8-bit signal sees its SNR improve from 48 to about 64dB. 16-bit with this method of dither lends to improvements from 96 to 112dB, and 24-bit from 144 to 160dB. But this is just a very crude method of producing dither, we can actually do even better.
Dither is the introduction of random noise to reduce harmonic quantization distortion.
High-quality dither takes advantage of human hearing characteristics, in particular the loss of sensitivity at higher frequencies, to eke out even better SNR for the frequencies that matter. Using wave-shaped noise rather than an equal distribution of noise can improve SNR in the low and mid frequencies, where the human ear is most sensitive, to about 120dB.
A triangular noise distribution (TPDF) is commonly used in audio dithering and has the desired effect of reducing the first few orders of harmonic distortion. There are other noise-shaped types available too, some which are designed to more closely match human hearing curves. Noise shaping is also a very important tool for improving the performance of sigma-delta ADCs and DACs found in modern audio equipment, pushing noise out of audible frequencies.
Is dither still relevant in the streaming and Hi-Res age?
Dither isn’t applied to 24-bit audio, so is perhaps not as important in audiophile circles at it once was. The reason for this is simply because 24-bit’s -144dB noise floor falls well below the best possible performance of real-world audio equipment, which maxes out at about 120dB (~20-bits).
However, the effective noise floor of noise-shape dithered 16-bit audio is also approximately -120dB. For this reason, there’s no audible difference between 24-bit Hi-Res and properly dithered 16-bit CD quality audio. Especially once you take into consideration listening volumes, limits of human hearing, and the dynamic range of most musical performances. There’s a reason we say that CD-quality audio is still great.
Dithering is still an important mastering technique in the modern music age. The most popular music streaming services offer lossy compressed, 16-bit music. Before applying this compression, music should be mastered and appropriately dithered for a low noise floor to ensure maximum performance from the compression codec. Even if you’re converting from a high-res library to a lossless format such as 16-bit FLAC, you should ensure that the encoding chain includes decent dithering for maximum audio quality.