On Dynamic Range
The trend in audio system design is toward greater dynamic range, but the trend in music production is toward less.
SINCE ITS INCEPTION, the goal for the audio playback system has been to achieve faithful reproduction of sound. One aspect of this involves dynamic range. The dynamic range of an audio device is the level difference between the smallest and largest signals that can be emitted. In a well-designed product, the lowest level is determined by the thermal noise floor, which is noise generated by the thermal agitation of electrons. The highest level is determined by the design of the internal power supply. The difference between these extremes is usually specified in decibels, with the idea being that bigger is better.
The first telephony devices barely had enough dynamic range to reproduce recognizable speech. As audio components have evolved over the last 100 years, dynamic range has steadily increased. Today, most individual audio products have at least 100 dB, which exceeds that of most rooms or transmission lines they're connected to.
Dynamic range is different than signal-to-noise ratio. The former describes the range of possible output values that a waveform can have. The signal-to-noise ratio is the level difference between the loudness component of the signal and the thermal noise floor.
The dynamic range of a product is determined by its design. The effective signal-to-noise ratio is determined by how it's used. A product can have very wide dynamic range, yet still be “noisy” because the operator fails to fully utilize it. Imagine plugging a dynamic microphone directly into a power amplifier. The signal would barely rise above the thermal noise floor, regardless of how “quiet” the amplifier is. This is why the signal processing chain requires some strategic gain blocks to properly utilize the dynamic range of each stage. The two blocks that provide the greatest gain are the microphone preamplifier at the system input and the power amplifier at the system output.
The Human Listener
Humans also have a dynamic range. The lowest perceivable sound pressure deviation from ambient atmospheric pressure is about 20 micropascals at mid-range frequencies. This is 0 dB-SPL. Arguably the highest deviation that we can be exposed to without permanent damage is about 200 Pascals (140 dB-SPL), making the dynamic range of the human listener about 140 dB. This is an amazing number and testimony to the amazing properties of the human auditory system.
Now, before you think that your audio system's dynamic range must exceed that of the human auditory system, consider that even the quietest listening rooms have ambient noise levels at least 20 dB above the threshold of hearing. At the other end, we become quite uncomfortable at sustained SPL above 100 dB-SPL, with short term peaks exceeding this by 20 dB or so. So, if we knock 20 dB off of each end, the required dynamic range for a high-quality playback system is around 100 dB.
Analog vs. Digital
The dynamic range of a digital product is determined by the length of the digital word used to represent each audio sample. This is the “bit depth” of the signal, and the dynamic range can be approximated by 20log(#bits). This makes an 8-bit system about 48 dB. Each additional bit adds 6 dB of dynamic range. The standard for CD audio, 16 bits, has a dynamic range of 96 dB, which is very close to the magic 100 dB that fully exploits the human auditory system in most applications.
Most digital products today use a 24-bit word length, which yields a numerical range of about 144 dB for resolving and producing analog signals. In practice this is greatly reduced by the noise level of the analog signals on each end, so the realized dynamic range is closer to 100 dB than it is to 144 dB. Practical issues constrain the available dynamic range to lower values than the digital word length affords.
Even though 16-bit digital audio is adequate for many applications, the fact that it can be higher without undue expense or drain on processing resources means that higher bit depths make sense. The positive effect of increasing word length (adding more bits) is a further reduction of residuals. This is an increase of “foot room” rather than “head room” for the audio signal. For many years we waited for digital resolution to equal that of human hearing, so it should be no surprise that it would inevitably surpass it. Current technology appears to be at this threshold.
So, used properly, the dynamic range available with current digital technology is more than adequate for satisfying the human sense of hearing.
The apparent loudness of an audio signal is related to the root-mean-square value of the audio waveform over time. In modern audio systems, the root-mean-square voltage is the quantity of interest regarding perceived loudness as well as power consumption (and the resultant heat generation/dissipation issues). Short-term signal peaks typically exceed the RMS voltage by 20 dB or more, and the noise floor by 100 dB. The ratio of the signal peaks to the RMS voltage is the crest factor of the waveform. The crest factor is dynamic and a function of time.
The perceived loudness of an audio signal is increased as the RMS signal level is “squashed” up against the peak limitations of the playback system. This is accomplished by compressing or clipping the signal. Clipping produces harmonic distortion, while compression doesn't. Both increase the loudness of the signal and the operating temperature of the loudspeaker. Unfortunately both processes also amplify the signal's noise floor, thereby reducing its dynamic range. While this seems detrimental, it's a common practice.
Reducing Dynamic Range
There are several reasons for the high levels of compression used in modern playback systems. The first is quite practical — music is played in public spaces and public spaces are noisy. If background music systems in restaurants played uncompressed recordings, most of the music would be masked by the noise floor of the room. The same logic applies for music playback in automobiles. These are inherently noisy environments and wide dynamic range music may get masked by the environmental noise.
Dynamic range reduction also increases the volume levels that can be achieved from portable audio devices, i.e. iPods and MP3 players. Consider that 10 dB of peak limiting can make a song play back at twice the perceived loudness of uncompressed music. This can be a big selling point for one artist over another, and a negative trend regarding the health of the listener's hearing.
So, while technology has given us playback systems with incredible dynamic range potential, the majority of listening environments can't really make use of it. Highly compressed music has become the norm, and recordings that fully exploit the playback system (i.e., that are uncompressed) are labeled as “audiophile,” and occupy their own special section of high-end retail stores. They come complete with a label that warns of their lower playback level compared to “conventional” recordings.
These issues tend to find their own equilibrium. It's still good design practice to create playback systems with the widest possible dynamic range. This requires good components and careful attention to system gain structure. On the other hand, few recordings and playback environments fully exploit the dynamic range possible in modern systems. The audio practitioner must know how to maximize dynamic range on one hand, and also how to reduce it to be appropriate for a particular application.
Pat Brown is president of Synergetic Audio Concepts (Syn-Aud-Con) Inc. and Electro-Acoustic Testing Company (ETC) Inc. Syn-Aud-Con conducts training seminars in audio and acoustics worldwide for those who operate, install, and design sound reinforcement systems. ETC Inc. performs precision loudspeaker testing for the audio industry. He can be reached at firstname.lastname@example.org.