Bob McCarthy" />

SVC on Twitter    SVC on Facebook    SVC on LinkedIn

Related Articles


Sound Image

Feb 11, 2013 8:34 AM, By Bob McCarthy

The costs and benefits

   Follow us on Twitter    

Figure 2: Horizontal localization with a pair of speakers: b) time offset favors
right speaker.

Figure 2: Horizontal localization with a pair of speakers: b) time offset favors right speaker. Click here to see a larger image.

As I said before, vertical localization is two-individual analyzers, rather than a dual-channel comparator (as in the horizontal plane). The distinction is important since this means that relative time is not the driving force. This means that we will have a harder time distinguishing direct sound (which arrives first) from reflected sound in the vertical plane than we do in the horizontal plane. The comb-filter signature detection of the vertical plane becomes more challenging when the additional combing created by the summation of direct sound and reflections is added in. This is further complicated by the fact that the reflections are arriving from different vertical angles as well, so their vertical plane identifying signatures conflict with that of the direct sound. In short, horizontal localization holds up better in a reflective environment while the vertical localization becomes harder for us to pinpoint a source direction.

Horizontal localization with a pair of speakers: c) image is centered
by a mix of level and time offsets.

Figure 2: Horizontal localization with a pair of speakers: c) image is centered by a mix of level and time offsets. Click here to see a larger image.

So far we have discerned the bearing of the sound source: its horizontal and vertical angle relative to us. Next is range. Is the source close or far? What are the clues we use to discern sound source range? We are not bats, so we can’t ping a sonar pulse and time the return.

Two of the factors in ranging are direct/reverberant ratio and frequency response. Level plays a part if the range is changing (and the level constant). But if the source is not moving, we cannot make conclusive range estimates based on level alone. One can easily see that if we brought a source closer by half the distance and reduced its level in half (-6dB), that the level of the direct sound would stay the same, negating level alone as a conclusive range finder.

Figure 3: Vertical localization: a) head response transfer function.

Figure 3: Vertical localization: a) head response transfer function. Click here to see a larger image.

The presence of reflections, however, provides a strong set of clues. For a given room, the proximity to the source will have a strong effect on the direct/reverberant ratio. If the source moves closer to us, we will detect an increase in the D/R ratio. This clues us in to the decreasing distance, even if the source has been adjusted to maintain a constant level at the listener. A stationary source in a given room will be more challenging to precisely range. A more reverberant room will lead to higher range estimates than a dry room because we associate the higher reverberation levels with larger spaces and longer distances.

The frequency response of the source also plays a part. There are two areas for this: the high end and the low end. The high frequency response is affected by the imperfect frequency response of our transmission medium: air. The extreme high frequency range is the most lossy over distance, the degree to which depends on the weather—mostly the humidity. In any case, the longer the transmission distance, the more lossy our top end gets. Note that this affects the direct sound the same whether you are in an anechoic chamber or an echo chamber. The losses, however, continue even after the sound has been reflected; therefore the later arrivals will have progressively more HF roll-off. We will also lose HF response if our location is off axis of the source (assuming a directional source). Our blindfolded listener in an anechoic chamber would likely enlarge their range estimate if the high frequency was filtered down since this mimics the air loss effects in the HF range that are factored into our range memory map. If we rotated the speaker in the anechoic chamber so that the HF response began to roll off, the brain can be fooled into extending the range estimate. Think this through from your own experience of walking along a row of seats and moving from on axis of a speaker to the off axis area along the aisle. You know you are at the same distance from the speaker and yet you feel farther away when you reach the off-axis area. If we added a side-fill speaker that restored the high frequency, to the outermost seats, we would effectively be restoring those seats to the same sonic range as the middle ones. Meanwhile, in the low end, we will see lots of constructive addition if the listening space has exotic architectural features such as a floor, walls, or ceiling. As distance between the source and listener rises, the frequency response tilts up in the low end (reflections) and down in the high end (air loss), resulting in a discernible range clue for our hearing system.

Figure 3: Vertical localization: b) direct and reflected sound localization.

Figure 3: Vertical localization: b) direct and reflected sound localization. Click here to see a larger image.

Imaging with Speakers

Now that we have established how sound image is perceived in our heads, it should be obvious how to get perfect imaging in your concert hall, theater, or house of worship: turn off the sound system. This would be fine if Canon’s old slogan, “Image is everything,” were true in our field of sound. Sound image is ever-present in the thoughts of sound engineers but way down the list for the folks who hear our work. For them, intelligibility, appropriate level, and natural tone weigh much more heavily on their experience. If folks don’t understand the words, you won’t be hearing about how great the imaging was.

Figure 3: Vertical localization: c) upper and lower speaker localization.

Figure 3: Vertical localization: c) upper and lower speaker localization. Click here to see a larger image.

So this puts things in perspective. Once we have satisfied the primary needs of intelligibility and expected level we can seek to enhance the experience with realistic imaging. Let’s clarify the sonic image goals. First, we want to create a sound image that is closely correlated in angular bearing to the live sound source. Second, (and here is the twist) we wish to create a sound image depth that is significantly closer than the live sound source, but not so much closer that it breaks the limits of plausibility and becomes a distraction. Therefore, we can measure our sonic image success (or failure) by the amount of angular and range offset between the perceived sound image and the intended source location. We seek to match the bearing and purposefully distort the range perception so that we bring the audience members sonically closer to the stage. We don’t tend to think of our sound system this way, but sonic image range reduction is the most indispensable part of the sonic image equation. If we are not going to decrease the sonic range to the listeners, then we should pack it up.

The first key factor will be the proportion of natural/amplified sound needed to get us up to the required level and intelligibility (the less amplified we need to add, the easier it will be to preserve imaging). The second will be our speaker locations. The closer they are to the natural sound source, the easier it will be to preserve the image. A speaker helmet would be the best location for imaging, but betrays the fact that image is not everything, and neck injuries and gain before feedback must be factored in to the equation. In practical terms, we strive to get speakers located near the planes where the natural sound originates. For a typical stage source or podium, we will try to get sources fairly low in the vertical plane. In the horizontal plane, we will be as central as possible. Excellent. We now know the best place for the speaker is center stage—an impossibly impractical location.

Figure 4: Time and level offset effects localization. As time offset rises, the level
offset must rise in order to maintain a centered sound image (after Dr. Helmut
Haas, 1972).

Figure 4: Time and level offset effects localization. As time offset rises, the level offset must rise in order to maintain a centered sound image (after Dr. Helmut Haas, 1972). Click here to see a larger image.

In practical terms, we have to look at each seat in the room as having a unique relationship to the natural sound source and the speakers that are reinforcing that source with added signal. Closer seats will tend to have large angular differences and small range differences, while distant seats will have the opposite. Fill systems and delay systems add another layer, since they have relationships with both the original source and the main speakers. While one speaker source (combined with the natural source) can minimize angular image distortion in one plane, there are relatively few practical main speaker locations that can help us in both planes for any single location, and even fewer that can do the job for a large part of any hall. Therefore, most sound reinforcement applications that place image preservation as a high priority will need to be comprised of multiple main systems and a variety of fill and delayed systems. Image control in the closer seats will be a mix of the original source and the speakers. As we get deeper into the room, the stage source will fade away and the game is played out between different speakers.

Every seat in the room has a bearing and range to the original sound source and the reinforcement speakers covering that area. If the horizontal bearings differ greatly, we will have to take steps to prevent a large-scale angular image distortion. As an example, we will consider a center source heard from a seat on the left that is covered by a reinforcement speaker on the left. Option one is to delay and reduce the reinforcement speaker level so that the source speaker arrives at around the same time and same (or better yet, less) level as the reinforcement speaker. This option can provide some inward image movement but may not be workable if high acoustic gain is required (thereby requiring the reinforcement speaker to be louder). Option two is to delay the reinforcement speaker even more so it arrives after the source. This can help “for a limited time only” (up to 7 milliseconds) but has the downside of creating comb filtering and reduced intelligibility when combined with the source. Option three would be to add another reinforcement speaker that is centrally located. This added arrival reinforces the central energy of the source and brings the horizontal image toward center. Bear in mind, however, that the center speaker is high above the stage. Therefore, the gains in horizontal image come with a cost of vertical image distortion. If we are close to the stage, we might get some help from the front fills, which are low and help ground the vertical. So now we have established an approach to image control, but it comes with some substantial costs. We need multiple speakers, signal processing channels, and the acoustical costs include the potential for intelligibility loss and tonal distortion that result from multiple speakers covering the same area. These are the tradeoffs we face for image control. In Part II, we will detail the means available to move the sound image and put this all to practical use.

Acceptable Use Policy
blog comments powered by Disqus

Browse Back Issues
  January 2015 Sound & Video Contractor Cover December 2014 Sound & Video Contractor Cover November 2014 Sound & Video Contractor Cover October 2014 Sound & Video Contractor Cover September 2014 Sound & Video Contractor Cover August 2014 Sound & Video Contractor Cover  
January 2015 December 2014 November 2014 October 2014 September 2014 August 2014