Jump to content

Article: How to set up a 2-channel Sound System


Kvalsvoll

Recommended Posts

Your -4dB may very well be mine 0dB, due to reasons you probably already have mentioned - differences in method for calibration.

 

Perceived loudness has nothing to do with room size, it has to do with sound field properties - first arrival, decay rate, direction of sound, whether reflected decay energy is diffuse, sound field intensity of first arrival. Of other persons who also understands how this works, I remember Grimani.

 

In even larger larger rooms, with some absorption, the sound is still dominated by the direct sound from the speakers, like the measurements in the smpte report clearly shows.

 

Which SMPTE report?

 

Seat distance is a crucial detail.  Dolby standards require theatrical mixes be monitored at least 10 meters from the speakers.  I believe the mix position in most dub stages is closer to 20 meters back.  But I think that most of the reason for the apparent complexity is simply that we're calibrating against the wrong thing.  Calibrate against first arrival SPL across all frequencies (with appropriate adjustments at the top and bottom) and all this apparent complexity may mostly go away.  I say mostly, because of course, stuff like severe acoustical problems may still be an issue, but these need to be addressed independently.

 

To be fair, I could be wrong about all of this, but this approach has been by far the most successful for me.  Note that while my left and right speakers pink noise at 85 dBC, my center comes in closer to 82 dBC.  Subjectively, they sound balanced.  Why does this happen?  Because in the current configuration, I absorb the strong ceiling reflection from the center but not the left and right mains.  All three speakers are dead-on in terms of first arrival SPL, but the energy from that ceiling reflection pushes up the pink noise level on the left and right with no impact on subjective level.

Link to comment
Share on other sites

I do agree that calibration (for frequency target response corrections) is done against wrong criteria. Simply eqing accordin to a measured frequency response will not give consistent and correct results.

 

Just consider the fact that the measurement will be strongly affected by objects near the measurement position. Such as seating. Should the seats be removed? What about people sitting in nearby seats? This will affect the measured response in the midrange, and correcting for those errors is wrong. 

Link to comment
Share on other sites

Awesome!  I've heard about that study and didn't realize it was made into an SMPTE publication.  I'm still digging through it, but one thing I see right away is that they are using longer window lengths, 10 ms or 1/24th octave FDW at minimum.  The windowed frequency data also gets smoothing with a 1/12th octave moving average filter.

 

I just now applied both of these windowing methods to my current center channel response.  Incidentally, both windowing criteria give remarkably similar results.  In both cases, I'm 2-3 dB hotter (on average) in the upper mid than treble.  I believe this is because of early reflection energy that arrives between 1.2 and 1.9 ms containing energy from 1-5 kHz.  IIRC, this energy is a mix of reflections involving my rear wall absorber face, which is only 12 in away and is not 100% absorbent, and my microfiber sofa back.  I have tried to EQ above 1 kHz to flat using a 10 ms window, and there was definitely too much treble vs. upper mid, subjectively speaking.  Stuff just sounds way more realistic the way I have it now, using a 1/3rd octave FDW, which omits most of that energy.

 

I don't think seats should be physically removed for the measurement.  If the seat interferes *too much*, one may wish to consider changing furniture.  My microfiber couch is reflective, but the back is shorter, so the reflection is mild.  I'd be a bit more concerned about a headrest with reflective material.  Even though the early reflection from my sofa adds quite a bit of ripple to my mid response, I notice insignificant change when moving my head around while listening to content other than sine waves.  As for people sitting in nearby seats, I can actually hear the difference with content playing, but it's easy to adapt to.

Link to comment
Share on other sites

Well there you see, I just assumed everybody knew this report.

 

The measurements are presented with different windows, which makes it possible for people who understand acoustics and sound to get a much better picture of what is going on with the sound in the room. Even better if they just provided .mdat-files, but they used something else to do the measurements, so they have no .mdat files, and they would still have to present  pictures of scaled plots, because very few readers would know what to do with the -mdat.

 

What can be seen in this report is that they do follow the standards, but that is not sufficient to ensure similar sound at different venues. I think it is fair to say that without a standard, the results would have been a lot worse.

Link to comment
Share on other sites

Awesome!  I've heard about that study and didn't realize it was made into an SMPTE publication.  I'm still digging through it, but one thing I see right away is that they are using longer window lengths, 10 ms or 1/24th octave FDW at minimum.  The windowed frequency data also gets smoothing with a 1/12th octave moving average filter.

 

I just now applied both of these windowing methods to my current center channel response.  Incidentally, both windowing criteria give remarkably similar results.  In both cases, I'm 2-3 dB hotter (on average) in the upper mid than treble.  I believe this is because of early reflection energy that arrives between 1.2 and 1.9 ms containing energy from 1-5 kHz.  IIRC, this energy is a mix of reflections involving my rear wall absorber face, which is only 12 in away and is not 100% absorbent, and my microfiber sofa back.  I have tried to EQ above 1 kHz to flat using a 10 ms window, and there was definitely too much treble vs. upper mid, subjectively speaking.  Stuff just sounds way more realistic the way I have it now, using a 1/3rd octave FDW, which omits most of that energy.

 

I don't think seats should be physically removed for the measurement.  If the seat interferes *too much*, one may wish to consider changing furniture.  My microfiber couch is reflective, but the back is shorter, so the reflection is mild.  I'd be a bit more concerned about a headrest with reflective material.  Even though the early reflection from my sofa adds quite a bit of ripple to my mid response, I notice insignificant change when moving my head around while listening to content other than sine waves.  As for people sitting in nearby seats, I can actually hear the difference with content playing, but it's easy to adapt to.

 

If you use the decay plot, you can see in one picture what is going on with the sound across the time range. Then you can adjust the first arrival (the top first in time curve) according to later decay curves to make the overall decay more flat across the frequency range. I am not saying this is how the response should be, it is just something to try, and the listen to. If some part of the frequency range sounds a bit off balance, this can be one tool to improve things.

 

As for the chair - in a auditorium, cinema, concert hall you can not easily remove the seats to do measurements. The chair will affect the response because it changes the boundary conditions close to the mic. Reflections off the surface matters most for higher frequencies, that is why I have blankets on the sofas in The Moderate Cinema. The blankets are of course high-end especially chosen for best sound, still they reflect too much to get a decent impulse response reading at higher freq.

 

If I removed the sofas (h a  h a .. as if that is going to happen..) each time I do a measurement, it would measure different from how I do it now - simply place the mic where the head is in the listening position, blankets in place, I may adjust boundary conditions with additional damping in the sofa seat.

 

In Room2 it is very easy to remove the chair, so that is what I do if I want to measure high freq. For low freq it does not matter, for mid the response measures flatter - or in any case different - with the chair in place.

 

The next step now for a true audiophile would be to start wondering about how a better or more correct sounding chair can improve the sound. Perhaps go to a furniture store to find some different chairs for evaluation. "Excuse me, I am looking for a chair. Do you have anything with a slightly warmer lower midrange?"

 

Then you realize that the response will also change when you are sitting in the chair - you body affects the sound. Body size, weight, you clothing, how you sit, all this will cause very measurable differences.

 

A sane person will now have concluded that there is something evidently wrong her, something is missing. And it has to do with the criteria used for evaluation of the sound - the frequency response measurement. There will always be deviations from the ideal flat response when measuring in a room, and seats will have significant effect in the measured response. But that does not matter much for the perceived sound. There are other parameters of the sound that causes one system to sound different form another.

Link to comment
Share on other sites

I spent quite a few hours digging through the report last night.  There is a lot of interesting and useful data there, even though I very much wish I could get at the raw measurement data.

 

The analytical methods they used make it very difficult if not impossible for me find the information I'm looking for.  As I said already, the window lengths are too long to isolate the first arrival sound.  At 10 ms, the reflection from the floor has already contaminated the measurements.  Other boundary reflections may also appear within that window.  Decay plots do not reveal this information either.  The "time 0" curve is steady frequency response (with smoothing), not first arrival.  The one plot that might be helpful is the cumulative energy plot, but I'm not sure.  I'm rather doubtful that the dB level of the initial rise is an accurate indicator of direct SPL relative to continuous (sine sweep or pink noise measured) SPL, for example.  If they are, then that suggests these systems have a very similar direct-to-reflected sound ratio to my own system and room and probably sound too loud at 85 dBC reference.  I seriously doubt that's the case in reality, given the major discrepancy in listening distance.  But I'll have to do a lot more work to confirm this one way or another.

 

I also disagree that "without standards, the sound would be a lot worse".  Their own comments about listening with and without EQ bypassed seemed to be very inconclusive.  Ironically music releases, despite its complete lack of standards, offers better tonal balance consistency between different program material, IMO.  At the same time, more recent film releases do seem to be more consistent than older releases.  This leads me to believe that dub stage mixers, being well aware of the mess with the current standards, may simply be avoiding the use of EQ in the soundtrack, knowing that it is likely to only make translation worse.

 

Anyway, I can't really blame them for doing the study the way they did it.  A couple years ago, I would have done it almost exactly like they did, right down to the use of a 10 ms window to try to isolate something psycho-acoustically relevant.  I just happen to know now that these criteria don't work for me in my room without some kind of arbitrary target curve that probably doesn't translate between rooms.  On the plus side, there is still a wealth of useful information there such as RT60 vs. frequency and physical dimensions of some of the rooms, from which I can make a better acoustic model of what's really happening in those places.

 

As for the listener's chair, I think you get my point.  If the chair noticeably changes the sound, then an audiophile may wish to replace it.  But if the effect is not audible yet it does change the frequency response plot, then obviously the frequency response plot does not accurately indicate what we hear and should not be used as a criterion for calibration.  A much better criterion is one which only changes when the subjective sound quality also changes.

 

Edit: After calculating the geometry, I figured out that the floor reflections for the close mics in the two dub stages won't arrive until well after 13 ms or so.  However, the article does not appear to contain any plots using the smaller 10 ms window size.  That's a real bummer, because then we could see the direct response of the speakers/baffle system without contamination from reflections.

Link to comment
Share on other sites

There is much information in that report, no doubt. It also helps to explain why movies sound like they do - the bass cut-off, the strange tonal balance on the dialog.

 

Now, something different. In the 2-ch article the 1. chapter has a picture which shows what we can hear and what we can not hear. As usual, it is a simplified presentation of how it really works, but for now, it is quite sufficient.

 

We can use this picture to find requirements for maximum distortion and noise levels for transparent sound.

 

I did a practical experiment, and found this relationship for audibility of harmonic distortion.

 

I tested 80hz, 440hz and 2khz. Levels from 60dB to 80dB. 

 

From the already established theory we know there is masking around the first tone, so audibility for 2h and 3h should be reduced. And for higher harmonics we expect audibility to approach the audible limit down at 0dB. We should then expect harder requirements for increasing spl - as the distortion relative to the first tone is lower.

 

And this is exactly what I found.

 

For 2h, around 2% is the limit. Also, for 2h, the phase of the distortion component matters for how it is perceived, and the distortion can actually make the tone appear cleaner with distortion than it is without. Which suggests that it is possible that added 2h distortion can actually "improve" the sound.

 

For higher harmonics the lowest audible distortion was 0.016%, that is -76dB (440hz 80dB, 8h.). Still not very difficult to achieve, but very far from the "you can't hear less than 1%".

 

The masking effect:

 

post-181-0-74244300-1486316558_thumb.jpg

 

Audibility limits for distortion:

 

post-181-0-14014500-1486316639_thumb.jpg

 

And if we overlay a distortion measurement from a dac we all know is not good enough - well, it ceraitlny looks good enough here:

 

post-181-0-75771000-1486316728_thumb.jpg

 

Here is an amplifier, I think it is my C15:

 

post-181-0-52272500-1486316863_thumb.jpg

 

Let us find something really bad - a cheap receiver amp, we all know they don't sound good:

 

post-181-0-44351700-1486316953_thumb.jpg

 

 

From this learn that we should expect all this equipment to be sonically transparent. And that matches the double-blind listening I did in the amplifier test - it is not possible to hear any difference between the original source and the output of the amplifier.

 

This does not mean all amps sound the same - some distort too much, there can be components well into thee audible range, some are not powerful enough, and then it will distort - very much distortion, very audible distortion.

 

What is important to understand here is that once the criteria for audibly transparent sound is satisfied, there is no performance gain in having better performance, there is no "only a very small improvement with the uber-expensive amp", because the is simply no difference.

 

I posted this on the hifi-forum recently, but I suspect many stopped reading before learning what I was trying to say, and those who read it all simply don't get it. Perhaps because they really don't want to know, they are happy living in a delusive world, kind of like telling someone their god does not exist.

 

Link to comment
Share on other sites

Link to the smpte article did not work for me?

 

Something strange happens with the link, tried to edit it, then removed it, and suddenly it works again.

 

If it does not work, just copy the the text:

 

B%20CHAIN%20FREQUENCY%20AND%20TEMPORAL%20RESPONSE%20ANALYSIS%20OF%20THEATRES%20AND%20DUBBING%20STAGES%201%20Oct%202014.pdf

 

https://www.smpte.org/sites/default/files/SMPTE%20TC-25CSS-B%20CHAIN%20FREQUENCY%20AND%20TEMPORAL%20RESPONSE%20ANALYSIS%20OF%20THEATRES%20AND%20DUBBING%20STAGES%201%20Oct%202014.pdf

Link to comment
Share on other sites

One thing to keep in mind about auditory masking is that the thresholds change according to loudness and frequency. Lower frequency sounds will mask a wider frequency band, and higher sound pressure levels will mask a wider frequency band. Auditory masking is dynamic and non-linear. 

Link to comment
Share on other sites

One thing to keep in mind about auditory masking is that the thresholds change according to loudness and frequency. Lower frequency sounds will mask a wider frequency band, and higher sound pressure levels will mask a wider frequency band. Auditory masking is dynamic and non-linear. 

 

True, and you can actually see it on the distortion picture I made - the masking increases for the louder tones.

 

My picture is only an approximation, it all depends on so many factors, such as differences in hearing capability.

 

The picture does not show wider masking for lower freq, but the lowest tone here is 80hz, may be it will look different for lower freqs.

 

This is very much like a worst-case scenario -  single continuous sine waves. For real music content, the audibility limits will be far higher. Which corresponds with the results from the amplifier test - all amplifiers could be verified as different from the original signal when playing test tone signals, but none could be verified as sounding different with music.

 

One detail that can increase audibility is the fact that higher harmonics does not necessarily distribute evenly in energy across the timespan of one period of the original tone, this will be the case with clipping or crossover distortion. Then the peak amplitude of the distortion will be louder than the spectral analysis shows. I did some experiments with this, and found that if you add 10dB safety margin it will always be safe. For a 10h it is possible to get around 20dB higher peak level than spectral analysis shows, but then there is the time aspect - the time is so short, that we can not hear the peak level, we rather hear something more similar to what the spectral analysis shows. Hearing works more like a sound energy detector - we hear loudness as sound integrated across a time-span.

 

If I had included even louder tones, the data would be better and show a more complete picture. But listening to that 2khz at 80dB is actually quite annoying, and even louder would be unpleasant.

 

The time aspect is also very interesting to get better information about. We know audibility requires the sound to have a minimum time-span, the shorter in time, the louder it has to be to be audible.

 

Which should mean that distortion is more difficult to hear for transient signals. And since transients are what will be louder in level than say 80dB when we play loud, it is reasonable to believe that increased distortion at very loud levels is not a problem.

 

If there is too much distortion, it will impact sound quality. But the levels we can accept are much higher than most believe.

 

I have concluded that distortion is not an important parameter for loudspeaker sound character. As long as the speaker is within its comfortable limits, the character is determined by other - linear - faults. The character of the sound is the same when playing loud or soft.

 

When spl increases, the distortion from a speaker increases a lot. This also happens with the speakers I make, even if they have spl capacity to play much louder than necessary for any sane or insane level in a small room. This is easy to see on the rta when playing sine waves. But is does not have any significant impact on sound. What does have significant impact, is if there is not enough capacity and distortion skyrockets to 100-200%. This happens when amplifiers clip or drivers are pushed beyond limits. For small hifi-speakers this limit is easily reached at modest spl levels, because music signals have a very high peak value relative to the rms value.

Link to comment
Share on other sites

THD is a consequence of non-linearity to sine wave inputs.  With real world signals, most of the same non-linearties will induce non-harmonic distortions that are likely to be much more objectionable.  I guess you could argue that the magnitude of the IM distortion will likely be similar to the magnitude of the THD for the lower frequency stimulus, and that the analysis is looking at psycho-acoustic audibility thresholds as kind of a worst case scenario approach.  So there's that.

 

OTOH, another thing to keep in mind with THD and audibility is that the distortion adds or subtracts to harmonics that are likely also present in the content as well.  If the harmonics in the content are slightly below the audible threshold due to masking, then the distortion level needs to be enough to cause them to exceed threshold in order for the sound to substantially change.  So it will always be possible to find situations in which very small amounts of distortion have audible consequences.

 

But let's flip that argument again and note that the relative level of harmonics vs. fundamental depends also depends a lot on the linear response of the system, and very small changes in the linear response (i.e. 0.25 dB) can potentially have substantial audible consequences, even ignoring non-linear effects.  Consider that if turning up the volume increases some HD by 0.25 dB, then it's possible for such distortion to be audible in some circumstances (linear response slightly below threshold) and not audible (linear response already above threshold) in others.

 

I think the more important point regarding distortion is that it is almost always much higher in the speaker than in the electronic components.  Even if distortion in electronics is audible, it's certain to be extremely subtle, and the fact of the matter is that almost every audio system on the planet suffers from flaws that are not subtle.  In my efforts, I'd much rather worry about improving all the stuff that's not subtle and doesn't require golden ears to hear.

 

Actually one thing that bugs me about the obsession over "the sound of electronics" is that it often takes attention away from details that *do* matter.  For example, consumer pre pros routinely publish specs like THD (often without qualification about the signal input), but few publish the max output voltage, information that is necessary to achieve proper gain structure and avoid unwanted noise, on the one hand, or clipping on the other.  There's also issues of basic implementation competence like the issue with lack of headroom in Oppo's bass management.

Link to comment
Share on other sites

Agreed that distortion from loudspeakers typically isn't a big problem, as long as common sense is used regarding the application of the speaker, ie you would not want to try to fill a large room with music from a small bookshelf speaker. Linear distortion is a much greater problem than non-linear distortion, ie poor frequency response, many times resulting from the effects of room acoustics on the sound. 

Link to comment
Share on other sites

Target curves and how to measure:

 

The chapter on Calibration - Target curves was updated for the 2-ch article, but still is a very brief and compressed text which does not explain everything in detail.

 

When trying out different target curves and applying eq to force the measured response to match this target, there are some issues you need to consider.

 

Because the measurement may not be a correct representation of the response actually present, so that the corrections applied - either done manually or by using automatic room correction software - will not give the desired response, even if it looks that way from the measurement.

 

At higher freqs the direct sound totally dominates, and what you see on the measurement is a good representation of the actual response - if done right.

 

Any objects very close to the mic - such as mic stand, mic attachment clip, the back of the sofa or chair, will affect this part of the response as well. Use blankets or something similar to absorb these very early reflections, but even then there will be reflections and those reflections will affect the measurement.

 

For hf, the easiest and best solution is to remove the chair/seating.

 

At mid freqs the boundary conditions around the mic - seating, floor - will affect the response severely. You can see this by comparing measurements with and without a chair.

 

When a persons sits in the chair, the response will also change.

 

So, what to do. I have seen people making a dummy listener placed in the chair, for measurements. Others suggest removing the chair and other items close to the listening position.

 

The problem will typically be present in the 200hz - 2khz range. Worst around 500-1000hz.

 

How severely affect the response is also depends on the properties of the sound field. A directional sound field - i.e. the direction of the particle velocity does not change with frequency - behaves different form a more diffuse sound field with lots of relfections.

 

At lower freqs those very close surfaces does not have the same impact on the measurement. It is now the boundary conditions around the listening position and around the speaker that causes deviations.

 

The answer is to not correct for any changes in the response caused by the chair or the listener. You want a system that presents a smooth, flat response in the room, and when the presence of a listener and chair changes this response, that should not be corrected for. Because those changes will always be present and can be considered as part of the hearing mechanism.

 

The easiest solution is to use a speaker that has been voiced and tuned to give exactly this flat, smooth response, no eq necessary. Deviations from a smooth curve is caused by the room, and if you can find what is causing it, you can change things with acoustic treatment.

 

If you have a custom system that needs eq simply because there is no reference for how to set it up to give a proper response, there is no simple and quick fix. You can try to use nearfield measurements to get more information, but two problems arise - the measurement will still be affected by boundary reflections, and the response too close to the source will not be representative for how it looks at some distance. Especially horns are tricky here, you can not use a measurement taken at the mouth of a horn for anything useful, at least get to a distance equal to the largest dimension of the mouth.

 

If you need a measurement for automatic room correction, you should remove the chair and any other objects close to the measurement position. You want the room correction to fix reflections around the speaker, which will appear similar across the whole room, but you do not want corrections for a chair that will change its acoustical appearance completely once you sit down in it.

 

At low freqs, in the bass range, the room dominates the response and here eq should be used to reduce those effects. Ideally, you should fix it with acoustic treatment, but usually that is not possible due to the size required. It is still possible to get very reasonable results using a combination of some bass treatment and eq.

 

To summarize - be careful with any eq corrections throughout the midrange, usually a good speaker will not need any eq here.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...