Digital Music: Can You Hear Above 16-bit/44.1kHz?

| Dave Hamilton's Blog

Imagine I said, "You should watch my special Blu-ray copy of Star Trek because mine is encoded with extra ultraviolet and infrared data and that totally makes it look better!"

At best you'd think, "My TV doesn't display UV or IR so it won't make any difference since I can't see those frequencies anyway." At worst you'd be concerned about getting a sunburn or a heat rash.

Now imagine I said, "You should listen to my special audio version of Miles Davis's Kind of Blue because I re-encoded it from the masters with special ultrasonic data and that totally makes it sound better!"

At best you'd think, "My amplifier filters out inaudible sounds so it won't make any difference." At worst you might even worry what pumping all this extra audio data into your speakers could do to them.

The thing is, this second example isn't hypothetical. What I've described above is happening right now as companies like Neil Young's Pono are telling people that audio files encoded at a bit-depth of 24-bits or a sample rate of 192kHz sound better than the 16-bit, 44.1kHz versions of the same audio on playback (recording is different, and we'll get to that later).

This difference between 16-bit/44.1kHz audio and anything greater than that has been tested (a lot... in double-blind tests) and we have yet to find any human that can reliably notice that difference. Bit depths greater than 16 bits and sample rates above 44.1kHz simply don't matter as long as the data is converted properly (and the industry's ability to do that conversion has improved substantially since those very first CDs were released at the dawn of the digital music era, though most of us at home don't have the skill or setup to convert and downsample properly).

Bit Depth and Sample Rate Explained

A digital audio file's bit depth determines its dynamic range (the difference between the softest portion and the loudest portion). 16 bits gives us 96 decibels (dB) of range. 24 bits give us 144dB. As per the aforementioned double-blind testing that's been done no human can hear that difference.

The sample rate of an audio file determines the frequency range that can be reproduced. Nyquist-Shannon sampling theorem informs us that perfect fidelity reproduction is possible with a sample rate equal to twice the maximum frequency we wish to reproduce. This means a 44.1kHz sample rate can reproduce frequencies up to 22.05kHz. Most human children can hear from about 20Hz up to about 20kHz. As we age the high end of that number narrows very quickly. Again, double-blind testing confirms all of this.

Science Prevails Over Emotion

There's a lot of science behind what we're discussing here. Nyquist's theorem has been proven by others many times over (hence the reason that many others' names are often attached to it). If this article instills in you the need to reconfirm all of this on your own (and you'd be in the good company of this author and several recording-industry professionals if it does) the only way to do it is to utilize a tool that allows you to perform your own double-blind testing (ABXtester, available for free for both iOS and Mac, works great). Without double-blind testing you (and I!) are quite subject to confirmation biasOur minds are not objective when we have too much information.

There's also a ton of pseudoscience here. Earlier this month at CES I had Bruce Botnick, the producer of The Doors' LA Woman, tell me that after listening to a 24-bit album he feels better than he does after listening to the same album at 16-bit. That's great for him, folks, but doesn't mean much to rest of us. Also, Bruce is busy working with Neil Young to convince us all to buy their 24-bit Pono player, and I'm sure Bruce feels better after each Pono pre-order, too.

The truth is there is the potential to store more data in a 24-bit/192kHz file than in a 16-bit/44.1kHz file. You can see it if you just compare the two – the former will be many times larger than the latter. If you were playing this to a dog, for example, that creature might well be able to tell the difference (because dogs can typically hear higher frequencies than humans, so it makes sense to use a higher sample rate for music targeted towards audiophile dogs). If you're like me and will be playing your music for humans, though, we haven't yet evolved our ears where anything above 16-bit/44.1kHz matters upon playback. That's why it was standardized for compact discs at the beginning of the digital audio era, and that's why it still works today.

Music is Emotional, and That's a Good Thing

Music isn't just a listening experience. Music is also very much an emotional experience. If you believe your speakers are better than your friend's speakers, you're going to be happier listening to music at home. There's nothing wrong with that. I might like the band Weather Report (I do); my wife might hate them (she does). It's simply personal preference and we're both right.

Similarly, if you've convinced yourself that 24-bit or 192kHz (or both) sound better you are quite likely to believe you hear a difference when listening to music in that format. There's nothing inherently wrong with doing something solely because it makes you happy to do it. Just don't confuse that for science; and please don't try to convince others with pseudoscience. That's the last thing we, as a people, need.

Next, head to page 2 where we address the recording process, mastering, why Pono is great, iTunes and more.

Comments

wchpitt

Having spent years in high end audio and “hearing” the difference between two different speaker wires…
I look at it this way, listing to digital music is like looking through a screen. No matter what you do the image/light on the other side has to fit/pass thru the squares to get to your eyes. You cant put back the light that hits the wire that makes up the screen. If you doubled the number of squares buy halved the diameter of the wire, you still loose the same amount of light, but each instance of light (photons) loss is less in relation to the adjacent square that the light does pass through. This would mean my brain would not have to work as hard to add back in the missing parts between the adjacent squares.
Analog would be like looking through glass. All there but distorted by the medium. Based on this analogy it would seem that higher sampling rates have merit.
Of course… I could be mistaken.

Dave Hamilton

@wchpitt your theory is based on sound logic (no pun intended) but turns out to be inaccurate. This is where Harry Nyquist’s (and others’) math works. You can perfectly reproduce a continuous signal with samples so long as your sample rate is at least double the maximum frequency you are using.

Paul Goodwin

Hi Dave.

Good article.

In theory, the analog music will be perfectly (as good as our ears can decipher) reproduced using an ideal 16 bit A/D conversion followed by an ideal 16 bit D/A conversion with both at 16 bits and 44.1 KHz. The digital data samples accurately represent the analog amplitudes at the precise points in time. There aren’t as many sampled points in the 16 bit 44.1 KHz data as there is in a 24 bit 192KHz data so reconstructing them in a real world D/A is more complex. If music were purely sinusoidal, the D/A reconstruction process is much simpler, but it’s not. There are roughly 4 times as many data points at 192 KHz. So reconstructing the true waveform is less challenging.

In the real world, you can buy 16 bit 44.1 KHz that are so good in terms of linearity, phase accuracy and jitter, that in truth, nobody would likely hear the difference. But not all converters are created equal. And, in the end, the quality of the analog amplifiers and their power supplies (after the D/A) will heavily influence what you hear. It sounds like your follow-on article will address some of this.

I suppose it’s possible that an inexpensive 24 bit system might not sound as good as a high quality 16 bit system. But it’s also likely that any manufacturer making the 24 bit converters would have high quality stuff in their design, or why even bother with the high end D/A.

In summary though, merely meeting the Nyquist sampling criteria doesn’t guarantee high fidelity. It’s a very complex topic. Hell, I’m an electrical engineer and I struggle with it.

geoduck

If you like vinyl good for you. I’m glad though that you made it clear that the sound of vinyl is not as good as digital.

Personally I find the whole digital/vinyl debate amusing. One side of my family are musicians. They insist that vinyl is “warmer”, whatever that means. They can’t define how but they like it better.

The other side of my family are computer engineers, signal engineers, and electrical engineers. They can show precisely why you get exactly the same perceived sound with a high enough sampling rate. They can show you the wave forms and the math. I believe them, especially with all the double blind studies showing the same thing.

And isn’t everything pretty much recorded digitally anyway? I mean even if someone recorded an album and had it pressed into vinyl today, wouldn’t the recording itself have been done digitally, so the vinyl is essentially just an analog copy of a digital recording?

On the topic of 16 bit v 24 bit recordings, I just beg off both sides. I have 54 year old ears and listen to music in my car. Eight bit would probably sound the same to me.

Paul Goodwin

I pretty much agree with everything here on page 2. My ears are 66 years old, and 3 of those years I was around jet engines, 1/2 of them spent in loud bands, and most all of those years listening to music-turned way up. So I have a hard time telling the difference between a 256 Kbps file and a lossless one or a CD. At times some 256 Kbps files sound a little more compressed than the lossless, AIFF or WAV files.

Remastered files are interesting. The producers sometime in trying to make them better end up making them sound quite a bit different than what the original artist and producer in the recording studio wanted them to sound like. There are a lot of older songs where the mono versions are vastly superior to the remastered stereo versions. iTunes is loaded with multiple versions of the same songs and finding the original recordings is quite a task.

Warmth is another interesting topic. Vinyl has a lot o compression which keeps loud transients down making things warmer. In electronic preamps and power amps, designs can be made to sound warm whether they are tube type or solid state. The active and passive piece parts in the designs make a difference. Tailored frequency response can color the sound and warm it up. What sounds best is a pretty personal thing.

Enlil

I love the convenience and portability of digital. It makes life so easy! However, I cannot argue that when I listen to vinyl, my experience is much more enjoyable. When I listen to mp3’s, at 192 I find they start approaching the limit of my ability to identify differences between 16/44 resolution and the mp3 on consumer equipment. At 256, it is almost indistinguishable.

However, I do not relish sitting down and listening to digital, mp3 even less. Vinyl can transport me in ways that digital never could. When I moved to digital, I later realized I quit listening to music. Why? I didn’t enjoy it. When I started playing LP’s again, I listened to music often, and still do today, and enjoy it tremendously. Now, I will caveat that with the provision that the latest hi-rez bit rates can provide an experience that is almost the same as vinyl for me. The problem at this point is not the tech, it is the software. The number of 24 bit digital downloads available today is a paltry fraction of all the music available. I love music in whatever format I can get it, but I prefer vinyl if I can get it. I have mostly given up on high-resolution digital as simply too ephemeral. I also like something physical. Even with backups I’m paranoid.

So while 192 mp3 or even 16/44 is “good enough” for the mathematicians, I don’t judge music with a calculator, I use my ears. I don’t know how mathematicians can count bits in how Mark Knopfler bends a note, but I definitely bathe in vinyl while digital can leave me still reaching.

I readily accept the math that shows digital is better than analog. However, I find it surprisingly arrogant that the author can pronounce with such authority that we understand everything there is to know about measuring hearing and auditory perception.

I know I will not change the author’s mind, as he seems very confident, indeed has already passed judgment, but after this many years I have learned that there is often more to the story than meets the ear, as it were.

furbies

<quote>and our ability to do that conversion has improved substantially since those very first CDs were released at the dawn of the digital music era</quote>

I fondly remember a friend who was the first person I knew back then to get a CD player, and a first gen (?) release of Dark Side Of The Moon.

The three things that struck me the most were not having to get up and flip the album half way through, the lack of pop & hiss etc from a well played vinyl disc and the idea that with basic careful handling, a CD would wear out with repeated playings.

I’ve still got (hopefully in good and playable condition) a quadraphonic DSOTM, and a couple of years ago I invested in a SACD version of DSOTM.
Unfortunately the SACD/Blu-Ray player I’ve got doesn’t like the SACD albums I’ve tried, but one day…...

macHobs

I definitely can not near any sound above 14 kHz. But there is another aspect which I would like to hear your opinion about: localization.

The human brain localizes sound sources partly by detecting time differentials between its two ears. This is particularly true for sources in the horizontal plane in front of your head, and in the frequency range up to about 1 kHz. The brain is able to resolve an angle of about 1.5°, or time differentials above 10 µs.

Now, to encode such small time differentials between two channels you easily need to resolve data with a precision above 20 kHz.

So while it is true that you definitely don’t hear any sound of 20 kHz, the brain may still be able to detect this information indirectly by means of localizing a sound source. The net result may be that such recordings reproduce a sound stage more truthfully.

Do you have any insight into this aspect of sound recording and reproduction?

Paul Goodwin

macHobs. Yes. The phase accuracy of the converters, and the time stability of the sampling rate does have an impact on what you hear. The effect of inferior DACs on what you hear can be similar to the effects of speakers moving forward and backward (toward you and away from you) at high rates, in unison, and even not in unison. As you said, our ears can hear this effect if it’s pronounced enough. In the early days of digital music, the DAC reconstruction filtering and sampling rate stability wasn’t what it typically is today. It’s a form of distortion. Today’s better equipment (some of which is not really very expensive) is very good in reproducing what is in the digital files. One problem is that it’s so good that it doesn’t sound like what we used to hear on vinyl, the songs sound different, and we don’t get the exact experience we had when we first listened to the music on our old systems. Even though our ears may not have the bandwidth they used to, the phase impacts of equipment with cheap DACs can degrade the sound. It can sound “muddy” (confusing) and what the audiophiles call depth and soundstage is affected.

James Campbell

Dave: I have done double blind tests and could tell the difference between 44.1/24 and 96/24 100% of the time with good headphones (I used 24 bits on both so that there would be no audible difference in the volume as a queue). One of the things to keep in mind is that Nyquist’s work was all based on monophonic signals, stereo is a different beast. Our brain uses both phase shifts and volume to identify a point source for any given sound, 44.1 has much larger phase shifts (especially in the harmonics) and also some volume errors on signals that approach the higher end of the spectrum (phase shifts can be up to around 120 degrees, and volume can be off up to 100% at 1/2 sample rate). While on a mono signal these are measurable they aren’t really all that audible. With stereo signals however they can cause significant issues with the apparent position. As I was doing an ABX with 96K and 44.1K what I kept noticing was the stability of instruments in “space” was much better, and therefore easily identifiable. This difference was much less apparent when using speakers as the rooms acoustics come into play, but with headphones I could tell them apart every time. It was subtle, but it was audible.

Paul Goodwin

James Campbell. Yes. A good set of headphones will uncover a lot of sins of playback equipment.

James Campbell

Paul Goodwin. Agreed, but since it was the same equipment, and it was all recording studio quality equipment I’d have to say that what was uncovered were differences in the sample rates, not the equipment. One other point, in Dave’s article that I want to mention. 44.1/16 was not chosen because it was known to be the be all / end all of audio, it was chosen because they needed a way to get the masters from the studios to the pressing plants. In the early 80’s when the format was being developed the only storage that could hold that much data was tape based. Sony/Philips realized they could store the digital audio as a television image and send it on Umatic video cassettes, then convert it back from video to digital on the other end. It wasn’t chosen because it was the best, it was chosen because the data and the necessary error correction fit within the bandwidth of a TV signal.

macHobs

James Campbell.
So you seem to support what I wrote. That in order to reproduce a soundstage more faithfully, and to localize sources more precisely, you need a resolution of data above far above 20 kHz, indeed up to 50 kHz to resolve differences in the 10 µs range.

Is there any kind of scientific discussion of this side of the matter? I am asking because here in Germany I had a discussion with some people who emphatically denied that this would play a role at all.

James Campbell

macHobs, yes I’ve seen something about that somewhere, but it was after I did my double blind test. I first proved to myself I could tell them apart, then I forced myself to repeat that test over and over until I could tell HOW (the ABX test proved I was telling them apart, but I had to do that perhaps 100 times until I could tell how.) It was, as I mentioned, all in the spatial, there wasn’t anything missing, but I’d hear goofy things like a cymbal moving slightly left to right (or right to left) as it faded out, or an instrument that sounded like it was taking up space instead of a point source etc. After I realized what I was hearing I spent some time searching. There have been studies that do talk about this, but I didn’t save links, but they were all found initially with google. Most of them came to the same conclusion I did, 96K does make a difference, anything beyond, not so much.

macHobs

James Campbell: Thanks for your answer. I will try Google these days. And what you describe confirms an old saying: you only sea/hear what you know. You have to know what to look/hear for to detect it, if it’s subtle.

CCardona

As if I needed another reason to like you, Dave, there’s “identification”. I LOVE Weather Report and my wife hates it! You don’t have twins, or ride motorcycles, or play Bass, Guitar, Mando & Uke, do you? wink

Boogie Woogie Waltz!

Dave Hamilton

Great discussion, folks. I really appreciate all of you (and all of our TMO readers) for keeping everything so civil. You rock. Kudos… and thank you. smile

I normally wait a bit to reply, but having watched this back-and-forth this morning I figured it was time to chime in, and then I’ll leave you to it for more. smile

@Enlil - This has been about a 6-week process for me, and I started with an open mind to find out if I needed to worry about 24-bit/96kHz reproduction. I talked to some people who are VERY well-respected in this field (several award-winning producers among them; one who was certain he could hear the difference until ABX testing at my request two weeks ago!) in addition to the other research I mentioned. Obviously the sum of that led me to the conclusions I’ve made here, but that’s the benefit of science: if there truly is enough evidence to change my mind, I’ll do it.

@James Campbell - I’ve heard about this… I found it in my research, too, but every study I found where this 96 vs 44.1 spatial difference was noticeable turned out to have flaws in the methodology. Usually that flaw was in the process of converting from 96 (or 192) down to 44.1 (though sometimes there’s issues with using the wrong DAC or, as you mentioned, accidentally upsampling at output). Conversion errors like using the wrong low-pass filter (i.e. setting the wrong anti-alias frequency) or an overloaded system (where clipping/distortion artifacts are generated within it) were the typical examples I recall.

I searched and literally could find no mention of a repeatable ABX test case where this was not eventually found to be the differentiating reason.

Fun stuff!

James Campbell

Dave: Your point is well taken, it could have been my methodology as well, can’t say for sure it wasn’t, although I did use a couple of different applications to down sample from the 96 originals to 44.1 and had the same results with both sets of files.

I also did similar ABX tests with MP3 and AAC compared to 44.1/16, hats off to Apple, I could not reliably tell the AAC’s apart from the CD’s at anything 192K and beyond. The MP3’s yeah, the 320’s were good, anything lower and I could pick those out as well.

I’ve been wanting to take the time to record the same analog source at both 96 and 44.1 rates on the same gear and try again but haven’t found the time, and even at that there is still no guarantee that it isn’t something in the chain, and how it’s optimized in the recorder.

I will say that I can only take headphones for an hour or two with 44.1 files, I wear them all day with the 96’s and they don’t grate on my nerves, that however could definitely be attributed to confirmation bias.

One last thought, the biggest thing Pono brings to the table is the quality of the DAC, filters, and analog amps and components. They really reached for the stars, and got really close, even if you aren’t a believer in high-res, it’s a phenomenal player, albeit with a slightly quirky user interface, but they’ll get that ironed out at some point.

Paul Goodwin

If you think about it, at 44.1KHz, you’ll get 3 samples of a 14.7 KHz sine wave. At 192 KHz, you’ll have 13 points. Now think what it means if the wave isn’t sinusoidal, and you only have 3 data points with which to reconstruct the real waveform. The smoothing algorithms (I would think) would have to be far more complex to rebuild the wave with only 3 points. At 20 KHz, you’ll only have two points. I find it amazing that they can actually do it as well as they do. It’s not at all surprising to me that some people can hear subtle differences. The real test would be to let the artist or producer do the A-B test, and tell you which one sounded more like what the master recordings actually sound like.

Dave Hamilton

@Paul Goodwin – an excellent point, and one that is often asked. In my conversations with all the audio experts I was repeatedly told that you only need 2 points to accurately recreate the signal and everything else is 100% extra and unnecessary (and, according to some, simply discarded). 1 point isn’t enough because you don’t know which way it goes, but two tells you everything you need to know. Perhaps one of them will comment here.

minphase

“Nyquist’s theory has been independently confirmed many times.”

Sorry, but this simply isn’t true.  The truth is even more in your favour, but the statement in the article is false.  No one needs to confirm the sampling theorem.

A THEOREM is a mathematical proof, a THEORY is something one notices about the world and tests out.  Nyquist’s theorem about perfect reconstruction of a sinusoid (after low-pass filtering) with only two samples is a THEOREM.  It’s provable.  No experiment needs to be done, just examination of the proof. 

The part everyone seems to have trouble with is that descriptions leave out the final low-pass filtering stage of the reconstruction process.  Sure, you could draw an infinite number of sines that fit between two points.  BUT ONLY ONE LINE FITS with the constraint that the signal is bandlimited to Fs/2.  When you run those two points through the lowpass filter, voila, the perfect smooth continuous sine wave comes out, as if by magic.  This is math.  It’s very beautiful and magical, and very very true.

(( but don’t get me started on whether 16bits suffice for RECORDING.  There are practical considerations that make 24-bit recording superior in many applications ))

Dave Hamilton

@minphase - D’oh! Thanks. Correction to article made. I’m surprised I (or anyone else) didn’t catch that mistake earlier. And by “confirmed many times” I was referring to the fact that others independently proved the same thing, without knowledge of Nyquist’s work. That’s all. Thanks for the comment. MUCH appreciated, sir!

minphase

Yeah, pretty much any theorem, Gauss proved it first, somewhere smile

Forgive me—I hadn’t seen the second page.  You hit exactly the sorts of points about 24 bit recording that I was thinking of.  Having lots of headroom while recording allows you to get great quality even if your recording levels aren’t optimum.

I have to strongly agree with your final point—it’s fun to play vinyl, read liner notes, clean a record, thread a tape reel (not kidding, I do this).  It’s all about the quality of attention.  More attention == better sound, more fun.  One doesn’t have to be a Zen master or mindfulness expert to figure out that the ritual and preparation time put one into a good state for musical pleasure.  I used to argue with audio myth spreaders a lot, but as I get mellower, I understand that they are really trying to say “I have more FUN with vinyl”, etc.      Great article!

lahaina52

@Dave Campbell et.al. 1. You cannot self administer a double blind experiment. 2. An “N” of 1 (one) is not a statistically worthwhile sample size. 3. Why physics is sufficient to settle simple scientific issues but metaphysics seems to be necessary to settle such issues as they relate to audiophiles who have spent thousands of dollars on equipment, spent thousands of hours selling themselves as sound engineers to musicians, etc. etc. etc.—not every situation is covered here—is a mildly interesting psychological discussion in and of itself.

macHobs

Dave Hamilton: That really boils down to mathematics – Fourier analysis and the Shannon-Nyquist sampling theorem.

I am not an expert in this field in any respect, but philosophically speaking I find the whole affair a bit irritating. The human eye is able to distinguish ever increasing resolutions: 4K video, 8K video, high frame rates, and so on. And the human ear, which in many respects is far more sensitive than the eye, can not go beyond 16/48? This puzzles me, in particular with respect to its proven ability to detect phase differences around 10 µs, which is 1/100.000 of a second. But what do I know?

Paul Boutin

I keep seeing these articles about 44.1k/16bit being the holly grail of sample rates. The Verge and now you. Do you guys even do this for a living ?
I’ve been an audio engineer for over 20 years, not because I have super ears but because I love music.
Well let me tell you. There is a difference between different audio settings.
Maybe not so much between the sample rates but more so between the bit rate, 16 vs 24 bit.
I record and mix everything at 48k/24bit. So I live with my songs for days. Then it gets mastered at 44.1/16bit and then wait for it, it get’s sent to iTunes for “a special iTune Mastering”
When I listen to the iTunes version of my song, I want to cry. I really do.
I’m so upset that I feel like I did something wrong and I should go back in the studio and fix it. It does it every time. No more highs, no more lows, instruments poping out that I didn’t know where there.

So please stop propagating this miss information.
When I blind play both versions to my wife she can always pick the 48k/24bit from the 44.1k/16bit. Once you know what you are listening to, it’s very easy.

Pro Tools also now has 32bit availalble. Why do I not use it? Well, I don’t want to cry even more about what iTunes will do to my songs.

richardcon

I’ve wondered if bumping the consumer sample rate to 48khz would make a difference. Since it’s exactly half the frequency, the 96khz masters can be downsampled perfectly without needing dithering or a really good DAC.

trex67

My two cents: I’ve been hearing from my fellow musicians for years about how vinyl (or analog tape) is superior to digital, because “warmth” blah-blah (I think they mean the low-end analog distortion…)

I grew up with analog. My vinyl collection was around 2,000 albums at one point (long since sold off). I’ve recorded in 64 track studios on 2 inch tape (at exorbitant rates). I’ve self-engineered for over thirty years, my hearing is still above average (granted over 15k is failing fast.) Yet I’ve never been swayed by the argument that analog is superior in any way to digital. True, early CDs sounded pretty harsh and crappy, but that was due to the state of the technology at the time. Now anyone with a decent Mac, a good analog to digital converter (~ $200+), under a grand in software, and a few good mics can make professional sounding recordings pretty much anywhere. The revolution is over. Analog lost.

Yes, while I do record individual tracks at 24-bit/192kHz to get every last bit of that performance, I’ve found that mic’ing techniques are much more crucial to getting the sound I want than bitrate, compression or eq’ing.

analogplanet

Dave, you demonstrate the axiom that a little knowledge can be very dangerous.

I suggest you talk to someone like Meridian’s Bob Stuart who has forgotten more digital theory than you will ever know before you again write about something you know very little about except for some basic “math”.

Regarding double blind testing Bob Ludwig said in the current issue of Tape-op (and he’s probably mastered a few more albums than have you): ” I think the higher resolution sounds reveal themselves not in A/B testing, but in long periods of time. Play an entire album in a relaxed atmosphere at 96 kHz/24-bit, then, at the end, listen to it at 44.1 kHz/16-bit, and you’ll get it right away. A/B testing, while the only scientific method we have, does not reveal too much with short-term back-and-forth comparisons due to the anxiety the brain is under doing such a test. The brain becomes very left-brain-technical, rather than right-brain creative and musical. “

On the other hand I was given a double blind test of redbook versus 96/24 and got 3 of 4 IDs correct that were instantaneously identifiable. The fourth was a vocal track that proved more difficult.

I saw today that in a blind test violinists couldn’t distinguish a Stradivarius from a run of the mill violin. So do you really think that result was dispositive? Or might blind testing not be appropriate for every scenario and can a blind test produce stupid results?

I was challenged to an A/B/X blind test by a guy claiming that all amplifiers that measure the same sound the same, which in my experience is a ludicrous assertion. So I took the challenge at an AES in Los Angeles.

For some reason the test included a Crown DC-300 solid state “ear bleeder” and a VTL 300 tube amp (warm and billowy). They neither measure nor sound the same yet in that test AES engineers couldn’t distinguish between them.

Oh, by the way, I got all of the identifications correct. I took the challenge and I passed. Unfortunately because my result fell outside the statistical results I was declared a “lucky coin” and my result was tossed.

What a joke.

As for your crappy turntable sounding bad, well I’m shocked.

Your infantile comment about vinyl (for anyone who cares) speaks volumes…. you’ve clearly never heard good vinyl playback—-maybe you should visit Keith Jarrett and listen….he’s a vinyl enthusiast but what does he know? Or the concert pianist Olga Kern (first female to win Van Cliburn prize).

They must enjoy pops clicks etc. right Dave?

Dave Hamilton

Playing some catch-up here.

@macHobs - I understand your philosophical concern! I want it to be “better,” too, and when I started down this path I was hopeful that it would be possible. Seems we developed the technology to placate our ears before we caught up with our eyes.

@Paul Boutin - Sorry iTunes’ process is mucking with your good work, sir. I’m definitely not an expert on this, but from my research I recall reading that Apple actually wants you to submit at 24-bit so THEY can do the full conversion. Perhaps that’s part of your issue?

@richardcon - Moving to 48kHz for those reason you mention would be a smart move, for sure. Seems like that would limit the possible math/conversion errors significantly (and it seems like that’s where the biggest issue currently lies).

@analogplanet - Appreciate you taking the time to write. You’re clearly passionate about this. My passions lie more with music itself (I’m a musician, and have yet to hear any recording that sounds like it did live, though I think that’s more about microphones and amplifiers and speakers and room acoustics and all of that, but I digress smile ). Regardless, I think our goals are aligned: find the best possible way to recreate music in the way it was recorded. And as I said in the above comments (I would request that you read them all for some add’l perspective, including reference to the producers I *did* speak with), I came into this with an open mind and still have one. I’d love to learn what it would take to set up an appropriate test environment like the one Bob Ludwig mentions. That could be enlightening. Please feel free to follow-up with me personally on that if it’s too much detail for a comment stream here. Thanks!

leicaman

As a trained classical violinist for 19 years, and concert master of two difference orchestras, I probably have a bit more fine tuned sense of hearing than the average.

I keep hearing these arguments, and people throwing around numbers and theories, and saying ludicrous things such as the comment about UV and IR light, and realize the person has missed the point altogether. It’s not that I can hear like a dog, I hear what’s in the ear-compatible spectrum. But just like there are people who are tone deaf so they can’t tell the different between an opera singer and Slim Whitman who can’t carry a tune in a bucket, anyone who knows what to listen for can hear the difference.

As for the Pono, I have one. And you know what? It makes my iTunes music sound better. It has essentially the same DAC at to top-rated FLAC player which costs $1,300 or more. So it makes everything sound better. MP3s, AACs, WAVs, FLACs, and whatever else is coming.

Not to mention the arguments behind Bob Stuart’s work at Meridian with the new MQA codec that is coming soon to streaming, and to all the big music labels and a bunch of indies. Don’t listen to the arguments that sound like the arguments about climate change, or creationism. People have agendas, and they have doctrines, and they have all sorts of nonsense that will deny anything under the sun. And they love to throw out that “science” word like a cudgel. Excuse me, but I don’t care what you way, if you say “This is such and so because, science!” I stop listening. That is not how science works.

LukeG

Hi Dave,

I can’t let the pervasive misguidance and uneducated blabber continue. Each section on the first page of your blog became more painful to read than the previous.

To start us off, for the purposes of clear communication, I will use the term “high resolution audio” to refer to any audio file higher than 16 bits at a 44.1KHz sample rate.

So, I think “the thing is” that Neil Young or most (albeit not all) proponents of high resolution audio are telling the public that 24/192 does sound better, but NOT by simply upconverting a 16/44.1 file. I think they are saying that 16/44.1 is not the holy grail of audio everyone thought it was and that higher resolution source files sound better.

Also, never use hyperbole if you are trying to prove a point. – “As per all the double-blind testing that’s been done no human can hear that difference.” You, Dave, do not know about “all” the testing, so don’t speak to all of it, cite the ones you know.

And, let’s get something out of the way. Even though the Nyquist Therom results in perfect math and the dynamic range of 96dB (120+dB with dithering) is good enough for just about any recording, the transducers on either side of the number crunching are not perfect and will introduce their own characteristics. (Davis, Patronis,. (2006). Sound System Engineering)

Let’s get something else out of the way - the audible benefits of higher bit depths and sample rates are NOT dynamic range or being able to hear above 20 KHz. They are apparent in dynamic resolution and phase perception respectively. The immediately audible benefits, and they are audible, of higher bit rates are smoother reverb tails and “spaciousness”. The audible disadvantage of a 44.1 KHz sampled file is that the distortion introduced by 5th, 6th, and 7th order filters in place to ensure no audio crosses the 22.5 KHz threshold cause phase alteration. The winning paper at the 2014 137th AES showed humans can hear this distortion. (Jackson, Stuart (2014). The Audibility of Typical Digital Audio Filters in a high-Fidelity Playback System. Convention Paper 9174)

I am glad you brought up ABXtester. ABXtester and Foobar2000 (I speak to the Windows versions only and hope it is not sacrilege to speak about Windows on this site – this article, after all, is about non platform-specific audio) is a misleading piece of software because it outputs the sample rate to which the Windows operating system is set. If your operating system is set to 44.1 and you load a 192 file, the 192 will be downsampled to 44.1. This is built into Windows. According to the author of Foobar’s ABX module, to use it properly, both files MUST be at the same sample rate and the operating system changed to match that sample rate. This means first having a DAC that can output the rate you are comparing, converting the lower rate file to match your higher rate file using software that will not introduce any mathematical errors, loading appropriate drivers compatible with you sound card to the operating system, and finally changing the system sample rate to match the file with the higher rate. To test this out yourself, connect a DAC with a readout (on the unit or in the unit’s software) which displays the unit’s current sample rate (I would love to know your results). I have tested this with a MOTU 828 MKII, SoundDevices USB Pre2, and an RME Babyface (able to output 24, 192) and all stay rock solid at 44.1 no matter what file I play. The best way of truly playing back an audio file is to launch a DAW with ASIO drivers (again Windows) which control the attached DAC (SoundForge, Cubase, etc.). Quicktime, Foobar, ABX, Audacity, Windows Media, and VLC all exhibited this same behavior. As far as the bit depths, Windows (again, sorry) will also automatically convert the audio to 32bit floating point – that’s just what the operating system does.

If you performed an ABX test, and your results helped to guide your thoughts on high resolution audio, you should post your results, the setup, and files - Including, but not limited to: the audio files used, where you acquired them, your DAC, your computer, processor, RAM, operating system, listening device (headphones or speakers), and your listening environment with recorded noise floor (NC rating). If you want to play in the arena of science, then be scientific.

There is evidence, albeit not extensive, that humans can perceive audio above 20 KHz – possibly up to 100 KHz. Here is a study exploring this concept: http://jn.physiology.org/content/83/6/3548. Meyer Sound, in Berkeley, CA is also working on experiments along these same ideas.

So, finally, we can talk about the crux of the problem: misinformation. But, there is something more dangerous than poorly written software and out of phase audio. And that is the audio industry’s reluctance of file transparency. Terms like HD Audio are pervasive; from HD radio (definitely NOT high resolution), HD headphones, HD Pandora (very far from high resolution), and HD audio files. Currently, there is no standard for recognizing high resolution content or systems capable of reproducing high resolution content. Most egregious are some “content providers” simply upsampling the 16/44.1 CD masters and reselling them as “high resolution” files – a horrible, dishonest, and misleading practice. There is also disagreement of what constitutes “high resolution” – e.g. can a 20 bit audio file be considered high resolution?

Let’s start asking where our music came from and how it came to be in a particular format.

As a side note, if you want vinyl that sounds good, check out original pressings made before 1982. Otherwise, the pressing was most likely created from a CD master which gives you the worst of both worlds (even remasters from very reputable labels are repressings from the CD masters). I have yet to hear a new vinyl pressing (or repressing) that sounds better than the CD and an original (pre 1982) vinyl pressing that sounds worse than the CD.

Thanks,

LG

James Campbell

I’m circling back around on this one more time because Dave said or rather quoted something I strongly disagree with. “The sample rate of an audio file determines the frequency range that can be reproduced. Nyquist-Shannon sampling theorem informs us that perfect fidelity reproduction is possible with a sample rate equal to twice the maximum frequency we wish to reproduce.” This statement is well beyond known to be false, check the Wikipedia article again, and notice they even show an example of three different waveforms, all at the same frequency, but different amplitudes and phases that all get the same identical samples, clearly that isn’t right.

I wish I could attach a picture here but apparently I can’t so I instead provided a link to the file on drop box at the end of the post. I’ve got side by side waveforms of 44.1 & 96 (the 44.1 is not downsampled, I used the same high quality analog source (an SACD) and recorded the same song at both rates on the same “professional” digital recording gear) that clearly show errors in both phase and amplitude on the 44.1 files.

Just to make sure it was audible differences and not something outside human hearing I applied a filter to both sets of files to cut them off at 20 Kilohertz so that only audible differences would be shown. Yes the differences are subtle, but if you look closely they are there. For those of you who are skeptics you can see the image on my public dropbox folder at:

https://dl.dropboxusercontent.com/u/15283177/44kvs96k.tiff

Dave Hamilton

@James Campbell thanks for this. I don’t usually use the comments to go tit-for-tat immediately, but since this is such a fundamental part of understanding the math here, I felt I should either respond or remove your comment pending further research. I hate removing comments, so… here we are. wink

The sampling theorem is just that: a theorem. Proven (in this case, by many). And, in part, the Wikipedia article you asked us to reference says, “The sampling theorem introduces the concept of a sample-rate that is sufficient for perfect fidelity for the class of bandlimited functions; no actual ‘information’ is lost during the sampling process. It expresses the sample-rate in terms of the function’s bandwidth. The theorem also leads to a formula for the mathematically ideal interpolation algorithm.”

This part of the article (indeed many parts) were vetted by people a lot smarter than me (several with Grammys to their names, several with Ph.D at the end of the same).

There’s definitely a discussion to be had about what happens when you chop off higher frequencies (indeed, science isn’t finished yet!) but Nyquist’s work is not something I’ve seen anyone wanting to debate.

Still, you point out a valid issue with how easy errors can be introduced during the samplin process when the sample rate is very close to the frequency you want to reproduce. More headroom definitely makes it easier to be accurate.

James Campbell

Dave: You might or might not find this interesting, but it talks what Nyquist didn’t say in regards to sample rates, and also about why it may be better to sample well above the Nyquist rate.

http://www.wescottdesign.com/articles/Sampling/sampling.pdf

James Campbell

OK Dave, I decided to follow your lead and really closely read the Nyquist-Shannon stuff, (and do the math), the issue is that they very specifically require a sinc function (this does both look back and look forward in the bit stream and successive addition of these samples for each data point in the output). The problem is that our modern DAC’s don’t use this, they use a simple voltage reconstruction, and that’s where I take issue with saying 44100 is enough. At higher frequencies (say anything over about 25% of the sampling frequency or about 11K in the case of 44100) statistically significant phase errors and amplitude errors are introduced. If you do Nyquists math that doesn’t happen (well, at least not as bad), but I’m not aware of a single DAC on the market that actually does that.

I’ve added a spreadsheet on my public dropbox (both Excel & Numbers versions), feel free to play around with it, it shows what happens to sine waves when run through a modern AD/DA. There would be a little bit of additional filtering going on, but just look at how ugly it gets when you start raising the frequency and you’ll get a feel for why I don’t think you can accurately recover the original analog signal.

You can see two different AD/DA rates at the same time (and set the bit rate but without the ability to dial the amplitude of the sine wave down it doesn’t mean much, suffice it to say even with quanitization dithering you can get a -89 db signal out of a CD vs about -120 with 24 bits).

The graphs then show the first 100 samples that would be taken with an A/D with those settings. There are some interesting things, peaks and phase shifts start occurring pretty quickly on 44100 (say at 8k), and get worse as the input frequency increases. Second, a signal in the 14K to 15K range will produce significant inter-modulation distortion producing a secondary wave at about 2K.

With a 96K sample rate you have to get right on top of 20K to start seeing significant distortions. (By the way if you look at those samples in my dropbox from the Steely Dan tracks the distortions look very much like the sine wave distortions in the spreadsheets).

So while 44100 might be “good enough” for an untrained ear, I still maintain that with good playback equipment and headphones and a little training for what to listen for you can hear the difference.

As I mentioned way early on in the conversation I did a ABX test and I could tell them apart reliably, I then repeated the test with a friend, he didn’t score well, then I described what to listen for and he hit 80% as well.

Dave Hamilton

Thanks for this, James. Good stuff, man. wink

Responding in-kind, just to keep the trail here. Regarding the post two up from this, that’s an interesting article.

Dangerous reading for the casual reader, though, because this Tim Wescott gets far too close to *appearing* to make the argument for “stair-stepping” in digital sampling, something I think everyone knows is false, including the author (he basically says so). But he describes and diagrams it nonetheless. Was this thing peer-reviewed? If so, I blame his peers for letting him get away with this, too. wink

Regardless, his final conclusion is right on the money: in order to do sampling properly one must be *well*-aware of the need for — and pitfalls contained within — aliasing and filtering. It’s for this reason that I don’t suggest people ever attempt to do their own “down-conversions” at home… and probably one of the reasons (though clearly not the only one) that Apple prefers files at 24-bit/96kHz for iTunes so they can do their own down-sampling. Most of us simply don’t have the skill, understanding *or* proper equipment to down-sample a 96kHz signal to 44.1kHz without screwing it up.

When it’s done properly, it’s unnoticeable by human ears. But that’s a difficult task to accomplish properly.

TL;DR: If you receive 24-bit/96kHz tracks, keep them that way. Storage is cheap.

Dave Hamilton

And now, as for your tests… again I go back to the body of work that’s been done. I don’t mean to continually punt on this point, but the scientific consensus is that when the sampling/conversion is done properly, there is no audible difference (and the filtering and aliasing is a HUGE part of getting that right… and it’s not easy). Given that you quickly found audible differences and were able to train others to hear the same, I have to assume it’s in the sampling process simply based upon what all the other studies have found.

Again, referring to the point that’s been made time and again by the pros: most of us aren’t well-equipped at home to do the proper conversions. You may be, I don’t know, but your tests seem to refute a whole body of work that no one else who’s studying this even seems to be interested in refuting.

Now what IS interesting is the brain studies that have been done indicating different brain activity when exposed to both audible and ultrasonic sound in combination as opposed to either separately. THAT warrants further study.

Leo Hoarty did a talk on this last year and indicated that fMRI studies are being done to see exactly how this affects (or doesn’t affect) our brains when listening to music.

Science isn’t finished with this one yet, and that’s a good thing. I eagerly await those results.

In the meantime, my advice remains the same: leave tracks at whatever bit depth and sample rate you got, and (if desired) convert straight from that down to your 256kbps AAC; but don’t do any sample-rate conversion in the middle there.

And if you find you enjoy listening to one type of track/medium more than another (lossless, vinyl, 8-track), don’t question it… just enjoy listening. That, after all, is the key here. In the end science will tell us which things are universally noticeable versus what our lives and experiences have taught us to individually prefer.

Bert_B

I am searching for a discussion like this for a long time.
I see a lot of discussions about the fact that you cannot hear the difference between 16 bit and 24 bit because the dynamic range of 16 bit is more than sufficiant. Nobody is talking about the fact that the ear is very sensitive for phase-relations and phase-problems. In my opinion the difference that people hear(and what I hear) between 16 bit and 24 bit is “localization” as also mentioned by “macHobs”, although he goes for higher sample-rates and I go for higher bit-rates.
I’m over 60 and also cannot hear any sound above 14 kHz, so 44.1kHz samplerate is ok for me, but it must be 24 bit.
The story goes back for 40 years, when I bought my first CD-player and my first classical DDD version of Dvorak “from the new world”.  I must say, I was very disappointed. The sound was very clear and without distortion, but I didn’t like the stereo-definition, I couldn’t localize the several instruments at there exact place. It looked as if every instrument was placed in a little box where the echo hardly could get out.
Now that I recently bought a new headphone(Sennheiser HD-650) and a very good audio-card in my computer(Soundblaster Z-series) I discovered the reason why.  I downloaded Dvorak “from the new world”, but now in 24 bit, 96 kHz and I must say “this is the way it should sound”.  I get a good impression of the placing of the instruments, without them being placed in a box, and this has nothing to do with dynamic range at 24 bit, but only with a more precise stereo-definition.

Dave Hamilton

Great comment, @Bert_B. And thanks for rekindling this discussion!

The big thing you have to watch out for is the difference in mastering. A lot of those first-gen CDs were from the (sometimes digital) masters that were EQ’ed for vinyl. Hearing that raw (I.e. Without vinyl adding its own color) can sound pretty flat.

I have no doubt the 24/96 version you’re hearing is remastered for digital consumption and sounds remarkably better than the original CD from decades ago. It should! The question is whether a 16/44(.1) version from that same master (properly converted and dithered) sounds any different.

Here’s the thing: I would love to find a repeatable, double-blind test that shows a discernible difference. It would (finally?) support our basic human desire to prove that “more is better.” Thus far, though, none has been presented.

Log in to comment (TMO, Twitter or Facebook) or Register for a TMO account