I got a different impression from the video. What I heard was that our hearing/perception system are very sophisticated, much more sophisticated than most people realize, and these complexities need to be considered in addition to the standard measurements like frequency response and noise level. There are many people who claim that if you cannot see a difference in these standard measurements then there must not be any difference in the sound. His point is simply that is a very simplistic view of a very complex hearing/perception system. Most of the video were just examples that helped prove that point.
From looking at the video and the paper it seems to me the strong messages from the video do not match 100% the messages in the paper it is based on; so at least as far as I understood it, there appears to be a level of interpretation introduced by the author of the YT video.
In any case I'd suggest anyone interested watching the video not to take it for granted, but to take the time to read the paper it is based on as well. I'd also encourage people to read other academic papers on similar topics too, because not all researchers share the same mind.
I don't see myself as competent enough to comment on the details of the content of the paper itself - the scientific community will either accept and build upon / or reject / or ignore its conclusions.
What I heard was that our hearing/perception system are very sophisticated, much more sophisticated than most people realize
This part is true. I don't believe any person who did any kind of research in audio would try to negate this. But human auditory system has its limits too, and these are not quite as poorly understood as it is sometimes suggested. But not everything is understood either.
these complexities need to be considered in addition to the standard measurements like frequency response and noise level.
In my experience professionals in the audio field try to do this.
Please also note that we are able to measure much more than just frequency response and noise level. A typical measurement suite will contain quite a few additional tests, depending of course on what kind of device is being tested.
However, in my experience it quickly becomes clear to most involved in auditory perception research that some of the tests have much more weight than others, because people tend to hear certain types of deviations much more easily than others. It would be silly to ignore this.
There are many people who claim that if you cannot see a difference in these standard measurements then there must not be any difference in the sound.
People claim all sorts of things. These claims are sometimes justified, but not always.
Most of the audio research articles I've read look quite different. To be honest, I haven't seen many (any?) serious research articles where some random device is measured and claims are made about how it sounds without listening tests.
In fact, many research articles in audio are very much focused on (controlled) listening tests. A lot of audio research in the last 100 or so years was looking at the mechanics and limits of human hearing perception, what causes audible differences, and what people prefer on average.
Results of such research are sometimes used to establish reasonable metrics/targets for device engineering. It is IMO a reasonable thing to do.
I thought the experiment of removing the first few microseconds of a recording and having the people who played it try to identify the instruments was particularly interesting. It showed that timing down to the microsecond was important. That is far outside the normal 20 to 20 K range that many people take as the definitive test of hearing ability. It is certainly a very unexpected result and I would love to know more about that experiment.
Are you aware that removing the first few microseconds of a recording (i.e. the initial transient) also changes the recording spectrum?
Note that the time-domain waveform and frequency-domain magnitude+phase response are inextricably and directly linked. If you change one, you also change the other (explained by Fourier transform in mathematics). So this may not be as unexpected as it can seem at first glance.
But it is absolutely an interesting test, and I'm sure it is valuable in better understanding human perception.
My take from the video is not that all things matter but that many more things matter than the simple test measurements typically reported and and relied on by some people. I agree with that conclusion.
People tend to oversimplify, I agree. But unfortunately people also tend to overcomplicate. Being human is filled with paradoxes.
Lastly, it is clear to me that it is unlikely most audio hobbyists would be willing to go through the hassle to setup a truly rigorous listening test just to see which device they prefer. That is IMHO reasonable and completely understandable. People are free to select whichever methodology they like when choosing devices to buy and own.
The part that always surprises me is not this; it is the fact that so many people in this hobby seem to think that completely uncontrolled listening tests are an equally valid (or even more valid) way to compare audio devices compared to controlled listening tests, and are willing to argument this to death - most without ever having truly researched or experienced the alternative.
