USB audio: Asynchronous is the only choice for compromise-free audio
In August, EDN published an article on USB audio - "Select your USB audio MCU with care: Scary stories from the test bench." The article caught our eye as it outlined some views on what designers should consider when selecting a microcontroller for USB audio and, critically, the relative merits of synchronous and asynchronous clocking.
[We'll assume the reader already understands the basics of USB audio, and the role of clocking in ensuring high-quality USB audio streams. However, if not, there"s a detailed guide, by Henk Muller of XMOS, on the fundamentals of USB audio.]
The 'synchronous' perspective
The original article suggests that:
"…understanding of audio may have passed the current generation of microcontroller suppliers by, resulting in a generally rather poor standard of audio replay. This may be down to a tendency for pure-play MCU companies to treat audio as just another data interface format… Some of the methods proposed and implemented for generating the audio master clock… have no place in a high-quality audio product"
The article runs through some issues encountered with off-the-shelf USB audio products and vendor reference designs that employ adaptive and synchronous clocking methods: it uncovers failure of segments of audio due to improperly implemented changes of sample rate, as well as pitch shifts caused by the toggling of a coarsely-stepped oscillator between different sample rates, (to "average out" at the desired sample rate).
The article goes on to say that asynchronous operation is "all the rage" at the high end. However, it asserts that asynchronous operation is not commonly supported by host devices, can cause undue processor load (especially if the host is a mobile device), may require extra component expense, and can cause dropped frames when video is attempting to remain synched with audio. The conclusion is that "asynchronous operation is generally limited to fairly simple audio-only".
Apples and apples
Clearly the examples put forward do indeed represent poor system implementations and result in (in some cases shockingly!) compromised audio performance. However, we would seek to rebalance the article, challenge some of its assertions and assumptions, and in particular disagree with the idea that asynchronous operation is only suitable for a small niche of simple applications.
The article picks holes in several examples of microcontrollers implementing synchronous and adaptive operation. Parts have been tested that, variously:
- Couldn't react appropriately to sample rate changes
- Employed coarsely-stepped synchronous or adaptive clock generators
- Implemented poor quality sample rate conversion, resulting in excessive noise floor spuriae.
This "rogues" gallery" of various faults has then been presented as typical of a homogenous mass of USB audio microcontroller devices; together with the assertion that these problems can be solved by a particular device choice (or an external clock synthesizer device).
We have no problem with the devices the authors recommend; but they are not solutions for low-end designs. And at the low-end of USB audio microcontrollers we would argue that price point drives device selection and, as a result, no one is expecting particularly excellent audio quality. One could also argue that the docks, speakers and other accessories into which these are integrated generally have relatively low-end electronics and speakers, such that many of these problems would be masked anyway.
Where we would also challenge the article, is in the discussion of mid- to high-end USB audio microcontroller solutions and the implications thereof - as this is the space in which XMOS has had great success, and has accumulated considerable knowledge and expertise.
One criticism advanced is that asynchronous operation tends to be found in higher-end solutions. We wouldn't take any issue with that, except to say that asynchronous operation need not be limited to high-end solutions only. As the article advocates, the fact is, if you really value audio quality, you need to use asynchronous operation for best results. Indeed, XMOS specifically and deliberately only supports asynchronous USB audio, thereby aiming at, and being perfectly suited for, mid- to higher-end solutions.
The article only briefly mentions audio accessories where a dock is the USB host. But this type of connection in fact remains the predominant mode for phone docking stations; indeed for Android this is the only configuration that officially supports USB audio. One clear benefit of this mode is that the host docking station becomes the master of the audio clock, and as such, virtually all of the clocking problems presented in the article evaporate. So to dismiss the substantial portion of the USB audio market for docking stations where a phone is a USB device as 'nonsensical' is (at best) disingenuous.
As for the implication that a significant number of USB hosts do not support asynchronous operation - this is simply not true. For every significant platform, (Windows, Mac, iOS, and Android) where USB audio is supported, so is asynchronous operation.
It's hard not to think that these wrong assertions and assumptions simply show up the authors" devotion to the specific mode of USB audio clocking they work on. And who can blame them? We ourselves are all for asynchronous clocking as an answer for high-quality audio.
Advocating a particular solution for applications that choose to use adaptive or synchronous clocking is fine. But to marginalize other clocking modes is misleading. The fact that XMOS devices are at the heart of USB audio products from respected brands such as Arcam, Meridian, Sony and Sennheiser, all of which use asynchronous operation, says a lot about the reality of the concerns raised by the article.
On one thing, however, we do see eye to eye: the article concludes with the advice 'look for a vendor team that clearly knows what it's doing in the audio field.' With that, we couldn't agree more!