USB Device Class Definition for Audio Devices
Release 2.0
May 31, 2006 21
, e.g. piano, guitar, synthesizer, drum machine, etc.
he expectation is
of audio devices,
such as a mixer panel.
th e is sufficiently different from the above descriptions as to be
ction Category Codes” of this specification.
l ot
1
A C
c
a
c t
In g me audio function.
3.1
as
defi
sync
y
cloc o a
c
speed endpoi hat occurs at the beginning of every microframe to
l
lationship between
different physical audio channels. Indeed, the virtual spatial position of an audio source is directly related
to and influenced by the phase differences that are applied to the different physical audio channels used to
reproduce the audio source. Therefore, it is imperative that USB audio functions respect the phase
relationship among all related audio channels. However, the responsibility for maintaining the phase
• Musical Instrument: A musical instrument
• Pro-Audio: A device not typically used by consumers of audio, e.g. editing equipment, multitrack recording equipment, etc. •Audio/Video: The audio from a device that also supplies simultaneous video where t
that the audio is tightly coupled to the video, e.g. a camcorder, a DVD player, a television, etc. • Control Panel: A device that is used to control the flow of audio through a system
• Oer: Any device whose primary purposconsidered a completely different form of
device. The assigned codes can be found in Appendix A.7, “Audio Fun
Alher Category codes are unused and reserved by this specification for future use. 0 Clock Domains
3. lock Domain is defined as a zone within which all sampling clocks are derived from the same master synchronous and their timing
clok. Therefore, within the same Clock Domain, all sampling clocks are
reltionship is constant. However, the sampling clocks can be at different sampling frequencies. The master clok can be generated in many different ways. An internal crystal could be the master clock, the USB star
of frame (SOF) could be used or even an externally supplied clock could serve as a master clock. eneral, multiple different Clock Domains can exist within the sa
1 Audio Synchronization Types
Each isochronous audio endpoint used in an AudioStreaming interface belongs to a synchronization type ned in Section 5 of the USB Specification. The following sections briefly describe the possible hronization types.
3.11.1 Asynchronous
Asnchronous isochronous audio endpoints produce or consume data at a rate that is locked either to a k external to the USB or to a free-running internal clock. These endpoints cannot be synchronized t
start of frame (SOF) or to any other clock in the USB domain.
3.11.2 Synchronous
The clock system of synchronous isochronous audio endpoints can be controlled externally through SOF synhronization. Such an endpoint must lock its sample clock to the 1ms SOF tick. Optionally, a high-nt could lock its clock to the 125 μs SOF t
improve accuracy.
3.11.3 Adaptive
Adaptive isochronous audio endpoints are able to source or sink data at any rate within their operating range. This implies that these endpoints must run an internal process that allows them to match their naturadata rate to the data rate that is imposed at their interface. 3.12 Inter Channel Synchronization
An important issue when dealing with audio, and 3-D audio in particular, is the phase re
USB Device Class Definition for Audio Devices
Release 2.0
May 31, 2006 22
dware, and all of the audio peripheral devices or
l delay
essed in number of (micro)frames and is due to the
fact that the audio function must buffer at least one (micro)frame worth of samples to effectively remove
Furthermore, some audio functions will introduce extra delay because
and process the audio data streams (for example, compression and
t of (micro)frame n) is the first sample of
the packet it sends over USB during (micro)frame (n+δ). δ is the audio function’s internal delay expressed
pplies for an audio sink function. The first sample in the packet, received
ust be the first sample that is fully reproduced during (micro)frame
ernal delays of all audio functions involved.
Clock Entities and they are used to
describe and manipulate the clock signals inside the audio function.
inal
e.
ed together is a guarantee (by construction) that the protocol and format, used over these
e Unit. Likewise, there is a Terminal
descriptor (TD) for every Terminal in the audio function. In addition, these descriptors provide all
e audio function. They fully describe how Terminals and
relationis shared among the USB host software, har
functions. To provide a manageable phase model to the host, an audio function is required to report its internafor every AudioStreaming interface. This delay is expr
packet jitter within a (micro)frame. they need time to correctly interpret
decompression). However, it is required that an audio function introduces only an integer number of (micro)frames of delay. In the case of an audio source function, this implies that the audio function must guarantee that the first sample it fully acquires after SOFn (star
in (micro)frames. The same rule aover USB during (micro)frame n, m
(n+δ). By following these rules, phase jitter is limited to ±1 audio sample. It is up to the host software to synchronize the different audio streams by scheduling the correct packets at the correct moment, taking into account the int
3.13 Audio Function Topology To be able to manipulate the physical properties of an audio function, its functionality must be divided into addressable Entities. Two types of such generic Entities are identified and are called Units and Terminals. In addition, a special type of Entity is defined. These Entities are called
Units pvide the basic building blocks to fully descri
robe most audio functions. Audio functions are built by connecting together several of these Units. A Unit has one or more Input Pins and a single Output Pin, where each Pin represents a cluster of logical audio channels inside the audio function (see Section 3.13.1, “Audio Channel Cluster”). Units are wired together by connecting their I/O Pins according to the required topology. Note that it is perfectly legal to connect the Output Pin of an Entity to multiple Input Pins residing on different other Entities, effectively creating a one-to-many connection. In addition, the concept of a Terminal is introduced. There are two types of Terminals. An Input Term
(IT) is an Entity that represents a starting point for audio channels inside the audio function. An Output Terminal (OT) represents an ending point for audio channels. From the audio function’s perspective, a USB endpoint is a typical example of an Input or Output Terminal. It either provides data streams to the audio function (IT) or consumes data streams coming from the audio function (OT). Likewise, a Digital toAnalog converter, built into the audio function is represented as an Output Terminal in the audio function’smodel. Connection to the Terminal is made through its single Input or Output Pin. Input Pins of a Unit are numbered starting from one up to the total number of Input Pins on the Unit. The Output Pin number is always one. Input Terminals have only one Output Pin and its number is always onOutput Terminals have only one Input Pin and it is always numbered one. The information, traveling over I/O Pins is not necessarily of a digital nature. It is perfectly possible to use the Unit model to describe fully analog or even hybrid audio functions. The mere fact that I/O Pins are connect
connections (analog or digital), is compatible on both ends. Every Unit in the audio function is fully described by its associated Unit descriptor (UD). The Unit descriptor contains all necessary fields to identify and describe th
necessary information about the topology of thUnits are interconnected.
This specification describes the following eight different types of standard Units and Terminals that are considered adequate to represent most audio functions available today and in the near future:
USB Device Class Definition for Audio Devices
Release 2.0
May 31, 2006 23
re
side
le Clock Output pin. Clock Input Pins are
lock
gnal at
en
or Q. The values P and Q are fixed for a given Clock Multiplier. The new clock
u f Clock Source, Clock Selector, and Clock Multiplier Entities, the most complex
c nted and exposed to Host software.
oc t Pins are fundamentally different from Input and Output Pins defined for Units and
rry only clock signals and therefore cannot be connected to Unit or Terminal
hey are only used to express clock circuitry topology.
a single Clock Input Pin that is connected to a Clock Output Pin of a
oc nal carried by that Clock Output Pin determines at which sampling frequency
h y the Terminal is operating.
ncies between which the Sampling Rate Converter Unit is converting.
ch cribed by a Clock Entity descriptor (CED). The Clock Entity descriptor contains
n tify and describe the Clock Entity.
e etailed in Section 4, “Descriptors” of this document.
at
eric audio driver should be able to fully control the
c
• Input Terminal (IT) • Output Terminal (OT)
• Mixer Unit (MU) • Selector Unit (SU) • Feature Unit (FU) • Sampling Rate Converter Unit • Effect Unit (EU) • Processing Unit (PU)
• Extension Unit (XU) Besides Units and Terminals, the concept of a Clock Entity is introduced. Three types of Clock Entities adefined by this specification: • Clock Source (CS) • Clock Selector (CX) • Clock Multiplier (CM)
A Clock Source provides a certain sampling clock frequency to all or part of the audio function. A Clock Source can represent an internal sampling frequency generator, but it can also represent an external sampling clock signal input to the audio function.
A Clock Source has a single Clock Output Pin that carries the sampling clock signal, represented by the Clock Source. The Clock Output Pin number is always one. A Clock Selector is used to select between multiple sampling clock signals that might be available inan audio function. It has multiple Clock Input Pins and a sing
numbered starting from one up to the total number of Clock Input Pins on the Clock Selector. The COutput Pin number is always one. A Clock Multiplier is used to derive a new clock signal with a different frequency from the clock siits single Clock Input Pin. It does this by multiplying that clock signal frequency by a numerator P and thdividing it by a denominat
signal is guaranteed to be synchronous with the input clock signal. A Clock Multiplier has one Input Pin and one Output Pin and their numbers are always one.
Bysing a combination o
clok systems can be represe
Clk Input and Outpu
Terminals. Clock Pins ca
Input and Output Pins. T
Each Input and Output Terminal has
Clk Entity. The cloc
k sigtheardware represented b
Each Sampling Rate Converter Unit has two Clock Input Pins that are typically connected to the Clock Output Pins of two different Clock Entities. The clock signals carried by those Clock Output Pins determine the sampling freque
Ea Clock Entity is des
allecessary fields to iden
Th descriptors are further d
The ensemble of Unit descriptors, Terminal descriptors and Clock Entity descriptors provide a full description of the audio function to the Host. This information is typically retrieved from the device enumeration time. By parsing the descriptors, a gen
audio function, except for the functionality represented by Extension Units. Those require vendor-specifiextensions to the audio class driver.
USB Device Class Definition for Audio Devices
Release 2.0
May 31, 2006 24
software must be notified of these changes to remain ‘in sync’ with the
ck Sources, a Clock Selector, and two Clock
onnected into the overall topology of the
the
onnector on the audio
tion of a Headphone Out jack on the audio device.
6
lt is
o
nto the audio device and OT 11 could
rding purposes.
z for
ut
cy of 48 kHz to OT 9 for driving the headphone. Since all sampling
freq he audio function are at all times derived from a single master clock (internal or
external a
The descri
Entity is. F
external co t
indicates th the Output Pin of IT 1, Input Pin 2 is connected to the Output
Important Note: The complete set of audio function descriptors provides only a static initial description of the audio function. During operation, a number of events can happen that force the audio function to change its state. Host
audio function at all times. An extensive interrupt mechanism is in place to report any and all state changes to Host software. Figure 3-2, “Inside the Audio Function” illustrates the concepts defined above. Using the iconic symbols defined further, it describes a hypothetical audio function that incorporates 16 Entities: three Input Terminals, five Units, three Output Terminals, two Clo
Multipliers. Each Entity has its unique ID (from 1 to 16) and descriptor that fully describes the functionality of the Entity and also how that particular Entity is c
audio function. Input Terminal 1 (IT 1) could be the representation of a USB OUT endpoint used to stream audio fromHost to the audio device. IT 2 could be the representation of an analog Line-In c
device whereas IT 3 could be an analog Microphone-In connector on the audio device. Selector Unit 4 (SU4) selects between the audio coming from the Host and the audio present at the Line In connector. Feature Unit 5 (FU 5) is then used to manipulate the audio (Volume, Bass, Treble …) before it is presented to Output Terminal 9 (OT 9). OT 9 could be the representa
At the same time, all three input sources (USB OUT, Line In, and Mic In) are connected to a Mixer Unit(MU 6) that effectively mixes the three sources together. The output of the Mixer is then fed into a Processing Unit 7 (PU 7) that could perform some audio processing algorithm(s) on the mix. The resu
in turn sent to FU 8 where some final adjustments to the audio (Volume …) are made. FU 8 is connected tOT 10 and OT 11. OT 10 could represent speakers incorporated i
represent a USB IN endpoint used to send the processed audio to the Host for reco
Clock Source 12 (CS 12) could represent an internal sampling frequency generator, running at 96 kHinstance. Clock Source 15 (CS 15) could be the representation of an external master sampling clock inpthat can be used to synchronize the device to an external source. Clock Selector 13 (CS 13) enables selection between the two available Clock Sources. The output of CS 13 provides a sampling frequency to IT 1, IT 2, IT3, OT 10, and OT 11 of 96 kHz. Clock Multiplier CM 14 further multiplies that clock signal by 0.5, providing a sampling frequen
uencies used inside t
), ll audio streams in the audio function are synchronous. ptors, associated with each Entity clearly indicate to the Host what the exact nature of each or instance, the IT 2 descriptor contains a field that indicates to the Host that it represents an nnector on the device, used as an analog Line In. Likewise, the MU 6 descriptor has a field thaat its Input Pin 1 is connected to
Pin of IT 2, and Input Pin 3 is connected to the Output Pin of IT 3. For further details on descriptor contents, refer to Section 4, “Descriptors” of this document.
USB Device Class Definition for Audio Devices
Audio Function
Release 2.0
May 31, 2006 25
PU
SU Descr.FU Descr.MU Descr
Selector UnitFeature UnitMixer UnitProcessing Unit Feature Unit
PU Descr.FU Descr.1234567891011USB OUTAnalogLine INOUT
AnalogMic INHeadphone
Speakers
USB IN
12
Clock Source
Clock Multiplier
14
15
ITITITIT Descr.IT Descr.IT Descr.OTOT
OT
OT Descr.OT Descr.OT Descr.
Clock Source
Clock Selector
13
CSD
CSD
CXD
P/Q
CMD
Figure 3-2: Inside the Audio Function
Inside an Entity, functionality is further described through Audio Controls. A Control typically provides
access to a specific audio or clock property. Each Control has a set of attributes that can be manipulated or
that present additional information on the behavior of the Control. A Control can have the following
attributes:
• Current setting attribute
• Range attribute triplet consisting of:
• Minimum setting attribute
• Maximum setting attribute
• Resolution attribute
As an example, consider a Volume Control inside a Feature Unit. By issuing the appropriate Get requests,
the Host software can obtain values for the Volume Control’s attributes and, for instance, use them to
correctly display the Control on the screen. Setting the Volume Control’s current attribute allows the Host
software to change the volume setting of the Volume Control.
Additionally, each Entity in an audio function can have a memory space attribute. This attribute optionally
provides generic access to the internal memory space of the Entity. This could be used to implement
vendor-specific control of an Entity through generically provided access.
information. Inside the audio function, complete abstraction is made of the actual physical representation
3.13.1 Audio Channel Cluster An audio channel cluster is a grouping of audio channels that carry tightly related synchronous audio
最終更新:2011年05月08日 13:37