Audio Device Document 1.0(21-25)

USB Device Class Definition for Audio Devices
Release 1.0 March 18, 1998 21

· Resolution attribute

As an example, consider a Volume Control inside a Feature Unit. By issuing the appropriate Get requests, the Host software can obtain values for the Volume Control’s attributes and, for instance, use them to correctly display the Control on the screen. Setting the Volume Control’s current attribute allows the Host software to change the volume setting of the Volume Control.

Additionally, each Entity (Unit or Terminal) in an audio function can have a memory space attribute. This attribute optionally provides generic access to the internal memory space of the Entity. This could be used to implement vendor-specific control of an Entity through generically provided access.

3.5.1 Input Terminal

The Input Terminal (IT) is used to interface between the audio function’s ‘outside world’ and other Units in the audio function. It serves as a receptacle for audio information flowing into the audio function. Its function is to represent a source of incoming audio data after this data has been properly extracted from the original audio stream into the separate logical channels that are embedded in this stream (the decoding process). The logical channels are grouped into an audio channel cluster and leave the Input Terminal through a single Output Pin.

An Input Terminal can represent inputs to the audio function other than USB OUT endpoints. A Line-In connector on an audio device is an example of such a non-USB input. However, if the audio stream is entering the audio function by means of a USB OUT endpoint, there is a one-to-one relationship between that endpoint and its associated Input Terminal. The class-specific endpoint descriptor contains a field that holds a direct reference to this Input Terminal. The Host needs to use both the endpoint descriptors and the Input Terminal descriptor to get a full understanding of the characteristics and capabilities of the Input Terminal. Stream-related parameters are stored in the endpoint descriptors. Control-related parameters are stored in the Terminal descriptor.

The conversion process from incoming, possibly encoded audio streams to logical audio channels always involves some kind of decoding engine. This specification defines several types of decoding. These decoding types range from rather trivial decoding schemes like converting interleaved stereo 16 bit PCM data into a Left and Right logical channel to very sophisticated schemes like converting an MPEG-2 7.1 encoded audio stream into Left, Left Center, Center, Right Center, Right, Right Surround, Left Surround and Low Frequency Enhancement logical channels. The decoding engine is considered part of the Entity that actually receives the encoded audio data streams (like a USB AudioStreaming interface). The type of decoding is therefore implied in the wFormatTag value, located in the AudioStreaming interface descriptor. Requests specific to the decoding engine must be directed to the AudioStreaming interface. The associated Input Terminal deals with the logical channels after they have been decoded.

The symbol for the Input Terminal is depicted in the following figure:

ここに画像

Figure 3-1: Input Terminal Icon

3.5.2 Output Terminal

The Output Terminal (OT) is used to interface between Units inside the audio function and the ‘outside world’. It serves as an outlet for audio information, flowing out of the audio function. Its function is to represent a sink of outgoing audio data before this data is properly packed from the original separate logical channels into the outgoing audio stream (the encoding process). The audio channel cluster enters the Output Terminal through a single Input Pin.

USB Device Class Definition for Audio Devices
Release 1.0 March 18, 1998 22

An Output Terminal can represent outputs from the audio function other than USB IN endpoints. A speaker built into an audio device or a Line Out connector is an example of such a non-USB output. However, if the audio stream is leaving the audio function by means of a USB IN endpoint, there is a oneto- one relationship between that endpoint and its associated Output Terminal. The class-specific endpoint descriptor contains a field that holds a direct reference to this Output Terminal. The Host needs to use both the endpoint descriptors and the Output Terminal descriptor to fully understand the characteristics and capabilities of the Output Terminal. Stream-related parameters are stored in the endpoint descriptors. Control-related parameters are stored in the Terminal descriptor.

The conversion process from incoming logical audio channels to possibly encoded audio streams always involves some kind of encoding engine. This specification defines several types of encoding, ranging from rather trivial to very sophisticated schemes. The encoding engine is considered part of the Entity that actually transmits the encoded audio data streams (like a USB AudioStreaming interface). The type of encoding is therefore implied in the wFormatTag value, located in the AudioStreaming interface descriptor. Requests specific to the encoding engine must be directed to the AudioStreaming interface. The associated Output Terminal deals with the logical channels before encoding.

The symbol for the Output Terminal is depicted in the following figure:

ここに画像

Figure 3-2: Output Terminal Icon

3.5.3 Mixer Unit

The Mixer Unit (MU) transforms a number of logical input channels into a number of logical output channels. The input channels are grouped into one or more audio channel clusters. Each cluster enters the Mixer Unit through an Input Pin. The logical output channels are grouped into one audio channel cluster and leave the Mixer Unit through a single Output Pin.

Every input channel can virtually be mixed into all of the output channels. If n is the total number of input channels and m is the number of output channels, then there are n x m mixing Controls in the Mixer Unit. Not all of these Controls have to be physically implemented. Some Controls can have a fixed setting and be non-programmable. The Mixer Unit Descriptor reports which Controls are programmable in the bmControls bitmap field. Using this model, a permanent connection can be implemented by reporting the Control as non-programmable and by returning a Control setting of 0 dB when requested. Likewise, a missing connection can be implemented by reporting the Control as non-programmable and by returning a Control setting of -¥ dB.

The symbol for the Mixer Unit can be found in the following figure:

ここに画像

Figure 3-3: Mixer Unit Icon

3.5.4 Selector Unit

The Selector Unit (SU) selects from n audio channel clusters, each containing m logical input channels and routes them unaltered to the single output audio channel cluster, containing m output channels. It

USB Device Class Definition for Audio Devices
Release 1.0 March 18, 1998 23

represents a multi-channel source selector, capable of selecting between n m-channel sources. It has n Input Pins and a single Output Pin.

The symbol for the Selector Unit can be found in the following figure:

ここに画像

Figure 3-4: Selector Unit Icon

3.5.5 Feature Unit

The Feature Unit (FU) is essentially a multi-channel processing unit that provides basic manipulation of the incoming logical channels. For each logical channel, the Feature Unit optionally provides audio Controls for the following features:

· Volume
· Mute
· Tone Control (Bass, Mid, Treble)
· Graphic Equalizer
· Automatic Gain Control
· Delay
· Bass Boost
· Loudness

In addition, the Feature Unit optionally provides the above audio Controls but now influencing all channels of the cluster at once. In this way, ‘master’ Controls can be implemented. The master Controls are cascaded after the individual channel Controls. This setup is especially useful in multi-channel systems where the individual channel Controls can be used for channel balancing and the master Controls can be used for overall settings.

The logical channels in the cluster are numbered from one to the total number of channels in the cluster. The ‘master’ channel has channel number zero and is always virtually present.

The Feature Unit Descriptor reports which Controls are present for every channel in the Feature Unit and for the ‘master’ channel. All logical channels in a Feature Unit are fully independent. There exist no cross couplings among channels within the Feature Unit. There are as many logical output channels, as there are input channels. These are grouped into one audio channel cluster that enters the Feature Unit through a single Input Pin and leaves the Unit through a single Output Pin.

The symbol for the Feature Unit is depicted in the following figure:

ここに画像

Figure 3-5: Feature Unit Icon

3.5.6 Processing Unit

The Processing Unit (PU) represents a functional block inside the audio function that transforms a number of logical input channels, grouped into one or more audio channel clusters into a number of logical output channels, grouped into one audio channel cluster. Therefore, the Processing Unit can have multiple Input

USB Device Class Definition for Audio Devices
Release 1.0 March 18, 1998 24

Pins and has a single Output Pin. This specification defines several standard transforms (algorithms) that are considered necessary to support additional audio functionality; these transforms are not covered by the other Unit types but are commonplace enough to be included in this specification so that a generic driver can provide control for it.

Processing Units are encouraged to support at least the Enable Processing Control, allowing the Host software to bypass whatever functionality is incorporated in the Processing Unit.

3.5.6.1 Up/Down-mix Processing Unit

The Up/Down-mix Processing Unit provides facilities to derive m output audio channels from n input audio channels. The algorithms and transforms applied to accomplish this are not defined by this specification and can be proprietary. The input channels are grouped into one input channel cluster that enters the Processing Unit over a single Input Pin. Likewise, all output channels are grouped into one output channel cluster, leaving the Processing Unit over a single Output Pin.

The Up/Down-mix Processing Unit can support multiple modes of operation (besides the bypass mode, controlled by the Enable Processing Control). The available input audio channels are dictated by the Unit or Terminal to which the Up/Down-mix Processing Unit is connected. The Up/Down-mix Processing Unit descriptor reports which up/down-mixing modes the Unit supports through its waModes() array. Each element of the waModes() array indicates which output channels in the output cluster are effectively used in a particular mode. The unused output channels in the output cluster must produce muted output. Mode selection is implemented using the Get/Set Control request.

As an example, consider the case where an Up/Down-mix Processing Unit is connected to an Input Terminal, producing DolbyÔ AC-3 5.1 decoded audio. The input audio channel cluster to the Up/Downmix Processing Unit therefore contains Left, Right, Center, Left Surround, Right Surround and LFE logical channels.

Suppose the audio function’s hardware is limited to reproducing only dual channel audio. Then the Up/Down-mix Processing Unit could use some (sophisticated) algorithms to down-mix the available spatial audio information into two (‘enriched’) channels so that the maximum spatial effects can be experienced, using only two channels. It is left to the audio function’s discretion to use the appropriate down-mix algorithm depending on the physical nature of the Output Terminal to which the Up/Down-mix Processing Unit is routed. For instance, a different down-mix algorithm is needed whether the ‘enriched’ stereo stream is sent to a pair of speakers or to a headphone set. However, this knowledge already resides within the audio function and deciding which down-mix algorithm to use does not need Host intervention.

As a second interesting example, suppose the hardware is capable of servicing eight discrete audio channels for instance a full-fledged MPEG-2 7.1 system. Now the Up/Down-mix Processing Unit could use certain techniques to derive meaningful content for the extra audio channels (Left of Center, Right of Center) that are present in the output cluster and are missing in the input channel cluster (AC-3 5.1). This is a typical example of an up-mix situation.

The symbol for the Up/Down-mix Processing Unit is depicted in the following figure:

ここに画像

Figure 3-6: Up/Down-mix Processing Unit Icon

3.5.6.2 Dolby Prologic Processing Unit

The Dolby PrologicÔ decoding process can be seen as an operator on the Left and Right logical channels of the input cluster of the Unit. It is capable of extracting additional audio data (Center and/or Surround

USB Device Class Definition for Audio Devices
Release 1.0 March 18, 1998 25

channels) from information that is transparently ‘superimposed’ on the Left and Right audio channels. It therefore differs from a true decoding process as defined for an Input Terminal. It can be applied on a logical audio stream anywhere in the audio function. The Dolby Prologic Processing Unit is a specialized derivative of the Up/Down-mix Processing Unit.

The Dolby Prologic Processing Unit can have the following modes of operation (besides the bypass mode, controlled by the Enable Processing Control):

· Left, Right, Center channel decoding
· Left, Right, Surround channel decoding
· Left, Right, Center, Surround decoding

The Dolby Prologic Processing Unit descriptor reports which modes the Unit supports. Mode selection is then implemented using the Get/Set Control request.

Dolby Prologic Surround Delay Control is considered not to be part of the Dolby PrologicÔ Processing Unit and must be handled by a separate Feature Unit.

Dolby Prologic Bass Management is the local responsibility of the audio function and should not be controllable from the Host.

The symbol for the Dolby Prologic Processing Unit can be found in the following picture:

ここに画像

Figure 3-7: Dolby Prologic Processing Unit Icon

3.5.6.3 3D-Stereo Extender Processing Unit

The 3D-Stereo Extender Processing Unit operates on Left and Right channels only. It processes an existing stereo (two channel) soundtrack to add spaciousness and to make it appear to originate from outside the Left/Right speaker locations. Extended stereo effects can be achieved via various, straightforward methods. The algorithms and transforms applied to accomplish this are not defined by this specification and can be proprietary. The effects of the 3D-Stereo Extender Processing Unit can be bypassed at all times through manipulation of the Enable Processing Control. The size of the listening area (area in which the listener has to be placed with respect to speakers to hear the effect, also called sweet spot) can be controlled using the proper Get/Set Control request.

The symbol for the 3D-Stereo Extender Unit is depicted in the following figure:

ここに画像

Figure 3-8: 3D-Stereo Extender Processing Unit Icon

3.5.6.4 Reverberation Processing Unit

The Reverberation Processing Unit is used to add room acoustics effects to the original audio information. These effects can range from small room reverberation effects to simulation of a large concert hall reverberation. A number of parameters can be manipulated to obtain the desired reverberation effects.

· Reverb Type: Room1, Room2, Room3, Hall1, Hall2, Plate, Delay, and Panning Delay.

1 - 6 - 11 - 16 - 21 - 26 - 31 - 36 - 41 - 46 - 51 - 56 - 61 - 66 - 71 - 76 - 81 - 86 - 91 - 96 - 101 - 106 - 111 - 116 - 121 - 126

ここを編集

タグ：

+ タグ編集

「Audio Device Document 1.0(21-25)」をウィキ内検索

最終更新：2011年05月22日 10:51

USB_AUDIO@WIKI

メニュー

更新履歴