What I want to discuss with you this time is basic information about DirectX Audio including: DirectX Audio architecture, most frequent used terminologies and of course, the most interesting part, how to use DirectX Audio to play audio file.
Windows had multimedia feature through Multimedia Control Interface (MCI), Wave API and MIDI API. For simple applications, MCI is capable to be used for playing WAV file or MIDI.
But, if you need to develop rich-feature sound recording application, such as Cakewalk's products, or game with impressive sound quality, MCI is not suitable.
DirectX Audio was presented by Microsoft for high performance audio application development on Windows platform. With DirectX Audio, you are able to do following things with ease:
DirectX Audio consists of two components, i.e DirectSound and DirectMusic. DirectSound, which was created first, was designed for fast and efficient audio hardware access. DirectSound provides low-level mechanism for direct access to audio hardware. DirectMusic, which was created later after DirectSound, provides low-level programming interface for MIDI device, called DirectMusic Core and high level programming interface for loading and playback of music media, called DirectMusic Performance.
On DirectX 7, DirectMusic and DirectSound were two separated components (see Figure 1). MIDI and style-based music was loaded by DirectMusic Performance and then it was sent to DirectMusic Core which handled MIDI device and synthesizer directly. Sound in wave file format was loaded into memory and prepared by application and then sent to DirectSound which handle wave hardware.
Figure 1. DirectMusic and DirectSound architecture on DirectX 7 or lower.
After DirectMusic release, application developers prefered DirectMusic Performance to handle loading and music file playback. On DirectX 8, DirectMusic and DirectSound architecture was united, so DirectMusic Performance was used to handle loading and playback of MIDI and wave format. Unification of DirectMusic and DirectSound was called DirectX Audio (Figure 2). This architecture is not change in DirectX 9.
Figure 2. DirectMusic and DirectSound architecture on DirectX 8 and 9.
In DirectX 8, Microsoft added new functionalities to Performance object, enabled it to load wave file and play wave format data stored in segment. Significant change to the architecture is the introduction of audiopath concept which separates DirectMusic Performance from DirectMusic Core. An audiopath is virtual audio channel which controls sound data flow. DirectMusic Performance manages one or more audiopath.
Performance is the work-horse of DirectX Audio. Performance is responsible for audio initialization, scheduling segments, creating and mapping audiopath to segment also controlling sound playback.
Loader was created to handle file input/output task and to load audio data from memory or resource. Loader simplifies programmer's task to load audio data.
Segment represents any playable audio data. It can be in form of MIDI file or wave file or DirectMusic Segment (default format of DirectMusic with SGT file extension). You can create one or more segments in an application. You are even allowed to play two or more segments simultaneously.
AudioPath is DirectMusic object responsible to manage route that must be taken by music instrument or sound effect. Each segment is played through an audiopath. Applications can have one or more audiopaths.
DirectMusic do not provide function helper to create performance and loader instance. You must use CoCreateInstance() to create performance and loader instance (see Listing 1).
CoCreateInstance(If codes in Listing 1 executed successfully, FPerformance will hold pointer to IDirectMusicPerformance8 interface instance, while FLoader holds IDirectMusicLoader8 instance.
unit ..;Personally, I prefer doing COM initialization in unit initialization so COM initialization function automatically get called everytime unit get referenced.
function InitAudio (ppDirectMusic: PIDirectMusic;Listing 4 contains example how to call InitAudio.
pParams: PDMUS_AudioParams) : HResult; stdcall;
Ok, let us discuss InitAudio function parameters. ppDirectMusic and ppDirectSound, respectively, are instance of DirectMusic and DirectSound. If you want DirectMusic instance or DirectSound instance created automatically, fill it with nil. hWnd is window handle to create IDirectSound instance. If ppDirectSound contains valid DirectSound instance, hWnd is ignored. dwDefaultPathType parameter holds audiopath type as in Table 1. If you don't need default audiopath, it can be set to zero.
|DMUS_APATH_DYNAMIC_3D||Audiopath for 3D sound|
|DMUS_APATH_SHARED_STEREOPLUSREVERB||Stereo and reverb|
dwPChannelCount is number of performance channel needed. Each channel has its own volume and balance setting. dwFlags determines what features needed. Table 2 lists available flags that you can use. There are other flags but they are not implemented yet.
|DMUS_AUDIOF_STREAMING||Support waveform streaming|
pParams holds address to D3DMUS_AUDIOPARAMS data structure. If it is nil, default audio parameters will be used. To comply with Delphi naming convention, it is also declared as TDMus_AudioParams. This record has following fields.
|DMUS_AUDIOPARAMS_FEATURES||dwFeatures holds valid data|
||dwVoices contains valid data|
|DMUS_AUDIOPARAMS_SAMPLERATE||dwSampleRate contains valid data|
|DMUS_AUDIOPARAMS_DEFAULTSYNTH||clsidDefaultSynth is valid. If this flag is not set, Microsoft software synthesizer will be used|
To load MIDI, WAV or segment (SGT) file, you use loader object with LoadObjectFromFile() (Listing 5).
function LoadObjectFromFile(const rguidClassID: TGUID;
const iidInterfaceID: TGUID;
out ppObject): HResult; stdcall;
rquidClassID is object class ID to be loaded. To load audio file to segment, we use CLSID_DirectMusicSegment.
iidInterfaceID, interface identifier. You can set it with IID_IDirectMusicSegment8 to create segment (IDirectMusicSegment8). With Delphi, you can use interface name to replace interface identifier. So replacing IID_IDirectMusicSegment8 with IDirectMusicSegment8 is allowed.
pwzFilePath is filename that you want to load. It is in widestring format (2 bytes per character). If you use ANSI string, make sure you convert string to widestring. After you converted to widetring, typecast it to PWidechar.
ppObject is variable that will hold address of object instance created. For our case, the object is IDirectMusicSegment8 instance. Listing 6 shows example how to load WAV file into segment named FSound1.
LoadObjectFromFile, as its name shows, is only able to load data from file. So how to load data from memory? For this case, you need GetObject() (Listing 7).
function GetObject (const pDesc: TDMus_ObjectDesc;
const riid : TGUID;
out ppv) : HResult; stdcall;
pDesc is description of the object to be loaded. We will discuss TDMus_ObjectDesc type soon. riid is interface identifier. ppv contains variable that will hold pointer to object instance.
This data type describes object to be loaded. It contains following fields:
|DMUS_OBJ_CATEGORY||Field wszCategory contains valid data|
|DMUS_OBJ_CLASS||guidClass contains valid data|
|DMUS_OBJ_OBJECT||quidObject contains valid data|
|DMUS_OBJ_DATE||ftDate contains valid data|
|DMUS_OBJ_VERSION||vVersion contains valid data|
|DMUS_OBJ_NAME||wszName contains valid data
|DMUS_OBJ_FILENAME||wszFilename contains valid data|
|DMUS_OBJ_FULLPATH||wszFilename contains valid data and with full path|
|DMUS_OBJ_MEMORY||llMemLength and pbMemData berisi data valid|
|DMUS_OBJ_STREAM||pStream contains valid data|
Listing 8 contains example how to load audio data from stream. I usually initialize data structure with ZeroMemory() before filling fields.
To load data from memory into segment, you need address of buffer and size of buffer. You also need to set guidClass field to inform DirectMusic to create segment instance. Therefore, at least you need to use DMUS_OBJ_CLASS and DMUS_OBJ_MEMORY.
procedure LoadFromStream(Stream: TStream);
if FInternalStream=nil then
Loader caches data in buffer. Buffer containing data cannot be freed until loader is freed. This is because loader might need to access data in the buffer anytime due to cache mechanism.
If you use DMUS_OBJ_FILENAME or DMUS_OBJ_FULLPATH, GetObject does same thing as LoadObjectFromFile().
Before segment can be played, it need to be downloaded into performance or audiopath. Downloading segment means copying instrument data and waveform used by segment to performance. To download segment we use Download() of IDirectMusicSegment8 interface (Listing 9).
function Download (pAudioPath: IUnknown) : HResult; stdcall;
Code in Listing 10 downloads segment named FSound1 into performance.
To play sound stored in segment, you can use PlaySegment() or PlaySegmentEx() (see Listing 11). First function is available in IDirectMusicPerformance and IDirectMusicPerformance8 while the latter is only available in IDirectMusicPerformance8 interface. PlaySegmentEx() is extension of PlaySegment().
function PlaySegment (pSegment: IDirectMusicSegment;
ppSegmentState: PIDirectMusicSegmentState) : HResult; stdcall;
function PlaySegmentEx (pSource: IUnknown;
out ppSegmentState: IDirectMusicSegmentState;
pFrom, pAudioPath: IUnknown) : HResult; stdcall;
pSegment and pSource are segment object to be played. pwzSegmentName is segment name. For now, it is not used, so must be set to nil. dwFlags holds flags which determine how segment is played. Available flags you can use is listed on Table 5. There are quite many of them but in my opinion they are not yet relevant because you are still learning basic thing.
|DMUS_SEGF_REFTIME||Time use REFERENCE_TIME unit|
|DMUS_SEGF_QUEUE||Segment is played after main segment is finished. If you use secondary segment, then segment is played after segment in pFrom|
|DMUS_SEGF_CONTROL||Segment is played as control segment|
i64StartTime is position where sound playback starts. ppSegmentState is interface that will hold state of segment. pFrom holds segment stopped when segment in pSource is played, this can be set to nil. pAudioPath contains audiopath used to play segment. If it set to nil, default audiopath is used.
Code in Listing 12 plays segment named FSound1 from start of segment using deault audiopath. Segment state is stored in variable named state.
Code in Listing 13 produce output that is same as Listing 12.
You can try DM1.dpr and DM2.dpr demo available on CD/DVD.
If you try DM1.dpr demo or DM2.dpr demo, everytime you press a button, sound segment currently playing is stopped. You cannot play two or more segments simultaneously with technique explained above. It is only suitable for sound player application where music is played one by one.
In game application, many sounds can be played at same time. In war game, sound of cannons, guns and wounded soldiers must be able to be played simultaneously to create illusion of real war.
To be able to play many segments at same time, you need separate audiopath for each segment. In DM1.dpr demo and DM2.dpr demo, we only used one audiopath, i.e, default audiopath.
To create audiopath, you can use CreateStandardAudioPath() of IDirectMusicPerformance8 interface (Listing 14).
function CreateStandardAudioPath (dwType,
out ppNewPath: IDirectMusicAudioPath) : HResult; stdcall;
dwType is audiopath type. See Table 1 to find out available flags. dwPChannelCount, number of channels that you want. fActivate is set to true to enable audiopath. If you set it to false, you can enable it anytime with Activate(), member of IDirectMusicAudioPath8. ppNewPath will hold pointer to audiopath instance. You can examine Listing 15 for example code.
Then audiopath to be played is used when calling PlaySegmentEx(). Not only that, you also need to play it in secondary segment, i.e, with DMUS_SEGF_SECONDARY flag (Listing 16). Without this flag, segment will be played as primary segment. Only one primary segment can be played at a time. Therefore, new primary segment will replace primary segment currently playing.
You can try DM3.dpr demo. If you press cannon button, gun and bomb button repeatedly, you will feel you are in middle of war.
There are times when you need to play background music over and over. To set how many loops a segment is played, you change it with SetRepeats() of IDirectMusicSegment8 interface. Parameter of this function is number of repeat you want. If it is set to 0, segment will be played once without any repeats. If it is set to DMUS_SEG_REPEAT_INFINITE, segment will be played infinitely until explicitly stopped. You can study example of SetRepeats() call in Listing 17.
//repeat segment once
To find out what segment is currently playing, you can use isPlaying(), member of IDirectMusicPerformance8 interface (Listing18).
function IsPlaying (pSegment: IDirectMusicSegment;
pSegState: IDirectMusicSegmentState) : HResult; stdcall;
We can check by using segment or segment state. If pSegment is nil, then pSegState is used or otherwise, pSegState is nil then pSegment is used. If segment is currently playing, return value of this method is S_OK or S_FALSE if otherwise.
Segment that is currently playing can be stopped with Stop() or StopEx() (Listing 19). Both methods are member of performance object. StopEx() is extension of Stop() and it is available in IDirectMusicPerformance8 interface.
function Stop(pSegment: IDirectMusicSegment;
dwFlags: DWORD) : HResult; stdcall;
function StopEx(pObjectToStop: IUnknown;
dwFlags: DWORD) : HResult; stdcall;
pSegment is segment need to be stopped. If you set to nil, all segments currently playing will be stopped. pSegmentState is segment state to be stopped. pObjectToStop is segment or segment state or audiopath to be stopped. mtTime and i64StopTime are stop time. If they are set to zero, segment is stopped immediately. dwFlags is flag which determines how segment is stopped. It can be set to zero or flag available in Table 5. Listing 20 contains sample code how to stop segment.
//stop all segment
//stop segment FSound1
DirectX Audio does not provide special function to pause playback, however you can do similar task using StopEx() and PlaySegmentEx().
Before you stop playback of segment, you need to save current cursor position of segment. When you need to resume, saved cursor position is used to change start position of playback. You change start position of segment playback using SetStartPoint(), member of IDirectMusicPerformance8 interface. After you call SetStartPoint(), you play segment with PlaySegmentEx() as usual.
After you are done with DirectX Audio, you must call CloseDown(), member of performance object. This method does not require any parameters.
Before you call CloseDown(), it is recommended if you call Unload() to unload all segments previously downloaded into performance. Its parameters is same as Download(), which is audiopath or performance where segment downloaded.
Actually Unload() call is not mandatory when closing performance, because CloseDown() automatically unloading any segments. Listing 21 contains example code how to shutdown DirectX Audio.
Loader use cache mechanism to speed up playback process. Object that is created with GetObject() or LoadObjectFromFile() might refer other objects. A MIDI file might refer to other MIDI files. GetObject will automatically create referenced objects and add it to cache.
To free those objects completely, there are few things you must do.
If you don't use object that referencing other objects, you don't need to call ReleaseObjectByUnknown() and CollectGarbage(). For example, if you load WAV into segment, you don't need to do above steps because WAV don't reference external file. But this is good programming practice to ensure all resource we used is completely freed.
You can get demo source code from CD/DVD accompanied the magazine. DM1.dpr demo contains demonstration how to initialize DirectX Audio and playback segment using PlaySegment(). Demo DM2.dpr is almost identical to DM1, but it uses PlaySegmentEx().
DM3.dpr demo is extension of DM2.dpr. In this application, we are able to play many sounds simultaneously. DM4.dpr is improvement of DM3.dpr with addition of MIDI background music that played over and over until application shutdown. Application user interface of demo is figured in Figure 3.
Figure 3. Application user interface.
Source code of demo application is available for download here
You reach the last part of this article. I hope now you figure out basic techniques to utilize DirectX Audio inside your application. In this article, you have learn about DirectX Audio architecture, DirectX Audio initialization, loading and playing audio file using DirectX Audio and also how to play many sounds simultaneously.