Update 11 jun 2010.
For those interested, the VorbisPlayer has now been published on Codeplex. Find the source code here.
I have finished the playing part of Slengo. Next are the titlescreen, the highscore, making some levels and… adding sound effects. When I started adding the sound effects, I instantly remembered why I use little effects in my games: the slow latency.
It is like this. On the screen you push an icecube, and half a second later (or longer) you hear the sound effect. The visual and audio are so much out of sync that it breaks game-playing experience.
In Avios, I solved this problem by delaying the graphics. The event of a bomb falling into the sea works with this principle. This was possible because the sounds in Avios are not bound to user-events. They are bound to in-game events. With user-bound events, this has the drawback that the game will appear to react slowly on user input.
The delay has two causes.
1) Mp3’s are standard encoded with a gap at the beginning;
2) The MediaElement has an internal buffer which is filled before even one sound is played.
Also Silverlight has problems with short sounds (less than a second), they make the delay worse, or you don’t hear them at all.
Countering the audio gap
I have thought of two ways to deal with the gap, encoded in audio files. Either find a gapless format or gapless encoder, or skip the gap when starting the sound.
If you encode Mp3’s gapless with WinLame, then the audio quality is very bad (I’ve tested a short sound).
In Silverlight, it is possible to write your own sound decoder. Larry Olson wrote Managed Media Helpers which inspired about everybody else. What I understand from this project is that only the Mp3 frame headers are decoded, while the compressed sound data is send to the MediaElement, making use of the decoder already in Silverlight. If this is true, then this scheme is not sufficient to skip the gap programmatically, because you cannot set a start pointer in the middle of compressed data.
DDtMM wrote a library for seamless audio loop, for Mp4 encoded files. Expression Blend Encoder can encode gapless. I had many troubles downloading the library in the past (timeouts), so I was unable to investigate this code. Recently I had more luck, but being too occupied with MoonVorbis (see below) I haven’t studied the code very well yet.
Ahura Mazda wrote the Saluse Media Kit, fully decoding Mp3’s and sending the raw audio to the MediaElement. This code is a good option to skip the gap at raw-sound level. Unfortunately I had problems playing short sounds repeatedly, so I looked further.
A project which really drew my attention was MoonVorbis from Atsushi Eno. Casy Wireman wrote on GameDev that Ogg Vorbis is a pretty good format for games. It seems that Ogg Vorbis is encoded without audio gap. MoonVorbis decodes to raw sound samples. So if there is a gap we can skip it.
Countering the audio buffer
MoonVorbis is not production ready. I had problems with the pitch of my test-sound, and problems with repeatedly playing a short sound. However, I’ve worked a solution. But still, there is latency between user-event and sound being played.
I have introduced a Play function on the stream. The MediaElement’s source is set to the stream and starts playing directly. The stream produces silence when the sound is not played. When the Play function is called on the stream, it starts to send sound samples. With this approach, the short-sound problem is solved, because the standard silence extends the length of your sample.
Now to reduce audio buffer latency. The MediaStreamSource has a property called AudioBufferLength, which is measured in milliseconds. The standard setting is 1000 (1 second), making the latency very slow. If you set this property to its minimum, 15 milliseconds, the latency is much better, but the sound begins to stutter. A setting of 50ms gives a little more latency, but with good sound quality. I am still investigating whether this is the best we can get.
Solving the short-sound problem is a major improvement. I would still like to reduce the sound latency a little more. However, I still have the option of delaying the graphics, but now for a much lower latency. That is an improvement. Still, I need to find out how all this works out in practice when I bring it to the game. Although I have a preference for WMA audio, Ogg Vorbis is a good enough choice for me.