[mad-dev] sample accurate seeking with mad

Rob Leslie rob@mars.org
Tue, 28 Aug 2001 15:03:26 -0700

Andy Lo A Foe wrote:
> I'm writing a player (http://www.alsaplayer.org) and currently use
> mpg123 for mp3 decoding purposes. I'm interested in adopting MAD because
> it is fully reentrant (right?). My question, is there sample accurate
> seeking possible in MAD?

[You'll need to subscribe to mad-dev before you can post another message, but
to answer your question...]

Yes, libmad should be fully reentrant; all decoding state is held in the three
primary structs, mad_stream, mad_frame, and mad_synth.

Sample-accurate seeking is *possible* with MAD, but that doesn't mean MAD goes
out of its way to make it easy. It is essentially up to you to provide all
seeking functionality. You provide MAD with the byte stream buffer to decode.
If the buffer you provide happens to correspond with the part of the stream
you wish to seek to, then voila, you have implemented a seek.

MAD helps you by marking the frame boundaries in the stream after decoding
each frame header (stream.this_frame and stream.next_frame). For seeking
forward, you can decode the frame headers only (mad_header_decode()) and
resume with a full decode once you reach the time or sample position you're
interested in. For seeking backward, you can recall the desired frame's stream
position and resume decoding from there.

Actually, things are slightly more complicated than that due to various frame
interdependencies. As I explained previously on this list (check the archives
for another seek question), to avoid any unpleasant burps in the audio after a
seek, it is necessary to decode a few frames *before* the frame you wish to
obtain audio from in order to re-sync the decoder. It is also a good idea to
perform synthesis on the frame immediately before the frame you wish to obtain
audio from, although you can throw away the output.

How many samples does each frame produce?

              48000         24000          12000
              44100         22050          11025
              32000 Hz      16000 Hz        8000 Hz
             (MPEG-1)      (MPEG-2)       (MPEG 2.5)

Layer I         384           384            384
Layer II       1152          1152           1152
Layer III      1152           576            576

This number can also be derived as 32 * MAD_NSBSAMPLES(&frame.header) or,
after synthesis, synth.pcm.length.

To help you calculate time positions, you may find MAD's timer interface
helpful. See:


Hope this helps.

Rob Leslie