[mad-dev] frame decoding questions
Rob Leslie
rob@mars.org
Sat, 25 May 2002 17:18:49 -0700
On Saturday, May 25, 2002, at 12:15 AM, Russell O'Connor wrote:
> Hmmm, basically I'm looking for the length in bytes. Since the bytes per
> sample per channel is 2, then we get 64*NBSAMPLES*NCHANNELS. Or so I was
> thinking.
I see. The number of bytes per sample is of course arbitrary, which is why
I counted samples instead of bytes.
> So since I haven't decoded the last couple of frames, getting an error on
> the first frame I decode in the middle of the stream is to be expeded?
Yes, this is not unusual.
> Do you know how many frames back I would at most have to decode before I
> can correctly decode my target frame? Do I need to synth those frames
> too?
I think the worst case would be the largest possible bit reservoir against
the smallest possible frame size.
The largest possible bit reservoir is 511 bytes (MPEG-1 only), and in this
case the smallest Layer III frame size is 96 bytes (32 kbps, 48000 Hz).
The bit reservoir reaches back over previous frames, but does not include
space used by frame headers or Layer III side information. In MPEG-1, the
largest possible frame header + side info size is 38 bytes, leaving 58
least possible bit reservoir bytes per frame. Rounding up, 511 / 58 is 9
frames.
The smallest possible frame size for Layer III at all is 24 bytes (8 kbps,
24000 Hz) and the largest bit reservoir in this case is 255 bytes (MPEG-2)
. The largest possible frame header + side info size for MPEG-2 is 23
bytes, which is probably impractical in this case. More likely the stream
would be single-channel, and in this case the header + side info size will
be at most 15 bytes, leaving 9 bytes per frame. Rounding up, 255 / 9 is 29
frames.
These are extreme cases. I think it's actually rare for the bit reservoir
to extend beyond one or two previous frames in the most common
bitrate/samplerate combinations.
In any case, you only need to synth the frame immediately preceding the
target frame.
> So basically to do perfect seeking in MP3, I would have to skip though the
> headers, and keep a point back n number of frames, until I find where I
> want to be. Reinitalize mad_stream, etc. Seek the file to where my back
> pointer is indicating. Decode n frames silently, and then start playing
> at the appropriate place in the nth frame. Whew.
Since frames always have the same playing time for any given layer and
sampling frequency, you don't have to read headers all the way to your
seek point. You can calculate n * the duration of one frame, subtract this
from the time of your seek point, and stop scanning when you reach the
frame containing this time point.
--
Rob Leslie
rob@mars.org