yuv4mpeg - video stream format used by pipe-based MJPEGtools
DESCRIPTION
Many of the MJPEGtools communicate via pipes and act as filters
(or sources or sinks). The format of video data used in the pipes is
referred to as "YUV4MPEG", or, more precisely, "YUV4MPEG2". (The format
was extended and codified during v1.5.x of the tools.)
The basic structure is a stream header followed by an unlimited number of
frames. Each frame itself consists of a header followed by video data.
The headers are vaguely human-readable ASCII, but the video data is simple
byte-size binary.
The MJPEGtools distribution has a C library (libmjpegutils) which contains
functions for manipulating YUV4MPEG2 streams. We recommend that you use
this library rather than writing your own code if possible. See the header
file "yuv4mpeg.h" for a description of these functions.
Design Goals:
o
Easy to parse both via C or sh.
o
Extensible; easy to add new parameters while maintaining backwards
compatibility.
o
Simple upgrade from original "YUV4MPEG" format.
Drawbacks:
o
Frame headers do not have constant size, so streams are not seekable.
GRAMMAR
The precise description of the the YUV4MPEG2 stream format is as follows:
STREAM consists of
-
one STREAM-HEADER
-
unlimited number of FRAMEs
STREAM-HEADER consists of
-
magic string "YUV4MPEG2"
-
unlimited number of TAGGED-FIELDs,
each preceeded by a ' ' (single space) separator
-
single '\n' line terminator
FRAME consists of
-
one FRAME-HEADER
-
"length" octets of planar YCbCr 4:2:0 image data
(If the stream is interlaced, then the two fields per frame are interleaved,
with proper spatial ordering.)
FRAME-HEADER consists of
-
magic string "FRAME"
-
unlimited number of TAGGED-FIELDs,
each preceeded by a ' ' (single space) separator
-
single '\n' line terminator
TAGGED-FIELD consists of
-
single ASCII character tag
-
VALUE (which does not contain whitespace)
VALUE
consists of
-
RATIO,
or
-
integer (base 10 ASCII representation),
or
-
single ascii character,
or
-
string (multiple ASCII characters)
RATIO
consists of
-
numerator (base 10 ASCII integer)
-
':' (a colon)
-
denominator (base 10 ASCII integer)
The supported tags for the STREAM-HEADER:
W - [integer] frame width in pixels, must be > 0 (required)
H - [integer] frame height in pixels, must be > 0 (required)
I - [char] interlacing specification:
p - progressive (none)
t - top-field-first
b - bottom-field-first
? - unknown
F - [ratio] frame-rate, 0:0 == unknown
A - [ratio] sample aspect ratio, 0:0 == unknown
X - [string] 'metadata' (unparsed, but passed around)
The currently supported tags for the FRAME-HEADER:
X - [string] 'metadata' (unparsed, but passed around)
Except for those specified as "required", all tags are optional,
and the absence of a tag indicates that the parameter is unknown.
Note that a filter application must faithfully forward all "X" tags from
input pipe to output pipe (unless it uses one of those tags, of course).
The supplied library will do this automatically if the functions
y4m_copy_stream_info() and y4m_copy_frame_info() are used appropriately.
NOTES ON IMAGE DATA
Currently only planar, 4:2:0-subsampled CCIR-601, Y'CbCr image data is
supported.
This consists of, one after another:
-
(height X width) octets (8-bit bytes) of Y' samples in row-major order;
-
(height X width / 4) octets of Cb samples;
-
(height X width / 4) octets of Cr samples.
Sample siting (for the subsampling of the chroma planes) is not specified.
(Not yet, at least.)