Quotation from Doom9:
MPEG4 video is based heavily on H.263 - most of the MPEG4 docs assume a prior knowledge of H.263 so you may be better off looking for it instead.
B-frames are one of 3 different frame types that MPEG4 supports. I-frames (Intra frames) are completely self-contained, that is they don't reference any other frames. These are commonly called key frames. P-frames (Predicted (?) frames, can't remember for some reason) are frames which reference the frame that came before it for image data. Each 16x16 block (macroblock) of a P-frame can either be encoded independently of anything else (an "intra" block) or be "compensated" from the frame that came before it. By expoiting the similarities often found in subsequent frames, P-frames are significantly smaller than I-frames - the cost is that you must decode every preceding P-frame from the last I-frame in order to decode one.
B-frames are a different matter, and complicate the encoding/decoding procedure by quite a bit. They are "Bidirectional" frames, meaning that they can reference frames that come both before and after itself. How can a frame reference a frame that comes after itself, you ask? The encoder reorders the frames so that they are no longer stored one after the other in order, that's how.
Say you had 4 frames. You wanted the first to be an I-frame (naturally), the next to to be B-frames (as they are usually a quarter of the size of P-frames, to give you some idea of the compression benefits), and the last frame to be a P-frame (as B-frames need something ahead of themselves to be predicted from). The frames would sequentially look like this:
1 2 3 4
I B B P
However, they would be stored in the file as:
1 4 2 3
I P B B
After encoding the I-frame, the encoder skips ahead and grabs the frame that is destined to be the P-frame, and encodes it as if it immediately followed the I-frame. Now, that will increase the size of the P-frame, as more will have changed between the I-frame and itself over the 2 intermediate frames, but we are hoping that the B-frames will make up for this loss of compressability. Now we have an I-frame and a P-frame compressed. Once this is done, the encoder goes back to the second frame (which was destined to be our first B-frame), and references both the I-frame and the P-frame to find similarities. Once the first B-frame is completed, the encoder again uses the original I-frame and the P-frame to compress the second B-frame. Note that B-frames can't reference other B-frames for finding matches.
As you can see, B-frames make things messy
Anyway, here's another diagram:
1 2 3 4
I B B P
1. Codec compresses I-frame
2. Codec jumps ahead and compresses P-frame as if it immediately followed the I-frame. Our bitstream now contains:
1 4
I P
3. Codec grabs the 2nd frame, and references both the I-frame and the P-frame to compress it. Adds compressed B-frame to bitstream:
1 4 2
I P B
4. Codec grabs the 3rd frame, and references both the I-frame and P-frame to compress it. We have finally completed our encoding:
1 4 2 3
I P B B
Since B-frame encoding requires frames to be fed to the encoder "out of order", there is talk of creating a custom encoding program to generate XviD AVI's containing B-frames. These AVI's will be playable with all the normal tools (Media Player, PowerDivX, etc.), it's just the encoding that's tricky.
Hope that's made things less (more?) confusing.
-h
And this, again by -h:
The I-frame quantization settings restrict the quantizers that may be used for I-frames (key frames). If you set the min and max to 2 and 2, then the keyframes in your movies will be high quality and large in size.
Now, because of the way the 2-pass "bit bucket" works, setting the I-frames to artificially low quantizers will result in the P-frames following it to have extremely high quantizers - such a quick jump will produce ugly output. So, the "Smooth quantizers" checkbox will prevent the quantizer from jumping too far between subsequent frames.
If you're going to mess with the I-frame quantizers, you should check that box.