Levels are (agreed upon) constraints put on an encoder. They limit the encoder to certain frame sizes and bit rates in order to make sure that a decoder can correctly decode such a bitstream. This means that a decoder that claims to be compatible with level X bitstreams must be able to decode a stream encoded with level X. You can find an overview of all H.264 levels on Wikipedia.
If you do not know what level you need, you should think about your target application. Is it going to be a high definition broadcast or a small video for web? Look at the maximum supported frame dimensions (e.g. 1920×1080 vs 320×240) and frame rates (e.g. 60 Hz vs. 15 Hz) and set the appropriate level.
The level itself does not influence the quality or file size. It only enforces a certain upper boundary or gives you a general hint, since logically, a 1080p60 video will be larger than a 320×240 web clip. But generally, you control the quality by setting an average bitrate, or a constant quality level. The level is just secondary here.
The keyframe distance doesn't have anything to do with the above. It is the distance (in pictures) between two I-pictures. There are three types of pictures in video compression:
- I-pictures, which can be decoded without reference to others ("intra-coded").
- P-pictures, which can only be decoded with the information from one or multiple previous P- or I-pictures ("previous" as in display order, P standing for "predicted").
- B-pictures, which can only be decoded with the information from one or multiple previous P- or I-pictures ("previous" as in decoding order, not necessarily display order; B standing for "bidirective").
Here, the P-frame requires the previous I-frame to be decoded first. The B-frame requires the previous P-frame and the following I-frame to be decoded.
The specific implementation of the picture types depends on the codec. The Wikipedia article on Group of Pictures (GOP) also explains that concept from a different perspective: Usually, I-pictures are interleaved with P- and B-pictures, and occur in a fixed interval—the keyframe interval. This is also the GOP length.
A video with only I-pictures will provide the best quality at the highest file size. The longer the GOP gets, the smaller the file will be, since P-pictures or B-pictures require less bits to encode. Longer GOPs are rarely used for streaming, as a lost frame might deteriorate the quality, but in broadcasting, a longer GOP is not unusual.
For a very detailed description of what the picture types mean in H.264, you can also read Overview of the H.264/AVC Video Coding Standard by Thomas Wiegand et al (see chapter IV A).
For CRF-based encodes, pass the following arguments in the snippet below to FFmpeg:
-c:v h264_nvenc -rc:v vbr_hq -cq:v 19 -b:v 2500k -maxrate:v 5000k -profile:v high
Of course, you'll need to adjust for target bit rates and a fixed cq
value. 19 is the recommended setting as its' visually identical to 0, yet preserves good compression trade off to file size. See this write-up for more on what CRF does.
Note that the -cq
scale is logarithmic, meaning that 0 is essentially lossless and 51 would be the absolute worst.
Quality can be further improved upon by adding options such as B-frames (limit this to 3, at most, and this requires the H.264 Main profile and above. Baseline profiles do not support B-frames. To do this, pass -bf {uint}
to the video encoder, such that -bf:v 4
would result in the encoder using 4 B-frames.
The key parts here are the -cq:v 19
and the -rc:v vbr_hq
arguments, which allow you to tune the encoder with both a preset variable bitrate and a maximum allowable bitrate (-b:v
and -maxrate:v
) while adhering to a CRF value of 19.
And now, small notes about NVENC, and tuning it for high quality encodes:
NVENC, like any other hardware-based encoder, has several limitations, and in particular with HEVC, here are the known limitations:
On Pascal:
For HEVC encodes, the following limitations apply:
- CTU sizes above 32 are not supported.
- B-frames in HEVC are also not supported.
- The texture formats supported by the NVENC encoder limit the color spaces that the encoder can work with. For now, we have support for 4:2:0 (8-bit) and 4:4:4 (for 10-bit). Extraneous formats such as 4:2:2 10-bit are not supported. This will affect some workflows where such colorspaces are required.
- Look ahead control is also limited to 32 frames. You may want to look at this editorial for more details.
Turing has all the enhancements available to Pascal, with the addition of B-frame support for HEVC and the ability to use B-frames as a reference. See this answer for an example on this capability.
And on Maxwell Gen 2 (GM200x series GPUs):
HEVC encoding lacks the following features:
The impact here for Maxwell is that motion heavy scenes with HEVC under constrained bitrates may suffer from artifacting (blockiness) due to the missing lookahead functions and adaptive sample offset (SAO) loop filtering capabilities. Pascal has somewhat improved on this capability, but depending on the version of the SDK that the video encoder was built with, not all features may be available.
For instance, weighted prediction mode for H.264 encodes on Pascal requires NVENC SDK 8.0x and above, and this encode mode will also disable B-frame support. Likewise, the combination of hardware-based scalers running off the Nvidia Performance Primitives (NPP) with NVENC may introduce performance improvements with video scaling applications at the cost of scaling artifacting, particularly with upscaled content. The same also impacts the video encode pipeline as NPP's scaling functions run off the CUDA cores on the GPU, and as such, the performance impact introduced by the extra load should be analyzed on a case-by case basis to determine if the performance-quality trade-off is acceptable.
Keep this in mind: A hardware-based encoder will always offer somewhat lesser customization than an equivalent software-based implementation, and as such, your mileage and acceptable output quality will always differ.
And for your reference:
With FFmpeg, you can always refer to an encoder's settings for customization by:
ffmpeg -h encoder {encoder-name}
So, for NVENC-based encoders, you can run:
ffmpeg -h encoder=hevc_nvenc
ffmpeg -h encoder=h264_nvenc
You can also see all the NVENC-based encoders and NPP-based scalers (if built as such) by running:
for i in encoders decoders filters; do
echo $i:; ffmpeg -hide_banner -${i} | egrep -i "npp|cuvid|nvenc|cuda"
done
Sample output on my testbed:
encoders:
V..... h264_nvenc NVIDIA NVENC H.264 encoder (codec h264)
V..... nvenc NVIDIA NVENC H.264 encoder (codec h264)
V..... nvenc_h264 NVIDIA NVENC H.264 encoder (codec h264)
V..... nvenc_hevc NVIDIA NVENC hevc encoder (codec hevc)
V..... hevc_nvenc NVIDIA NVENC hevc encoder (codec hevc)
decoders:
V..... h263_cuvid Nvidia CUVID H263 decoder (codec h263)
V..... h264_cuvid Nvidia CUVID H264 decoder (codec h264)
V..... hevc_cuvid Nvidia CUVID HEVC decoder (codec hevc)
V..... mjpeg_cuvid Nvidia CUVID MJPEG decoder (codec mjpeg)
V..... mpeg1_cuvid Nvidia CUVID MPEG1VIDEO decoder (codec mpeg1video)
V..... mpeg2_cuvid Nvidia CUVID MPEG2VIDEO decoder (codec mpeg2video)
V..... mpeg4_cuvid Nvidia CUVID MPEG4 decoder (codec mpeg4)
V..... vc1_cuvid Nvidia CUVID VC1 decoder (codec vc1)
V..... vp8_cuvid Nvidia CUVID VP8 decoder (codec vp8)
V..... vp9_cuvid Nvidia CUVID VP9 decoder (codec vp9)
filters:
... hwupload_cuda V->V Upload a system memory frame to a CUDA device.
... scale_npp V->V NVIDIA Performance Primitives video scaling and format conversion
Best Answer
I'd stick to the HDTV 1080p presets. I don't know why they put in YouTube presets, but if those really affect the quality of your videos, I wouldn't recommend them.
I've uploaded a few 1080p videos to YouTube, and it hasn't changed the resolution to 720p. Although my workflow included exporting a high-quality version from Premiere, and then re-encoding with x264 through FFmpeg, I don't think YouTube will downscale a 1080p video. It might just take time to appear on the website, since YouTube needs to re-encode the videos.
Choose one of those presets and go to Video settings. Choose the matching frame rate for your source material, and select PAL (depending on what the camcorder outputs).
Then, choose the High Profile (yes, YouTube supports that)
Now, let's get to the most important part affecting the quality: The bit rate. Premiere Pro assumes a very high bit rate of around 20 MBit/s for exporting. No wonder why your files are becoming this huge. 20 MBit/s is something you'd use in broadcasting and for archiving the files. You really don't need it when uploading to YouTube, unless you're on an enterprise network connection.
You can reduce the bit rate to around 2-8 MBit/s, which is still a sane value for 1080p h.264 video (and recommended by YouTube). You can actually see the estimated output file size and change the bit rate according to that, too.
Personally, I've found the MainConcept encoder bundled with Premiere Pro to be quite slow. There's not much you can do about it, I guess.