Intel i5 Kaby Lake – Hardware-Assisted Transcoding to VP9 and FLAC Using FFmpeg

ffmpegkaby-laketranscode

I have about 30 gigabytes of video (mostly MP4, some MKV and webm) I need to transcode to 8-bit VP9 with Free Lossless Audio Codec (FLAC) audio in an MKV container from various input codecs (AAC audio; H264, VP8, H265/HEVC, and probably some other video codecs). On my most powerful system, transcoding low-resolution videos takes twice as long as the length of the video. I use ffmpeg on Linux with the arguments ffmpeg -i input -c:v libvpx-vp9 -lossless 1 -c:a FLAC -preset veryslow output.mkv to transcode videos without hardware assistance. Recently, however, a friend of mine got an Intel i5 Kaby Lake CPU for his PC, and has offered to transcode the videos for me. According to Wikipedia and its references the new Kaby Lake CPUs support hardware decoding of all my input codecs and encoding of 8-bit VP9. So I have two questions:

  1. What ffmpeg arguments can my friend use to transcode the videos to VP9 and audio to FLAC in an MKV container? Do they work with Windows? If not, that is fine as he has a Windows 10-Linux dual-boot.

  2. Is the veryslow preset still necessary to get best compression?

I've tried to find the answer to this question elsewhere but could only find examples for encoding codecs like H264 and JPEG.

Best Answer

UPDATE ON 3 AUGUST 2017: According to a newer answer by user 林正浩, ffmpeg now has support for VP9 encoding through VAAPI. I still don't have the hardware required to test this though so my answer will be of limited help. I'll leave my original answer on how to encode VP9 in software below.


For some reason FFmpeg doesn't support VP9 encoding on Intel's QuickSync hardware encoder, even though they support H.264 and HEVC. A search through the FFmpeg source code repository shows it's not even a matter of it being disabled, the feature just hasn't been implemented yet. But if it does become available at some point in the future, it should be usable in a manner similar to the other QuickSync encoders: a switch like -c:v vp9_qsv instead of -c:v libvpx-vp9 should do the job.

FFmpeg command line usage is the same on all platforms, with the one notable exception I know of being Windows users having to use NUL instead of /dev/null for output during the first pass of a 2-pass encode. But since you're doing 1-pass and lossless this shouldn't affect you.

If you want to speed up your encodes the most obvious thing you should try is setting an encoding speed value with the -speed switch. Recommended values are numbers from 0 to 4, with 0 being really, really slow (think -preset placebo in x264 but worse) but high quality and 4 being fast while being lower quality. ffmpeg uses -speed 1 by default which is a good speed-for-quality tradeoff for lossy encoding. However, I just did a quick lossless encoding test with different speed values and noticed a 32% reduction in file size when going from -speed 1 to -speed 0 with lossless encoding. The encoding time tripled though, so whether using 0 is worth it is up to you. The file produced by -speed 4 was only 1.1% larger than the one produced by -speed 1 though, and it was encoded 43% faster. So I'd say that if you're doing lossless and -speed 0 is too slow you might as well use -speed 4.

Another important encoding performance increase is turning on multi-threading with the -threads switch; libvpx doesn't automatically use multiple threads so this must be set manually by the user. You should also set the number of tile columns with the -tile-columns switch. This option makes libvpx divide the video into multiple tiles and encode these tiles in parallel for better multi-threading. You can find recommended numbers for the amount of tile columns and threads in the "Tiling and Threading Recommendations" section of Google's VP9 encoding guide. As you can see, the number of threads used goes up with the number of tiles, which means that depending on the number of CPU cores available your processor might not be fully saturated while encoding sub-HD-resolution video. If you mainly encode low-resolution videos you might want to consider encoding multiple files at the same time.

However, there is yet another way to speed up VP9 encoding: multi-threading within a single column tile that can by turned on with -row mt 1. As of April 4 (2017, hello future people), it isn't part of a released version of libvpx but will most likely be in libvpx 1.6.2. If you want to try it out before the next release you need to compile recent git versions of libvpx and ffmpeg from source. Just follow FFmpeg's compilation guide for your distro of choice but instead of downloading and extracting a release tarball do git pull https://chromium.googlesource.com/webm/libvpx instead.

As for the veryslow preset, that's only used in x264 and x265. libvpx uses the -speed switch and additionally the -quality best, -quality good, or -quality realtime options to define how much time the encoder is allowed to spend encoding a frame. The default is -quality good because -quality best is so slow it's unusable and -quality realtime is meant to be used for time-critical applications like video calls and livestreaming.