I am attempting to create a picture in a picture video from two mp4 videos.
- There is a time offset on the first input file (0:v).
- The second input (1:v) file is being rotated 90 degrees.
- I am using apad to extend the first audio (0:a).
- I am using amerge to merge both audio inputs into a single audio.
Errors
I am experiencing similar errors with the same command on two separate machines. One machine has a GPU (Graphics Processing Unit) and the other does not. (I'm trying to start using hardware acceleration, hence the new machine with a GPU.) They have different configurations, if you would like that info let me know.
Machine without GPU
ffmpeg version 3.0.git Copyright (c) 2000-2016 the FFmpeg developers
Error while decoding stream #1:1: Cannot allocate memory
Machine with GPU
ffmpeg version N-90913-gd176497 Copyright (c) 2000-2018 the FFmpeg developer
Error while filtering: Cannot allocate memory
Failed to inject frame into filter network: Cannot allocate memory
Error while processing the decoded data for stream #1:1
The Machine without a GPU does complete the video conversion, even though there was an error. The machine with a GPU completely fails, with the message "Conversion Failed!"
Commands:
ffmpeg -itsoffset 1.801 -i 2327_segment_0_remote_0.mp4 \
-i 2327_segment_0_local_0.mp4 -filter_complex \
" [1:v]scale=iw/4:-1:flags=lanczos[loc0]; \
[0:v]transpose=1[rotate1]; \
[rotate1][loc0]overlay=main_w-overlay_w-10:main_h-overlay_h-10:eof_action=pass[rem0]; \
[0:a]apad[0a]; [0a][1:a]amerge=inputs=2[a]" \
-map "[rem0]" -map "[a]" -ac 2 -vcodec libx264 -ar 44100 -acodec aac \
2327_segment_0.mp4
ffmpeg version 3.0.git Copyright (c) 2000-2016 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.2) 20160609
configuration: --prefix=/home/daryl/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/daryl/ffmpeg_build/include --extra-ldflags=-L/home/daryl/ffmpeg_build/lib --bindir=/home/daryl/bin --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
libavutil 55. 32.100 / 55. 32.100
libavcodec 57. 63.103 / 57. 63.103
libavformat 57. 52.100 / 57. 52.100
libavdevice 57. 0.102 / 57. 0.102
libavfilter 6. 64.100 / 6. 64.100
libswscale 4. 1.100 / 4. 1.100
libswresample 2. 2.100 / 2. 2.100
libpostproc 54. 0.100 / 54. 0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '2327_segment_0_remote_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.52.100
Duration: 00:00:09.12, start: 0.000000, bitrate: 762 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 688 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : SoundHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '2327_segment_0_local_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.52.100
Duration: 00:00:12.24, start: 0.000000, bitrate: 398 kb/s
Stream #1:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 324 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #1:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : SoundHandler
[Parsed_amerge_4 @ 0x283a300] No channel layout for input 1
[Parsed_amerge_4 @ 0x283a300] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
[libx264 @ 0x2849b20] using SAR=1/1
[libx264 @ 0x2849b20] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
[libx264 @ 0x2849b20] profile High, level 2.2
[libx264 @ 0x2849b20] 264 - core 148 r2643 5c65704 - H.264/MPEG-4 AVC codec - Copyleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=1 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=16 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '2327_segment_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.52.100
Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 480x640 [SAR 1:1 DAR 3:4], q=-1--1, 16.67 fps, 12800 tbn, 16.67 tbc (default)
Metadata:
encoder : Lavc57.63.103 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
encoder : Lavc57.63.103 aac
Stream mapping:
Stream #0:0 (h264) -> transpose
Stream #0:1 (aac) -> apad
Stream #1:0 (h264) -> scale
Stream #1:1 (aac) -> amerge:in1
overlay -> Stream #0:0 (libx264)
amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
Error while decoding stream #1:1: Cannot allocate memory
Last message repeated 17 times
frame= 182 fps= 29 q=27.0 Lsize= 975kB time=00:00:13.55 bitrate= 589.4kbits/s dup=30 drop=0 speed=2.13x
video:786kB audio:182kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.772906%
[libx264 @ 0x2849b20] frame I:1 Avg QP:15.27 size: 22292
[libx264 @ 0x2849b20] frame P:127 Avg QP:21.91 size: 5290
[libx264 @ 0x2849b20] frame B:54 Avg QP:22.79 size: 2036
[libx264 @ 0x2849b20] consecutive B-frames: 52.2% 20.9% 11.5% 15.4%
[libx264 @ 0x2849b20] mb I I16..4: 6.0% 58.8% 35.2%
[libx264 @ 0x2849b20] mb P I16..4: 2.6% 9.1% 2.0% P16..4: 38.6% 11.5% 2.9% 0.0% 0.0% skip:33.1%
[libx264 @ 0x2849b20] mb B I16..4: 0.3% 0.9% 0.5% B16..8: 30.3% 6.1% 0.7% direct: 0.8% skip:60.4% L0:51.5% L1:44.3% BI: 4.2%
[libx264 @ 0x2849b20] 8x8 transform intra:65.2% inter:69.2%
[libx264 @ 0x2849b20] coded y,uvDC,uvAC intra: 60.3% 63.2% 6.7% inter: 14.4% 10.5% 0.2%
[libx264 @ 0x2849b20] i16 v,h,dc,p: 21% 23% 18% 37%
[libx264 @ 0x2849b20] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 22% 19% 5% 5% 7% 5% 6% 7%
[libx264 @ 0x2849b20] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 30% 19% 17% 5% 5% 7% 4% 6% 4%
[libx264 @ 0x2849b20] i8c dc,h,v,p: 52% 17% 26% 5%
[libx264 @ 0x2849b20] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x2849b20] ref P L0: 76.5% 11.4% 8.1% 3.9%
[libx264 @ 0x2849b20] ref B L0: 89.9% 9.6% 0.5%
[libx264 @ 0x2849b20] ref B L1: 96.4% 3.6%
[libx264 @ 0x2849b20] kb/s:589.06
[aac @ 0x2838940] Qavg: 1670.505
ffmpeg -itsoffset 1.801 -i 2327_segment_0_remote_0.mp4 \
-i 2327_segment_0_local_0.mp4 -filter_complex \
" [1:v]scale=iw/4:-1:flags=lanczos[loc0]; \
[0:v]transpose=1[rotate1]; \
[rotate1][loc0]overlay=main_w-overlay_w-10:main_h-overlay_h-10:eof_action=pass[rem0]; \
[0:a]apad[0a]; [0a][1:a]amerge=inputs=2[a]" \
-map "[rem0]" -map "[a]" -ac 2 -vcodec libx264 -ar 44100 -acodec aac \
2327_segment_0.mp4
ffmpeg version N-90913-gd176497 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.9) 20160609
configuration: --as=yasm --bindir=/home/dreedy/bin --cpu=native --extra-cflags=-I/home/dreedy/ffmpeg_build/include --extra-ldflags=-L/home/dreedy/ffmpeg_build/lib --extra-libs=-lpthread --pkg-config-flags=--static --prefix=/home/dreedy/ffmpeg_build --enable-cuda --enable-gpl --enable-ladspa --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libgsm --enable-libmp3lame --enable-libopus --enable-libsmbclient --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libx264 --enable-libx265 --enable-nonfree --enable-nvenc --enable-opengl --enable-pic --enable-static --enable-vaapi --enable-vdpau --enable-version3 --enable-libxvid --enable-omx --enable-openal --enable-openssl
libavutil 56. 18.100 / 56. 18.100
libavcodec 58. 19.100 / 58. 19.100
libavformat 58. 13.100 / 58. 13.100
libavdevice 58. 4.100 / 58. 4.100
libavfilter 7. 21.100 / 7. 21.100
libswscale 5. 2.100 / 5. 2.100
libswresample 3. 2.100 / 3. 2.100
libpostproc 55. 2.100 / 55. 2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '2327_segment_0_remote_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.52.100
Duration: 00:00:09.12, start: 0.000000, bitrate: 762 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 688 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : SoundHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '2327_segment_0_local_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.52.100
Duration: 00:00:12.24, start: 0.000000, bitrate: 398 kb/s
Stream #1:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 324 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #1:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : SoundHandler
Stream mapping:
Stream #0:0 (h264) -> transpose
Stream #0:1 (aac) -> apad
Stream #1:0 (h264) -> scale
Stream #1:1 (aac) -> amerge:in1
overlay -> Stream #0:0 (libx264)
amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[Parsed_amerge_4 @ 0x31e9c40] No channel layout for input 1
[Parsed_amerge_4 @ 0x31e9c40] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
[libx264 @ 0x2f2dc80] using SAR=1/1
[libx264 @ 0x2f2dc80] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
[libx264 @ 0x2f2dc80] profile High, level 2.2
[libx264 @ 0x2f2dc80] 264 - core 148 r2643 5c65704 - H.264/MPEG-4 AVC codec - Copyleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=16 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '2327_segment_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.13.100
Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 480x640 [SAR 1:1 DAR 3:4], q=-1--1, 16.67 fps, 12800 tbn, 16.67 tbc (default)
Metadata:
encoder : Lavc58.19.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
encoder : Lavc58.19.100 aac
Error while filtering: Cannot allocate memory
Failed to inject frame into filter network: Cannot allocate memory
Error while processing the decoded data for stream #1:1
[aac @ 0x2f2eb80] Qavg: 136.254
[aac @ 0x2f2eb80] 2 frames left in the queue on closing
Threads
Some google searches suggested specifying the threads, but this did not help:
-threads 1
-filter_threads 1
Update
I have found that this command works just fine with a different video used as the 2nd input file. Below is the ffprobe information for the videos with conversion problems. I originally started with two .webm files saved from a WebRTC Stream using RecordRTC. I converted those files to .mp4 as an intermediary step before converting them to a picture in a picture overlay video.
Input #0, matroska,webm, from '2327_segment_0_local_0.webm':
Metadata:
encoder : Chrome
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0(eng): Audio: opus, 48000 Hz, mono, fltp (default)
Stream #0:1(eng): Video: h264 (Baseline), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 16.67 tbr, 1k tbn, 2k tbc (default)
Input #0, matroska,webm, from '2327_segment_0_remote_0.webm':
Metadata:
encoder : Chrome
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0(eng): Audio: opus, 48000 Hz, mono, fltp (default)
Stream #0:1(eng): Video: h264 (Baseline), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 16.67 tbr, 1k tbn, 2k tbc (default)
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '2327_segment_0_local_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.52.100
Duration: 00:00:12.24, start: 0.000000, bitrate: 398 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 324 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : SoundHandler
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '2327_segment_0_remote_0.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf57.52.100
Duration: 00:00:09.12, start: 0.000000, bitrate: 762 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 688 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : SoundHandler
After a bunch more research, I decided to dumb down my example to the bare minimums. I found that if I remove the offset, that the video converts. However, the background video starts too soon and is not in sync. I did some testing and found that it will convert with an offset of no more than 1.200 seconds.
The background video is 9.12 seconds. The overlay video is 12.24 seconds. The background video should show the first frame until 1.801 seconds has passed. The background video should end at 10.921 seconds. The overlay video should play for all 12.24 seconds of the completed Picture in a Picture video.
Best Answer
When converting the .webm files to .mp4 you are converting the audio from opus to aac. It appears that when you attempt to create the picture in a picture overlay, that the conversion is having a problem decoding the aac stream.
I suggest trying to convert the .webm file to .mp4 with a different audio codec. Then try your overlay command again.
Example: