How to stack two speakers of a Zoom video capture with FFmpeg

How to stack two speakers of a Zoom video capture with FFmpeg

06 September, 2023 1 min read
software, video, FFmpeg, howto

Basic use

Filter chain:

  • Remove the black regions after using cropdetect that identifies the right filter as crop=1280:352:0:184
  • Split the stream into two streams, one for the left and one for the right speaker
  • Crop each split of the stream to isolate each speaker
  • Stack the left speaker above the right speaker
  • Crop the resulting video to a 9:16 aspect ratio
ffmpeg -i input.mp4 -filter_complex "[0:v],split=2[left][right];[left]crop=in_w/2:in_h:0:out_h[left2];[right]crop=in_w/2:in_h:in_w/2:out_h[right2];[left2][right2]vstack,crop=ih*(9/16)" -c:a copy -y output.mp4

Advanced, with speakers’ names

This is what I used for converting the Zoom video “Isaak Tsalicoglou and Bruno Pešec pull the curtain on innovation” of my discussion with Bruno Pešec into a vertical video with speakers’ names overlaid on each half (top/bottom) of the video. I then used lossless-cut for extracting short snippets for social media.

ffmpeg -i input.mp4 -filter_complex "[0:v]crop=1280:352:0:184,split=2[left][right];[left]crop=in_w/2:in_h:0:out_h,drawtext=text='Bruno Pešec':x=w*4/16:y=(h-th)*0.925:fontsize=20:fontcolor=white:borderw=2:bordercolor=black:fontfile='/home/tisaak/.fonts/Inter Tight/InterTight-Medium.ttf'[left2];[right]crop=in_w/2:in_h:in_w/2:out_h,drawtext=text='Isaak Tsalicoglou':x=w*4/16:y=(h-th)*0.925:fontsize=20:fontcolor=white:borderw=2:bordercolor=black:fontfile='/home/tisaak/.fonts/Inter Tight/InterTight-Medium.ttf'[right2];[left2][right2]vstack,crop=ih*(9/16)" -c:v libx265 -crf 26 -c:a libopus -b:a 32k -vbr on -compression_level 10 -frame_duration 60 -application voip -y output.mkv