audio - FFmpeg x264 Encode Settings

Thursday, 29 August 2019

audio - FFmpeg x264 Encode Settings

My new Canon Vixia HF G30 video camera has an option to encode directly to 35 Mbit/sec Mp4 at 1080p60 and it does a nice job. I primarily use it to record my daughter's high school basketball games.

I need to make highlights from the camera files AND I WISH TO RETAIN AS MUCH QUALITY AND CLARITY AS POSSIBLE. Here is the workflow that I have developed with my limited understanding of FFmpeg:

STEP 1.) Join the camera produced mp4 clips into one mp4 file representing one game. Here is a representative command that I use for this step from a Windows command prompt (DOS box):

ffmpeg -f concat -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\gameclipsfromcamera.txt" -c copy "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\joined_fullgame.mp4"

This joins the game clips into a game video with no transcoding (I believe), and so this is good (it is quick and no additional compression artifacts are introduced).

STEP 2.) Cut highlight clips from the game video. Here is a representative command that I use for this step:

ffmpeg -ss 00:00:06.00 -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\joined_fullgame.mp4" -acodec copy -vcodec copy  -t 00:00:20.00 "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1.mp4"

This produces the following 20 second clip starting at 6 seconds into the game video:

Highlight Sample (85 MB)

STEP 3.) I require a 2-second pause at the beginning of each highlight clip where a text overlay identifies my daughter. To get this I make a bitmap image from the first frame of the clip :

ffmpeg -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1.mp4" -ss 00:00:00.00 -f image2 -vframes 1 -deinterlace "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1freezeframe.bmp"

Here is the result.

Then loop the video for 2 seconds with CRF 0 to introduce no additional artifacts:

ffmpeg -loop 1 -r 59.97 -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1freezeframe.bmp" -c:v libx264 -crf 0 -pix_fmt yuv420p  -t 2  "\\Excelhero\f\Canon\2013.09.21_san_francisco_game2\1freezeframe.mp4"

Then add the text overlay label identifying Fiona:

ffmpeg -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1freezeframe.mp4" -vf drawtext="fontfile=/Windows/Fonts/Corbelb.ttf:text='Fiona':fontsize=40:fontcolor=yellow:x=1321:y=417"  -b:v 35M -minrate 35M -maxrate 35M -bufsize 35M    -profile:v high -level:v 4.2  -refs 2  -pix_fmt yuv420p  -bf 0    -r 59.97 "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1freezeframeannotated.mp4"

Which produces this file: [ ehasamples.excelhero.com/video/1freezeframeannotated.mp4 ]

(SuperUser only allowed me to include 2 live links)

Now we add a silent sound stream to the clip:

ffmpeg -f lavfi -i aevalsrc=0:0:sample_rate=48000 -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1freezeframeannotated.mp4" -shortest -c:v copy -c:a aac -b:a 255k -strict -2 "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1freezeframeannotatedwithsilentsound.mp4"

Resulting in this: [ ehasamples.excelhero.com/video/1freezeframeannotatedwithsilentsound.mp4 ]

(SuperUser only allowed me to include 2 live links)

The above is a 2-second video with my daughter labeled in suspended animation with a blank sound stream.

STEP 4.) The final operation is to join the 2-second video to the full clip from the camera. I can easily re-render by filter producing one output file from these two input files in this fashion:

ffmpeg -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1freezeframeannotatedwithsilentsound.mp4" -i "\\Excelhero\f\Canon\2013.09.21_san_francisco_game2\1.mp4" -filter_complex "[0:0] [0:1] [1:0] [1:1] concat=n=2:v=1:a=1 [v] [a]" -map "[v]" -map "[a]" -c:v libx264 -r 59.97 "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1done.mp4"

But this is not a great solution. It is slow because the original footage from the video camera is re-rendered with the introduction of new compression artifacts. I can choose to add " -crf 0 " to get a lossless re-rendering, but this bloats the size of the clip by well over 1000%, in this case bringing our sample clip to over a gigabyte. So this is not a practical solution.

So I want to use the concat demuxer instead of the filter:

ffmpeg -f concat -i "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\clipfiles.txt" -c copy "\\mylancomputer\f\Canon\2013.09.21_san_francisco_game2\1doneBAD.mp4"

With this approach the operation is quick as there is no re-encoding. BUT, the resulting file does not play properly. Here it is:

[ ehasamples.excelhero.com/video/1doneBAD.mp4 ]

(SuperUser only allowed me to include 2 live links)

The 2-second image displays the entire 22 seconds. Although the video does not play properly, the sound seems to work just fine. So my guess is that the two files (the clip from the camera and the 2-second label clip) are simply too far apart on their relative encoding. I have attempted to get them as close to identical as I can in terms of encoding by using MediaInfo and by using FFProbe, and the above series of commands are my best, but apparently are not sufficient.

So the question I have is simply this: is it possible with FFmpeg to make my labeled clip enough like the encoded footage from the camera in order for the Concat demux to succeed? If so, how?

This entire workflow is driven by batch file and is really simple on my part and so I like it and hope to use this methodology to allow me to quickly edit highlights from games going forward.

Conversely, is there a better (quicker, no transcoding) workflow using FFmpeg?

Thank you kindly for your assistance.

Notes

Thursday, 29 August 2019

audio - FFmpeg x264 Encode Settings

No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?