Note: rAV1ator isn't compared in this test because it is based on different underlying technologies that would be hard to compare directly to Aviator & Handbrake, especially for very short clips.
This blog post illustrates a comparison between Aviator 0.2.0 &
Handbrake 1.6.1 using two sources that illustrate different use cases;
the first being the 2160p crowd_run.y4m
source from Derf's Test Media & the other being a short clip of an animated
sequence from a popular animated show.
This is a relevant comparison because both Aviator & Handbrake use the SVT-AV1 encoder under the hood, and both can encode videos with 10 bit color (default for Aviator) which should increase the visual quality of the output. Because of the similarity of the two programs when outputting AV1 video, I was initially under the impression that upon beginning the test I'd be incapable of discerning a difference between the two based on how well they scored. I was mistaken.
Aviator settings: - Resolution: 3840x2160 for crowd_run, 1920x1080 for animation - CRF: 25-50 (10-60 for the expanded set) - Speed: 6 for crowd_run, 4 for animation - Container: MKV
Handbrake settings: - Format: Matroska - Resolution Limit: None (defaults to same as Aviator) - All Filters Off - Video Encoder: AV1 10-bit (SVT) - Framerate: Same as Source - RF: 25-50 (10-60 for the expanded set) - Speed: 6 for crowd_run, 4 for animation
These tests were run using the SSIMULACRA2 visual quality metric via ssimulacra2_rs. This metric is designed to model human vision far better than VMAF, SSIM, PSNR, & other less effective alternatives.
First, the crowd_run results from CRF 25 through 50.
This is a clear and decisive victory for Aviator by a larger margin than I initially thought possible between the two utilities, considering that they're both using 10 bit SVT-AV1. Aviator's out-of-the box tuning for visual quality has paid off.
It is important to note that these are incredibly high bitrates. The scene is very complex, and the video being 2160p50 makes it require a lot of bits to achieve a watchable level of visual quality.
SVT-AV1 defaluts to CRF 35 internally, while Aviator defaults to CRF 32 & Handbrake defaults to RF 30. The results above were done in increments of 5 from CRF/RF 25 through 50 (25, 30, 35, etc). In order to get the bigger picture, I tested a wider quality range from 10 through 60 that dips into the realm of impracticality a bit given the obscene bitrate approached by lower CRF/RF values & the relatively low quality image produced with higher CRF.
Here's the expanded results, with CRF 10 through 60.
Here, we see the two become relatively the same at lower & higher quality. While Aviator has a tiny advantage at lower quality, Handbrake looks to take the lead by an almost imperceptible margin at higher quality. For this source, because the bitrate skyrockets beyond CRF/RF 25 & the quality plummets beyond CRF/RF 50, I would consider this a win for Aviator in the range I'd consider usable that encompasses the default quality levels for Aviator, Handbrake, & SVT-AV1's stock behavior.
For the animation test, CRF 25 through 50 were tested.
This sees less of a performance delta than the more lifelike crowd_run source, and also sees bitrates reach an acceptable level that would be more common to see from an animated 1080p24 source. Aviator sees an advantage here still, with the gap widening at slightly higher bitrates.
It is clear that Aviator's prioritization of visual quality performance has paid off, even with SSIMULACRA2 being a synthetic benchmark. It appears that Handbrake is only worth using when dipping below CRF/RF 20, but when fine detail preservation at very high bitrate is a priority it may be worth using another codec (which Handbrake will offer you the option to use, seeing its diverse selection of codecs besides AV1). Even then, the quality difference is minute & may vary between sources. Aviator is the undisputed AV1 champ between the two, and appears to win in situations where AV1 is most useful.
While it is hard to benchmark, it is worth mentioning that Aviator supports film grain synthesis ("Grain Synth") while Handbrake does not. This can improve the visual quality of any source with grain present by removing it & reapplying a synthesized version at decode time. This allows the encoder to spend less bits compressing grain (which is notoriously difficult to compress) and instead apply it artificially with little to no discernible difference to the viewer. The crowd_run source (or any live action source that hasn't been heavily denoised) has grain present, and while SSIMULACRA2 doesn't totally understand the benefits of grain synthesis, it is clear to even the untrained eye the advantage it offers.
If you'd like the encoded clips I used to run this test, I will give them to you for crowd_run. Otherwise, feel free to replicate the crowd_run segment for yourself or the animation segment using another animated source of your choosing. Thanks for reading!