Zephyr, MetaShape, Reality Capture: Time and Quality Comparison

When you ask on the forum “Hey, what is best photogrammetry software to use?” most people are going to say “Use Reality Capture”, “I use Agisoft software”, “Go and try Zephyr” (those are paid solutions), or one of many open source projects or less popular commercial software. The Internet is full of models created by those 3 paid programs, but there is a lack of comparison tests based on the same photo set showing side-by-side final results for easier comparison. Maybe one will give a better mesh? Maybe it’s possible to get better texture by other? What about calculation times?

Hi, my name is Jurand and in this article I would like to share results of my comparison test between the 3 most popular paid photogrammetry solutions I have encountered so far. It is good to review strengths and weaknesses before we buy a product. I hope that this article will provide some answers.

Comparison Parameters

Before we start digging into numbers and photos, I would like inform you about a few things. First, I’m not an option tweaker; I like to use products out of box and then try to change settings if the results are not good enough. Second, people who have worked with photogrammetry longer (more than a year) strongly suggest comparing results only with perfect crisp and beautiful photo sets. Maybe, but I believe that most of us who decide to buy one of those programs (Zephyr, MetaShape or Reality) are just beginning our photogrammetry adventure. Photo sets that we usually use come from phones, family trip grade cameras, drone videos recorded during sunny or rainy days and that’s it – we were there once and it’s not possible to go back and get better images.

In further parts of this article you will see two tests; one with a professional quality photo set of a skull that I took with my work camera (Nikon D610+Tamron 28-75), but taken quickly without earlier planning (I received that epoxy skull for just 5 minutes), and a second where I process photos extracted from somebody’s YouTube video (bad quality of video recording + YouTube compression + photos saved as jpg (extra compression artifacts)). All photo sets are available to download – the links are in the Sketchfab model description.

What I tested: Basically how 3DFlow Zephyr Lite, Agisoft MetaShape standard and Reality Capture work on same photo sets prepared in 3 ways and how that photo preparation effects processing times, quality, and final texture.

Ok, so what are you going to see beyond a few models and photos? Like I mentioned, a very important thing – processing times (for test 2). Those times come from the same or similar stages (for each program) that you need to get through to receive 3D textured objects from 2D images. I can’t say how long it will take to process a garden gnome set on your computer, but for sure you will see processing time-speed ratio between those 3 softwares and that should give you more or less an idea what to expect when it comes to time vs. quality. I attach my workstation specification – no overclocking.

PC configuration

OS: Windows 10
CPU: Amd Threadripper 1950x (32 cores)
GPU: GTX 1080ti
RAM: 64GB 2400mHz

Test 1: Skull

Objective:

This one was a quick test to check how masking and permanent picture information removal (by deleting them) effects final model quality, and, of course, which program will provide the best quality of mesh itself.

About object and capture session:

The test object was a fair quality epoxy (I believe) skull. Compared to real skulls (I had those in my hands in the past many times) was about 6/10 accuracy – many nice details but far away from being an anatomically believable copy. Photos was taken quickly when I received the skull for a few minutes while I was in the office. Because the opportunity was unexpected I decided that a colored chair would be the best “green screen” to use for future masking. I set the camera up on a tripod and used a remote for the shutter.

Aperture: 11
Shutter speed: 1s
ISO: 100
Number of photos: Whole head 71, bottom part 46, jaw bone 26

First row: Raw input image and color fixed with removed background. Second row: bottom of skull and jawbone

After a basic photography session from 3 different camera heights, I removed the jaw bone to to take photos of the bottom part of the skull and the jawbone itself from various angles to catch parts that were not visible during normal capture session. I used X-Rite passport for future color correction – it’s a very useful tool when you have no control over surrounding light

Methodology:

The goal of these tests was checking how each of those 3 softwares would process (on default medium and high settings) the same photo set with background information removed completely. First, I fixed colors and removed shadows from the RAW images using Rawtherapee software. Photos were saved as PNG files. Then with Affinity Photo I removed all non essential things like background or my hand (in situations when I was holding the skull by hand during rotation). Photos were batch saved as PNG files with alpha channel.

Photogrammetry softwares: Medium / High processing means that all steps are at the same level; medium for example: photo align – medium, dense cloud computing – medium, mesh extraction – medium. No texture extraction at this time.

All models have been connected in ZBrush into one mesh scene and decimated from 12,000k points to 1,299k points to fit Sketchfab’s free file upload limit of 50MB.

Raw vs. decimated mesh

That adjustment was necessary to fit the Sketchfab free account file upload size limitation. ZBrush has the option to decimate a mesh with unevenly sized triangles, which helps reduce polycount in places where information is lacking anyway (for example flat surface) and preserve the max level of details.

Flat surface without information has a lower polygon density

Results:

The first important thing to notice is that all programs were able to create a single mesh from 3 data sets (whole skull, bottom part of skull and jaw bone). All photos in all cases aligned during the first attempt.

Second thing worth noting is the mesh quality. I believe that you should decide which model looks best, but please take a look at these mesh parts:

Jaw bone – especially the top part where it connects with the skull
Tooth line – remember that original model has a closed jaw but the jaw received a separate session to get some tooth information – in theory there should be a gap between the upper and lower teeth.
Top part of front skull – apparently the photo set wasn’t perfect and not enough photo information was provided for the forehead.
Cranial suture quality: depth and quality

Conclusion:

Hard removing information (by delete) and replacing it with alpha channel helps with connecting separate structures. There are visible and important differences between softwares.