In Art Spotlight, we invite Sketchfab artists to talk about one of their designs.
My name is Rich Oglesby, from the UK. I do not consider myself an artist but I do have a keen interest in art and technology. I run a Tumblr blog called Prosthetic Knowledge in this field, including the subject of computational photography.
Like everyone on the internet the past few months, it was easy to come across a #MannequinChallenge video, featuring a one-take shot of people in a variety of poses completely still. It dawned on me (and by others on Twitter) that it could be an ideal area for creating 3D content through the method of Photogrammetry (where software such as AgiSoft Photoscan or RealityCapture can construct a 3D model from a series of photographs of a subject at various angles) … yet no one as far as I could see was attempting to try it. I decided I would have to have a go myself, despite no experience in 3D arts and no budget, only my PC which was adequate for gaming with an Nvidia card, a lot of patience and the latest Tribe Called Quest album. I thought this would be a week long project, but it turned into four weeks …
First, I needed the right video and, for the purpose of this experiment, not many videos worked: most had some shakiness and blurring thanks to fast turning during the video shot, some had bad lighting, and the video compression applied to uploaded video diminished the image quality. HD videos or videos with a measured pace seemed to work well. Instagram videos were more miss than hit, but YouTube was much better.
Next was the right software to use. My benchmark would be this Hillary Clinton video:
I tried various pieces of software and services, with varied results – some were good at parts of the video but not altogether. Some just ended making a geometric blob. The one piece of software which managed to pull it of was COLMAP by Johannes L.Schönberger which was put together as part of a computer vision research project, and can amazingly create models from a huge dataset of images from the internet (like this section of Rome created with 75,000 online images). Here is what COLAMP produced with this software:
COLMAP is free, available for PC, Mac and Linux, open source code and pre-built, but if you intend to do dense mapping you will need Nvidia graphics cards, and generally speaking the more powerful computer setup the better.
With the successful results from the Hillary video, I was on the lookout for another video of similar quality, and not long after, this video of the Cleveland Cavaliers at the White House with Michelle Obama appeared. I had a really good hunch that this one could also work:
First you will need to download the video (which I’m sure you can Google how to do that) and split the video into individual frames (again, Google if you don’t know) – it is best to place these in an individual folder. When you start with COLMAP on a new project, you will be asked to create an .ini file which you can name, as well as asked the directory of where the images are. Once you have this set up, you are ready to start. It is highly recommended to read up the COLMAP documentation to familiarize yourself with the process.
Here is an official video demonstrating the process of using COLMAP which I will briefly describe below:
In basic terms, the first step is Feature Extraction, a process which will go through all the images looking for points of interest. For ‘Camera Model’, it is probably best to use SIMPLE_RADIAL as it is set up for image collections (such as internet photos) like this. Also select ‘Single Camera for all images’. Below is a video compiled from the image marking process – note all the red dots which are spread over each image, these are considered key focal points:
The next step is Feature Matching: what happens now is all these red dot points are being connected to each other from image to image. There are various options. Sequential works out the connections between images in order, in such a way like a video. It is usually faster than the other processes but not always successful. Exhaustive is probably best as it sifts through all the images and searches connections between points to all, but will take longer. Depending on how many images you have and how many points have been found, it could take a long time, so be prepared to wait.
Once this has been completed, you can now start a sparse reconstruction, which you can do by either selecting in the ‘Reconstruction’ drop-down menu, or what looks like the triangular play button. Now … this also will take time (it certainly did on my PC). How long depends on various things – usually it is the size of the image collection and how big each of those images are. In this case of the Cavaliers video, there were 730 individual images, each at 1280 by 720 pixels. If I remember correctly, this one took a couple of hours to create a sparse model.
Embedded below is what it came up with:
As you can see, it can create a basic representation of the scene, enough to see forms and representation in three dimensions, basic rough outlines. You can save your model as well as export the point cloud in various formats – I would recommend the .ply format (the embedded Sketchfab example above is an uploaded .ply file).
With these calculations though, we can now proceed with dense reconstruction. I’m not 100% sure if all versions can produce dense reconstructions as I have only used the PC version, and I may have read somewhere that this can only be done on the CUDA enabled version, but I cannot validate that at time of writing. I also have to stress at this point that some of the steps from now on could take an extremely long time to process and eats up all the processing power on your PC, so close any other application including browsers as they will not be useable.
First, go to ‘Reconstruction’, then ‘Multi-View Stereo’. A new window will open with five buttons from the top left, and at the top right will be a section with ‘Workspace’: click the ‘Select’ button and it will present a file directory so that you can create or select a new folder to place all the new files for constructing the dense point cloud (Important note: make sure you have a lot of disk space on your hard drive – the next phase produce 22 GB of data with this model). Once you’ve done that, select ‘Undistortion’. On my PC this took a couple of hours. Next you click ‘Stereo’ – now this phase is the longest of them all, and it is probably best to leave your PC running overnight whilst it is processing. Also worth noting is that at this point, if you do not have enough memory COLMAP could crash. One way around this is go to ‘Options’, and in the ‘Stereo’ tab you will see ‘Max_image_size’; reduce the number down until it works.
You will notice when these steps are completed that once grey boxes are selectable, and you can see the visualized calculations of depth from each image – below is a gif comparing one frame with these steps completed:
Once these steps are done, now it is a lot more easier. You then select ‘Fusion’ and this step is relatively much quicker, and you then get to see the dense point cloud reconstruction in the main window. You can then select ‘Mesh’ to finalize the point cloud, and save the model (or export in another format, again recommending the .ply format).
Now… the dense point cloud will be a big file (the .ply file in this example ended up being 2.56 MB) and there will be stray points that can be removed. If you use 3D software that you know can edit point clouds then that is probably the best choice for you. For me with no experience and no budget required a bit of homework. MeshLab was the choice of many online blog tutorials, but in my experience it just didn’t work (and I’m sure it didn’t like the exported .ply file). What was useful was a free open source program designed for forensic point cloud data called CloudCompare by Daniel Girardeau-Montaut. With this software I could easily edit the model of unnecessary points (apologies in advance for the mundane demonstration video below):
You can now save the edited point cloud (again, as .ply format). One thing that might be worth further investigation is that CloudCompare can turn point clouds into 3D mesh. How this is done correctly I have no idea, but should you have familiarity and understanding of 3D modelling you may be able to work it out, should you want to.
Once I managed to get a few models together, and after spending more time than I anticipated working on this project, I wanted to present the idea to the internet, with a little hope that it could inspire and maybe appreciate how photography is moving forward. For some reason, I find point clouds very interesting and beautiful, a sort of three dimensional pointilism, and can be argued that it should become an artistic genre in some sense. Sketchfab here could be a great canvas for framing these works and make them far more accessible than they are currently.
Another interesting idea that came from developing these works is in relation to the history of photography. In its early stages, photography required the subject / sitter to be absolutely still for minutes to capture a single image, and this project became a contemporary version of that thanks to the #MannequinChallenge meme. 3D photography is on its way to consumers through smartphone technology, most notably through depth sensors that appear in recent phones with Project Tango tech, and as demonstrated with Microsoft’s Creative Update for Windows which is coming at the start of next year. Future memories and portraits can be captured in 3D, and with Sketchfab’s VR feature we can walk around them. Sketchfab is providing a taste for the future…