Efficient Large Scale
Multi-View Stereo for Ultra High Resolution Image Sets
click on the images to jump to some results.
We present a new approach for large scale multi-view stereo
matching, which is designed to operate on ultra high resolution
image sets and efficiently compute dense 3D point clouds. We show
that, by using a robust descriptor for matching purposes and high
resolution images, we can skip the computationally expensive steps
other algorithms require. As a result, our method has low memory
requirements and low computational complexity while producing 3D
point clouds containing virtually no outliers. This makes it
exceedingly suitable for large scale reconstruction. The core of
our algorithm is the dense matching of image pairs using DAISY
descriptors, implemented so as to eliminate redundancies and
optimize memory access. We use a variety of challenging data sets
to validate and compare our results against other
algorithms. Here, we present some of our results. Note that all
the results shown here are point cloud renderings.
The results are rendered in two modalities for some sequences. Blue colored
frames show shaded point clouds (wrt their estimated normals) and normal
colored frames are rendered by computing the original color of the points
from the input images.
Videos are compressed using MS MPEG4 video codec and tested on Linux, Mac
and Windows machines with Firefox,IE8.0 and Safari browsers.
There might be slight distortions due to compression. For best viewing,
please download the videos using 'Save Link As...' option of your browser.
Sequence contains 127 18-Megapixel images of a statue at different
scales. Final point cloud contains 15.3 Million points which is computed in
29.5 minutes. Click on the image for the video of the colorized point
cloud.
Data set consists of 31 40-Megapixel images of the EPFL campus taken from a
helicopter, which represents the highest resolution we tested our algorithm
on. The campus is fully reconstructed including trees, grass walkways,
parked cars, and train tracks. The only exceptions are some building facades
that were not seen from any viewpoint. Reconstructed point cloud contains
~11.35 Million points. Click on the image for the video of the point
cloud.
Sequence contains 1302 21 Megapixel images of a cathedral shot from the
ground level. Final point cloud contains 148.2 Million points which is
computed in 419 minutes. Click on the image for the point cloud
video.
Data set contains 61 24-Megapixel images of the Lausanne cathedral and its
immediate surroundings shot from an airplane. Final point cloud contains
12.7 Million points which is computed in 22.1 minutes. Click on the image
for the video of the colorized point cloud.
Sequence contains 214 18-Megapixel images of a building pillar. Final point
cloud contains 63.1 Million points which is computed in 48.9 minutes. Click
on the image for the video of the colorized point cloud.
This is the largest data set we tested our algorithm on. It contains 3504 6
Megapixel images and 980 21 Megapixel images of the downtown area of
Lausanne, seen at different scales. There is much clutter, such as people
and cars, and some images were taken at different times of day. Computation
took 1632 minutes and the final cloud contains 272 million points. This may
seem long but represents only 27 hours or a little over one day on a single
PC, as opposed to a cluster and without using GPU processing. The colors in
the video represent the accuracy estimates where dark red is uncertain and
yellow is very certain. Click on the image for the video.
Data
We will be sharing some of the data sets here in future. Stay tuned.
Software
We will be sharing our software here in future. Stay tuned.
References
Main Reference
Efficient Large Scale Multi-View Stereo for Ultra High Resolution Image Sets
Engin Tola, Christoph Strecha, Pascal Fua Machine Vision and Applications 2011 - Preprint Online Version (link here!)
Related References
DAISY: An Efficient Dense Descriptor Applied to Wide
Baseline Stereo
Engin Tola, Vincent Lepetit, Pascal Fua IEEE Transactions on Pattern Analysis and Machine Intelligence
May 2010 website
| pdf
| slides (9.5MB)
|
@article{Tola10,
author = "E. Tola and V. Lepetit and P. Fua",
title = {{DAISY: An Efficient Dense Descriptor Applied to Wide
Baseline Stereo}},
journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",
year = 2010,
month = "May",
pages = "815--830",
volume = "32",
number = "5"
}
A Fast Local Descriptor for Dense Matching
Engin Tola, Vincent Lepetit, Pascal Fua Proceedings of Computer Vision and Pattern Recognition 2008, Alaska, USA
June 2008 website
| pdf
| slides (9.5MB)
|
@inproceedings{Tola08,
author = "E. Tola and V.Lepetit and P. Fua",
title = {{A Fast Local Descriptor for Dense Matching}},
booktitle = "Proceedings of Computer Vision and Pattern Recognition",
year = 2008,
address = "Alaska, USA"
}