Algorithms & Perceptual Analysis for Interactive Free Viewpoint Image-Based Navigation


We present image-based rendering that allows free viewpoint walkthroughs of urban scenes using just a few photographs as input. Commercial applications such as Google Streetview, Bing Maps etc. use rudimentary forms of image-based rendering for urban visualization; more sophisticated approaches use the full 3D model of the scene as input. As the quality of 3D model degrades, rendering artifacts are observed which drastically reduce the utility of such applications. In this thesis, we propose image-based approximations to compensate for the lack of accurate 3D geometry. In the first approach, we use discontinuous image warping guided by quasi-dense depth maps which improves visual quality compared to previous methods that rely on texturing 3D models. This approach involves a small degree of manual intervention to mark occlusion boundaries in the input images. We build upon this in the second approach by developing a completely automatic solution that is capable of handling more complex scenes. We oversegment input images into superpixels and warp them independently using sparse depth. We introduce depth synthesis to create approximate depth in poorly reconstructed regions of the image and use this with our image warps for generating high quality results. We compare our results to many recent algorithms and show that our approach extends very well to free viewpoint navigation.

We also perform perceptual analysis of different image-based rendering artifacts in separate user studies under controlled conditions. We use vision science to investigate perspective distortions produced when a single image is projected on a planar geometry and viewed from novel viewpoints. We use the experimental data to develop a quantitative framework for predicting the level of perspective distortions as a function of capture and viewing parameters. In another study, we compare artifacts caused by smooth transitions (blending images) with abrupt transitions (popping) and develop guidelines for selecting the ideal tradeoff under different capture and rendering scenarios. We use guidelines from these studies to motivate the design of our image-based rendering systems described above.

We demonstrate an application of our approach for cognitive therapy. We create the first virtual reality application that uses image-based rendering instead of traditional computer graphics. This drastically reduces the cost of modeling 3D scenes for virtual reality while producing highly realistic walkthroughs.

Overall, we believe our work is a significant step towards free viewpoint image-based rendering designed on sound perceptually-based foundations.

Results and comparison figures are best appreciated on a computer screen using the high res version.

Supplementary videos