Skip to main content

Fast and Robust Methods for Multiple-View Vision

Image and video processing has always been a hot research topic and has many practical applications in areas such as television/movie production, augmented reality, medical visualization, communication, and so forth. Very often, multiple cameras are employed to capture images and videos of the scene at distinct viewpoints. In order to efficiently and effectively process such a large volume of images and videos, novel multiple-view image and video processing techniques should be developed.

The classical problem of multiple-view vision has been studied by a lot of researchers over the past few decades, and numerous solutions have been proposed to tackle the problem under various assumptions and constraints. Early methods developed in the 80s and 90s have laid down the foundations and theories for resolving the multiple-view vision problem. Nonetheless, many of these methods lack robustness and work well only under a well-controlled scene (e.g., homogeneous lighting, wide-baseline viewpoints, and texture-rich surface).

Recently, a number of researchers revisit the multiple-view vision problem. Based on the well-developed theories on multiple-view geometry, they adopt robust implementations like statistical methods to produce solutions that can work well under general scene settings. Despite their robustness, these methods are often extremely computationally expensive and require days or even weeks to run and produce results. Therefore, efficient algorithms and implementations will be required to make those methods more practical. Techniques that are developed in real-time image/video processing can be redesigned and adapted for this interesting scenario.

This special issue targets at striking a balance between the efficiency and robustness of methods for multiple-view vision. This helps to bring multiple-view methods from laboratories to general home users. After two rounds of strict review, five distinguished papers were accepted. In the following, we summarize those five papers.

The first paper entitled "Real-time multi-view recognition of human gestures by distributed image processing" proposes a framework for multiview integration and recognition for human gestures. In this framework, recognition agents run in parallel for different views, and the recognition results are integrated on-line and in real time. Experimental results prove the effectiveness of the swarm-based method in multiview gesture recognition.

The second paper investigates a multiview C-arm fluoroscopic image acquisition system for assisting Cardiac Ablation Procedures. The proposed methodology strikes a balance between efficiency and robustness of a common multiview vision problem applied to medicine. It consists of integrating fluoroscopic and electrical data from the RF catheters into the same image so as to better guide RF ablation, shorten the duration of this procedure, increase its efficacy, and decrease hospital cost.

A tracking algorithm robust to object occlusions applied in augmented reality applications is presented in the third paper. Square targets are identified and pose parameters are computed using a hybrid approach based on a direct method combined with the Kalman filter. To tackle occlusions, the algorithm relies on an optical flow motion estimator to track visible points and maintain virtual graphics overlaying when targets are occluded.

The fourth paper designs a three-dimensional stereoscopic display system for operating microscopes in biomedical applications. The system consists of a stereoscopic camera part, image processing device for stereoscopic video recording, and the stereoscopic display. To reduce eyestrain and viewer fatigue, the authors apply a preexisting stereomicroscope structure and a polarized-light stereoscopic display method that does not reduce the quality of the stereoimages.

An efficient iterative surface evolution method for reconstructing 3D shapes in stereo video is proposed in the last paper. 3D depth estimation and a fast implicit distance function-based region growing are first employed to extract the initial shape estimation. An explicit surface evolution is then conducted to recover the finer geometry details of the recovered shape. The final result is further improved by several iterations between depth estimation and shape reconstruction.

We would like to thank all of the authors for their contributions. We also want to express our sincere gratitude to all of the reviewers for their time and effort in providing insightful and constructive reviews.

Ling ShaoHui ZhangKenneth K. Y. WongJiebo Luo

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ling Shao.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Shao, L., Zhang, H., Wong, K. et al. Fast and Robust Methods for Multiple-View Vision. J Image Video Proc 2010, 205283 (2010).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: