Center for Research in Comptuer Vision
Center for Research in Comptuer Vision


Volume 25, Issue 8

This issue features the following speical issue and originalpapers.

Special Issue Paper
RGB-D based place representation in topological maps
Hakan Karaoğuz, Özgür Erkent, H. Işıl Bozma

With the recent developments in sensor technology including Microsoft Kinect, it has now become much easier to augment visual data with three-dimensional depth information. In this paper, we propose a new approach to RGB-D based topological place representation—building on bubble space. While bubble space representation is in principle transparent to the type and number of sensory inputs employed, practically, this has been only verified with visual data that are acquired either via a two degrees of freedom camera head or an omnidirectional camera. The primary contribution of this paper is of practical nature in this perspective. We show that bubble space representation can easily be used to combine RGB and depth data while affording acceptable recognition performance even with limited field of view sensing and simple features.

Special Issue Paper
The ChaLearn gesture dataset (CGD 2011)
Isabelle Guyon, Vassilis Athitsos, Pat Jangyodsuk, Hugo Jair Escalante

This paper describes the data used in the ChaLearn gesture challenges that took place in 2011/2012, whose results were discussed at the CVPR 2012 and ICPR 2012 conferences. The task can be described as: user-dependent, small vocabulary, fixed camera, one-shot-learning. The data include 54,000 hand and arm gestures recorded with an RGB-D KinectTMcamera. The data are organized into batches of 100 gestures pertaining to a small gesture vocabulary of 8–12 gestures, recorded by the same user. Short continuous sequences of 1–5 randomly selected gestures are recorded. We provide man-made annotations (temporal segmentation into individual gestures, alignment of RGB and depth images, and body part location) and a library of function to preprocess and automatically annotate data. We also provide a subset of batches in which the user’s horizontal position is randomly shifted or scaled. We report on the results of the challenge and distribute sample code to facilitate developing new solutions. The data, datacollection software and the gesture vocabularies are downloadable from We set up a forum for researchers working on these data

Dissimilarity criteria and their comparison for quantitative evaluation of image segmentation: application to human retina vessels
Yann Gavet, Mathieu Fernandes, Johan Debayle, Jean-Charles Pinoli

The quantitative evaluation of image segmentation is an important and difficult task that is required for making a decision on the choice of a segmentation method and for the optimal tuning of its parameter values. To perform this quantitative evaluation, dissimilarity criteria are relevant with respect to the human visual perception, contrary to metrics that have been shown to be visually not adapted. This article proposes to compare eleven dissimilarity criteria together. The field of retina vessels image segmentation is taken as an application issue to emphasize the comparison of five specific image segmentation methods, with regard to their degrees of consistency and discriminancy. The DRIVE and STARE databases of retina images are employed and the manual/visual segmentations are used as a reference and as a control method. The so-called ϵ criterion gives results in agreement with perceptually based criterions for achieving the quantitative comparison.

Multiphase B-spline level set and incremental shape priors with applications to segmentation and tracking of left ventricle in cardiac MR images
Van-Truong Pham, Thi-Thao Tran, Kuo-Kai Shyu, Lian-Yu Lin, Yung-Hung Wang, Men-Tzung Lo

This paper presents a new multiphase active contour model for object segmentation and tracking. The paper introduces an energy functional which incorporates image feature information to drive contours toward desired boundaries, and shape priors to constrain the evolution of the contours with respect to reference shapes. The shape priors, in the model, are constructed by performing the incremental principal component analysis (iPCA) on a set of training shapes and newly available shapes which are the resulted shapes derived from preceding segmented images. By performing iPCA, the shape priors are updated without repeatedly performing PCA on the entire training set including the existing shapes and the newly available shapes. In addition, by incrementally updating the resulted shape information of consecutive frames, the approach allows to encode shape priors even when the database of training shapes is not available. Moreover, in shape alignment steps, we exploit the shape normalization procedure, which takes into account the affine transformation, to directly calculate pose transformations instead of solving a set of coupled partial differential equations as in gradient descent-based approaches. Besides, we represent the level set functions as linear combinations of continuous basic functions expressed on B-spline basics for a fast convergence to the segmentation solution. The model is applied to simultaneously segment/track both the endocardium and epicardium of left ventricle from cardiac magnetic resonance (MR) images. Experimental results show the desired performances of the proposed model.

Scale alignment of 3D point clouds with different scales
Baowei Lin, Toru Tamaki, Fangda Zhao, Bisser Raytchev, Kazufumi Kaneda, Koji Ichii

In this paper, we propose two methods for estimating the scales of point clouds to align them. The first method estimates the scale of each point cloud separately: each point cloud has its own scale that is something like the size of a scene. We call it a keyscale, which is a representative scale and is defined for a given 3D point cloud as the minimum of the cumulative contribution rates of PCA of descriptors over different scales. Our second method directly estimates the ratio of scales (scale ratio) of two point clouds. Instead of finding the minimum, this approach registers the two sets of curves of the cumulative contribution rate of PCA by assuming that those differ only in scale. Experimental results with simulated and real scene point clouds demonstrate that the scale alignment of 3D point clouds can be effectively accomplished by our scale ratio estimation.

Inpainting images with curvilinear structures propagation
Hailing Zhou, Lei Wei, Douglas Creighton, Saeid Nahavandi

Inpainting images with smooth curvilinear structures interrupted is a challenging problem, because the structures are salient features sensitive to the human vision system and they are not easy to be completed in a visually pleasing way, especially when gaps are large. In this paper, we propose an approach to address this problem. A curve with a desired nice shape is first created to smoothly extend the missing structure from the known to unknown regions. As the curve partitions the unknown region into separate areas, textures can be filled independently into each area. We then adopt a patch-based texture inpainting method enhanced by a novel similarity measurement of patches. After that, very abrupt edges caused by different inpainted colours on their two sides need to be smoothed for natural colour transition across the curve. Experimental results demonstrate the effectiveness of the proposed approach.