Center for Research in Comptuer Vision
Center for Research in Comptuer Vision



MVA

Volume 25, Issue 6


This issue features the following papers.



Automatic plant identification from photographs
B. Yanikoglu, E. Aptoula, C. Tirkaz

We present a plant identification system for automatically identifying the plant in a given image. In addition to common difficulties faced in object recognition, such as light, pose and orientation variations, there are further difficulties particular to this problem, such as changing leaf shapes according to plant age and changes in the overall shape due to leaf composition. Our system uses a rich variety of shape, texture and color features, some being specific to the plant domain. The system has achieved the best overall score in the ImageCLEF’12 plant identification campaign in both the automatic and human-assisted categories. We report the results of this system on the publicly available ImageCLEF’12 plant dataset, as well as the effectiveness of individual features. The results show 61 and 81 % accuracies in classifying the 126 different plant species in the top-1 and top-5 choices.



Painting-91: a large scale database for computational painting categorization
Fahad Shahbaz Khan, Shida Beigpour, Joost van de Weijer, Michael Felsberg

Computer analysis of visual art, especially paintings, is an interesting cross-disciplinary research domain. Most of the research in the analysis of paintings involve medium to small range datasets with own specific settings. Interestingly, significant progress has been made in the field of object and scene recognition lately. A key factor in this success is the introduction and availability of benchmark datasets for evaluation. Surprisingly, such a benchmark setup is still missing in the area of computational painting categorization. In this work, we propose a novel large scale dataset of digital paintings. The dataset consists of paintings from 91 different painters. We further show three applications of our dataset namely: artist categorization, style classification and saliency detection. We investigate how local and global features popular in image classification perform for the tasks of artist and style categorization. For both categorization tasks, our experimental results suggest that combining multiple features significantly improves the final performance. We show that state-of-the-art computer vision methods can correctly classify 50 % of unseen paintings to its painter in a large dataset and correctly attribute its artistic style in over 60 % of the cases. Additionally, we explore the task of saliency detection on paintings and show experimental findings using state-of-the-art saliency estimation algorithms.



Multilayer background modeling under occlusions
Shoaib Azmat, Linda Wills, Scott Wills

A multilayer background modeling technique is presented for video surveillance. Rather than simply classifying all features in a scene as either dynamically moving foreground or long-lasting, stationary background, a temporal model is used to place each scene object in time relative to each other. Foreground objects that become stationary are registered as layers on top of the background layer. In this process of layer formation, the algorithm deals with ”fake objects” created by moved background, and noise created by dynamic background and moving foreground objects. Objects that leave the scene are removed based on the occlusion reasoning among layers. The technique allows us to understand and visualize a scene with multiple objects entering, leaving, and occluding each other at different points in time. This scene understanding leads to a richer representation of temporal scene events than traditional foreground/background segmentation. The technique builds on a low-cost background modeling technique that makes it suitable for embedded, real-time platforms.



Automatic optical phase identification of micro-drill bits based on improved ASM and bag of shape segment in PCB production
Guifang Duan, Hongcui Wang, Zhenyu Liu, Jianrong Tan, Yen-Wei Chen

This paper addresses the problem of automatic optical phase identification of micro-drill bits for micro-drilling tool inspection in printed circuit board production. To overcome the limitations of conventional active shape model (ASM) on shape modeling of micro-drill bits, six key landmarks are defined for the initialization and optimization of ASM, and a novel method based on projection profiles is also proposed for these key landmarks detection. In addition, to involve the local shape feature, a bag of shape segment (BoSS) model is developed. Based on the improved ASM and BoSS, a new shape representation of micro-drill bits is proposed for phase identification. Experimental results show that the proposed method outperforms the conventional ASM and can improve the phase identification accuracy of micro-drill bits.



Super-resolution: a comprehensive survey
Kamal Nasrollahi, Thomas B. Moeslund

Super-resolution, the process of obtaining one or more high-resolution images from one or more low-resolution observations, has been a very attractive research topic over the last two decades. It has found practical applications in many real-world problems in different fields, from satellite and aerial imaging to medical image processing, to facial image analysis, text image analysis, sign and number plates reading, and biometrics recognition, to name a few. This has resulted in many research papers, each developing a new super-resolution algorithm for a specific purpose. The current comprehensive survey provides an overview of most of these published works by grouping them in a broad taxonomy. For each of the groups in the taxonomy, the basic concepts of the algorithms are first explained and then the paths through which each of these groups have evolved are given in detail, by mentioning the contributions of different authors to the basic concepts of each group. Furthermore, common issues in super-resolution algorithms, such as imaging models and registration algorithms, optimization of the cost functions employed, dealing with color information, improvement factors, assessment of super-resolution algorithms, and the most commonly employed databases are discussed.



Fast automatic medical image segmentation based on spatial kernel fuzzy c-means on level set method
Siavash Alipour, Jamshid Shanbehzadeh

Fast two-cycle (FTC) model is an efficient and the fastest Level set image segmentation. But, its performance is highly dependent on appropriate manual initialization. This paper proposes a new algorithm by combining a spatially constrained kernel-based fuzzy c-means (SKFCM) algorithm and an FTC model to overcome the mentioned problem. The approach consists of two successive stages. First, the SKFCM makes a rough segmentation to select the initial contour automatically. Then, a fuzzy membership matrix of the region of interest, which is generated by the SKFCM, is used in the next stage to produce an initial contour. Eventually, the FTC scheme segments the image by a curve evolution based on the level set. Moreover, the fuzzy membership degree from the SKFCM is incorporated into the fidelity term of the Chan–Vese model to improve the robustness and accuracy, and it is utilized for the data-dependent speed term of the FTC. A performance evaluation of the proposed algorithm is carried out on the synthetic and real images. The experimental results show that the proposed algorithm has advantages in accuracy, computational time and robustness against noise in comparison with the KFCM, the SKFCM, the hybrid model of the KFCM and the FTC, and five different level set methods on medical image segmentation.



3D human pose estimation from image using couple sparse coding
Mohammadreza Zolfaghari, Amin Jourabloo, Samira Ghareh Gozlou, Bahman Pedrood, Mohammad T. Manzuri-Shalmani

Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases of the pose dictionary. Two strategies for dictionary construction are presented: (i) constructing the dictionary by randomly selecting the frames of a sequence and (ii) selecting specific frames of a sequence as dictionary atoms. We analyzed the effect of each strategy on the accuracy of pose estimation. Extensive experiments on datasets of various human activities show that our proposed method outperforms state-of-the-art methods.



Two-stage online inference model for traffic pattern analysis and anomaly detection
Hawook Jeong, Youngjoon Yoo, Kwang Moo Yi, Jin Young Choi

In this paper, we propose a method for modeling trajectory patterns with both regional and velocity observations through the probabilistic topic model. By embedding Gaussian models into the discrete topic model framework, our method uses continuous velocity as well as regional observations unlike existing approaches. In addition, the proposed framework combined with Hidden Markov Model can cover the temporal transition of the scene state, which is useful in checking a violation of the rule that some conflict topics (e.g. two cross-traffic patterns) should not occur at the same time. To achieve online learning even with the complexity of the proposed model, we suggest a novel learning scheme instead of collapsed Gibbs sampling. The proposed two-stage greedy learning scheme is not only efficient at reducing the search space but also accurate in a way that the accuracy of online learning becomes not worse than that of the batch learning. To validate the performance of our method, experiments were conducted on various datasets. Experimental results show that our model explains satisfactorily the trajectory patterns with respect to scene understanding, anomaly detection, and prediction.



Visual lane analysis and higher-order tasks: a concise review
Bok-Suk Shin, Zezhong Xu, Reinhard Klette

Lane detection, lane tracking, or lane departure warning have been the earliest components of vision-based driver assistance systems. At first (in the 1990s), they have been designed and implemented for situations defined by good viewing conditions and clear lane markings on highways. Since then, accuracy for particular situations (also for challenging conditions), robustness for a wide range of scenarios, time efficiency, and integration into higher-order tasks define visual lane detection and tracking as a continuing research subject. The paper reviews past and current work in computer vision that aims at real-time lane or road understanding under a comprehensive analysis perspective, for moving on to higher-order tasks combined with various lane analysis components, and introduces related work along four independent axes as shown in Fig. 2. This concise review provides not only summarizing definitions and statements for understanding key ideas in related work, it also presents selected details of potentially applicable methods, and shows applications for illustrating progress. This review helps to plan future research which can benefit from given progress in visual lane analysis. It supports the understanding of newly emerging subjects which combine lane analysis with more complex road or traffic understanding issues. The review should help readers in selecting suitable methods for their own targeted scenario.



Application of machine vision in improving safety and reliability for gear profile measurement
Md. Hazrat Ali, Syuhei Kurokawa, Kensuke Uesugi

This research presents a camera-based measurement system which is developed to improve the safety and reliability for gear profile measurement system. Gear profile measurement is vital in precision engineering. To increase the safety and reliability of the precision measurement, application of camera or vision is very useful. Automatic control is also necessary to increase reliability of the measurement system. Normally, gear profiles are measured using contact-based stylus system. During gear profile measurement, human monitoring is required to avoid accident and sometimes we may face great danger regarding safety of our body especially eyes. The stylus is sharp and thin and if it is collided to the gear teeth there is high probability of breaking and scattering the stylus tip. To save time, if the measurement probe scans the gear shape with a speed of 10 mm/s then the issue of safety should be considered highly. The traditional methods for gear measurement are either time consuming or expensive. This paper presents the successful implementation of the camera system in precision measurement which saves time and increases safety and reliability of the measurement with the increment of the measurement performance by increasing production rate. Color-based stylus tracking algorithm is implemented to acquire better reliability of the complete system. Thus, the developed system with vision enhances safety and reliability of the precision measurement.



Improve scene categorization via sub-scene recognition
Shan-shan Zhu, Nelson H. C. Yung

Traditional scene categorization methods tend to generalize representation of the scene via a holistic approach to calculate a distribution of visual words observed in the image. They disregard spatial information within a scene and are not able to discern categories that share similar sub-scenes but different in layout; or categories that are ambiguous by nature. To address this issue, we propose to incorporate sub-scene attributes within global descriptions to improve categorization performance, especially in ambiguity cases. This is achieved by encoding sub-scenes with layout prototypes that capture the geometric essence of scenes more accurately and flexibly. The proposed method improves categorization accuracy to 92.26 % in the widely used eight scenes dataset, and outperforms all the other published methods. It is also observed that the proposed method is more accurate at detecting and evaluating ambiguity images.



Background subtraction by combining Temporal and Spatio-Temporal histograms in the presence of camera movement
Andrea Romanoni, Matteo Matteucci, Domenico G. Sorrenti

Background subtraction is the classical approach to differentiate moving objects in a scene from the static background when the camera is fixed. If the fixed camera assumption does not hold, a frame registration step is followed by the background subtraction. However, this registration step cannot perfectly compensate camera motion, thus errors like translations of pixels from their true registered position occur. In this paper, we overcome these errors with a simple, but effective background subtraction algorithm that combines Temporal and Spatio-Temporal approaches. The former models the temporal intensity distribution of each individual pixel. The latter classifies foreground and background pixels, taking into account the intensity distribution of each pixels’ neighborhood. The experimental results show that our algorithm outperforms the state-of-the-art systems in the presence of jitter, in spite of its simplicity.



A traverse inspection system for high precision visual on-loom fabric defect detection
Dorian Schneider, Timm Holtermann, Dorit Merhof

A self-contained inspection system for vision-based on-loom fabric defect detection is presented in this paper. Design and loom integration of a traversing camera sled, a camera vibration damper and a complementary back-light illumination are presented and discussed. Image acquisition strategies and traverse control are described to complete the discussion on hardware and mechanics. The main part of the paper focuses on a novel algorithmic framework for woven fabric defect detection in highly resolved (1,000+ ppi) image data. Within this scope, single yarns are tracked and measured in terms of position, size, and appearance in real time. An inspection prototype has been mounted onto an industrial loom. Extensive on-line and off-line evaluations for various fabric materials gave precise and stable detection results with few false alarms. A brief cost analysis for the prototype system is provided and completes the presentation of the system.



Extrinsic calibration of heterogeneous cameras by line images
Dieu Sang Ly, Cédric Demonceaux, Pascal Vasseur, Claude Pégard

The extrinsic calibration refers to determining the relative pose of cameras. Most of the approaches for cameras with non-overlapping fields of view (FOV) are based on mirror reflection, object tracking or rigidity constraint of stereo systems whereas cameras with overlapping FOV can be calibrated using structure from motion solutions. We propose an extrinsic calibration method within structure from motion framework for cameras with overlapping FOV and its extension to cameras with partially non-overlapping FOV. Recently, omnidirectional vision has become a popular topic in computer vision as an omnidirectional camera can cover large FOV in one image. Combining the good resolution of perspective cameras and the wide observation angle of omnidirectional cameras has been an attractive trend in multi-camera system. For this reason, we present an approach which is applicable to heterogeneous types of vision sensors. Moreover, this method utilizes images of lines as these features possess several advantageous characteristics over point features, especially in urban environment. The calibration consists of a linear estimation of orientation and position of cameras and optionally bundle adjustment to refine the extrinsic parameters.



Graph-cut based interactive segmentation of 3D materials-science images
Jarrell Waggoner, Youjie Zhou, Jeff Simmons, Marc De Graef, Song Wang

Segmenting materials’ images is a laborious and time-consuming process, and automatic image segmentation algorithms usually contain imperfections and errors. Interactive segmentation is a growing topic in the areas of image processing and computer vision, which seeks to find a balance between fully automatic methods and fully-manual segmentation processes. By allowing minimal and simplistic interaction from the user in an otherwise automatic algorithm, interactive segmentation is able to simultaneously reduce the time taken to segment an image while achieving better segmentation results. Given the specialized structure of materials’ images and level of segmentation quality required, we show an interactive segmentation framework for materials’ images that has three key contributions: (1) a multi-labeling approach that can handle a large number of structures while still quickly and conveniently allowing manual addition and removal of segments in real-time, (2) multiple extensions to the interactive tools which increase the simplicity of the interaction, and (3) a web interface for using the interactive tools in a client/server architecture. We show a full formulation of each of these contributions and example results from their application.