Thesis (Ph.D)


Efficient inference and learning for computer vision labelling problems

Abstract

Discrete energy minimization has recently emerged as an indispensable tool for computer vision problems. It enables inference of the maximum a posteriori solutions of Markov and conditional random fields which can be used to model labelling problems in vision. When formulating such problems in an energy minimization framework there are three main issues that need to be addressed: (i) How to perform efficient inference to compute the optimal solution: (ii) How to incorporate prior knowledge into the model: and (iii) How to learn the parameter values. This thesis focuses on these aspects and presents novel solutions to address them. As computer vision moves towards the era of large videos and gigapixel images, computational efficiency is becoming increasingly important. We present two novel methods to improve the efficiency of energy minimization algorithms. The first method works by "recycling" results from previous problem instances. The second simplifies the energy minimization problem by "reducing" the number of variables in the energy function. We demonstrate a substantial improvement in the running time of various labelling problems such as interactive image and video segmentation, object recognition, stereo matching. In the second part of the thesis we expIore the use of natural image statistics for the single view reconstruction problem, where the task is to recover a theatre-stage representation (containing planar surfaces and their geometrical relationships to each other) from a single 2D image, To this end. we introduce a class of multi-label higher order functions to model these statistics based on the distribution of geometrical features of planar surfaces. We also show that this new class of functions can be solved exactly with efficient graph cut methods. The third part of the thesis addresses the problem of learning the parameters of the energy function. Although several methods have been proposed to learn the model parameters from training data, they suffer from various drawbacks, such as limited applicability or noisy estimates due to poor approximations. We present an accurate and efficient learning method, and demonstrate that it is widely applicable.

Attached files

Authors

Alahari, K

Oxford Brookes departments

Department of Computing and Communication Technologies
Faculty of Technology, Design and Environment

Dates

Year: 2010


© Alahari, K
Published by Oxford Brookes University
All rights reserved. Copyright © and Moral Rights for this thesis are retained by the author and/or other copyright owners. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the copyright holder(s). The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the copyright holders.

Details

  • Owner: Unknown user
  • Collection: eTheses
  • Version: 1 (show all)
  • Status: Live
  • Views (since Sept 2022): 45