Closed-loop deep vision

Carneiro, G.; Liao, Z.; Chin, T.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/82581

Scopus	Web of Science®	Altmetric
Citations
?	?

Type:	Conference paper
Title:	Closed-loop deep vision
Author:	Carneiro, G. Liao, Z. Chin, T.
Citation:	2013 International Conference on Digital Image Computing: Techniques and Applications, DICTA, Hobart, Tasmania, 26-28 November 2013: 8 p.
Publisher:	IEEE
Publisher Place:	USA
Issue Date:	2013
ISBN:	9781479921263
Conference Name:	International Conference on Digital Image Computing: Techniques and Applications (2013 : Hobart, Tasmania)
Editor:	DeSouza, P. Engelke, U. Rahman, A.
Statement of Responsibility:	Gustavo Carneiro, Zhibin Liao, Tat-Jun Chin
Abstract:	There has been a resurgence of interest in one of the most fundamental aspects of computer vision, which is related to the existence of a feedback mechanism in the inference of a visual classification process. Indeed, this mechanism was present in the first computer vision methodologies, but technical and theoretical issues imposed major roadblocks that forced researchers to seek alternative approaches based on pure feed-forward inference. These open loop approaches process the input image sequentially with increasingly more complex analysis steps, and any mistake made by intermediate steps impair all subsequent analysis tasks. On the other hand, closed-loop approaches involving feedforward and feedback mechanisms can fix mistakes made during such intermediate stages. In this paper, we present a new closedloop inference for computer vision problems based on an iterative analysis using deep belief networks (DBN). Specifically, an image is processed using a feed-forward mechanism that will produce a classification result, which is then used to sample an image from the current belief state of the DBN. Then the difference between the input image and the sampled image is fed back to the DBN for re-classification, and this process iterates until convergence. We show that our closed-loop vision inference improves the classification results compared to pure feed-forward mechanisms on the MNIST handwritten digit dataset [1] and the Multiple Object Categories [2] containing shapes of horses, dragonflies, llamas and rhinos.
Rights:	Copyright © 2013 by the Institute of Electrical and Electronic Engineers
DOI:	10.1109/DICTA.2013.6691492
Description (link):	http://www.aprs.org.au/dicta13/
Published version:	http://dx.doi.org/10.1109/dicta.2013.6691492
Appears in Collections:	Aurora harvest 4 Computer Science publications

Files in This Item:

There are no files associated with this item.

Show full item record

Adelaide Research & Scholarship