Multilabel image classification with regional latent semantic dependencies

Zhang, J.; Wu, Q.; Shen, C.; Zhang, J.; Lu, J.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/119065

Scopus	Web of Science®	Altmetric
Citations
?	?

Full metadata record

DC Field	Value	Language
dc.contributor.author	Zhang, J.	-
dc.contributor.author	Wu, Q.	-
dc.contributor.author	Shen, C.	-
dc.contributor.author	Zhang, J.	-
dc.contributor.author	Lu, J.	-
dc.date.issued	2018	-
dc.identifier.citation	IEEE Transactions on Multimedia, 2018; 20(10):2801-2813	-
dc.identifier.issn	1520-9210	-
dc.identifier.issn	1941-0077	-
dc.identifier.uri	http://hdl.handle.net/2440/119065	-
dc.description.abstract	Deep convolution neural networks (CNNs) have demonstrated advanced performance on single-label image classification, and various progress also has been made to apply CNN methods on multilabel image classification, which requires annotating objects, attributes, scene categories, etc., in a single shot. Recent state-of-the-art approaches to the multilabel image classification exploit the label dependencies in an image, at the global level, largely improving the labeling capacity. However, predicting small objects and visual concepts is still challenging due to the limited discrimination of the global visual features. In this paper, we propose a regional latent semantic dependencies model (RLSD) to address this problem. The utilized model includes a fully convolutional localization architecture to localize the regions that may contain multiple highly dependent labels. The localized regions are further sent to the recurrent neural networks to characterize the latent semantic dependencies at the regional level. Experimental results on several benchmark datasets show that our proposed model achieves the best performance compared to the state-of-the-art models, especially for predicting small objects occurring in the images. Also, we set up an upper bound model (RLSD+ft-RPN) using bounding-box coordinates during training, and the experimental results also show that our RLSD can approach the upper bound without using the bounding-box annotations, which is more realistic in the real world.	-
dc.description.statementofresponsibility	Junjie Zhang, Qi W, Chunhua Shen, Jian Zhang and Jianfeng Lu	-
dc.language.iso	en	-
dc.publisher	IEEE	-
dc.rights	© 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.	-
dc.source.uri	http://dx.doi.org/10.1109/tmm.2018.2812605	-
dc.subject	Multilabel image classification; semantic dependence; deep neural network	-
dc.title	Multilabel image classification with regional latent semantic dependencies	-
dc.type	Journal article	-
dc.identifier.doi	10.1109/TMM.2018.2812605	-
dc.relation.grant	CKCY2016082919273553	-
pubs.publication-status	Published	-
dc.identifier.orcid	Wu, Q. [0000-0003-3631-256X]	-
Appears in Collections:	Aurora harvest 4 Computer Science publications

Files in This Item:

There are no files associated with this item.

Show simple item record

Adelaide Research & Scholarship