Phrase-based image caption generator with hierarchical LSTM network

Tan, Ying Hua and Chan, Chee Seng (2019) Phrase-based image caption generator with hierarchical LSTM network. Neurocomputing, 333. pp. 86-100. ISSN 0925-2312, DOI https://doi.org/10.1016/j.neucom.2018.12.026.

Full text not available from this repository.
Official URL: https://doi.org/10.1016/j.neucom.2018.12.026

Abstract

Automatic generation of caption to describe the content of an image has been gaining a lot of research interests recently, where most of the existing works treat the image caption as pure sequential data. Natural language, however possess a temporal hierarchy structure, with complex dependencies between each subsequence. In this paper, we propose a phrase-based image captioning model using a hierarchical Long Short-Term Memory (phi-LSTM) architecture to generate image description. In contrast to the conventional solutions that generate caption in a pure sequential manner, phi-LSTM decodes image caption from phrase to sentence. It consists of a phrase decoder to decode the noun phrases of variable length, and an abbreviated sentence decoder to decode the abbreviated form of the image description. A complete image caption is formed by combining the generated phrases with sentence during the inference stage. Empirically, our proposed model shows a better or competitive result on the Flickr8k, Flickr30k and MS-COCO datasets in comparison to the state-of-the art models. We also show that our proposed model is able to generate more novel captions (not seen in the training data) which are richer in word contents in all these three datasets. © 2018 Elsevier B.V.

Item Type: Article
Funders: Postgraduate Research Grant ( PPP ) PG003-2016A, Frontier Research Grant FG002-17AFR, from University of Malaya
Uncontrolled Keywords: Image captioning; Natural language processing; Long short-term memory; Deep learning
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Faculty of Computer Science & Information Technology
Depositing User: Ms. Juhaida Abd Rahim
Date Deposited: 03 Jan 2020 02:38
Last Modified: 03 Jan 2020 02:38
URI: http://eprints.um.edu.my/id/eprint/23288

Actions (login required)

View Item View Item