Top-down Visual Saliency Guided by Captions

Ramanishka, V; Das, A; Zhang, JM; Saenko, K

Ramanishka, V (reprint author), Boston Univ, Boston, MA 02215 USA.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017; ( ): 3135

Abstract

Neural image/video captioning models can generate accurate descriptions, but their internal process of mapping regions to words is a black box and the......

Full Text Link