Generalized pyramid co-attention with learnable aggregation net for video question answering

Gao, LL; Chen, TM; Li, XP; Zeng, PP; Zhao, L; Li, YF

Gao, LL (corresponding author), Univ Elect Sci & Technol China, Ctr Future Media, Chengdu, Peoples R China.

PATTERN RECOGNITION, 2021; 120 ():

Abstract

Video based visual question answering (V-VQA) remains challenging at the intersection of vision and language. In this paper, we propose a novel archit......

Full Text Link