Abstract
Video based visual question answering (V-VQA) remains challenging at the intersection of vision and language. In this paper, we propose a novel archit......
小提示:本篇文献需要登录阅读全文,点击跳转登录