END-TO-END MULTIMODAL SPEECH RECOGNITION

Palaskar, S; Sanabria, R; Metze, F

Palaskar, S (reprint author), Carnegie Mellon Univ, Pittsburgh, PA 15213 USA.

2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018; (): 5774

Abstract

Transcription or sub-titling of open-domain videos is still a challenging domain for Automatic Speech Recognition (ASR) due to the data's challenging ......

Full Text Link