Video Question Answering on Screencast Tutorials

Zhao, Wentian; Kim, Seokhwan; Xu, Ning; Jin, Hailin

Published in

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

DOI: 10.24963/ijcai.2020/148

Tools

Export citation

Search in Google Scholar

Video Question Answering on Screencast Tutorials

Proceedings article published in 2020 by Wentian Zhao, Seokhwan Kim, Ning Xu, Hailin Jin

This paper was not found in any repository; the policy of its publisher is unknown or unclear.

Full text: Unavailable

Preprint: policy unknown

Upload

Postprint: policy unknown

Upload

Published version: policy unknown

Upload

Abstract

This paper presents a new video question answering task on screencast tutorials. We introduce a dataset including question, answer and context triples from the tutorial videos for a software. Unlike other video question answering works, all the answers in our dataset are grounded to the domain knowledge base. An one-shot recognition algorithm is designed to extract the visual cues, which helps enhance the performance of video question answering. We also propose several baseline neural network architectures based on various aspects of video contexts from the dataset. The experimental results demonstrate that our proposed models significantly improve the question answering performances by incorporating multi-modal contexts and domain knowledge.

Published in

Links

Tools

Video Question Answering on Screencast Tutorials

Abstract