Deep Reinforcement Learning for Video Summarization with Semantic Reward

Deep Reinforcement Learning for Video Summarization with Semantic Reward
Author	Haoran Sun Xiaolong Zhu Conghua Zhou
Abstract	<p>Video summarization aims to improve the efficiency of large-scale video browsing through producting concise summaries. It has been popular among many scenarios such as video surveillance, video review and data annotation. Traditional video summarization techniques focus on filtration in image features dimension or image semantics dimension. However, such techniques can make a large amount of possible useful information lost, especially for many videos with rich text semantics like interviews, teaching videos, in that only the information relevant to the image dimension will be retained. In order to solve the above problem, this paper considers video summarization as a continuous multi-dimensional decision-making process. Specifically, the summarization model predicts a probability for each frame and its corresponding text, and then we designs reward methods for each of them. Finally, comprehensive summaries in two dimensions, i.e. images and semantics, is generated. This approach is not only unsupervised and does not rely on labels and user interaction, but also decouples the semantic and image summarization models to provide more usable interfaces for subsequent engineering use.</p>
Year of Publication	2022
Conference Name	2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)
Google Scholar \| BibTeX