AEDNVS: A NOVEL VIDEO SUMMARIZATION FRAMEWORK TO ENCODE THE VIDEO FRAMES CONTEXTUAL INFORMATION

Authors

  • Y. Femi Priya
  • D. Minoladavids

Keywords:

component; formatting; style; styling; insert

Abstract

This paper addresses the problem of supervised video summarization by formulating it as a sequence-to-sequence knowledge problem, where the input may be a sequence of original video frames, the output is a keyshot sequence. The key idea is to learn a deep summarization network with attention mechanism to mimic the way of selecting the keyshots of human. A novel video summarization framework named Attentive encoder-decoder networks for Video Summarization, is proposed in which the encoder uses a Bidirectional Long Short-Term Memory (BiLSTM) to encode the contextual information among the input video frames. As for the decoder, two attention-based LSTM networks are explored by using additive and multiplicative objective functions, respectively. Extensive experiments are conducted on two video summarization benchmark datasets, i.e., SumMe, and TVSum. The results show the control of the proposed AVS-based approaches against the state-of-the-art approaches, with remarkable improvements on both datasets.

Author Biographies

Y. Femi Priya

Department of ECE, C.S.I Institute of technology, Thovalai, Tamil Nadu, India.

D. Minoladavids

Assistant Professor, Department of ECE, Institute of technology, Thovalai,

Tamil Nadu, India

Published

2021-06-26