Video processing has become a popular research direction in computer vision due to its various applications such as video summarization, action recognition, etc. Recently, deep learning-based methods have achieved impressive results in action recognition. However, these methods need to process a full video sequence to recognize the action, even though many of the frames in the video sequence are similar and non-essential to recognizing a particular action. Additionally, these non-essential frames increase the computational cost and can confuse a method in action recognition. Instead, the important frames called keyframes not only are helpful in recognizing an action but also can reduce the processing time of each video sequence in classification or in other applications, e.g. summarization. As well, current methods in video processing have not yet been demonstrated in an online fashion. Motivated by the above, in this talk, I will discuss a new online learnable module for keyframe extraction. This module can be used to select key shots in video and thus, can be applied to video summarization. The extracted keyframes can be used as input to any deep learning-based classification model to recognize action. I will also discuss a plugin module using the semantic word vector as input along with keyframes and a new train/test strategy for the classification models. To our best knowledge, this is the first time such an online module and train/test strategy have been proposed. The experimental results on many commonly used datasets in video summarization and in action recognition have demonstrated the effectiveness of the proposed module.
Herb Yang’s research interests cover a wide range of topics in computer graphics and computer vision. In computer graphics, his interests include fluid and character animation, environment matting, hardware accelerated graphics, motion editing, physics-based modelling, texture analysis and synthesis, and static and dynamic image-based modeling and rendering. In computer vision, his interests include light source estimation, motion estimation, human motion analysis, camera calibration, 3D reconstruction, stereo and multiview stereo, underwater imaging, and medical imaging, in particular in developing biomarkers for ALS, also known as the Lou Gehrig’s disease.
While most of his work can be regarded as traditional computer vision, he has started to work on applying deep learning to solve typical computer vision problems, which include image dehazing, image enhancement, super-resolution and human action recognition. He is Professor in the Department of Computing Science, University of Alberta, has served as Associate Editor of Pattern Recognition since 1991 and as reviewer and committee member of many international conferences and government agencies.