IIUM Repository

Modeling sub-event dynamics in first-person action recognition

Mohd Zaki, Hasan Firdaus and Shafait, Faisal and Mian, Ajmal S. (2017) Modeling sub-event dynamics in first-person action recognition. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21st-26th July 2017, Honolulu, USA.

[img] PDF - Published Version
Restricted to Repository staff only

Download (999kB) | Request a copy
[img] PDF (SCOPUS) - Published Version
Restricted to Repository staff only

Download (544kB) | Request a copy

Abstract

First-person videos have unique characteristics such as heavy egocentric motion, strong preceding events, salient transitional activities and post-event impacts. Action recognition methods designed for third person videos may not optimally represent actions captured by first-person videos. We propose a method to represent the high level dynamics of sub-events in first-person videos by dynamically pooling features of sub-intervals of time series using a temporal feature pooling function. The sub-event dynamics are then temporally aligned to make a new series. To keep track of how the sub-event dynamics evolve over time, we recursively employ the Fast Fourier Transform on a pyramidal temporal structure. The Fourier coefficients of the segment define the overall video representation. We perform experiments on two existing benchmark first-person video datasets which have been captured in a controlled environment. Addressing this gap, we introduce a new dataset collected from YouTube which has a larger number of classes and a greater diversity of capture conditions thereby more closely depicting real-world challenges in first-person video analysis. We compare our method to state-of-the-art first person and generic video recognition algorithms. Our method consistently outperforms the nearest competitors by 10.3%, 3.3% and 11.7% respectively on the three datasets.

Item Type: Conference or Workshop Item (Plenary Papers)
Additional Information: 8923/64353
Uncontrolled Keywords: First-person action recognition, deep learning
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Mechatronics Engineering
Depositing User: Dr. Hasan Firdaus Mohd Zaki
Date Deposited: 05 Jul 2018 14:56
Last Modified: 05 Jul 2018 14:56
URI: http://irep.iium.edu.my/id/eprint/64353

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year