IIUM Repository

Learning a deeply supervised multi-modal RGB-D embedding for semantic scene and object category recognition

Mohd Zaki, Hasan Firdaus and Shafait, Faisal and Mian, Ajmal (2017) Learning a deeply supervised multi-modal RGB-D embedding for semantic scene and object category recognition. Robotics and Autonomous Systems, 92. pp. 41-52. ISSN 0921-8890

[img] PDF - Published Version
Restricted to Repository staff only

Download (2MB) | Request a copy
[img] PDF (SCOPUS) - Supplemental Material
Restricted to Repository staff only

Download (51kB) | Request a copy

Abstract

Recognizing semantic category of objects and scenes captured using vision-based sensors is a challenging yet essential capability for mobile robots and UAVs to perform high-level tasks such as long-term autonomous navigation. However, extracting discriminative features from multi-modal inputs, such as RGB-D images, in a unified manner is non-trivial given the heterogeneous nature of the modalities. We propose a deep network which seeks to construct a joint and shared multi-modal representation through bilinearly combining the convolutional neural network (CNN) streams of the RGB and depth channels. This technique motivates bilateral transfer learning between the modalities by taking the outer product of each feature extractor output. Furthermore, we devise a technique for multi-scale feature abstraction using deeply supervised branches which are connected to all convolutional layers of the multi-stream CNN. We show that end-to-end learning of the network is feasible even with a limited amount of training data and the trained network generalizes across different datasets and applications. Experimental evaluations on benchmark RGB-D object and scene categorization datasets show that the proposed technique consistently outperforms state-of-the-art algorithms.

Item Type: Article (Journal)
Additional Information: 8293/61281
Uncontrolled Keywords: RGB-D image, Visual place recognition, Object categorization, Multi-modal deep learning
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Mechatronics Engineering
Depositing User: Dr. Hasan Firdaus Mohd Zaki
Date Deposited: 12 Jan 2018 11:28
Last Modified: 10 Jul 2018 08:43
URI: http://irep.iium.edu.my/id/eprint/61281

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year