IIUM Repository

Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition

Mohd Zaki, Hasan Firdaus and Shafait, Faisal and Mian, Ajmal (2016) Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), 16-21 May 2016, Stockholm, Sweden.

[img] PDF - Published Version
Restricted to Repository staff only

Download (1MB) | Request a copy
[img] PDF (scopus) - Supplemental Material
Restricted to Repository staff only

Download (61kB) | Request a copy

Abstract

Deep learning based methods have achieved unprecedented success in solving several computer vision problems involving RGB images. However, this level of success is yet to be seen on RGB-D images owing to two major challenges in this domain: training data deficiency and multi-modality input dissimilarity. We present an RGB-D object recognition framework that addresses these two key challenges by effectively embedding depth and point cloud data into the RGB domain. We employ a convolutional neural network (CNN) pre-trained on RGB data as a feature extractor for both color and depth channels and propose a rich coarse-to-fine feature representation scheme, coined Hypercube Pyramid, that is able to capture discriminatory information at different levels of detail. Finally, we present a novel fusion scheme to combine the Hypercube Pyramid features with the activations of fully connected neurons to construct a compact representation prior to classification. By employing Extreme Learning Machines (ELM) as non-linear classifiers, we show that the proposed method outperforms ten state-of-the-art algorithms for several tasks in terms of recognition accuracy on the benchmark Washington RGB-D and 2D3D object datasets by a large margin (upto 50% reduction in error rate).

Item Type: Conference or Workshop Item (Plenary Papers)
Additional Information: 8293/60177
Uncontrolled Keywords: category theory;computer vision;convolution;image classification;image colour analysis;image fusion;image representation;learning (artificial intelligence);neural nets;object recognition;CNN;ELM;RGB-D images;RGB-D object category;RGB-D object recognition;classification;coarse-to-fine feature representation;computer vision;convolutional hypercube pyramid;convolutional neural network;deep learning;extreme learning machines;fusion scheme;instance recognition;multimodality input dissimilarity;nonlinear classifiers;point cloud data;training data deficiency;Feature extraction;Hypercubes;Image color analysis;Object recognition;Robots;Three-dimensional displays;Training
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Mechatronics Engineering
Depositing User: Dr. Hasan Firdaus Mohd Zaki
Date Deposited: 06 Aug 2018 16:13
Last Modified: 06 Aug 2018 16:13
URI: http://irep.iium.edu.my/id/eprint/60177

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year