IIUM Repository

Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition

Mohd Zaki, Hasan Firdaus and Shafait, Faisal and Mian, Ajmal (2016) Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), 16-21 May 2016, Stockholm, Sweden.

[img] PDF - Published Version
Restricted to Repository staff only

Download (1MB) | Request a copy
[img] PDF (scopus) - Supplemental Material
Restricted to Repository staff only

Download (61kB) | Request a copy


Deep learning based methods have achieved unprecedented success in solving several computer vision problems involving RGB images. However, this level of success is yet to be seen on RGB-D images owing to two major challenges in this domain: training data deficiency and multi-modality input dissimilarity. We present an RGB-D object recognition framework that addresses these two key challenges by effectively embedding depth and point cloud data into the RGB domain. We employ a convolutional neural network (CNN) pre-trained on RGB data as a feature extractor for both color and depth channels and propose a rich coarse-to-fine feature representation scheme, coined Hypercube Pyramid, that is able to capture discriminatory information at different levels of detail. Finally, we present a novel fusion scheme to combine the Hypercube Pyramid features with the activations of fully connected neurons to construct a compact representation prior to classification. By employing Extreme Learning Machines (ELM) as non-linear classifiers, we show that the proposed method outperforms ten state-of-the-art algorithms for several tasks in terms of recognition accuracy on the benchmark Washington RGB-D and 2D3D object datasets by a large margin (upto 50% reduction in error rate).

Item Type: Conference or Workshop Item (Plenary Papers)
Additional Information: 8293/60177
Uncontrolled Keywords: category theory;computer vision;convolution;image classification;image colour analysis;image fusion;image representation;learning (artificial intelligence);neural nets;object recognition;CNN;ELM;RGB-D images;RGB-D object category;RGB-D object recognition;classification;coarse-to-fine feature representation;computer vision;convolutional hypercube pyramid;convolutional neural network;deep learning;extreme learning machines;fusion scheme;instance recognition;multimodality input dissimilarity;nonlinear classifiers;point cloud data;training data deficiency;Feature extraction;Hypercubes;Image color analysis;Object recognition;Robots;Three-dimensional displays;Training
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): Kulliyyah of Engineering > Department of Mechatronics Engineering
Depositing User: Dr. Hasan Firdaus Mohd Zaki
Date Deposited: 06 Aug 2018 16:13
Last Modified: 06 Aug 2018 16:13
URI: http://irep.iium.edu.my/id/eprint/60177

Actions (login required)

View Item View Item


Downloads per month over past year