Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition

Mohd Zaki, Hasan Firdaus and Shafait, Faisal and Mian, Ajmal (2016) Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), 16-21 May 2016, Stockholm, Sweden.

	PDF - Published Version Restricted to Repository staff only Download (1MB) \| Request a copy
	PDF (scopus) - Supplemental Material Restricted to Repository staff only Download (61kB) \| Request a copy

Official URL: https://ieeexplore.ieee.org/document/7487310/

Abstract

Deep learning based methods have achieved unprecedented success in solving several computer vision problems involving RGB images. However, this level of success is yet to be seen on RGB-D images owing to two major challenges in this domain: training data deficiency and multi-modality input dissimilarity. We present an RGB-D object recognition framework that addresses these two key challenges by effectively embedding depth and point cloud data into the RGB domain. We employ a convolutional neural network (CNN) pre-trained on RGB data as a feature extractor for both color and depth channels and propose a rich coarse-to-fine feature representation scheme, coined Hypercube Pyramid, that is able to capture discriminatory information at different levels of detail. Finally, we present a novel fusion scheme to combine the Hypercube Pyramid features with the activations of fully connected neurons to construct a compact representation prior to classification. By employing Extreme Learning Machines (ELM) as non-linear classifiers, we show that the proposed method outperforms ten state-of-the-art algorithms for several tasks in terms of recognition accuracy on the benchmark Washington RGB-D and 2D3D object datasets by a large margin (upto 50% reduction in error rate).

Item Type:	Conference or Workshop Item (Plenary Papers)
Additional Information:	8293/60177
Uncontrolled Keywords:	category theory;computer vision;convolution;image classification;image colour analysis;image fusion;image representation;learning (artificial intelligence);neural nets;object recognition;CNN;ELM;RGB-D images;RGB-D object category;RGB-D object recognition;classification;coarse-to-fine feature representation;computer vision;convolutional hypercube pyramid;convolutional neural network;deep learning;extreme learning machines;fusion scheme;instance recognition;multimodality input dissimilarity;nonlinear classifiers;point cloud data;training data deficiency;Feature extraction;Hypercubes;Image color analysis;Object recognition;Robots;Three-dimensional displays;Training
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button):	Kulliyyah of Engineering > Department of Mechatronics Engineering
Depositing User:	Dr. Hasan Firdaus Mohd Zaki
Date Deposited:	06 Aug 2018 16:13
Last Modified:	06 Aug 2018 16:13
URI:	http://irep.iium.edu.my/id/eprint/60177

Actions (login required)

View Item

Download Statistics

Downloads

Downloads per month over past year