CHI 436 - Fall 2012: Paper Reading #2 - KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera

Intro:
Title: KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera
Author Bios

Shahram Izadi

Research focus: the technical end of HCI
http://research.microsoft.com/en-us/people/shahrami/

David Kim

Research focus: novel input and output technologies for 3D spatial and natural interaction
http://di.ncl.ac.uk/blog/author/xdk/

Otmar Hilliges

Research focus: the intersection of input sensing technologies, display technologies, computer graphics, and human computer interaction
http://research.microsoft.com/en-us/people/otmarh/

David Molyneaux

Research focus: human computer interaction, augmented reality, ubiquitous computing, interactive multitouch tabletops, handheld and steerable projected interfaces, mobile smart objects, computer vision, and machine learning
eis.comp.lancs.ac.uk/people/david/

Richard Newcombe

PhD. candidate in cognitive robotics and computer vision
http://www.doc.ic.ac.uk/~rnewcomb/

Pushmeet Kohli

Research focus: the development of intelligent machines
http://research.microsoft.com/en-us/um/people/pkohli/

Jamie Shotton

Research focus: computer vision and machine learning
http://jamie.shotton.org/work/

Steve Hodges

Research focus: rapid prototyping, novel sensors, embedded camera systems, flexible electronics, display technologies, wireless communications, and ubiquitous and mobile devices
http://research.microsoft.com/en-us/people/shodges/

Dustin Freeman

http://dustinfreeman.org/
http://dustinfreeman.org/files/DustinFreemanCV_academic.pdf

Andrew Davison

Works in computer vision and robotics
http://www.doc.ic.ac.uk/~ajd/

Andrew Fitzgibbon

Research focus: computer vision

http://research.microsoft.com/en-us/um/people/awf/

Richard Newcombe and Andrew Davison works for Imperial College London and Dustin Freeman works for the University of Toronto. The rest worked for Microsoft Research Cambridge.

Summary:

This paper covers KinectFusion. Using just the standard Kinect camera, they can recreate indoor 3D scenes. The system is able to use the live depth data from a Kinect camera that is being moved around a scene and create a 3D model that is accurate and detailed in real time.

Figure 1. A. User moving camera around B. The Phong shaded reconstructed 3D model C. Real-time particles simulated on the 3D model that is mapped usign the Kinect RGB information D. The use of the multi-touch interaction that can be performed by users E. Shows the segmentation and tracking of a 3D object [1]

The authors claim that their system and novel GPU pipeline is unique for several reason. The system allows for real-time reconstruction and tracking, does not rely on any explicit detection step so it can be used under a variety of settings, provides high quality surfaces with real-world geometry, allows for some user interaction, does not require expensive equipment or augmented room, and can reproduce a whole room.

This system has several main features, some of which and be seen in figure 1. The first that is it can be used to scan an object by moving the camera around the object. This object can be used by CAD or 3D printed. Another use is that it can segment an object out of a scene by first recreating the scene and then allowing the user to move the object, which the system then segments in real-time. This system can also be used in augmented reality since it allows created 3D models to be added to a system, and this system will account for physics, which allows simulations to be run. Users are actually able to move around in the scene and the camera will track them and the note their touch interaction with the background scene (Figure 1 D).

Figure 2. The left shows raw data from a Kinect and the left shows the PC after reconstruction.

The GUI implementation has 4 main stages: depth map conversion, camera tracking, volumetric integration, and raycasting. These steps are executed in parallel and are written in the CUDA language. The algorithmic breakdown of each step is given in the paper. [1]

Related work not referenced in the paper:

Efficient Model-based 3D Tracking of Hand Articualtions using Kinect

http://www.ics.forth.gr/~argyros/mypapers/2011_09_bmvc_kinect_hand_tracking.pdf
This paper makes a 3D model like this article does, but focuses on the the hand and not on the rest of the scene.

3D shape scanning with a Kinect

http://dl.acm.org/citation.cfm?id=2037780
This one focuses on 3D object scanning using the Kinect, which is just one aspect of this article's system.

Real-time 3D visual SLAM with a hand-held RGB-D camera

http://ias.in.tum.de/_media/events/rgbd2011/03_engelhard.pdf
It focuses on a RGB solution, which my paper avoids.

Human Detection Using Depth Information by Kinect

http://www.nattee.net/sites/default/files/Human%20Detection%20Using%20Depth%20Information%20by%20Kinect.pdf
This is very similar in the fact that it has segmentation and tracking using the Kinect, but it focuses on a human not on a scene.

The Kinect Sensor in Robotics Education

http://www.innoc.at/fileadmin/user_upload/_temp_/RiE/Proceedings/69.pdf
This one also uses 3D modeling, but focuses on robotics and education

Accuracy Analysis of Kinect Depth Data

https://wiki.rit.edu/download/attachments/52806003/ls2011_submission_40.pdf
This is an analysis of the depth data, which is actually used by this paper's system.

RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environment

http://ijr.sagepub.com/content/early/2012/02/10/0278364911434148.abstract?rss=1&patientinform-links=yes&legid=spijr;0278364911434148v1
This one again focuses on RGB mapping

Gesture Recognition based on 2D and 3D Feature by using Kinect Device

http://onlinepresent.org/proceedings/vol1_2012/26.pdf
This article is different because it focuses on users and not scenes and because it uses RGB color solution

Using Kinect for hand tracking and rendering in wearable haptics

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5945505&tag=1
This article focuses on tracking, but not on modeling.

Incremental 3D Body Reconstruction Framework for Robotic Telepresence applications

http://mail.isr.uc.pt/~mrl/admin/upload/752-068.pdf
This one focuses on modeling the body not a scene.

Overall, most of these have a lot in common with this paper's solution, but they do not cover the wide array of areas that this one does.This paper combines almost everything these other papers cover, which is unique, and it also takes new approaches to reconstructing scenes and segmenting.

Evaluation:
Each part of this system was tested separately, so this was not a systemic approach. This paper does not really go into details on testing methods, so I can only assume that it is still in the 'lab' test phase and not ready for real testing. Once that point is reached, I believe open-ended questions that are qualitative and subjective to be the best method since this is a system that should be tested partly on how much people enjoy using the system and how useful they feel it is. It could also be tested quantitatively by measuring how long it takes to reconstruct a scene.

Discussion:
I could see this being a very fun addition to video games in the near future. It could also be used in many other situation, such as training exercises for the military or policemen. This was not a well written paper though. It was highly technical, and was not written in a way that someone without a background could pick it up quickly. This really reduced by enjoyment of the paper for me.

Reference Information:
[1] KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera: http://delivery.acm.org/10.1145/2050000/2047270/p559-izadi.pdf?ip=165.91.10.170&acc=ACTIVE%20SERVICE&CFID=151349081&CFTOKEN=89389542&__acm__=1346736520_5660bbce8755472ac4aefa88d4f853b3
[2] All papers listed were found using http://scholar.google.com/

CHI 436 - Fall 2012

Tuesday, September 4, 2012

Paper Reading #2 - KinectFusion: Real-Time 3D Reconstruction and Interaction Using a Moving Depth Camera

No comments:

Post a Comment