Tuesday, March 29, 2011

Hacking the Microsoft Kinect

Content provided by Kyle Weaver...my son.
The Microsoft Kinect is a peripheral for the Xbox 360 console released November 4th, 2010. It is a hands free controller device designed to enhance the gaming and overall navigation experience of the Xbox 360.  Since its release, much effort has been put in by the open source community to hack the Kinect and make it a viable controller on many different platforms. Héctor Martín was the winner of the Adafruit contest to release an open source driver. Since his breakthrough many uses, from fun to life saving, have been created. This paper provides a look into this world of hacking the Kinect, exploring the different capabilities of the hardware, and how it is being put to use beyond the Xbox 360.

1.     Introduction

     “Microsoft will continue to make advances in these types of safeguards and work closely with law enforcement and product safety groups to keep Kinect tamper-resistant”, a company spokesperson said (Blomquist, 2010). This was the initial stance taken by Microsoft against those trying to hack their device.  Microsoft planned to take legal action against anyone using the device beyond its initial specifications. They were quick to change their mind however (Blankenhorn, 2010). Not long after that first statement, a second one was released stating that using the Kinect outside of the Xbox 360 was unsupported, but not illegal.
     No one is sure the exact reason Microsoft changed their stance so quickly, but there are a few speculations. Possible reasons include Microsoft’s lawyers believing there was no way to take legal action and potential long run financial benefits for Microsoft (Blankenhorn, 2010). This opens up new markets of potential customers for the Kinect. Before, gamers were the target market, now it can be used for educational, medicinal, and robotic applications.
     This paper examines the technical aspects of the Kinect. What are the capabilities of the hardware and how does it work? Beyond just the hardware, it will dive into the world of hacking to see what has been accomplished by these rogue programmers. A handful of applications will be investigated and what they bring to the table.

2.    Background

    The prototype for the Kinect (originally codenamed “Nadal”) cost around $30,000. Now one  can obtain a Kinect for around $150 retail price (Carmody, 2010). Part of the reason this peripheral is so ground breaking is its low price for the technology it delivers. Before all this investment into the Kinect, a motion sensing device of any real quality would cost thousands of dollars.  The low cost allows people previously cut off from this avenue of technology to explore motion sensing possibilities (Deyle, 2011).
     Adafruit Industries got the ball rolling on the race to create a Kinect driver. On the day of the Kinect’s release, Adafruit created an “X-Prize” of $2,000 (later upped to $3,000) to the first person who could demonstrate a working Kinect driver which showed video with corresponding depth ("The open kinect," 2010).  On November 10th, 2010, only six days after the release of the Kinect, Héctor Martín was able to write a driver and display its use in an OpenGL window ("We have a," 2010).
     Ever since the release of Héctor Martín’s open source driver, the community has exploded. There are now hacks to do almost anything one can think of. There are applications to play Super MarioKart in an emulator on the PC (Seegoolam, 2010), create a “set of eyes” for the visually impaired (Quick, 2011), and help detect Autism in young children (Bi, 2011). The possibilities are endless with an affordable option for anyone with $150 and a computer.

3.     Kinect Hardware

            The Kinect is a USB device in the shape of a bar which measures 11inches wide, 3 inches deep and 3 inches tall (Figure 1). It is officially capable of detecting objects in a depth of view between 4 feet and 11.5 feet. The field of view for the Kinect is 57 degrees horizontal and 43 degrees vertical. However, the device will automatically tilt vertically 27 degrees giving it a full vertical range of 70 degrees (Miller, 2010).
Figure 1. Kinect Hardware
3.1 Cameras
3.1.1 RGB Camera
            The Kinect includes a normal RGB camera, similar to a webcam, that captures video at a resolution of 640 x 480 with 32 bit color at 30 frames per second (Miller, 2010). This camera is used to distinguish colors in the field of view, as well as recording images and videos for use in games. The RGB camera is the middle of the three cameras in the image above (Figure 1).
3.1.2 Depth of Field Cameras
            The Kinect also includes two cameras used to detect the field of depth. The first is less of a “camera” and more of an emitter of infrared beams. It sends out thousands of infrared beams to bounce off objects in front of it (Figure 2). The light is then captured by the monochrome CMOS camera and its “time of flight” is measured. This allows the Kinect to map a 3D image of what is in front of it to depths within 1 centimeter and 3 millimeters for width and height (Carmody, 2010). In the above image the IR emitter is the furthest left camera and the monochrome CMOS capture camera is the furthest right (Figure 1).
Figure 2. Kinect Infrared Emitter viewed from Infrared Camera
3.2 Microphone
            Included on the Kinect is also an array of stereo microphones which use a sophisticated wide-field conic noise cancelling audio capture. Upon setup the microphone will adjust itself to the echoes and ambient noises of the room, as well the noise from the Xbox 360. This allows a very high degree of accuracy in picking up voices even when there is much background noise (Carmody, 2010).
3.3 Skeletal Tracking
            The Kinect keeps track of 48 distinct points on the human body (Miller, 2010). By keeping track of these points it creates a skeletal map of your body. There is no limit to the amount of subjects that can be “seen” so long as they are in the field of view. However, in a non-hacked state the limit is set to tracking two skeletons because of processor limitations.
4. Kinect Hacking Software
4.1 OpenNI
            OpenNI is a non-profit open source organization that was created to promote the compatibility of Natural Interaction devices. OpenNI is an open source framework which provides an API to help create applications involving Natural Interaction devices. It can communicate with both low level devices and higher level middleware. Their goal is to create an open standard for Natural Interaction ("PrimeSense™ establishes the," 2010).
4.2 PrimeSense NITE
            PrimeSense, the creators of the Microsoft Kinect hardware, have released a SDK which works over the OpenNI Modules and the OpenNI infrastructure. By adding Control Management one can receive a stream of points and route them to the appropriate NITE control. Above this are the Controls themselves which translates the streams into meaningful actions (PrimeSense, 2010). The application flow can be seen in the following image (Figure 3).
Figure 3. NITE Application Flow
4.3  SensorKinect Module
            This is the driver the Kinect uses on the computer. It is currently written and available in Mac OS X, Ubuntu, and Windows. The SensorKinect module is a modified version of the open source PrimeSensor module converted to work with the Kinect. Since its release the SensorKinect module has become the most used Kinect driver.
5 Applications
5.1 FAAST (Flexible Action and Articulated Skeleton Toolkit)
            A middleware application which integrates full body control into any already created application. FAAST works akin to old joysticks and gamepads where buttons can be mapped to keyboard buttons or mouse actions. Using PrimeSense NITE, FAAST is able to track the movements of various joints in the body. To begin, all one has to do is stand in the ‘Psi’ pose until a stick figure appears (Figure 5). It is very easy to manipulate the actions to illicit the commands you would like. The action binding below (Figure 6) would hold down the ‘w’ key whenever your right arm is more than 18 inches in front of your shoulders. The key is then depressed when it returns below the threshold (Suma, Lange, Rizzo, Krum, & Bolas, 2011).
Figure 5. Skeleton Calibration Pose
Figure 6. Skeletal Tracking and Action Bindings
5.2 Ultra Seven
            Timoto Washio created the application Ultra Seven using PrimeSense NITE. While at first glance it appears a simple application which only overlays your skeleton with a skin, it actually has a few complex features that show off some of what PrimeSense NITE is capable of. After doing the ‘psi’ pose you are transformed into Ultra Seven the super hero (Washio, 2011). This allows one to perform various actions such as shooting lasers out of the arms and head (Figure7). The most impressive thing about this application is how 3D objects can be detected, which then stops the graphical beams (Figure 8).
Figure 7. Graphical Overlay with Skeleton Detection and Laser
Figure 8. Demonstration of 3D Objects Blocking Graphical Beam
5.3  Kinect and WPF: Complete Body Tracking & Nui.Vision
            The complete body tracking software is a useful application for keeping track of body part coordinates on the screen (Figure 9). This application uses the OpenNI library, .Net Framework 4.0, and the Windows Presentation Foundation (Pterneas, 2011). Vangos Pterneas decided to clean up the OpenNI library by writing his own wrapper. This wrapper makes it very easy to interact with and track the coordinates of various body parts. After including Nui.Vision (Figure 10) one needs to create a NuiUserTracker object (Figure 11). Set the NuiUserTracker object to be updated (Figure 12) and now one has easy access to the coordinates of a persons’ body (Figure 13).
Figure 9. Body Point Tracking using Nui.Vision Wrapper

Figure 10. Using Nui.Vision;
Figure 11. Creating a NuiUserTracker Object
Figure 12. Updating the Class
Figure 13. Accessing the points

5.     Conclusions

        Hacking the Kinect has been an exciting prospect for many people. Creating a sensor with a price point low enough has removed the previous barriers to entry. This is more than just a niche that will never catch on in main stream programming. It has already taken a major foothold in amateur robotics. Finally a sensor that can deliver performance and affordability has been created (Deyle, 2011).
        On the other hand the life of Kinect “hacking” is most likely going to take an abrupt turn later this year. Microsoft has announced plans to release an official SDK for their Kinect device sometime late this spring (Knies, 2011). Most likely this library will be easy to use and well documented. As long as it includes access to all the functionality one could want, it most likely will become the standard for controlling your Kinect on Windows. One will have easy access to the internal motors and audio that the current drivers have no control over. The future for the Kinect looks bright and very well could usher in a new era of how we interact with computers.

6.     References

Bi, F. (2011, March 14). Minnesota prof. uses xbox kinect for research. Retrieved from     Http://minnesota.cbslocal.com/2011/03/14/minnesota-prof-uses-xbox-kinect-for-research/
Blankenhorn, D. (2010, November 12). Microsoft surrenders on linux kinect hack. Retrieved        from http://www.zdnet.com/blog/open-source/microsoft-surrenders-on-linux-kinect-     hack/7769
Blomquist, C. (2010, November 18). Hacking the kinect & how not to do pr. Retrieved from         http://techliberation.com/2010/11/18/hacking-the-kinect-how-not-to-do-pr/
Carmody, T. (2010, November 3). How motion detection works in xbox kinect [Web log   message]. Retrieved from http://www.wired.com/gadgetlab/2010/11/tonights-release-xbox-      kinect-how-does-it-work/
Deyle, T. (2011, January 9). The need for low cost sensors in robotics. Retrieved from      http://www.hizook.com/blog/2011/01/09/need-low-cost-sensors-robotics-holiday-edition
Knies, R. (2011, February 21). Academics, enthusiasts to get kinect sdk. Retrieved from    http://research.microsoft.com/en-us/news/features/kinectforwindowssdk-022111.aspx

Miller, P. (2010, June 30). Kinect detailed in newly precise tech specs. Retrieved from        http://www.engadget.com/2010/06/30/kinect-detailed-in-newly-precise-tech-specs/
PrimeSense. (2010). Prime sensor™ nite 1.3 controls programmer's guide.
PrimeSense™ establishes the openNI™ standard and developers’ initiative to bring the world of  natural interaction™ to life. (2010, December 21). Retrieved from            http://www.openni.org/news/5-primesense-establishes-the-openni-standard-and-    developers-initiative-to-bring-the-world-of-natural-interaction-to-life
Pterneas, V. (2011, March 15). Kinect and wpf: complete body tracking [Web log message].          Retrieved from http://www.studentguru.gr/blogs/vangos/archive/2011/03/15/kinect-and-      wpf-complete-body-tracking.aspx
Quick, D. (2011, March 20). Navi project turns kinect into a set of eyes for the visually impaired.
         Retrieved from http://www.gizmag.com/kinect-as-a-set-of-eyes/18179/
Seegoolam, N. (2010, November 29). Want to hack the kinect microsoft says its okay. Retrieved    from http://nerdreactor.com/2010/11/29/want-to-hack-the-kinect-microsoft-says-its-okay/
Suma, E, Lange, B, Rizzo, Skip, Krum, D, & Bolas, M. (2011). Flexible action and articulated      skeleton toolkit (FAAST). Unpublished manuscript, Creative Technologies, University of Southern California, Los Angeles, California. Retrieved from            http://projects.ict.usc.edu/mxr/faast/
The open kinect project [Web log message]. (2010, November 4). Retrieved from   http://www.adafruit.com/blog/2010/11/04/the-open-kinect-project-the-ok-prize-get-1000-      bounty-for-kinect-for-xbox-360-open-source-drivers/
Washio, T. (2011). kinect-ultra. Retrieved from http://code.google.com/p/kinect-ultra/
We have a winner [Web log message]. (2010, November 10). Retrieved from           http://www.adafruit.com/blog/2010/11/10/we-have-a-winner-open-kinect-drivers-released-          winner-will-use-3k-for-more-hacking-plus-an-additional-2k-goes-to-the-eff/