Getting Started with Kinect Development

The Microsoft Kinect addon has been available for a couple of weeks now, and gamers all over the world can now enjoy the "controllerless controller". But from a hacker's perspective, the Kinect offers much more interest than jumping through hoops or bursting bubbles. Indeed within just hours of the Kinect launch, some resourceful developers had got together an open-source driver for Kinect and were busy building in support for its various faculties. Now it's a fairly simple process for anybody interested and with a bit of programming knowledge to get going with building on top of Kinect. This guide will take you though setting up Kinect to work with your computer, and get some trivial image analysis running.

While there are now .NET / Windows drivers for Kinect available (supporting RGB camera, depth camera, accelerometer feedback, motor driving and LED setting) these are not open source. However there are some open-source drivers available through the OpenKinect project on Github. At the time of writing these support OS X and Linux, however they are moving at an extremely fast pace, so you may find this has changed in the time since this article was published. To run this demo, I will be using a standard installation of Ubuntu 10.04 Lucid Lynx LTS without any major modifications. Other than a Kinect device you won't need anything else for this tutorial.

Before I start, there should be a note on hardware and warranties. By using Kinect in this way, so far as I am aware, you will not be violating any warranties in place on the device. However, I am not a lawyer, and neither I, the publishers, or anyone associated with this article or code accept any responsibility nor provide any warranties whatsoever (expressed or implied) for this code or your Kinect device. You proceed here at your own risk!

1. Installing Required Software

From the base Ubuntu you will need a few extra packages to get you up and running. These two are fairly common, so you may have them already:

sudo apt-get install git-core cmake

Git-core installs the source repository that will allow us to check out the code for use in the project. Cmake is a utility that the project uses to pre-build the software for us.

These are less common and most people will need to install them:

sudo apt-get install libusb-1.0-0-dev freeglut3-dev

Note: the libusb-1.0-0-dev package is more commonly known as "libusb-1.0-dev" but it is named differently in the Ubuntu repository, if you're using another platform or different version you may need to try the other one.

Libusb is a library that allows for the easy manipulation of USB devices - it makes the driver development a lot more lightweight and facilitates easier communication. Freeglut is a package used in this case for rendering the camera data on the screen for you to view.

If those packages are all installed and up to date, you are ready to go.

Getting The Code

As I have said before, the open source community is working feverishly to improve and expand Kinect support in these drivers. Hence I have created a tag of my code to ensure there is a stable demonstration that will always work.

The master for all of this code is hosted by OpenKinect on GitHub. If you want the latest up-to-date code then please try there. There are also numerous forks working on adding accelerometer, motor, LED and audio support as I write, so if you desire one of these features you may like to investigate those as well.

However, you may use my code to demo the algorithm I have written and see how it works. You need to clone the code from Github and switch to the correct branch. Here I also make a directory for you to use to store the code.

cd
mkdir kinect
cd kinect
git clone git://github.com/chrisalexander/libfreenect.git .
git checkout depthdemo

This will move you to your home directory; create a Kinect directory; clone my repository; switch to the stable branch. Feel free to have a look at the code and see what's going on.

Building The Code

First of all we need to run CMake across the project, then we can simply make it.

cd c
mkdir build
cd build
cmake ..
make

Assuming there are no errors from the cmake and make commands, you should now have a file called glview in the kinect/c/build/examples directory. Currently you have to run it as root as it does not have permissions properly set, although this should change in future versions. Now is the point where you can connect your Kinect to the USB port of your computer. Wait a few seconds and then execute:

sudo ./examples/glview

This should bring you up a window containing two displays. On the left is a colourised depth view of the field of vision of Kinect. On the right is its RGB (standard colour) equivalent.

Occasionally there are initialisation errors in one or the other cameras, and the view will not be rendered correctly - simply Ctrl+C to exit and try again.

Demo

You're now ready to try out the depth demo that you have run. Run the demo in a clear area with plenty of space. Standing about 2 meters from the Kinect sensor and holding one hand in front of you should show you a cross on the nearest point to the sensor. This is what has been detected as the nearest point to the sensor. If you wave your hand around, you should see it tracking this point around the screen. Additionally, you should see the x, y and z position values being displayed on the screen. This code version also dispatches the coordinates via UDP, so that you can use them in another application if you like.

Your Modifications

For this simple proof-of-concept, I have modified the rendering code provided by the OpenKinect developers to add my algorithm in. You can see this in the glview.c file in the kinect/c/examples directory.

The main function in this file initialises Kinect through the driver call. It also registers the two callback functions, one which processes the RGB camera information and renders it to the screen, and the other that processes the depth camera.

This algorithm runs a very simple nearest-point detection algorithm over the input data from the depth camera. The way the SDK works in this version is that it calls back the depthimg function in that file. The first argument is a pointer to a 640x480 length 1 dimensional array containing the mapped depth data. By iterating over this array to detect the closest item (using a bounding of at least 50 elements within 10 units of depth of each other) it provides the very simple nearest object detection algorithm. This data is then passed to the rendered and showed as a colorised depth map on the screen, thresholded to show various colours.

You can modify the algorithm to implement this part of the code. For example, you may wish to implement an edge detection algorithm (there are very strong signals from the depth information and it should be easy to outline people). It would also be fairly trivial to build a new application from scratch on top of the existing architecture (with the callbackS) if desired.

Conclusions

This should have got you up and running with a sample application built on top of Kinect, and shown you how you can modify the code to run your own applications. There are already some brilliant visualisations and robotics applications being demonstrated with Kinect, and the future looks bright for the development in the open-source community.

You might also like...

Comments

About the author

Chris Alexander United Kingdom

Chris is an engineer, developer and technical writer currently studying for a Masters degree in Applied Robotics at the University of Reading. His interests range from building websites to hacki...

Interested in writing for us? Find out more.

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Debuggers don't remove bugs. They only show them in slow motion.”