UK academics today got an inside look at the way that Microsoft’s Xbox Kinect accessory works thanks to a day of presentations from the minds who developed the software.
The development of the Kinect device was split 50/50 between hardware and software: PrimeSense supplied the all-important depth sensor that sits on the front of the device, but it was some ingenious development by Microsoft’s research division in the UK and the US that turned it into the device that we know today on the Xbox as well as on the PC with the brand new Windows SDK.
According to Andrew Fitzgibbon of Microsoft Research in Cambridge, UK, the Kinect project was handed over to them in September 2008 with an astonishing budget of nearly $1 billion for the entire operation. The Xbox team had started off building the skeletal tracking technology, the core software that is required for Xbox games and now Windows PCs to operate more easily, but had problems scaling it – it needed to operate at 30 frames per second on Xbox hardware, which was first developed in 2005.
The MSR team took a machine learning approach to the problem, in some cases using algorithms first invented in the 1960’s (such as the decision tree, used to locate probable joint positions), in order to allow the skeleton tracking to operate fast enough and in a manner that can be parallelised across the Xbox graphics card. The end result: accuracy of over 70% running at 5ms per frame in the production SDKs for Xbox.
However, the researchers were extremely quiet on how located bodies are mapped to skeletal frames: this area of their research is top secret to Microsoft researchers and they seem to want to keep it that way.
But it’s not just Kinect’s past that was revealed at the event in Microsoft’s UK base of operations at Thames Valley Park in Reading: some hints were dropped at what we could see from future Kinect software and hardware iterations. Future software might be trained to cope with detecting people sat at close range, as if their Kinect was placed at their computer in front of them. There were also hints that tracking would eventually go down to the finger level, where it currently is only to the hand level at the moment except for simple hand closed / hand open data. The SDK may also eventually evolve to include face detection capability as well as the ability to detect emotions.