Microsoft’s plans for “Big Data”, a growing movement among web developers and information junkies to enable the next generation data storage and retrieval required for the ever-growing digital information volume, have been announced at the PASS Summit in Seattle.
Among announcements about SQL Server 2012 and more, the most surprising and perhaps the most interesting announcement of the day is that Microsoft have been working on Windows Server and Windows Azure compatible distributions of Apache Hadoop. Hadoop development was inspired by technical papers on the internal MapReduce and Google File System applications developed (and kept proprietary) by Google. Hadoop is written in Java, and is used by all kinds of companies for storing large volumes of data distributed across servers – it is rumoured that Facebook has the largest global Hadoop cluster, topping out at 30 petabytes of storage.
“The new addition of an Apache Hadoop-based distribution for Windows Azure and Windows Server is the next building block, seamlessly connecting all data sizes and types” writes Microsoft’s Corporate Vice President for the Business Platform Division, Ted Kummert. “Coupled with our new investments in mobile business intelligence, and the expansion of our data ecosystem, we are advancing data management in a whole new way.”
There are no timeframes given for when Azure developers can expect Hadoop availability; however it has been noted by the technical guys that Hadoop jobs which have been written for the existing available versions will be directly compatible with the Azure distribution.