Beautiful Data: The Stories Behind Elegant Data Solutions

Beautiful Data: The Stories Behind Elegant Data Solutions
Authors
Toby Segaran, Jeff Hammerbacher
ISBN
0596157118
Published
24 Jul 2009
Purchase online
amazon.com

In this insightful book, you'll learn from the best data practitioners in the field just how wide-ranging -- and beautiful -- working with data can be. Join 39 contributors as they explain how they developed simple and elegant solutions on projects ranging from the Mars lander to a Radiohead video.With Beautiful Data, you will: *Explore the opportunities and challenges involved in working with the vast number of datasets made available by the Web

Page 2 of 2
  1. Editorial Reviews
  2. Customer Reviews

Customer Reviews

Thomas W. Gonzalez said
While the content of this book is interesting and informative, I am struck with what lousy print quality it is. For a $40+ book you would expect a hardback, or at least a paperback with thick stock pages and color plates that actually look good. It was hard for me to appreciate the content when it felt like each page (or the cover) was going to rip because they were such thin and poor quality stock. The color plates are washed out and pixelated. I was expecting the same high quality we got with "Beautiful Code". O'Reilly usually does a much better job. That said, if these types of aesthetics don't bother you (although with a title like "Beautiful Data" I would question that it wouldn't) the book itself is an interesting read.

Donald Park said
This book is a well assembled collection of academic papers and conference presentations on data mining. I found chapter 4, Cloud Storage Design, to be the most interesting in its description of Facebook's extensive use of Hadoop. Chapter 8 introduced me to the concept of Web Sockets in HTML5.

Ira Laefsky said
From the title, I might have guessed that this was another pretty coffee table book on Information Visualization--Basically, an art book unless you already had the insight and talent to apply its principles to your own work in Data Representation. But, I should have expected (and I did receive) much more from O'Reilly's efforts in this domain. While the book is indeed beautiful, it more importantly provides a set of carefully described case studies in all phases of the data capture, processing, analysis, communication and visualization life cycle. Detailed descriptions are given of the motivation and design of data capture, analysis and design system in fields as diverse as personal energy consumption (and carbon footprint), mars explorer robotics, high quality market research, interpretation of U.S. Census statistics, and the visualization of DNA databases.

The case study methodology points out the necessity of designing all phases of the data capture, processing, analysis and representation process around the goals, open questions and constraints of the client organization, or user/consumer of the data
whose decisions are being informed. The thinking and design process behind these cases of beautiful data are fully described--this will enable you (or an untalented artist such as myself) to design systems which answer the questions and support the decisions of the individual or organization who needs this data.

--Ira Laefsky

Techie Evan said
This book tells you what's possible now and what's on the horizon when it comes to data representation, collection, management, processing, analysis, sharing, and display. Very little code is provided because each chapter is mostly a conceptual discussion of approaches to tackling various kinds of challenges involving data, the lifeblood of any application. My favorite chapters are: 4, 5, 7 and 20. Below are my short notes for each chapter to give you some idea of the book's contents.

Ch. 1 Seeing Your Life in Data by Nathan Yau
Hoping to better understand their impact on and exposure to the environment, participants in one of Yau's projects download software onto their phones that then upload GPS data to servers as they go about their daily activities. One of Yau's early challenges was to summarize the data and make it meaningful to the participants: for example, what does it mean to emit 1,000 kilograms of carbon in a week? What he found helpful and not so helpful in data visualization are instructive.

Ch. 2 The Beautiful People: Keeping Users in Mind When Designing Data Collection Methods by Jonathan Follett and Matthew Holm
When there is no explicit profit to be made, how do you convince a person to take the time to answer your survey questions?

Ch. 3 Embedded Image Data Processing on Mars by J.M. Hughes
Like everything else onboard a spacecraft, the computing system is custom built with minimalism and other stringent specifications (e.g., withstand radiation) in mind. How does one harness limited resources to get the job done?

Ch. 4 Cloud Storage Design in a PNUTShell by Brian Cooper, Raghu Ramakrishnan, and Utkarsh Srivastava
Yahoo! engineers have a very challenging job. Web pages containing potentially complex social data must load and update quickly regardless of where the data may be mastered in servers distributed across the world. Learn why they jettisoned some conventional database concepts in favor of: flexible schemas, timeline consistency-driven data updates, etc.

Ch. 5 Information Platforms and the Rise of the Data Scientist by Jeff Hammerbacher
The author mentions that according to IDC, the digital universe will expand to 1,800 exabytes by 2011 (1 exabyte = 1 billion gigabytes) and the vast majority of that data will not be managed by relational databases. The Facebook Information Platform described in this chapter can manage structured and unstructured data in an integrated manner, and can extract useful information from terabytes of data in seconds. Similar platforms built at Fox Interactive Media and Microsoft are also described briefly.

Ch. 6 The Geographic Beauty of a Photographic Archive by Jason Dykes and Jo Wood
The Geograph British Isles Project aims to collect geographically representative photographs and information for every square kilometer of great Britain and Ireland. Learn new data visualization techniques!

Ch. 7 Data Finds Data by Jeff Jonas and Lisa Sokol
Technologies similar to those already used in, say, fraud surveillance can be adapted for other more mundane applications.

Ch. 8 Portable Data in Real Time by Jud Valeski
How can companies facilitate the sharing of and access to social data without having to invest on an inordinate amount of infrastructure?

Ch. 9 Surfacing the Deep Web by Alon Halevy and Jayant Madhaven
Web contents that lie hidden behind HTML Forms are part of the Deep Web that search engines have not indexed very well but that may partially change soon.

Ch. 10 Building Radiohead's House of Cards by Aaron Koblin with Valdean Klump
The author helped produce a video for the music group entirely from visualization of data, and without the use of cameras or lights. Google Code urls given. You gotta see the interesting video!!

Ch. 11 Visualizing Urban Data by Michal Migurski
Learn how to visualize trends in urban crime, using maps and data mashups

Ch. 12 The Design of Sense.us by Jeffrey Heer
The combination of interactive visualization and social interpretation can help an audience more richly explore a data set.

Ch. 13 What Data Doesnt't Do by Coco Krumme
Data doesn't stand alone. In real-world decision-making, information is rarely packaged neatly and data isn't free from interpretive biases.

Ch. 14 Natural Language Corpus Data by Peter Norvig
Natural language tasks like word segmentation or spelling correction can be handled using probabilistic models built from processed large data sets.

Ch. 15 Life in Data: The Story of DNA by Matt Wood and Ben Blackburne
The human genome has been well annotated and 40 other species have been sequenced. With each new discovery, however, more questions are raised, and more research data is generated. The need for efficient sequence search, alignment, and assembly tools, as well as safe housing for the millions of genomes, will continue to grow. Learn how scientists are rising to the challenge.

Ch. 16 Beautifying Data in the Real World by Jean-Claude Bradley, et al.
How online publishing of scientific data can be improved upon

Ch. 17 Superficial Data Analysis: Exploring Millions of Social Stereotypes by Brendan O'Connor and Lukas Biewald
Ch. 18 Bay Area Blues: The Effect of the Housing Crisis by Hadley Wickham, Deborah F. Swayne, and David Poole
Ch. 19 Beautiful Political Data by Andrew Gelman, Jonathan P. Kastellec, and Yair Ghitza
These chapters show you data analyses in action: how to prep data, smooth out the effects of noisy or outlier data, etc.

Ch. 20 Connecting Data by Toby Segaran
We need to break down information silos but how? The use of Semantic Web and/or Collective Reconciliation techniques are discussed.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Programs must be written for people to read, and only incidentally for machines to execute.”