| By Paul Miller | Article Rating: |
|
| October 1, 2012 06:26 AM EDT | Reads: |
529 |
UK newspaper, The Guardian, has done some pioneering work to use data, and to engage readers in exploring data to share their own insights. The paper’s Simon Rogers and Google’s Kathryn Hurley shared some of the lessons at Strata this morning.
Rough notes follow.
Not going to talk about big projects like riots and Wikileaks and MP’s expenses… Going to talk about the day-to-day process of hacking around with data.
Open data journalism – more than just Google spreadsheets. Much more of a two-way process than simply writing and disseminating stories.
Numbers need context. Journalists need the skills to interpret, probe, and tell a data-backed story.
First rule of what we do – find the key data behind a story and make it public. Guardian Datablog and Data Store used to push out data relevant to the main news stories of the week.
Lots of data is available, but it’s locked up in a wide range of data sets. A lot of the Data team’s work is involved with pulling freely available data together in one place – making it comparable and useful.
Get past raw numbers, and show how they have changed over time. Measures and units and groupings change, so how do you actually compare like with like?
Don’t always just rely upon the algorithm… Need the knowledge and the question-asking capabilities to wonder whether or not the result is too good to be true. Often, it will be wrong.
Olympics… lots of data, but very little was open. IOC sold the data, and refused to allow it to be shared.
Kathryn Hurley at Google… spent the last week working directly with the Guardian team… Learned…
- News drives the stories
- Data journalism moves fast
- Quick and easy tools reign supreme
What does this mean for other businesses?
- Know what matters
- Find the data to back it up (internally, from government, from public data sites, from data markets, etc)
- Clean the data (a lot! Sometimes just normalisation, sometimes more serious)
- plugging tools like Google Refine
- Sometimes the data you have isn’t enough – find more
- Tell the story – visualisation matters, interactivity helps
- Sharing the data to support your story – make it available for download, or offer an api
Tools need to get easier to use and richer, to let data journalists (and others) get the results they need more quickly, and with less coding.
Published data needs to be more logically formatted… PDFs derived from printed documents are designed for human reading, not for machine processing.
Image of The Guardian‘s offices by Flickr user Mark Hillary.
Related articles
- Data journalism at the Guardian: What is it and how do we do it? (nextlevelofnews.com)
- Open data journalism (guardian.co.uk)
- Four key trends changing digital journalism and society (radar.oreilly.com)

Read the original blog entry...
Published October 1, 2012 Reads 529
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Paul Miller
Paul Miller works at the interface between the worlds of Cloud Computing and the Semantic Web, providing the insights that enable you to exploit the next wave as we approach the World Wide Database. He blogs at www.cloudofdata.com.
- Not Quite Ready to Live in the Cloud
- Cloud Database Company Xeround, and a Tale of Evolving Business Models
- Discussing Virtual Machine Interoperability with the Open Data Center Alliance
- Visualisation – the key that unlocks data’s value?
- OpenStack Summit – thoughts from Portland
- Survey lifts covers on Cloud Promiscuity: good thing, bad thing, or who cares?
- Doing the DataBeat
- To Dublin, in search of evidence
- Getting it right with data attribution
- Seeking Simplicity’s Sweet Spot
- Find the data, aggregate the data, make the data useful
- Cloud on merit, not by dictat
- Not Quite Ready to Live in the Cloud
- Cloud Database Company Xeround, and a Tale of Evolving Business Models
- Discussing Virtual Machine Interoperability with the Open Data Center Alliance
- Visualisation – the key that unlocks data’s value?
- OpenStack Summit – thoughts from Portland
- Survey lifts covers on Cloud Promiscuity: good thing, bad thing, or who cares?
- Doing the DataBeat
- To Dublin, in search of evidence
- Getting it right with data attribution
- Seeking Simplicity’s Sweet Spot
- Is Infochimps running from the Data Market business?
- Find the data, aggregate the data, make the data useful
- Cloud Computing Is Far More Than Just Cutting Enterprise IT Costs
- Security and the Cloud
- David Eaves Talks About Vancouver’s Open Data Initiative
- Talking to Simon Wardley About Ubuntu and Cloud Computing
- Juan Carlos Soto Reaffirms Sun Microsystems’ Commitment to the Cloud
- If Government is a Platform, What Are People Building?
- Keep Your Executive Assistant Happy if Moving to the Cloud
- Tungle Goes a Long Way Toward Reducing the Pain of Scheduling Meetings
- Discussing Cisco’s Unified Computing System with Wendy Mars
- Eucalyptus Project Closes $5.5 Million Series A with Benchmark
- Hewlett Packard: A Tale of Many Clouds
- Licensing of Linked Data


















