The Interface Between the Worlds of Cloud Computing and the Semantic Web

Paul Miller

How Open is ‘Open’ ?

There has been a recent burst of enthusiasm for making raw data produced by and for Government more ‘open,’ and this must surely be welcomed. Long-running grass-roots efforts such as Tom Steinberg’s mySociety and The Guardian’s Free Our Data campaign continue to innovate, but in an environment that is suddenly more receptive to their ideas. Edge-case adoptions of RDFa and other ’semantic’ specifications, perhaps, are at last moving from being merely the preserve of a few isolated enthusiasts.

Sir Tim Berners-Lee now walks the corridors of power in London and Washington and elected officials (and even the Opposition parties) at least claim to be listening to his call for ‘Raw Data, Now!’ and his talk of Linked Data, URIs, and the rest. How far we have come, but we have much further still to go.

‘Open’ and ‘Transparent’ Government is nothing new. It’s been talked about for a very long time, and there has been some progress. Part of the issue, I think, comes down to interpretations of ‘open.’ Just because it’s possible to download some Government data doesn’t necessarily mean it’s practical for most interested parties to do so.

If a national library puts all of its catalogue online for free, but requires you to query it via an obscure industry protocol, is that ‘open’ ? If they then throttle access so that it would take an inordinately long period of time to ‘copy’ their catalogue, is that ‘open’ ?

If a National Statistics agency makes all of their research freely available, and provides access to thousands of opaquely named csv files by listing them on a web page, is that ‘open’ ?

If a Government department makes all its research reports available online as Microsoft Word files, is that ‘open’ ?

A purist might strenuously assert that none of these are ‘open.’ Most, certainly, are far from ideal… but they still serve a real purpose in making the innards of Government more accountable. How good should be good enough in 2009?

Going the other way, does a Health Authority have to make my medical records visible to the world before it can be called ‘open’ ? It seems almost unthinkable, but extremes of viewpoint do have an annoying habit of quickly becoming that absurd.

The current enthusiasm for ‘Open’ is closely associated to Tim Berners-Lee’s talk of Linked Data and the newly pragmatic Semantic Web, and Berners-Lee provided a short note last week on his current views. Contrast Tim’s discussion of the ways in which Government data should be linkable with the Sunlight Foundation’s attack on the US Federal Government’s transparency flagship, Recovery.gov, for not making any real data available in the first place.

If we can’t even get the existing raw data out of Government as often as we’d like, there’s a long way to go before Berners-Lee’s grander vision can be achieved. He recognises this, of course, writing;

“Government data is being put online to increase accountability, contribute valuable information about the world, and to enable government, the country, and the world to function more efficiently. All of these purposes are served by putting the information on the Web as Linked Data. Start with the ‘low-hanging fruit’. Whatever else, the raw data should be made available as soon as possible. Preferably, it should be put up as Linked Data. As a third priority, it should be linked to other sources. As a lower priority, nice user interfaces should be made to it — if interested communities outside government have not already done it.”

(my emphasis)

To get much further, and to make that progress sustainable, there’s a requirement for a very real shift in attitudes at the heart of Government. Openness (of data or anything else) shouldn’t be a tactic to distract from worse news elsewhere, or a short lived knee-jerk response to the latest embarrassment. Rather, it should be a deep-seated presumption to underpin policy, systems design and more.

Data from Government should, quite simply, be freely and easily available. As a matter of course, and without prevarication. Unless there is a compelling reason to do otherwise.

For all the talk of ‘open,’ that is very far from being true today. The presumption is ‘closed.’ The mindset is (largely) ‘closed.’ ‘Open’ has to be fought for, and ‘Open’ has to be justified. ‘Open’ has to be championed, endlessly, tirelessly, thanklessly.

The exact opposite should be true. Then (and maybe only then?) Berners-Lee and his colleagues can build something wonderful.

