The Interface Between the Worlds of Cloud Computing and the Semantic Web

Paul Miller

Subscribe to Paul Miller: eMailAlertsEmail Alerts
Get Paul Miller via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: SEO Journal, Air Travel Journal, Facebook on Ulitzer

Blog Feed Post

Surely the computer should do that?

Computer rendering of the Chicago Spire. This ...

Computer rendering of the Chicago Spire. This is not the current design as of July 12, 2008. (Photo credit: Wikipedia)

We have become accustomed to the simple yet all-powerful search box. ‘Advanced’ search options and arcane query syntaxes have largely been replaced by the learned behaviour of throwing some words at Google*, ignoring the sponsored links, and (usually) finding what we want somewhere in the first 5-10 proper results. A Google search is certainly impressive (especially to those who really remember how poor some of the earlier search engines were), but it remains far from perfect. Do Google’s limitations create a big enough opportunity for others to grab credible market share?

In recent weeks, I’ve received a flurry of information on partial alternatives to Google’s market-dominating search engine. Most appear useful in their own niche, but I doubt even their creators would be surprised to learn that none tempt me to change my Google-powered default search behaviour.

Far more damaging for their prospects, any hope they had of attracting my occasional use is dashed by the very way that they seem to work. They may excel in certain verticals, or in particular types of search, but most make the unfortunate mistake of expecting me to mould my behaviour to them. The pain of remembering how to concoct effective queries for each of these tools far outweighs the gain of their ‘better’ search result, creating a vicious spiral from which they must surely struggle to escape.

Take London-based Sehrch, for example. Behind a name that’s impossible to pronounce or communicate to others (say “Search for that on sehrch,” and 99.99% of those you tell will end up here rather than here) lies an interesting attempt to bring structure to web search, with a little help from data sources like Freebase and DBpedia.

Google could probably find you buildings with 150 floors (just the Chicago Spire, according to both my first page of Google hits and the trusty Wolfram Alpha), but might struggle to find those that were higher. Sehrch finds 14, and they appear in a neat list that’s free of the other stuff that cluttered my Google results. I’m confused that the Chicago Spire (sehrch agrees that it has exactly 150 floors) appears in a search that was quite clearly looking for buildings with more than 150 floors, but the other 13 appear to be valid. I’m not a tall building aficionado, so don’t know how many buildings Sehrch failed to find, but it certainly did better than Google (where the results are a mess, and would require careful reading) or even Wolfram Alpha (which reckons there are two ‘notable’ buildings with more than 150 floors). All of the searches returned a mix of actual buildings, planned buildings, and cancelled buildings.

But — and it’s a big but — both Google and Wolfram Alpha were pretty straightforward to search with normal search behaviours. Sehrch was not.

Wolfram Alpha took a perfectly realistic plain-text search for “buildings with more than 150 floors” and interpreted it to arrive at a query that the system could understand and operate upon. Sehrch, on the other hand, expected me to build a query from the 130,342 object properties and 249,777 object types that it understands. Frankly, if this search hadn’t been one of the examples, I doubt that I’d have formulated (type:building) (floors>150) correctly.

Extracting meaning and structure from data, and making it available to deliver better search results is a valid and useful thing to be doing. If you want to know about female teenage pop stars from Sweden, Sehrch can give you six. Both Google and Wolfram Alpha might be able to get there too, but I gave up trying to work out how. Sehrch may be returning ‘better’ results, but it’s too different to use.

Wolfram Alpha understands the power of meaning and structure too, but is getting better at hiding the power behind pretty user-friendly queries.

Even Google, the home of brute force computation across the unstructured mess of the Web, recognises the power of meaning and structure, and is doing something about it.

Enter some maths into a Google search box, and you don’t get a list of web pages containing calculators. You get the answer.

Enter a flight code into a Google search box, and you don’t get a list of airline or airport web pages. You get the time the flight is expected to land.

Enter a stock market code into a Google search box, and you don’t get a list of stock exchanges or companies. You get the share price, and a graph showing how it’s changing.

Type ‘showtimes’ into a Google search box, and you get a list of films showing at cinemas near you.

Google is getting better at structure. The company bought Freebase. The company is one of those behind schema.org. It’s investing in WikiData. Google knows that structure and meaning matter, and it’s applying itself to baking both into the search experience with which users are already familiar. Google is getting better, but it’s improving by doing more to anticipate the user’s needs, not by forcing the user to adopt arcane query syntax.

I use Google every day. For some searches, it’s really not (yet) the best place to answer my query. In those situations, I’ll turn to some other tool. Am I going to turn to one like Wolfram Alpha which works in a very different way, but hides that behind a box that typically takes the queries I’m used to typing? Or am I going to turn to one like Sehrch, which works in a very different way and expects me to work in a different way, too?

Sadly for Sehrch, until it finds a way to hide search syntax from the casual user, all its clever search capabilities are going to go unused. And it’s not alone. As I mentioned at the start, I’ve received pitches from a load of similar companies recently. All are interesting. All expect me to change too much without offering enough benefit in return. All therefore, ultimately, fall short.

Structure is good. Meaning is powerful. But I want the computer to infer, discover, reason and suggest. The last thing I want is to go back to typing arcane search syntax. And I very much doubt that I’m alone.

Note: Yes, I know that other big-name search engines like Bing exist and are broadly comparable to Google in scope and capability. But, honestly, they’ve never demonstrated a compelling reason for me to switch away from Google either. Feel free to substitute the name of your favourite mainstream search engine everywhere I wrote ‘Google’.

Read the original blog entry...

More Stories By Paul Miller

Paul Miller works at the interface between the worlds of Cloud Computing and the Semantic Web, providing the insights that enable you to exploit the next wave as we approach the World Wide Database.

He blogs at www.cloudofdata.com.