Welcome!

The Interface Between the Worlds of Cloud Computing and the Semantic Web

Paul Miller

Subscribe to Paul Miller: eMailAlertsEmail Alerts
Get Paul Miller via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Blog Feed Post

In a world of niche Clouds, how do you define a useful niche?

There are a couple of interesting posts on the blog of the UK’s FLESSR project, detailing their efforts to work out how feasible it might be to offer a new Cloud service to universities. More on that in a moment.

I don’t think I’ve ever really been convinced by the argument that everything will end up in the data centres of Amazon.

The straightforward provision of commodity Cloud Computing is an important – and growing – area, and one that will continue to expand as interfaces become simpler, FUD is challenged, and prices maintain their relentless march towards the bottom. Everyone has something they could usefully, sensibly, and cost-effectively run in a commodity Cloud such as those offered by Amazon, Rackspace, Flexiant, and others. In this space, basic stability, security and reliability combine with a compelling – and diminishing – pricing proposition to create commodity services targeted squarely to lowest common denominator functionality. Here, market forces may (inevitably?) lead to an eventual reduction in the number of providers. Cost, although not the only consideration, is both important and compelling. Although markets like competition, there may even be a single winner here, one day.

Layered all around the basic, routine, grunt-work computation that these commodity public clouds handle so well, many organisations find themselves having to cope with a wide range of other use cases and data sets. Some require specialist hardware (like the GPUs that Amazon has recently begun selling access to). Some demand particular regulatory and legislative hoops to be jumped through. Some have quirky requirements around latency in data transfer or speed of in-CPU processing. Some have lots of data, and issues with regard to getting the stuff from one location to another with a sensible balance between transfer cost and time.

All of these are certainly capable of being addressed in the Cloud, but the economics and the business rationale begin to shift. For the data owner, cost may no longer be quite so significant a factor. Reliability may matter more, or speed, or the audit trail. For the Cloud provider, these requirements no longer look like the lowest common denominator. It’s not cost-effective to provide these capabilities to everyone and still keep the price low. It becomes more sensible to segment, to divide, and to create bespoke offerings of various kinds. Some of these services require such specific things in terms of network topology, physical building layout, and staff expertise that it may even become counter-productive to have these services in the same building as the commodity Cloud. Here, there’s plenty of room for new entrants, plenty of scope for competition, and plenty of opportunity to differentiate in terms of price, location, support, and a host of other factors. This segment of the Cloud is only just getting started.

In these contexts, we see compelling arguments made for on-premise private clouds, off-premise private clouds, hybrid clouds, community clouds and the rest. Some of the arguments made in favour of private and hybrid certainly are part of the FUD we see in this space, but beneath the noise, the security scares, and the vested interests of SysAdmins and sellers of data centre components, there lies a grain of truth. Not everything is most sensibly run on a cheap VM, rented from Amazon (or Rackspace, or whoever) with your credit card, and physically located half way round the planet.

Unfortunately, it can be difficult to make sensible decisions about which type of cloud works best in each situation, and large swathes of the market are doing everything in their power to add to the confusion.

Having accepted that the basic offering from a public cloud provider is not the solution for my particular requirements, where do I turn next?

Do I listen to the (convincing) pitch from a vendor of ‘community cloud’ solutions for my domain? If I’m in Healthcare, they come with HIPAA and European Data Protection Directive, and all sorts of other accreditations. For dealing with sensitive patient data, this may be just what I need… but does the wily salesman also persuade me to run staff email and the hospital volleyball club website on this over-specified (and expensive) infrastructure?

Do I listen to the (convincing) pitch from a vendor of virtualisation software? If I’ve got a reasonably sized data centre with some life left in it, I may see the value of virtualising all of that expensive hardware, and running current applications in house more efficiently. But instead of gradually reducing my in-house costs, do I continue to add more machines as current ones reach end of life, or as new requirements come along?

Do I listen to the (convincing) pitch from my co-location facility, which happily sells me a ‘private cloud’ that may fail to deliver some of the economies of scale so central to the main Cloud proposition?

Do I listen to the horror stories, stick my head in the sand, and simply keep ordering servers until every single one of my competitors undercuts my costs and I go out of business?

These, and more, are certainly possible. But let’s return to that UK project I mentioned right at the start.

Flexible Services for the Support of Research (FleSSR) is

“a new cloud pilot project looking at utilising hybrid private-public IaaS cloud infrastructure to provide computational and data services to the academic research community. The project is a collaboration between the Oxford e-Research Center, IT Service @ University or Reading, e-Science Centre @ STFC, Eduserv, EoverI, Eucalyptus INC and Canonical Ltd.”

The ten month project is funded by the Joint Information Systems Committee (JISC), an organisation that supports the innovative use of IT across UK universities.

Now, to a degree, the project’s mindset must be influenced by its partners. IT staff at Reading and STFC are incumbents with turf to protect (or new vistas to discover, map, and claim). Eduserv has a new data centre that they’d like to fill with willing clients. It would be easy to be cynical, but knowing some of the people involved, I see no real reason to be. It is perfectly reasonable to suggest that a ‘community’ the size of UK Higher Education would realise value in replicating less (not nothing) at every university campus across the country, and bringing much of that together in some sort of Cloud. That Cloud might use public infrastructure, or it might be served up from an organisation such as Eduserv, which is known to the community, aware of the community’s requirements, quirks and foibles, and (importantly) not-for profit (and therefore cheaper?).

Personally, I’d always rather presumed that an organisation like Eduserv (or JISC itself) would act on behalf of the community to procure a competitive price on access to the resources of Amazon, Rackspace, or one of the others. I’m not convinced that most UK research computation needs any sort of special treatment that couldn’t be met from Amazon’s Dublin data centre… unless the community itself can somehow beat – and continue to beat – Amazon on price. Somewhat surprisingly, that’s exactly what some calculations in two posts by Eduserv’s Andy Powell suggest could happen. By including network costs and other charges over and above the basic storage cost, Andy finds Amazon, Rackspace and Dropbox to be more expensive than anticipated, and posits that Eduserv (connected to every UK university free of charge via JISC’s high speed JANET service, and constrained in the ways it can generate profit from services sold to universities by its charitable status) might actually work out cheaper.

There’s a lot of work to do in terms of fleshing out the assumptions behind some of Andy’s figures, but the whole industry certainly benefits when people conduct exercises like these out in the open, for all to see. If Andy has made mistakes, the vendors should be quick to jump in and correct them. If his assumptions miss the mark, public debate can redress the balance.

The Cloud is not all about price. But more transparency around the true cost of computing in the Cloud – and in your data centre – means that we can all make more informed decisions.

Thanks for sharing, Andy – and hopefully readers will be willing and able to look over your calculations and share their own views.

Note: this post was conceived and written in the United Kingdom. By reading this post you agree to comply with UK usage, and will henceforth pronounce the word ‘niche’ from the title as ‘neesh,’ not ‘nitch.’ Or maybe not.

Read the original blog entry...

More Stories By Paul Miller

Paul Miller works at the interface between the worlds of Cloud Computing and the Semantic Web, providing the insights that enable you to exploit the next wave as we approach the World Wide Database.

He blogs at www.cloudofdata.com.