Cloud Computing, Data and Corporate Trust
November 13, 2008 Leave a comment
The Cloud is here and it’s here to stay. The financial and reputational bets currently being made on the future of Clouds mean that, like our surviving banks, they’re just too big to fail. But this is a good thing. It presents some fantastic new opportunities for startups to get their ideas in front of the world quickly and cheaply and then either “scale fast” or “fail fast”.
But what about the bigger, more established and more traditional organisations, those with a standing investment in “on-premises” computing ? Where can they benefit from Cloud computing ? Cloud computing, for all the opportunities it provides, poses some interesting problems around the issue of data.
All of the current cloud platforms (Microsoft Azure, Amazon EC2, Force.com and Google) provide data storage. Interestingly they have mostly gone for hierarchical entity storage rather than traditional relational tables. It maybe that relational data has finally met it’s match in the Cloud. Hierarchical entity data definitely seems more scalable and easier to federate than relational tables and we have to consider that the currently favoured technique of data access, RESTful ATOMPub making individual entities addressable, may have been a significant factor in that decision. However, as a corporation wanting to scale an application into the cloud rather than on-premises, I would have a number of issues :
How can I scale my applications in the Cloud if the business data they require is still “on-premises” behind my corporate firewall ?
- How can I take that data and put it out in the cloud where it will scale with, and be available to, my applications without running into huge legal, privacy, governance and compliance issues ?
- Even if I was allowed to put my data in the Cloud do I trust my Cloud provider with it ?
- How can I make sure that the data in my LOB applications and its availability are protected from the peaks and surges of internet activity ?
- How can I make sure that data in my LOB applications is in sync with data out in the Cloud ?
- Although putting my data in the cloud relieves me of having to worry about where (physically and therefore geographically) it is stored, as a corporation I actually have to care and it has serious legal implications. Nearly every country and jurisdiction has different laws concerning the storage, security and management of data of different kinds. The Cloud is a compliance nightmare.
The truth is that we will have to start thinking about our corporate data differently. Anyone hosting a website of any size or scale probably already uses Content Delivery Networks and internet caches to host “static” content and remove unnecessary load from their servers so we already understand that content can be divided into the cacheable and the dynamic. Broadly speaking most corporations, or at least nearly all retail organisations, have two kinds of data. The first kind we can call private data and this is the kind of data that is unsuitable for cloud storage for a number of reasons and it includes customer data and sales or other transactional data. The other kind we can call public data which, if you think about it, is ideally suited to Cloud storage. It includes inventory, pricing and availability data, i.e. what stuff do I sell, how much is it and how much of it have I got available.
If you think about it, most retail organisations already spend a lot of time and money trying to publish this data on their web sites and via B2B interfaces. A retail organisation has to service thousands of listing, availability and pricing requests not to mention B2B feeds and web site scrapers, and while those requests form the bulk of the load on their web sites and back-end systems, it’s the relatively infrequent order requests that actually make all the money.
Surely it makes sense to take all the inventory, pricing and availability data and push it out into the cloud along with most of the eCommerce web application. This data is also perfectly suited to storage as hierarchical entities with attributes.
Department -> Category -> Sub-category -> Item[name, price, units available, etc]
Most eCommerce web applications are a hostable workflow, built around a state-machine (the shopping basket) and exposed via web pages. All of this could and should sit out in the cloud right up to the final step of the workflow when the customer can be brought to a secure payment portal where the contents of their basket can undergo a final availability and pricing check before the purchase transaction is committed. Why should I, as a retailer, want to buy, build and host systems capable of supporting thousands of browsing, fickle shoppers many of whom wander off in the middle of shopping, when for a fraction of the cost I can host a system just for interacting with the really valuable ones – the buyers – and relieving them of their customer data and money ? I don’t want to come across all mercenary here but we are all operating in tough times.
The only issue this leaves is the one of synchronisation. I need to keep availability and prices up to date. This should not be a problem. Using either plain ATOMPub or the Cloud’s Service Bus I can replicate purchase events and price change events into my Cloud data fairly easily. Even if the data gets a bit out of date both retailers and customers already accept this kind of latency in online shopping. We know that since shopping is a often long running workflow process, items that were in stock when selected may have sold out by the time we reach checkout. We understand that with limited time offers and price banding, that products may have changed in price between being added to the basket and being paid for. If things have changed at the checkout I simply have an opportunity to cross-sell or up-sell that customer on other products, something I already do on my website anyway. Therefore, for the majority I would suggest, synchronisation is not as problematic as one might suspect.
This change, particularly in eCommerce, is inevitable. Not only that, but the retailers who make the move and make it soonest will enjoy the same advantages over those who don’t as all of the online retailers enjoy over the traditional store retailers. The Cloud will provide the same cost advantages over on-premises hosting that online retail enjoyed over store retail. If you’re an online retailer today, get ready for the change!