01 July 2009

Cloud Sweet Cloud

What's this cloud thing? Where is it? What does it do? Why should I bother? Why should I care?
I care because I'm a geek, and any new-tech stuff simply gets my unconditional attention, but aside from that I have recently found more reasons to care about it.
I am on this project with a fairly complex project whose main characteristics are:
  • zero client entry points (all web-based)
  • multi-tier architecture
  • inter-layer messaging based on HTTP or HTTPS
  • scalability is achieved by simply adding more modules
  • pretty much any kind of load balancing solution can be implemented at the client entry point
When we implement it, we figure out how many concurrent users we might expect, and size the implementation accordingly, in terms of number of processors needed, amount of memory needed, dedicated bandwidth, etc.

The key to any implementation is, indeed the number of concurrent users, but guess what... in many cases getting the project number of concurrent users is like choosing the numbers to play at the lottery: a wild guess. I've seen the same happening many through my career. It's just the way it goes.

You can call the country or regional manager, ask about the market projections for the first year, then talk with the marketing guys, see if there have been pre-registrations etc, and finally they all agree with a ballpark estimate of 'N' concurrent users. Perfect! What's the confidence level on this? Hmmm... they say pretty good, with a 20% floor and a 50% ceiling. Excellent!

So what do you do? You size for N x 2 concurrent users and proceed with the implementation. Then, of course demand will be way over the projections and your business unit will have to rush new equipment to take care of it, but until it does it will effectively lose business, and inevitably people will start pointing fingers...
Maybe sizing for N x 2 is not enough, so what do you do? Size for N x 3? N x 5? And what if the actual demand end up sticking to the projections or be less? You then have a lot of spare capacity that pulls down the return on investment. So what's the magic multiplier for sizing a projection of N concurrent users?

Of course, there is no such thing. But wait a minute...

Enter the cloud.

There have been a lot of discussion about 'what' actually is 'the cloud'. To me, there is no such thing as 'the cloud'. There are service providers. Some of the solutions are geared towards running web applications, like Google App Engine or Windows Azure. Other solutions are geared towards storage and backup, like Nirvanix. Some other solutions are geared towards data centre resources outsourcing, like Amazon EC2.
Amazon EC2 is the stuff I've been playing with and it's brilliant. The cost structure is somewhat exotic: you pay for CPU/hour, TB/month transfer rates, GB/month storage rates, etc. However, once you get to grips with this model I find it a lot more deterministic than traditional costing. A systems architect will have a good idea about these figures for system usage, so budgeting should not be a serious problem. But budgeting and costing is not why I'm a big fan of this type of cloud solution.

The reason why I like it is that in the cloud there are no such things as spare capacity, forecast errors, starved resources, supplier delays, or last-minute server re-allocation traumatic stress disorder.
Mind you, I'm not saying this s the silver bullet of all scalable implementations. It has its pros and cons, and things to consider are the loss of control over parts of your data centre, the (un)reliability of internet, remote administration costs, staff training, securing the data channels with the cloud, encription, compression, and so on.

Yes, nobody said it doesn't need any homework, and each project is different, but for me and some of my projects this is a godsend. Need more capacity for the Christmas rush online sales? No problem, let me open my ElasticFox plugin and I'll double your capacity in less than an hour, already configured, already load balanced, and fully operational... oh, and with no downtime of course. Then what? Need to downsize after the Christmas rush? No problem, let me open my ElasticFox plugin and remove some spare capacity.

Hold on, that's too simple. Let's try this... I am going to release an updated version of the sytem and planning for a good load test and a stress test to see how far I can push it, so I want a single stack system for the stress test, and a 3-stack load balanced system for the load test, then I want the same again but pointing at a different back-end (one has standard test data, the other has localised data in different languages). No problem, let me open my ElasticFox plugin and... well... you know the story.

M.

No comments: