Monday, April 12, 2010

Use cloud and get rid of your sysadmin.

Following on from my Cloud Computing Myths post.

The principle argument behind cloud getting rid of sysadmins is one of "pre-cloud a sysadmin can manage a few hundred machines, in the cloud era with automation a sysadmin can manage tens of thousands of virtual machines". In short, since system admins will be able to manage a two orders of magnitude greater number of virtual machines then we will need less of them.

Let's be first clear what automation means. At the infrastructure layer of the computing stack there are a range of systems, commonly known as orchestration tools, which allow for basic management of a cloud, automatic deployment of virtual infrastructure, configuration management, self-healing, monitoring, auto-scaling and so forth. These tools take advantage of the fact that in the cloud era, infrastructure is code and is created, modified and destroyed through APIs.

Rather than attempting to create specialised infrastructure, the cloud world takes advantage of a bountiful supply of virtual machines provided as standardised components. Hence scaling is achieved not through provision of an ever more powerful machine but deployment of vastly more standardised virtual machines.

Furthermore the concept of a machine also changes. We're moving away from the idea of a virtual machine image for this or that, to one of a basic machine image and all the run time information you require to configure it. The same base image will become a wiki, a web server or part of a n-tier system.

All of these capabilities allow for more ephemeral infrastructure, rapidly changing according to need with rapid deployment and destruction. This creates a range of management problems and hence we have the growth of interest in orchestration tools. These tools vary from specifically focused components to more general solutions and include chef, controltier,CohesiveFT, capistrano, rightscale, scalr and the list goes on.

A favourite example of mine, simply because it acts as a pointer towards the future, is PoolParty. Using a simple syntax of describing infrastructure deployment, PoolParty synthesises the core concepts of this infrastructure change. For example, deploying a system no longer becomes a long architectural review and planning process, an RT ticket requesting some new servers with an inevitable wait, the installation, racking and configuration of those servers followed with change control meetings.

Deploying a system becomes in principle as simple as :-

Pool "my_application" do
Cloud "my_application_server" do
Using EC2
Instance 1...1
Image_id "xxxxx"
Autoscale
end

Cloud "my_database_server" do
Using EC2
Instances 1...1
Image_id "xxxxx"
end

end

It is these concepts of infrastructure as code and automation through orchestration tools when combined with a future of computing resources provided as larger components (pre-built racks and containers) which have led many to assume that cloud will remove the roles of many sysadmins. This is a weak assumption.

A historical review of computing resource usage shows it's price elastic. In short, as the cost for provision of a unit of compute resource reduces then the demand has increased leading to today's proliferation of computing.

Now, depending upon who you talk to, the inefficiency of computer resources in your average data centre runs at 80-90%. Adoption of private clouds should (ignoring the benefits of using commodity hardware) provide a 5 x reduction in price per unit. Based upon historical precedents, you could expect this to be much higher in public cloud and lead to a 10-15x increase in consumption as we find the long tail of applications that companies desire becomes ever more feasible.

Of course, this ignores transient applications (those with a short life time such as weeks, days or hours), componentisation (e.g. self service and use of infrastructure as a base component), co-evolution effects and the larger economies of scale potentially available on public providers.

Given Moore's law, the current level of wastage, a standard VM / Physical server conversion rate, greater efficiencies in public provision, increasing use of commodity hardware and the assumption that expenditure of computing resources will remain flat (any reductions in cost per unit being compensated by increase in workload) then it is entirely feasible that within 5-7 years these effects could lead to a 100x increase in virtual infrastructure (i.e. number of virtual servers compared to current physical servers). It's more than possible that in five years time every large marketing department will have its own 1,000 node hadoop cluster for data processing of consumer behaviour.

So, we come back to the original argument which is "pre-cloud a sysadmin can manage a few hundred machines, in the cloud era with automation a sysadmin can manage tens of thousands of virtual machines". The problem with this argument is that if cloud develops as expected then each company will be managing two orders of magnitude more virtual machines which means there'll be at least as many sysadmins as there are today.

Now whilst the model changes when it comes to platform and software as a service (and there are complications here which I'll leave to another day), the assumption that cloud will lead to less system adminstrators is another one of those cloud myths which hasn't been properly thought through.

P.S. The nature of the role of a sysadmin will change and their skillsets will broaden, however if you're planning to use cloud to reduce their numbers then you might be in for a nasty shock.

P.P.S. Just to clarify, I've been asked by a company which runs 2,000 physical servers whether this means that in 5-7 years they could be running 200,000 virtual servers (some of which will be provided by private and most on public clouds, ideally through an exchange or brokers). This is exactly what I mean. You're going to need orchestration tools just to cope and you'll need sysadmins to be skilled in these and managing a much more complex environment.

10 comments:

Graham Chastney said...

We aren't very good at planning for situations where demand outstrips the benefits.
And it's not just an IT problem, look at the situation with washings, where we are using the automation of a washing machine to wash things more often, rather than using the extra time we have for more constructive pursuits.

swardley said...

Not sure I'd agree with the example analogy, the key issues here are elasticity, componentisation and co-evolution effects.

Take for example the evolution of the humble electronic switch from flemmings value to today's complex ICs containing millions if not billions of switches. The componentisation & co-evolution effect (i.e. other industries such as hand held calculators and digital computers) have been very powerful leading to a rise from 1 switch in 1904 to 1x10^19 produced last year.

However, you're absolutely correct that we tend to ignore and not plan for situations where the benefit actually causes a massive rise in demand.

We also tend to forget that "Technological progress that increases the efficiency with which a resource is used, tends to increase the rate of consumption of that resource."

James Urquhart said...

The coolest thing about administration/operations in the cloud is the move from server-centricity to application-centricity. Thus, the right question isn't servers/sysadmin, but apps/sysadmin.

I would content that apps/sysadmin will change less drastically than servers/sysadmin, thus you won't be eliminating sysadmin headcount.

However, the role will change pretty drastically, as you note, as some play the role of infrastructure keeper, and others engineer automation at the application level.

You are absolutely right about the growth of use as resources get cheaper, however. This is one of the critical elements of cloud operations that has to be internalized by IT as they plan for cloud adoption.

Kilgore Trout said...

Are we not seeing the same thing when systems programmers just disappeared and became systems administrators? So, systems adminstrators are merely transitioning more and more to applications administrators. Likewise, applications programmers transitioned to developers. This is just the evolution where the os and systems and applicationn software need less and less programming and more and more assembly like lego blocks. The work will never go away, it just evolves up the stack.

William V. said...

How do I use my sysadmin to get rid of Cloud?

Vladimir said...

I think this also could be one of the myth i.e. less sysadmins vs number of "servers" (wittingly without any adjective).
I agree with James the role of server/sysadmin will move towards apps config/admin role with the knowledge of application processes.

Ewan said...

Very interesting read, lots to think about.

One thing I'd say though is that if you choose to run (to use your example) a Hahoop cluster supplied as an external service, the 1000 VMs in it really shouldn't need any sysadmins from inside the company, that should all be covered by the SaaS supplier.

So while the amount of computing resource usage may well explode over the next 5 years, orchestration isn't the only opportunity to reduce the workload on sysadmins.

There are also some direct cases where the sysadmin might well be replaced - if you have an internal MS Exchange cluster with 5000 users needing 24x7 support, you're likely to have a couple of people supporting the servers, software, backups, etc, associated with it. Moving that Exchange system to a hosted messaging solution (Either Exchange, Gmail, etc) should give you the same level of support and functionality, but the sysadmins no longer have to look after any of the day to day work.

Of course, whether "hosted Exchange" is actually a cloud service or not depends on how you're defining the cloud :)

Sean said...

I like This Post very much..
--------------
Sean
Outsourcing

iouatp said...

Hi Simon,

i saw your talk on cloud computing at Skills Matter the other day. I was very impressed and it opened my eyes to the technology and how I might use it. UEC sounds like an excellent solution for both private and public cloud. I guess you still need sys admin experts for your private cloud offering tho :)

You seemed familiar to me when I heard your talk and I've since discovered where I'd seen/heard you before. It was a FOWA about 3-4 years ago and you were promoting Zimki for Fontango. I still have my pre-shaved yaks Tee. I also thought your presentation style at SkillsMatter was familiar, a bit like a Dick Hardt's presentation he did when evangelising about openid. It works, you did a great presentation.

Smith said...

Hey thanks a lot for sharing such a nice and informative article,
The coolest thing about administration/operations in the cloud is the move from server-centricity to application-centricity. Thus, the right question isn't servers/sysadmin, but apps/sysadmin.

I would content that apps/sysadmin will change less drastically than servers/sysadmin, thus you won't be eliminating sysadmin headcount.

By the way for more information on security courses and its certification check this link: http://www.eccouncil.org/certification.aspx