Friday, March 18, 2011

Building a private cloud

As we move from a world of infrastructure products to a competitive market of infrastructure services, private clouds have a valid role as part of a hybrid strategy in mitigating transitional risks (such as concerns over data governance, trust, transparency and security of supply).

The key word here is transitional i.e. this is not a permanent solution. Furthermore by using a hybrid strategy you're deliberately sacrificing some element of the benefits of public provision (economies of scale, focus etc) in order to mitigate those risks. It's a trade-off.

However, for this post I'll assume you've decided to make that trade-off. Hence you're going to be using a combination of public and private resources in a hybrid form for the immediate future. So dealing specifically with infrastructure, at what price should you be building a private cloud?

First, you need to recognise that a consequence of a shift to standardised components provided through utility services will be an explosion of innovation in higher order systems - i.e. the competitive landscape is going to become even more competitive. Second, cost efficiency won't result in reduced IT budgets because increased competition, the long tail of unmet demand and co-evolution effects will result in far greater consumption. Lastly, one of the key benefits of cloud is speed and when this is combined with cost efficiency, you'll end up doing more, faster with the same budget.

It's worth noting that there is little correlation between total IT spending and business value but there is strong correlation between speed and business value. How efficient your provision is in terms of speed is critical here. Now assuming you're not planning something daft like a long winded requisition process for new virtual machines, then the speed of provisioning will be minutes whether public or private. Assuming this is uniform, the distinguishing feature becomes cost efficiency - i.e. how much MORE you can do with same level of spending over competitors.

With public provision such as Amazon EC2, a small instance (I'll refer to this as a VM) costs currently $740 per year. Now whilst Amazon doesn't release figures, we believe the cost of provisioning such a VM can be much lower, possibly as low as $220 per VM per year though from the comments possibly much lower. Hence if your private cloud is being designed to cost $1500 per VM per year, you might be forgiven for thinking that this is only twice Amazon pricing and represent good value for a hybrid trade-off. What you're probably discounting is that your private environment will have to exist for three to five (in a worst case) years and during that time Amazon's pricing could fall rapidly, possibly to the order of $300 per VM per year, possibly lower.

In such circumstances, a competitor with a pure public play would have upto a 5x cost advantage (depending upon how much of your hybrid strategy was private provision) which is an astronomical difference in a world of increasing and more rapid competition. But that's a key point to note, the impact depends upon the type of industry you exist within and your competitor actions.

Assuming you're working in a high tech reliant industry, then in order to give your private cloud a fighting chance, you have to be designing for a target price of between $220-$540 per VM per year.
These figures (and lower) are achievable if done at reasonable scale and with a relentless focus on commodity provision. For reference purposes only, I've provided a hypothetical example (based upon real data from real examples) of a VM farm. I've used the following assumptions:
  1. Base VM: 1.7 Gb RAM, 160 Gb HD, CPU equivalent to a passmark of 420
  2. Target Price (VM per year): $510
  3. Max Utilisation (to determine level of over-provisioning required): 70%
  4. No. of VMs: 100,000
  5. Base Unit: Rack
  6. Focus: Commodity (No multiple PSU, Hot Swap)
  7. Depreciation for hardware: 3 yrs
  8. Current High Interest Rates of: 3%
  9. Networks: All internal equipment and network upto external routers but not external bandwidth.
  10. PUE:1.3
Given this, then figure 1 shows as a rough guide, how your target price per VM per year breaks down into its various components.
Figure 1 - Cost per VM per year.
(click on image for larger size)

Now obviously there are many ways to skin a cat, and pricing is sensitive to various factors such as scale, density of VMs, power consumption, number of units per rack, location etc etc. However, I put this here to start a discussion as I would like to know what's your cost per VM in your private cloud?

-- 12 Feb 2015

It's 2015 and companies are still commissioning / building private clouds as new projects! Ouch. There's going to be some very costly white elephants.


Brad Vaughan said...

HW looks really high.. A normal environment TCO breakdown looks far more weighted to people/support/software license. I can see the SW license down if you are developing a app using OSS, but the rest looks imbalanced. With higher utilization of HW in a virtualized space HW cost per VM should be even ower.

Be interested in your opinion why this is the case.

swardley said...

Hi Brad,

Actually, this example is a target for a specific price point of $510 per VM per year for a specified 100,000 VM environment.

You can bring these figures much lower by example altering of VM density per machine (the above example was used with a very generous CPU spec).

Obviously this changes power / cooling, number of racks required for this fixed volume of VMs i.e. whilst this will reduce the cost of hardware per VM, at the same time it also impacts building, racking power, networking and people costs.

It's a non-trivial problem with lots of inter-connections, minimum limits etc. However, as a rule of thumb, in a pure play commodity environment at scale, a 40-50% on hardware seems to be a good match as you're optimizing for maximum density.

Yes, you do have to squeeze software costs, increase power efficiency and the environment has to be fully automated ideally on a rack basis to get down to the $220 per VM per year mark.

However, I put this up as a rough guide precisely because I want to initiate such discussions.

I'd be delighted if you could provide an equivalent breakdown for comparison based upon provisioning a standard VM as described in the assumptions.

swardley said...


Also for reference the software cost is actually "what can I afford to spend on software?"

In the above example, by manipulating density I can create more to spend on software however when I'm reducing the overall price per VM (which requires me to increase density, change machine spec) then the software bit gets squeezed further.

It's not a simple question of "use open source", it's a question of what can I afford to spend - not a lot at scale per VM.

Ewan said...

It's an interesting model, and I hope it starts to drive some public discussion about this kind of thing - I'm sure a lot of the large service providers have already done similar internal modelling, but they've obviously decided not to make it too public so far.

Randy Bias said...

Brad Vaughan,

James Hamilton, the datacenter expert for Microsoft and now AWS has characterized 5-year TCO costs for typical datacenters as roughly 55%:

See here:

Randy Bias said...
This comment has been removed by the author.
Brian Gracely said...

Can you clarify a few points from the post?

- Where does the $740/yr (AWS) cost come from and what assumptions are made? Spot/Reserved; Bandwidth; I/O; Storage; Monitoring, etc.

- How do you get from $740 to $220 for a public VM? You can't be assuming that difference is AWS profit, so what timeframe do you see this cost reduction happening? And you switching wording from "cost per year" to "provision a VM" in those paragraphs, so are you still talking about all-in costs, or just some assumption of "turn it on" cost?

- Why don't you attempt to show any reduction in cost within a Private Cloud over time? You're assuming no learning curve, automation or adoption of improving technology within a Private Cloud?

- It's not clear how you're making the jump from a single VM example to a 100,000VM farm. In your studies, how many companies have 100,000VMs (or physical servers) today? Trying to understand if you're making a statement of when cost/scale matters, or just using a large reference.

swardley said...


Thanks for the link.

First thing I'll note is that he doesn't include cost of money or people - it makes a difference.

His Power, Distribution and cooling percentages are a close match to mine along with his networking. However, my machine costs are lower than his.

Given this it possibly implies I'm using a higher density of commodity servers in my model (hence machine cost lower but power, network rise as a % of machine cost).

It's an extremely useful reference point.

Thank you.

swardley said...

Hi Brian,

Thanks for the questions. Let me respond to each.

1. The $740 figure is simply a standard EC2 instance run for a year - no spot, reserved pricing, bandwidth, EBS, monitoring or any other feature is considered. Simply raw compute, one year, to give a base unit in order to make a comparison.

2. The lower end of the calculations highlights that it's possible to get to $220 per VM per year. I know of one example where this is beaten.

3. Amazon doesn't provide any figures but I assume that they're making a margin and they are operating effectively. So, you have to hazard a guess where they're going to be.

4."cost per year" to "provision a VM". In all cases I'm referring to the total cost of provisioning a VM for a year. If you're buying from a supplier, then obviously this will have a margin and other costs added.

5."Why don't you attempt to show any reduction in cost within a Private Cloud over time?" - it's simply a target price for now and obviously that target price will get lower over time. I'm assuming you're working on a unit of racks with heavy automation in order to get the people cost so low.

6. The calculations use a 100,000 VM farm as a base model because I'm making an assumption of scale. The final graph shows the costs for provisioning for each VM broken down into different sections.

Obviously all of this depends upon scale, density of VMs, location, admin/server ratios ... long list.

Hope that helps.

bmullan said...

Building a private cloud is everything you've pointed out.
It means dealing with...
Backup Diesel Generator(s)
H/W Mainteance/replacement contract costs
Software Licenses
Software Maintenance contract costs
24x7 operation
Disaster Recovery
Staff to deal with all the above
Benefits of scale such as power costs - can you get a bigger discount than Microsoft, Google or Amazon?
More limited elasticity with private cloud unless hybrid strategy is builtin.
Other questions are:
Can you hire better engineers/technicians than you'd get using a public cloud.
Also, consider costs for cloud mgmt tools, SAN, Compute, Orchestration etc.

Simon May said...

Great article and you're really getting to a good place with the cost model. I think as per bmullan there are additional overheads to think about, but probably under a "business / data centre overheads" category. They will be different for each private cloud provider.

An interesting case study would be to understand the costing model of a private cloud hoster.

Ultimatley for any organisation to do the calcs properly they will need a techy and a management accountant to work together.

Anyone running a private cloud will need a festidious focus on cost but those costs may not be as key to them as other factors. In a hybrid model they are likley to value security or data soverignty over purely a decision to move a service to purely a public cloud as you say.

RyanG said...

Though provoking analysis of the costs of running your own private cloud.

Another route to consider is leveraging dedicated hosted options from companies like Rackspace and SoftLayer.

By using hosting options such as this (with a sizable enough implementation) you can take advantage of a lot of the economy of scale you get with a full "proper" public cloud offering. However, you still get dedicated hardware and connectivity, things which could mitigate the risks of running on multi tenant public cloud. You'd certainly be able to cut out the cost of the staff to build and maintain the hardware infrastructure.

-Ryan J. Geyer-
Sales Engineer - RightScale