Wednesday, February 25, 2015

What's wrong with my private cloud ...

These days, I tend not to get too involved in cloud (having retired from the industry in 2010 after six years in the field) and my focus is on improving situational awareness and competition  (in my view a much bigger problem) through techniques such as mapping.  I do occasionally stick my oar into the cloud due to some very elementary mistakes that appear. One of those has raised its head again, namely the cost advantages / disadvantages of private cloud.

First, private cloud was always a transitional play which would ultimately head to niche unless a functioning competitive market forms enabling a swing from centralised to decentralised. This functioning market doesn't exist at the infrastructure layer and so the trend to niche for private cloud is likely to accelerate. There's a whole bunch of issues related to the impact of ecosystems (in terms of innovation / customer focus and efficiency rates), performance and agility, effective DevOps, financial instruments (spot market, reserved), capacity planning / use, focus on service operation and potential for sprawl which can work out very negatively for private cloud but these are beyond the scope of this post. I simply want to focus on cost and to make my life easier - the cost of compute.

Here's the problem. Two competitors going head to head in a field, both with over £15Bn in annual revenue and both spending over £150 M p.a. on infrastructure (i.e. compute, hosting etc). Now, this IT component represents a good but not huge chunk of their annual revenue (1%).

One of the competitors was building a private cloud. It was quite proud of the fact that it reckoned it had achieved almost parity with AWS EC2 (taking in this case a comparison to a m3.medium) of around $800 per year. What made them happy is that the public Amazon cost is around $600 per year and so they weren't far off but they gained all the "advantages" of being a private cloud. This is actually a disaster and was caused by three basic mistakes (ignoring all the other factors listed above which make private cloud unattractive).

Mistake 1 - Externalities.

The first question I asked (as I always do) is what percentage of the cost was power? The response was the usual one that fills me with dread - "That comes from another budget". Ok, when building a private cloud you need to take into consideration all costs from power, building, people, cost of money etc etc. On average the hardware & software component tends to be around 20-25% of the cost. Power tends to be the lion share. So, they weren't operating at anywhere close to $800 per equivalent per year but instead closer to $3,000 per year. This on the face of it was a 5x differential but then there's - future pricing.

Mistake 2 - Future pricing

Many years ago I calculated how efficiently you could run a large scale cloud and from this guesstimated that AWS EC2 was running at over 80% margin. But how could this be? Isn't Amazon the great low margin, high volume business? The problem is constraint. 

AWS EC2 is growing rapidly and compute happens to be elastic i.e. the more we reduce the price, the more we consume. However, there's a constraint in that it takes time, money and resource to bring large scale data centres online. With such constraints it's relatively easy to reduce the price so much that demand exceeds your ability to supply which is the last thing you want. Hence you have to manage a gentle decline in price. In the case of AWS I guesstimated that they were focused on doubling total capacity each year.  Hence, they'd have to manage the decline in price to ensure demand kept within the bounds of their supply factoring in the natural reduction of underlying costs. There are numerous techniques that can also help i.e. increasing size of default instance etc but we won't get into that.

Though their price is currently $600 per year, I take the view that their costs are likely to be sub $100 which means that a lot of future price cuts are on the way.  The recent 'price wars' from Google seem more about Google trying to find where the price points / constraints of Amazon are rather than a fully fledged race to the bottom. All the competitors have to keep a watchful eye on demand and supply.

Let us assume however, that AWS is less efficient than I think and the best price they could achieve is $150. This suddenly creates a future 20x differential between the real cost of the private environment. However, it's no big shakes because even if the competitor was using all public cloud (i.e. data centre zero) which is unlikely then it simply means they're spending $7.5M compared to our $150M and whilst they might be $140M of saving this is peanuts to our revenue ($15Bn+) and the business that's at stake. It's not worth the risk.

This couldn't be more wrong.

Mistake 3 - Future Demand

Cloud computing simply represents the evolution of a world of products to a world of commodity and utility services. This process of "industrialisation" has repeated many times before in our past and has numerous known effects from co-evolution of practice (hence DevOps), rapid increases in higher order systems, new forms of data (hence our focus on big data), increases in efficiency, a punctuated equilibrium (exponential change), failure of companies stuck behind inertia barriers etc etc.

There's a couple of things worth noting. There exists of long tail of unmet business demand for IT related projects. Compute resources are price elastic. The provision of commodity (+utility) forms of an activity enable rapid development of often novel higher order systems (utility compute allows a growth in analytics etc) which in turn evolve and over time become industrialised themselves (assuming they're successful). All of this increases demand for the underlying component.

Here's the rub. In 2010 I could buy a million times more compute resource than in the 1980s but does that mean my IT budget for compute has reduced a million fold in size during that time? No. What happened is I did more stuff.

In the same way, Cloud is extremely unlikely to reduce IT expenditure on compute (bar a short blip in certain cases) because we will end up doing more stuff. Why? Because we have competitors and as soon as they start providing new capabilities then we have to co-opt / adapt etc.

So, let us take our 20x differential (which in all likelihood is much higher) and assume we're building our private cloud for 5yrs+ giving time for price differentials to become clear. Our competitor isn't going to reduce their IT infrastructure spending, they are likely to continue spending $150M p.a. but do vastly more stuff. However, in order to maintain parity with this given the differential then we're going to need to be spending closer to $3Bn p.a. In reality, we won't do this but we spend vastly more than we need and we will still just lose ground to competitors - they'll have better capabilities than we do.

You must keep in mind that some vendors are going to lose out rather drastically to this shift towards utility services. However, they can limit the damage if you put yourself in a position where you need to buy vastly more equipment (due to the price differential) just to keep up with your competitors. This is not being done for your benefit. When you're losing your lunch, sometimes you can recover by feasting on a few individuals. If that means playing to the inertia of would be meal times (FUD, security, loss of jobs etc) then all is fair in war and business. You must ask yourself whether this is the right course of action or am I being lined up as someone's meal ticket?

Now, I understand that there are big issues with heading towards data centre zero and migrating to public cloud often related to the legacy environments. This is a major source of inertia particularly as architectural practices have to change to cope with volume operations of good enough components (cloud) compared to past product based practices (e.g. the switch from scale up to scale out or N+1 to design for failure etc). We've had plenty of warning on this and there's all sorts of sensible ways for managing this from the "sweat and dump" of legacy to (in very niche cases) limited use of private. 

But our exit cost from this legacy will only grow over time as we add more data. We're likely to see a crunch on cloud skills as demand rockets and this isn't going to get easier. There is no magic "enterprise cloud" which can enable future parity with all the benefits of commodity and volume operations but provided with non-commodity products customised to us in order to make our legacy life easier. The economics just don't stack up. 

By now, you should be well on your way to data centre zero (several companies are likely to reach this in 2015). You should have been "sweating and dumping" those legacy (or what I prefer to call toxic) IT assets over the last four+ years. You should have very limited private capacity for those niche cases which you can't migrate unless you can achieve future pricing parity with EC2 (be honest here, are you including all the costs? What is your cost comparison really?). You should have been talking with regulators to solve any edge cases (where they actually exist). You should be well on the way to public adoption.

If you're not then just hope your competitors return the favour. Be warned, don't just believe what they say but try and investigate. I know one CIO who has spent many years telling conferences why their industry couldn't use public cloud whilst all the time moving their company to public cloud. 

Some companies are going to be sorely bitten by these three mistakes of externalities, future pricing and future demand. Make sure it isn't you.

-- Additional note

Be very wary of the hybrid cloud promise and if you're going down that route make sure it translates to "a lot public with a tiny bit of private". This is a trade off but you don't want to get on the wrong side and over invest in private. There's an awful lot of vendors out there encouraging you to do what is actually not in your best interest by playing to inertia (cost of acquiring new practices, cost of change, existing political capital etc). There's some very questionable "consultant" CIOs / advisors giving advice based upon not a great deal and dubious beliefs.

Try and find someone who knows what they're talking about and has experience of doing this at reasonable scale e.g. Bob Harris, former CTO for Channel 4 or Adrian Cockroft, former Director Engineering Netflix. These people exist, go get some help if you need it and be very cautious about the whole private cloud space.

-- Additional note

Carlo Daffara has noted that in some cases, the TCO of private cloud (for very large, well run environments) can achieve 1/5th of AMZN public pricing. Now, there are numerous ecosystem effects around Amazon along with all sorts of issues regarding variability, security of power, financial instruments (reserved instances etc) and many of those listed above which create a significant disadvantage to private cloud. But in terms of pure pricing (only part of the puzzle) then a 1/5th AMZN public pricing is just on the borderline of making sense for the immediate future. Anything above this and then you're into dangerously risky ground. 

Monday, February 23, 2015

An extra dollop of hubris to go with my hubris ... on AI

First, I find it exceptional hubris to assume that we will create artificial intelligence before it emerges. However, even if we somehow accept that we're masterful enough to beat random accident to the race then the idea that we will be able to control it makes me shudder.

Any mathematical model (and that includes any computer program) is subject to Godel's Theorem of Incompleteness. Basically, we can't provably show a model is true within the confines of the model itself. What that means in plain English is that mistakes / bugs / unforeseen circumstances will happen. There is no way to create a control mechanism which will absolutely ensure that any AI doesn't misbehave any more than there is a way to create a provably secure system. Given enough time, something will go wrong in much the same way that given enough time, any system will be hacked.

The only sensible course of action is isolation. Which is why in security you don't try to make the unbreakable system, you accept it will be broken and minimise the consequences of this to the smallest risk vector possible. This is why systems like Bromium which use micro virtualisation make so much good sense and certainly a lot more than much of the rest of the security industry.

If you want to reduce the risk of AI then you have to reduce its interconnectedness i.e. you have to use isolation. But this is counter to the whole point of the internet and where everything is heading (e.g. IoT, mobile, use of ecosystems) where we attempt to make everything connected.

This whole paradox stems from the fact that industrialisation of one component enables the rapid creation of higher orders system which in turn evolve to more industrialised forms and the cycle repeats. Our entire history is one of creating ever more complex systems which enable us to reduce the entropy around us i.e. make order out of chaos. It's no different with biological systems (which are also driven by competition). See figure 1.

Figure 1 - Evolution and Entropy.

This constant process of change increases our energy consumption (assuming we consumed energy efficiently in the first place), enables us to deal with vastly more complex problems and threats to our existence but at the same time exposes us to ever great levels of reliance and vulnerability through the underlying components. It's why the underlying components need to be designed with design for failure in mind which also means isolation. So when you build something in the cloud, you rely on vast volumes of good enough components ideally spread across multiple zones and regions and even clouds (unless the cost of switching is too high).

But, that also unfortunately requires greater interconnectedness at higher order systems i.e. our virtual machines may be isolated across multiple zones and regions but our monitoring, configuration and control are integrated and connected across all of these. In much the same way that our redundant array of inexpensive disks (RAID) was controlled by software agents connected across all the disks. A major part of the benefit that industrialisation brings to us comes from this very interconnectedness but that interconnectedness also creates its own risk.

When it comes to artificial intelligence then forget being able to provide a provably verified, valid, secure and controlled mechanism to prevent something wayward happening. You can't. Ask Godel.

The only way to prevent catastrophic consequences in the long term is to isolate as much as possible  at this level of "intelligence" assuming we created it in the first place - which I doubt. But the very benefits it creates which includes protection against future threats (diseases, asteroid strike, climate control) comes from the interconnectedness that creates this threat.

There is no way around this problem and as I've said before, if we keep connecting up a 100 billion different things then eventually artificial intelligence (of a form we probably won't recognise) will emerge. We can't stop it and despite our best efforts then given enough time we will lose control of it. The only thing we can do is isolate the "intelligence" throughout the system but then who wants to do that? No-one. That's where the benefits are - we want one set of super smart intelligent network of things communicating with another or how else are we going to create the "paradise" of "any given Tuesday"?

At some point, we need to have that discussion on whether the benefits of interconnectedness outweighs the risks. It isn't a discussion about AI but fundamentally one on the speed of progress. We've already had one warning shot in the financial markets where the pursuit of competition created a complexity of interconnected components that we lost control of. I know some are convinced that we can create verified, valid, secure and controlled mechanism to prevent any future harm. Even if you don't agree with Godel who says you can't, we've already demonstrated how easy it is for whizz kids to fail.

This discussion on interconnectedness or as Tim O'Reilly would say "What is the machine we are creating?" is really one on our appetite for risk. Personally, I'm all gung ho and lets go for it. However, we need to have that wider discussion and with a lot hubris than we do today.

On the two forms of disruption

I've posted on this topic several times before but I thought it was worth re-iterating. There is not one but at least two different forms of disruption. Don't confuse them.

To explain the two types, it's best to use a map and then look at the characteristics of both types. In figure 1, I've provided a very simple map from a customer (the user) who has a need for a component activity (A) which in turn needs a component (B) which in turns needs component C and so on. Each component is shown as a node in the map with interfaces as links. The black bars represent inertia to change.

Figure 1 - a Basic Map.

For more information on :-
1. How to map, see an introduction to mapping.
2. Evolution, see mapping and the evolution axis.

Now, every component is evolving from the uncharted space to becoming more industrialised through the effects of supply and demand competition. Hence activity A evolves from A1 to A2 to A3 etc. Most of the time such evolution is sustaining i.e. it involves incremental improvements in the act e.g. a better phone.

However, there are two forms of non-sustaining and disruptive change that are possible. The first is the product to product substitution, shown as B1 to B2. The second is product to commodity (or utility) substitution, shown as C1 to C2. These two forms of disruption are very different but in both cases we will have inertia to the change (drawn as a black bar). 

I won't go through the various forms of inertia, there are at least 16 different types and you can read about them in this post on Inertia to change. What I want to focus on is the characteristics of these two forms of disruption. To begin with, I'll just simply list the differences in table 1 and explain afterwards.

Table 1 - Characteristics of the two forms of disruption.

Now both forms of disruption threaten past industries. Examples of this would be product to product substitution such as Apple iPhone vs RIM Blackberry or product to utility substitution such as Amazon EC2 vs IBM / HP / Dell server businesses.

However, both forms of disruption are NOT equally predictable. In the case of product to product substitution it is almost impossible to predict what, when and who. There's a set of complex reasons for this (beyond the scope of this post) which mean that this form of disruption depends upon individual actors actions. Hence Christensen, Gartner, RIM and Nokia didn't get it utterly wrong on the iPhone because they're daft but instead because it's unpredictable and those claiming to predict are suffering from (as Hayek would call it) 'a pretence of knowledge'. Equally, people who got it right were ... just lucky.

In the case of product to commodity substitution (e.g. product based computing such as servers evolving to utility computing) then this is moderately predictable in terms of what and when but not whom. This is because it is dependent upon competition (between all actors) rather than individual actors actions. This turns out to be predictable through weak signals, which is why we could feel back in 2004 that the change to cloud was going to happen soon. We also anticipated well in advance what the effects would be (e,g, efficiency, agility in building higher order systems, co-evolution of practices, new forms of data etc)  as they are caused by common repeatable patterns. However, that said - we didn't know who was going to play the game i.e. which individual actor was going to make the change.

In terms of anticipation, the subject matter of cloud had actually been well explored from Douglas Parkhill's book on the Challenge of the Computer Utility, 1966 and onwards. Everybody in this space (e.g. all the hardware vendors) should have been aware of the oncoming storm. They should have all been able to anticipate and hence have been well prepared. 

I need to emphasise that product to utility substitution gives you ample warning in advance, often ten years or more. However, this assumes you have reasonable levels of situational awareness and it's situational awareness that is the key to defence. If you exist in a firm which has little to none (the majority) then even that which you should anticipate comes as a complete shock. Even if you have the situational awareness necessary to see the oncoming storm then you'll still need to manage the inertia you have to change but since you've got lots of warning then you can prepare.

In the case of product to product substitution, well you can't anticipate. The best you can hope to do is notice the change through horizon scanning and only when that change is well upon you i.e. it's in play. There is no advance warning given with product to product substitution and yes you'll have inertia to the change. The key to defence is to have a highly adaptable culture and even then that's no guarantee.

When it comes to impact, then beyond disruption of past industries the impact of the two forms are different. Product to product substitution tends not to create new practices, it's not associated with new forms of organisation or rapid increases in data and new activities. However, product to commodity substitution is associated with rapid increases in data and new activities along with co-evolution of practice and new forms of organisation. 

When it comes to gameplay then with product to product substitution there is little option for positioning and your gameplay choices are limited e.g. acquire, copy or abandon the market (before it kills you). 

With product to commodity (+utility) substitution then because you have advance warning you can position yourself to take advantage of the known effects. There is ample scope for strategic play from being a first mover to building ecosystems to manipulation of the market through open means - assuming that you have good situational awareness and know what you're doing. A classic example of good play is Canonical (with Ubuntu) vs Red Hat and others and how Canonical took the cloud market with relative ease.

These two different forms of disruption are also associated with different economic states - one known as peace, one known as war - however that's outside the scope of this post. More on that can be found in this post on peace, war and wonder.

If you assume there is only one form of disruption then you get into bizarre arguments such as Lepore vs Christensen arguing over predictability or not.  Let us be clear both Lepore and Christensen are both right and wrong because one type of disruption is predictable whilst the other type isn't. This is a debate that can never be resolved until you realise there's more than one type. 

It's no different with commodification vs commoditisation or the abuse of innovation. It might make it simpler to use single words to cover multiple different things and hide the complexity of competition but if you do this then you'll repeatedly play the wrong games or try to solve the issue using the wrong techniques.

Oh, but am I really saying that we can anticipate one form of disruption, the shift from product to commodity (+utility)? YES! More than this, we can say roughly what is going to happen e.g. the disruption of past industries stuck behind inertia barriers to rapid increases in new activities built on the recently commoditised component to co-evolution of practice to new forms of data to organisational change.

Actually, we can say an awful lot about this form of change. Hence we already know that as Virtual Reality systems head towards more of a commodity then we'll see an explosion of higher order systems built upon this and fundamental changes to practice (i.e. co-evolution) in related industries. Think devops for architecture but this time physical architecture i.e. from town planning to civil, mechanical and heavy engineering.

But can we really say when? Yes! Well, more accurately we can do pretty good job of "when" using weak signals. These major points of predictable disruption (which take about 10-15 years to work through) we call 'wars' and I've provided a list of them in figure 2.

Figure 2 - Wars

So we know (actually, we've known for some time) that Big Data is likely to go through a war over the next 10-15 years with industrialisation to more utility services (e.g. MSFT's machine learning, GooG's big table or AMZN's elastic map reduce) and many product vendors suffering from inertia will be taken out during the process. In fact, some of the smarter vendors seem to be gearing up for this hence Pivotal's Open Data Platform play. Others, that are not quite so savvy don't seem to have got the memo. Can we work out who will succeed in the 'war'? Fairly quickly once it has all kicked off, so with Big Data we already know who the likely winners and losers will be.

Other 'wars' such as industrialisation of the IoT space to more commodity components are quite far away. Even with this, we've a good idea of what and when just not who. Still between now and when this 'war' of predictable product to commodity disruption kicks off in the IoT space, there will be lots of unpredictable product to product disruption.

Now when it comes to the old 'chestnut' of 'you should disrupt yourself' then let us be clear. You should absolutely disrupt yourself by being the first mover from product to commodity (+utility). This is a change you can anticipate and position yourself to take advantage of. Timing is important because if you go too early then it's easy to undermine a viable product business when a utility business isn't suitable (i.e. the act isn't widespread and well defined enough).

However, when it comes to product to product substitution then you have no idea - i.e. zero - of what is going to disrupt. Saying 'you should disrupt yourself' is tantamount to saying 'you should own a crystal ball'. It's gibberish. The best you can do is hope you discover the change going on (through horizon scanning), recognise it for a disruptive change (no easy thing) and adapt quickly through a flexible culture against the huge inertia that you'll have. Also, you'll need to do this all within a short period of time. Far from easy.

And that's the point of this post. There are at least two forms of disruption. The way you deal with them is different. Don't confuse them.

-- 13th October 2015

Added a link to mapping post on Canonical.

Also worth noting is this post by Deloitte on Turning Disruptive Trends into Opportunity. Before anyone starts smirking about it, do remember that most of these companies have no means of understanding situational awareness. The post is fine, it covers preparation (for the predictable), scanning (the unpredictable product vs product), understanding patterns (as in basic learning from situational awareness) and confronting bias (though it does a poor job of explaining which). Remember with bias you have to hit those sixteen forms of inertia to change, misunderstandings on the pace of change (e.g. punctuated equilibriums) and various forms of bias (i.e. use aggregated map view for these). The other benefit is provide further, completely independent support to the notions of multiple forms of disruption and that some forms can be exploited.

Yes, I know that some might feel that others are catching up, they always do - that's the Red Queen for you. Expect people to get better over time. But also remember that this stuff on disruption is still surprisingly advanced for most companies. The vast majority of companies you'll face still have little to no understanding of ecosystem plays, they're still struggling over organisation & cultural structure by blindly embarking on dual operating / bimodal approaches and they are generally far removed from understanding the full extent of repeatable gameplay. They lack scenario planning notwithstanding that most have no means of developing situational awareness or organisational learning. They're a long way off from the edge and by the time they they ever get close then I've a host of other useful stuff to post which will hopefully keep you going.

So keep mapping, keep discovering those patterns, keep on taking out those chancers (oh, and keep on me sending me your war stories, I've been enjoying several of these).

Saturday, February 21, 2015

All for the want of a telephone call ...

Many many moons ago I used to run security for a retail company. It was an interesting time, some fairly thorny problems but the biggest issue I faced was the culture that had established in the company. The cause of the issue was policy mixed with rumour.

Years before I arrived, one employee had managed to clock up a moderate telephone bill due to personal calls. Someone had noticed. They had then gone on to estimate a total cost to the company of people making personal phone calls. A policy was introduced "no personal calls to be made at work".

It didn't take long for the rumour mill to start. Before long a common rumour was that one employee working late one night had called home to say he was going to be late. He was fired the next day. Of course, none of this was true but that didn't matter. 

A culture of fear had been established in the company and with all such cultures it grew and fed on rumour. It didn't take much for people to leap to a conclusion that security was monitoring all phone calls. The stories becoming ever more outlandish.

By the time I arrived, the culture of fear around security was well established. No-one wanted to speak to security or point out if something was wrong because of a fear of the consequences. Stories that security would investigate why you were looking at something and who you were talking to were widespread. The common attitude was if something looked amiss, keep your mouth shut.

All of this derived from that earlier treatment of a minor matter that could have been so much better dealt with by simply asking the employee to refrain from making so many personal calls. I've generally found that whilst people make mistakes or do foolish things occasionally that these are exceptions and treating them like adults is the way forward. Creating policies to deal with these exceptions almost always results in treating the general population as though they're not adult and the consequences of this are negative.

Certainly in places with high levels of mistrust then people want to know "their boundaries" but this is a sign of mistrust and a culture that is not healthy rather than anything positive. In such environments it can be extremely difficult to rebuild trust and the policies and rumours that surround them will counter you at every turn.

Why do I mention this? Well, be careful with your policies in an organisation. Straying away from the mantra of "treating people like adults so they behave like adults" by introducing policies to cover the exception is a sign of weak management and will come back to bite you in the long term. A policy should be a last resort and not what you should immediately dive into.

Friday, February 20, 2015

On open source, gameplay and cloud

Back in 2005, Fotango launched the world's first Platform as a Service (known as Zimki) though we called it Framework as a Service in those days (consultants hadn't yet re-invented the term). It provided basic component services (for storage, messaging, templates, billing etc), a development environment using isolated containers, a single language - Javascript - which was used to write applications both front and back-end plus exposure of the entire environment through APIs. 

Zimki was based upon user need and the idea of removing all the unnecessary tasks behind development (known as yak shaving). It grew very rapidly and then it was shutdown in its prime as the parent company was advised that this was not the future. Today, Cloud Foundry follows identically the same path and with much success. 

The lesson of this story was "Don't listen to big name strategy consultancy firms, they know less than you do" - unfortunately this is a lesson which we continuously fail to learn despite my best efforts.

Fotango was a very profitable company at the time and had it continued the path then there was every chance that Canon would have ended up a major player in Cloud Computing. The upside of this story is that I went on to help nudge Canonical into the cloud in 2008, wrote the "Better for less" paper in 2009/2010 which had a minor influence in helping nudge others and without this, I probably would never have got around to teaching other organisations how to map. This has started to turn out to be very useful indeed, especially in getting organisations to think strategically and remove their dependence upon overpriced strategy consultants. You can guess, I don't like strategy consultancy firms and the endless meme copying they encourage.

The most interesting part of Zimki was the play. It was based upon an understanding of the landscape (through mapping) and use of this situational awareness to navigate a future path. 

The key elements of the play were ...

1) Build a highly industrialised platform with component services to remove all 'yak shaving' involved in the activity e.g. in developing an application. We used to have 'Pre Shaved Yak' T-Shirts to help emphasise this point. The underlying elements of the platform were used in many of Fotango services which themselves had millions of users. The use of component services was based upon limitation of choice, an essential ingredient for a platform unless you want to create a sprawl generator. To give you an idea of speed, the platform was so advanced in 2006 that you could build from scratch and release relatively complex systems in a single day. Nothing came close.

2) Expose all elements of the platform through APIs. The entire user interface of Zimki communicated through the same publicly available APIs. This was essential in order to create a testing service which demonstrated that one installation of Zimki was the equivalent of another and  hence allow for portability between multiple installations. It's worth emphasising that there was a vast range of component services all exposed through APIs.

3) Open source the entire platform to enable competitors to provide a Zimki service. A key part of this was to use a trademarked image (the Zimki provider) to distinguish between community efforts and providers complying to the testing service (which ensured switching). The mix of open source, testing service and trademarked image was essential for creating a competitive marketplace without lock-in and avoiding a collective prisoner dilemma. We knew companies would have inertia to this change and they would attempt to apply product mentality to what was a utility world. The plan was to announce the open sourcing at OSCON in 2007, I had a keynote to discuss this but alas that plan was scuppered at the very last moment.

4) Use the open approach to further industrialise the space. Exploit any ecosystem building on top of the services to identify new opportunities and components for industrialisation. A key element of any platform is not just the ecosystem of companies building within the platform but the ecosystem of companies building on top of it by including the APIs in their products or services.

5) Co-opt anything of use. We had guessed in 2005 that someone would make an IaaS play (though back then we called it Hardware as a Service, those meme re-inventing consultants hadn't turned up yet). Turned out it was Amazon in 2006 and so we quickly co-opted it for the underlying infrastructure. With no-one else playing in this platform space, there was nothing else for us to really co-opt. We were first.

6) Build a business based upon a competitive market with operational efficiency rather than lock-in and feature differentiation. There were around 13 different business models we identified, we focused on a few and there was plenty to go around by building upon this idea of a 'small piece of a big pie'.

In the end, despite its growth, Zimki was closed down and the open sourcing stopped. I did however give a talk at OSCON in 2007 covering commoditisation (a rehash of an earlier 2006 talk), the creation of competitive markets and the potential for this. 

A couple of points to note ...
  • The Cloud Foundry model is straight out of the Zimki playbook and almost identical in every respect which is why I'm so delighted by their success. I know lots of the people there and I was in their office in Old Street last week (it's actually opposite the old Fotango offices) and so it was nice to "cross the road" after a decade and see a platform succeeding. The recent ODP (open data platform) announcement also follows the same play. This is all good and I'm a huge supporter.
  • The OpenStack effort made the critical errors of not dealing with the tendency of product vendors to create a collective prisoner dilemma and compounded this by failing to co-opt AWS. They were warned but were hell bent on differentiation for the flimsiest of reasons (future speculation on API copyright). They'll survive in niches but won't claim the crown they should have done by dominating the market.
  • OpenShift has failed to learn the essential lesson of limitation of choice. I'm sure some will adopt and learn the lessons of sprawl once again. I do tell RHT every year they should just adopt Cloud Foundry but c'est la vie.

Anyway, there's more to good gameplay than just open sourcing a code base. Keep an eye on Pivotal, they have good people there who know what they're doing.

Monday, February 16, 2015

On Evolution, Disruption and the Pace of Change

When people talk about disruption they often ignore that there is more than one type. You have the highly unpredictable form of product vs product substitution (e.g. Apple vs RIM) and then you have the highly predictable form such as product to utility substitution (e.g. Cloud). Confusing the two forms leads to very public arguments that disruption isn't predictable (Lepore) versus disruption is predictable (Christensen) - see New York Times, The Disruption Machine.

The answer to this is of course that they are both right, both can give examples to support their case and at the same time they're both wrong.  If you mix the two forms of disruption then you can always make a case that disruption appears to be both highly unpredictable and predictable. 

The second common mistake that I see is people confusing diffusion with evolution. The two are not the same and when it comes to 'crossing the chasm' then in the evolution of any single act there are many chasms to cross.

The third common mistake is to confuse commoditisation with commodification. This is such a basic error that it surprises me that it happens in this day and age. Still, it does.

The fourth common mistake is the abuse of the term innovation. This is so widespread that it doesn't surprise me many companies have little to no situational awareness and that magic thinking abounds. If you can't even distinguish this most basic of concepts into the various forms then you have little to no hope of understanding change.

Once you start mapping out environments then you quickly start to discover a vast array of common and repeatable patterns from componentisation to peace / war & wonder to the use of ecosystems. There's a vast array of tactical & strategic gameplay which is possible once you know the basics and contrary to popular belief, even changes in company age and the processes of how companies evolve are not in the realm of the mysterious & arcane. I've used this stuff for a decade to considerable effect from private companies to Governments.

The fifth common mistake is pace.

I recently posted a tongue in cheek pattern of how things change. I like to use a bit of humour to point to something with a bit of truth.

In the same way, I used my enterprise adoption graph to provide a bit of humour pointing to the issue of inertia.

However, whilst most people half heartedly agreed with elements of the pattern of change (just like they agreed with Enterprise Adoption Curve), there was equally disagreement with the timeline. The majority felt that the timeline was too long. 

People often forget that concepts can start a long time ago, e.g. contrary to popular ideas, 3D printing started in 1967. The problem is that we are currently undergoing many points of "war" within IT as an array of activities move from product to utility services. Whilst these "wars" are predictable to a greater degree, they are also highly disruptive to past industries. Multiple overlapping "wars" can give us a feeling that the pace of change is rapid.

I need to be clear, that I'm not saying  the underlying rate of change is constant. The underlying rate of change does accelerate due to industrialisation (i.e. commoditisation) of the means of communication (see postage stamp, printing press, telephone, internet). BUT it's too easy to view overlapping "wars" as some signal that the pace of change itself is much faster than it is. 

It still takes about 30-50 years for something to evolve from genesis to the point of industrialisation, though a speculative case can be made that this accelerated post internet to about 20-30 years with 10-15 years for the 'war' to disrupt the past. We don't have enough data to be sure at the moment hence the importance of weak signals in determining when the 'war' will start.

However, come 2025-2030 with many overlapping wars (from IoT to Sensors as a Service to Immersive to Robotics to 3D printing to Genetics to Currency) then it'll feel way faster than that and way faster than it is today.

Friday, February 13, 2015

Why big data won't improve business strategy for most companies.

Over the years, I've heard a lot of people talk about algorithmic business and how big data will improve business strategy. For most, I strongly suspect it won't. To explain why, I'm going to have say certain things that many will find uncomfortable. To begin with, I'm going to have to expand on my Chess in Business analogy.

I want you to imagine you live in a world where everyone plays Chess against everyone else. How good you are at Chess really matters, it determines your status and your wealth. Everyone plays everyone through a large computer grid. Ranks of Chess players are created and those that win are celebrated. Competition is rife.

The oddest part of this however is that no-one in this world has actually seen a Chessboard.

When you play Chess in this world, you do so through a control panel and it looks like this ...

Each player takes it in turn to press a piece. Each player is aware of what piece the other player pressed. 

And so the game continues until someone wins or it is a draw. Unbeknownst to either player, there is actually a board and when they press a piece then a piece of that type is randomly selected. That piece is then moved on the board to a randomly selected position from all the legal moves that exist. White's opening move might be to select 'Pawn', one of White's eight pawns would then be moved one or two spaces forward. But all of this is hidden from the players, they don't know any of this exists, they have no concept there is a board ... all they see is the control panel and the sequence of presses.

However, within that sequence people will find 'favourable' patterns e.g. the best players seem to press the Queen as often as possible! People will come up with their own favourite combinations and try to learn the combinations of other winners! 

"When your opponent moves the Knight, you should counter with Queen, Queen, Rook" etc. 

People in this world would write books on secret knowledge such as "The art of the Bishop" or "The ten most popular sequences of Successful Winners". The more you aggregate data from different games, the more patterns will be proclaimed. Some through experience might even gain a vague notion of a landscape. Let us suppose one of those people - by luck, by accident or by whatever means - twigs that a landscape does exist and somehow finds a way to interact with it.

Imagine one day, you play that individual but they can see something quite remarkable ... the board. You start off planning to use your favourite opening combination of Pawn, Pawn, Knight, Bishop but before you know it, you've lost.

Obviously, this was just luck! But every time you play this player, you lose and you lose fast. You keep recording their sequences. They beat you in two moves - Pawn (b) and Queen (b) (known as Fool's Mate) or they beat you in four moves - Pawn (b), Queen(b), Bishop(b), Queen(b) (known as Scholar's mate) but you can't replicate their success even though you copy their sequences, it never works and you keep on losing.

People will start to seek all different forms of answers to explain what is happening. Maybe it's not just the sequence but the way that the player presses the button? Maybe it's timing? Maybe it's what they had for lunch? Maybe it's their attitude? They seem to be happy (because they're winning). Is happiness the secret to winning? All sorts of correlations and speculations will be jumped upon.

However there is no luck to this or special way of pressing buttons. The player controlling Black simply has far greater situational awareness. The problem for the player controlling White is they have no real understanding of the context of the game they are playing. White's control panel is just a shadow of the landscape and the sequence of moves lacks any positional information. When faced with a player who does understand the environment then no amount of large scale data analysis on combinations of sequences of presses through the control panel is going to help you.

I need to emphasise this point of understanding the landscape and the importance of situational awareness and so we'll now turn our attention to Themistocles. 

The battle of Thermopylae (the tale of the three hundred) and the clash between the Greeks and the mighty army of Xerxes has echoed throughout history as a lesson in the force multiplier effect of landscape. Themistocles devised a strategy whereby the Athenian navy would block the straits of Artemision forcing the Persians along the coastal road into the narrow pass of Thermopylae where a smaller force could be used to hold back the vastly greater Persian numbers. I've provided a map of this below. 

Now, Themistocles had options. He could defend around Athens or defend around Thebes but he chose to block the straits and exploit the landscape. Each of Themistocles' options represents a different WHERE on the map. In the same way that our enlightened Chess player has many WHEREs to move pieces on the board. By understanding the landscape you can make a choice based upon WHY one move is better than another. This is a key point to understand. 

WHY is a relative statement i.e. WHY here over there. However, to answer the question of WHY you need to first understand WHERE and that requires situational awareness.

Now imagine that Themistocles had turned up on the eve of battle and said - "I don't know WHERE we need to defend because I don't understand the landscape but don't worry, I've produced a SWOT diagram"

How confident would you feel?

Now in terms of combat, I'd hope you'd agree that a strategy based upon an understanding of the landscape is going to be vastly superior than a strategy that is derived from a SWOT. The question you need to ask yourself however is ...

... what do we most commonly use in business? SWOT or MAP?

The problem that exists in most businesses is that they have little or no situational awareness. Most executives are even unaware they can gain greater situational awareness. They are usually like the Chess Players using the Control Panel and oblivious to the existence of the board. I say oblivious because these people aren't daft, they just don't know it exists.

But how can we be sure this is happening?

Language and behaviour is often our biggest clue.  If you take our two Chess players (White with little or no situational awareness, Black with high levels) then you often see a marked difference in behaviour and language.

Those with high levels of situational awareness often talk about positioning and movement. Their strategy is derived from WHERE and so they can clearly articulate WHY they are making a specific choice. They tend to use visual representation (the board, the map) to articulate encounters both present and past. They learn from those visualisations. They learn to anticipate others moves and prepare counters. Those around them can clearly articulate the strategy from the map. Those with situational awareness talk about the importance of WHERE - "we need to drive them into the pass", "we need to own this part of the market". They describe how strategy is derived from understanding the context, the environment and exploiting it to your favour. The strategy always adapt to that context.

Those with low levels of situational awareness often talk about the HOW, WHAT and WHEN of action (e.g. the combination of presses). They tend to focus on execution as the key. They poorly articulate the WHY having little or no understanding of potential WHEREs. They have little or no concept of positioning and they often look to 'copy' the moves of others - "Company X gained these benefits from implementing a digital first strategy, we should do the same". Any strategy they have is often gut feel, story telling, alchemy, copying, magic numbers and bereft of any means of visualising the playing field. Those around them often exhibit confusion when asked to describe the strategy in any precision beyond vague platitudes. Those who lack situational awareness often talk about the importance of WHY? They describe how strategy should be derived from your vision (i.e. general aspirations). They often look at you agog when you ask them "Can you show me a map of your competitive environment?"

Back in 2012, I interviewed 160 different high tech companies in Silicon Valley looking at their level of strategic play (specifically their situational awareness) versus their use of open (whether source, hardware, process, APIs or data) as a way of manipulating their environment. These companies were the most competitive of the competitive.

What the study showed was there was significant difference in the levels of strategic play and situational awareness between companies. When you add in market cap changes over the last seven years then those with high levels of strategic play had performed vastly better than those with low levels.

Now this was Silicon Valley and a select few. I've conducted various interviews since then and I've come to conclusion that less than 1% of companies have any mechanism of improving or visualising their environment. The vast majority suffer from little to no situational awareness but then they are competing in a world against others who also lack situational awareness.

This is probably why the majority of strategy documents contain a tyranny of action and why most companies seem to duplicate strategy memes and rely on backward causality. We should be digital first, cloud first, we need a strategy for big data, social media, cloud, insight from data, IoT ... yada yada. It's also undoubtably why companies get disrupted by predictable changes and are unable to overcome their inertia to the change.

At the same time, I've seen a select group of companies and parts of different Governments use mapping to remarkable effects and discovered others who have equivalent mental models and use this to devastate opponents. Now, don't believe we're anywhere near the pinnacle of situational awareness in business - far better methods, techniques and models will be discovered than those that exist today.

Which brings me back to the title. In the game of Chess above, yes you can use large scale data analytics to discover new patterns in the sequences of presses but this won't help you against the player with better situational awareness. The key is first to understand the board.

Now certainly in many organisations the use of analytics will help you improve your supply chain or understand user behaviour or marketing or loyalty programmes or operational performance or any number of areas in which we have some understanding of the environment. But business strategy itself operates usually in a near vacuum of situational awareness. For the vast majority then I've yet to see any real evidence to suggest that big data is going to improve "business strategy". There are a few and rare exceptions.

For most, it will become hunting for that magic sequence ... Pawn, Pawn ... what? I've lost again?

Thursday, February 12, 2015

Chess in Business

I want you to imagine you're playing a game of chess on a computer against another opponent. But rather than looking at the board, I want you to pretend that you're unaware that a chess board even exists.

When you play Chess, you have a control panel and it looks like this ...

When White moves then one of the pieces on your control panel flashes ...

You might press the piece or choose another. Let's say you select the Bishop, then White sees this on their control panel ...

and so the game continues.

Now obviously pieces are moving on the board but both players are unaware that a board exists. We shall assume some process where a random piece of the type selected is chosen, moved a random number of squares in a random direction according to the rules allowed by chess.

Eventually some player will press a button and they will win! People will take note, possibly even copy their moves. You'll probably get people compiling lists of combination presses, all sorts of wonderful books and sorts of information will be generated to help you press the right button. Of course, all of this will be story telling, anecdotes, gut feel and alchemy.

Now imagine, that one day, you play another individual but unbeknown to you, that player doesn't have a control panel. In fact what they see is something almost magical ... this ...

You play your game through your control panel, using your favourites "special" combinations and whatever Big Name Strategy consultancy says is the latest set of popular presses. No matter what you do, you're going to get your arse kicked. Even your team of overpriced strategy consultants harping "Press the Rook, Press the Rook" won't save you.

But worse than this, the more you play this person then the better they will become and the arse kickings will get even more frequent.

Mapping is like suddenly exposing the chess board in business and playing against companies who view the world through the equivalent of a control panel. This is why it always makes me smile to hear people talking about business strategy as either like chess or going beyond it. Most business strategy is nowhere close to this - it's more story telling, anecdotes, gut feel and alchemy.

Wednesday, February 11, 2015

On Government, platform, purchasing and the commercial world.

Whenever you run a large organisation, the first step should always be to get an idea of what you do, how much you do, what each transactions costs and if you're a commercial company then what revenue does each transaction make. This is the most super simple basic information that every CEO should have at their fingertips.

In the case of UK Gov, there are 784 known transactions of which 700 have data on volume and 178 have published cost per transaction. These are shown in figures 1 & 2 (source - UK Gov High Volume Services Transactions)

Figure 1 - Volume of transaction

Figure 2 - Cost per Transaction

In the UK Gov they know that the cost of transaction of a memorial grant scheme claim is £1,085 but a driving license replacement only costs £7.85. Equally they know that we have over 393,000 charity applications, change of details and gift aids but only 19 Human Tissue Authority Licensing Applications. 

Any company of significant size will have a number of core transactions though it probably won't be anywhere near the scale and complexity of UK Gov.  But let us take a hypothetical global Media company with 50 or so different core transactions from the Commissioning of TV programmes to Merchandising to a Games Division to Broadband provision to Mobile Telephony to News Content to Radio to ... etc etc.

Now, each of those transactions should represent the provision of a user need. That is after all where value should be created. Now, the purpose of mapping is not just to force a focus on user needs, improve communication, improve strategic play, use of appropriate methods, provide a mechanism of organisational learning and mitigate risks but also to remove duplication and bias.

If you map out multiple different transactions, you will find that common elements exist between them and that bias exists in the way groups treat those activities. (see steps 4 and 5 on mapping and if you haven't read that post, you need to in order for the following to make sense). 

By aggregating multiple maps together, you can create a view of how things are treated (see figure 3) and then determine (by looking at clusters) how things should be treated (see figure 4)

Figure 3 - An Aggregated view

Figure 4 - How things should be treated.

There are two important parts of figure 4 - first is the duplication (i.e. how many instances of this thing exist in our maps) and second is how it should be treated (determined by looking at the clusters of how it currently is treated) which is a necessity in overcoming bias. It's not uncommon to find 18 different references and examples of compute in your maps and have half a dozen groups arguing that somehow their "compute" is special to them and unlike anybody else's.  If you're up for a laugh, try rules engines (a particular favourite of mine) if you want endless arguments by different groups about how their near identical systems are unique.

Now this aggregate map also has other purposes. Let us suppose you want to build a platform by which I mean not some "monolithic does everything" platform (a guaranteed route to quick failure) but a mass of discrete component services exposed through APIs and often associated with a development environment. When you're building those component services, the last thing you want to build is the uncommon and constantly changing component (i.e. those in the uncharted space). What you want to build is the common and fairly commodity like component services which are therefore well understood, stable and suitable for more volume operations.

Now, fortunately, the aggregated view shows you where these are in all your value chains. Sometimes, you'll find the market has already provided a component service for you to use e.g. Amazon EC2 for compute. There maybe reasons such as buyer / supplier relationship that you may wish to choose a particular route over another but at least the aggregate maps points you to where you should be looking. 

In other cases, you'll find no market solution even though a component is common and commodity like and this is your target for building a component service. For example, if Fraud Analysis is a common component of many of your value chains and assuming it's not provided in the market then this is a component service for you to consider. Now, in the case of Gov that component service could be provided by a central groups such GDS or even a department. For example if DWP was particularly effective at Fraud Analysis then there is no reason why it shouldn't offer this as component service to other departments. Ditto GCHQ and large scale secure data storage.

In fact, Government has a mechanism - known as G-Cloud - where departments could compete with the outside market to provide common components to others. All it requires is some central group such GDS to identify from all the maps of different value chains what the common components should be and to offer the opportunity to a department to provide it (and if they don't accept then GDS could look to provide it itself).

But let us not digress and go back to our Media company. We've taken an aggregated view, we've challenged bias in the organisation, we've identified duplication and determined where we're going to consume services from other groups and what services we're going to provide (hence building our platform of component services), we've even added some strategic play into the mix using open source (see previous post). We now subdivide this environment into contract components (for reasons of risk mitigation), apply the right methods and build a suitable team structure. Our map and understanding of the environment hopefully took only a few hours and now looks something like this.

Figure 5 - Our Map

Now, a key part of this map is it identifies what should be outsourced, use utility services, use other department (or platform) components, built in a six sigma like fashion with a focus on volume operations VERSUS that which needs to be built in-house, using agile techniques and in a more dynamic environment VERSUS that which should be more off the shelf, lean etc.

Obviously, we're aware everything is evolving between the two extremes due to supply and demand competition hence the necessity for a three party systems like Pioneer, Settler and Town Planner (don't get me started on the bimodal gibberish). We're all good.

BUT ... in the same way we need different methods for project management in any large scale project ideally scheduled together with Kanban, then we also need different purchasing tactics (see figure 6).

Figure 6 - Project management and purchasing methods

In other words, as with any large system where one size fits all project methodologies are ineffective, the same is true with purchasing. Any large scale system requires a mix of time and material, outcome based, COTS & fixed contract and unit / utility charging. Each has different strengths and merits as with project management methods. All activities evolve and how you purchase them will change accordingly.

So a couple of points ...

1) Long term contracts for any component is a big effing "no, no!" unless you're talking about a commodity / utility. The act will evolve and so will the method of purchasing.

2) Treating massive systems as a whole and applying a single size method e.g. six sigma everywhere, outsource everything, fixed price contracts everywhere are a big effing "no, NO!" with cream on top. This is almost guaranteed to cause massive cost overruns and failure.

Unfortunately, prior to the coalition, we had a long spell in UK Gov IT where single size methods such as outsource everything, fixed price contracts became the dogma based upon a single idea that if you can get the specification right then it'll all work. 

This idea is unbelievably ludicrous. Take a look at a map of any complex system such as the IT in High Speed Rail (see figure 7).

Figure 7 - Early Map HS2 IT

Whilst some elements of that map are industrialised (suitable for outsourcing to utility suppliers with known specifications), large chunks of that map contain elements that are uncharted i.e. they're novel, uncertain and constantly changing. We DON'T know what we need, the specification will change inevitably, these components are about exploration and no amount of navel gazing will write the perfect specification. If you plaster a single size method across the lot then you'll inevitably incur massive change control costs and overruns precisely because a large chunk of the map is uncertain, it will change!

However, alas we had this single size idea and not only did the single size approach lead to massive cost overruns and a catalogue of IT disasters, it also had the nefarious effect of reducing internal engineering capability. This had serious consequences for the ability of Government to challenge (something we covered in the 'Better for Less' paper and the need for an "intelligent customer" and a Leverage & Commoditise group to ensure challenge). 

In UK Gov, they now have that more "intelligent customer" in the guise of GDS and the Digital Leaders Network. We also have that challenge function in GDS spend control. 

I kid you not, but prior to this I saw an example which was as close enough as it makes no difference to ...

1) Department wants to do something, asked their favourite vendor to do the options analysis 
2) The options came back as Option A) Rubbish, Option B) Rubbish, Option C) Hire us - Brilliant. 
3) The department then asked how much for option C) and haggled a small amount on a figure that was in excess of £100 Million. 

This was for a project that I couldn't work out how you could spend more than £5-£10 Million on. This was not "challenge". This was handing over large fists of cash. That has now fortunately changed. However, you have to be extremely careful to avoid one size fits all methods otherwise it's easy to fall down the same trap. 

So, this brings me to purchasing. Any group involved in this has to know how to use multiple purchasing methods to get the best result. It also has a need which it supplies. That need is "to help the Engineering capability deliver and develop efficiently and effectively". 

This is key. 

Purchasing has to be subservient to Engineering, it has to support it, advise on the best use of contracts and enable it to do the job. If it isn't subservient then you run the future danger of a one size fits all policy being applied and the same mess highlighted above being created.

Now, those various types of work (whether T&M, outcome based, COTS, fixed price or utility) can often be labelled "services" and there is no reason why you can't have a competitive market of these. In the UK Gov case, G-Cloud seems a suitable framework.

So, why do I mention all of this?

First, there was this discussion I had with @harrym on the Digital Services Framework and UK Gov Purchasing.

Then, there was this rather disturbing post by Chris Chant on everything unacceptable with UK Purchasing 

Now, this disturbs me. The real concern is the accusations that CCS is overstepping its mark, not acting as the support function that it is and imposing policy. If that is the case, I can see why Chris argues that CCS should be scrapped or at least (in my view) absorbed into GDS.

Oh, why the commercial world bit in the title?

Let's be honest, I lied at the beginning. When I said this is the "most super simple basic information that every CEO should have at their fingertips" then we all know that the vast majority of large companies in the commercial world have no idea what transactions they do and what volume of transactions they have beyond some basic revenue lines.

Without this, it's fairly farcical to assume large corporations have any concept of user needs, mapping, situational awareness or strategic play beyond simply copying others. This sort of discussion is light years ahead of where most major corporates are (there are a few exceptions like Amazon). The problems that UK Gov faces are problems that most corporates could only dream of. 

I only put that line in because some troll often pipes up and says "UK Gov should be more like the commercial market". Yeah, for the vast majority that is clueless and hopeless, waiting to be disrupted whilst pretending they're master chess players without ever looking at a board. No thanks, this is our Government we're talking about.