Tuesday, April 21, 2015

Devops ... we've been here before, we will be back again.

In this post I want to explore the causes of DevOps and how you can use such knowledge to advantage in other fields. I'm going to start with a trawl back through history and four snippets from a board pack in early 2007. This snippets describe part of the live operations of Fotango, a London based software house in 2006.

Snippet 1


We were running a private infrastructure as a service with extensive configuration management, auto deployment and self healing (design for failure) of systems based upon cfengine. We were using web services throughout to provide discrete component services and had close to continuous development mechanisms. In 2006, we were far from the only ones doing this but it was still an emerging practice. I didn't mention agile development in the board pack ... that was old hat.

Snippet 2


To be clear, we were running a private and a public platform as a service back in 2006. This was quite rare but still more of a very early emerging practice.

Snippet 3


In early 2007, we had switching of applications between multiple installations of platform as a service from our own private infrastructure as a service (Borg) to one we had installed on the newly released EC2. This was close to a novel practice.

Snippet 4


By early 2007 we working on mechanisms to move applications or data between environments based upon cost of storage, cost of transfer and cost of processing. In some cases it was cheaper to move the data to the application in other cases the application to the data. We were also playing some fairly advanced strategic games based upon tools like mapping. However, one of my favourite changes (which we barely touch on today) is when you had pricing information down to the function. This can significantly alter development practices i.e. we used to spend time focusing on specific functions because they were costly compared to other functions. You can literally watch the bill racking up in the real time billing system as your code was running and one or two functions always stood out. This always helps concentrate the mind and this was in the realm of novel practice in 2007.

Much of what we talk about regarding DevOps and the changes in practice today are not new. It is simply becoming good practice in our industry. For the majority of these changes, the days of novel and emerging practice have long gone. Many companies are however only just starting their journey and whilst most will get some things right - design for failure, distributed systems, use of good enough components, continuous deployment, compartmentalising systems and chaos engines - many are almost certainly doomed to repeat the same mistakes we made long ago - single size methods (agile everywhere), bimodal and API everything (some things just aren't evolved enough yet). Much of that failing will come from our desire to apply single methods without truly understanding the causes of change ... but we will get to that shortly.

The above is all perfectly normal and so is the timeframe. On average, it can take 20 to 30 years for a novel practice to become defined as a best practice. We're actually a good 10-15 years into our journey (in some cases more), so don't be surprised if it takes another decade for the above to become common best practice. Don't also be surprised by the clamouring for skills in this area, that's another normal effect as every company wakes up to the potential and jumps on it at roughly the same time. Demand always tends to outstrip supply in these cases because we're lousy at planning for exponential change.

However, this isn't what interests me. What fascinates me is the causes of change (for reasons of strategic gameplay). To explain this, I need to distinguish between two things - the act (what we do) and the practice (how we do stuff). I've covered this before but it's worth reiterating that both activities and practices evolve through a common path (see figure 1 & 2) driven by competition.

Figure 1 - Evolution of an Act


Figure 2 - Evolution of Practice


Now, what's important to remember is the practice is dependent but distinct from the act. For this reason practices can co-evolve with activities. To explain, the best architectural practice around servers is based upon the idea of compute as a product (the act). These practices includes scale up, N+1 (due to high MTTR - mean time to recovery) and disaster recovery tests.  However, best architectural practice around IaaS is based upon the idea of compute as a utility i.e. volume operations of good enough components with a low MTTR.  These practices includes scale out, design for failure and chaos engines. In general, best practice for a product world is rarely the same as best practice for a utility world.

However, those practices have to come from somewhere and they evolve through the normal path of novel, emerging, good and best practice. To tie this together I've provided an example of how practice evolves with the act in figure 3 using the example of compute. 

Now, normally with a map I use an evolution axis of genesis, custom built, product (+rental) and commodity (+utility). However practices, data and knowledge all evolve through the same pattern of ubiquity and certainty.  So on the evolution axis I could use :-

Activities : Genesis, Custom Built, Product, Commodity.
Practices : Novel, Emerging, Good, Best
Data : Unmodelled, Divergent, Convergent, Modelled
Knowledge : Concept, Hypothesis, Theory, Accepted.

For simplicity sake, I always use the axis of activities but the reader should keep in mind that on any map - activities, practice, data and knowledge can be drawn. In this case, also for the reason of simplicity, I've removed the value chain axis.

Figure 3 - Coevolution of practice with the act


From the above, the act of computing infrastructure evolves to a product and new architectural practices for scaling, capacity and testing develop around the concept of a product (i.e. a server). These practice evolve until they become best practice for the product world. As the underlying act now evolves to a more industrialised form, a new set of architectural practices appear. These evolve until they become best practice for that form of the act. This gives the following steps outlined in the above :-

Step 1 - Novel architectural practices evolve around compute as a product
Step 2 - Architectural practices evolve becoming emerging and good practice
Step 3 - Best architectural practices develop around compute as a product
Step 4 - Compute evolves to a utility
Step 5 - Novel architectural practice evolves as compute becomes a commodity and treated as a utility
Step 6 - Architectural practices evolve becoming emerging and good practice
Step 7 - Ultimately these good practices (DevOps) will evolve to become best practice for a utility world.

When we talk about legacy in IT, we're generally talking about applications built with best architectural practice for a product world. When we talk about DevOps, we're generally talking about applications built with good to best architectural practice for a utility world. Both involve "best" practice, it's just the "best" practices are different because the underlying act has evolved.

This process of co-evolution of practice with activity has occurred throughout history whether engineering or finance or IT. When the act that is evolving has a significant impact on many different and diverse value chains then its evolution can cause macro economic effects known as k-waves or ages. With these ages, new co-evolved practices emerge tend to be associated with new forms of organisation. Hence in the the mechanical age, the American System was born. With the electricity age, we developed Fordism. 

Knowing this pattern of change enabled me to run a set of population experiments on companies to confirm the model and identify a new phenotype of an emerging company form (the next generation) back in 2011. The results of which are shown in table 1.

Table 1 - Next generation vs Traditional organisations


It's precisely because I understood this pattern and how practices evolved that back in Canonical (2008-2009) we knew we had to attack not just the utility compute space but also the emerging practice space (a field which became known as DevOps). It was actually one of my only causes of disagreement with Mark during my time there as I was adamant we should be adopting Chef (a system developed by a friend of mine Jesse Robbins). However, Mark had good reasons to focus elsewhere and at least we could have the discussion.

When it comes to attacking a practice space then natural talent and mindset are key. In the old days of Fotango, I captured a significant proportion of talent in the Perl industry through the creation of a centre of gravity (a post for another day). It was that talent that created not only the systems but discovered the architectural practices required to make it work. Artur Bergman (now the CEO of Fastly) developed many of the systems and subsequently was influential in the Velocity conference (along with Jesse). Those novel practices were starting to evolve in 2008.

In the Canonical days, I employed a lesser known but highly talented individual who was working on the management space of infrastructure - John Willis (Botchagalupe). Again my focus was deliberate, I needed someone to help capture the mindset in that space and John was perfect for the role. I didn't quite get to play the whole centre of gravity game at Canonical and there were always complications but enough was done. John himself has gone on to become another pillar of the DevOps movement.

Now, this pattern of co-evolution of practice and activity repeats throughout history and we have many future examples heading our way in different industries. All the predictable forms of this type of change are caused by the evolution of underlying activities to more industrialised forms. For example, manufacturing should be a very interesting example circa 2025-2035 due to commoditisation of underlying components through 3D printing, printed electronics and hybrid printing enabling new manufacturing practices. It even promises an entirely new form of language - SpimeScript - which is why the Solid conference by O'Reilly is so interesting to me. Any early signs are likely to appear there.

It's worth diving a bit deeper into this whole co-evolution subject. So let us go back in time to when the first compute products were introduced i.e. the IBM 650. Back then, there was no architectural practice for how to deal with scaling, resilience and disaster recovery. These weren't even things in our mindset. There was no book to read, there was no well trodden path and we had to discover these practices. What became obvious later was unknown, undiscovered and uncharted.

Hence people would build systems with these products and discover issues such as capacity planning and failure - we acted, we observed and then we had to respond to what we found. We had to explore what the cause of these problems were and create models and practices to try and cope. As our understanding grew of this space those practices developed. We built expertise in this space and the tools to manage this. We talked of bottlenecks and throughput and of N+1, of load and of capacity. We started to anticipate the problems before they occurred - running out of storage space became a sign of poor practice. We sensed our environment with a range of tools, analysed for points of failure and we responded before it happened. Books were written and architectural practice became firmly in the space of the good. We then started to automate more - RAID, hot standby, clusters and endless tools to monitor and manage a complex environment of products (compute as services). Our architectural practice became best practice.

But as the underlying act evolved from compute as a product to compute as more of a commodity and ultimately a utility then the entire premise on which our practices were based changed. It wasn't about THE machine, it was about volume operations of good enough. We had to develop new architectural practices. But there was no book, no well trodden path and no expertise to call on. We had to once again use these environments, observe what was happening and respond accordingly. We created novel architectural practices which we refined as we understood more about the space. We learnt about design for failure, distributed systems and chaos engines - we had to discover and develop these. 

As we explored this new field we developed tools and a greater understanding. We started to have an idea of what we were looking for. The practices started to emerge and later develop. Today, we have expert knowledge (the DevOps field), a range of tools and well practiced models. We're even starting to automate many aspects of DevOps itself. 

The point to note, is that even though architectural practice developed to the point of being highly automated, best practice and "obvious" in the product world, this was not the end of the story. The underlying act evolved to a more industrialised form and we went through the whole process of discovering architectural practices again. 

Now a change of practice (and related Governance structures) is one of the sixteen forms of inertia companies have to change. However because of competition dynamics, this change is inevitable (the Red Queen effect). We don't get a choice about this and that gives me an advantage. To explain why I'll use an example from a company providing workshops. 

The Workshop

This example relates to a company that provides workshops and books related to best practice in the environmental field. It's a thriving business which provides expert knowledge and advice (embodied in those workshops and books) about effective use of a specific domain of sensors. I have to be a bit vague here for reasons that will become obvious. The sensors used are quite expensive products but new more commoditised forms are appearing, mainly in Asia. At first glance, this appears to be beneficial because it'll reduce operating costs and is likely to expand the market. However, there is a danger.

To explain the problem, I'm going to use a very simple map on which I've drawn both activity and practice to describe the business (see figure 4)

Figure 4 - The Business



The user need is to gain best practice skill on the use of the sensors, the company provides this through workshops and associated materials such as books based upon best practice. Now the sensors are evolving. This will have a number of effects (see figure 5).

Figure 5 - Impact of the Change


From the above,

Step 1 : the underlying sensor becomes a commodity
Step 2 : this enables a novel practice (based upon commodity sensors) to appear. This practice will evolve become emerging and then good.
Step 3 : the existing workshop business will become legacy
Step 4 : a workshop business based upon these more evolved practices will develop and it's the future of the market.

This change is not just about reducing operational costs of sensors but instead the whole business of the company will alter. The materials (books, workshops, tools etc) that they have will become legacy. Naturally the company will resist this changes as they have a pre-existing business model, past revenues to justify the existing practices and a range of current skills, knowledge and relationships developed in this space.  However, it doesn't matter because competition has driven the underlying act to more of a commodity and hence a new set of practices will emerge and evolve and the existing business will become legacy regardless.

Fortunately this hasn't happened yet. Even more fortunately, with a map we can anticipate what is going to happen, we can identify our inertia, we can discuss and plan accordingly. We know those novel practices will develop and we can aim to capture that space by developing talent in that area. We know we can't write those practices down today and we're going to have to experiment, to be involved, to act / sense and respond.

We can prepare for how to deal with the legacy practices, possibly aiming to dispose of part of this business. Just because we know the legacy practice will be disrupted, doesn't mean others will and if we have a going concern then we can maximise capital by flogging off this future legacy to some unsuspecting company or spinning it off in some way. Of course, timing will be critical. We will want to develop our future capability (the workshops, tools, books and expertise) related to the emerging practice, extract as much value from the existing business as possible and then dump the legacy at a time of maximum revenue / profit on the market without the wider industry being aware of the change. If you've got a ticking bomb never underestimate the opportunity to flog it to the market at a high price. Oh, and when it goes off, don't miss out on the opportunity of scavenging the carcass of whatever company took it for other things of value e.g. poaching staff etc.

There's lots we can do here, maybe spread a bit of FUD (fear, uncertainty and doubt) about the emerging practices to compound any inertia that competitors have. We know the change is inevitable but we can use the FUD to slow competitors and also give us an ideal reason (internal conflict) for diversifying the business (i.e. selling off the future "legacy" practice). There's actually a whole range of games we can play here from building a centre of gravity in the new space, disposal of the legacy (known as pig in a poke), to ecosystem plays to misdirection.

This is why situational awareness and understanding the common patterns of economic change is so critical in strategic gameplay. The moves we make (i.e. our direction) based upon an understanding of the map (i.e. position and movement of pieces) will be fundamentally different from not understanding the landscape and thinking solely that commodity sensors will just reduce our operational costs. This is also why maps tend to become highly sensitive within an organisation (which is why I often have to be vague).

When you think of DevOps just don't think about the changes in practice in this one instance. There's a whole set of common economic patterns it is related to and those patterns are applicable to a wide variety of industries and practices. Understanding the causes and the patterns are incredibly useful when competing in other fields. 

DevOps isn't the first time that a change of practice has occurred and it won't be the last. These changes can be anticipated well in advance and exploited ruthlessly. That's the real lesson from DevOps and one that almost everyone misses.