Sunday, May 14, 2023

Why the fuss about conversational programming - Part II

In my last post on conversational programming I asked the reader to "get yourself ready for a world of conversational programming" but I wasn't willing to call it for ChatGPT or the existing crop of LLMs. My advice was one of preparation and not nailing any colours to a mast. There is a reason.

I think GPT4 and systems like Github CoPilot are admirable but probably in the wrong space. I think the breathless horde of consultants prognasticating the replacement of programmers with LLMs have fallen into a trap, not disimillar to their claims in 2011 of cloud saving you money (it doesn't, you end up doing more stuff - see Jevons Paradox) or cloud needing less engineers (see above) or cloud just being for startups. 

In this case, I think the problem is to do with the medium iself.

Don't get me wrong, we're on a path to conversational programming as highlighted by Nicholas Negroponte and his paper "Architecture by yourself". As the paper noted, design is the process of a conversation between two or more pespectives in the minds of one or more designers. In today's case, one of the designers is becoming the machine. That conversation is the heart of conversational programming. But, what we're missing is the graphical conversational part of what is needed. This is the bit that Yona Friedman explored in "Towards a Scientific Architecture".

To explain, I'm going to use a map of coherent city transport. It was created through a discussion between a group of people (from Turkey to Germany to UK to the US) all involved in the transport industry. I've provided the "map" and the code use to create the map in figure 1.

Figure 1 - Coherent City Transport.


The discussion around the map highlighted that in transport planning we often failed to consider one of the major transportation systems effecting the world - that of virtual transport. Every virtual conference, every zoom chat involves a number of "virtual" miles travelled which has an impact on other transportation systems. Unfortunately, we often don't recognise this or even count the virtual miles. We've even built digital twins of cities that have ignored virtual as a transport system. It's a bit like ignoring roads.

If the idea of virtual as a transport system seems odd to you then consider the impact of different modes of transport on road congestion. Figure 2 should bring the message home.

Figure 2 - Congestion on roads by mode of transport.

This realisation of the importance of virtual as a transport system itself occurred through discussion as we explored the things on the map, their relationships with each other and the context they existed in. Now take a look at the code in figure 1. It would be difficult to make that realisation in a medium that is dominated by syntax, style and rules. In reality, the code was only ever the means to make the map. The conversation happened around the map.

Of course, both the text and the map are ways of "coding" the problem space of coherent city travel. One is simply code as text where syntax, styles and rules dominate. The other is simply code as a map where things, relationships and context dominate.

With this in mind, think about where we are with conversational programming today. We're mostly trapped in syntax, style and rules around code as text. We're not really having that conversation with the machine but instead giving it instructions - write this piece of code for me, check my code, improve my code - and being amazed at what it gives back.


A fabulous examples of this was in the NewStack article on "Developers Put AI Bots to the Test of Writing Code". The conversation often veers towards completeness or accuracy of written code but Villegas hits the nail on the head with a verdict that "AI-generated code lacks context-awareness".

That's not a problem with AI but instead the medium via which we are having the conversation. It doesn't matter if it's written or spoken, the medium is still the word, it is text. The power of conversational programming will only be truly unleashed if we can escape from the confines of text (where syntax, styles and rules dominate) and into a world of maps (where things, relationships and context matters).

Before you say "but we can create maps with words" - well, that's exactly what we used to do for navigation. Early forms of navigation were based upon epic sagas that people would learn. These were the written and spoken words for navigation. I probably sound no different from the Viking standing there pointing at squiggles on a parchment and a sun-stone to another group of bemused Vikings who had spent years learning epic sagas word for word. They probably thought that Viking was “off his rocker” and this map thing will never take off. But it did and it will again.

Words themselves are a poor medium to transmit quickly the key information needed in any significant journey. This is why we have used maps in military history as methods of communication and learning. The code for the map in figure 1 is a simplified and constrained language provided through online wardley maps. However, the code for that map contains over 1,700 different words (some hidden as digits, others hidden as links). That is pages of text in which it is hard to see the relationships between things. When we're coding our maps, the only thing that matters is the syntax and completeness of the code to produce the map. The real discussion happens over the map.

That's the world we need to get to, code as maps. At which we point we can have conversations over things, relationships and context (the real substance of a discussion) with the machines as a designer. That's when conversational programming will truly explode onto the scene. Today is just the foretaste of what is to happen.

Yes, I understand that LLMs might replace a lot of writing code itself but if you think that programming was just about writing code then you're missing the bigger picture. The world of writing code as text might diminish but the world of programming has yet to reach its golden age.

I strongly believe that the world of serverless which has already become about stitching components (or gluing things) together and thinking about the context will lend itself more naturally to the world of conversational programming. 
I suspect the techniques for conversational programming which were first hinted at by Aleksander Simovic and Slobodan Stojanovic in the AWS 2018 presentation on "Ask Jarvis to Create a Serverless App for Me" will come out of the rapidly developing open source space around multi-modal systems. 


Don't get me wrong, I admire the work of Bard, GPT4 and Github's Copilot but as engineers we are still stuck in a model of code as text. Our early entrants might have the foresight to declare "We have no moats" but to make things worse, they will already have inertia created by commercial interests.

So, keep exploring. Just remember, this battle for the future of AI through industrialisation of fundamental components is only at the beginning and we maybe in the "Sun Cloud" moment and barking up the wrong tree. That's where I suspect we are and there is much more to come. 

For interest, this is not a unique situation in business or IT. This is no different to the current situation with military organisations around the world slowly realising there are at least five landscapes of sovereignty which matter for the defence of the nation - territorial, economic, technological, political and cultural - and most of us only have maps for one of those landscapes.

Four of the other landscapes are still dominated by text - whether written or spoken. As we've learned in the territorial landscape in the American Civil war, topographical intelligence and situational awareness is critical. For that, we need conversations around maps not better text.

Same with code, same with conversational programming.

ADDITIONAL QUESTIONS

Q. Does that mean all code will be maps?
Of course not. It simply means that much our programming is likely to head in that direction to enable conversations. There will always be a need for the novel and new, the need for optimisations and the deeper you go the more likely you are to meet code as text i.e. maps might be high level but as you dig into a concept you’ll increasingly find graphs and then text alone. This should not surprise you as a graph normally contains some elements of text and a map contains a graph (of the value chain) and associated text. Each level inherits from what it is built upon.

Q. Why not just graphs?
Graphs are a fine tool but remember we're talking about conversions with context which implies some form of landscape. The difference between a graph and a map is that in a map, space has meaning. Which is why they are good for representing landscapes. See figure 3.

Figure 3 - Graphs vs Maps. In a map, space has meaning.



Q. Do you mean your form of maps?
All maps are imperfect representations of a space, they are also models and hence wrong. We use them because they are useful tools of communication but they represent a lossy trade-off between being able to discuss the context and fidelity of detail. It took our geographical brethren thousands of years to get that balance right, we've only be doing this for 18 years. We're more at the Babylonian Clay Tablet stage rather than ordinance survey maps. So, no ... I'm hoping we can create better maps than we have. I simply use my maps as an example — think odd looking Viking muttering about scribbles on parchments and sun-stones

Q. If your maps are wrong and imperfect, why use them?
Our geographical brethren didn't magically create 
1:10,000 scale representations of the landscapes with notable features overnight. These are a starting point for conversation. 

Q. Where would you look for inspiration of this change?
Two places — the open source world and the gaming world, especially the large and still vibrant community around Skyrim SE where both seem to happen. Hence keep an eye on reddit and discord groups. There’s an awful lot of work going into bringing LLMs into Skyrim SE. It’ll be interesting to see what tools they develop.

Q. Is it possible to map this landscape in any meaningful way? It’s a complex adaptive system.
Just because something is a complex adaptive system doesn’t mean everything is unpredictable and that meaningful representations can’t be created. It used to be said that you couldn’t meaningfully graph out an economy until a group in Hungary went and did that using sales tax data (figure 4) and discovered numerous chokepoints in the economy. It’s often a question of finding the right perspective and right data.

Figure 4 — Graph of the Hungarian economy. Source : https://phys.org/news/2022-05-country-entire-economy-predictand-forthe.html

Q. We are moving from “describe the code” to “describe the app”?
That’s a nice idea but it’s slightly more than that. Describing the app still evokes concepts of commands and instructions i.e. build me this thing, I’d like buttons in cornflour blue. The future of conversational programming is more likely to start with the question of “describe your need”. In many cases, conversational programming may never need a single line of code written.