Week : Operating Model: Mission Control for AI
[jupyter][google colab][reveal]
Abstract:
Project management reframed as an operating model for high‑stakes socio‑technical systems: decision rights, escalation, simulation, and incident response. We connect organisational structure to system structure (Conway’s Law, the API mandate) and show how this becomes essential in the agent era.
Operating model, not pilots
The shift is from “ship an AI model” to “operate a socio-technical system”. The book’s repeated lesson is that high-stakes work is not heroic improvisation: it’s roles, handoffs, and practiced escalation. A mission-control mindset turns uncertainty into procedure: thresholds, drills, and feedback loops.
See Lawrence (2024) rockets p. 187-210. See Lawrence (2024) Apollo programme p. 184-7, 197-210. See Lawrence (2024) Mission Control Center p. 192, 195-6. See Lawrence (2024) test pilot p. 163-8, 189, 190, 192-3, 196, 197, 200, 211, 245. See Lawrence (2024) counterfactual simulation p. 215-18. See Lawrence (2024) simulations p. 215. See Lawrence (2024) Watt’s governor p. 122-5, 127, 131, 143, 144, 184, 198, 202-3, 206, 207, 221, 231, 234, 251, 254, 256-7, 263.
Organisation ↔︎ architecture
Institutional Character
Before we start, I’d like to highlight one idea that will be key for contextualisation of everything else. There is a strong interaction between the structure of an organisation and the structure of its software.
This is known as Conway’s law:
Organizations, who design systems, are constrained to produce designs which are copies of the communication structures of these organizations.
Melvin Conway Conway (n.d.)
The API Mandate
The API Mandate was a memo issued by Jeff Bezos in 2002. Internet folklore has the memo making five statements:
- All teams will henceforth expose their data and functionality through service interfaces.
- Teams must communicate with each other through these interfaces.
- There will be no other form of inter-process communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network.
- It doesn’t matter what technology they use.
- All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.
The mandate marked a shift in the way Amazon viewed software, moving to a model that dominates the way software is built today, so-called “Software-as-a-Service”.
Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.
Conway (n.d.)
The law is cited in the classic software engineering text, The Mythical Man Month (Brooks, n.d.).
As a result, and in awareness of Conway’s law, the implementation of this mandate also had a dramatic effect on Amazon’s organizational structure.
Because the design that occurs first is almost never the best possible, the prevailing system concept may need to change. Therefore, flexibility of organization is important to effective design.
Conway (n.d.)
Amazon is set up around the notion of the “two pizza team”. Teams of 6-10 people that can be theoretically fed by two (American) pizzas. This structure is tightly interconnected with the software. Each of these teams owns one of these “services”. Amazon is strict about the team that develops the service owning the service in production. This approach is the secret to their scale as a company, and the approach has been adopted by many other large tech companies. The software-as-a-service approach changed the information infrastructure of the company. The routes through which information is shared. This had a knock-on effect on the corporate culture.
Amazon works through an approach I think of as “devolved autonomy”. The culture of the company is widely taught (e.g. Customer Obsession, Ownership, Frugality), a team’s inputs and outputs are strictly defined, but within those parameters, teams have a great of autonomy in how they operate. The information infrastructure was devolved, so the autonomy was devolved. The different parts of Amazon are then bound together through shared corporate culture.
Amazon prides itself on agility, I spent three years there and I can confirm things move very quickly. I used to joke that just as a dog year is seven normal years, an Amazon year is four normal years in any other company.
Not all institutions move quickly. My current role is at the University of Cambridge. There are similarities between the way a University operates and the way Amazon operates. In particular Universities exploit devolved autonomy to empower their research leads.
Naturally there are differences too, for example, Universities do less work on developing culture. Corporate culture is a critical element in ensuring that despite the devolved autonomy of Amazon, there is a common vision.
Cambridge University is over 800 years old. Agility is not a word that is commonly used to describe its institutional framework. I don’t want to make a claim for whether an agile institution is better or worse, it’s circumstantial. Institutions have characters, like people. The institutional character of the University is the one of a steady and reliable friend. The institutional character of Amazon is more mecurial.
Why do I emphasise this? Because when it comes to organisational data science, when it comes to a data driven culture around our decision making, that culture inter-plays with the institutional character. If decision making is truly data-driven, then we should expect co-evolution between the information infrastructure and the institutional structures.
A common mistake I’ve seen is to transplant a data culture from one (ostensibly) successful institution to another. Such transplants commonly lead to organisational rejection. The institutional character of the new host will have cultural antibodies that operate against the transplant even if, at some (typically executive) level the institution is aware of the vital need for integrating the data driven culture.
A major part of my job at Amazon was dealing with these tensions. As a scientist, initially working across the company, working with my team introduced dependencies and practices that impinged on the devolved autonomy. I face a similar challenge at Cambridge. Our aim is to integrate data driven methodologies with the classical sciences, humanities and across the academic spectrum. The devolved autonomy inherent in University research provides a similar set of challenges to those I faced at Amazon.
My role before Amazon was at the University of Sheffield. Those were quieter times in terms of societal interest in machine learning and data science, but the Royal Society was already convening a working group on Machine Learning. This was my first interaction with policy advice, I’ve continued that interaction today by working with the AI Council, convening the DELVE group to give pandemic advice, serving on the Advisory Council for the Centre for Science and Policy, and the Advisory Board for the Centre for Data Ethics and Innovation. I’m not an expert on the civil service and government, but I believe many of the themes I’ve outlined above also apply within government. The ideas I’ll talk about today build on the experiences I’ve had at Sheffield, Amazon, and Cambridge alongside the policy work I’ve been involved in to make suggestions of what the barriers are for enabling a culture of data driven policy making.
Mission control: roles, escalation, practice
The purpose of “runbooks” and “drills” is not bureaucracy. It is to make the organisation behave like a control system: detect, escalate, and correct before a local failure becomes systemic.
Escalation is the governance layer for uncertainty. If we cannot name the trigger and the accountable owner, escalation becomes performative — and agents will simply route around it.
A simple governance rhythm you can reuse: WHY → WHAT → HOW → DO → DOCUMENT. WHY: principles you won’t trade off. WHAT: outcomes that must be true. HOW: operating design (roles, escalation, interfaces). DO: this week’s tasks. DOCUMENT: keep decision breadcrumbs so you can explain and unwind. This is the difference between “pilot theatre” and an operating model that survives scale.
Thanks!
For more information on these subjects and more you might want to check the following resources.
- company: Trent AI
- book: The Atomic Human
- twitter: @lawrennd
- podcast: The Talking Machines
- newspaper: Guardian Profile Page
- blog: http://inverseprobability.com