Intellectual Debt in the Agent Era

Intellectual debt as an accountability gap

The Sorcerer’s Apprentice

Executive framing

Debt is not a metaphor — it’s an interest rate on change (every modification gets slower and riskier).
Intellectual debt creates control loss: nobody can explain end-to-end behaviour under stress.
With agents, that becomes operational risk: actions happen faster than review and across more surfaces.

Intellectual Debt

Technical Debt

Compare with technical debt.
Highlighted by Sculley et al. (2015).

Separation of Concerns

Intellectual Debt

Technical debt is the inability to maintain your complex software system.
Intellectual debt is the inability to explain your software system.

Adding Data

The Great AI Fallacy

Artificial vs Natural Systems

First rule of a natural system: don’t fail
Artificial systems tend to optimise performance under some criterion
The key difference between the two is that artificial systems are designed whereas natural systems are evolved.

Natural Systems are Evolved

Survival of the fittest

?

Natural Systems are Evolved

Survival of the fittest

Herbet Spencer, 1864

Natural Systems are Evolved

Non-survival of the non-fit

Mistake we Make

Equate fitness for objective function.
Assume static environment and known objective.

The Mythical Man-month

Technical Consequence

Classical systems design assumes decomposability.
Data-driven systems interfere with decomponsability.

Bits and Atoms

The gap between the game and reality.
The need for extrapolation over interpolation.

Ride Allocation Prediction

Machine Learning Systems Design

Fragility of AI Systems

They are componentwise built from ML Capabilities.
Each capability is independently constructed and verified.
- Pedestrian detection
- Road line detection
Important for verification purposes.

Computer Science Paradigm Shift

Von Neuman Architecture:
- Code and data integrated in memory
Today (Harvard Architecture):
- Code and data separated for security

Computer Science Paradigm Shift

Machine learning:
- Software is data
Machine learning is a high level breach of the code/data separation.

Peppercorns

A new name for system failures which aren’t bugs.
Difference between finding a fly in your soup vs a peppercorn in your soup.

Peppercorns

Experiment, Analyze, Design

A Vision

We don’t know what science we’ll want to do in five years’ time, but we won’t want slower experiments, we won’t want more expensive experiments and we won’t want a narrower selection of experiments.

What do we want?

Faster, cheaper and more diverse experiments.
Better ecosystems for experimentation.
Data oriented architectures.
Data maturity assessments.
Data readiness levels.

Data Oriented Architectures

View data to a first-class citizen.
Prioritise decentralisation.
Openness

Data Orientated Architectures

Historically we’ve been software first
- A necessary but not sufficient condition for data first
Move from
1. service oriented architectures
2. data oriented architectures

Data Oriented Principles

Apache Flink

Streams and transformations
a stream is a (potentially never-ending) flow of data records
a transformation: streams as input, produces transformed streams as output

Join

stream.join(otherStream)
    .where(<KeySelector>)
    .equalTo(<KeySelector>)
    .window(<WindowAssigner>)
    .apply(<JoinFunction>)

Milan

A general-purpose stream algebra that encodes relationships between data streams (the Milan Intermediate Language or Milan IL)
A Scala library for building programs in that algebra.
A compiler that takes programs expressed in Milan IL and produces a Flink application that executes the program.

Meta Modelling

Trading System

High frequency share trading.
Stream of prices with millisecond updates.
Trades required on millisecond time line

mlai.write_figure(‘hypothetical-prices.svg’, directory=‘./data-science/’)

Real Price

Future Price

Hypothetical Streams

Real stream — share prices
- derived hypothetical stream — share prices in future.
Hypothetical constrained by
- input constraints.
- decision functional
- computational requirements (latency)

Hypothetical Advantage

Modelling is now required.
But modelling is declared in the ecosystem.
If it’s manual, warnings can be used
- calibration, fairness, dataset shift
Opens door to Auto AI.

SafeBoda

With road accidents set to match HIV/AIDS as the highest cause of death in low/middle income countries by 2030, SafeBoda’s aim is to modernise informal transportation and ensure safe access to mobility.

Ride Sharing: Service Oriented

Ride Sharing: Data Oriented

Ride Sharing: Hypothetical

Information Dynamics

Potential for information feedback loops.
Hypothetical streams are instantiated.
Nature hypothesis (e.g. price prediction) can effect reality.
Leads to information dynamics, similar to dynamics of governors.
See e.g. Closed Loop Data Science at Glasgow.

Agent-era guardrails (minimum viable control)

Declare models/agents as first-class dependencies (ownership + versioning).
Instrument drift/novelty and route to escalation (“pause when unsure”).
Record an audit trail: inputs, tools used, outputs, and approvals.
Limit blast radius: scopes, sandboxes, rate limits, and kill switches.

Thanks!

company: Trent AI
book: The Atomic Human
twitter: @lawrennd
The Atomic Human pages sorcerer’s apprentice 371-374 , intellectual debt 84, 85, 349-50, 365, automation 6, 24, 46-7, 77-8, 80-81, 83, 85-87, 363-6, 368-369, decomposition 58, 79, information assembly line 57-8, 79, accountability 352, 363, intelligent accountability 363-4, topography, information 34-9, 43-8, 57, 62, 104, 115-16, 127, 140, 192, 196, 199, 291, 334, 354-5, intellectual debt 84, 85, 349, 365, separation of concerns 84-85, 103, 109, 199, 284, 371, intellectual debt 84-85, 349, 365, 376, natural vs artificial systems 102-103.
newspaper: Guardian Profile Page
blog posts:

The Open Society and its AI

Natural and Artificial Intelligence

Decision Making and Diversity

Natural vs Artifical Intelligence

References

Brooks, F., n.d. The mythical man-month. Addison-Wesley.

Cabrera, C., Paleyes, A., Thodoroff, P., Lawrence, N.D., 2023. Real-world machine learning systems: A survey from a data-oriented architecture perspective.

Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Young, M., Crespo, J.-F., Dennison, D., 2015. Hidden technical debt in machine learning systems, in: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (Eds.), Advances in Neural Information Processing Systems 28. Curran Associates, Inc., pp. 2503–2511.