unseriously serious

Towards an Agent-Agnostic Coordination System

We built the systems we use to coordinate productivity around process. That made sense when processes were conducted mostly by a single agent: humans. Every agent pretty much arrived at outcomes in the same way, so process was a reliable enough proxy to infer capability. If you needed wood cut to specific dimensions, you could select an agent by using the same signal. In this case, you might determine that someone with two functioning arms and some degree of static strength would be capable of delivering your desired outcome. That same signal was an adequate filter to evaluate all human agents in the ecosystem.

But in a multi-agent world, process can vary. While some tools mimic how humans do things, most operate under fundamentally different constraints.

A job description might require "proficiency in Excel." But the real need is "produce a financial model." The best-suited agent might be a human, a script, or an AI, depending on the context and constraints. Anchoring coordination logic in process obscures the outcome you’re actually trying to arrive at, and limits the variety of agents you might consider.

To build a more relevant coordination system, we must detangle:

Agent = a human or a tool; any instigator of change or translator (compiler) between agents; evaluated based on capability, reliability, and context fit to deliver outcomes. An example of an agent in the compiler role might be an air conditioning unit. It enables the environment / conditions to cool humans so that they may more reliably and accurately conduct a process to arrive at the outcome. An example of an agent in the instigator of change role might be a human that cuts a piece of wood in half. [1]

[1] It might be useful to break this down further. Maybe distinguishing between synthesising information and changing the state of something in an irreversible way.

Skills (process) = how something is done; pertains to process; independent of context e.g. sawing wood.

Unit of Output = a chunk of process that is bound to agent capability and environmental constraints; produces a non-substitutable deliverable critical to the integrity of the outcome; often marks handoff or validation point.

Capability (outcome) = the demonstrated ability to deliver an outcome; context dependent e.g. making a chair.

Context/Constraints (environment) = the environment, permissions, social and technical conditions that shape / limit what an agent can do i.e. the terrain agents need to navigate. For example, an environment that requires exposure to harmful chemicals may be unsuitable for a human agent. Or an input that contains unstructured data may be unsuitable for an agent that requires data to be structured to process it. If that agent is critical to delivering the outcome, a compiler agent might need to be inserted into the process chain to structure the data so that it can process it. Resource constraints also sit here, as does information about input / output type.

Reliability (agent) = individual agent’s ability to perform its function accurately and consistently e.g. a robotic arm has 99.8% accuracy in welding under optimal conditions (context / constraints).

Reliability (systemic / outcome) = agent reliability × agent fit; how consistently the entire multi-agent chain produces the outcome; depends on how well agents work together. For example, the actual reliability of the weld (as an outcome) drops if the robotic arm receives noisy input, even though the arm itself is consistent.

When we separate these, we can assign components in the process chain to the most suitable agent. For example, if you want to value a company, Excel might be the best-suited agent to execute the calculation, but couldn’t come up with the formula by itself. Therefore, the most suitable combination of agents to value a company might be:

A human to gather the data - finances might not be publicly available, so a human has to request these from another human, who has the permission to approve its release (other agents might be limited by this context);

A tool to clean the data (other agents might be more prone to error in this part of the process);

A human to create the formula (other agents might be context-limited i.e. considering implicit variables);

A tool like Excel to execute the calculation (where speed and precision exceed human capability).

Our existing system for coordinating labour is based on an outdated understanding of what value humans are uniquely positioned to contribute. Because tools are emerging that are capable of outperforming us at things we once labeled as human-only, we are falsely being pitted head to head with these innovations.

The system we use to assign agents to tasks is malfunctioning because we’re filling it with agents it wasn’t designed to contain. As a result, it gives us the false conclusion that humans and tools are interchangeable. But that conclusion stems from a logic that misclassifies agents. The idea that agents are interchangeable doesn’t reflect the reality. Our model was once adequate for capturing value, but it’s no longer comprehensive enough to surface useful information in today’s more complex, multi-agent ecosystem.

Now that many processes can be performed by tools, we must reexamine our logic to:

[a] surface previously obscured human value,
[b] more accurately represent how agents contribute to outcomes, and
[c] represent how agents work together.

Tools will replace many humans in process chains. But this only feels threatening because our current systems never measured human value to begin with. They measured human capability in the context of an outdated process-based logic. [2]

[2] Secondary effects like degree inflation stem from this integration failure. When capability signals lag behind shifts in agent utility, systems overcompensate with increasingly inflated filters. Even the structures we inherited from the industrial age: assessment methods, instruction pedagogy, and institutional authority, were designed to evaluate process fluency, not to measure and enhance the value of humans in a multi-agent world.

[…]

Tools should help us, not constrain / dictate our behaviour.

[…]

TAXONOMIES.

Today, we have taxonomies, like O*NET (Occupational Information Network), that aim to standardise the language of skills, abilities, tasks, etcetera. It was developed by the U.S. Department of Labour on top of SOC (Standard Occupational Classification), a system used to classify jobs in the U.S. economy. O*NET is basically a detailed layer for SOC. [3]

[3] There are others, like ESCO (European Skills, Competences, Qualifications and Occupations) and ifATE (Institute for Apprenticeships and Technical Education (now part of Skills England!).

Occupational taxonomies answer the question: what does this person do? They centre the process, not the outcome. But this vertical structure isn’t very helpful in a world where agents accomplish things in different ways. To create more effective productivity flows, we need to shift our descriptions of work to answer the question: what is the intended outcome?

The current structure treats tools as supporting instruments to human processes, not as stand-alone agents. For example, O*NET purports that a financial analyst is responsible for analysing financial information and requires the skill of data modelling and fluency with the tool Excel to accomplish this. But a structure that doesn’t consider Excel an agent doesn’t allow us to simulate variations of the process. Nor does it let us explore the scope and evolution of each agent’s role. For instance, it might be useful to consider what changes if a different agent deploys Excel, whether another agent is better suited to deliver the outcome, or how agent suitability shifts as Excel itself evolves and gains new functions.

Deconstructing outcomes, processes, and agents would allow us to work backwards and identify the most suitable agent and process for achieving it. A horizontal structure that describes agent processes--human and tool--using the same language would enable agent-agnostic coordination. It would also allow us to simulate different agent combinations and evaluate how they fit together.

[…]

As non-human agents that operate in non-obvious ways gain autonomy, we need to clarify how they operate to better anticipate their usefulness across contexts. Making their inner workings rising and inserting these variables into our coordination logic would create a more transparent, productive, and trustworthy system.

I propose the following: Outcome —> (Unit of Output —>) Process —> Agent —> Context (OPA-C).

[…]

Note 1: We should ignore (but not omit) job titles, i.e. SOC Codes, because they aren’t designed to align productivity. Titles like CEO or Manager are context dependent and mean vastly different things depending on the organisation. But we should keep the SOC-Code field in our analysis because that’s how most of the O*NET tables are linked. SOC-Codes can also be useful sanity checks post-analysis.

Note 2: Neither O*NET nor SOC are explicitly influenced by NAICS (the North American Industry Classification System). That’s a good thing because NAICS is designed for tax purposes. Using a classification system designed to determine taxes as the underlying logic for describing Agent Functions is unhelpful because it satisfies the wrong question. [4]

[4] SOC is cross-walked with NAICS and O*NET, meaning you could link O*NET to NAICS if you wanted to.

This also serves as a useful reminder to consider what a taxonomy was built to satisfy. What implicit assumptions are built into the structure? What logic / system was it designed to satisfy?

Here are some taxonomies we could use to create this new taxonomy / coordination logic:

APQC (American Productivity and Quality Center) - Process Classification Framework

O*NET (Occupational Information Network) - already crosswalks with ESCO, SOC, and NAICS (via SOC)

SFIA (Skills Framework for the Information Age)

Automation / Autonomy Taxonomies as described in A Taxonomy for Human-Machine Collaboration by Monika Simmler and Ruth Frischknecht

UiPath

[…]

PUTTING IT TOGETHER.

Once we have separate values for outcomes, processes, and agents, and have identified input and output types (i.e. the information needed and produced), mapped context variables, and sanity them, we can begin simulating real process chains. This will show us the different types of organisations in terms of the outcomes they produce. Then we can plug in the most suitable processes, while accounting for each company’s specific context. Patterns, such as processes that are often grouped together, agents that often collaborate, or shared input / output sources, should emerge. These are Agent Functions, which will replace job titles.

CASE FOR COORDINATION 1.

During summertime, on kings road, you used to be spot the queue for Amarino before you could read the sign. But since they recently adopted a self-checkout system, I haven’t seen their queues spill outside the ice-cream shop. Technically, the new system has made the process more efficient. But something about the experience feels off. Part of the joy of ordering ice-cream is waiting in the queue (what feels like a reasonable guilt-tax for indulging in a sweet treat), chatting with friends about what flavour they’re thinking of getting, asking a human what’s popular, and maybe trying a flavour or two, before all this is pressure tested when you arrive at the front of the queue and impulsively decide to get a cup that can contain all the flavours you’ve been hesitating between. That process is part of the experience, and I believe (at least to some degree), shapes the outcome. When it’s replaced by a screen, the process is more efficient but the experience is less alive.

At McDonald’s, the same system feels like it makes more sense because the transaction is the point. McDonald’s = fast food = efficiency. The screen aligns with the purpose of the visit. But at Amarino, the same process strips out something clearly disappears when the agent is replaced - something not visible in our existing coordination logic.

This is the kind of scenario our existing system doesn’t model. The same process was replaced by the same agent but in one case, the human was contributing more than just the logistical element of taking an order.

Amarino = coordination breakdown (automation ignores context)

McDonald’s = coordination enhancement (automation reinforces reliability and efficiency; form / function / context fit)

CASE FOR COORDINATION 2.

I have a friend who orders her coffee on the Starbucks app and picks it up in-store. She has a very specific order (as do most Starbuckers) and trusts the app more than the barista to produce the correct outcome. Ordering through the app also means she avoids waiting in line--her drink is ready when she arrives. But would she trust a machine to make her drink exactly how she likes it? Sure, the outcome might be more consistent. But does the machine have the ability to outperform a human in this domain?

Here’s one way it works: we communicate via the app, and a human produces the outcome. This reduces the error rate at the communication stage. You know how to input your preferences into the app; the person making your drink knows how to read them. You don’t even need to speak the same language--communication is formalised, and the interface acts as a translation layer.

However, Starbucks' naming conventions are non-intuitive. Terms like “Venti” don’t map cleanly to everyday language. The app enforces these constraints, while a human can interpret ambiguous input like “small.” If you’re unfamiliar with the context, getting the outcome you want becomes harder. Then again, what non-Starbucks-goer would have the app anyway? This makes the issue largely irrelevant for existing customers, but not for newcomers.

And that’s the point: the naming friction may be critical to Starbucks’ success. Understanding the context makes people feel like insiders. In this light, the human cashier becomes a context translator--critical for converting new customers and helping them access the information they need to participate. They are not just executing process; they are onboarding people into a semi-closed system.

Humans have access to a broader context base than most purpose-built tools. An agent-agnostic classification system can help to identify not just how tools can perform reliably, but also where they fall short, highlighting the parts of the process where humans remain essential. Today’s systems can’t capture that. This is what coordination is about.

THINGS TO WATCH OUT FOR.

The taxonomies used mostly represent the U.S. May not suitable for worldwide application.
Context: Error Rate (+ tolerance) - 0 = no error; 1 = high likelihood of error (across full chain); Trust (+ tolerance) - 0 = no trust; 1 = high trust, downstream (across full chain). We calculate ErrorRate and Trust across the full train because they are contextual; Information Structure (how predictable, clean, and structured the input / output is); Connectivity (how reliable the agent’s access is to systems and other agents); Cognitive Complexity (how much interpretation, nuance, or judgement is required. e.g. Level 1 = known inputs + known outputs (deterministic / rule based) and level 5 = relies on empathy, context, social nuance etc.);Time Sensitivity (how quickly the process must be completed to avoid degradation (for the next agent + overall value of the outcome).
Who is the direct receiver of the value? A colleague? A customer? What will they do with the product of a unit of output? How do these affect flexibility around Error rate, Relative tolerance for complexity, or Trust? Maybe add precision / objectivity.
Is culture fit just soft-skills in context? Is it determined by trust, like the conditions / context a human agent might require to be productive? Tolerance for contextual ambiguity? Can non-human agents feel doubt i.e. do they act out if they sense an input / upstream agent is unreliable? What does reliability mean for non-human agents?
Until you have reliability (systemic / outcome) of 1, coordination (including automation and autonomy) is not zero sum. p.s. perfect reliability is practically impossible.
An agent can be both a catalyst of change and a compiler (translator) between agents.
Time doesn’t always equal productivity but it’s easy to measure so is often the default way to organise human agents. A more nuanced approach to optimising productivity might be useful. For example, an 8 hours work day might give a human agent more time to rest, making them more productive than they would be during a 10 hour day, even though their "on" hours are more; Implicit vs explicit productivity.
The desired outcome of most companies is probably to increase profit but it might be useful to consider why that is. Is it greed? a desire for financial stability? Or more specific, like wanting to own a home? Perhaps error tolerance and trust would vary depending on their why (which can also be framed as constraints / considerations). Do they have a high risk appetite? Increased need for reliable outcomes? Organisations would probably benefit from considering why they actually do what they do so that stakeholders could benefit in ways that are meaningful to them.
Extracting outcomes from tools might reveal outcomes that don’t appear in the human data. This is likely due to the our physiological limitations. But we should verify that this is really the case. One way to do this could be by performing a sanity check using the Abilities table. If a tool has an ability that a human cannot mimic, then the tool is the only agent capable of producing the outcome. Though I do wonder to what extent existing solutions produce novel outcomes versus simply being used as instruments to optimise human processes.
In taxonomies, for example O*NET, tools are often embedded into processes. To distinguish between agents we need to detangle this.
We should consider industry (industry, as per NACIS, is not the correct term, but a way to categorise organisations would be useful.) Might consider size. Stakes. Stakeholder type: B2B? B2C? Figure this out. Reason: there is a big difference between a financial transaction at a coffee shop versus one at a pharmacy. The cashier at the coffee shop takes your order and hands you your drink. The financial transaction at the pharmacy needs to be verified. They don’t take your order (usually the doctor will do that on your behalf via a prescription) but reliability of the outcome is more important since an incorrect prescription can be fatal. As a customer, you want to trust that however the transaction is carried out has a low error rate. Does the person going into the medicine room know what they’re doing? Or if that process is being conducted by a non-human agent, has that agent been tested rigorously tested? Is it reliable? Safe? Trustworthy? More than the human? Also in a pharmacy, frequent customers are perhaps more elderly, maybe neurodivergent. Who are the regular customers? How does this affect reliability? Tolerance for ambiguity (indicating that a human agent might be more suitable)?
Adoption rate / friction. How easy is it to trust a new way of doing something? For instance, elderly people don’t like using the self check out counter. Nor do some people who believe these replace humans. Why is this? How can we illustrate how changing processes affect the reliability of an outcome in a way that is transparent and easy to understand?