Security by Design: How the Assembly Line Protocol Keeps AI Agents Safe

There's a reason most organisations haven't let AI agents anywhere near production systems. It's not scepticism about the AI — it's a reasonable fear about what happens when a non-deterministic system has privileged access. An agent that can read your database can misread instructions. An agent that can push to production can push the wrong thing. The downside isn't a slightly wrong answer. It's an irreversible action.

The Assembly Line Protocol (ALP) is our answer to that problem. Not with access-control rules bolted on after the fact — but with an architecture where the security properties fall directly out of the structure.

The core insight: privilege and non-determinism are inversely distributed across the four layers.

The layer with the most access is the most deterministic. The layer that gets to improvise has the least access. This isn't a policy. It's the shape of the system.

The Four Layers

Server — The Task Queue

The Server is where work enters the system. It's a task queue: a list of things that need doing, with a title, a description, and some metadata. That's it.

The Server has zero sensitive access. It doesn't hold credentials, it can't execute anything, and it doesn't care how tasks get completed. You can implement it on top of whatever project management system you already use — Jira, Azure DevOps, GitHub Issues, or a custom service. ALP defines the protocol; you bring the backlog.

Because the Server has no privileged access, it's a safe input zone. Anyone can submit a task. The worst a malformed task can do is confuse the downstream stages — and those stages are designed to handle that.

Runner — Privileged but Deterministic

The Runner is where things actually happen. It has the access: credentials, network, the ability to provision sandboxes, the ability to reach your production systems if you configure it that way. In the reference implementation, that's pks-cli — open source, auditable.

But the Runner never improvises. There is no language model running inside the Runner. It polls the Server for tasks, evaluates transition rules, provisions sandboxes, injects credentials, routes outputs, and moves tasks between stations. Every step is deterministic code. You can read it, reason about it, and predict its behaviour.

This is the key property: the most privileged component in the system is also the most predictable. A human engineer reviewing the Runner's behaviour doesn't need to wonder what it might decide — it follows its rules, every time.

Operator — The Protocol Enforcer

The Operator runs inside the sandbox the Runner provides. It's the layer between the Runner and the Agent, and its job is deterministic protocol enforcement.

The Operator communicates with the Runner over the ALP spec protocol — a typed contract that defines exactly what messages can flow in each direction: task parameters in, status updates and results out. The Operator validates that everything the Agent produces conforms to this contract before it leaves the sandbox.

Operators are pluggable and third-party implementable. The ALP Operator spec is a public interface, not an internal detail. vibecast is the reference Operator — it knows how to drive Claude Code. A different team could build an Operator that drives Codex, or GitHub Copilot CLI, or any other agent runtime. Each Operator declares a capability spec: which agents it supports and what it can instruct them to do.

This means trust is explicit. You don't just trust "AI agents" in the abstract. You pick a specific Operator, you know what agents it drives, and you know what protocol surface it exposes. If you don't trust an Operator, you don't use it.

Agent — Non-Deterministic but Contained

The Agent — Claude Code, Codex, Copilot CLI, whatever you've wired up — is where the actual intelligence lives. It reads the task, reasons about it, writes code, runs tools, and produces output. It's non-deterministic by design: that's precisely what makes it useful.

But the Agent is the least privileged component in the chain. It operates inside the sandbox. It can only communicate through the Operator's protocol surface. It cannot directly reach the Runner, cannot modify task state, and cannot escalate its own permissions. Any output that doesn't conform to the Operator's expected protocol gets rejected before it leaves the sandbox.

The Agent gets to be creative within a precisely bounded space. That's not a limitation — it's what makes it safe to deploy.

The Inversion

The animation above shows it directly. Outside the sandbox sits the most privileged component in the system — with no language model anywhere near it. Inside the sandbox sits the LLM, with the lowest privilege of any layer.

The boundary between the Runner and the Operator is a security boundary, not just an architectural one. Inside that boundary, the system is sandboxed, protocol-constrained, and monitored. The only thing that crosses it is structured output the Runner explicitly accepts.

Why This Works

Traditional approaches to AI agent security try to add guardrails around an agent that already has access. You get a list of allowed tools, a prompt injection filter, a set of rules the agent is supposed to follow. These are all runtime policies enforced by the same system you're trying to constrain.

ALP takes the opposite approach. Security isn't enforced on the Agent — it's structurally impossible for the Agent to violate, because the architecture doesn't give it the means to do so. The Agent can't reach production credentials because it's in a sandbox. It can't push bad output to the Runner because the Operator validates the protocol. It can't escalate its own permissions because the Operator doesn't have any to escalate.

The threat model is honest: an AI agent will, at some point, do something unexpected. The question is whether your architecture limits the blast radius. ALP's answer is that the unexpected thing happens inside a sandbox, gets validated by a deterministic Operator before it exits, and is routed by a deterministic Runner that has explicit rules about what to do next.

Building on ALP

The protocol is designed to be as adoptable as MCP. The Server is intentionally left to you — implement it on top of whatever you already use for task management. The Runner (pks-cli) is open source: read it, fork it, audit it. The Operator spec is public: build your own if you have a different agent runtime or specific capability requirements.

The goal is a world where running AI agents in production doesn't require blind trust in a black box. You can inspect every layer, understand every boundary, and reason about what happens when something goes wrong.

That's security by design — not security by hope.

The Assembly Line Protocol is developed by agentics.dk. The reference Runner (pks-cli) and Operator (vibecast) are open source.

Security by Design: How the Assembly Line Protocol Keeps AI Agents Safe

Security by Design: How the Assembly Line Protocol Keeps AI Agents Safe

The Four Layers

Server — The Task Queue

Runner — Privileged but Deterministic

Operator — The Protocol Enforcer

Agent — Non-Deterministic but Contained

The Inversion

Why This Works

Building on ALP

Introducing vibecast