Specification

Runner

An ALP Runner is a long-running daemon that registers with an ALP Server, polls for Jobs, delegates execution to a Station Operator, and reports results back to the Server.

The Runner is not the Agent. The Runner is infrastructure. The AI capability lives in the Agent, which is managed by the Operator.

See diagrams/job-execution-swimlane.drawio for the full lifecycle across all four layers (Server → Runner → Operator → Agent), and diagrams/runner-operator-boundary.drawio for a component view of the sandbox boundary.

Responsibilities

Register with the Server and persist the Runner ID and token
Poll the Server for available Jobs (short-poll or SSE)
Claim a Job and spawn a Station Operator
Monitor the Operator until it completes
Report the Job outcome to the Server

Registration

Before polling, a Runner must register with the Server.

Required at registration:

name — human-readable identifier for this Runner instance
labels — capability declarations (e.g. ["linux", "claude-code", "x86_64"])

Stored after registration:

id — Runner ID
token — Bearer token for all subsequent authenticated requests
server, owner, project — the Server endpoint and scope

Registration is persistent — the Runner stores its credentials locally (e.g. ~/.pks-cli/agentics-runners.json) and reuses them across restarts.

Polling Loop

The Runner runs a continuous polling loop:

loop:
  response = POST /runners/jobs  (Authorization: Bearer {token})

  if response.status == 204:
    sleep(pollingInterval)   // default: 10 seconds
    continue

  if response.status == 200:
    job = response.body.jobs[0]
    outcome = executeJob(job)
    PATCH /runners/{id}  { jobResult: outcome.result, exitCode: outcome.exitCode, error: outcome.error }

Default polling interval: 10 seconds. Configurable via --polling-interval flag in pks-cli.

If the Server supports SSE, the Runner MAY subscribe to GET /runners/events and call the poll endpoint immediately on receiving a job_available event, reducing latency from up to 10s to near-zero.

Operator Spawning

When a Job is received, the Runner spawns a Station Operator process with the full Agent Definition as context. The Operator is given:

The rendered prompt (from the Agent Definition)
Environment variables: ASSEMBLY_LINE_REPO_URL, ASSEMBLY_LINE_REPO_TOKEN, AGENTICS_JOB_ID, AGENTICS_TOKEN, AGENTICS_BASE_URL, AGENTICS_OWNER, AGENTICS_PROJECT_NAME
A working directory for this Job
Any devcontainerFiles from the Agent Definition

The Runner monitors the Operator process. When the Operator exits, the Runner reads the exit code and reports the outcome.

Job Outcome Schema

When a Job completes, the Runner MUST report the following to the Server:

interface JobOutcome {
  jobResult: 'success' | 'failed';
  exitCode: number;
  error: string | null;
}

The Runner SHOULD also send intermediate jobResult: "in_progress" updates (heartbeats) for long-running Jobs to indicate the Job is still alive.

How the exit code is determined: The Operator process exits with 0 for success and non-zero for failure. The Runner maps the Operator's exit code to jobResult. The details of how the Operator elicits a structured outcome from the Agent are the Operator's implementation concern — see 09-operator.md.

Label Matching and Capability Declaration

A Runner's labels are what the Server uses to route Jobs to it. Labels should describe:

Category	Examples
OS	`linux`, `macos`, `windows`
Agent type	`claude-code`, `gpt4-cli`
Hardware	`gpu`, `x86_64`, `arm64`
Custom capability	`vibecheck`, `high-memory`

The Station's labels array lists what is required. The Runner's registered labels must be a superset of the Station's requirements.

Security Responsibilities

See 11-security.md for the full security model and credential architecture.

The Runner is the privileged host process. Its security responsibilities go beyond infrastructure management:

1. Sandbox Creation and Isolation

The Runner creates the devcontainer in which the Operator and Agent run. The Runner MUST:

Run the container without the --privileged flag
Not mount /var/run/docker.sock unless the Station explicitly requires Docker-in-Docker capability
Not inject real service credentials into the container environment

The security boundary between host and sandbox is the Runner's responsibility to enforce.

2. Credential Broker

The Runner holds all privileged service credentials (API keys, tokens, PATs). These credentials MUST NOT be passed to the Agent as environment variables. Instead, the Runner exposes a Credential Server over a Unix socket:

Host path: /run/alp/cred.sock
Bind-mounted into the devcontainer
Agents call it to receive short-lived, scoped JIT tokens

The Runner validates every credential request against the Job's workload identity (Job ID + labels) before issuing a token.

3. Egress Proxy (DMZ)

The Runner SHOULD act as an HTTP/HTTPS proxy (host-gateway:3128) for all outbound container traffic. This allows the Runner to:

Enforce an allow-list of permitted external destinations
Swap JIT tokens for real credentials at the egress boundary
Log all outbound traffic with Job ID for audit
Hold requests for human approval before forwarding to sensitive destinations

4. Workload Identity

The Runner injects ALP_JOB_ID and ALP_STATION_LABELS into the devcontainer. These are the Agent's workload identity — the credential server uses them to gate access. No static credentials are needed inside the container.

Plugin Acquisition

Stations can declare Claude Code plugins that must be installed before the Agent runs. The Runner is responsible for acquiring these plugins before the Operator starts, using its own network context where marketplace URLs (e.g. localhost:40145) are directly reachable.

Why the Runner acquires plugins (not the Operator)

The Operator runs inside a sandboxed devcontainer. From inside the container, localhost refers to the container itself — not the host where the marketplace is running. Rewriting URLs to host.docker.internal is fragile and environment-specific. The Runner, running on the host, can reach the marketplace directly.

Mechanism: Docker volume

For each Job that declares plugins, the Runner:

Creates a named Docker volume: alp-plugins-{jobId}

Clones each plugin into the volume via a short-lived helper container:

docker run --rm \
  --mount type=volume,source=alp-plugins-{jobId},target=/plugins \
  alpine/git clone --depth=1 {sourceUrl} /plugins/{pluginId}

Mounts the volume into the Operator devcontainer at /run/alp/plugins:

--mount type=volume,source=alp-plugins-{jobId},target=/run/alp/plugins

Sets VIBECAST_EXTRA_PLUGINS in the launch environment to the colon-separated container-side paths:
```
/run/alp/plugins/plugin-a:/run/alp/plugins/plugin-b
```
Removes the volume after the Job completes.

The Operator (vibecast) reads VIBECAST_EXTRA_PLUGINS and passes --plugin-dir to Claude for each path.

Plugin declaration schema

Plugins are declared in the Station trigger's agentDefinition:

interface PluginRef {
  id: string;       // Unique plugin identifier
  name: string;     // Display name
  sourceUrl: string; // Git-cloneable URL (e.g. https://marketplace/plugins/org/name.git)
}

See diagrams/plugin-acquisition-swimlane.drawio for a visual walkthrough of the acquisition flow, and diagrams/runner-operator-boundary.drawio for a component view of everything that crosses the sandbox wall.

Concurrency

A single Runner instance handles one Job at a time (per pks-cli's current implementation). For concurrency, run multiple Runner instances — each registers with a unique name and the Server dispatches Jobs across them.

Implementing Your Own ALP Runner

A compliant ALP Runner MUST:

Register with POST /runners/register and persist the token
Poll POST /runners/jobs in a loop with Authorization: Bearer {token}
On HTTP 200: execute the Job (via an Operator) and report the result
On HTTP 204: wait for the polling interval, then retry
Report result via PATCH /runners/{id} with the Job Outcome schema

A compliant ALP Runner SHOULD:

Respect idleTimeoutMinutes and maxTimeoutMinutes from the Agent Definition
Send in_progress heartbeats for long-running Jobs
Handle SIGTERM gracefully (complete current Job, then stop)
Support SSE for reduced polling latency when the Server offers it

Reference Implementation

pks-cli — open source C# CLI.

Install:

# TODO: add installation instructions when publicly released

Start a runner:

pks agentics runner start \
  --server https://agentics.dk \
  --owner pksorensen \
  --project my-project \
  --labels linux,claude-code \
  --polling-interval 10