One lake, many tenants: the control plane

2026-06-18T22:36:12Z by Showboat 0.6.1

One lake, many tenants: the control plane

What makes chimpy-lake a platform, and not five cron jobs in a trench coat.

chimpy-lake pulls data from many sources — the Sierra ILS, OverDrive, curated reading lists, the catalog — refines it, and serves it. Each source is a tenant. The claim of this walkthrough is that they are not five hand-wired scripts that happen to share a disk: each tenant declares itself in one manifest, inherits a uniform lifecycle, and a single operator drives the whole fleet from one CLI.

If you’ve used a few familiar tools, you already know the shape of this:

Pinned legend — the tier ladder, how deep a tenant plugs in:

Tier What it ships What it earns
Tier-2 a validated data contract a tenant the lake trusts
Tier-3 contract + a [lifecycle] section a tenant the control plane can operate

Every beat below runs the real chimpy-lake modules over a small frozen fixture — two tenants, circ-trans (Tier-3) and catalog (Tier-2). The commands are real and reproducible (showboat verify re-runs them); the live lake, ~/.config, and real systemd are never touched. We follow one tenant from declaredoperableinstalled.

DECLARE OPERATE INSTALL Tenant chimpy-tenant.toml T2 T3 Control plane chimpy-lake CLI list-tenants install · run-now Host systemd + Podman Quadlets timers reconcile (idempotent)
A tenant declares itself in a manifest; the control plane installs it onto the host and runs it — reconciling desired state, idempotently.

1 · A tenant declares itself — the tier ladder

What. A source joins the lake by shipping a chimpy-tenant.toml manifest, not by hand-wired glue. A validator enforces it. Tier-2 is the data contract (name, schema, schedule, telemetry). Tier-3 adds a [lifecycle] section — and that’s what lets the control plane operate the tenant, not just trust its data.

Why it matters. The manifest is a single, checkable definition of “what a valid tenant is,” enforced in CI — a bad manifest fails the build, it never crashes production at 3 a.m. And the tier is explicit: you can see, per tenant, how deeply it participates.

Term to know: data contract; tiered onboarding.

How we do it — a Tier-2 manifest passes the contract but is refused as Tier-3; a Tier-3 manifest carries the lifecycle section:

uv run python docs/demos/control-plane/_driver.py tiers

The tier ladder — a tenant declares itself in chimpy-tenant.toml
================================================================
A tenant is admitted by its MANIFEST, not by hand-wired glue. Two tiers:
  Tier-2  the data contract        — a tenant the lake TRUSTS
  Tier-3  contract + [lifecycle]   — a tenant the control plane can OPERATE

Tier-2  catalog     kind=extract schema=catalog
        contract satisfied; sources=['catalog']
        as Tier-3? REFUSED — lifecycle: required section missing

Tier-3  circ-trans  kind=extract schema=circ_trans
        entrypoint   ['python', 'app.py']
        subcommands  ['run', 'status', 'dry-run', 'migrate']

The contract is enforced in the conformance suite, never at runtime — a bad
manifest fails CI, it never crashes production. Tier is how deep a tenant plugs in.

2 · Every tenant speaks the same verbs — the Lifecycle SDK

What. A Tier-3 tenant wraps its own run() in LifecycleApp. In return, the SDK supplies a uniform verb setrun, status, dry-run, migrate — so the tenant only writes the one thing that’s actually tenant-specific.

Why it matters. The operator (and the control plane) never learns a per-tenant CLI: every Tier-3 tenant answers the same four verbs the same way. The verb set is closed — an unknown verb is refused, so the contract can’t quietly drift.

Term to know: uniform interface; plugin / service contract.

How we do it — a toy tenant supplies only run; the SDK provides the rest, and dry-run flips the tenant into a no-write mode:

uv run python docs/demos/control-plane/_driver.py lifecycle

The Lifecycle SDK — every Tier-3 tenant speaks the same four verbs
==================================================================
A tenant wraps its own run() in LifecycleApp; the SDK supplies the rest, so
the operator never learns a per-tenant CLI:

    from chimpy_lake.lifecycle import LifecycleApp
    LifecycleApp(run=run).main()   # status / dry-run / migrate come for free

verbs the SDK exposes for this tenant:
  run        the tenant's own logic (the one handler it supplied)
  status     SDK default: one-line health from the telemetry hub
  dry-run    SDK default: sets CHPL_DRY_RUN=1, then calls run
  migrate    SDK default: schema/data migration hook

    [tenant run] mode=live — would extract circ-trans for this window
$ app run       -> exit 0
    [tenant run] mode=dry-run — would extract circ-trans for this window
$ app dry-run   -> exit 0   (note mode flips to dry-run)
$ app migrate   -> exit 0   (no migrations registered)
$ app frobnicate-> exit 2   (unknown verb refused — the verb set is closed)

3 · The operator’s fleet — the roster

What. An instance (a deployment of the lake on a host) declares the tenants deployed there in a small registry — one tenants/<name>.toml per tenant. chimpy-lake list-tenants reads that registry: the inventory the control plane acts on.

Why it matters. Adding a source to the fleet is a one-file change, not a code change — drop a registry entry and it joins every fleet-wide command. The roster is the seam between “what’s declared” and “what gets operated.”

Term to know: instance registry / inventory.

How we do it — the declared fleet for this instance, each tenant with its tier read straight from its manifest:

uv run python docs/demos/control-plane/_driver.py roster

The roster — `chimpy-lake list-tenants` reads the instance registry
===================================================================
the operator's declared fleet for this instance (org=chpl, scope=dev):

  tenant       kind     tier  schedule
  catalog      extract  T2    *-*-* 03:00:00 America/New_York
  circ-trans   extract  T3    *-*-* 08:35:00 America/New_York

The registry is the inventory the control plane acts on — add a tenant by
dropping one tenants/<name>.toml, and it joins every fleet-wide command below.

4 · Declare the desired state — install

What. chimpy-lake install <tenant> materialises a tenant’s Quadlet (container) + timer units onto the host by symlinking them into the systemd user dirs, then enabling the timers. --dry-run shows the plan without touching anything. It is idempotent (a second run is a no-op) and safe (a link a human re-pointed is flagged as a conflict and refused without --force).

Why it matters. This is the kubectl apply of the lake: you declare the desired state, the CLI reconciles the host to match — and because it’s idempotent, re-running is always safe; because conflicts are refused, it never silently clobbers a hand-edit.

Term to know: desired-state reconciliation; idempotent apply.

How we do it — a fresh host plans the links; a re-run is a no-op; a hand-edited link conflicts and is refused (the live ~/.config and systemd are never touched — the planner is driven against a scratch dir):

uv run python docs/demos/control-plane/_driver.py install

Install — `chimpy-lake install circ-trans --dry-run` plans, idempotently
========================================================================
FRESH  host has nothing linked yet:
  create   4 symlink(s): circ-trans-reconcile.container, circ-trans.container, circ-trans-reconcile.timer, circ-trans.timer
  enable   2 timer(s): circ-trans-reconcile.timer, circ-trans.timer
  conflicts 0

RE-RUN  same command again, host already in the desired state:
  create   0
  already-ok 4 symlink(s) — no-op (idempotent)

CONFLICT  a managed link was re-pointed by hand:
  conflicting 1: circ-trans.container
  apply -> REFUSED: won't overwrite 1 conflicting link without --force

Declare the desired state, the control plane reconciles it onto the host —
safe (conflicts refused), idempotent (re-runs converge). Nothing was applied here.

5 · Fire one on demand — run-now

What. Tenants run on a timer, but the operator sometimes needs one now — a backfill, a re-pull after a fix. chimpy-lake run-now <tenant> resolves the tenant’s service unit and starts it via the same code path a timer fire uses, surfacing the tenant’s true exit code. Multi-service tenants (here, a primary + a reconcile) resolve by suffix.

Why it matters. On-demand and scheduled runs share one mechanism — no parallel “manual run” logic to drift out of sync. The operator triggers by tenant name, not by memorising unit names.

Term to know: on-demand trigger; unit resolution.

How we do it — resolve which unit run-now would start, for both the primary and the reconcile service (resolution only — nothing is started here):

uv run python docs/demos/control-plane/_driver.py run-now

Run-now — fire one tenant on demand (atop its timer schedule)
=============================================================
`chimpy-lake run-now circ-trans` resolves which unit it would start:

  primary               -> circ-trans.service
  --service reconcile   -> circ-trans-reconcile.service

Resolution only — the real command then does `systemctl --user start <unit>
--wait` and surfaces the tenant's true exit via ExecMainStatus. Same code path
as a timer fire; here we only resolved the name, nothing was started.

Proof

Every command above is reproducible — showboat verify re-runs each one and diffs its output. And the on-thesis invariants (the tier ladder, the closed verb set, and the idempotent-and-safe install) ship green:

uv run pytest tests/demos/test_control_plane.py -q 2>&1 | sed -E 's/ in [0-9.]+s//'
....                                                                     [100%]
4 passed

Glossary

Where the control plane goes next — open questions, not commitments

Three directions the platform is shaped for but that are not built, posed as questions rather than claims:

Nothing exotic — and that’s the point. Manifests, a uniform interface, and declarative, idempotent apply are the ordinary platform playbook, implemented for a real library lake and running today. Everything above re-runs on demand.


all walkthroughs · Rendered from 226199c on 2026-06-18 · showboat verify: reproduces. A living artifact — the version ledger is git.