KiwiStack

№ A · 03 / Network

Two stacks,
two purposes.

Plain WireGuard carries the laptop ↔ customer-slice traffic. Headscale carries the operator admin paths and, on Fleet, the site-to-site mesh between the customer's office router and their datacenter slice. Both stacks ride WireGuard at the data-plane level; they're separate at the control plane and serve different callers.


№ N1 / Two purposes, two stacks

One for the laptop,
one for the mesh.

Treating WireGuard and Headscale as one thing leads to confusion: they share a transport but solve different problems. The laptop tunnel is a static, customer-owned data plane; the Headscale mesh is a dynamic, operator-owned coordination plane that also doubles as the Fleet-tier site-to-site fabric.

Laptop ↔ customer slice: the always-on data plane

№ 01

WireGuard

Each managed laptop dials a persistent WireGuard tunnel into the customer's WG endpoint. Nubus, Fleet, smallstep and the OpenDesk web tier are not on the public internet; the laptop reaches them only through this tunnel. WireGuard here is plain WireGuard (kernel module on Linux, native client on macOS/Windows), not orchestrated by Headscale.


  • → Static peer config, rotated only on key compromise
  • → Public UDP ingress on the customer's WG endpoint VPS
  • → Routes only the customer's internal CIDR, no full-tunnel default route
  • → Wakes from sleep, reconnects across network changes

Mesh tailnet: operator admin & Fleet site-to-site

№ 02

Headscale

Headscale is the open-source Tailscale coordination server. We run one per cluster. It coordinates two flows: KiwiStack operator reach into every customer's namespaces (one tailnet per operator key), and on Fleet the customer's office router joining the same mesh as the datacenter. The data plane that Headscale orchestrates is itself WireGuard; that's separate from the laptop WG above.


  • → Self-hosted; no Tailscale SaaS dependency
  • → ACL split: operator admin nodes vs customer site-to-site nodes
  • → Pre-auth keys rotated per onboarding
  • → Magic-DNS off, we use the customer's existing internal DNS

№ N2 / Per-tier scope

Core rides public,
Mesh adds the tunnel,
Fleet adds the mesh.

Core has no overlay; multi-tenant addresses share a public ingress with TLS. Mesh adds the laptop tunnel plus operator admin via Headscale. Fleet extends the same Headscale tailnet to the customer's office router (so the office LAN is on the same fabric as the cluster) and adds Wi-Fi 802.1X via FreeRADIUS bound to the same smallstep PKI used for device certs.

Tier

Overlay components

Laptop WG

Site-to-site

Wi-Fi

Core

·

·

·

·

Mesh

WireGuard endpoint + Headscale (operator)

Yes, managed laptops dial in

·

·

Fleet

Mesh + Headscale (site-to-site) + FreeRADIUS

Yes, managed laptops dial in

Yes, office router joins the tailnet

EAP-TLS via FreeRADIUS, certs from smallstep CA

Fleet customers who don't want their office router on the mesh can opt out. The laptop WG path is unchanged. Site-to-site is offered, not imposed.


№ N3 / Topology, traced

Who reaches what,
and over which fabric.

Three callers, three paths into the customer slice. The laptop WG and the Headscale mesh do not share keys and do not share peers; a compromise of one does not bridge to the other.

                                ┌──────────────────────────────────────┐
   Customer laptop   ─── WG ───▶│  Customer slice (k3s for Mesh/Fleet)  │
   (managed, BYOL                │                                      │
    or OEM-direct)               │   • Nubus  • OpenDesk  • Fleet       │
                                 │   • smallstep CA   • Vaultwarden     │
   KiwiStack operator ─ Headscale ─│                                      │
   (admin reach)                 │   ingress-nginx (TLS, public)        │
                                 └──────────────────────────────────────┘
                                                  ▲
   Fleet office  ─ Headscale ─────────────────┘
   router (LAN/printers/file server                  (site-to-site,
    join the same tailnet)                            Fleet only)
      

Public traffic terminates at the cluster's ingress-nginx with TLS, the only Core-tier path. Mesh and Fleet keep the suite reachable on the public ingress for browser-only sessions but route managed laptops through WG for the always-on tunnel.


№ N4 / Wi-Fi (Fleet)

The same cert
unlocks the Wi-Fi.

Only on Fleet, and only at customer sites we enrol. Mesh customers' offices and home Wi-Fi keep working as they did. The smallstep PKI trust anchor is shared with the device-cert flow, so there's no separate credential to manage.

01

FreeRADIUS

Hosted in the customer's slice. Configured by GitOps from the per-customer state repo. One RADIUS server per tenant, no shared FreeRADIUS across customers.

02

EAP-TLS

Mutual cert auth. The user account is bound to the device cert subject (CN=<host>, OU=<role>, O=<customer>), same shape as the smallstep device cert.

03

smallstep trust

FreeRADIUS trusts the customer's smallstep intermediate. No separate Wi-Fi CA. When a device cert is revoked at smallstep, Wi-Fi access drops within the next OCSP/CRL fetch.

04

AP integration

Any AP that speaks RADIUS: Unifi, MikroTik, Cisco Meraki, OpenWRT-based. We ship a default supplicant profile via Fleet for the laptop side.


№ N5 / When something goes wrong

Tunnels break,
independently.

The two-stack design means most network failures have a small, local blast radius. The laptop WG outage and the Headscale outage do not compound each other; they affect different callers and different paths.

№ 01

WG endpoint down

Blast radius

Customer-scoped: only that customer's laptops

What that means

Managed laptops can't reach the suite until the endpoint comes back. Local user data is unaffected (Nextcloud Sync keeps working offline; mail caches as usual). Operators can still reach the cluster via Headscale and restore the endpoint without customer involvement.

№ 02

Headscale down

Blast radius

Operator-only by default; Fleet site-to-site cut

What that means

Operator admin paths are interrupted. On Fleet, office ↔ datacenter mesh is cut for the duration. Customer end-users on managed laptops are unaffected, since they reach the suite via the laptop WG tunnel, which doesn't depend on Headscale.

№ 03

smallstep CA down

Blast radius

Cross-cutting: affects laptop certs, Wi-Fi, mTLS

What that means

Existing certs keep working until they expire (24h–90d depending on subject). Nothing new can enroll. Wi-Fi 802.1X stays up on Fleet as long as device certs are valid. Recovery: bring the smallstep workload back, certs re-issue on next renewal cycle.


Where the WireGuard config comes from

The laptop WG profile is generated at enrollment and bound to the device cert. The full enrollment trace ( YubiKey attestation, smallstep cert issuance, Fleet handshake, OS-side WG bring-up) lives on /architecture/devices →.