Skip to content
Gary Wu
Go back

Self-Hosted to Serverless: Migrating a WebSocket Relay

Edit page

A stateful WebSocket relay is the hardest category of server to migrate to serverless — it requires persistent connections, fan-out routing, and shared in-memory state. Durable Objects are the first primitive that solves this cleanly at the edge. This is the architecture, using clipboard sync as the concrete case study.


The Problem Class

Most self-hosted servers are easy to migrate to serverless. REST APIs, static assets, auth endpoints — these are stateless by construction. You lift them out, wrap them in a Worker, and they run.

WebSocket relays are different. A relay server must do three things that serverless platforms historically cannot:

  1. Hold persistent connections — a device connects and stays connected, sometimes for hours
  2. Maintain shared state — the server must know which devices belong to the same user so it can route between them
  3. Fan out messages — when device A sends a clipboard update, every other device belonging to that user must receive it in real time

These three requirements are why self-hosted relay servers exist. They are also exactly what Durable Objects were built for.


ClipCascade as the Case Study

ClipCascade is an open-source clipboard sync utility. It supports two modes:

The server is a Spring Boot / Java 21 application, deployed via Docker. It exposes:

EndpointProtocolRole
/clipsocketWebSocketP2S clipboard relay
/p2psignalingWebSocketP2P WebRTC signaling
/login, /logout, /signupHTTPAuth
/captchaHTTPBot protection
/health, /pingHTTPMonitoring

The interesting problem is /clipsocket. Everything else is a standard REST API with a user database behind it. The relay is the architectural challenge.


What Has to Change

What the JVM Does That Workers Cannot

Spring Boot provides several things that require replacement:

Long-lived process state — Spring maintains in-memory structures (WebSocket session registries, user-to-connection maps) that persist across requests. Workers are stateless; each invocation is isolated. You need somewhere to put the connection registry.

Blocking I/O model — Spring’s WebSocket implementation blocks threads per connection. Workers use a non-blocking event loop with hibernating WebSocket connections. This is actually an improvement, not a limitation.

File-based user database — ClipCascade persists user credentials and config to a mounted /database directory. Workers have no filesystem. You need D1.

The JVM itself — Workers run V8. The entire server must be rewritten in TypeScript.

The Rewrite Surface

ComponentCurrentCloudflare ReplacementDifficulty
WebSocket relaySpring WebSocketDurable ObjectsHard
P2P signalingSpring WebSocketDurable ObjectsMedium
User auth / sessionsSpring SecurityWorker + D1 + KVMedium
User databaseFile-based H2/embeddedD1 (SQLite)Easy
Session tokensIn-memory / cookieKVEasy
Large payloads (images, files)Buffered in memoryR2Medium
Static assets / web dashboardServed by SpringWorkers static assetsEasy
CaptchaCustom implementationCloudflare TurnstileEasy

The Durable Object Architecture

Each user’s relay session is a Durable Object. The DO holds open WebSocket connections for all of that user’s devices and fans out messages between them.

Device A (laptop)  ──WebSocket──┐
Device B (phone)   ──WebSocket──┤──► UserRelayDO (per user) ──► fans out to all connected devices
Device C (desktop) ──WebSocket──┘

The UserRelayDO

export class UserRelayDO extends DurableObject {
  private sessions = new Map<WebSocket, { deviceId: string }>();

  async fetch(request: Request): Promise<Response> {
    const { 0: client, 1: server } = new WebSocketPair();
    this.ctx.acceptWebSocket(server);
    return new Response(null, { status: 101, webSocket: client });
  }

  async webSocketMessage(ws: WebSocket, message: string | ArrayBuffer) {
    // Fan out to all other connected devices for this user
    for (const [session] of this.sessions) {
      if (session !== ws && session.readyState === WebSocket.READY_STATE_OPEN) {
        session.send(message);
      }
    }
  }

  async webSocketOpen(ws: WebSocket) {
    this.sessions.set(ws, { deviceId: crypto.randomUUID() });
  }

  async webSocketClose(ws: WebSocket) {
    this.sessions.delete(ws);
  }
}

The Worker entry point validates auth and routes to the correct DO:

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);

    if (url.pathname === '/clipsocket') {
      const user = await validateSession(request, env);
      if (!user) return new Response('Unauthorized', { status: 401 });

      // Each user gets their own DO — named by user ID
      const id = env.USER_RELAY.idFromName(user.id);
      const relay = env.USER_RELAY.get(id);
      return relay.fetch(request);
    }

    // ... other routes
  }
};

WebSocket Hibernation

Durable Objects support WebSocket hibernation: when no messages are flowing, the DO instance is evicted from memory but the WebSocket connections are preserved. When a message arrives, the DO is instantiated again to handle it. This is critical for economics — clipboard sync is bursty. Most of the time nothing is happening, and you should not pay for idle compute.

Without hibernation, a DO with 10 connected devices would burn CPU constantly even when no clipboard events occur. With hibernation, you pay only for the milliseconds when messages actually flow.

Enable it by using this.ctx.acceptWebSocket(server) instead of server.accept(). That single API difference is what activates hibernation.

P2P Signaling DO

The P2P signaling endpoint (/p2psignaling) is simpler — it only needs to pass WebRTC offer/answer/ICE candidate messages between two specific devices attempting a direct connection. Same pattern, smaller scope:

export class P2PSignalingDO extends DurableObject {
  // Keyed by session ID shared out-of-band between two devices
  // Both devices connect, messages are relayed point-to-point
  // DO can be deleted once the WebRTC connection is established
}

Storage Layer

D1 for Users

CREATE TABLE users (
  id TEXT PRIMARY KEY,
  username TEXT UNIQUE NOT NULL,
  password_hash TEXT NOT NULL,
  encryption_salt TEXT NOT NULL,
  hash_iterations INTEGER NOT NULL DEFAULT 600000,
  created_at INTEGER NOT NULL,
  role TEXT NOT NULL DEFAULT 'user'
);

CREATE TABLE sessions (
  token TEXT PRIMARY KEY,
  user_id TEXT NOT NULL REFERENCES users(id),
  expires_at INTEGER NOT NULL
);

KV for Active Sessions

Session tokens go in KV with a TTL. D1 handles the durable user record; KV handles the hot path for request auth — every WebSocket upgrade and API call needs to validate a token, and KV reads are faster and cheaper than D1 queries for this use case.

R2 for Large Payloads

The default P2S message limit is 1 MiB. For images and files above that limit, the client can upload to R2 directly (presigned URL) and send only a reference token through the relay. Devices download the payload from R2 using the token. This also avoids passing large binary blobs through the DO.


Tailscale + Cloudflare: Different Layers, No Conflict

The common assumption is that choosing Cloudflare means choosing public internet. It does not.

Tailscale operates at the network layer — it gives your devices a private mesh (WireGuard-based) with stable hostnames and no public exposure. Cloudflare Workers operate at the application layer — they are HTTPS endpoints on the public internet.

These layers compose without conflict:

Device (Tailscale node) ──[normal internet egress]──► Cloudflare edge ──► Worker ──► Durable Object

Tailscale devices have unrestricted outbound internet access. They connect to the CF Worker URL the same way they connect to any HTTPS endpoint.

If you want network-level access restriction — so only Tailscale users can reach the Worker — you have options:

ApproachHowTradeoff
App auth onlyWorker validates session tokenNo network restriction, auth is the only gate
Cloudflare AccessPut Worker behind CF Zero Trust, require identityAdds a login step, free up to 50 users
Exit node + IP allowlistRoute all Tailscale traffic through one exit node, allowlist that IP in WorkerSimple but funnels all traffic through one machine

For personal or small-team use, app auth is sufficient. ClipCascade already requires a username and password. If the URL is not published, the attack surface is minimal.

The combination that makes operational sense: CF handles the relay server globally with no infrastructure to maintain; Tailscale handles device-level access policy if you need a second layer.


The Business Reality

The infrastructure question is the easy one. The hard question is whether anyone will pay.

The Pain Is Real but Narrow

Cross-device clipboard sync is genuinely unsolved across ecosystems. Within a single ecosystem it is solved for free:

The gap is mixed-device users: Windows laptop + iPhone, Mac + Android, Linux + anything. This is a real daily frustration. It is also a smaller market than it looks.

The Self-Hoster Paradox

The people who want this most are the people already running ClipCascade. They chose it precisely to avoid paying a cloud service and to keep clipboard data private. They have made an active decision not to trust a hosted service. They will not become paying customers. They are your loudest GitHub users and your worst conversion prospects.

This is not a ClipCascade-specific problem. It is the defining tension of any open-source self-hosted tool: the audience most engaged with your project is systematically self-selected against paying you.

Pushbullet Already Tried This

Pushbullet launched in 2013 with clipboard sync as a flagship feature. By 2016 it had 18 million registered users and had raised $4.8 million. It introduced paid tiers, struggled to convert, and today operates as a zombie product — alive but no longer actively developed. The team moved on.

Pushbullet failed not because clipboard sync is a bad idea but because:

  1. Apple and Google progressively closed the gap within their ecosystems
  2. The users who cared most about cross-ecosystem sync tended to be technical and resistant to paywalls
  3. It was hard to justify a recurring subscription for a utility that mostly runs invisibly in the background

Unit Economics Are Excellent

The infrastructure cost of a hosted clipboard relay on Cloudflare is nearly zero at personal scale. Clipboard messages are tiny (sub-1 KB for text), infrequent (most users copy something a handful of times per minute at peak), and the WebSocket connections spend most of their time hibernating.

For 1,000 active users:

Cost itemEstimate
Worker requests< $1/month
Durable Object compute (hibernation)< $5/month
D1 reads/writes< $1/month
KV operations< $1/month
R2 (image/file payloads)< $5/month
Total< $15/month

At $5/user/month, 1,000 users is $5,000 MRR against $15 in infrastructure. Margins are exceptional — if you can acquire and retain customers.

Where It Could Work

Open-core hosted: Self-hosted is free (existing project). Hosted version targets users who want it but don’t want to run a server. The self-hosted version is marketing for the hosted version, not competition. This is the Plausible/Umami model.

B2B, not B2C: A company with 50 employees on mixed OS — Windows laptops, Macs, Android phones, iPhones — has a real clipboard workflow problem that IT will pay to solve. The compliance story is compelling: E2EE, no clipboard data leaving your infrastructure (CF account), audit logs via Workers Analytics. IT buyers make purchase decisions; individual engineers do not. Price at $5–10/seat/month, sell to the IT buyer, and the individual user’s willingness-to-pay is irrelevant.

Bundled feature: Clipboard sync is one feature in a broader “cross-platform productivity” or “power user toolkit” product. Standalone subscriptions for clipboard sync are a hard sell. Clipboard sync as one of ten features in a $12/month product is easier.

The Honest Assessment

QuestionAnswer
Real pain point?Yes, specifically for mixed-ecosystem users
Will consumers pay?Unlikely at scale — Pushbullet proved this
Will businesses pay?More likely, with the right framing
Unit economics?Excellent — CF infrastructure costs are negligible
Biggest risk?Distribution and conversion, not infrastructure
Recommended path?Open-core or B2B; not consumer-direct SaaS

Migration Effort

For a developer familiar with Cloudflare Workers:

WorkEstimate
D1 schema + user auth (login, signup, sessions)1–2 days
REST endpoints (health, captcha, admin)0.5 days
UserRelayDO with WebSocket hibernation2–3 days
P2P signaling DO1 day
R2 integration for large payloads0.5 days
Cloudflare Turnstile for captcha0.5 days
Client compatibility testingVariable — clients speak standard WebSocket, point at new URL

Total: 5–8 days for the server. The clients (Windows, Mac, Linux, Android) require no code changes — they connect via standard HTTP and WebSocket, with a new base URL in their config.


What This Pattern Generalizes To

ClipCascade’s server architecture is a direct instance of a general pattern: stateful pub/sub relay with per-entity isolation. The same Durable Object design applies to:

The migration pattern is identical in each case: identify the stateful boundary, map it to a DO, use hibernation for idle connections, fan out messages within the DO, route from the Worker using a stable name derived from the entity identifier.

The hard part is never the routing. It is recognizing that the DO is the right abstraction before you spend three months building a Redis-backed WebSocket cluster that delivers the same capability at ten times the operational cost.


Edit page
Share this post on:

Previous Post
Prime: A Conversational Control Plane
Next Post
The Capability Primitive