PushWorker.md (17642B)
1 # Push Worker + Sync Simplification 2 3 ## Context 4 5 After the engagement Worker landed (collaborative live-solving over WebSockets, 2026-05), it became natural to use the same Cloudflare runtime for push notifications. Pings were unreliable for time-critical user-facing events, and the receiver-side heuristics that had grown around them (`SessionMonitor.presentBegins`, the 3-minute quiescence-window scheduler) were brittle. This pass moves the user-facing event notifications onto a second Cloudflare Worker that signs APNs JWTs and pushes alerts directly, while folding the per-cell state for check / reveal / resign into the Moves stream so it propagates collaboratively. 6 7 ## What landed (2026-05-27) 8 9 ## App Attest migration (2026-06) 10 11 The push worker no longer relies on a build-baked `PUSH_BEARER` for normal 12 clients. Updated apps enroll a per-install App Attest key through 13 `GET /attest/challenge` + `POST /attest/register`, then sign every 14 `/register`, `/register DELETE`, and `/publish` request with that key. 15 16 The worker stores only the attested public key and assertion counter keyed by 17 `deviceID` + `keyID`. This preserves the no-Crossmate-account model: CloudKit 18 still owns collaboration identity, and the worker only proves that a request 19 came from an attested Crossmate install. Existing games keep working because 20 their CloudKit `pushAddress` values are unchanged; updated clients simply 21 re-register the same addresses with App Attest request auth. 22 23 Deploy requirements: 24 25 - `APP_TEAM_ID`, `APP_BUNDLE_ID`, and `APP_ATTEST_ENVIRONMENT` are configured 26 in `wrangler.push.toml`. 27 - `APP_ATTEST_ROOT_CERT_PEM` must be set as a Worker secret or dashboard 28 variable. 29 - `ALLOW_LEGACY_PUSH_BEARER = "1"` may be set temporarily for rollback, but 30 normal deployment should leave it unset. 31 32 ### Worker 33 34 - New `Workers/push-worker.js` + `Workers/wrangler.push.toml`. 35 - Single Durable Object class `PushRegistry`, sharded by `idFromName("registry")` (one global instance — Crossmate's scale doesn't need per-author sharding). 36 - Three push endpoints, all App-Attest-auth gated: 37 - `POST /register` — `{deviceID, token, environment, addresses}` upsert. Idempotent. 38 - `DELETE /register` — `{deviceID, addresses}` — unregisters dropped address bindings. 39 - `POST /publish` — `{kind, addressees, gameID, fromAuthorID, title, alertBody}` — fans out to every addressee's registered devices. 40 - APNs JWT (ES256) signed in-Worker via Web Crypto. JWT cached in DO instance memory and refreshed every ~40 min (APNs' rate-limit floor is ~20 min; the 1-hour ceiling is the upper bound). 41 - Per-token `environment` field picks sandbox vs production endpoint; iOS reports `"sandbox"` for Debug builds and `"production"` for TestFlight/App Store. 42 - Deployed at the push Worker's `*.workers.dev` URL with `preview_urls = false`. 43 44 ### Secrets 45 46 Loaded via `wrangler secret put --config wrangler.push.toml`: 47 48 - `APNS_KEY` — the `.p8` (piped via `< AuthKey_*.p8` stdin redirect; not pasted, because the interactive prompt eats newlines). 49 - `APNS_KEY_ID`, `APNS_TEAM_ID` — both 10-char. 50 - `APP_ATTEST_ROOT_CERT_PEM` — public Apple App Attest root certificate used 51 by the worker to verify the attestation chain. 52 53 APNs key is topic-restricted to the app's bundle ID, both environments enabled. Apple caps at 2 APNs keys per Team, so a single key serving both endpoints leaves headroom. 54 55 ### iOS plumbing 56 57 - `Crossmate/Services/PushClient.swift` — `@MainActor` class. Reads 58 `CrossmatePushBaseURL` from `Bundle.main` and signs worker requests via 59 `PushRequestAuthenticator`. 60 - Idempotent: `updateAPNsToken(_:)` on every `didRegisterForRemoteNotificationsWithDeviceToken` callback, `updateAuthorID(_:)` on every `identity.refresh(...)` site. Worker side dedupes unchanged triples. 61 - Account switch: explicit `DELETE /register` for the previous `(authorID, deviceID)` before re-registering. 62 - Failures go to `syncMonitor.note(...)` for diagnostics; never user-surfaced. 63 64 ### Push kinds (sender-formatted strings, no `loc-key` — Crossmate is English-only) 65 66 | Kind | Trigger | Body | 67 |------|---------|------| 68 | `win` | `sendCompletionPings(resigned: false)` after the local player completes a puzzle | "Alice solved the puzzle 'X'" | 69 | `resign` | `sendCompletionPings(resigned: true)` after the local player gives up | "Alice resigned the puzzle 'X'." | 70 | `play` | `PuzzleDisplayView.updateActiveNotificationPuzzleID` when scenePhase becomes `.active` | "Alice is solving the puzzle 'X'" | 71 | `pause` | Same handler when scenePhase leaves `.active`, plus the selection-clear block at puzzle close | "Alice added 12 letters and cleared 3 letters in the puzzle 'X'" | 72 73 Pause uses `LocalSessionTracker` (`Crossmate/Services/LocalSessionTracker.swift`) — a small `@MainActor` counter wired into `store.onLocalCellEdit` that resets on every `begin(gameID:)` and drains on `consume(gameID:)`. The "skip if both counts are zero" guard inside `publishSessionEndPush` handles dedup between scenePhase-background and puzzle-close naturally. 74 75 ### Pings that stay 76 77 - `.friend` — friendship handshake. 78 - `.invite` — re-invite to a game (uses `addressee`; surfaces in the Invited section). 79 - `.hail` — engagement room bootstrap. 80 - `.join` — the *invite-accept* one (one-shot, when a collaborator first accepts a share). Distinct from the new session-start APN, which is the `play` kind on the push side specifically to avoid this naming collision. 81 82 `PingScope` and `PingScopePayload` are gone entirely (only `.check`/`.reveal` used them). 83 84 ### Check/reveal/resign in Moves 85 86 - New `Crossmate/Models/CheckResult.swift` — `enum CheckResult: Sendable, Equatable, Codable { case right, case wrong }`. Used as `CheckResult?`; `nil` is the canonical "unchecked" value (no `.none` case). 87 - `CellMark.pen` / `.pencil` now carry `checked: CheckResult?` instead of `checkedWrong: Bool`. 88 - `Game.checkCells` now stamps `.right` on correct entries (previously erased them to `.none`). Wrong entries continue to stamp `.wrong` — pen-vs-pencil style is preserved across the check. 89 - Wire format and Core Data keep two parallel `checkedRight` / `checkedWrong` booleans (translation lives in `GameMutator.encodeMark` / `GameStore.decodeMark`). `MovesCodec.Payload.Entry` gets a custom `init(from:)` that defaults `checkedRight` to false so records written by older clients still decode cleanly. 90 - Resign reveals propagate via the existing Moves reveal mechanism — peer grids fill in cooperatively. No new `Game.resignedBy` field; a resigned game and a reveal-all-puzzle game look identical at cold start (accepted trade-off — the moment-of-resign signal lives in the APN). 91 92 ### SessionMonitor cleanup 93 94 Slimmed to what the on-open catch-up banner actually needs: 95 96 - Kept: `ingest(_ deltas:)` (bucket accumulation), `consumeOnOpen(gameID:)`, `cancel(gameID:authorID:)`, `bodyText(playerName:puzzleTitle:added:cleared:)`. 97 - Removed: `presentBegins`, `scheduleEnd`, the 3-minute quiescence timer, `Bucket.scheduledFor`, `SessionNotificationScheduling` protocol + `RecordingNotificationScheduler` test recorder, `bodyText(for: Session)`, the end-/begin-identifier helpers, the `notificationCenter` and `notificationAuthorization` constructor params. 98 99 The on-open banner still says "Alice added 12 letters; Bob added 5 letters" — that path is independent of the new APNs and remains useful as a catch-up when the user opens an unread puzzle. 100 101 ## Design decisions 102 103 - **Sender-formatted strings, not `loc-key`/`loc-args`.** Crossmate has no `Localizable.strings` / `.lproj` yet. Setting up the localization scaffolding just to feed receiver-side formatting that would produce the same English text is more work than it's worth for a single-locale app. Switch to `loc-key` when a second language ships. 104 - **One Durable Object, not per-author sharding.** Single DO instance can comfortably hold thousands of token registrations; `/publish` resolves targets in one `state.storage.list({prefix: ...})` per addressee. 105 - **JWT caching is mandatory.** APNs rejects rapidly-rotated provider tokens with `TooManyProviderTokenUpdates`. Cache in DO instance memory (sticky region) — no storage RTTs. 106 - **`CellMark` as enum, parallel-Bool storage.** The conceptual model is the optional `CheckResult` enum; the on-disk encoding is two booleans. Translation at the boundary keeps the API clean without requiring a Core Data migration model. 107 - **No `Game.resignedBy` distinguisher.** Cold-state conflation of resign vs reveal-all-puzzle is acceptable; the moment-of-resign signal lives in the APN body. 108 - **No `lastAction` field on Moves; no Actions record.** Replay is a future feature that will design its own log format with the consuming UX in mind. Designing the granularity now without the consumer would be a guess. 109 - **Push registration is fire-and-fetch-error.** Worker is idempotent, so the next launch's APNs callback retries naturally. Failures log to the diagnostics monitor but don't surface to the user. 110 - **`preview_urls = false`.** Each preview URL carries the same secret bindings; gratuitous attack surface for no benefit. 111 112 ## Badge and sibling-device read horizons (2026-06) 113 114 Badge state is now split between durable-ish local truth and best-effort push 115 transport: 116 117 - The Notification Service Extension and the app share an App Group 118 `BadgeState` ledger. It is a per-game horizon map, not a set: `unreadAt` 119 advances when a badge-worthy APN is delivered, `seenAt` advances when the 120 user opens/clears the game, and a game counts only when `unreadAt > seenAt`. 121 This lets stale APNs lose to a newer open without needing CloudKit to arrive 122 first. 123 - The app icon badge is computed as `BadgeState.unreadGameIDs()` unioned with 124 Core Data's unread-other-moves query. Core Data unread games are also seeded 125 back into the ledger as `unreadAt` horizons, each stamped with that game's 126 newest unseen other-author move time (`latestOtherMoveAt`). The NSE can't 127 reach Core Data, so without this seed a push landing on a suspended app would 128 re-stamp the badge from the ledger alone and drop any game whose unread state 129 arrived purely via CloudKit sync. The seed is safe precisely because the 130 ledger is a horizon map and not a set: re-seeding a game the user has since 131 opened is a no-op, because its newer `seenAt` still wins. (The old set-based 132 ledger couldn't express that, which is why the write-back was dropped when 133 the horizon map first landed — and why it is safe to reinstate now.) 134 - Ledger entries are maintained by lifecycle event, not reconciled by absence 135 (a ledger-only game is ambiguous: it could be push-ahead-of-sync, which must 136 count, or a deleted game, which must not). So `BadgeState.forget(gameID:)` is 137 called on the two hard-removal hooks — local `onGameDeleted` and sync-driven 138 `onGameRemoved` — and `BadgeState.reset()` on the diagnostics data wipe. This 139 matters because a deleted game has nothing left to open, so a stale 140 `unreadAt > seenAt` entry could never be cleared by a seen horizon and would 141 badge forever. 142 - The ledger key is `badge.ledger.v2`. The bump from `v1` is a deliberate 143 one-shot discard: the `v1` migration off the pre-horizon set stamped 144 `unreadAt = now` for every legacy entry, fabricating fresh-unread phantoms 145 that no read state could defeat (a future-dated `unreadAt` beats any past 146 `seenAt`). `loadLedger` now drops the legacy stores without fabricating, and 147 rebuilds from Core Data via the seed above. 148 - Each account has an account-scoped push address, published as the existing 149 `Decision` record `decision-account-pushAddress` with `kind = account` and 150 the address in the payload. This avoids adding CloudKit schema fields while 151 giving all devices on the same iCloud account a shared silent-push route. 152 - When a device joins a shared game, it sends `accountJoined` to the account 153 address. Sibling devices ignore self-sends, fetch/sync, mint their own 154 per-game push address if needed, and register with the Worker so they can 155 receive that game's future pushes. 156 - When a device clears/sees a game, it sends `accountSeen` with 157 `senderDeviceID` and `readAt`. Sibling devices ignore self-sends, apply the 158 supplied `readAt` to their local read cursor and badge ledger, withdraw any 159 matching delivered notifications, and refresh the app badge. Inbound 160 `accountSeen`, CloudKit `Player.readAt` updates and ping deletions are 161 non-echoing so the account push path remains one-hop rather than a feedback 162 loop. 163 - If the game is actively visible, the seen push uses the same future 164 read-lease horizon as `Player.readAt`. That prevents an account-seen message 165 from racing after an active-lease update and accidentally collapsing 166 presence on another device. 167 168 The push server is still not durable truth for badges. It only transports 169 same-account hints quickly. If a silent push is dropped, the next app launch or 170 CloudKit fetch recomputes from local Core Data plus whatever the NSE delivered 171 on that device. 172 173 ## Open items / gotchas 174 175 - **Schema deployment is not needed.** Moves records ship the cells as an opaque binary blob; the new `checkedRight` boolean rides inside that blob. No CloudKit dashboard field to add (unlike, say, `Player.readAt`). The `CellEntity.checkedRight` Core Data attribute auto-migrates locally on first launch (Boolean default `NO`). 176 - **No visual indicator UI yet.** The cell state is now propagating; rendering it on the grid is separate UX work. Plan was tri-state: subtle dot for `.right`, distinct marker for `.wrong`, cleared when the cell's letter changes. 177 - **The bearer in the binary is the only abuse gate.** That matches your stated comfort level (users aren't adversarial). If the binary is ever reverse-engineered the bearer leaks; rotate via `wrangler secret put` if that happens. 178 - **Background-time publish.** Backgrounding while in a puzzle fires `publishSessionEndPush` from a `Task` that runs as the app suspends. iOS gives ~5–10 seconds; in practice a small POST to a Cloudflare Worker completes well within that window. If we see drops in the field, switch to a `BGTaskScheduler` background task or a background `URLSession`. 179 - **`onPlayerEvent` removed from `PlayerSession`** — the 6 fire sites that used to enqueue check/reveal pings are gone. Anyone re-introducing peer-visible per-action notifications would need to wire a new path; the existing one no longer exists. 180 - **Mixed-version peers.** `MovesCodec.Payload.Entry.init(from:)` defaults `checkedRight` to false, so a TestFlight build that's behind on the schema can still decode records from the Simulator. Older builds drop the new flag when re-encoding (they just don't see it), so a check-correct stamped on the new build that round-trips through an older device's re-write would lose its `.right` state. Acceptable transient: the next check on the new device re-stamps. 181 182 ## File-level inventory 183 184 Added: 185 186 - `Workers/push-worker.js`, `Workers/wrangler.push.toml` 187 - `Crossmate/Services/PushClient.swift` 188 - `Crossmate/Services/LocalSessionTracker.swift` 189 - `Crossmate/Models/CheckResult.swift` 190 191 Modified: 192 193 - `project.yml`, `Crossmate/Info.plist`, `Crossmate/Crossmate.entitlements` — push base URL and App Attest entitlement/configuration 194 - `Crossmate/Services/PushRequestAuthenticator.swift` — per-install App Attest enrollment and request signing 195 - `Crossmate/Models/CellMark.swift` — enum-based `checked: CheckResult?` 196 - `Crossmate/Models/Game.swift` — `setLetter` / `checkCells` updated for new mark API 197 - `Crossmate/Models/CrossmateModel.xcdatamodeld` — `CellEntity.checkedRight` attribute 198 - `Crossmate/Models/PlayerSession.swift` — `onPlayerEvent` + 6 fire sites removed 199 - `Crossmate/Sync/Moves.swift` — `TimestampedCell` / `GridCell` / `RealtimeCellEdit` / `Payload.Entry` gain `checkedRight` 200 - `Crossmate/Sync/Presence.swift` — `PingKind` trimmed, `PingScope` / `PingScopePayload` removed, `Ping.scope` removed 201 - `Crossmate/Sync/SyncEngine.swift` — `enqueuePing` loses `scope`; `PingPayload` loses `scope` 202 - `Crossmate/Sync/RecordSerializer.swift` — `pingRecord(scope:)` removed 203 - `Crossmate/Sync/RecordBuilder.swift` — drop `payload.scope` passthrough 204 - `Crossmate/Sync/SessionMonitor.swift` — slimmed to bucket accumulation + `consumeOnOpen` 205 - `Crossmate/Sync/MovesUpdater.swift` — `enqueue(checkedRight:)`, `Pending.checkedRight` 206 - `Crossmate/Sync/GridStateMerger.swift` — propagate `checkedRight` 207 - `Crossmate/Persistence/GameMutator.swift` — `encodeMark` returns the triple; cell-edit path passes `checkedRight` 208 - `Crossmate/Persistence/GameStore.swift` — `decodeMark` takes the triple; `applyCellCache` / `applyRealtimeCellEdit` updated 209 - `Crossmate/CrossmateApp.swift` — `AppDelegate.onAPNsToken` callback; `updateActiveNotificationPuzzleID` fires the play/pause pushes; selection-clear block fires the pause push; `session.onPlayerEvent` callback removed 210 - `Crossmate/Services/AppServices.swift` — `PushClient` init wired; `publishCompletionPush` (win + resign), `publishSessionStartPush`, `publishSessionEndPush`, `recipientsAndTitle`; `sendPlayerEventPings` removed; `bodyText(for: Ping)` trimmed; `presentBegins` call site removed 211 212 Tests touched: `RecordSerializerMovesTests`, `PendingEditFlagTests`, `GridStateMergerTests`, `MovesUpdaterTests`, `GameStoreUnreadMovesTests`, `Sync/MovesInboundTests`, `Sync/AuthorDeltaTests`, `Sync/EngagementCoordinatorTests`, `Sync/PendingChangeReapTests`, `Sync/SessionMonitorTests`, `GameMutatorTests`, `XDAcceptTests`, `PuzzleNotificationTextTests`, `RecordSerializerTests`. 213 214 All unit tests pass. No CloudKit Dashboard changes required.