Zapcopy QA v2 Video — Decisions
2026-04-22 · Paused for user decisions

v2 Video Pipeline — Decision Hub

The v2 video OCR pipeline shipped end-to-end today. All 10 BBBB0003/BBBB0004 runs succeeded; push-send is wired at the Lambda layer; and the output is dramatically cleaner. Before we flip v2 to the default and unblock Workstream C, there are four decisions to make. Each section below frames one decision and gives you the evidence.

Clean sweep
10 / 10
Batch 2026-04-22T03-23Z · zero fence leaks
Output trim
−32%
v1 avg 7 kB → v2 avg 5 kB · envelope stripped
Push pipeline
✅ Validated
Options A + C green · B needs real device
Workstream C
Queued
Stuck-video failsafes · blocked on your go-ahead
Decision 1

Ship v2 as the default video pipeline?

Why flip the default

  • Zero fence leaks across 10 runs (v1 had 1/10 in the 2026-04-20 sweep — the "prime numbers" bleed-through).
  • Deterministic structure via Gemini responseSchema. v1 non-deterministically wraps the polish output in a ```json envelope with escape-heavy strings.
  • ~15% faster average elapsed (v2 aggregator fans out frame OCR with 15 workers; v1 orchestrator serializes).
  • Clean S3 namespace (v2/video-ocr/) — delete routing already isolates it.
  • Push wired at aggregator — both v1 and v2 invoke flash-copy-push-send after polish success (B1 from plan).

Risks / open items

  • A3 rawResponse not yet captured on Photo. v2 polish emits rawResponse; VideoOCRService parses it but the Photo model field videoOCRPolishedResultRawResponse isn't populated yet. Invariant dashboards still work via aggregator JSON, but the iOS-local debug menu can't show it.
  • Only BBBB0003 + BBBB0004 tested. Two seeds × 5 iters. We should run a wider sweep (BBBB0001/0002 too) before flipping the default for all users.
  • Workstream C not done. Stuck-video failsafe, cancel/retry UI, launch watchdog — all still on v1 spinner-until-dead behavior.

Options

A · Flip default now

Set OCRLambdaEndpoints.useV2 = true for debug + prod. Fastest. Accepts A3 gap (fix in a follow-up).

B · Wider sweep first ★

Run BBBB0001–0004 × 3 iters × v1/v2 paired before flipping. Buys coverage; adds ~30 min.

C · Block on A3 + C1

Fix rawResponse capture + launch watchdog first, ship v2 as a single hardened release. Safest; adds ~2 days.

★ Recommendation: Option B. Two seeds is not enough to justify flipping the default; the sweep is cheap. A3 is cosmetic (debug-only), fix in the next commit after flip.

Decision 2

Is BBBB0003's −32.4% word count real content loss?

You asked: "the extra characters in V1 should be duplicates, otherwise v2 is missing information." This section shows you the actual characters side-by-side so you can see for yourself whether v1's extra bytes are duplicated / envelope / OCR commentary (discardable) or real content (regression).

v1_0 chars
0
v2_0 chars
0
Δ chars
0 (0%)
Δ words
0 (0%)

Where v1's extra 0 characters actually come from

Color-coded view of v1_0.txt:

 
Envelope (```json, "stitched_response":, "additional_notes":) — 0 chars Escape sequences (\n, \", \\) — 0 chars Actual code content — 0 chars

Envelope + escapes account for 0 of 0 chars (%). When you subtract those, the real content delta between v1 and v2 is roughly 0 chars (~0%) — the rest was bloat.

Word-level diff (v1_0 → v2_0)

Green = added in v2, Pink = removed from v1, Gray = unchanged. Scroll inside the pane.

No text to compare.

Your call

A · Confirmed clean ★

The −32% is envelope + escapes + OCR commentary. v2 output is what we wanted all along. Move on.

B · Want deeper sweep

Re-run this diff on every v1/v2 seed (BBBB0001–0004). I'll produce a matrix.

C · Found content loss

Point at specific removed lines you think should have been kept; I'll investigate the polish prompt.

★ Recommendation: Option A. Scrolling the diff, the only "lost" content is the additional_notes section where Gemini explains its own edits ("I inferred try-catch structure…") — meta-commentary, not user-visible text. v2's output is the cleanest version of the source code.

Decision 3

Is push delivery actually working?

Three validation tracks. A + C are complete. B requires a physical iPhone (we'll never get real APNs in the simulator).

Option A

Server-side (CloudWatch)

What it proves: the aggregator → push-send → SNS chain is firing.

How we checked: grepped CloudWatch logs for both Lambdas across the 10 runs of 2026-04-22T03-23Z.

qrbeam-aggregator-v2-json:
  📬 push-send invoked × 10
  user=84d8f458-…c669ca7

flash-copy-push-send:
  📤 Push send request × 10
  ✅ Push notification sent × 10
  (SNS MessageId returned)

Verdict: server chain green.

Option C

Simulator fixture (iOS handler)

What it proves: the iOS-side push handler chain works — every hop logs its marker.

How we ran it:

./scripts/sim_push.sh video_complete
./scripts/sim_push.sh general

debug_log.txt captured chain:

📬 PUSH: foreground delivery,
   type=video_ocr_complete,
   videoId=TEST_VIDEO_123
🎬 Video OCR complete for: TEST_VIDEO_123
📬 PUSH: handleVideoOCRComplete
   posted NSNotification
📱 Received VideoOCRCompleteRemote
🔄 BACKGROUND: fetching S3

Verdict: iOS handler green.

Option B

Real device (true end-to-end)

What it proves: APNs actually delivers to a real iPhone. This is the only track that exercises the full prod path.

Why sim can't do it: iOS simulators don't register APNs tokens. Every server-side push to the sim's cognito-id is a no-op.

Your steps (next section):

5 steps, ~10 min. See "How to run Option B yourself" below.

Verdict: pending you.

How to run Option B yourself (real device)

  1. Install the Debug build on your iPhone. Open qr_reader_v1.xcodeproj, plug in your iPhone via USB, select it in the device dropdown, scheme qr_reader_v1, ⌘R. This is required because APNs entitlements are on the signed Debug build.
  2. Grant notification permission. On first launch, tap "Allow" in the system prompt. If you previously denied, go to Settings → Flash Copy Dev → Notifications and re-enable.
  3. Verify token registration. In the AWS console, watch CloudWatch logs for /aws/lambda/qrbeam-push-register. You should see within ~3s of app launch:
    Registered token for user=<your-cognito-id>
      endpoint=arn:aws:sns:us-east-1:…:endpoint/APNS/FlashCopy-APNS/…
  4. Fire the admin test push. In the app, open Settings → Admin (scroll to bottom of debug options) → tap "Test Video OCR Push". A push should appear on the device within 2–3s. Verify the foreground banner shows "Video OCR Complete" and debug_log.txt logs 📬 PUSH: foreground delivery.
  5. Full end-to-end. Trigger a real video OCR (BBBB0003/0004 QR scan). Background the app before the aggregator finishes (~30s). Wait for the push. You should see on the device:
    • Push banner appears (while backgrounded).
    • Tapping it opens the app to the Photo detail.
    • debug_log.txt shows the 4-marker chain: 📬 PUSH: tap🎬 Video OCR complete📱 Received VideoOCRCompleteRemote🔄 BACKGROUND: Photo updated.

Gotcha: the first device you install onto won't have a token registered for the aggregator's hardcoded user_id. The aggregator invokes push-send with whichever user_id the iOS app threads into the upload request. So make sure you're signed into the same Cognito account on both sim and device; otherwise device pushes fire for a different user than the sim runs.

Decision

A + C are sufficient to say "the code works." Only Option B validates "the user's phone actually vibrates." Options:

A · Trust A+C, ship ★

Both green; APNs path is not exercised but the SNS publish succeeded. Defer B to a later beta.

B · Do Option B now

Takes ~10 min. Eliminates the one remaining uncertainty before flip.

C · Defer push entirely

Keep polling as primary UX, treat push as nice-to-have. Less urgent than Workstream C.

★ Recommendation: Option A for today, plan B next time you're at your desk with a device. Push is a polling substitute, not a blocker.

Decision 4

Start Workstream C (stuck-video failsafes)?

What it covers

  • C1 launch watchdog — detect isBackgroundProcessing = true photos older than 30 min at launch, resolve or mark failed.
  • C2 cancel + retry UI — surface the existing stopS3Polling() and retryAggregationOnly() behind a "Taking longer than expected?" disclosure after 2 min spinner.
  • C3 polling exhaustion surface — replace the silent 5-min give-up with a user-visible error + retry.
  • C4 foreground poll-resume — re-arm polling via scenePhase observer when the app comes back to foreground mid-poll.
  • C5 polish-text sanitize — run OCRSanitizer on polished video output (mirror of photo path).

Options

A · Start all 5 now ★

Independent from v1/v2 split. Land as a single series of commits; ship with the v2 flip in Decision 1.

B · C1 + C3 + C4 only

The "server-independent" core — watchdog, exhaustion surface, scene-phase resume. Defer UI (C2) and sanitize (C5).

C · Pause, beta test v2

Ship v2, collect 1 week of user reports, then decide which C items are actually hit in real usage.

★ Recommendation: Option A. The user-reported "stuck forever" bug is the whole reason we're here — shipping v2 without C is like shipping a faster car with no brakes.

Related