F1 + F2 — Lambdas now self-report and accept runtime overrides
The two foundation steps of Workstream F are
done. Every OCR Lambda now stamps its own identity on every response, and accepts a
modelId + promptId on the request body. Nothing is wired into iOS or
remote config yet — that's F3+F4. But the plumbing the rest of F rides on is in place.
What we just shipped
75/75 tests passing 4 Lambdas updated no behavior change yet Same defaults, new contract.
Without a modelId or promptId on the request, every Lambda still resolves to
Gemini 2.5 Flash with its existing normal/strict prompt — production behavior is identical.
What changed is that the Lambda now tells you what it ran (provider, model, prompt
ID, prompt SHA, lambda version) on every response, and you can tell it what to run
via two new optional body fields.
Before vs after — the request shape
Where this sits in Workstream F
F1+F2 are the foundation. Every later step assumes a Lambda that can be asked to run a specific model and prompt, and that tells you what it actually ran. Without that, the iOS picker has nothing to talk to and the dashboard has nothing to trust.
F1+F2 are the only steps that touch the Lambda hot path. F3 onward are config + UI + dashboards riding on top.
Per-Lambda — what changed
moderation_ocr_v2
photo + collection
Two prompt IDs registered: photo-ocr-normal-v1 + photo-ocr-strict-v1.
Mode toggle still works (back-compat); explicit promptId wins when set.
frame_ocr_v2
per-frame video
One prompt ID for now (frame-ocr-normal-v1) but the registry pattern is in
place. Identity stamped on both the HTTP response and the per-frame result doc in S3.
video_polish_v2
stitch frames
Two prompt IDs: video-polish-normal-v1 + video-polish-strict-v1.
model param was already plumbed (deviation: kept legacy field for back-compat).
video_aggregator_v2
orchestrator
Pure pass-through: accepts frameModelId / framePromptId /
polishModelId / polishPromptId and forwards them onto the per-stage
calls. New routing block in the 200 envelope echoes what it actually ran.
Deviation from original plan
documented
The plan called for prompts to live in a prompts.yaml file, loaded at Lambda cold
start. AWS Lambda's Python runtime doesn't ship PyYAML, so adopting it would force a
dependency into every Lambda zip.
What we did instead: kept prompts inline as Python constants and built a
PROMPTS = { id → text } + PROMPT_SHAS registry computed at module import.
The intent (named lookup by ID, SHA-stamped, swappable) is preserved — and the dashboard
now reads promptSha256 off the Lambda response (the F1 work), so the prompt-text-readability
argument for YAML is no longer load-bearing.
How we verified
-rev2 lambdaVersion. New per-Lambda cases:
default request resolves to Flash + normal; explicit Pro reaches Gemini correctly;
unknown promptId → 400; unsupported modelId → 400; response carries identity fields.
photo-ocr-normal-v1, photo-ocr-strict-v1,
frame-ocr-normal-v1, video-polish-normal-v1,
video-polish-strict-v1. Each SHA-stamped at module import.
What this directly unlocks
- F3 next
Upload
ocr-routing.jsonto S3. Now that Lambdas accept per-requestmodelId/promptId, a static config file is a viable hot-swap mechanism. - F4 then
iOS
OCRRoutingConfigactor + 4 Settings pickers (photo / collection / frame / polish). App resolves the routing locally and stamps the request body — iOS now uses the new fields. - F5+F6
Sweep harness gains
--model/--prompt-idflags. Dashboard ingest trusts Lambda-emittedpromptSha256overapp_defaults.yaml; SHA drift surfaces as a yellow pill on flow pages. - F7 final Paired Flash-vs-Pro polish sweep. 18 video runs (3 seeds × 3 iters × 2 configs). First real test of "does Pro on polish-only earn its 10× cost?"