Multi-photo collection OCR

A PhotoCollection of N children fans out — for each child, the full single-photo pipeline runs in parallel via DispatchGroup. After all complete, the iOS app stitches results locally via OCRFormattingManager; no additional server Lambda.

Cloud pipeline

Click any node for URL, handler file, IAM notes, and iOS call site. Every AWS service touched at every step is rendered — Lambda URLs, S3 buckets, Rekognition, DynamoDB, SNS → APNS — plus the external Gemini API.

Prompts used

photo_ocr

gemini-2.5-flash

qr_reader_v1/EXTENDED_LAMBDA_OCR.py:108 · temp 0 sha256 48496a3017a2…

Can you please OCR this image? Please OCR and do not modify the content and try and generate the OCR result with the same exact formatting as the input image. Please focus in ensuring the OCR process flawlessly retains the source's formatting. I aim to go line-by-line, capturing every detail, including special characters, comments, and those crucial line breaks, indentations, and case differences, thus guaranteeing the output mirrors the original. However, please remove any items from an editor or parts of the IDE/word processor that are shown in any potential screenshot to as just show only the content instead. (For instance removing the list of windows open/ line numbers, file name etc.)

Model mix last 16 runs · 6 post-F1

gemini-2.5-flash 13 · 81%

gemini-2.5-pro 3 · 19%

Model attribution comes from the Lambda's self-reported modelId (post-F1). Older runs fall back to runSettings.geminiModel from app_defaults.yaml.

Recent runs for this flow