Prompt library

Cached view of the Gemini prompts used by the OCR pipeline. Source of truth is the Python Lambda file; this yaml is refreshed whenever we notice drift. Each run stamps the SHA256 of the prompt it executed — if a run's SHA doesn't match the current yaml entry, the run-detail page will flag it.

YAML last touched 2026-04-18

photo_ocr

single photo

qr_reader_v1/EXTENDED_LAMBDA_OCR.py:108

Model

gemini-2.5-flash

T=0 · topK=1 · topP=1

Words

112

Chars

699

SHA256

48496a3017a2…

Used by

215 runs

View prompt body

Can you please OCR this image? Please OCR and do not modify the content and try and generate the OCR result with the same exact formatting as the input image. Please focus in ensuring the OCR process flawlessly retains the source's formatting. I aim to go line-by-line, capturing every detail, including special characters, comments, and those crucial line breaks, indentations, and case differences, thus guaranteeing the output mirrors the original. However, please remove any items from an editor or parts of the IDE/word processor that are shown in any potential screenshot to as just show only the content instead. (For instance removing the list of windows open/ line numbers, file name etc.)

Called by extended_lambda_ocr · Rekognition moderation (80 % threshold) → if approved, Gemini OCR → writes ocr/<uuid>.txt.

frame_ocr

video

qr_reader_v1/frame_ocr_lambda.py:31

Model

gemini-2.5-flash

T=0 · topK=1 · topP=1

Words

Chars

SHA256

66326cc5be6b…

Used by

215 runs

View prompt body

Perform OCR on this image and return plain text only. Do not describe the image.

Called by frame_ocr_lambda · OCR a single frame with Gemini; persists result JSON into <videoId>/results/.

video_polish

video structured JSON

qr_reader_v1/video_ocr_polishing_lambda_rest.py:102

Model

gemini-2.5-flash

Words

142

Chars

898

SHA256

—

Used by

sha n/a

View prompt body

Please stitch together the OCRs that are taken from individual screenshots/frames from a video. There should be overlapping lines which can be used as a marker for when one frame ends and another begins. Please do not alter the content in any way besides stitching the content together and please add correct indentation. Do not alter or add any content.

Your final output must be a single, valid JSON object and nothing else. The JSON object must conform to the following structure:
{
  "stitched_response": "(string) This key must hold the final, fully reconstructed and cleaned document text.",
  "additional_notes": "(string) Use this field to briefly describe your process. Mention any significant noise you filtered out (e.g., 'Removed text from a Save As dialog box') or any ambiguities you encountered during the reconstruction."
}

Here is the raw OCR text from video frames:

{raw_text}

The polish prompt is a Python f-string — {raw_text} is substituted with the aggregator's combined.txt. The SHA256 for this prompt is computed at ingest time from the interpolated template (without the substitution) so the hash stays stable across runs.

Called by video_ocr_polishing_lambda · Polishes combined.txt via Gemini with a structured-output JSON schema; writes final.txt.