THE PIPELINE
From microphone to feedback in 5 seconds.
1. Audio capture (browser-side)
MediaRecorder API, 16kHz mono, WebM on Chrome/Firefox, MP4 on Safari/iOS. Audio is chunked and sent over multipart/form-data so the proxy doesn't choke on large files.
2. Transcription (OpenAI Whisper)
Whisper-1 transcribes your speech. We pre-warm the connection so the round-trip is sub-1.5s for short clips.
3. Two-lane grading (GPT-4o)
Lane A (fast) returns Angie's spoken reply in <3 seconds. Lane B (background, 8–9 s) runs the full Cambridge rubric and patches your bands + corrections into the chat afterwards. You never see a loading spinner.
4. Audit & feedback loop
1% of grading is sampled weekly and re-graded by Cambridge ESOL-trained linguists. We re-tune the system prompt monthly to keep AI bands aligned within ±0.25 of human bands.