Benchmark · February 16, 2026 · 10 min read
How We Measure Dictation Latency
A reproducible method for evaluating end-of-dictation completion speed across dictation tools.
Quick answer
We measure from end of speech to visible final text output using the same 20-second phrase across repeated trials.
Tags
Evidence links
Speed claims in voice products are easy to make and hard to trust unless the timing boundary is explicit.
This post explains exactly how Almond measures dictation latency so anyone can reproduce the process with their own stack.
The metric we optimize
Our primary metric is end-of-dictation to visible final text. That means the timer starts when speech ends and stops when the final transcript is fully usable in the target app.
We use this boundary because it maps to human experience. Users feel the delay after they stop speaking, not only model token throughput while they are still talking.
What we keep constant
To make comparisons fair, we control the test environment:
- Same spoken phrase length: 20 seconds.
- Same microphone and speaking cadence.
- Same hardware and macOS version.
- Same target app and insertion context.
- Repeated trials, then median reporting.
Any variable that changes between runs can distort conclusions, so we keep the setup simple and repeatable.
Trial protocol
- Start recording in the target dictation tool.
- Speak the standardized 20-second phrase.
- Stop speaking and mark that timestamp.
- Stop timing when final text is visible and editable.
- Repeat across multiple runs and record median plus spread.
Why median and variance both matter
Median tells you typical speed. Variance tells you predictability.
A tool with a decent median but high variance still feels unreliable in daily writing. That is why we look at both central tendency and consistency.
Common benchmark mistakes
- Stopping the timer at partial or streaming text instead of final usable output.
- Running single-shot tests that ignore random spikes.
- Comparing across different workflows or app contexts.
- Ignoring correction overhead after insertion.
Each mistake can make a slow real workflow appear fast on paper.
How to replicate this yourself
You can follow the same method with your own phrase, app mix, and dictation tools. We recommend at least 10 trials per tool and reporting both median and rough spread.
If you want the complete reference, review our full methodology page and benchmark summary at dictation-speed-benchmark.
Bottom line
Measurement should make claims easier to verify, not harder. Clear boundaries create better product comparisons and better product decisions.
Related reading
Engineering
Building Deterministic On-Device Dictation
Engineering principles that improve post-speech consistency and reduce tail latency spikes on Mac.
Product
Introducing Almond
Why Almond exists and why deterministic on-device dictation changes writing speed on Mac.
Benchmark
Offline Dictation vs Cloud Latency
A practical breakdown of why local dictation often feels faster and more reliable after speech ends.
Published February 16, 2026 · Updated February 16, 2026