Prediction methods
How accurate are race time predictors? An honest look at the research
Every running app gives you a finish time. Behind that number are formulas with real limits, and a confidence range your predictor probably is not telling you about.
Type a recent 10K time into any running app and you get back a confident looking number. Your projected marathon finish. Your projected half marathon finish. Your sub three capability. The number is usually rendered to the nearest second. It is rarely rendered with a confidence range.
Two things are worth knowing before you trust any single prediction. First, what formulas are actually doing the work behind the scenes. Second, where they break. Every prediction model has assumptions that do not always hold for your specific training and physiology.
How does the Riegel formula work?
Most distance equivalence predictors are built on, or compared against, an equation published by mechanical engineer Pete Riegel in American Scientist (1981). His formula relates a known race time to a target race time at a different distance:
Riegel found that an exponent of k = 1.06 fitted world record data across distances from sprints to ultras for both men and women. The formula assumes endurance fatigues at a roughly predictable rate as distance increases. Fast enough to remain useful, simple enough to compute on the back of a napkin (Riegel, 1981).
Riegel himself was clear that the model has limits. The fixed exponent treats endurance as if it scales identically across athletes, which is not generally true. Runners with short distance training bias tend to have their longer distance times over predicted (the model expects more endurance than they have built). Runners with long distance bias tend to have their shorter distance times under predicted.
What is VDOT and how is it different?
Jack Daniels, an exercise physiologist whose career has shaped much of modern distance running coaching, developed a different framework based on physiological measures of running economy and oxygen uptake. His VDOT system (a function of VO2max and running economy) is presented in Daniels' Running Formula (Daniels, 2014) as a set of equivalence tables. If you ran X for distance D1, you should be capable of Y for distance D2, with corresponding training paces for easy, threshold, interval, and repetition work.
VDOT is more physiologically grounded than Riegel. It incorporates the actual oxygen uptake demand of running at different paces. For trained runners with consistent training across distances, VDOT predictions broadly track observed performance. For runners whose physiology diverges from the table assumptions (very heavy or very light, limited training history, unusual running economy), the predictions drift.
Where is the research heading?
More recent work has tried to incorporate training history into predictions, on the basis that a 10K time alone does not tell you whether someone has the endurance base to translate that fitness to a marathon.
Tanda (2011) proposed a model that used average weekly training distance and average training pace as inputs alongside recent performance, and tested it against marathon times in trained runners. The training aware approach reduced prediction error compared to distance equivalence formulas alone. Vickers and Vertosick (2016), using a large dataset of recreational endurance runners, found that incorporating training history meaningfully improved prediction accuracy over Riegel only models. The gain was largest for the marathon, where the gap between underlying speed and demonstrated endurance is largest.
The direction of travel in the literature is consistent. Simple distance equivalence formulas are useful starting points, but the most accurate predictions ensemble multiple models with the runner's actual training context.
How accurate is a good prediction in practice?
For a trained runner with consistent recent racing and reasonable training data, a well calibrated prediction lands within roughly 2 to 5 percent of actual finish time most of the time. A four hour marathon prediction is typically correct within 5 to 12 minutes under good conditions and competent execution. Predictions are most accurate over distances close to a recent race performance, and least accurate when extrapolating from a 5K to a marathon with no training data in between.
Three things to know before trusting any single prediction:
A single short race input is the weakest signal.
A 5K predicting a marathon, with no training data, extrapolates from a predominantly aerobic but fast effort to a predominantly aerobic and long effort. The further apart the input and target distances, the larger the uncertainty.
Training context matters more than the model.
A 50:00 10K runner doing 30 km per week will not run the same marathon as a 50:00 10K runner doing 80 km per week. Any predictor that ignores training data is leaving accuracy on the table.
Race day conditions are unmodelled.
Weather, course profile, fueling, sleep, and pacing decisions add variability the prediction cannot capture. A “3:42” prediction assumes competent execution. Going out 10 seconds per kilometre too fast in the first 10K is enough to break it.
What does PaceBrain actually do?
We combine six prediction models including VDOT (Daniels), Riegel, Cameron, and training load adjusted variants, weighted by how well they fit your specific data. We then publish a confidence range, not a single number. A prediction of 3:42:18 with a confidence range of 3:38 to 3:47 is more useful than 3:42:18 alone, because it tells you what to plan for and what is plausible on the day.
If the predictor you are using gives you a single number with no confidence interval, ask it for one. Or trust it less.
Frequently asked questions
How accurate is the Riegel formula?
The Riegel formula is reasonably accurate for trained runners predicting between distances that are close together (for example a 10K to a half marathon). It systematically over predicts marathon times for runners with limited long distance training, because the fixed exponent assumes endurance scales evenly with speed. Most prediction errors at the marathon distance trace back to that assumption.
Can a 5K time predict my marathon?
It can, but the uncertainty is large. A 5K is a predominantly aerobic effort lasting roughly 15 to 30 minutes for most trained runners. A marathon is a predominantly aerobic effort lasting two and a half to five hours. Extrapolating across that gap without training data assumes the runner has built endurance proportional to their speed, which often is not the case. Predictions improve substantially when the model includes weekly mileage, longest recent run, and at least one race over a longer distance.
What is VDOT?
VDOT is a measure developed by Jack Daniels that combines VO2max and running economy into a single value. The VDOT tables in Daniels' Running Formula map a recent race performance to equivalent paces for other race distances, plus training paces for easy runs, tempo runs, intervals, and repetitions. It is more physiologically grounded than the Riegel formula and is widely used by coaches.
Why do running apps disagree on my marathon time?
Different apps use different prediction models. Some use Riegel only. Some use VDOT tables. Some adjust based on training load. The wider the spread between predictions, the more uncertainty there is in the underlying data. If three models give you answers within five minutes of each other, you can trust the central estimate. If they disagree by twenty minutes, the prediction is mostly noise.
How do I get a more accurate prediction?
Provide more data. The strongest predictions use multiple recent race times across different distances, plus training history (weekly volume, long run length, consistency). Predictions over distances close to your recent racing are always more accurate than extrapolations to distances you have not raced.
Related reading: Why most runners train too fast on easy days.
References
- Daniels, J. (2014) Daniels' Running Formula. 3rd edn. Champaign, IL: Human Kinetics.
- Riegel, P.S. (1981) ‘Athletic records and human endurance’, American Scientist, 69(3), pp. 285–290. JSTOR.
- Tanda, G. (2011) ‘Prediction of marathon performance time on the basis of training indices’, Journal of Human Sport and Exercise, 6(3), pp. 511–520. Journal of Human Sport and Exercise.
- Vickers, A.J. and Vertosick, E.A. (2016) ‘An empirical study of race times in recreational endurance runners’, BMC Sports Science, Medicine and Rehabilitation, 8(1), article 26. PubMed.
All citations point to peer reviewed primary sources or, in the case of Daniels (2014), a foundational textbook by the originating researcher.
Get a prediction with the confidence range built in.
PaceBrain runs six models on your data and publishes a range, not a number. Free, no signup.
Try the predictor