Topol’s VO₂max Critique Makes the Same Error It Condemns
Table of Contents
- Introduction — Where Topol Is Right
- METs Are an Oxygen Consumption Estimate
- The MET Estimate Contains Hidden Assumptions About Human Movement
- “Free and Accessible” Does Not Mean Accurate
- VO₂ Measurement Is Not a Single Number. It Is a Physiologic Dataset.
- Does Raising VO₂ by Any Means Improve Longevity?
- Conclusion — What the Evidence Actually Supports
- References
Introduction — Where Topol Is Right
The Wearable “VO₂ Max” Problem
Eric Topol’s critique of wearable VO₂ max metrics begins on solid ground. He correctly points out that the number that appears on a smartwatch screen labeled “VO₂ max” is not a direct physiological measurement. It is an estimate generated by an algorithm. The algorithm takes inputs such as heart-rate response, movement speed or acceleration, and demographic variables, then compares those signals to patterns observed in datasets where laboratory VO₂ max was measured. From those correlations it produces a predicted value. The correlation with actual, measured, oxygen uptake is rough at best, even on a population level, and worse for individuals.
Population models are excellent at describing the center of a distribution but much weaker at capturing the quirks of any given individual. A person’s biomechanics, cardiovascular response patterns, breathing mechanics, medication use, and training background all influence how their body responds to exercise. A model trained on population averages necessarily compresses that complexity into a simplified statistical relationship. When a wearable device reports a VO₂ max value, what it's really reporting is what the model believes the number is likely to be for someone whose signals resemble yours (ignoring the noisy input quality). That can be reasonably accurate for some individuals and substantially wrong for others.
Topol’s concern is therefore well placed. When a device outputs a single value and (deceptively, in my opinion) labels it with the name of a physiological quantity, people naturally assume that the quantity has been measured. In reality it has been inferred, and once the number becomes a focal health metric, something people believe they must optimize in order to extend their lives, the gap between inference and measurement becomes more consequential.
Once we accept that wearable VO₂ max estimates are model-based approximations rather than measurements, Topol’s preferred alternative must be held to the same standard.
There's also a deeper conceptual issue beneath the technical one. VO₂ max, in the laboratory sense, is defined by gas exchange measurement. During cardiopulmonary exercise testing (CPET), the subject breathes through a metabolic analyzer that directly measures oxygen uptake and carbon dioxide production while workload increases to maximal effort. That process captures the integrated performance of the lungs, heart, circulation, and skeletal muscle metabolism. The smartwatch version of VO₂ max bypasses that physiology entirely and attempts to infer the result indirectly from external signals.
For these reasons, Topol’s skepticism toward consumer “VO₂ max” numbers is justified. In my opinion, they are not really VO2 max numbers at all, just a deceptively-named guess. They are estimates built from population modeling, dependent on imperfect inputs, and incapable of accounting for many individual variables that influence exercise physiology. And, as Topol argues, presenting them as precise physiological measurements can mislead users about both their health status and the meaning of the number itself.
But once we accept that wearable VO₂ max estimates are model-based approximations rather than measurements, Topol’s preferred alternative must be held to the same standard.
In fact, METs instantiate the exact same kind of inferential issues. Yes, treadmill-derived METs are a better-anchored estimate than wrist-based VO₂ predictions, but they are still just a modeled estimate. That is a difference in degree, not a difference in kind. If Topol is going to condemn wearables for substituting modeled inference for direct measurement, he cannot then smuggle METs into the argument as though they belong to a different epistemic category.
In fact, though Topol conveniently skips explaining this, the very definition of a “MET” is explicitly an estimate of VO2.
METs Are an Oxygen Consumption Estimate.
The central problem with Topol’s argument appears immediately once the definition of a MET is stated clearly. A MET is a unit derived from oxygen consumption: One metabolic equivalent is defined as 3.5 mL of oxygen per kilogram of body weight per minute, historically approximating resting metabolic rate. (Ainsworth et al., 2000).
Everything built on the concept of METs is therefore explicitly anchored to oxygen uptake. A workload of 10 METs corresponds to an oxygen consumption of roughly 35 mL/kg/min. The connection between METs and VO₂ is baked into the very definition of the term.
For a physician writing publicly and authoritatively about why readers should prefer METs to VO₂, omitting the definitional relationship between METs and oxygen consumption is an inexcusable failure of clarity.
When exercise treadmill tests report “peak METs,” they are not measuring some magically different physiological phenomenon. They are estimating maximal oxygen consumption from mechanical workload.
The treadmill records speed and grade, which are then inserted into metabolic equations that estimate the oxygen cost of locomotion. The resulting oxygen consumption estimate is then divided by 3.5 to express the value in METs. (ACSM Guidelines for Exercise Testing and Prescription; Wasserman et al., Principles of Exercise Testing and Interpretation).
In other words, the pipeline looks like this: mechanical workload → estimated oxygen consumption → METs
METs are simply an oxygen-consumption (VO2) estimate expressed in different units.
This fact is well-understood within the exercise physiology literature. It's stated explicitly in the very research Topol cites. The systematic review he relies on notes that, “VO₂max measured by cardiopulmonary exercise testing is the gold standard measure of cardiorespiratory fitness, while estimated CRF, expressed in METs, is used when direct gas exchange measurement is not feasible” (Singh et al., 2024, emphasis mine).
Similarly, the FRIEND registry analysis explains that cardiorespiratory fitness can be directly measured as maximal oxygen consumption (VO₂max) from CPET or estimated from exercise workload during treadmill testing (Kaminsky et al., 2015, emphasis mine).
Topol cites this very literature to argue that METs should replace VO₂max ... That position contradicts the framing used by the authors of the studies themselves.
The implication is straightforward. These papers treat MET-based exercise capacity as a practical estimation method, not as a superior physiological metric. Yet Topol cites this very literature to argue that METs should replace VO₂max as the primary metric of interest. That position contradicts the framing used by the authors of the studies themselves.
The epidemiologic studies he references follow the same pattern. These studies don't present METs as a competing physiological construct. They use METs because the dataset contains treadmill workloads rather than gas-exchange measurements, because the former have historically been easier to measure at scale.
Once that point is understood, the logic of Topol’s recommendation collapses. If METs are an estimation method for oxygen consumption, then claiming that METs should replace VO₂max is equivalent to claiming that an estimate of oxygen consumption is preferable to measuring oxygen consumption directly.
And yes, METs are more practical to measure at scale, but practicality isn't the same thing as scientific superiority. A convenient approximation can be useful in population research without becoming the preferred physiological measurement for individuals.
And for individuals, METs, as an estimate, inherit all the modeling assumptions used to convert workload into oxygen consumption. They assume average locomotion efficiency, accurate treadmill calibration, correct incline measurement, and standardized resting metabolic rate. Each assumption introduces potential error.
The literature acknowledges these limitations directly. Byrne and colleagues demonstrated that the conventional MET definition of 3.5 mL/kg/min does not accurately represent resting metabolic rate for many individuals, meaning that MET-based estimates can frequently misrepresent true oxygen consumption on an individual level (Byrne et al., 2005).
So Topol’s own critique against wearable “VO₂” numbers, also applies, by the same principles, to MET-based estimates.
So Topol’s own critique against wearable “VO₂” numbers, also applies, by the same principles, to MET-based estimates. This is the central error in his article. He correctly objects when wearable companies take indirect inputs, pass them through population-based models, and present the result under the authoritative label of VO₂ max. But treadmill-derived METs do the same kind of thing. They also take indirect inputs, pass them through standardized equations and population assumptions, and infer oxygen consumption rather than measuring it. Treadmill MET estimates are generally closer to the physiology because workload is a stronger anchor than wrist-sensor pattern recognition. But they are still estimates.
Both are modeling approaches. Neither directly measures oxygen consumption. Topol is therefore attacking modeled inference in one paragraph and privileging modeled inference in the next. He isn't exposing two different epistemic categories. He is treating two members of the same category as though one of them were exempt from the criticism he has already made.
Once this is recognized, the conceptual hierarchy becomes obvious. Direct VO₂ measurement via cardiopulmonary exercise testing sits closest to the physiology. Treadmill-derived MET values are one layer removed, estimating oxygen consumption from workload. Wearable VO₂ estimates are further removed, predicting oxygen consumption from statistical relationships between sensor signals and metabolic data.
But treating METs as superior to VO₂max actually reverses the direction of measurement fidelity. It elevates the approximation over the quantity being approximated.
The MET Estimate Contains Hidden Assumptions About Human Movement
Even if one accepts METs as a practical way to approximate aerobic capacity in large datasets, the estimation process introduces assumptions that move the measurement further away from physiology and closer to performance on a specific mechanical task.
When peak METs are derived from treadmill testing, the estimate assumes a predictable relationship between external workload and oxygen consumption. The ACSM metabolic equations treat the oxygen cost of locomotion as a function of speed and grade. They assume that if a person is running at a given velocity on a given incline, the metabolic demand can be estimated with reasonable accuracy. That assumption is broadly valid at the population level, which is why treadmill protocols have been widely used in epidemiologic research.
But the assumption breaks down once individual biomechanics enter the picture.
Running is not a purely metabolic task. It's also a biomechanical skill. The metabolic energy cost of locomotion varies substantially between individuals depending on stride mechanics, tendon stiffness, elastic recoil, neuromuscular coordination, and training background. These factors determine how efficiently metabolic energy is converted into forward motion. Two individuals with identical cardiovascular & metabolic capacity can produce very different treadmill workloads if one of them is a more mechanically efficient runner.
Running economy, the oxygen cost required to sustain a given running speed, varies widely, even among trained athletes. Small differences in technique, stiffness of the Achilles tendon, stride frequency, or ground contact mechanics can produce meaningful differences in metabolic cost for the same running speed. The ACSM equations used to estimate METs cannot account for these individual variations. They assume an average locomotion efficiency.
This means that treadmill-derived METs are partially measuring running skill, not just cardiorespiratory and metabolic physiology.
Treadmill-derived METs are partially measuring running skill, not just cardiorespiratory and metabolic physiology.
For some individuals, this distinction matters a great deal. My own personal example illustrates the problem. During periods when my measured VO₂max has been in the mid‑50s mL/kg/min range (well within what exercise physiology would classify as good aerobic capacity), I have still performed extremely poorly on running tasks relative to peers. In CrossFit programming where workouts required running fixed distances, I would routinely fall far behind the group. Yet when the same workout allowed rowing or cycling equivalents, my performance relative to others improved by absurd amounts. While it can be argued that the “equivalents” were not truly equivalent, the degree of improvement had more to do with the fact that my skill at rowing was higher than my skill at running.
In other words, my aerobic system was capable of producing substantial oxygen consumption, but my running mechanics, tendon recoil, and movement efficiency were poor. As a result, much of my metabolic output was not translated into locomotion, but wasted in inefficiencies.
A treadmill protocol that estimates aerobic capacity from running workload would therefore classify me as less fit than my oxygen consumption would suggest. Direct gas‑exchange measurement removes that biomechanical skill filter, because it measures oxygen uptake itself rather than inferring it from performance on a specific movement skill. The treadmill test therefore embeds a biomechanical skill filter between physiology and the final number reported.
The epidemiologic literature rarely addresses this distinction directly because large cohort studies cannot measure detailed biomechanics. Instead, they rely on the statistical strength of large numbers. Across thousands of participants, variation in locomotion efficiency tends to average out. The population-level association between estimated fitness and mortality remains visible despite the noise introduced by individual biomechanics. Population-level robustness, however, does not imply individual-level accuracy.
When METs on a treadmill are interpreted for a single person, the value reflects a mixture of physiological capacity, technical running skill, breathing mechanics, and neuromuscular coordination. Direct gas-exchange measurement removes much of that ambiguity because it measures oxygen uptake itself rather than inferring it from the external work performed during a specific movement pattern. This distinction becomes important when one asks which metric is closer to the underlying physiology that plausibly influences longevity.
If the biological pathways linking fitness to mortality operate through cardiovascular output, mitochondrial function, metabolic regulation, vascular health, and autonomic balance, then oxygen consumption represents a direct expression of those integrated systems. A treadmill workload estimate, by contrast, sits one step further removed because it measures the mechanical output produced by those systems filtered through the skill of running.
That doesn't invalidate treadmill-derived METs as a useful population metric. It simply means that they should be understood for what they are: an indirect estimate of aerobic physiology mediated by the mechanics of locomotion.
Once that distinction is acknowledged, the claim that METs should replace measured VO₂max becomes difficult to justify scientifically.
And, as we said before, the authors of the papers that Topol cites do understand exactly these issues. Cardiorespiratory fitness is formally defined in exercise physiology as maximal oxygen consumption measured during cardiopulmonary exercise testing. Gas exchange analysis during graded exercise directly quantifies oxygen uptake and carbon dioxide production, providing the canonical measure of aerobic capacity (Wasserman et al., Principles of Exercise Testing and Interpretation), which can then be estimated using treadmill workload in METs (Kaminsky et al., 2015).
The recent systematic review synthesizing studies that used both measured and estimated cardiorespiratory fitness adopts the same framework, as do the epidemiological outcome studies. Direct measurement of VO₂ is described as the gold standard, while estimated CRF, often expressed in METs, is used when laboratory gas exchange testing isn't feasible in large cohorts (Singh et al., 2024; Mandsager et al., 2018; Kokkinos et al., 2022).
Importantly, none of these papers claim that MET-based estimates are superior to direct oxygen measurement. In fact, they explicitly insist that METs are an estimate used for convenience, because it's easier to implement a research study in a large population using treadmill data rather than measuring oxygen data.
But what is “more convenient” for researchers doesn’t mean “better for individuals”.
“Free and Accessible” Does Not Mean Accurate
One of Topol’s central claims is that METs should be favored because they are “free, simple, and universally available.” That framing sounds persuasive until one examines how MET values are actually generated and what conditions are required for them to be meaningful.
In the research studies he cites, METs are not produced casually. They come from standardized exercise treadmill testing conducted in clinical environments. Yes, the researchers get to skip acquiring, calibrating, and using all the oxygen-uptake equipment, but that doesn’t mean they just roll on over to the local Planet Fitness and do a survey of treadmill workouts.
The claim that METs are therefore “free and accessible” also deserves closer scrutiny. A valid MET measurement in the studies cited by Topol requires a clinical exercise treadmill test, which involves specialized equipment, trained supervision, and a structured protocol.
In the research lab, the treadmill speed is calibrated, the incline is verified, and the testing protocol follows established staging procedures. The supervising staff ensure that the participant doesn't hold the handrails, prematurely terminate the test, or otherwise distort the workload. These conditions matter because the metabolic equations used to estimate oxygen consumption assume accurate measurements of speed and grade and assume that the participant is performing the locomotion task normally.
Outside that controlled environment, those assumptions frequently break down.
Home/gym treadmills are not laboratory instruments. Their speed calibration drifts over time. Incline mechanisms are often imprecise. Users commonly hold the handrails, which significantly reduces the metabolic cost of the task. Even small deviations from the expected locomotion pattern can alter the relationship between mechanical workload and oxygen consumption.
The claim that METs are therefore “free and accessible” also deserves closer scrutiny. A valid MET measurement in the studies cited by Topol requires a clinical exercise treadmill test, which involves specialized equipment, trained supervision, and a structured protocol. That isn't meaningfully different from the practical requirements of a cardiopulmonary exercise test except that the latter includes gas exchange measurement.
In other words, the comparison being drawn isn't between an expensive laboratory test and a universally accessible metric. It's between a laboratory measurement of oxygen consumption and a laboratory estimate of oxygen consumption derived from treadmill workload.
Once that distinction is made explicit, the accessibility argument loses much of its force. Whatever little force might remain is further diminished by the advancement of consumer-level gas-exchange measurement technology, such as the Calibre Biometric device. Direct VO₂ measurement has historically been inaccessible outside specialized laboratories and large research settings. But that may not remain true for long. If democratization of direct measurement is nigh, then Topol’s argument from convenience collapses entirely.
In any case, future considerations aside, the immediate question is whether the proxy (METs) should be treated as scientifically preferable to the actual measurement when the measurement is available? Nothing in the evidence cited by Topol demonstrates that such an inversion is justified, and even on the face of it seems rather absurd.
VO₂ Measurement Is Not a Single Number. It Is a Physiologic Dataset.
Up to this point the discussion has focused on measurement fidelity: direct oxygen consumption vs estimates derived from workload or population models. But there's another distinction that rarely appears in the public discussion of VO₂max, but is highly relevant once the goal shifts from population risk stratification to individual decision-making.
True VO₂max testing doesn’t just produce a single final number.
When VO₂ is measured directly during graded exercise, the metabolic cart records breath-by-breath oxygen uptake and carbon dioxide production across the entire test. From those data the test yields a series of physiologic signals that describe how the cardiovascular, respiratory, and metabolic systems respond to increasing workload. VO₂max is only one output from that dataset.
The test also provides interpretive inferences on thresholds between different metabolic systems (creatine/free ATP, anaerobic glycolysis, glycogen, lactic acid shuttle, aerobic glucose, and free fatty acids). CPET also produces the oxygen pulse trajectory, which approximates stroke volume behavior during exercise. It measures ventilatory efficiency, typically expressed as the VE/VCO₂ slope, which reflects how effectively ventilation eliminates carbon dioxide. The respiratory exchange ratio provides evidence that the subject has reached true maximal effort rather than terminating early.
These thresholds provide practical anchors for training planning and interventions, but none of this information exists in a treadmill-derived MET value.
A peak MET estimate collapses the entire physiological response to exercise into a single external performance outcome: the highest workload achieved on the treadmill. The number doesn't indicate whether the limitation occurred in the cardiovascular system, the lungs, peripheral muscle metabolism, or simply because the subject stopped due to fatigue or discomfort. It provides no information about metabolic thresholds or ventilatory efficiency and no direct insight into the internal physiology that produced the observed workload.
This difference matters once the conversation moves from “Does fitness predict mortality?” to “How should an individual train?”
The observational studies cited earlier demonstrate that individuals with higher cardiorespiratory fitness tend to have lower mortality risk. They don't prescribe how that fitness should be improved, whereas direct VO₂ measurement allows a more structured interpretation of where an individual currently sits physiologically and how different training intensities might influence that system.
This does not mean that a CPET-guided training program has been proven to extend life. That claim has not been tested. But it does mean that the measurement captures far more of the relevant biology than a workload estimate alone.
For population epidemiology, that may not matter much. A coarse estimate of aerobic capacity is sufficient to detect the broad relationship between fitness and mortality across large cohorts. But for individuals trying to understand their physiology and improve it deliberately, the difference between a single number and a physiologic dataset is substantial.
Does Raising VO₂ by Any Means Improve Longevity?
There's a central claim that no one disputes - individuals with higher cardiorespiratory fitness tend to live longer. Across large observational cohorts, whether fitness is measured directly through oxygen consumption or estimated through treadmill workload, higher fitness levels are consistently associated with lower rates of all-cause mortality and cardiovascular events (Kodama et al., 2009; Mandsager et al., 2018; Kokkinos et al., 2022). Topol’s own citation of a meta-analysis diagram shows that the same result is noted in both MET-derived datasets and direct-measurement datasets.
But what the literature doesn't establish is something much more nuanced. There remains the open question, contrary to Attia’s assertion, whether every method that raises fitness produces an improvement in long-term health outcomes.
The observational studies linking cardiorespiratory fitness to mortality are not intervention trials. They measure baseline fitness and follow participants over time. When researchers report that a one-MET increase in fitness is associated with lower mortality risk, the increase isn't being produced in an individual, but across individuals. Genetics, exercise habits, occupation, body composition, and even breathing mechanics, all contribute to the observed variation in cardiorespiratory capacity between individuals. This distinction becomes critical once the conversation turns to optimization.
Rejecting Topol’s MET argument does not require accepting Attia’s framing. Attia’s error is different.
Improving VO₂max through targeted training is not a single intervention. Different training strategies can raise measured VO₂ through different physiological adaptations. High-intensity interval training can increase maximal oxygen uptake partly by improving cardiac output and mitochondrial density. Moderate continuous training can enhance oxidative capacity and metabolic efficiency. In each case the VO₂ number may rise, but the underlying adaptations are not identical, and the epidemiologic literature doesn't distinguish between these pathways vis-à-vis mortality risk.
That omission matters insofar as Topol’s article is, in substantial part, a reaction against the way Peter Attia and others have elevated VO₂max in public discourse. But rejecting Topol’s MET argument does not require accepting Attia’s framing. Attia’s error is different.
The observational literature supports the claim that higher cardiorespiratory fitness is associated with lower mortality. It doesn't provide comparably strong mortality evidence for every specific intervention designed to optimize VO₂max. General exercise recommendations have broad outcome support. Tailored VO₂max-increasing strategies (especially non-exercise based strategies) don't have that same level of mortality evidence behind them. Conflating those two claims is looser than the evidence allows.
The literature also doesn't support the Attia-subcultural obsession with maximizing the number. Like most physiologic predictors, cardiorespiratory fitness is observed to yield diminishing returns. The most clinically meaningful risk reduction is plausibly concentrated in moving people out of the lowest fitness strata into a moderate range (using exercise).
Beyond that, additional increases in VO₂ or MET capacity deliver progressively smaller mortality benefits, even if they deliver large performance benefits. A large portion of longevity benefit could be captured by achieving guideline-level exercise volume and intensity that improves basic stroke volume, blood pressure, and mitochondrial function, while the fine-grained, protocol-specific tailoring that maximizes VO₂ becomes largely a performance or vanity project rather than a longevity project.
The literature also doesn't support the Attia-subcultural obsession with maximizing the number.
Genetics and upbringing add another layer of uncertainty that the observational category comparisons cannot solve. Baseline fitness is partly heritable, and may be influenced by infant or childhood environment (birth altitude, for instance). Traits affecting oxygen transport, ventilatory control, erythropoiesis, muscle fiber composition, and movement economy all influence measured VO₂ and treadmill-derived MET capacity. If those genetic factors also influence mortality risk through pathways that are not fully mediated by exercise behavior, then part of the observed association between fitness and mortality may not be a causal effect of training. In that scenario, being in a higher VO₂ category because of genetics isn't equivalent to being in that category because one pushed the number upward through an intervention. The association remains real, but the causal interpretation becomes more fragile.
Now, this may also not be the case. I’m not claiming one way or another, I’m pointing out that this is a common inferential error. To treat observational differences between individuals as if they were automatically achievable benefits from an intervention in each of those individuals is a category error that ignores many factors that contribute to a metric.
This raises a scientifically important possibility. It's conceivable that some methods of raising measured VO₂ primarily improve test performance without producing the same long-term physiological changes that drive reductions in cardiovascular risk or mortality. Acute ergogenic aids such as caffeine, bicarbonate loading, NMN, or other performance enhancers can temporarily improve exercise output and therefore increase estimated METs or even measured VO₂ during a test session, but that in no way implies that such interventions would extend lifespan. The same logic may apply, in more subtle ways, to certain training approaches that disproportionately affect the specific performance characteristics of the testing modality.
None of this undermines the core finding that fitness matters. The association between higher cardiorespiratory capacity and lower mortality is one of the most consistent relationships observed in exercise epidemiology. But the literature is less clear on what sub-types of cardiovascular exercise create the greatest improvement in longevity.
This is where the debate about METs versus VO₂max begins to look misplaced. The critical uncertainty isn't which metric best represents aerobic capacity. The critical uncertainty is which interventions actually translate into meaningful improvements in long-term health outcomes, and how much of the observational gradient is modifiable versus an index of underlying traits.
Conclusion — What the Evidence Actually Supports
Topol is correct to criticize the cultural fixation on wearable “VO₂ max” numbers. These values are algorithmic predictions generated from population models rather than direct measurements of oxygen consumption. They rely on imperfect sensor inputs and statistical inference. Treating them as precise physiological measurements invites unwarranted confidence.
But he then commits the same kind of error he is criticizing.
[Topol] then commits the same kind of error he is criticizing.
His complaint about wearable VO₂ estimates is that they are indirect, model-based approximations presented with more authority than they deserve. That complaint is right. His mistake is failing to recognize that treadmill-derived METs are also indirect, model-based approximations of oxygen consumption.
The very papers Topol cites repeatedly describe direct VO₂ measurement as the reference standard and MET-based fitness estimates as a practical approximation. The parallel is therefore exact in kind and different only in degree.
Wearable VO₂ estimates are further removed from physiology. Treadmill MET estimates are closer. But both are still estimates, and only direct CPET measures oxygen uptake itself. Topol’s argument fails because he correctly objects to one proxy for being a proxy, then turns around and privileges another proxy while writing as though it were something fundamentally different.
The argument that METs should replace VO₂max therefore rests on a conceptual inversion. It elevates the approximation above the quantity being approximated.
None of this diminishes the usefulness of MET-based exercise capacity for epidemiologic research. The proxy performs well enough to capture the population-level relationship between fitness and mortality. That is precisely why it has been widely used in large observational studies. But success as a proxy does not transform an estimate into a superior physiological measurement.
The argument that METs should replace VO₂max therefore rests on a conceptual inversion. It elevates the approximation above the quantity being approximated.
The more consequential scientific uncertainty lies elsewhere. The existing literature demonstrates that higher cardiorespiratory fitness predicts lower mortality risk. It does not establish that every strategy capable of increasing measured fitness (by any method) will also improve outcomes.
That distinction is important because Topol’s article is responding, at least in part, to Peter Attia’s style of VO₂max advocacy. My critique of Topol is not a defense of Attia. He is too loose in moving from observational evidence to broad “by any means” intervention rhetoric. The evidence base for general exercise is much stronger than the evidence base for specific VO₂max-optimizing interventions as mortality tools.
The important scientific questions remain open. Which physiological adaptations are most responsible for the protective effects associated with higher fitness (stroke volume, mitochondrial function or density, etc.)? Which training strategies most reliably produce those adaptations? And how much of the observed fitness gradient in mortality is modifiable through intervention, versus reflecting underlying inherited biological variation?
Those questions will not be answered by arguing that METs should replace VO₂max. They will be answered by carefully designed studies that examine how different interventions change the physiology of cardiorespiratory fitness and how those changes translate into long-term health outcomes.
Until then, the most defensible position is straightforward. Wearable estimates should not be confused with measurement. Proxies should not be mistaken for the quantities they approximate. And strong observational correlations should not be treated as proof that optimizing a number, by any means available, will necessarily extend life.
To understand more about how Attia-style inference creates category errors of false optimization, you can download my paper on “The Illusion of Optimization” at: https://riverrockmedical.com/optimization.
References
- Eric Topol. “The Flawed VO₂ Max Craze.” Ground Truths (Substack). Feb 23, 2026.
https://erictopol.substack.com/p/the-flawed-v02-max-craze - Mandsager K, Harb S, Cremer P, Phelan D, Nissen SE, Jaber W. Association of Cardiorespiratory Fitness With Long-term Mortality Among Adults Undergoing Exercise Treadmill Testing. JAMA Network Open. 2018;1(6):e183605.
https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2707428 - Kodama S, Saito K, Tanaka S, et al. Cardiorespiratory Fitness as a Quantitative Predictor of All-Cause Mortality and Cardiovascular Events in Healthy Men and Women: A Meta-analysis. JAMA. 2009;301(19):2024–2035.
https://jamanetwork.com/journals/jama/fullarticle/1108396 - Kokkinos P, Faselis C, Myers J, et al. Exercise Capacity and Mortality in Black and White Men. Journal of the American College of Cardiology. 2022.
https://pubmed.ncbi.nlm.nih.gov/35926933/ - Kaminsky LA, Arena R, Myers J. Reference Standards for Cardiorespiratory Fitness Measured With Cardiopulmonary Exercise Testing: Data From the Fitness Registry and the Importance of Exercise National Database (FRIEND). Mayo Clinic Proceedings. 2015.
https://pmc.ncbi.nlm.nih.gov/articles/PMC4919021/ - Singh A, et al. Objectively Measured Cardiorespiratory Fitness and Estimated CRF in Relation to Cardiovascular and All-Cause Mortality: Systematic Review and Meta-analysis. 2024.
https://rodin.uca.es/bitstream/handle/10498/36353/OA_2025_0099.pdf - Kim Y, et al. Cardiorespiratory Fitness, Grip Strength, and All-Cause Mortality in UK Biobank Participants. British Journal of Sports Medicine.
https://bjsm.bmj.com/content/52/6/356 - Byrne NM, Hills AP, Hunter GR, Weinsier RL, Schutz Y. Metabolic Equivalent: One Size Does Not Fit All. Journal of Applied Physiology. 2005;99(3):1112–1119.
https://journals.physiology.org/doi/full/10.1152/japplphysiol.00023.2004 - Apple Watch / Garmin / Fitbit wearable VO₂max validation studies (example):
Shcherbina A, Mattsson CM, Waggott D, et al. Accuracy in Wrist-Worn, Sensor-Based Measurements of Heart Rate and Energy Expenditure in a Diverse Cohort. Journal of Personalized Medicine.
https://pmc.ncbi.nlm.nih.gov/articles/PMC6732081/ - Bent B, et al. Investigating Sources of Inaccuracy in Wearable Optical Heart Rate Sensors. NPJ Digital Medicine.
https://www.nature.com/articles/s41746-020-0226-6 - Studies evaluating smartwatch VO₂max estimation accuracy (Apple Watch / Garmin / Fitbit):
https://pmc.ncbi.nlm.nih.gov/articles/PMC11325102/ - Kuopio Ischemic Heart Disease Risk Factor Study (direct VO₂max and mortality):
Laukkanen JA, et al. Cardiorespiratory Fitness and the Risk of Sudden Cardiac Death. JAMA.
https://jamanetwork.com/journals/jama/fullarticle/193003 - General definition of MET = 3.5 mL O₂/kg/min:
Ainsworth BE, et al. Compendium of Physical Activities: Classification of Energy Costs of Human Physical Activities. Medicine & Science in Sports & Exercise.
https://pubmed.ncbi.nlm.nih.gov/10993420/ - American College of Sports Medicine metabolic equations for treadmill walking/running (basis of MET estimation):
ACSM. ACSM’s Guidelines for Exercise Testing and Prescription.
https://www.acsm.org/read-research/books/acsm-guidelines-exercise-testing-prescription - General CPET / VO₂max methodology reference:
Wasserman K, Hansen JE, Sue DY, Stringer WW, Whipp BJ. Principles of Exercise Testing and Interpretation. Lippincott Williams & Wilkins. - Wearable VO₂max mean absolute percentage error studies (7–16% range):
https://pmc.ncbi.nlm.nih.gov/articles/PMC11325102/ - RealClearScience reprint of Topol article:
https://www.realclearscience.com/2026/02/24/the_flawed_vo2_max_craze_1166604.html