Submaximal Walking Tests: A Review of Clinical Use
Though graded exercise testing is the gold standard for assessing cardiorespiratory fitness, submaximal exercise testing is also useful to assess cardiorespiratory status and functional capacity when maximal testing is not feasible. Submaximal walking tests are advantageous as they have less risk, lower cost, require less time and equipment, and walking is a familiar activity that is easy to do in most environments. A number of submaximal walking tests exist for both overground and treadmill walking. Regression equations to predict V̇o2max values based on walking time, distance, and other variables that influence exercise tolerance have been developed for some submaximal tests, including the Rockport Fitness Walking Test and the Single-Stage Treadmill Walk Test. The 6-Minute Walk Test is a common test used in clinical populations to predict prognosis and assess change in functional capacity after intervention. Determining which submaximal walking test to use depends on purpose and setting, subject characteristics, equipment availability, space, and time. This review will provide clinicians with an overview of submaximal walking test protocols and provide reference equations and minimal clinically important difference values to interpret results.ABSTRACT
INTRODUCTION
Assessing cardiorespiratory fitness and functional capacity in clinical and fitness settings is done for diagnostic and prognostic guidance or to determine the impact of interventions. Maximal graded exercise testing is the gold standard for assessing cardiorespiratory fitness, though it requires costly equipment, time, and skilled personnel to administer and interpret. It also may be contraindicated or unsafe for persons with cardiac or other medical conditions (1), may not provide accurate results in individuals unaccustomed to high intensity exercise (2), and can be influenced by individual motivation (3,4).
Submaximal walking tests can assess cardiorespiratory fitness and/or functional capacity when maximal exercise testing is not feasible. These tests have less risk, lower costs, do not require equipment, are easy to administer to individuals or groups, and motivation for maximal exercise does not influence the results (5). Walking tests may be an ideal mode as walking is a natural activity for most. For some submaximal tests, regression equations have been developed to predict V̇o2max using variables including age, sex, heart rate (HR), body weight, and walk time or distance (6). Others, such as the 6-Minute Walk Test (6MWT), estimate functional capacity based on distance walked (6MWD), have reference values, can predict health outcomes in clinical populations (7,8), and have established minimal clinically important difference (MCID) values to determine if meaningful health-related changes have occurred with time or treatment (9–11).
There are no current literature reviews to guide clinicians in determining the most appropriate submaximal walking test based on setting, participant characteristics, time, space, financial constraints, available equipment, and purpose of testing (e.g., diagnostic, to estimate V̇o2max, or functional capacity), or that provide aggregate data on validity or interpretation of test results. Therefore, the purpose of this literature review (12) is to provide clinicians with an overview of common submaximal walking test protocols, review their application in healthy populations and in clinical populations commonly treated by clinical exercise physiologists, and provide reference equations (RES) and MCID values to interpret results. Table 1 provides an overview of protocols for common submaximal walking tests.

Review Process
A reference librarian designed the search strategy in consultation with the primary author to (a) determine which sub-maximal walking tests were most cited in published literature, and (b) to identify the clinical populations for which these submaximal tests were being investigated. See Supplemental Material for full search details. PubMed and MED-LINE (Ebsco) were searched January 2021 to July 2021 to select relevant submaximal tests. Because the search strategy targeted originating articles, searches were not limited by publication date. Three submaximal tests with validity and reliability data in both healthy and clinical populations were identified: the 6MWT, the Rockport Fitness Walking Test (RFWT), and the Single-Stage Treadmill Walk Test (SSTW) which is often called Ebbeling's test. Targeted searching for each test yielded a list of relevant studies. Originating articles for each submaximal test reviewed and studies investigating psychometric properties of those tests were included. Searches were supplemented with lists of citing works from originating article and reference scanning of review articles to confirm the comprehensiveness of search results.
Assessment of V̇o2max Predictive Test Accuracy
Submaximal walking tests use linear regression equations developed using predictor variables such as age, HR, and walk time to estimate V̇o2max values. Though no specific criteria exist to define an accurate predictive test equation, several statistical values are used to interpret results of regression equations and can help interpret the level of accuracy of the prediction. The Pearson correlation coefficient (r) shows the strength of the association between observed and predicted values, though it provides no detail on accuracy of the measure. A correlation between 0.70 and 1.0 demonstrates a strong relationship. The R2 measure, called the goodness of fit measure, is the percent of variance in the predicted value that is explained by the linear model (14). For example, an R2 of 0.60 would indicate that 60% of the data fit the regression model; however, it does not indicate the correctness of the regression model. The standard error of the estimate (SEE), total error and analysis of the residuals (residual = measured V̇o2max − predicted V̇o2max) evaluate the accuracy of prediction. SEE measures the variance around the regression line, called the residuals. A smaller SEE means better predictive validity, though what constitutes a good SEE is arbitrary and influenced by the level of acceptable error in the physiological measure. The mean of the residuals should be close to zero and normally distributed. Plotting residuals against predicted V̇o2max values should show random dispersion if there is no bias in the prediction equation (14). Total error, which considers the systematic difference between measured and predicted V̇o2max values, is larger than SEE when there is systemic error in the prediction equation (15). Bland-Altman plots, graphing the mean of predicted and actual measures against the difference of these measures, provides a visual representation to determine if there are systemic bias patterns in our measurement (16).
Prediction equations should be developed on a diverse group to enhance generalizability and be cross-validated on a subgroup to determine accuracy. Equations should also be applied only to populations with similar characteristics to the development population. Test-retest reliability is important, especially if using the test to measure changes over time. This is determined using intraclass correlation coefficient (ICC), with good reliability indicated by values closer to 1.0 and poor reliability indicated by values less than 0.5. Not all studies provide comprehensive results to allow a full understanding of application. Table 2 presents prediction equations and validation data for the tests described below.


ROCKPORT FITNESS WALKING TEST
Overview
Originally developed by Kline et al. (6), this test requires participants to walk overground as fast as possible while maintaining a consistent pace for 1 mile on a measured, flat 1-mile surface. The RFWT was developed on 174 healthy individuals ages 30 to 69 and cross-validated on a similar group of 169 participants. Both sex-specific and generalized regression equations were developed to estimate V̇o2max using age, body weight, time to complete walk, and HR as independent variables (6). Negligible differences in SEE between sexes were found, and authors concluded the use of a sex-specific equation was not warranted. For the development group, the generalized equation for estimating V̇o2max in mL·kg−1·min−1 reported r = 0.88 and SEE = 5.0 mL·kg−1·min−1. Cross-validation by decade of age of the generalized equation yielded correlations ranging from 0.74 to 0.90 (SEE = 2.4–5.2 mL·kg−1·min−1). There was no analysis of residuals reported. The RFWT demonstrates good test-retest reliability with ICCs for V̇o2max estimate in mL·kg−1·min−1 ranging between 0.73 and 0.97 (17,22,23).
Greenhalgh and colleagues (23) validated the RFWT generalized equation on college students and found it accurately predicted V̇o2max not only using walk time from the 1-mile walk (r = 0.84, SEE = 4.03 mL·kg−1·min−1, residual = −0.36 mL·kg−1·min−1), but also when using quarter-mile time alone to estimate 1-mile time walk time (quarter-mile time x 4) (r = 0.81, SEE = 4.83 mL·kg−1·min−1, mean residual = 1.59 mL·kg−1·min−1). This may be beneficial if testing time is limited.
Dolgener (20) and George (21), conversely, reported that the RFWT generalized equation systematically overestimated V̇o2max in untrained college aged students. In each of these studies, measured mean V̇o2max values were lower compared to the population used by Kline et al. (6) to develop the equation, which may explain the overestimation.
Dolgener et al. (20) developed new equations for the RFWT in a homogenous group of college students that yielded reasonable results for the generalized equation predicting absolute (r = 0.84, SEE = 0.40 L·min−1) and relative (r = 0.58, SEE = 2.44 mL·kg−1·min−1) V̇o2max values in a cross-validation sample. George et al. (21) validated the new Dolgener generalized equation (20) in a group of similar college-aged participants and found it valid when using both 1-mile walk time and quarter-mile walk time adjusted to estimate 1-mile time length. The Dolgener equation (20) had poor accuracy in predicting V̇o2max in high school students with a mean age 4 years younger than the equation development population (18).
Fenstermaker et al. (17) reported good reliability and validity of the RFWT generalized equation from Kline et al. (6) in a small sample of females >65 years. More recently, Weiglein et al. (22) found the RFWT accurately predicted V̇o2max in male United States Air Force members with a correlation of 0.81 and mean residual of 1.1 mL·kg−1·min−1, close to 0, indicating it was an accurate predictor in this homogeneous sample. The generalized equation by Kline et al. (6) over-predicted V̇o2max by 19% in a sample of adults with developmental delay (19). Physiological differences in HR response in this population was speculated to contribute to the inaccuracy (19).
Treadmill RFWT
The RFWT underestimated V̇o2max when healthy adults were tested using a nonmotorized curved treadmill (24). Similarly, Pober et al. (25) reported the RFWT underpredicted V̇o2max values when 304 moderately fit middle-aged male and female participants were tested on a motorized treadmill walking at a self-selected pace maintained for 1 mile. Pober et al. (25) developed a regression equation, and cross-validation on a subset of the sample showed good accuracy (r = 0.87, SEE = 4.7 mL·kg−1·min−1, mean residual value of 0.96 mL·kg−1·min−1). These results demonstrate that prediction equations used should be specific to mode of activity (e.g., over ground vs. treadmill walking). The equation from Pober et al. (25) may be beneficial in fitness settings where aerobic conditioning and testing is often performed on treadmills. This equation has not been validated and is not appropriate for use on clinical populations.
RFWT Clinical Bottom Line
RFWT is a simple, inexpensive test ideal for use in healthy individuals and easy to administer to groups. The original prediction equation is reasonably accurate in predicting V̇o2max values when applied to populations and settings similar to its development (healthy adults, age 30–69, over ground). Equations for testing on treadmills (25), or younger individuals (20) are available and are more appropriate for use in these populations. A limitation of the RFWT is that to date, none of the developed equations have been validated in clinical populations and therefore the test should only be used with healthy individuals.
SUBMAXIMAL SINGLE-STAGE TREADMILL WALK TEST (EBBELING TEST)
Overview
Ebbeling et al. (13) developed a submaximal treadmill walking test where participants walked at “brisk but comfortable” self-selected pace of 2.0, 3.0, 4.0, or 4.5 mph for a HR of 50% to 70% of age-based maximum for three 4-minute stages at 0%, 5%, then 10% grade. This test was developed with 117 healthy males and females ages 20 to 59 years. Regression equations, developed for each stage, showed good fit (R2 = 0.83–0.94, SEE = 4.72–5.25 mL·kg−1·min−1), and predicted vs. measured V̇o2max had high correlation (r = 0.93–0.96) in a cross-validation sample. The recommended final equation (Table 2) was based on data from only stage 2 for simplicity, shortening the test to a 4-minute warm-up followed by the 4-minute walk at 5% grade. Mitros et al. (27) found fair correlation with measured V̇o2max using the SSTW protocol and equation (30) with middle-aged women, though the mean difference between estimated and measured peak values was 6.7 mL·kg−1·min−1 with bias toward overestimation. The narrow age range and fitness level of participants compared to the original development group likely contributed to the differences seen.
Waddoups (30) tested the SSTW equation at the low (50%) and high (70%) ranges of recommended maximal HR in a slightly younger overall sample of 22 participants. They found that the equation underestimated V̇o2max values at low HR range and overestimated it at higher HR range by about the same amount (3.5 mL·kg−1·min−1). If using SSTW test to assess change, it is recommended that the same age-based HR percentage be used for each test with varying treadmill speed as needed to minimize this error.
Nemeth et al. (26) developed a V̇o2max prediction equation with 86 overweight children ages 11 to 14 years using SSTW protocol (Table 2). This equation was cross-validated on a similar group of 27 children and was accurate predicting V̇o2max (r = 0.85, SEE = 271 mL·min−1, median deviation from observed values 6.8%). However, there was large individual variability making it more appropriate for estimating mean group values due to large error margin with individual application.
Francis et al. (28) compared Nemeth's equation to the original Ebbeling equation in adolescents with Type 1 diabetes mellitus. Both equations underpredicted V̇o2max values, with Ebbeling's equation error being larger and systematically underpredicting to a greater extent in unfit females. Similarly, Risum et al. (29) looked at the validity and reliability of using the SSTW protocol in 58 children ages 10 to 16 years with juvenile idiopathic arthritis. Criterion validity was acceptable at a group level (ICC = 0.71), but not at an individual level (ICC = 0.55) with no systematic bias. These findings are not surprising considering Ebbeling's equation was not developed on, nor validated in, adolescents.
Limited reliability data exists for the SSTW test. Both Mitros and Risum reported acceptable test-retest reliability for V̇o2max estimation with ICC of 0.95 and 0.91 respectively, and an interrater reliability coefficient of 0.96 in children with juvenile idiopathic arthritis (27,29).
SSTW Clinical Bottom Line
Treadmill tests are useful when space is limited. It is also a familiar form of walking for many. The SSTW test and equation originally developed by Ebbeling et al. (13) is useful only in populations similar to its development population: that is healthy males and females 20 to 59 years. It can be used to assess changes in fitness after intervention if the same relative max HR percentage is used both before and after training (30). A new equation developed using the SSTW in overweight children is valid, though reliability and sensitivity to change in fitness have not been studied (26). Validity of this test in clinical populations is limited and further study is warranted.
TESTS OF FUNCTIONAL CAPACITY
6MWT Overview
Originally developed by Guyatt et al. (5), standardized by the American Thoracic Society in 2002 (7) and updated in 2014 (9), the 6MWT is a submaximal walking test commonly used to measure and detect change in functional capacity in clinical populations. Developed to assess functional capacity in individuals with cardiopulmonary diseases, it consists of walking as “far as possible” on a straight, 30-m, flat, hard surface between 2 cones for 6 minutes. Standardized instructions and feedback at 1-minute intervals are given and participants can stop and rest as needed with time still recording (8). Testers should assist patients as minimally as required during the walk and walk behind patients if supervision is needed to avoid influencing pace. The main outcome measure of the 6MWT is distance walked typically measured in meters. It is safe and feasible in clinical populations (9,32–34). Because of large variability, performing the 6MWT on children younger than 5 years is not recommended (35). Absolute contraindications for the 6MWT include unstable angina and myocardial infarction within a month of testing (7). Though considered a submaximal test, the 6MWT elicits a maximal exercise response in some clinical populations with severe disease (e.g., chronic obstructive pulmonary disease) (36–38), and assessing for contraindications to maximal exercise testing should be done for these individuals (1).
The 6MWT is popular because it is easy to administer, the familiarity of most people with walking, and the profuse data published for many clinical populations (9–11,33,39–41). The 6MWD demonstrates strong correlation with V̇o2max in healthy adults and several clinical conditions (33,40,42,43). Details on its use as a predictor of morbidity, mortality, and prognosis are abundant (9,33,39,44–46), as are details on validity and reliability (8,11,34,41,47–49). It is beyond the scope of this review to cover these; instead, we focus on the clinical administration and interpretation.
Testing Methodology
The 6MWT has high reproducibility when the standardized protocol is followed (7). Modifying verbal feedback impacts 6MWD (5,48,49) but eliminating verbal feedback did not alter 6MWD in a group of adults with chronic obstructive pulmonary disease (52). Small modifications in the standard feedback did not alter 6MWD but resulted in a small significantly different rating of perceived exertion (53). Walking in a continuous path that is circular, oval, or square increases distance compared to a straight 30 m path where turning is required (54,55). Altering the distance of the walkway from 30 m to a shorter distance, typically 20 or 10 m, results in a shorter 6MWD versus a 30 m walkway (46,56–60). RES exist to adjust test results if the 6MWT is done on a 10 m or 20 m walkway if space is limited (46,61,62). Direction of turning (dominant or nondominant direction) does not seem to influence 6MWD (59). Performing the test outside vs. inside seems to produce similar results (63).
Studies have shown that performing the 6MWT on a treadmill compared to overground results in shorter distances walked (64–67), thus 6MWD from treadmill testing cannot be interchanged with overground results. Treadmill 6MWTs have shown good test-retest reliability in healthy adults, patients after cardiac surgery, and for persons in cardiac rehabilitation (66–68). Treadmill walking may be useful when space is limited, for patients who must be isolated, or if close monitoring is needed.
Learning Effect
There is a practice or learning effect with the 6MWT. A large systematic review noted a mean 23-m learning effect between first and second 6MWT in individuals with chronic respiratory disease (9). Hernandes et al. (69) similarly found a mean learning effect of 27 m in patients with chronic obstructive pulmonary disease, with 82% of the 1514 participants walking further on their second 6MWT. Others have found learning impacts 6MWD between 1 to 3 tests in both healthy individuals and those with health conditions (40,55,70–72). The magnitude of the learning effect varies based on the type and severity of clinical pathology, with those walking the shortest distance or with greatest impairments displaying less of a practice effect (34,54,69–71). This practice effect was not found in healthy children age 6 to 12 (49), or in adults post stroke (73). The impact of practice seen in many studies appears to last for 2 to 3 months (74,75). When using the 6MWT to evaluate changes over time, performing 2 tests and taking the best of the 2 is recommended to address the learning effect (8,72), though in children <12, or those with severe impairments walking short distances, 2 tests may not be warranted (41,51).
6MWD Reference Values and Reference Equations
Numerous prediction equations, often called RES, exist to predict 6MWD based on variables that impact functional gait, including weight, height, age, sex, and leg length (35,61,76–86). These allow clinicians to determine if an individual or group's 6MWD falls within expected norms or reference values. The abundance of equation options makes determining the ideal RES to use difficult for clinicians. Multiple studies have assessed the efficacy of existing RES in different populations (80,81,87). A recent systematic review by Mylius et al. (35) found a wide range in recommended within-age-group reference values across 22 studies in healthy children. Alameri et al. reported the most commonly used RES developed on healthy adults overestimated walk distance in adults in Saudi Arabia (81). Cultural, methodological, and ethnic differences likely contributed to the variation seen. It is recommended that country-based RES are used or developed across healthy and clinical populations (8). Further, existing reference values and RES should only be used on populations with demographics similar to those which they were developed. Application of RES developed on healthy participants should not be applied to clinical populations (14). A substantial body of literature on RES, reference values, and demographics of their developmental groups are available (35,77,87–90).
Equations to predict V̇o2max from 6MWD are available for healthy adults and children (42,61,68,76,77,90,91) and for a number of clinical conditions (43,92–94). Interestingly, the most commonly used equations (77) were developed before standardization of the current 6MWT protocol, thus should be applied with caution. In clinical populations, many of these equations show large predictive error, more so in clinical conditions where systems other than the cardiovascular system may influence gait. This limits the use of these equations for individual point estimates of V̇o2max. (43,92). Further study to develop and cross-validate equations on larger samples is needed.
Interpretation of Change
A common metric to determine meaningful change from intervention is the MCID. The MCID is “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in patient management.” (95). This differs from minimal detectable change, which is the amount of change required to account for measurement error and does not always reflect clinical relevance. Many methods for calculating MCID exist, including distribution methods that use statistical variance of the measure, and anchor-based methods, where change in 6MWD is linked to another clinical criterion (the anchor) that marks change (14). No consensus on the best way to determine MCID exists (96). MCID values have been published for the 6MWT for many clinical populations so clinicians can assess effectiveness of interventions, or in some cases, measure deterioration with disease progression (48,97). Table 3 provides 6MWT MCID values for common clinical populations published since 2002 when the 6MWT protocol was standardized. Pooled systematic review data is listed where noted instead of individual studies.

Limitations of 6MWT
The 6MWT does not provide diagnostic detail regarding functional limitations. As it is self-paced, motivation may influence results, though complying with the standardized instructions limits this factor (49). It demonstrates a ceiling effect in those with milder disease status and using a more sensitive marker for cardiovascular change may be warranted in those cases (111–113).
6MWT Clinical Bottom Line
The 6MWT is a useful clinical tool to assess functional capacity across the lifespan in both healthy and clinical populations. It requires the use of many body systems, though, and does not differentiate contributions of each system to functional status. It is safe, standardized, and both valid and reliable when performed according to protocol.
CONCLUSIONS
Submaximal walking tests are practical for use in a variety of settings as valid indicators of functional capacity and V̇o2max. Both the RFWT and the SSTW have established equations to predict V̇o2max for different age ranges. Both were designed for use in healthy adults but have limited application to clinical populations. The 6MWT has established reference values, MCID, and thresholds for prognosis for healthy and numerous clinical populations, which makes it useful to clinicians seeking to interpret results and determine the impact of interventions.
Contributor Notes