If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
St Joseph's Hospital Cardiac Rehabilitation & Secondary Prevention Program, London, Ontario, CanadaLawson Health Research Institute, London, Ontario, CanadaSchulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
St Joseph's Hospital Cardiac Rehabilitation & Secondary Prevention Program, London, Ontario, CanadaLawson Health Research Institute, London, Ontario, Canada
Mediated by outcomes such as improved exercise capacity, cardiac rehabilitation (CR) reduces morbidity and mortality. For accuracy, an individual CR patient's change must be measured reliably, an issue not typically considered in practice. Drawing from psychometric theory, we calculated reliable change indices (RCIs), to measure individual CR patients’ true clinical change, apart from that from error and test practice/exposure, in exercise capacity, anxiety, and depression.
Methods
Indirectly calculated exercise capacity (peak metabolic equivalents [METs]) and psychological symptoms were each measured twice, 1 week apart, by administering treadmill tests or the Hospital Anxiety and Depression Scale (HADS) to separate samples of 35 (mean age: 59.0 years; 6 women) and 96 (mean age: 64.4 years; 32 women) CR patients, respectively. Using test-retest reliability and mean difference scores from these samples to estimate error and practice/exposure effects, we calculated RCIs for a separate cohort (n = 2066; mean age: 62.0 years; 533 women) who completed 6-month CR, and compared change distributions (worsened/unchanged/improved) based on critical RCIs, mean and percent changes, cut-off scores, and standard deviations.
Results
Practice/exposure effects were nonsignificant, except the mean HADS anxiety score decreased significantly (P ≤ 0.013; d = 0.17, small effect). Test-retest reliabilities were high (METs r = 0.934; HADS anxiety score r = 0.912; HADS depression score r = 0.90; P < 0.001). Among 2066 CR patients, RCI distributions differed (P < 0.001) from those of most other change criteria.
Conclusions
Change ascertainment depends on criterion choice. A Canadian Cardiovascular Society CR quality indicator of increase by 0.5 MET may be too small to assess individuals’ functional capacity change. RCIs offer a pragmatic approach to benchmarking reliable change frequency, and pending further validation, could be used for feedback to individual patients.
Résumé
Contexte
Grâce à des résultats tels que l'amélioration de la capacité d'exercice, la réadaptation cardiaque (RC) réduit la morbidité et la mortalité. Pour être précis, le changement de RC d'un patient doit être mesuré de manière fiable, un aspect qui n'est généralement pas pris en compte dans la pratique. En nous inspirant de la théorie psychométrique, nous avons déterminé des indices de changement fiables afin de mesurer le véritable changement clinique individuel des patients en RC en lien avec la capacité d'entraînement, l'anxiété et la dépression et indépendamment des erreurs et de la pratique du test/l'exposition au test.
Méthodologie
La capacité d'entraînement calculée indirectement (l’équivalent métabolique maximum [MET]) et les symptômes psychologiques ont été mesurés deux fois, à une semaine d'intervalle, par le biais de tests sur tapis roulant ou de l’échelle de l’évaluation de l'anxiété et de la dépression en milieu hospitalier (HADS), auxquels ont respectivement été soumis des échantillons de 35 personnes en RC (âge moyen : 59,0 ans; 6 femmes) et de 96 personnes en RC (âge moyen : 64,4 ans; 32 femmes). En recourant à la fiabilité test-retest et aux écarts moyens dans les scores de ces échantillons pour faire une approximation des effets d'erreur et de pratique/exposition, nous avons déterminé des indices de changement fiables pour une cohorte distincte (n = 2 066; âge moyen : 62,0 ans; 533 femmes) qui a terminé une RC de six mois, et comparé la répartition des changements (aggravé/inchangé/amélioré) sur la base des indices de changement fiables critiques, des changements moyens et en pourcentage, des scores-seuils et des écarts types.
Résultats
Les effets de la pratique/exposition étaient peu significatifs, sauf une diminution du score moyen de l'anxiété sur l’échelle HADS (p ≤ 0,013; d = 0,17, petit effet). La fiabilité test-retest était élevée (MET, r = 0,934; score de l'anxiété sur l’échelle HADS, r = 0,912; score de la dépression sur l’échelle HADS, r = 0,90; p < 0,001). Parmi les 2 066 patients en RC, la répartition des indices de changement fiables différait (p < 0,001) de celle de la plupart des autres critères de changement.
Conclusions
Le constat du changement dépend du choix du critère. Une augmentation de 0,5 MET selon l'indicateur de qualité en RC de la Société canadienne de cardiologie est peut-être trop faible pour évaluer le changement de capacité fonctionnelle des individus. Les indices de changement fiables offrent une approche pragmatique pour comparer la fréquence du changement fiable et, en attendant une vérification plus poussée, ils pourraient être utilisés comme rétroaction fournie individuellement aux patients.
Cardiac rehabilitation (CR) improves morbidity and mortality among cardiovascular patients,
Depression as a risk factor for poor prognosis among patients with acute coronary syndrome: systematic review and recommendations: a scientific statement from the American Heart Association.
Behavioural and psychosocial issues in cardiovascular disease.
in: Stone JA Arthur H Suskin N Canadian Guidelines for Cardiac Rehabilitation and Cardiovascular Disease Prevention: Translating Knowledge into Action. 3rd ed. Canadian Association of Cardiac Rehabilitation,
Winnipeg, Manitoba2009
Additional effects of psychological interventions on subjective and objective outcomes compared with exercise-based cardiac rehabilitation alone in patients with cardiovascular disease: a systematic review and meta-analysis.
Comparisons of means, for example, of a CR cohort at discharge vs entry, might elucidate programmatic outcomes, while revealing little about individual participants. Clinically therefore, consistent with personalizing health interventions,
it is important to determine whether individual CR patients improve meaningfully in exercise capacity and psychological symptoms.
Suppose an individual patient's exercise capacity increased from 6.5 to 7.5 metabolic equivalents (METs; 1 MET = 3.5 ml O2 /kg body-weight/min), 1.0 MET or 15%, from CR entry to discharge. This increase matches a mean change measured directly with cardiopulmonary exercise testing,
associated with fewer clinical events. This increase is double the 0.5-MET pre-post CR exercise capacity quality indicator of the Canadian Cardiovascular Society (CCS-QI),
A critical question is whether such an individual's performance reflects truly improved exercise capacity, over and above test practice and measurement error.
because they may reflect measurement error or prior test exposure/practice effects, in addition to clinically relevant change, such as that from treatment, recovery, or deterioration.
We address the question of individual CR participants’ change in exercise capacity and psychological symptoms, through an innovative application of reliable change methodology from psychometric theory. Reliable change indices (RCIs) can account for error and practice in difference scores,
potentially facilitating measurement of true clinical change of individual CR patients, offering a direct clinical application.
Measuring Exercise Capacity, Anxiety, and Depression
Reliability (which estimates error) of exercise capacity measured with stress-testing is moderate to high, assuming consistent test-termination criteria
than functional capacity (FC) measures dependent on treadmill time, determined from peak speed and grade. Although FC can improve due to merely test practice,
The Hospital Anxiety and Depression Scale depression subscale, but not the Beck Depression Inventory-Fast Scale, identifies patients with acute coronary syndrome at elevated risk of 1-year mortality.
In the Ontario Cardiac Rehabilitation Pilot Project (OCRPP), the mean HADS depression score and HADS anxiety score each improved significantly (n = 1554; P < 0.0001), by 1 point over 6 months (Cardiac Care Network [CCN] 2002).
In the current set of 3 studies, we calculated RCIs for CR entry-to-discharge FC changes in peak METs, and HADS anxiety and depression scores. Specific objectives were to (i) estimate stability and test practice/exposure effects for FC, and depression and anxiety measures, for RCI calculations; (ii) illustrate clinical use of RCIs for individuals; (iii) in a large clinical cohort, compare RCI distributions to those of other change criteria based upon mean and percent changes, cut-off scores, and SDs, including the 0.5-MET CCS-QI.
We used 1-week retest intervals to estimate practice/exposure effects relatively free of treatment-related, maturational, or other systematic influences, hypothesizing that distributions of RCIs would differ from those of other change criteria, thereby providing unique information in determining change in individual patients.
Methods
This research was approved by the Health Sciences Research Ethics Board, Western University, London, Ontario. Separate samples for studies 1 and 2 were recruited from participants in our Cardiac Rehabilitation & Secondary Prevention Program, which accepts referrals of cardiovascular patients according to standard recommendations.
AACVPR/ACC/AHA 2007 performance measures on cardiac rehabilitation for referral to and delivery of cardiac rehabilitation/secondary prevention services endorsed by the American College of Chest Physicians, American College of Sports Medicine, American Physical Therapy Association, Canadian Association of Cardiac Rehabilitation, European Association for Cardiovascular Prevention and Rehabilitation, Inter-American Heart Foundation, National Association of Clinical Nurse Specialists, Preventive Cardiovascular Nurses Association, and the Society of Thoracic Surgeons.
The inclusion criteria were as follows: (i) referral to CR within 1 year of an acute coronary syndrome event, percutaneous coronary intervention, coronary artery bypass grafting, stable angina, or heart valve repair/replacement; (ii) age ≥19 years; (iii) ability to participate in stress-testing and exercise; (iv) written, informed consent to use data in research; (v) residence location within 1 hour of our CR program; (vi) no medical contraindications to CR participation. For studies 1 and 2, the 1-week interval was chosen to minimize spontaneous change of exercise capacity or anxiety/depression, which could impact practice/exposure effects and test–retest reliability estimates.
Study 3, in which the RCI formulae were applied retrospectively, used a sample with inclusion criteria similar to those in studies 1 and 2. Additionally, subjects (i) had consecutively entered our CR program from March 24, 2003 to February 23, 2012; (ii) had both entry and discharge measurements available for stress-testing, HADS anxiety score, or HADS depression score (at least 1 of the 3); (iii) did not have recurrent CR intakes; and (iv) participated in neither study 1 nor study 2.
Study 1: exercise
For time 1, we used results from usual-care stress-testing at CR entry, patients routinely undergo a peak symptom–limited exercise test using a modified Bruce protocol. Peak METs were derived algorithmically from maximum speed and grade attained.
Patients completing an entry stress test were asked to undergo an identical test 1 week later (time 2). This interval was chosen to minimize spontaneous change in exercise capacity. Testing was physician-supervised, occurred before exercise program entry to avoid treatment effects, and was scheduled at similar times of day, at least 2 hours after eating, with room environment and equipment conforming to the American Heart Association Guidelines for Exercise Testing Laboratories.
Guidelines for clinical exercise testing laboratories. A statement for healthcare professionals from the Committee on Exercise and Cardiac Rehabilitation, American Heart Association.
We analyzed peak METs. At our centre, CR stress-testing is supervised by physician CR specialists, using usual American Heart Association/American College of Sports Medicine stopping rules.
To limit observer effects, physicians were asked not to review the first test results prior to the second test. We strove to maintain consistency of supervising physicians between tests, although we did not require that for technicians. Detailed, standardized instructions for administration of a modified Bruce protocol were provided for each stress test. This protocol had 2 preliminary 3-minute stages (stage 1: 1.7 mph, 0% grade; stage 2: 1.7 mph, 5% grade) added prior to the usual stage 1 of the Bruce protocol; thus, stage 3 was the usual stage 1 of the Bruce protocol.
Participants were administered the Borg Rating of Perceived Exertion (RPE) Scale (range in integers: 6 = no exertion at all, to 20 = maximal exertion).
A usual-care goal was to have patients exercise to peak exertion by achieving or exceeding a Borg RPE of 17 (very hard), with light handrail holding permitted for safety. Location, use of treadmill, and medication regime were permitted to vary between tests, as a function of scheduling or clinical contingencies.
Study 2: psychometrics
All subjects were involved in CR exercise programming, conducted by CR program kinesiologists at a local YMCA. To minimize potential effects of early post-event adjustment, program habituation, and imminent discharge, we planned to test patients within the middle third of 6-month CR. Participants completed the HADS (time 1), repeated at the same time of day 1 week later (time 2), well within the minimum recommended 2-week interval.
Participants had typically completed the HADS previously at CR entry.
Sample sizes
For both studies 1 and 2, we aimed to recruit at least 30 men and 30 women, assuming the confidence interval around a reliability score of 0.60 would not include 0.
Study 3: CR clinical cohort
We categorized the CR clinical cohort of 2066 patients according to whether each individual had worsened, not changed, or improved from CR entry to discharge. For FC, the change criteria in either direction were: ≥0.5 MET, the absolute value of which corresponded with the CCS-QI
The Hospital Anxiety and Depression Scale depression subscale, but not the Beck Depression Inventory-Fast Scale, identifies patients with acute coronary syndrome at elevated risk of 1-year mortality.
For studies 1 and 2, given the relatively small sample sizes, we employed robust methods (bootstrapped, 2000 samples) to obviate concerns about potential deviations from normality. In study 1, to assess consistency and termination of the modified Bruce protocol stress-test administration, we compared time 1 and time 2 with respect to mean peak heart rate and RPE using dependent-measures t-tests, and calculated Pearson test-retest reliability. For METs and HADS scores, we expressed central tendency as means (SD), and mean changes at time 2 (M2) vs time 1 (M1) as effect size, Cohen's d = (M2 – M1)/SD1; we used dependent-measures t-tests to compare means at time 2 vs time 1; and we estimated test-retest stability with Pearson product-moment correlation coefficients.
RCIs
From cardiology, based on data from 14 male patients with exercise-induced angina and ST-segment depression, who were exercise stress-tested once on each of 2 consecutive days, Sullivan et al.
recommended multiplying the standard error of the difference x 1.96, to determine the minimum statistically reliable change (P < 0.05, 2-tailed) in an individual patient. Jacobson and Truax
calculated a reliable change index (RCI) to determine whether psychometric difference scores of individuals undergoing psychotherapy were statistically reliable, noting that the standard error of the difference estimates the probability of a change score occurring by chance. To correct for practice over repeated neuropsychological test administrations, Chelune et al.
derived from the mean change observed in a “control” sample unexposed to treatment.
To the extent that a “control” sample has retest intervals and characteristics matching those of a patient or cohort receiving an intervention, the approach of Chelune et al.
permits estimation of treatment-specific effects, apart from error and test practice, but also distinct from spontaneous recovery or deterioration; which are sources of true change. However, a treatment-free, no-CR “control” condition of cardiovascular patients with a retest interval comparable to the duration of usual-care CR might be affected by non-CR factors, including spontaneous recovery or deterioration, or by CR-independent medication changes or exercise, which would be challenging to exclude, and potentially of interest, as they contribute to “true” clinical change. In any case, withholding CR from eligible patients to form a control condition would be unethical.
We reasoned that clinicians would be interested primarily in gauging individual CR patients’ overall true improvements in exercise capacity and psychological well-being, whether caused specifically by CR, already known to be efficacious,
or by other true-change factors such as spontaneous recovery or deterioration, CR-independent exercise, or medications. Therefore, we modified the approach of Chelune et al.,
by using a 1-week retest interval (in contrast to intervals corresponding with actual CR programming duration) with CR-referred patients, to minimize treatment-related, maturational, or other systematic influences, while correcting for measurement error and test practice/exposure. The RCI of Chelune et al.
As applied here, X2 and X1 represent individuals’ test scores in the large clinical cohort at CR program discharge and entry, respectively (retest intervals: approximately 7.3-9.5 months). From the study 1 (exercise) or study 2 (psychometrics) research samples, r12 is the test-retest stability estimate (to estimate error; retest interval = 1 week); M1 and M2 are the mean test and retest scores (to estimate mean test exposure/practice), respectively; SD1 is the time-1 standard deviation. The Pearson product-moment correlation coefficient (r) was considered the most appropriate correlation coefficient to use to estimate stability, for 2 reasons: (i) r was used in the development of RCIs,
would result in 2 corrections for systematic differences such as practice effects, potentially resulting in a mathematical overcorrection.
For CR clinical cohort individuals (study 3), we calculated RCIs using (X2 – X1) for (i) indirectly measured peak MET change, corrected using variables (i.e., M2, M1, SD1, r12) from study 1; and (ii) HADS anxiety score and HADS depression score changes, corrected with study 2 variables. We calculated RCI variables based on whole samples for both study 1 and study 2; where sample size permitted (n ≥30, study 2), we developed sex-specific RCI variables. Whole-sample, male-specific, and female-specific variables were subsequently applied to the CR clinical cohort and each of its male and female subsamples, respectively, for whom entry and discharge data were available. As an RCI is a z-score, reliable change for an individual was defined as an RCI ≤–1.96 or ≥1.96 (P < 0.05; 2-tailed).
For exercise and psychometric outcomes of the CR clinical cohort, we used analysis of variance with time (program discharge vs entry) and sex as within- and between-subjects factors, respectively, and t-tests (2-tailed, bootstrapped, 2000 samples) for dependent measures to compare time points within sexes. For each HADS subscale, we set 2-tailed significance at α = 0.025 to control type 1 error. Distributions of CR cohort patients categorized as worsened, unchanged, or improved were compared with nonparametric statistics
using a Bonferroni correction to the alpha level for number of comparisons. Analyses were performed using SPSS v.24 or v.26 (IBM, Armonk, NY).
Results
Study 1: exercise
A total of 43 subjects consented to participate in study 1. Two stress tests were completed by each of 35 participants (mean age 59.0 years, SD = 8.3, range = 47.9-81.4 years; 6 women = 17.1%; ethnicity Caucasian, n =28 [80.0%], 5 missing; modal education level = university/college completed, n = 12 [35.3%], 1 missing; see Table 1, clinical characteristics).
Table 1Clinical characteristics of samples at cardiac rehabilitation entry
Characteristic
Study 1: exercise stress-testing(n = 35)
Study 2: psychometrics(n = 96)
Study 3: cardiac rehabilitation clinical cohort (n = 2066)
Ascertained from measurement (body mass index) or enquiry at cardiac rehabilitation entry (other variables ascertained from chart/referral documentation).
Ascertained from measurement (body mass index) or enquiry at cardiac rehabilitation entry (other variables ascertained from chart/referral documentation).
Values are n/total n (%; in repeated measures sample for risk factors).
Ascertained from measurement (body mass index) or enquiry at cardiac rehabilitation entry (other variables ascertained from chart/referral documentation).
† Sedentary was defined as < 30 min/d moderate physical activity 3 d/wk.
Stone JA Canadian Guidelines for Cardiac Rehabilitation and Cardiovascular Disease Prevention. Canadian Association of Cardiac Rehabilitation,
Winnipeg1999
Between test 1 and test 2, there was 100% consistency with respect to supervising physicians (n = 34; 1 missing). As shown in Table 2, mean peak heart rate and RPE did not differ significantly. For each, there was a significant correlation between test 1 and test 2 (P < 0.001; for RPE, n = 28, 7 missing).
Table 2Study 1: exercise stress-testing variables
n = 35
Time 1, mean (SD)
Time 2, mean (SD)
Mean difference
Pdifference (2-tailed)
Percent difference
r12
Pr
Days from referral
77.2 (91.2)
79.2 (91.2)
7.0
n/a
n/a
n/a
n/a
Peak HR (bpm)
131.66 (25.63)
132.43 (26.51)
0.77
0.62
0.59
0.940
< 0.001
Peak RPE (n = 28)
17.36 (1.99)
17.68 (1.76)
0.32
0.10
n/a
0.858
< 0.001
Peak METs
9.78 (3.01)
10.13 (3.15)
0.35
0.09
3.60
0.934
< 0.001
bpm, beats per minute; HR, heart rate; METs, metabolic equivalents; n/a, not applicable; RPE, rate of perceived exertion; SD, standard deviation.
Mean peak METs did not change significantly (Table 2), showed a small effect size (dMETs = 0.12), and significant test-retest reliability (P < 0.001).
Study 2: psychometrics
A total of 114 subjects consented to study 2 participation; 100 completed 2 administrations of at least one psychometric subscale; none participated in study 1. Of these, retest intervals in days (number of subjects) were 2 (2), 7 (88), 8 (2), 9 (4), 14 (3), and 37 (1). Post hoc, we excluded the 4 subjects who met/exceeded the minimum 14-day retest interval specified by the test developer.
Characteristics of the remaining n = 96 (Table 1) were as follows: mean age = 64.4 (9.7) years, range = 35.9-84.9 years; 32 women (33.3%); 93 Caucasians (97%); modal education level = university/college completed, n = 23 (24.0%). The mean (SD) inter-test interval was 7.0 (0.85) days. The mean (SD) interval from CR referral to test 1 was 25.9 (6.8) weeks, and from test 2 to CR discharge was 12.3 (4.6) weeks. This sample included one subject who had been formally discharged from CR 9 weeks before his test 1.
The mean HADS anxiety (HADS-A) score (Table 3) decreased significantly overall and within sexes (P ≤ 0.013); with a whole-sample small effect size (dHADS-A = 0.17). The mean HADS depression (HADS-D) score showed no significant changes (Table 3), with a whole-sample trivial effect size (dHADS-D = 0.02) Test-retest reliability was significant for all measures (P < 0.001).
Table 3Study 2: psychometric variables
Variable
Time 1, mean (SD)
Time 2, mean (SD)
Mean change
Pchange (2-tailed)
r12
Pr
Whole sample
HADS anxiety score (n = 93)
5.91 (3.80)
5.26 (3.81)
–0.66
< 0.001
0.912
< 0.001
HADS depression score (n = 93)
3.59 (3.34)
3.53 (3.43)
–0.065
0.679
0.900
<0.001
Women
HADS anxiety score (n = 32)
6.28 (3.84)
5.41 (4.03)
–0.88
0.003
0.922
< 0.001
HADS depression score (n = 31)
2.90 (2.66)
3.10 (3.22)
–0.19
0.508
0.876
< 0.001
Men
HADS anxiety score (n = 61)
5.72 (3.80)
5.18 (3.73)
–0.54
0.013
0.908
< 0.001
HADS depression score (n = 62)
3.94 (3.61)
3.74 (3.54)
–0.194
0.311
0.913
< 0.001
HADS, Hospital Anxiety and Depression Scale; SD, standard deviation.
The clinical cohort (Table 1) was comprised of 2066 patients (mean age = 62.0 years, SD = 10.7, range = 23.0-89.7; 533 women = 25.8%; 1941 Caucasian = 94.0%, 2 missing; modal education level = university/college completed, n = 564 or 27.5%, 18 missing). Mean (SD) intervals (weeks), based on available data, were: referral to CR entry (n = 2058), 12.8 (6.8); CR entry to discharge, (n = 1969), 32.7 (5.5); stress-testing retest (n = 2024), 31.4 (6.3); psychometric retest (n = 1956), 40.6 (7.4).
For the whole cohort and each sex (Table 4), all variables changed significantly from CR entry to discharge (P < 0.001). The sex main effect was significant for all variables (P ≤ 0.002), with men showing higher mean exercise capacity, and lower mean anxiety and depression scores. The interaction of time with sex was significant for METs (P < 0.001) but not for the HADS anxiety score (P = 0.122) or the HADS depression score (P = 0.041; alphacrit = 0.025).
Table 4Study 3: exercise stress-testing and psychometric outcomes among cardiac rehabilitation cohort of n = 2066
Measure
n
Entry, mean (SD)
Discharge, mean (SD)
Exercise: METs
Overall
2028
7.56 (3.30)
9.29 (3.62)
Men
1508
8.15 (3.28)
9.97 (3.56)
Women
520
5.83 (2.59)
7.31 (3.01)
HADS anxiety score
Overall
1722
5.94 (3.88)
4.99 (3.48)
Men
1272
5.58 (3.76)
4.67 (3.38)
Women
450
6.98 (4.04)
5.88 (3.59)
HADS depression score
Overall
1722
3.91 (3.30)
2.82 (2.84)
Men
1272
3.75 (3.21)
2.72 (2.83)
Women
450
4.36 (3.49)
3.10 (2.86)
HADS, Hospital Anxiety and Depression Scale; METs, metabolic equivalents; SD, standard deviation.
Figures 1 (exercise) and 2 (psychometrics) display frequency distributions of clinical cohort patients who worsened, did not change, or improved from CR program entry to discharge, according to the aforementioned change criteria. Most change criteria produced distributions differing significantly (P < 0.001) from each other, and in particular, from those of the RCIs. Among all distributions, only the HADS anxiety “meet/cross 7 points” vs 1.0 SD (P = 0.18) and; vs RCI (P = 0.019) did not differ significantly. HADS anxiety score (P = 0.58) and HADS depression score (P = 0.30) RCI distributions (not shown) did not differ significantly between sexes.
Figure 1Percentages of patients in cardiac rehabilitation cohort (n = 2066) with worsened, unchanged, or improved functional capacity, by change criterion. METs, metabolic equivalents; RCIs, reliable change indices; SD, standard deviation.
Figure 2Percentages of patients in cardiac rehabilitation cohort (n = 1722) with worsened, unchanged, or improved Hospital Anxiety and Depression Scale (HADS) anxiety or depression scores, by change criterion. RCIs, reliable change indices; SD, standard deviation.
We propose a method for clinical application, correcting for measurement error and mean test practice/exposure effects, to determine whether an individual patient has manifested true change from CR entry to discharge. Our exploration of different approaches demonstrates that ascertainment of change depends strongly upon the change criteria applied to individual patients. Present findings suggest the 0.5-MET CCS-QI
may be too small or liberal to ascertain true change over and above that from error and test practice in individuals, when using “indirect” FC measurement.
Study 1: exercise
Stress-test protocols were administered very consistently between time 1 and time 2, including consistency of supervising physicians, very similar levels of mean peak heart rates and Borg scale ratings, the latter corresponding with “very hard” RPEs, and high FC test-retest reliability. We achieved uniformity of 7-day retest intervals, and both measurements occurred before exercise program entry. Consequently, any test-retest FC differences were not due to CR exercise training, and were unlikely to have been caused by spontaneous change in fitness, but instead probably reflected test practice. The FC improvement was not statistically significant, although this was likely due to type II error, given combined small effect and sample sizes. Nonetheless, it is noteworthy that the estimated practice effect of 0.35 METs equates to 70% of the 0.5-MET CCS-QI,
assuming that they were measured indirectly as FC.
Peak METs demonstrated very high stability, providing a strong basis to calculate RCIs. For practical illustration, applying study 1 variables to equation 1 and requiring RCIMETS ≤ –1.96 or ≥ 1.96 (2-tailed) or RCIMETS ≥ 1.64 (1-tailed) for significance, an individual patient would have to change by at least ±2.50 METs, or gain 2.15 METs, respectively, to show true FC change, over and above that from error and test practice. Thus, hypothetically, individual Y, who gained 3.0 METs from CR entry to discharge, improved over and above the change from error and practice. However, for individual Z, whose performance increased by 1.0 MET, twice the CCS-QI of 0.5 METs,
; tests 1 and 2 occurred well after referral and before CR-discharge, respectively. Therefore, the significant improvements obtained in mean anxiety scores were unlikely to reflect treatment, true spontaneous change, acute adjustment, or anticipation of discharge. This finding suggests the importance of considering test exposure even in cases in which responses do not depend on psychomotor performance. Scores from the HADS demonstrated high to very high stabilities, providing a strong basis to calculate RCIs.
Applying study 1 variables to equation 2 and requiring RCIHADS ≤ –1.96 or ≥ 1.96 (2-tailed), or RCIHADS ≥ 1.64 (1-tailed) for significance, an individual patient would have to change by at least ±2.42, or 1.91 HADS anxiety points, respectively, to evidence true change in psychometrically measured anxiety symptoms, over and above that from error and test exposure. Corresponding values for the HADS depression score are ±2.80 or 2.32 points, respectively. Recently, Lemay et al. reported an MCID of 1.7 points for each HADS subscale.
As 1.7 < 1.91 points, this MCID is within the potential measurement error and test exposure effects we estimated for the HADS, and it therefore may be too liberal for application with individual patients.
Study 3: CR cohort
The samples for studies 1 and 2 were recruited from the same CR population as the large clinical cohort. Consequently, variables determined from the study samples were generalizable to the large clinical cohort. All samples were referred according to standard criteria
AACVPR/ACC/AHA 2007 performance measures on cardiac rehabilitation for referral to and delivery of cardiac rehabilitation/secondary prevention services endorsed by the American College of Chest Physicians, American College of Sports Medicine, American Physical Therapy Association, Canadian Association of Cardiac Rehabilitation, European Association for Cardiovascular Prevention and Rehabilitation, Inter-American Heart Foundation, National Association of Clinical Nurse Specialists, Preventive Cardiovascular Nurses Association, and the Society of Thoracic Surgeons.
and thus were broadly similar to CR populations elsewhere, being composed substantially of patients with coronary heart disease with or without acute coronary syndrome, large proportions of whom had undergone revascularization procedures.
Mean initial FC in study 1 and the CR cohort suggested relatively high exercise capacity for CR patients. However, indirect measurement with light handrail-holding, as conducted in our centre, probably overestimates direct peak VO2 by approximately 2 METs.
real-world CR outcomes study, as determined by direct peak VO2 measurement. The higher baseline level in study 1 could represent a selection bias in favour of healthier volunteers. In general, we believe that the external validity of our study is reasonable.
This cohort, overall and by sex, showed robustly significant mean improvements in exercise capacity, anxiety, and depression, which we used to illustrate a clinical problem: given statistically significant group changes and no control condition, how can a clinician infer what constitutes true change for individuals? Comparisons of the change distributions show that answering this question depends crucially on the criterion applied.
With respect to exercise, the criterion least sensitive to change (highest percentage of unchanged patients) was meeting or crossing the 7-MET mark. Although this level may be functionally important,
we recommend against using it as a simple benchmark of individual change.
The criterion giving the most favourable ascertainment of change in exercise capacity (highest percentage improved) was ±0.5 MET, corresponding with the CCS-QI.
Notably, this QI does not consider measurement error and test practice effects. Furthermore, it (intentionally) does not specify the method of measurement, whether through gas-exchange techniques, or indirect calculation from peak treadmill speed and grade, which is more subject to practice
These issues are crucial, given that the mean (nonsignificant) practice effect in study 1 alone could account for 70% of the 0.5-MET QI in a larger sample with more statistical power. Although the CCS-QI may meet criteria for an MCID, an FC increase of 0.5 METs by an individual would fall within the range of measurement error and practice effect, as illustrated above; in other words, that change would not be reliable. Therefore, we recommend against interpreting a gain of 0.5 METs measured indirectly, as meeting the CCS-QI, for individuals. We also recommend measuring exercise capacity (peak VO2) directly, with cardiopulmonary exercise testing.
When direct cardiopulmonary measurement of peak VO2 is not feasible, we suggest application of the critical RCI scores (±2.50 METs, 2-tailed; or 2.15 METs, 1-tailed) derived from RCI analyses reported here to ascertain reliable change by individuals.
We also suggest a review to determine whether the CCS-QI
The Hospital Anxiety and Depression Scale depression subscale, but not the Beck Depression Inventory-Fast Scale, identifies patients with acute coronary syndrome at elevated risk of 1-year mortality.
produced the most favourable ascertainment of improvement. Comparison with RCI distributions and critical RCI values suggests that using a 1-point change criterion may overestimate the frequency of reliable improvement for individuals.
There are depression-related CCS-QIs for programmatic processes
These factors, and scientific challenges to defining clinically meaningful change, could complicate any depression-outcome QI. RCIs, which are standard (“z”) scores, could enable meaningful comparisons with respect to depression outcomes, across psychometric instruments and CR programs. We propose that any future outcome QI should at minimum be capable of measuring individual-level reliable change.
Practical implications of our findings include benchmarking for program evaluation or to inform quality improvement, as, for example, with respect to frequencies of participants who either show reliable decrements, do not change, or reliably improve over CR, with respect to key outcomes including exercise capacity and emotional functioning. In light of the potential impact on individual CR participants, direct feedback to them might be considered, after further validation of RCIs against clinical criteria. These RCIs can be used clinically, similar to how they are applied in other disciplines (for example, psychology) to help determine if a patient seen for at least 2 testing sessions has improved, declined, or stayed the same on a standardized measure. This determination is impossible to make by eyeballing raw score changes. In contrast, RCIs allow a clinician to determine if any changes in raw scores are beyond expected fluctuations in test scores based on measurement error and practice effects.
Strengths and limitations
Strengths of our research include the following: short retest intervals, minimization of potential contributions of “true” treatment or maturational change to our practice/exposure estimates; high stability estimates; and use of RCIs, which permits correction of individual change scores for measurement error and test practice/exposure.
The principal limitation is that we have not validated RCIs against clinical criteria such as future cardiovascular event risk. Using 1-week retest intervals may not accurately estimate error or practice/exposure over CR. If treadmill practice effects decayed, 1-week retest intervals might overcorrect for practice over longer intervals. Conversely, treadmill use in CR might preserve or enhance practice, independently of true exercise-capacity change. Short retest intervals might then underestimate practice in CR. Further studies should test the validity of generalizing practice estimates derived from short retest intervals to 6-month CR.
Further challenges to our estimation of error and practice effects are that we did not control for medication changes between time 1 and time 2, which could have occurred for clinical reasons after the first stress test; or for retest changes in technician, location, or treadmill. However, changes to medications would have been difficult to control, as they reflect normal clinical practice. Also, all testing was done at London Health Sciences Centre locations, reflecting commonality of equipment and maintenance. The high levels of inter-test consistency with respect to peak heart rate and RPE, and test-retest reliability of peak METs, suggest that these issues were not influential.
This method of assessing practice effects assumes uniform benefit from practice/exposure, but baseline fitness could interact with test repetition. This possibility could be explored in future work, by comparison of RCIs as calculated here
Our work was also limited by small sample sizes, which preclude sex-specific exercise estimates, and by low ethnic diversity, which limits its generalizability.
Conclusions
Although requiring further validation, reliable change methodology offers a promising and pragmatic approach to measuring true change in individuals in CR. RCIs can provide complementary information to external criteria of clinically important change.
Reliable change and minimum clinically important difference (MCID) of the Repeatable Battery for the Assessment of Neuropsychology Status (RBANS) in a heterogeneous dementia sample: support for reliable change methods but not the MCID.
Ascertaining change from CR entry to discharge strongly depends upon the specific criteria applied, as RCI distributions differed significantly from those of other change criteria we tested. Our data suggest that improvement by 0.5 MET in exercise capacity, a CCS-QI,
is too liberal a criterion, when indirect measurement is used, to ascertain true change in individual patients over and above error and practice effects. We recommend use of direct, cardiopulmonary measurement of peak VO2 to ascertain exercise capacity, when feasible. Although the 0.5 CCS-QI may meet criteria for an MCID, we recommend a review of its reliability and validity when it is applied to individuals, particularly in situations of indirect measurement. In any such review, we recommend separate consideration of direct VO2 measurement vs indirect algorithmic estimation of exercise capacity. We propose that RCI methodology may have utility in development of future outcome-related psychometric QIs. The RCI critical values reported here may be applied to frequencies of individual CR participants for program evaluation or quality assurance purposes.
Acknowledgements
We thank Karen Unsworth, MSc, and the CR Program clinical team for facilitating data collection.
Funding Sources
This research was funded by grant IRF-018-06 to PLP, MEO, NS, and Karen Unsworth from the Lawson Health Research Institute`s (LHRI) Internal Research Fund. PLP receives salary support from LHRI and the Division of Cardiology, Department of Medicine, Western University. NS is supported by the Program of Experimental Medicine of the Department of Medicine, Western University. The funding sources had no role in the study design, data collection, analysis or interpretation, writing of the article, or the decision to submit this article for publication.
Disclosures
The authors have no conflicts of interest to disclose.
References
Anderson L
Thompson R
Oldridge N
et al.
Exercise-based cardiac rehabilitation for coronary heart disease.
Depression as a risk factor for poor prognosis among patients with acute coronary syndrome: systematic review and recommendations: a scientific statement from the American Heart Association.
Behavioural and psychosocial issues in cardiovascular disease.
in: Stone JA Arthur H Suskin N Canadian Guidelines for Cardiac Rehabilitation and Cardiovascular Disease Prevention: Translating Knowledge into Action. 3rd ed. Canadian Association of Cardiac Rehabilitation,
Winnipeg, Manitoba2009
Additional effects of psychological interventions on subjective and objective outcomes compared with exercise-based cardiac rehabilitation alone in patients with cardiovascular disease: a systematic review and meta-analysis.
The Hospital Anxiety and Depression Scale depression subscale, but not the Beck Depression Inventory-Fast Scale, identifies patients with acute coronary syndrome at elevated risk of 1-year mortality.
AACVPR/ACC/AHA 2007 performance measures on cardiac rehabilitation for referral to and delivery of cardiac rehabilitation/secondary prevention services endorsed by the American College of Chest Physicians, American College of Sports Medicine, American Physical Therapy Association, Canadian Association of Cardiac Rehabilitation, European Association for Cardiovascular Prevention and Rehabilitation, Inter-American Heart Foundation, National Association of Clinical Nurse Specialists, Preventive Cardiovascular Nurses Association, and the Society of Thoracic Surgeons.
Guidelines for clinical exercise testing laboratories. A statement for healthcare professionals from the Committee on Exercise and Cardiac Rehabilitation, American Heart Association.
Reliable change and minimum clinically important difference (MCID) of the Repeatable Battery for the Assessment of Neuropsychology Status (RBANS) in a heterogeneous dementia sample: support for reliable change methods but not the MCID.
Stone JA Canadian Guidelines for Cardiac Rehabilitation and Cardiovascular Disease Prevention. Canadian Association of Cardiac Rehabilitation,
Winnipeg1999