If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Address correspondence to: Ying Shi, PhD, Division of Geriatrics, University of California, San Francisco, 4150 Clement Street 151R, San Francisco, CA 94121, USA.
Division of Geriatrics, Department of Medicine, University of California, San Francisco, California, USASan Francisco Veterans Affairs Health Care System, San Francisco, California, USA
San Francisco Veterans Affairs Health Care System, San Francisco, California, USADepartment of Psychiatry, University of California, San Francisco, California, USADepartment of Epidemiology & Biostatistics, University of California, San Francisco, California, USA
Division of Geriatrics, Department of Medicine, University of California, San Francisco, California, USADepartment of Epidemiology & Biostatistics, University of California, San Francisco, California, USA
Division of Geriatrics, Department of Medicine, University of California, San Francisco, California, USASan Francisco Veterans Affairs Health Care System, San Francisco, California, USA
Division of Geriatrics, Department of Medicine, University of California, San Francisco, California, USASan Francisco Veterans Affairs Health Care System, San Francisco, California, USA
The validated 82-item Advance Care Planning (ACP) Engagement Survey measures a broad range of ACP behaviors but is long.
Objectives
Determine whether shorter survey versions (55-item, 34-item, 15-item, 9-item, and 4-item versions) can detect similar change in response to two well-validated ACP interventions and provide practical effect size information.
Methods
We assessed ACP engagement for 986 English- and Spanish-speaking adults in a randomized trial of PREPARE vs. an advance directive-only study arms. The survey was administered at baseline, one week, three months, six months, and 12 months. We calculated mean change scores from baseline to follow-up time points by study arm, intraclass correlation coefficients of change scores between the 82-item survey with shorter versions, and within-group and between-group effect sizes of the mean change scores.
Results
Shorter survey versions were able to detect within-group and between-group changes at all time points. Within-group intraclass correlations of the 82-item to shorter versions were high (0.78–0.97), and the amount of between-group differences was comparable using all survey versions. Twelve-month within-group effect sizes ranged narrowly from 0.76 to 1.05 for different survey versions in the PREPARE arm and from 0.44 to 0.64 for the advance directive-only version. Between-group effect sizes ranged narrowly from 0.24 to 0.30 for different survey versions. Results were similar when stratified by English and Spanish speakers.
Conclusion
Shorter versions of the ACP Engagement Survey were able to detect within-group and between-group changes comparable with the 82-item version and can be useful for efficiently and effectively measuring ACP engagement in research and clinical settings.
Advance care planning (ACP) has garnered increasing attention from health systems and researchers because it has been shown to improve patients' satisfaction with medical care and increase agreement of patients' wishes for care received.
A comparative, retrospective, observational study of the prevalence, availability, and specificity of advance care plans in a county that implemented an advance care planning microsystem.
Historically, most studies have focused solely on the completion of an advance directive (AD) to measure successful ACP. However, several studies have shown that ACP is a complex process that occurs over time and involves multiple discrete behaviors.
Promoting advance care planning as health behavior change: development of scales to assess decisional balance, medical and religious beliefs, and processes of change.
The survey is based on Social Cognitive and Behavior Change theories and focuses on four behavior change constructs (i.e., knowledge, contemplation, self-efficacy, and readiness) within four ACP domains (i.e., surrogate decision makers, values and quality of life, flexibility in surrogate decision making, and asking doctors questions). Although validated and shown to detect change in response to ACP interventions, the 82-item version of the survey takes 50 minutes to administer,
reducing its utility. Brief, feasible, and validated surveys that can effectively measure the ACP process and can detect change in response to ACP interventions are needed for research and clinical programs.
In a prior study, we conducted item reduction and validated five progressively shorter versions of the ACP Engagement Survey, including a 55-item, 34-item, 15-item, 9-item, and 4-item version.
However, that prior study used blinded trial data with a small sample size and only accessed pre-to-post changes during a one-week follow-up period.
The present study builds on that prior work by including larger complete trial cohort data of English- and Spanish-speaking older adults from a published randomized controlled trial designed to compare two well-validated interventions.
Follow-up time points now include one week, three months, six months, and 12 months, and we calculate both within-group and between-group differences by study arm. This study also provides practical effect size information for the use of brief, literacy-appropriate, English and Spanish, culturally vetted measures for a range of ACP behaviors. We will evaluate if change scores in response to an ACP intervention for progressively shorter versions of the survey, including a four-item version, are highly correlated with the original 82-item version.
Methods
Data Sources and Participants
Study participants included 986 English- and Spanish-speaking patients enrolled in a randomized trial at the San Francisco General Hospital from February 2014 to November 2017. These participants were randomly assigned to two intervention groups, an easy-to-read AD written at the fifth-grade reading level (AD-only arm) and the PREPARE Web site (PREPAREforYourCare.org) plus the AD (PREPARE arm). PREPARE is an interactive online ACP program that uses video stories to help people identify their wishes for medical care and models how to discuss those wishes with others. The trial compared the efficacy of PREPARE plus the easy-to-read AD vs. the AD alone to engage participants in the ACP process. The study was approved by the University of California, San Francisco Institutional Review Board, and the trial has been published.
The validated patient-reported ACP Engagement 82-item Survey includes 57 items concerning behavior change processes (i.e., knowledge, contemplation, self-efficacy, and readiness) measured on an average five-point Likert scale and 25 ACP action items such as discussing and documenting ACP wishes using yes or no response options. The survey scores were unweighted on a one- to five-point scale with higher scores reflecting greater ACP engagement (response options: 1—not at all, 2—a little, 3—somewhat, 4—fairly, and 5—extremely for knowledge, self-efficacy, and readiness subscales, and 1—never, 2—once or twice, 3—a few times, 4—several times, and 5—a lot for the contemplation subscale). A detailed table including the questions and response options of the original version and shorter versions (i.e., 55-, 34-, 15-, 9-, and 4-item versions) has been published.
The 25 ACP action items were removed in all shorter versions because of redundancy as yes/no actions can also be calculated from the readiness questions, which assessed readiness to discuss/document with surrogate decision makers, discuss/document wishes for medical care, discuss/document flexibility for the surrogate, and ask doctors questions. This resulted in all five shorter survey versions measured on an average five-point Likert scale for the ACP engagement score. The behavior change process questions concerning contemplation and questions concerning flexibility in surrogate decision making and asking doctors questions were the questions most often deleted from shorter versions. To be able to compare the average five-point Likert scores of the shorter versions with the original version, the overall average ACP engagement score for the 82-item survey was created by averaging the five-point Likert scales for the process measures and also for the action measures by assigning a value of five to response options of yes and a value of 0 to response options of no.
For the randomized trial, we administered the full 82-item ACP Engagement Survey at baseline, one week, three months, six months, and 12 months. We also assessed self-reported participant characteristics at baseline, including age, gender, race/ethnicity, education, health literacy, finances, and health status.
Prior ACP documentation before the baseline interview was obtained using a composite of any prior legal forms and documented discussions about ACP within the past five years by chart review.
We first compared baseline characteristics of the AD-only and PREPARE arms overall and stratified by English and Spanish speakers using unpaired t-tests for continuous variables and Chi-squared tests for categorical variables. In addition, we compared baseline characteristics of English and Spanish speakers overall. We then assessed the ability of each survey version to detect change in average ACP Behavior Change Survey scores in response to ACP interventions. We used mixed-effects repeated-measures model for the average ACP engagement score with fixed effects on time (i.e., baseline, one week, etc.), intervention group (PREPARE vs. AD-only arms), and group-by-time interaction, with time as a categorical variable to allow for nonlinearity of responses over time. All analyses were adjusted for the randomization block variable of health literacy and prior ACP documentation and clustered by physician.
We then calculated mean change scores from baseline to each of the four follow-up time points (i.e., one week, and three, six, and 12 months) by study arm and measured within-group effect sizes (Cohen's d) of the mean change scores. Using the original 82-item survey as the reference, we computed the intraclass correlation coefficients of the change in survey scores at each time point for progressively shorter versions. Finally, we evaluated between-group effect sizes and mean change score differences between PREPARE vs. AD-only arms for each shorter survey version at each follow-up time point and compared them with the original 82-item version using t-tests. All analyses were also stratified by English- and Spanish-speaking participants. We used SAS 9.4 statistical software (SAS Institute, Inc, Cary, NC) and STATA 15.1 (StataCorp LLC, College Station, TX); all tests of statistical significance were two sided, and we conducted Bonferroni adjustment for multiple between-group comparisons.
Results
Among 986 enrolled participants, 505 were randomized to the AD-only arm and 481 to the PREPARE arm. The mean age of overall participants was 63.3 (6.4) years, 603 (61.2%) were women, 634 (64.3%) were nonwhite, 445 (45.1%) were Spanish speakers, 387 (39.3%) had limited health literacy, 504 (51.1%) reported fair-to-poor health status, and 269 (27.3%) had prior ACP documentation (Table 1). Participant characteristics did not differ between study arms overall or by English or Spanish speakers. For the overall cohort including both arms, Spanish-speaking participants were more likely than English-speaking participants to be women, have less education, have higher rates of limited health literacy, and worse self-rated health (P < 0.05; Table 1).
Average ACP Behavior Change Survey scores increased more over time in the PREPARE vs. the AD-only arms (intervention group-by-time interaction, P < 0.001) among all survey versions (Fig. 1), demonstrating that all surveys can detect change in response to the ACP interventions. The results were similar when stratified by English and Spanish speakers (Appendix I and II).
Fig. 1Progressively shorter advance care planning engagement survey versions are able to detect change at one week, three months, six months, and 12 months by study arm.abaEnglish and Spanish speakers had similar results as shown in Appendix I and II. bP-values in the plots reflect overall intervention group-by-time interactions.
Within-group effect size estimates were larger in the PREPARE vs. the AD-only arms at each follow-up time point for all survey versions (Table 2). Intraclass correlation coefficients of the mean change scores between the original 82-item survey with progressively shorter versions were medium to high at all follow-up time points for both study arms (range over the four time points 0.78–0.97 for PREPARE and 0.76–0.98 for AD-only, all P < 0.001; Table 2). The 55- and 34-item versions had slightly lower mean change scores compared with the 82-item version, whereas the 15- and 9-item versions were variable, and the four-item version had slightly higher values. For example, in the PREPARE arm, the within-group mean change scores with standard deviation (SD) at 12 months compared with baseline were 0.82 (.9) for the 82-item survey, 0.73 (.9) for the 55-item survey, 0.68 (.9) for the 34-item survey, 0.70 (1.0) for the 15-item survey, 0.75 (1.0) for the 9-item survey, and 0.91 (1.3) for the 4-item survey (Table 2). However, these differences between the 82-item version and shorter surveys were all small and never exceeded 0.23 SD scale for PREPARE arm and 0.12 SD scale for AD-only arm across all time points (Table 2). Within-group results were similar for English- and Spanish-speaking participants (Appendix III and IV).
Table 2Within-Group Effect Sizes and Correlation of Mean Change Scores Over Time Using Progressively Shorter ACP Engagement Survey Versions
Between-group effect size estimates for PREPARE vs. AD-only arms were very similar for all versions of the survey at all follow-up time points (Table 3; 82-item between-group effect size estimate ranges during the four periods: 0.24–0.31; 55-item [0.21–0.26]; 34-item [0.21–0.24]; 15-item [0.20–0.25]; 9-item [0.20–0.24]; and 4-item [0.23–0.29]). As observed for within-group estimates, the 55-item and 34-item versions had slightly lower between-group mean change differences compared with the 82-item version, whereas results were mixed with the 15- and 9-item versions, and the four-item version had slightly higher between-group differences. For example, for PREPARE vs. AD-only arms, the between-group differences of mean change scores with SD at 12 months compared with baseline were 0.30 (.9) for the 82-item survey, 0.25 (.9) for the 55-item survey, 0.25 (.9) for the 34-item survey, 0.29 (1.0) for the 15-item survey, 0.32 (1.0) for the 9-item survey, and 0.40 (1.2) for the 4-item survey (Table 3). However, these differences of mean change scores between the 82-item version and shorter surveys were all small and never exceeded 0.17 SD scale across all time points. Results for between-group comparisons of mean change score differences were similar for English- and Spanish-speaking participants (Appendix V and VI).
Table 3Between-Group Effect Sizes and Differences of Mean Change Scores Over Time Using Progressively Shorter ACP Engagement Survey Versions
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, indicating no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, indicating no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, indicating no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, indicating no obvious differences among survey versions.
82-Item
0.24
.28 (.7)
0.29
.27 (.8)
0.31
.29 (.8)
0.30
.30 (.9)
55-Item
0.21
.27 (.6)
0.24
.23 (.7)
0.26
.26 (.8)
0.24
.25 (.9)
34-Item
0.21
.26 (.7)
0.22
.21 (.8)
0.23
.23 (.8)
0.24
.25 (.9)
15-Item
0.25
.33 (.8)
0.20
.22 (.8)
0.24
.26 (.9)
0.25
.29 (1.0)
9-Item
0.24
.36 (.9)
0.20
.26 (.9)
0.22
.28 (.9)
0.24
.32 (1.0)
4-Item
0.25
.40 (1.0)
0.23
.31 (1.1)
0.26
.35 (1.1)
0.29
.40 (1.2)
ACP = advance care planning; AD = advance directive; SD = standard deviation.
a English and Spanish speakers had similar results as shown in Appendix V and VI.
b t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, indicating no obvious differences among survey versions.
Using randomized clinical trial data, with four follow-up time points among a large cohort of older adults, we demonstrated that all survey versions were able to detect change in a broad range of ACP behaviors over time in response to ACP interventions. The surveys worked well among both English- and Spanish-speaking participants, although Spanish speakers had higher rates of limited health literacy and were more likely to have less than a high school education. Having several psychometrically sound shortened versions of the ACP Engagement Survey provides flexibility for research and quality improvement initiatives when choosing surveys to measure the effectiveness of ACP programs.
We found that the original 82-item version of the ACP Engagement Survey and five progressively shorter versions (i.e., 55-item, 34-item, 15-item, 9-item, and 4-item) can reliably detect both within-group and between-group differences for ACP interventions over all time points (i.e., one week, three months, six months, and 12 months). Both within-group and between-group effect sizes tended to be higher using the full 82-item survey, suggesting this version may be most appropriate when maximum power is required, for example, for small studies. However, the shorter versions of the survey were all able to detect both within-group and between-group changes, suggesting that they are acceptable alternatives in most clinical and research settings.
This study allowed us to quantify a clinically meaningful change in ACP Engagement Survey scores based on effect sizes using standard thresholds.
Small effect sizes (0.20–0.49) were associated with mean change scores of approximately 0.2–0.3 points. Moderate effect sizes (0.50–0.79) were associated with mean change scores of approximately 0.4–0.5 points. Large effect sizes (≥0.80) were associated with mean change scores of ≥0.6 points. Therefore, the smallest clinically meaningful change in response to an ACP intervention would be approximately 0.2 points and is an evidence that patients are moving along the behavior change pathway—from precontemplation, to contemplation, to preparation, to action. Larger changes of 0.6 or greater likely reflect ACP actions that are farther down the behavior change pathway. For example, in a prior validation study of the survey in 559 respondents in two countries, a score change of 1.0 was associated with having completed a prior AD.
This study also provided detailed within-group and between-group effects size information for each version of the survey at multiple follow-up time points compared with baseline for the overall cohort as well as for English and Spanish speakers. These results are important because it will allow ACP researchers to calculate power and estimate sample sizes for future clinical trials. Choice of the survey version may be based on the length of the survey desired to reduce response burden, the ACP information important to the research question (as the Behavior Change process questions concerning contemplation, flexibility in surrogate decision making, and asking doctors were the most often deleted questions from shorter versions), and the follow-up time proposed (i.e., one week, three months, six months, or 12 months).
The strengths of this study include the rigorous and systematic validation of all survey versions, assessment of the survey's ability to detect change over time in response to interventions in English and Spanish speakers and use of published trial data. This study does have some limitations. Generalizability may be limited because the validation only took place in one San Francisco health delivery system, with a predominance of older adults. However, the primary care sample was racially and ethnically diverse. Although inclusion criteria required chronic or serious illness for the trial, we do not know whether the results would be similar among patients from specialty clinics or patients who speak a language other than English or Spanish. Future studies will also need to assess the ability of shorter survey versions to detect change in response to different ACP interventions in varying patient populations and whether the survey can be used to help tailor ACP discussions based on readiness and behaviors that have not yet been completed in the clinical setting.
In conclusion, progressively shorter versions of the ACP Engagement Survey, including a four-item version, are psychometrically sound and able to efficiently and effectively measure change in ACP behaviors in response to ACP interventions. The choice of which survey version to use will depend on overall data collection burden, available resources, and the desire to look at survey subscales or specific survey domains.
Disclosures and Acknowledgments
Although this study was nonfunded, the original Veterans Affairs trial was funded by the U.S. Department of Veterans Affairs. Dr. Sudore is also supported in part by a National Institutes of Health, National Institutes on Aging K24AG054415 award. The funding sources had no role in the design, conduct, or analysis of this study or in the decision to submit the article for publication. The authors report no conflicts of interest related to the work described in this article. The corresponding author, Dr. Shi, had full access to all the data in the study and takes responsibility for the integrity of the data and accuracy of the data analysis.
Appendix
Appendix IEnglish-speaking participants: advance care planning engagement scores at baseline, one week, three months, six months, and 12 months for progressively shorter survey versions by study arma. aP-values in the plots reflect overall intervention group-by-time interactions.
Appendix IISpanish-speaking participants: advance care planning engagement scores at baseline, one week, three months, six months, and 12 months for progressively shorter survey versions by study arma. aP-values in the plots reflect overall intervention group-by-time interactions.
Appendix IIIEnglish-Speaking Participants: Within-Group Effect Sizes and Correlation of Mean Change Scores Over Time Using Progressively Shorter ACP Engagement Survey Versions
Appendix IVSpanish-Speaking Participants: Within-Group Effect Sizes and Correlation of Mean Change Scores Over Time Using Progressively Shorter ACP Engagement Survey Versions
Appendix VEnglish-Speaking Participants: Between-Group Effect Sizes and Differences of Mean Change Scores Over Time Using Progressively Shorter ACP Engagement Survey Versions
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
82-Item
0.13
.23 (.7)
0.20
.22 (.7)
0.26
.26 (.8)
0.23
.25 (.8)
55-Item
0.12
.20 (.6)
0.13
.16 (.7)
0.22
.21 (.8)
0.15
.17 (.8)
34-Item
0.11
.18 (.7)
0.12
.12 (.7)
0.19
.16 (.8)
0.18
.16 (.8)
15-Item
0.18
.28 (.8)
0.12
.17 (.8)
0.20
.23 (.9)
0.19
.22 (.9)
9-Item
0.14
.28 (.8)
0.12
.20 (.8)
0.18
.25 (.9)
0.16
.22 (1.0)
4-Item
0.15
.33 (1.0)
0.16
.26 (1.0)
0.23
.36 (1.1)
0.18
.29 (1.2)
ACP = advance care planning; AD = advance directive; SD = standard deviation.
a t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
Appendix VISpanish-Speaking Participants: Between-Group Effect Sizes and Differences of Mean Change Scores Over Time Using Progressively Shorter ACP Engagement Survey Versions
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
82-Item
0.37
.36 (.7)
0.40
.33 (.8)
0.37
.32 (.9)
0.37
.37 (1.0)
55-Item
0.35
.36 (.7)
0.36
.33 (.8)
0.33
.32 (.8)
0.36
.35 (1.0)
34-Item
0.34
.36 (.7)
0.30
.32 (.8)
0.32
.30 (.9)
0.35
.36 (1.0)
15-Item
0.39
.39 (.8)
0.31
.30 (.9)
0.29
.30 (.9)
0.38
.37 (1.0)
9-Item
0.39
.46 (.9)
0.31
.34 (.9)
0.29
.31 (1.0)
0.38
.43 (1.1)
4-Item
0.40
.48 (1.1)
0.34
.38 (1.1)
0.32
.34 (1.1)
0.47
.52 (1.2)
ACP = advance care planning; AD = advance directive; SD = standard deviation.
a t-tests for comparing differences of mean change between progressively shorter survey versions and the original 82-item version had nonsignificant P-values with Bonferroni adjustment for multiple comparisons at a significance level of 0.05, which meant no obvious differences among survey versions.
A comparative, retrospective, observational study of the prevalence, availability, and specificity of advance care plans in a county that implemented an advance care planning microsystem.
Promoting advance care planning as health behavior change: development of scales to assess decisional balance, medical and religious beliefs, and processes of change.