Volume 39, Issue 2 , Pages 250-258, February 2010
Evaluating Correlation and Interrater Reliability for Four Performance Scales in the Palliative Care Setting
Article Outline
Abstract
Performance scales are used by clinicians to objectively represent a patient's level of function and have been shown to be important predictors of response to therapy and survival. Four different scales are commonly used in the palliative care setting, two of which were specifically developed to more accurately represent this population. It remains unclear which scale is best suited for this setting. The objectives of this study were to determine the correlations among the four scales and concurrently compare interrater reliability for each. Patients were each assessed at the same point in time by three different health care professionals, and all four scales were used to rate each patient. Spearman correlation coefficient values and both weighted and unweighted kappa values were calculated to determine correlation and interrater reliability. The results confirmed highly significant linear correlation among and between all four scales. Whether using a reliability measure that incorporates the concept of “partial credit” for “near misses” or a measure reflecting exact rater agreement, no one scale emerged as having a significantly higher likelihood of agreement among raters. We propose that what may be more important than clinical experience or rater profession is the level of training an individual health care professional rater receives on the administration of any particular performance scale. In addition, given that low levels of exact rater agreement could have substantial clinical implications for patients, we suggest that this parameter be considered in the design of future comparative studies.
Key Words: Performance scales, interrater reliability, correlation, palliative care
Introduction
Widely used in palliative care settings, performance scales (PS) are key to the development of individualized treatment plans for patients/families, efficacy assessment of such plans, and the tracking of disease progression. In addition, it has been consistently demonstrated that, with accurate ratings, PS can strongly inform the prediction of patient survival.1, 2, 3, 4, 5, 6, 7
Two widely used reliable and valid PS are the Karnofsky Performance Status (KPS) and the Eastern Cooperative Oncology Group Performance Status (ECOG). KPS was originally developed to facilitate the objective assessment of performance in patients with cancer.8 Consisting of fewer categories, ECOG is an abridged version of KPS, developed to simplify performance assessment.9 The specific parameters determining an overall rating for both scales include patient's level of ambulation, level of activity, and ability to perform self-care. Previously identified limitations of both KPS and ECOG are of particular relevance for patients in the palliative care setting. Both scales have been criticized for poor sensitivity at lower ends, suggesting possible inaccuracies for patients with advanced disease.10, 11, 12
The Palliative Performance Scale (PPS) and the Australia-modified KPS (AKPS) are both KPS adaptations and were each developed with the intent to improve the evaluation of patients with lower performance status.10, 13 PPS ratings are assigned through an evaluation of five objective parameters: ambulation, evidence of disease, self-care, oral intake, and level of consciousness.13 AKPS differs from the original KPS for the ratings of 40 and below.10 The corresponding descriptions for these ratings on AKPS are intended to clarify for raters a level of function that can be observed and quantified, that is, the AKPS rating of 40 is used to describe a patient who “is in bed more than 50% of the time” vs. the KPS rating of 40 used to describe a patient who is “disabled and requiring special care and assistance.”10
Given the key role PS should play in overall patient care, many centers have mandated PS integration into their clinical assessment and documentation processes. Successful clinical practice or institutional integration of any PS is highly dependent on strong evidence of clinical utility, accuracy, ease of use, and interrater reliability.5 As previously mentioned, both KPS and ECOG have been found to be valid and reliable tools.4, 14 Several studies have addressed validation of PPS,15, 16, 17 but reliability has only recently been investigated.18, 19 Although more formal validation and evaluation of reliability is required, the face validity has been established for AKPS.10
Despite their widespread use in the palliative care setting, little direct evidence exists to guide clinicians as to which PS is best suited for this population. To our knowledge, no previous work has addressed the linear correlations, both across the four PS themselves and between different raters for each scale. In addition, interrater reliability has not been concurrently evaluated. Designs of previous reliability studies for PS have incorporated various statistical measures to determine the likelihood different raters will agree in the same general direction; however, exact agreement is rarely addressed. Given both the interprofessional nature of the field of palliative care and the fact that different centers will assign the task of PS assessment to different health care professionals (HCPs), concurrent evaluation of correlation and interrater reliability of all four scales would provide a foundation of necessary evidence for clinicians. The two main objectives and corresponding hypotheses for this study were:
Sample and Methods
Between March 2007 and May 2007, all patients referred for palliative care assessment at Sunnybrook Health Sciences Centre and the Odette Cancer Centre in Toronto, Ontario, were eligible for inclusion in this prospective quantitative study. Patients were accrued from three different palliative care delivery sites: an outpatient palliative care clinic within an ambulatory regional cancer center, an inpatient oncology unit within a tertiary academic acute care institution (patients referred for palliative care consultation), and an inpatient palliative care unit (patients admitted primarily for end-of-life care).
Ethics approval to conduct the research was obtained from the Sunnybrook Health Sciences Centre Research Ethics Board. For this study, only the rater assessment was captured, requiring neither additional patient contact nor questioning beyond standard care. Consent was, therefore, not required, and there were no exclusion criteria. Patient demographic and disease information was collected from patient charts and information gathered during consultation.
All patients were individually rated using all four scales by a palliative care research assistant (RA), a primary oncology/palliative care nurse (RN), and a specialist palliative care physician (MD). For each patient, assessments from all three raters were made within 24–48 hours of each other. A single individual served as the RA for all three sites. Both the MDs and RNs varied between and within sites. Site of palliative care delivery was not accounted for in the analysis, as relative homogeneity was assumed among all three sites.
Power Analysis
Sample size calculation was conducted by Power Analysis and Sample Size, version 2005 for Windows (NCSS, Kaysville, UT). The unweighted kappa was assumed to be increased by 40% based on the kappa of 0.30 for KPS among all raters. Using two-sided binomial hypothesis test with a target significance level of 0.05 (Type I error), 127 patients were required to detect a difference between the null hypothesis that the proportion was 0.30 and the alternative hypothesis that the proportion was 0.42 (0.30
×
140%). The actual power was 81%, and the actual significance was 0.0416. Patients could be withdrawn for any reason; therefore, a dropout rate of 5% was considered. A total of 134 (127/0.95) patients were required for the study.
Statistics
Inferential and descriptive statistics were measured using Statistical Analysis Software (version 9.1; SAS Institute, Cary, NC) for Windows. Results were expressed as the median (range) for quantitative variables and as proportions for categorical findings.
CorrelationTo establish correlation among different raters for each individual scale, Spearman correlation coefficients were calculated using an alpha of 0.05 for all rater pairings (MD/RA, RN/RA, and MD/RA). To establish the correlations between the scales themselves, Spearman correlation coefficients were calculated among each uniprofessional group of raters (MDs, RNs, and RAs) for all possible scale pairings. The range in values is from −1.0 (indicating perfect inverse correlation) to 1.0 (indicating perfect correlation).
Interrater ReliabilityGiven individual patients were assessed by multiple raters, kappa analyses were chosen over intraclass correlation coefficients to examine reliability.20 Both weighted and unweighted values were calculated. Unweighted or simple kappa values represent the measure of agreement beyond chance.21 When calculating unweighted kappa values, zero weight is given to all disagreement, regardless of discrepancy size, thus focusing on agreement that is exact. Weighted kappa calculations in addition assign “partial” credit for agreement that is “near.”22 In theory, one would expect weighted kappa values to be higher than those unweighted when assessing agreement between raters using a tool with multiple rating categories or levels. For both weighted and unweighted kappa statistics, a value of 0.2 or less represents poor agreement, 0.21–0.4 indicates fair agreement, 0.41–0.6 indicates moderate agreement, 0.61–0.8 indicates good agreement, and 0.81–1.0 signifies very good agreement.23 If agreement was found to be poor (kappa value: 0.2 or less), subsequent analysis using the Mann-Whitney U nonparametric test identified group tendencies to rate more positively or negatively. The percentage of exact agreement was also documented for all rater pairings. Comparative results were considered significant at the 5% critical level (two-sided P
<
0.05).
Results
Descriptive information for the 134 patients is summarized in Table 1. Gender was evenly distributed, with 52% males and an overall average age of 67 years (range 21–102 years). Of those with a cancer diagnosis (93% of total), 78% had either metastatic or locally advanced disease. The most common malignancies included those of lung (15%), breast (10.5%), and colon (7%). Median ratings of performance status differed substantially based on site (Table 2).
Table 1. Descriptive Information of Sample (n
=
134)
| Gender | |
| 70 (52%) | |
| 64 (48%) | |
| Age | |
| 67 | |
| 21–102 | |
| Palliative care delivery site | |
| 59 (44%) | |
| 40 (30%) | |
| 35 (26%) | |
| Diagnosis—malignant (most common); total | |
| 20 | |
| 15 | |
| 7 | |
| 8 | |
| 7 | |
| Diagnosis—nonmalignant; total | |
| 3 | |
| 2 | |
| 4 | |
aDiagnoses include pneumonia, CVA, thrombocytopenia, and respiratory failure (one patient each). CVA = cerebral vascular accident. |
Table 2. Median PS Rating (and Range) by Site
| ECOG | KPS | AKPS | PPS | |
|---|---|---|---|---|
| Outpatient | 1 (0–4) | 70 (10–100) | 70 (20–100) | 80 (20–100) |
| Inpatient | 3 (1–4) | 50 (10–90) | 40 (10–80) | 50 (10–80) |
| Palliative care unit | 4 (1–4) | 30 (10–60) | 30 (10–60) | 30 (10–60) |
For each rater group, all PPS, KPS, and AKPS pairings were found to be highly significantly correlated (MD ratings in Table 3). Given its inverse configuration, appropriately negative but equally highly significant correlations were found for ECOG and each of the other three scales. For each individual PS, ratings for all rater pairings were found to be highly significantly correlated (Table 4).
Table 3. Correlations (r) Among the Four PSa for MD Raters
| KPS | ECOG | PPS | AKPS | |
|---|---|---|---|---|
| KPS | 1 | −0.9139 | 0.9628 | 0.9836 |
| ECOG | −0.9139 | 1 | −0.9097 | −0.9262 |
| PPS | 0.9628 | −0.9097 | 1 | 0.9662 |
| AKPS | 0.9836 | −0.9262 | 0.9662 | 1 |
aSpearman correlation coefficients, P |
Table 4. Correlations (r) Between Different Raters for Each of the Four PSa
| KPS | ECOG | PPS | AKPS | |
|---|---|---|---|---|
| RA/MD | 0.8702 | 0.8251 | 0.8895 | 0.8932 |
| RA/RN | 0.8656 | 0.8344 | 0.8704 | 0.8273 |
| RN/MD | 0.8490 | 0.7921 | 0.89616 | 0.8582 |
aSpearman correlation coefficients, P |
Using weighted kappa values, interrater reliability for each of the PS, namely, KPS, PPS, AKPS, and ECOG, was almost universally found to be “good” (Table 5, Table 6). Using unweighted kappa values, interrater reliability was found to be “poor” between the RN/MD pairings for both KPS and AKPS. Using nonparametric testing, further analysis of these data indicated the tendency for RN ratings to be significantly higher than those of MDs for both KPS and AKPS. The range of exact agreement between raters was found to be from 38% to 61% (Table 5, Table 6) but less than 40% for the RN/RA and RN/MD pairings for the KPS alone.
Table 5. Interrater Reliability,a Strength of Agreement, and Percent Exact Agreement for KPS and ECOG
| HCP Dyad | KPS | ECOG | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Weighted Kappa | Strength of Agreement | Unweighted Kappa | Strength of Agreement | % Exact Agreement | Weighted Kappa | Strength of Agreement | Unweighted Kappa | Strength of Agreement | % Exact Agreement | |
| RA/MD | 0.71 | Good | 0.38 | Fair | 50 | 0.65 | Good | 0.46 | Moderate | 59 |
| RA/RN | 0.62 | Good | 0.27 | Fair | 38 | 0.68 | Good | 0.48 | Moderate | 61 |
| RN/MD | 0.58 | Moderate | 0.19 | Poor | 38 | 0.61 | Good | 0.38 | Fair | 53 |
aKappa—weighted and unweighted. |
Table 6. Interrater Reliability,a Strength of Agreement, and Percent Exact Agreement for PPS and AKPS
| HCP Dyad | KPS | ECOG | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Weighted Kappa | Strength of Agreement | Unweighted Kappa | Strength of Agreement | % Exact Agreement | Weighted Kappa | Strength of Agreement | Unweighted Kappa | Strength of Agreement | % Exact Agreement | |
| RA/MD | 0.72 | Good | 0.39 | Fair | 50 | 0.71 | Good | 0.37 | Fair | 50 |
| RA/RN | 0.64 | Good | 0.23 | Fair | 47 | 0.63 | Good | 0.24 | Fair | 43 |
| RN/MD | 0.63 | Good | 0.23 | Fair | 47 | 0.61 | Good | 0.18 | Poor | 50 |
aKappa—weighted and unweighted. |
Discussion
Accurately rating a patient's performance level is of critical importance, as in many health care settings, more and more clinical decisions are determined solely by a specific PS rating. Clinician investigators are often required to define a specific level of performance that would determine a patient's eligibility for participation in a clinical trial. In addition, a specific PS rating is often included in the criteria for participation in research studies in general. In many settings, health care resources are allocated based solely on a specific PS rating. An example of this is found within the algorithm used by many home care agencies to determine the amount of nursing and personal support hours for which a patient may qualify. Factoring in the prognostic utility of PS, strong support exists for the integration of PS use into the routine care of patients.
Given the possible clinical implications of an inaccurate PS assessment, this study was designed to evaluate interrater reliability and confirm correlation both within and between each of the four commonly used PS in the palliative care setting. These two psychometric properties contribute substantially to the statistical and clinical evidence HCPs appropriately require to support and guide individual, group practice, and/or institutional PS integration.
As originally hypothesized, our study confirms that, among three different rater groups, all four PS are highly significantly correlated. This information is meaningful, as it confirms for the PS a general level of association, that is, raters used each scale according to its original intended use. Clarifying further, our findings confirm that when a rater assigns a particular rating for a patient using one of the four PS, it is likely that subsequently using one of the other three PS, the rater would assign for the same patient a rating in the same general direction. In addition, our study confirms that when using any of the four PS, two different raters will assign for the same patient a rating in the same general direction. A key point to emphasize is, as the earlier description suggests, high correlation levels do not automatically confirm specific agreement between raters.
Interrater variability has been proposed to increase with a higher number of rater options (i.e., 10 categorical options for KPS, AKPS, and PPS, and four for ECOG).12 For our study, then, it was hypothesized ECOG would have the highest level of agreement among raters. Consistent with several other comparative reports involving PS,1, 24, 25, 26, 27 our results did not fully confirm this hypothesis. However, it is noteworthy that, of the four, ECOG was the only PS with greater than 50% exact agreement for all three rater pairings.
Reviewing the PS literature, a wide range of terminology and statistical measures have been used to represent the likelihood of agreement between raters. An example is found in the original KPS reliability studies. Using Pearson correlation coefficient values, Yates et al. first reported “moderately high” levels of KPS interrater reliability for patients with advanced cancer.27 Also choosing Pearson correlation coefficient values to evaluate agreement (reported as “very high”), Schag et al. included kappa statistics in their study design, the resulting values indicating “fair to moderate” agreement.28 The authors concluded that KPS had overall “very good” interrater reliability. Choosing intraclass correlation coefficients and subsequent Cronbach's coefficient alpha values, Mor et al. also reported “very high” KPS interrater reliability.2
To examine the reliability of ECOG, Sorenson et al. chose weighted kappa statistics and demonstrated a “moderate” level of agreement between raters.14 Interrater reliability of PPS has only recently been evaluated. Using a web-based case scenario design, Ho et al. examined agreement among administrators and senior clinicians of palliative care institutions.18 Intraclass correlation coefficients and weighted kappa values indicated a “good” level of agreement, which led the authors to confirm the reliability of PPS. The only study examining agreement for PPS in the clinical setting was recently reported by Campos et al.19 Spearman rank correlation coefficients and Cronbach coefficient alpha values were calculated, leading the authors to conclude “excellent” correlation agreement in ratings between an RA, a radiation therapist, and a radiation oncologist.19
Few studies have concurrently compared interrater reliability of different PS using the same study population. With two clinical oncologists as raters, Roila et al. used weighted kappa statistics and reported interrater reliability of both KPS and ECOG to be “very high.”26 Taylor et al. examined interrater reliability of KPS and ECOG using raters from different professions (clinical oncologist, medicine resident, and nurse), and Spearman rank correlation coefficients were chosen to evaluate agreement.4 The authors found each PS to have “good” interrater reliability despite slightly higher values for ECOG. As mentioned previously, this led to the suggestion that “either scale could be used with good inter-rater reliability but the simpler format of the ECOG would minimize potential differences in ratings.”6
Examining general principles of kappa statistics, Landis and Koch suggested weighted kappa values of 0.61 or greater are considered to represent “substantial” agreement; however, a minimum acceptable level has not been universally agreed upon.29 Applying Landis and Koch's assertion to the current study, interrater reliability would be considered substantial for all four PS evaluated. It has been further suggested that incorporating unweighted kappa statistics in a study's design would be “inappropriate,” as size of rating discrepancy is considered a key factor when evaluating agreement.26 This suggestion was made by authors evaluating KPS interrater reliability, who went on to report that exact agreement between ratings of different physicians occurred 61.7% of the time, a value not too dissimilar from that reported in our study (Table 5, Table 6). Attempting to clarify increasingly difficult-to-interpret statistical measures, epidemiologists have demonstrated that weighted kappa values can be very high despite a very low percentage of exact agreement.30 This has led many to advocate that weighted kappa statistics be regarded primarily as a measure of association rather than agreement.30, 31 Similar principles led to the same suggestion when interpreting Pearson correlation coefficient values.32 Given the clinical implications of an inaccurately assessed performance level and the lack of universal agreement on both ideal statistical measures and the strength of agreement considered adequate, we would propose both unweighted kappa values and exact agreement be included as parameters based on which the reliability of a PS could be judged. With that in mind for the present study, interrater reliability of KPS and its use in the palliative care setting presents a concern, as a review of our data demonstrates that this PS had the only two rater pairings with less than 40% exact agreement and, using unweighted kappa values, had one of only two rater pairings demonstrating “poor” agreement.
Historically, most studies evaluating interrater reliability have used a homogeneous or uniprofessional group of raters. More recently, study designs have been multiprofessional in nature, using combinations of physicians, nurses, social workers, radiation therapists, and RAs as the raters for the studied population.1, 10, 24, 25, 26, 27, 33 In our study, one consistent RA was used, contrasting with the groups of RNs and MDs who provided ratings. Based on previous work involving KPS, ECOG, and AKPS, we assumed adequate agreement within each HCP group of raters.1, 10, 24, 33 One limitation of our study is that this assumption was not tested through the assessment of intraprofessional agreement within the MD and RN groups. In addition, regardless of clinical role, a lack of rater knowledge on the use of the PS could greatly influence the accuracy of a rating. One previous PS agreement study used non-HCP raters who received a two-hour training session on the use of the KPS. The strength of agreement demonstrated in this study was found to be “excellent” (weighted kappa
=
0.97).14 For our study, the RA received informal training on the use of PS, but it was assumed that this was not necessary for the RN and MD groups. To minimize bias, future studies would ideally investigate interrater reliability by including only one HCP from each profession and ensuring adequate training for each rater on the use of PS.
Several other limitations exist with this study. Despite the intended setting of use for PPS and AKPS (i.e., palliative care setting regardless of diagnosis), only 7% of our study population had a noncancer diagnosis. Future studies should be designed to ensure adequate representation of patients with greater diversity in diagnosis. One final limitation is that sample size prevented separate analysis of outpatients and acute care and palliative care unit inpatients.
Conclusions
Because of wide variation in terminology and statistical measures used to represent the likelihood of PS rater agreement, cumulative evidence from previous work examining PS interrater reliability is difficult to summarize and elicit conclusions. Paired with a lack of consensus on the strength of agreement considered adequate, limited evidence exists to guide clinicians regarding the most appropriate PS for the palliative care setting. In our study, linear correlation between and among raters for all four PS was confirmed. Interrater reliability was evaluated, and no one PS emerged as having a higher likelihood of agreement among raters from different professions. In the current clinical climate, the use of a PS found to have low levels of exact rater agreement could have substantial clinical implications for patients. We propose that, in addition to correlation coefficients, both weighted and unweighted kappa values as well as exact agreement be included as parameters based on which the reliability of a PS could be judged. In addition, what may be more important than clinical experience or rater profession is the level of training an individual HCP rater receives on the administration of any particular PS. To maximize clinical efficiencies and ensure accuracy of assessments, it is likely to be of great benefit to have multiple HCPs knowledgeable in the use of PS. Adequate training should be included in the design of future comparative studies. Although all patients in this study were receiving palliative care services, to further clarify which PS is best suited for patients in the palliative care setting, follow-up studies should include both the stratification of data into groups representing high, middle, and low patient performance, and rater perception on ease of PS use.
References
- . Karnofsky and ECOG performance status scoring in lung cancer: a prospective, longitudinal study of 536 patients from a single institution. Eur J Cancer. 2002;32A:1135–1148
- . The Karnofsky Performance Status Scale. An examination of its reliability and validity in a research setting. Cancer. 1984;53:2002–2007
- . Improved accuracy of physicians' survival prediction for terminally ill cancer patients using the Palliative Prognostic Index. Palliat Med. 2001;15:419–424
- . Observer error in grading performance status is cancer patients. Support Care Cancer. 1999;7:332–335
- . Performance status assessment among oncology patients: a review. Cancer Treat Res. 1986;70:1423–1429
- Is the Palliative Performance Scale a useful predictor of mortality in a heterogeneous hospice population?. J Palliat Med. 2005;8:503–509
- . Prognostication in hospice care: can the Palliative Performance Scale help?. J Palliat Med. 2005;8:492–501
- . The use of nitrogen mustards in the palliative treatment of cancer. Cancer. 1948;1:634–656
- . Appraisal of methods for the study of chemotherapy of cancer in man: comparative therapeutic trial of nitrogen mustard and triethylene thiophosphoramide. J Chron Dis. 1960;11:7–33
- . The Australia-modified Karnofsky Performance Status (AKPS) scale: a revised scale for contemporary palliative care clinical practice. BMC Palliat Care. 2005;4:7
- . The Edmonton Functional Assessment Tool: preliminary development and evaluation for use in palliative care. J Pain Symptom Manage. 1997;13:10–19
- . Can Karnofsky Performance Status be transformed to the Eastern Cooperative Oncology Group scoring scale and vice versa?. Eur J Cancer. 1992;8:1328–1330
- . Palliative Performance Scale (PPS): a new tool. J Palliat Care. 1996;12:5–11
- . Performance status assessment in cancer patients. An inter-observer variability study. Brit J Cancer. 1993;67:773–775
- . Validation of the Palliative Performance Scale for inpatients admitted to a palliative care unit in Sydney, Australia. J Pain Symptom Manage. 2002;23:455–457
- . Survival prediction of terminally ill cancer patients by clinical symptoms: development of a simple indicator. Jpn J Clin Oncol. 1998;29:156–159
- . Validation of the Palliative Performance Scale from a survival perspective. J Pain Symptom Manage. 1999;18:2–3
- . A reliability and validity study of the Palliative Performance Scale. BMC Palliat Care. 2008;7:10
- The Palliative Performance Scale: examining its inter-rater reliability in an outpatient palliative radiation oncology clinic. Support Care Cancer. 2009 Jun;17(6):685–690
- . Measuring nominal scale agreement among many raters. Psychol Bull. 1971;76:378–382
- . A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46
- . A new procedure for assessing reliability of scoring EEG sleep recordings. Am J EEG Tech. 1971;11:101–109
- . Statistical methods for assessing observer variability in clinical measures. BMJ. 1992;304:1491–1494
- . Performance status assessment in cancer patients. Cancer. 1990;65:1864–1866
- . The correlation among patients and health care professionals in assessing functional status using the Karnofsky and Eastern Cooperative Oncology Group Performance Status scales. Support Cancer Ther. 2004;2:1–5
- Intra- and interobserver variability in cancer patients' performance status assessed according to Karnofsky and ECOG scales. Ann Oncol. 1991;2:437–439
- . Evaluation of patients with advanced cancer using the Karnofsky Performance Status. Cancer. 1980;45:2220–2224
- . Karnofsky Performance Status revisited: reliability, validity and guidelines. J Clin Oncol. 1994;2:187–193
- . The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174
- . The analysis of ordinal agreement data: beyond weighted kappa. J Clin Epidemiol. 1993;46:1055–1062
- . 2
×
2 kappa coefficients: measures of agreement or association. Biometrics. 1989;45:269–287 - . Measurement in medicine: the analysis of method comparison studies. Statistician. 1983;32:307–317
- Performance status assessment in home hospice patients using a modified form of the Karnofsky Performance Status scale. J Palliat Med. 2000;3:301–311
PII: S0885-3924(09)01125-7
doi:10.1016/j.jpainsymman.2009.06.013
© 2010 U.S. Cancer Pain Relief Committee. Published by Elsevier Inc. All rights reserved.
Volume 39, Issue 2 , Pages 250-258, February 2010
