If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Validity, Reliability, and Diagnostic Accuracy of the Respiratory Distress Observation Scale for Assessment of Dyspnea in Adult Palliative Care Patients
Address correspondence to: Dr. Qingyuan Zhuang, MBBS, MMED(FM), Department of Supportive and Palliative Care, National Cancer Centre Singapore, 11 Hospital Drive, Singapore 169610.
The prevalence and severity of dyspnea increase at the end of life. Many of these patients have difficulty in reporting their symptoms. Accurate surrogate measures are needed for appropriate assessment and treatment. The Respiratory Distress Observation Scale (RDOS) is proposed as a possible scale although more external validation is needed. We set out to validate the RDOS in the context of palliative care patients near the end of life.
Measures
We prospectively studied 122 palliative care patients in a tertiary hospital in Singapore. Prior RDOS training was done using a standardized instructional video. Dyspnea was assessed by RDOS, Dyspnea Numerical Rating Scale, and Dyspnea Categorical Scale. Pain was assessed by Pain Numerical Rating Scale. We measured RDOS inter-rater reliability, convergent validity, and divergent validity. We used area under receiver operating characteristics curve (AUC) analysis to examine the discriminant properties of RDOS using dyspnea self-report as benchmark.
Results
RDOS had good inter-rater reliability with an intraclass correlation of 0.947 (95% CI 0.919–0.976). It showed moderate-to-strong correlation with Dyspnea Numerical Rating Scale (r = 0.702) and Dyspnea Categorical Scale (r = 0.677) and negligible correlation to Pain Numerical Rating Scale (r = 0.080). It showed good discriminant properties of identifying patients with moderate and severe dyspnea with an AUC of 0.874 (95% CI 0.812–0.936). RDOS ≥ 4 predicted patients with moderate and severe dyspnea with a sensitivity of 76.6%, specificity of 86.2%, positive predictive value of 86.0%, and negative predictive value of 76.9%.
Conclusions
The RDOS shows promise and clinical utility as an observational dyspnea assessment tool. Further studies in uncommunicative patients are needed to determine clinical usefulness and generalizability of results.
Proxy assessments of symptom severity are carried out by health care teams in clinical practice. However, accuracy of such subjective assessments may be poor. Studies in advanced cancer and intensive care populations show proxy assessments by physicians and nurses to be significantly lower and poorly correlating to patient self-report. As the assessments of symptom severity influence treatment decisions, there is a need for a better method to assess dyspnea in nonverbal patients.
The Respiratory Distress Observation Scale (RDOS) is an eight-item ordinal scale designed to measure the presence and intensity of respiratory distress in adults (Fig. 1). This scale was developed from a biobehavioral framework by Dr. Margaret L. Campbell.
It is the first and only symptom assessment tool for dyspnea developed thus far, intended for assessing presence and intensity of respiratory distress in nonverbal patients.
Fig. 1The Respiratory Distress Observation Scale (RDOS) was developed by Dr. Margaret L. Campbell as an observational tool to measure respiratory distress in patients. The above scale is reproduced in full with written permission given by the original scale developer.
Psychometric properties have been previously examined by the original scale developer. The internal consistency (α) across studies ranged from 0.64 to 0.84. Inter-rater reliability was perfect, but this was conducted on only five patients (r = 1.0).
Convergent validity was established through comparison of RDOS against dyspnea self-report (r = 0.38–0.74) in various populations. Although strong correlation (r = 0.74) was found in a combined population of healthy volunteers, patients with postoperative pain, and dyspneic chronic obstructive pulmonary disease (COPD) patients, within-group correlation was poor in dyspneic COPD patients (r = 0.38).
Area under receiver operating characteristic curve (AUC) analysis determined RDOS scores of 0–2 to suggest little or no respiratory distress, whereas a score of ≥3 signified moderate-to-severe distress (AUC 0.795).
Further intensity cutpoints for none, mild, moderate, and severe respiratory distress have been suggested, benchmarked against surrogate ratings by nurse practitioners.
Following development of the original scale, there remains a paucity of independent studies validating the RDOS, none substantiating the cutoff point of RDOS ≥ 3, and none examining inter-rater reliability with an adequate sample size.
We thus set out to evaluate its reliability and validity within our palliative care setting and examine its diagnostic properties in identifying patients with moderate-to-severe dyspnea.
First, we hypothesize that an appropriate RDOS score cutoff will discriminate between none/mild dyspnea and moderate/severe dyspnea with an AUC of at least 0.75. This would provide clinical relevance and utility in identifying patients requiring further symptom managment.
Second, we hypothesize that RDOS will have strong reliability by demonstrating concurrent inter-rater reliability. We also hypothesize that RDOS will have moderate convergent and divergent validity against self-reported dyspnea and pain, respectively.
Materials and Methods
Study Setting
The study was conducted prospectively in Singapore General Hospital (SGH), a 1597-bed tertiary hospital with a dedicated palliative care team that provides palliative consultation services.
Study Population
Although the intended population for RDOS is cognitively impaired patients unable to report respiratory distress, to establish preliminary psychometric data, it is necessary to use cognitively intact patients. This allowed us to use symptom self-report as a gold-standard benchmark.
Ethics Approval
Approval to conduct the study was granted by the SingHealth Institutional Review Board (CIRB Ref No: 2017/3126) with waiver of written informed consent as the study presented minimal risk of harm to participants.
Participants
We screened all consecutive inpatients referred to our palliative consultation service from 25th June 2018 till 1st September 2018. All patients who met our inclusion/exclusion criteria were approached for consent and recruited into the study once verbal consent was obtained.
We included patients with age ≥ 21 years, at risk of dyspnea (but not referred for management of dyspnea) with one of these diagnoses: lung or pleural malignancies, end-stage renal failure, heart failure, and COPD. We also included patients referred specifically for management of dyspnea. Inclusion criteria were purposefully designed to obtain a sample of patients with a spread of dyspnea severity from none to severe.
We excluded patients who were unable to provide consent or refused to participate. RDOS is not valid if the patient is quadriplegic or has bulbar amyotrophic lateral sclerosis, and thus these patients were excluded.
We established, a priori, that a sample size of 113 was needed to detect a difference of 0.15 between the null hypothesis of AUC = 0.60 and the alternative hypothesis of AUC = 0.75 with 80% power at 5% two-sided type 1 error. An AUC of 0.75 would establish the RDOS to have clinically acceptable discriminating properties. An assumption was made that the proportion of moderate and severe dyspnea would be 40% of the total included patients based on a retrospective audit of prior palliative care referrals over two months.
Measures
Patient demographics included age, gender, race, primary diagnosis, and Palliative Performance Scale, version 2, (PPS).
The scale to be evaluated is RDOS (Fig.1)—an 8-item ordinal scale to measure the presence and intensity of respiratory distress. Each parameter is scored from 0 to 2 points, and the points are summed. Scale scores range from 0 signifying no distress to 16 signifying the most severe distress.
The reference standard for dyspnea measurement is self-report. We use two single-item verbal report scales to assess current dyspnea severity. The Dyspnea Numeric Rating Scale (Dyspnea-NRS) is a widely used and a valid scale to measure dyspnea severity. It is scored from 0 to 10, labeled with verbal anchors (e.g., “nothing at all” to “maximal”).
The Dyspnea Categorical Verbal Descriptor Scale (Dyspnea-Cat) is a four-level categorical scale to describe severity of dyspnea (none, mild, moderate, and severe). It has been shown to have strong correlation to the Dyspnea-NRS.
Practical dyspnea assessment: relationship between the 0-10 Numerical rating scale and the four-level categorical verbal Descriptor scale of dyspnea intensity.
The reference standard for pain measurement is self-report. We use the Pain Numeric Rating Scale (Pain-NRS), which is widely used and validated to measure pain severity. It is scored from 0 to 10, labeled with verbal anchors “nothing at all” to “maximal.”
Two raters were trained in the use of RDOS with the aid of an instructional video obtained from the original scale developer. These 2 raters were a palliative physician (Rater 1) and an advanced practitioner nurse (Rater 2) with 10 years of clinical experience. This was followed by practice scoring of RDOS on real patients until a score difference of ≤1 was obtained. We decided, a priori, that a score difference of ≤1 would be deemed clinically acceptable.
Inpatient referrals to the palliative care team were screened daily by Rater 1, and patients whom met the inclusion criteria were identified. After obtaining verbal consent, rater/s would first proceed to administer the RDOS. This would be followed by obtaining patient self-report on dyspnea and pain.
To assess inter-rater reliability, 50 patients were recruited via convenience sampling. Raters 1 and 2 would conduct simultaneous blinded scoring of the RDOS. The sample size of 50 was decided a priori, giving a 95% confidence interval width of 0.2 for intraclass correlation (ICC).
Statistical Analyses
Inter-rater reliability was examined with the Analysis of Variance (ANOVA) estimator of ICC and limits of agreement by the Bland-Altman plot. Convergent and divergent validity was ascertained by using Spearman's rank-order correlations to examine association of RDOS with Dyspnea-NRS, Dyspnea-Cat, and Pain-NRS.
Kruskal-Wallis Test was used to assess difference between RDOS scores of patients across categories of dyspnea severity (none, mild, moderate, and severe). We used box and whisker plots to illustrate RDOS median and interquartile range against dyspnea self-report ratings. AUC analysis was used to illustrate the diagnostic ability of the RDOS in identifying patients with moderate and severe dyspnea. We calculated the sensitivity, specificity, positive predictive value, and negative predictive value for various RDOS cutoff points and used the Youden Index to detect the optimal RDOS score cutoff for our sample. This was compared with the original cutoff of RDOS ≥ 3 proposed by Dr. Campbell.
Statistical analyses were carried out using IBM SPSS Statistics, version 25, software.
Results
We screened 472 referrals, and a total of 132 inpatients who fulfilled the inclusion/exclusion criteria were approached during the study interval. In all, 10 patients refused participation because of fatigue (n = 4) or gave no reason (n = 6). Patient characteristics are summarized in Table 1.
Table 1Baseline Characteristics of Study Participants
Of the 122 participants, mean age was 67.9 (standard deviation = 12.9) years, 43.4% were female and more than 80% were of Chinese ethnicity. Eighty-seven (71.3%) participants had a primary cancer diagnosis, whereas the rest were noncancer diagnosis. Seventy-one (58.1%) participants were chair/bed bound with a PPS of ≤50, and 28 (23%) were completely bedfast requiring total care with a PPS of ≤30.
Median Dyspnea-NRS score was 5 with a range from 0 to 10. Twenty-two (18.0%) patients had no dyspnea, 36 (29.5%) had mild dyspnea, 45 (36.9%) had moderate dyspnea, and 19 (15.6%) had severe dyspnea. Very few of these patients (19.7%) had concurrent pain symptoms with a median pain score of 0 (0–10).
RDOS scores ranged from 0 to 10 with a median score of 3. There was high positive correlation between RDOS and Dyspnea-NRS (r = 0.702) and moderate positive correlation between RDOS and Dyspnea-Cat (r = 0.677). There was negligible correlation between RDOS and Pain-NRS (r = 0.08; all P < 0.01).
Figure 2 shows the distribution of RDOS scores across dyspnea severity categories. It shows a trend of increasing median RDOS scores as dyspnea severity increases. Fifty-eight (47.5%) participants had none or mild dyspnea, and they had a median RDOS score of 2 with a range from 0 to 5. Sixty-four (52.5%) participants had moderate or severe dyspnea, and they had a median RDOS score of 5 with a range from 1 to 10. Median RDOS scores across patient groups with none, mild, moderate, and severe dyspnea were significantly different using Kruskal-Wallis Test (P < 0.05).
Fig. 2Box and whisker plot of median RDOS scores against dyspnea categorical score.
Figure 3 shows receiver operating characteristic curve analysis. The RDOS had clinically significant ability to discriminate patients with moderate-to-severe dyspnea from none to mild dyspnea with an AUC of 0.874 (0.812–0.936). Table 2 shows diagnostic values of varying RDOS cutoff points. We use Youden's Index to propose the optimal cutoff point for our population.
Our proposed cutoff value of RDOS ≥ 4 has the highest Youden Index of 0.628, with a sensitivity of 76.6%, specificity of 86.2%, positive predictive value of 86.0%, and negative predictive value of 76.9% in identifying patients who have moderate-to-severe dyspnea.
Fig. 3Results of the receiver operating characteristic curve analysis.
There was strong inter-rater reliability of the RDOS between two trained raters (a nurse and a doctor) with an ICC of 0.947 (95% CI 0.919 to 0.976). Figure 4 shows the limits of agreement using the Bland-Altman Plot. There is a mean RDOS score difference of 0 (standard deviation = .74), and 47 (94%) ratings had an absolute score difference of ≤1, which was clinically acceptable.
Standardized, well-developed, and clinically useful tools for dyspnea assessment are needed for nonverbal palliative care patients. The RDOS appear promising, and we seek to further explore its statistical and clinical validity within our population. The results of our study demonstrate moderate-to-strong convergent validity, divergent validity, acceptable diagnostic properties, and strong inter-rater reliability.
First, the RDOS demonstrates a moderate-to-strong relationship with dyspnea self-report, establishing convergent validity with standard dyspnea severity scales. We also show no relationship of RDOS with pain self-report, which establishes divergent validity. Our findings comport with Dr. Campbell's initial findings, further establishing generalizability.
Second, we showed significant difference in RDOS scores among patients with varying dyspnea severities. The discriminant capability of this tool is clinically significant (as determined a priori), and our proposed new cutoff of ≥4 provided the best sensitivity and specificity tradeoffs. This is in contrast to the previous proposed cutoff of ≥3, which we felt did not demonstrate sufficiently high specificity (62.1%) within our population.
The increased number of false positives due to a lower cutoff score may result in overtreatment of these vulnerable patients who are cognitively impaired. Further external validation of RDOS cutoff is needed.
Third, our study shows that the RDOS can be administered reliably after standardized training is given. The larger sample size provides better precision for the estimation of inter-rater reliability. A prior unpublished study using this tool in Singapore had shown significant inter-rater variability, and results were hypothesized to be due to lack of standardized training. The original scale developer had shown perfect inter-rater reliability but only on a small sample size of five patients.
Our study had a few limitations. We were unable to completely blind Rater 1 to patient's clinical information as Rater 1 was involved in patient screening. This could possibly bias RDOS scoring. We tried to reduce bias by ensuring only information relevant for trial screening was accessed. Rater 2 was completely blinded to patient's prior clinical information. Given that we achieved good score agreement between both Raters 1 and 2 without unidirectional bias during disagreements (Fig. 4), we suggest that the impact of such a bias may be limited.
Raters had no knowledge of prior dyspnea scores as dyspnea self-report was collected only after RDOS scores were finalized. However, this could bias the dyspnea self-report scoring. There should be no prior knowledge of RDOS scores in an ideal situation. We tried to reduce bias by training our raters to follow a standard script during scale administration.
RDOS was designed to measure respiratory distress in nonverbal patients, which is a common situation near the end of life. However, we needed to use cognitively intact patients so that we could benchmark against gold standard (symptom self-report). This limits generalization of our results although a majority of our patients were possibly near the end of life as they had low PPS scores.
Ninety-six (78.7%) participants had a PPS of ≤60, whereas 28 (23.0%) had a PPS of ≤30, which gave a predicted median survival of 43 days and 19 days, respectively.
An important aspect of the RDOS not measured in this study would be its responsiveness and sensitivity to change. Assessment of its ability to detect clinically relevant changes in dyspnea symptoms over time provides additional valuable insight into the discrimination properties of a clinical tool.
We have plans for future studies in nonverbal palliative care patients to further substantiate clinical validity and utility.
Conclusions
The RDOS shows strong concurrent inter-rater reliability, convergent validity, and divergent validity, suggesting its reliability and validity as a potential observational tool in dyspnea assessment for palliative care patients. More importantly, it shows clinical utility by demonstrating good discriminant properties in detecting patients with moderate and severe dyspnea. We propose new RDOS cutoff of ≥4, which provides a sensitivity of 76.6% and specificity of 86.2%. Further studies in uncommunicative palliative care patients are needed to determine clinical usefulness and generalizability of results.
Disclosures and Acknowledgments
The authors thank Dr. Margaret L. Campbell for her written permission to use the RDOS and its training video. The authors also thank APN Stella Goh Seow Lin for her invaluable contributions.
Practical dyspnea assessment: relationship between the 0-10 Numerical rating scale and the four-level categorical verbal Descriptor scale of dyspnea intensity.