Background

Lung cancer screening with low-dose computed tomography (LDCT) is currently recommended by several international associations [1, 2]. A meta-analysis of randomised controlled trials (RCTs) demonstrated a reduction in lung cancer-related mortality with LDCT compared to control groups in high-risk smoking populations [3]. However, whilst earlier diagnosis is associated with improved outcomes, screening also has unintended harms.

Of particular relevance to this paper is the limited characterisation of the potential for psychosocial consequences of lung cancer screening. There have been four lung cancer screening RCTs to date which have evaluated psychosocial consequences; the National Lung Screening Trial (NLST [4]), the Dutch-Belgian Nederlands-Leuvens Longkanker Screenings Onderzoek (NELSON [5]), the United Kingdom Lung Cancer Screening trial (UKLS [6]), and the Danish Lung Cancer Screening Trial (DLCST [7]). Only two studies (UKLS and DLCST) included the whole cohort in their psychosocial evaluation, and only one study (DLCST) used a condition-specific questionnaire. The DLCST reported more negative psychosocial consequences (affecting behaviour, dejection, and negative impact on sleep) in both the control and LDCT groups over 5years of annual screening using the condition-specific Consequences Of Screening Lung Cancer (COS-LC) questionnaire [8]. The DLCST trial performed a nested matched cohort study and observed that those who received false positive (FP) results had more negative psychosocial consequences compared with the control group and participants with true negatives in the short term [9]. FP results occur when a screen result is positive or indeterminate for cancer in a person who does not have cancer [10]. The rate of FPs from baseline LDCT in lung cancer screening in a meta-analysis of RCTs was 21% [3]. No significant long-term psychosocial consequences were noted in the DLCST, however the DLCST reported that those with FPs had an increased healthcare use in the years after their screening result [11, 12]. The UKLS administered to their whole cohort the Hospital Anxiety and Depression Scale (HADS), Revised 6-item Cancer Worry Scale (CWS-R), and Satisfaction with Decision Scale [13]. HADS scores were within the normal range for both groups, however, the control group reported lower Satisfaction with Decision to participate scores than the intervention group. Participants who were referred to multidisciplinary meetings in the screening arm experienced more short-term lung cancer distress, but no evidence of long-term consequences.

In the NLST, only 16 of the 23 sites invited participants in the LDCT screening arm to complete the State-Trait Anxiety Inventory (STAI) and Short Form 36-item questionnaire (SF-36) [14]. There was likely no difference between the groups (participants with true positive scans, scans with significant incidental findings, and negative LDCTs) in health-related quality of life (HRQoL) and anxiety measures [3]. The NELSON study included a random sample of 733 participants from each trial arm (LDCT screening and control) [15]. They used the Short Form 12-item questionnaire (SF-12), STAI, and Impact of Event Scale (IES). Participants with intermediate LDCT results had an elevated cancer-specific distress post result at two months, with no differences in measures between the groups at 2years.

Another study, Pan-Canadian Early Detection of Lung Cancer Study, assessed HRQoL using the SF-12 questionnaire, the EuroQol questionnaire, and the STAI [16]. LDCT screening was reported to have no overall impact on HRQoL, however a portion of participants were noted to have increased anxiety levels (number needed to harm = 7) which persisted at 12 months.

A potential limitation of almost all studies to date is the use of generic questionnaires which assess HRQoL without the context of the underlying condition [17]. These questionnaires have not had their psychometric performance evaluated in the target population and risk not capturing what is relevant [18]. Some studies have attempted to solve this issue by using multiple generic HRQoL questionnaires, however, this can fail to completely address the question and can introduce redundancy with repetition. The COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) risk of bias checklist was developed to evaluate the quality of Patient-Reported Outcome Measures (PROMs) in systematic reviews [19]. A systematic review assessing quality of PROMs in the evaluation of psychosocial consequences in colorectal cancer reported that 90% of PROMs lacked content validity according to the COSMIN checklist [20].

The aims of this study were to validate an Australian-English version of the COS-LC and to describe the short-term psychosocial impacts of lung cancer screening among an Australian high-risk cohort participating in the International Lung Screen Trial (ILST; NCT02871856).

Methods

Questionnaire translation and content validity assessment

The COS-LC questionnaire consists of four core themes (anxiety, behaviour, dejection, and negative impact on sleep) and twelve lung cancer screening-specific themes (focus on symptoms, stigmatisation, introvert, harms of smoking, self-blame, lung cancer, calm, social relations, existential values, impulsivity, empathy, and regretful still smoking). The questionnaire is divided into two parts; part 1 can be used at any timepoint (before, during, and after screening), whereas part 2 incorporates longer term consequences and is administered after the screening result/diagnosis. In part 1, higher scores indicate poorer outcomes, however part 2 measures the absolute change in either direction.

The Danish COS-LC was translated to Australian-English in Denmark with a three-member bilingual panel. A lay panel in Melbourne, Australia of five participants aged 55 to 80 years old assessed the initial translation’s functionally and ease of understanding as part of a double panel translation process [21]. Participants of the lay panel were volunteers who were invited to this process via flyers in a local Australian centre’s outpatient clinic. The lay panel was balanced for participant sex, with at least one participant in each decade aged 55 to 80 years old. Participants were selected based on availability to attend the group session, with health professionals excluded from the process. The group interview lasted approximately 2.5 h.

Study design and participants

The ILST is a prospective cohort study with over 2000 Australian participants enrolled. Participants are men and women aged 55 to 80 years old who are current or former smokers with at least a PLCOm2012 6-year risk of lung cancer ≥ 1.51% or ≥ 30 pack-year history of smoking. Participants undergo baseline LDCT and a 2-year LDCT. The full ILST protocol has been published [22]. LDCT results were given a category (CAT) based on likelihood of malignancy which then dictated the nodule management as per the protocol (Fig. 1 [22]) [23]. Definitions of CATs are also defined in Fig. 1.

Fig. 1
figure 1

Initial LDCT (T0) ILST lung nodule management protocol with COS-LC timepoints

Australian participants, across five centres in four states, were invited to participate in the quality-of-life study. The COS-LC questionnaires were administered as described in Fig. 1 via paper questionnaires or via electronic surveys. Each site collected their responses prospectively.

Data analysis

For confounder adjustment, age, sex, smoking status, pack-years, education level, work and PLCOm2012 scores were collected. Baseline differences between risk CATs were assessed using a non-parametric Kruskal-Wallis Monte Carlo test for continuously valued data and a Pearson chi-squared Monte Carlo test for categorical variables.

COS-LC validation was performed using a random sample of 200 Australian ILST participants, in accordance with COSMIN Risk of Bias checklist for PROMs [19]. Construct validity was assessed in Rasch item response theory models. Aspects of construct validity tested were: unidimensionality, local response dependency and differential item functioning with respect to the above listed covariates. Reliability was assessed using classical test theory (Cronbach’s alpha). The Benjamini-Hochberg procedure was used to adjust for multiple testing.

The COS-LC scales’ scores from the baseline pre-screening assessment and the assessment 1-month after result letter/diagnosis of the Australian ILST cohort were analysed in linear regression models. In these models the mean differences between CATs were estimated, both unadjusted and adjusted for sex, age, smoking status and pack-years, education, work status and PLCO, parameterised such that estimates at 1-month follow-up were interpreted as mean differences beyond differences at baseline. Potential bias because of differential attrition between CATs was dealt with by weighting the available observations by the inverse of the probability of this observation being present; the latter was estimated from logistic regression models using the available covariates. Inference was corrected for both the repeated measurements and the weighting using the method of generalized estimating equations.

Analyses were performed with SAS v9.4, except for the analyses in the Rasch models, which was performed in DIGRAM.

This study was funded by the National Health and Medical Research Council of Australia. Funding sources had no role in study design.

Results

Field testing

Minor linguistic alterations were made to 22 of the part 1 items and 1 of the part 2 following field testing. All items were found to be relevant by the participants and content validity of the COS-LC was established via field testing amongst ILST screening participants. No new items or scales were added to the COS-LC Australian-English version. Table 1 summarises the themes and items of the COS-LC. The complete part 1 and 2 questionnaires are presented in the supplementary materials (Questionnaires 1 and 2).

Table 1 Themes and items of the COS-LC

Baseline characteristics

A total of 1129 participants out of 1130 participants from three of the five (two Melbourne and one Brisbane) Australian ILST sites had data available at the time of analysis. One Melbourne participant did not complete the COS-LC questionnaires due to a language barrier. Only 11 participants (< 1%) identified as First Nations. The sample represented 54% (1129/2099) of Australian ILST participants. Centres were metropolitan hospitals, however some participants did attend from rural settings. The median cohort age was 63years old (IQR 59-69years old), with a lower median age in CAT1 compared to other groups. There was an overrepresentation of male participants (80%) in this study, with variability in distribution between categories. PLCOm2012score was lowest in participants with CAT1 LDCT results. There were no significant differences in covariates between groups otherwise, with demographics presented in Table 2.

Table 2 Demographics

Psychometric analyses

Results of the psychometric analyses are presented in Table 3. The four core scales (anxiety, behaviour, dejection and sleep) all fit the Rasch model. There was some differential item functioning (DIF) with work status and negative impact on sleep, education level and focus on symptoms, smoking and harms of smoking, self-blame and empathy. The DIF disappeared when scales were modified to exclude the corresponding item.

Table 3 Conditional likelihood ratio (CLR) fit statistics and cronbach’s alpha for the 16 multi-item domains of the Consequences of screening-lung cancer (COS-LC) questionnaire

Baseline and 1-month responses

Mean scores (adjusted for age, sex, smoking status, pack-years, education, employment, PLCOm2012 score) of part 1 and part 2 are presented in Tables 4 and 5 respectively. Unadjusted mean sores are presented in supplementary materials (table S1 and S2). There was no significant difference in any of the themes (anxiety, behavioural, sense of dejection, sleep, focus on symptoms, stigmatisation, introvert, harms of smoking, self-blame, kee** busy, and interest in sex) using the unmodified scales across categories and time.

Table 4 Adjusted mean scores and mean difference in scores from baseline and after T0 (baseline CT) results for COS-LC Part 1
Table 5 Adjusted mean scores from after screening results for COS-LC Part 2

There was a significant increase in scores on the modified sleep scale in the adjusted analysis for those in CAT2 and CAT5 following their T0 results. Part 2 did not demonstrate any significant difference between the categories across all themes except two. There was a significant difference in unadjusted analysis for the relaxed/calm theme, with CAT3 having the highest mean score, however there was no significant difference in the adjusted analysis. The adjusted analysis for empathy also demonstrated a significant difference between categories with CAT2 having a lower mean score and CAT3, 4 and 5 having higher scores compared to CAT1.

Discussion

The Australian-English COS-LC questionnaire demonstrated high content and construct validity in an Australian lung cancer screening cohort. All core themes demonstrated excellent psychometric measurement properties, although four of the twelve lung cancer screening-specific themes demonstrated minor violations from the Rasch model. These violations were improved with modification of the items in each affected theme (sleep, focus on airway symptoms, self-blame, and empathy) as detailed in Table 3.

The early results from Australian-ILST demonstrate no major differences in HRQoL based on baseline CT results or over time from pre-screening to one month post initial LDCT results. Results of both the modified and unmodified scales were included for reference. Of note, in the DLCST participants with FP results were most adversely affected from a psychosocial perspective [9]. There have been studies demonstrating negative psychosocial impacts in other cancer screening programs with FP results. In breast cancer screening, more negative psychosocial consequences were noted in women who received FP results, with one study reporting persistent psychosocial consequences 12 to 14years after screening in women with FP mammograms [10, 24]. In colorectal cancer screening programs, one Danish study reported short-term and long-term psychosocial consequences of receiving a FP or diagnosis of polyps compared to a negative screening result using a condition-specific questionnaire [25]. There was no evidence of negative impacts from invitation to a colorectal cancer screening program [26].

The DLCST reported a significant increase in negative consequences in behaviour, dejection and sleep comparing round 1 with round 2 annual LDCT in the intervention and control group [8]. Although this increase was observed to decrease towards baseline in round 4 and 5 of annual LDCTs in behaviour and dejection scales. There was no similar trend in our Australian ILST cohort, though our reported follow up period was much shorter at approximately four weeks. There was a trend to poorer sleep in CAT5 participants after T0 results. It should be noted that there were differences between the Australian ILST cohort and DLCST baseline characteristics beyond country, with a higher portion of female, current smoker, and working participants in the DLCST [8]. The UKLS did report short-term results at 2 weeks of result of LDCT, although they did not use a condition-specific HRQoL measure, and as such the scope of psychosocial consequences assessed was limited [13]. They reported only an increase in anxiety in those referred to multidisciplinary meeting. When comparing to our CAT4 and 5 groups who would have had further investigation including possible multidisciplinary meeting discussion, our cohort did not have a significant increase in their anxiety scale, with confidence intervals that crossed 0.

Limitations of our early analysis of the Australian ILST cohort include higher CAT1 participant numbers compared to the other categories. The small numbers in CAT4 and 5 groups may have resulted in underestimation of differences. Additionally, the one-month timepoint was very early in the screening pathway of these participants, and consequently likely does not capture the total impact of the participant’s screening journey. There were also very few First Nations participants in our cohort which may limit extrapolation of the COS-LC to this population.

We specified this version as Australian-English as the translation of COS-LC was finalised in Australia. However, there were no significant colloquialisms incorporated and this version of the COS-LC is likely adaptable to other English-speaking settings.

Lung cancer screening is an evolving field with multiple ongoing studies evaluating implementation and efficacy in different populations. All systematic reviews to date which have evaluated psychosocial impacts of lung cancer screening have concluded that the available evidence is limited in the context of the number of studies, study design, and generic outcome measures [3, 27,28,29]. A key part of future research will be better characterising the psychosocial impact of screening on participants and designing consumer information and healthcare provider training that can ameliorate any potential negative consequences. Some psychosocial impacts of screening may in fact be positive and be beneficial to screening update and overall wellbeing. The COS-LC has been evaluated in two different international contexts and demonstrated high performance in measuring psychosocial consequences in both. While our version of the COS-LC was validated in Australia, it can be further tested and adapted to other settings. In future studies, condition-specific questionnaires, such as the COS-LC, should be used to enable adequate measurement of psychosocial impacts and more robust comparisons between different countries and participants to help inform the overall approach to screening.

Conclusions

The COS-LC questionnaire has been validated in Australian-English in an Australian lung cancer screening cohort, demonstrating high content validity and adequate psychometric measurement properties. The early results from the Australian ILST cohort found minimal psychosocial impacts in the short-term using the COS-LC, a condition-specific questionnaire, however longer-term outcomes for the whole Australian ILST cohort are awaited.