1 Introduction

Prior to the COVID-19 pandemic, the global aviation industry was experiencing sustained growth, whilst concurrently suffering from a significant shortage of qualified, experienced pilots [1]. At least part of this shortfall is believed to be due to insufficient training capacity to meet demand, and the prohibitive costs of flight training [1].

Once the industry has recovered from the effects of COVID-19, it is predicted that an additional 260,000 to 350,000 additional pilots will be needed over the next decade [1, 2]. Based on a 20-year fleet forecast for commercial aircraft, aircraft utilisation, attrition rates, and regional crewing differences, this equates to a requirement of up to 763,000 new pilots by 2039 [3].

It takes approximately 24 months to train a multi-engine, instrument-rated commercial pilot [4]. However, as many airlines now require pilots to have a university degree [5], in addition to their licences, ratings and hour requirements, it can now take up to 48 months before a newly qualified pilot graduates flight training. Even then, newly qualified pilots, who only have the minimum hours required to hold a licence, may find it difficult to gain employment without first gaining experience.

Pilots train in both real aircraft and using simulation. Traditional flight simulators are typically expensive (US$10K to US$10M) and require specialist simulator staff and flight instructors to assist in their operation [6, 7]. However, extended reality (xR) [including virtual reality (VR) and mixed reality (MR)] is increasingly of interest to the aviation industry for its potential in flight training [8]. xR-based simulators generally do not require specialists to operate and offer student pilots the opportunity to complete at least part of their training in a self-paced, learner-centric, collaborative environment, where autonomy and individualised training are encouraged [9].

By implementing xR technology in the form of virtual reality flight simulators (VRFSs), time and cost reductions could be made by substituting actual aircraft flying hours with relatively affordable simulator training [10, 11]. VRFSs may also have the ability to reduce the scale of operations, specifically in areas such as airport and airways charges, maintenance provision and the quantity of fuel (and other carbon-based materials) consumed, thus simultaneously lowering the costs and the negative environmental impact of aviation training [11,12,13,14]. These latter potential advantages would also bring operators in line with environmental initiatives such as ICAO’s Carbon Offsetting and Reduction Scheme for International Aviation (CORSIA) [15].

Better use of training time allocations, fewer delays or cancellations due to weather, or unavailability of aircraft due to maintenance issues, results in less time required to train pilots. A reduction in airframe hours and fuel usage, less maintenance and a marked reduction in the associated ancillary logistic costs might also lead to additional financial savings. A recent cost–benefit analysis [11] found that the US Air Force (USAF) alone could make savings of tens of billions of US dollars to fixed and variable operating costs over the next decade by implementing xR flight training in the form of the Pilot Training Next (PTN) initiative. It is suggested then that in the civil sector, potential operational savings such as these could be passed on to the consumer in the form of reduced training fees. This in turn could increase the attractiveness of a career in aviation to a broader base of potential student pilots to whom at present the cost of flight training is prohibitively high [16, 17].

In the current sco** review, we seek to systematically collate the current evidence on whether xR technology has the potential to be used as an adjunct to existing flight training methods already employed by civil FTOs worldwide. Our research question, therefore, is ‘What is the nature of the evidence to support the use of extended reality (xR) flight simulators in traditional flight training methods?’ The objectives of the current ScR were to: (1) identify all relevant literature to assist in answering the RQ, and (2) what currently is the body of research and what further research might be required.

This review may assist and guide researchers interested in the use of xR in aviation training to best decide the direction future research might proceed and what is the current state of evidence and the methods that have provided it so far.

2 Method

Systematic literature searches of the academic databases, Web of Science, Scopus and Google Scholar were performed via the bibliographic platform EndNote to identify journal articles, conference proceedings, reports and unpublished grey literature that were published in English only relating to the use of xR in aviation. The following Boolean search terms were used: (1) “flight sim*” AND “virtual reality” OR “mixed reality” OR “extended reality”; (2) “flight training” AND “virtual reality” OR “mixed reality” OR “extended reality”; (3) “pilot training” AND “virtual reality” OR “mixed reality” OR “extended reality” and (4) “aviation training” AND “virtual reality” OR “mixed reality” OR “extended reality.”

Applying the Joanna Briggs Institute concept of population, concept, context (PCC) protocol [18], inclusion criteria were that study populations should be pilots of an aircraft (qualified or training), aircraft type may be fixed-, rotary-wing, or UAV, the study should employ xR (VR or MR) flight simulation technology and the research should relate to flight training.

Two preliminary screening steps were taken. For Google Scholar search returns only, due to the number of results envisaged to be returned, and as recommended in previous studies [19,20,21], only the first 150 results were considered for inclusion. Following this, a title-only screening was carried out on all results returned from all sources, to identify returns that were obviously not relevant.

One author (GR) completed the initial database searches, and both authors decided which of the initial search returns satisfied the inclusion/exclusion criteria. The initial review of all studies was undertaken first by GR and then by AG. Any disagreements were to be settled by discussion, or, where agreement could not be reached, a third party independent of the study was to be asked to arbitrate.

Following the full-text review, a data extraction table was populated according to suggestions made in the JBI Reviewer’s Manual [18, 22]. This included data, relating to: author(s), year of publication, title, publication source, country of origin, aims/purpose, population, method, concept, conclusions and key findings (see Tables 1, 2).

Table 1 Data extraction table—publication details
Table 2 Data extraction table—process and findings

3 Results

Initial searches returned a total of n = 871 potential studies of interest. After the removal of n = 104 duplicates, screening by title led to a further n = 584 studies being omitted as they were not relevant. The abstracts of the remaining n = 183 were screened, after which a further n = 121 studies were excluded as being not relevant, leaving n = 62 studies identified as of potential or probable interest. Following the full-text screening, a further n = 37 studies were discounted for reasons such as the study was not relevant, the full text was not accessible, or it was not available in the English language. At this point, two further studies were identified and added following a review of the references, meaning n = 27 studies were subjected to the final in-depth full-text screening. During this final stage, a further n = 9 studies were excluded because they were either not relevant, participant ‘n’ details were not specific, it was discovered they were duplicate studies published using a different title or were not empirical. The n = 18 remaining studies were selected for inclusion in the sco** review. No additional studies were acquired as a result of the personal or public requests made to authors, ResearchGate or LinkedIn.

Figure 1 illustrates the stages of the literature search and details the number of studies screened at each stage, including the number of those studies included/excluded

Fig. 1
figure 1

Literature search stages: detailing numbers of sources of evidence screened and included in the review, with reasons for exclusions at each stage

4 Characteristics of individual studies

Analysis of the characteristics of individual studies was undertaken by both authors, with any disagreements resolved by discussion. Conference proceedings accounted for n = 8 (44%) of the included studies with only n = 2 (11%) published in peer-reviewed journal articles. All n = 18 of the studies were published since 2018. Of the included studies, n = 11 (61%) of the lead authors were based in the USA, n = 3 (17%) in Canada, n = 1 (5.5%) in Belgium, n = 1 (5.5%) in Germany, n = 1 (5.5%) in Poland and n = 1 (5.5%) in Sweden. In total n = 13 (72%) of the studies used quantitative methods: n = 7 (54%) quasi-experimental, n = 3 (23%) correlational, n = 2 (23%) descriptive and n = 1 (8%) experimental. Ten of the studies used qualitative methods to gather results, with n = 7 (70%) using questionnaires, n = 2 (20%) observations, n = 1 (10%) interviews.

The majority of the included studies, n = 11 (61%), focussed on civilian flight training, with the remaining n = 7 (39%) focussing on military flight training, although interestingly, n = 1,099 (76%) participants were military and n = 339 (24%) were civilian. Of the n = 1,438 participants in the included studies, n = 205 (14%) were qualified pilots, who held a cross-section of licences from PPL to ATPL (or the military equivalent), n = 1048 (73%) were student pilots undergoing formal flight training and n = 185 (13%) were unqualified non-pilot participants under instruction.

A summary of the conclusions and key findings (see Table 1) of the included studies suggested that n = 2 studies (one descriptive, one observational) explicitly reported that xR could offer some degree of cost savings in pilot training and n = 4 (two observational and two descriptive) stated pilot training time could potentially be reduced. Thirteen studies (four quasi-experimental, three qualitative questionnaires, two observational, two descriptive, one correlational and one experimental) reported positive learning experiences, and n = 5 (two correlational, one observational, one quasi-experimental and one descriptive) stated that xR was potentially as good as legacy methods. Although xR could potentially reduce the actual number of flying hours required, it should be remembered that there will be a finite limit to this number as flying a real aircraft is, presently, still a regulatory requirement to gain a pilot licence. In the future, aviation governing authorities may amend this, and reduce the number of flight hours required.

Some form of negative training experience (i.e. functionality issue, simulator sickness) was reported in n = 2 quasi-experimental studies, whilst n = 3 quasi-experimental studies did not support the idea that xR flight training was better than the alternative legacy option. Distraction, stress and high cognitive load whilst in xR were reported in n = 3 (two quasi-experimental and one qualitative questionnaire) studies—although this might not necessarily be considered a negative outcome, with distraction mitigation and desensitisation being options to counter this.

5 Discussion

The sco** review identified 18 studies of interest published between 2018 and 2021 that met all inclusion criteria. Of these, the majority had been presented at conferences and subsequently published as conference proceedings, rather than as journal articles. This may have been an artefact of the speed at which technology is advancing in this area (i.e. quickly rendering technology and study data obsolete), making the delays sometimes associated with publishing in academic journals seem less attractive [37, 38]. However, it might also be that some of the studies lacked academic rigour and may not stand up to the stringent peer-review process of high-level journal publication [37]. Another alternative is that the author/s could be continuing their work, including the incorporation of conference feedback, in this area and have, or intend to, submit the finished study to an academic journal for publication in the future [39]. Nevertheless, it is unlikely that these papers will have been subjected to the level of scrutiny normally associated with journal publication; for this reason, any findings should be treated with caution; that said, this is an important finding of the current sco** review.

Most of the included studies were conducted by researchers based in the United States, with 7 of the 18 studies, and most of the participants, originating from the US military. This implies that the US military is currently investing significant time and resources into researching the potential benefits of xR for pilot training and is also reporting optimistic findings through some of their initiatives (i.e. Pilot Training Next). A potential explanation for this is that few, if any, civilian aviation organisations (e.g. flight schools) have the resources and financial capability to match the US military; therefore, in contrast, existing research from these sources appears to be limited in scale and number. In addition, few studies from the civil sector were solely focussed specifically on flight training, instead seeming to concentrate on nuanced areas of training such as learning, human–machine interaction and other human factors.

The majority of both the quantitative and qualitative evidence included in this review supports the potential of xR to be used as an instructional aid in the training of pilots who are currently using traditional flight training methods. The studies suggest that participants who undergo xR flight training are at least as good as those trained using traditional means [11, 24, 25]. There is also evidence to suggest that xR flight training, in concert with traditional flight training, could decrease training time [11, 23, 29, 31], reduce costs [11, 23] and lower the environmental impact of flight training overall [11, 13].

Some studies noted that XR flight training has some noteworthy limitations. VR, in particular, can result in breaks in presenceFootnote 1,Footnote 2 (BIP) [40, 41] (i.e. feeling disconnected) between the participant and the physical world they occupy (e.g. a VR flight simulator with physical cockpit controls). This BIP can result in restrictions in operation, particularly when the pilot is required to engage with particular aspects of the physical cockpit environment/equipment (e.g., buttons, dials and switches) [12, 36, 41]. MR, which combines images of the physical environment and VE, has the potential to overcome this limitation as it develops [25, 42]. It is also reasonably common to feel the effects of simulator or cybersickness (e.g. cybersickness symptoms include nausea, dizziness, disorientation and headaches and are related to classical motion sickness) [43] during xR exposure, particularly if a participant is susceptible to motion sickness, or new to xR [12, 36], but not always, be overcome with continued exposure over time that may lead to the participant becoming desensitised [43]. It is worth mentioning that conventional simulators as well as real aircraft are also known to cause motion sickness [10].

6 Limitations

There were at least three potential limitations identified during this Sco** Review. First, during the search and study identification phase, it is possible that not all of the studies relevant to answering the research question were successfully identified. Second, it is also possible that studies exist that have not, for various reasons (e.g. publication bias), been made publicly available. Third, due to financial constraints, only studies published in English were considered, which prevented the potential inclusion of six of the identified studies.

7 Suggestions for future research

As is evident from the paucity of eligible studies focussing on the use of xR technology and/or VRFS use in flight training (particularly in the civil sector) that there is scope for more research to be undertaken. The greater the amount of data and level of understanding on a topic area such as this will enable those responsible for deciding what future iterations of flight training will entail. Specifically, to assist in achieving this goal, it is suggested that more research should be undertaken in areas such as BIP/disconnect—here the potential benefits of MR's perceived ability to reduce these effects could be further investigated. A comparative study involving civil FTOs could be conducted to assess differences between student pilot groups—some of whom used VRFSs in their training. In concert with a comparative study, a cost–benefit analysis could also be undertaken to compare the costs of differing flight training approaches (i.e. traditional versus VRFS) to gather data that might support the suggestions of cost-saving made in other studies.

8 Conclusion of sco** review

The purpose of this sco** review was to (1) answer the research question, (2) provide an overview of the topic and (3) determine the value of conducting a full systematic review.

Despite the aviation industry’s proclamations to limit its environmental impact with initiatives such as CORSIA, as well as reducing training time and cost, whilst also increasing pilot throughput, only 18 studies that satisfied the inclusion criteria were identified. This number is low when considering the technological potential, level of innovation and the investment currently being seen in xR technology. From the limited number of studies available, however, there was some evidence to suggest that xR flight simulators could successfully be used in support of traditional flight training.

Overall, this sco** review determined that there is a lack of empirical, peer-reviewed studies in the area of xR flight training. Considering this finding, further investigations into whether xR can satisfactorily deliver an enhancement to traditional flight training methods are required. This is something that could be addressed by academic institutes, particularly those with close ties to commercial aviation. To redress the imbalance that currently exists with military studies, research of an empirical nature involving civil FTOs should be undertaken as a priority, so the results can be better applied to airline and civil aviation. It is suggested that a longitudinal study involving both instructors and students at a large FTO would in part fulfil this requirement and potentially provide data that could also apply to other commercial FTOs.

It is concluded that, at this moment, not enough high-quality empirical evidence exists to warrant the conducting of a full systematic review.