Introduction

3D digital data have been rapidly incorporated in archaeological and anthropological fields, and they are providing new and unlimited access to fragile and valuable remains. For instance, 3D digitisations can be used for facial reconstructions such as Robert the Bruce (Wilkinson et al. 2017), the creation of interactive virtual displays such as the Gebelein Man at the British Museum (Ynnerman et al. 2016), or 3D printed to reveal otherwise hidden trauma such as the Jericho skull (Hirst 2017). Additionally, 3D digitisations allow for more advanced statistical analysis of biological shapes, and the most commonly employed method in anthropology and archaeology is 3D geometric morphometrics. 3D geometric morphometrics (GMM) is an analytical method of quantifying and comparing 3D objects. GMM methods have been advancing over the last decade, such as with semi-landmark (Gunz and Mitteroecker 2013), and finite element analysis which predicts how a shape may respond to external forces (O’Higgins et al. 2011). These technological and statistical advancements have expanded this field, to the point where the 3D digitisation of archaeological material, including human remains, is a frequent occurrence, with many universities owning multiple 3D scanners as well as photogrammetry equipment which can be used to create 3D digital data. The proliferation of digitisation methods and equipment has resulted in the exponential growth of GMM analysis in human remains research. GMM studies of human remains include the analysis of biological and cultural variation, evolution, and adaptation (Archer and Braun 2010; Buchanan and O’Brien 2014; Cardillo 2010; Lawing and Polly 2010; Perez 2007). For instance, by utilising GMM analysis researchers have developed a more advanced understanding of sexual dimorphism and age-related morphological alterations in adult human skeletons (Bigoni et al. 2010; Franklin 2010; Gonzalez et al. 2009; Janin 2017; Viðarsdóttir et al. 2002). The empirical data that GMM provides increase the precision and robustness with which biological profiles of skeletal remains can be made.

Despite this, 3D digital data and GMM are still considered by some researchers as a relatively recent or new field (Adams et al. 2004; Marcus and Corti 1996; Rohlf 1999, 2002; Waddell 1996). London air raids in 1940–1941 severely damaged the Natural History Museum and Royal College of Surgeons, and many specimens were damaged or destroyed (Fforde 1992; Natural History Museum 2015). Even in everyday data collection, there is always the risk of damage; repeated handing, even where all measures of care are taken, results in an accumulation of damage to the material over time (Fletcher et al. 2014; Palmer 2015). For instance, a study conducted by Bowron (2001, 2003) analysed taphonomic damage from six skeletal collections and determined that the two factors which most significantly affected the preservation of human remains in a collection were handling and packaging. Online stores and published skeletal recording sheets have been promoted, as they can be used by multiple researchers, and limit the handling of remains and subsequent damage to material (Fletcher et al. 2014; Gröning et al. 2005; Palmer 2015; Pelfer and Pelfer 2003). In addition, to further preserve skeletal material it would be beneficial, where possible, for researchers to utilise existing 3D scans rather than subjecting the remains to repeated scanning and/or handling (Wilson 2014–16). It would therefore be valuable for details of the scans and scanning process to be compiled between the original researcher and the curating institution, and stored in a format that makes these data available for future research, while acknowledging the contribution of the original collector.

In the last 2 years, the ownership and use of digital data have been discussed (Decker and Ford 2017; Márques-Grant and Errickson 2017; Niven and Richards 2017). However, more research is needed to determine the potential of sharing these digital collections. Furthermore, researchers may frequently collect data from archaeological and anthropological collections internationally, and as such it is argued that international agreements regarding the storage of 3D digital data are required.

Special Considerations: Repatriation and Cultural Sensitive Remains

The ascribed cultural and religious views of human remains and digital or physical replicas of human remains vary significantly. As such it is important to consider potential variation in ethical issues of collecting, storing, sharing, and displaying 3D digital data, as there may be cases where it is not ethical to keep digital copies of human remains. A common argument for scanning human remains is to allow repatriation requests to be honoured, while still kee** a digital copy of these data to be utilised for further research (Mathys et al. 2013; Rowe et al. 2002; Schurmans et al. 2002). Nevertheless, no research could be found in which the organisations and communities actively making repatriation requests were consulted with regard to their views on retaining 3D scans of human remains after repatriation. For instance, a study conducted by Henson (2015) evaluated the benefit of different 3D scanning technologies to preserve osteological data before repatriation. For their study human remains from Clover and Fort Ancient in West Virginia, USA, were scanned using different methods prior to their repatriation. The paper failed to specify if the work was conducted in collaboration with NAGPRA (NAGPRA 1990; National Park Service 2018) or if ethical approval was given or sought. While it may well be the case that the author received approval from NAGPRA and/or the descendant communities of this site, it would have been pertinent to detail this agreement and how these 3D digital data were to be used or stored in the future. Instead, this study focused on demonstrating that it was possible to create a high-resolution model of NAGPRA-protected Native American remains in the time period available for archaeological research prior to repatriation, and ignored arguably the more pertinent question of whether these digital data be created and stored.

Previous instances where archaeological material has been digitised and printed without approval of the group requesting repatriation have resulted in controversy. In another case, the 3D scanning and printing of seven columns from the Old Summer Palace in Bei**g by artist Oliver Laric has caused controversy. The columns are currently held in Norway, although it has been agreed that they will be repatriated to China. As such, the production of these 3D prints has been argued as an attempt to steal cultural heritage material (Mendoza 2014). These case studies demonstrate the disparity between the differing views of researchers and the groups advocating for the repatriation of archaeological material. Given these controversies alongside the current dearth of ethical consideration of digitising repatriated or culturally sensitive remains, promoting the scanning of such material without the discussion of approval or detailing the future use of these digital data may lead to actions in the future that could damage the integrity of this discipline.

Despite the paucity of research, there are some localised protocols currently in existence. The National Museums of Liverpool recognise the cultural sensitivity of some items, including photographs and other depictions of human remains, and that some forms of analysis, such as photography and X-rays, may not be appropriate due to issues of cultural significance (National Museums Liverpool 2017). Similarly, the Museums Galleries of Scotland suggest that, after repatriation requests have been accepted, decisions regarding the treatment of the remains, such as photography, are the responsibility of the group or individuals requesting the repatriation (Museums Galleries Scotland 2017). It is clear that the potential repatriation of digital data is a concern among researchers, with Weiss (2001) noting the potential negative effect of digital data or casts coming under repatriation acts, due to the catastrophic loss of data.

Additionally, among some cultures distinctions are not made between original sacred or culturally affiliated objects, and replicas or even photographs (Brown and Nicholas 2012; Isaac 2015). Furthermore, in cases of repatriation of cultural items, digital scans have been used instead of the original item (Hess et al. 2009; Resta et al. 2001), indicating a stronger relationship or at least a blurred line between digital data and the physical material. It is therefore argued that further research is necessary to determine the ethical responsibilities when storing or collecting digital copies of human remains or other culturally sensitive items. This is a discussion that needs to happen between archaeological institutes and the communities and organisations who are requesting the repatriation of human remains. It is suggested that the relevant communities/organisations should be consulted prior to the scanning and/or sharing of culturally sensitive material. It is arguably unethical or unbeneficial for future collaboration between human remains research and indigenous communities to keep digital versions of human remains without the knowledge of these organisations.

Standardisation of Methods

As previously stated one of the key advantages of 3D digitisation is that once a skeletal element has been scanned, the digitisation can be used by numerous researchers in a variety of studies. While there are important ethical and legal considerations for this practice, as discussed, the digitisation of skeletal remains opens the possibility of a worldwide dataset, promoting a more holistic/global approach to digital data which will maximise the availability of resources, preserve the original material, and foster greater collaboration among researchers (see White et al., this issue). There are already a number of different institutions who are compiling extensive 3D scanned data collections of human and animal remains (e.g. Smithsonian 2016). However, in many cases within human remains research, the value of pre-existing databases of published papers has been limited by the lack of standardisation in data collection and the data format; as such, standardising these digital databases may increase their potential for future research and global collaborations. Two components of method standardisation shall be discussed in this paper: digitisation methods and landmark placement for GMM analysis in terms of error assessment.

Data Collection Methods

There are several methods which produce 3D models such as photogrammetry, structured light scanning, laser scanning, computed tomography (CT) scanning, and magnetic resonance imaging (MRI). While these methods all produce a 3D digitisation, the quality and resolution of the data varies across the different technologies and methodologies. Although several papers have discussed the variation in accuracy and resolution for different scanning technologies (see White et al., this issue), this has not yet led to the introduction of standardised digitisation methods (Boehler et al. 2003).

There are many factors that influence the quality of the 3D data. The Next Engine Desktop laser scanner, for instance, has several different settings which relate to the object distance and size and influence the quality of the scan, and may influence the reliability of consistently placing GMM landmarks (Kuzminsky and Gardiner 2012; Slizewski et al. 2010; Zaimovic-Uzunovic and Lemes 2010). The practical implication of differences in scanning accuracy will also depend on the purpose of the research and nature of the material involved. For instance, Villa et al. (2017) compared the accuracy of 3D models generated from three different laser scanners and software, finding no significant variation in the topography of the bone surface between different scanners, although each scanner introduced random error which influenced curvature values. While the choice of digitiser may be dictated by their availability in an institution or by research grants provided, the information regarding the technology and process involved in creating a digitisation, if provided, can also act to increase the potential of reuse for these collections. Therefore, it is recommended that, when publishing digital collections, researchers include details of the digitisation material and method, allowing this to be both easily reproducible and to enabling valuable comparisons between digital data.

In addition, researchers will frequently record other data from the human remains with 3D scanning, including age, sex, and stature estimation, and the presence of pathology. If these data are not available in addition to 3D scans, then the benefit to both researchers and curators is reduced. Therefore, when possible, individual researchers and curators should consider the future value of their scans in a wider context, not only to themselves but also to the greater scientific community. By creating a standard for 3D data collection, regarding the quality and completeness of the scan, as well as the secondary data obtained for each specimen, the value of this method as a reproducible multi-user dataset can be conserved.

Placing Landmarks

Many different methods can be used to analyse 3D digital data. The method most frequently employed in human remains studies is geometric morphometrics (GMM), which typically involves the use of landmarks and/or semi-landmarks. GMM studies use these points to record the morphology of anatomical features in order to quantify biological shapes, and the 3D coordinate data can easily be exported for analysis. While this is a very valuable method that can be easily reproduced, there are some potential issues concerning the standardisation of the landmarks utilised and the descriptions of these landmarks. When determining the landmarks for a study, several factors need to be considered, including the preservation of the sample and the biological questions being asked. However, even among studies which are asking the same biological question of the same skeletal element, there is considerable variation in the number and position of landmarks which are being utilised (Bigoni et al. 2010; Kimmerle et al. 2008). The lack of standardisation in the landmarks employed by GMM studies has created issues when attempting to make comparisons between studies.

GMM is no longer a fledgling field within archaeology and now is the time to discuss ways to standardise analysis. There are many examples in archaeological methods where standardisation of the field has occurred late in its development, which has significantly limited the value of earlier studies (Florian 1990; Musonda 1990; Oonk et al. 2009; Pajas and Olivam 2009). For instance, methods for recording dental caries have varied significantly between studies, limiting the ability to compare across studies preventing comparisons (Cox and Mays 2000; Hillson 2001; Whittaker and Molleson 1996).

This is not to say that the validity of landmarks should be compromised in the aim of standardisation; instead, it is suggested here that efforts should be made to standardise the points used when designing the GMM methodology. Not only would the standardisation of landmarks increase the possibility for direct comparison of results between studies, but researchers could also publish their raw landmark coordinate data online as a dataset, allowing other researchers to directly incorporate these data into their own analysis. It is suggested that such raw coordinate data would not be subject to the same ethical and legal considerations as 3D scans, as these data are considered sufficiently different from the raw material and instead are more similar to measurements or the scoring of biological features, such as those used in sex estimation.

Assessing Error

There is, however, a lack of standardised methods for observer error in GMM research, which limits the ability to compare methods and studies (Fields et al. 1995). This error may be introduced during the digitisation process, or in the extraction of coordinate data such as landmarks for analysis. When considering the error introduced during digitisation, some research indicates that observer error in methods such as photogrammetry (Weinberg 2006), as well as 3D digitisers and laser scanning (Sholts et al. 2011), leads to non-significant levels of observer error, although potentially higher than those found in traditional methods, due to the nature of the process (Hildebolt and Vannier 1988). There are also indications that observer error may be affected by experience with the equipment (Sholts et al. 2011), although this would also be expected with more traditional methods.

Measurement error is inevitable regardless of the method used, due to human error as well as issues associated with the measuring equipment (Barker et al. 1994; Choi et al. 2002). Assessment of observer error is difficult in 3D GMM, as it requires the direct comparison of 3D data that exist in different coordinate systems (Richtsmeier et al. 2002; von Cramon-Taubadel et al. 2007). Typically, data in the form of 3D landmarks are registered using processes such as generalised Procrustes analysis (GPA) (Gower 1975), although this does not make variability due to observer error directly quantifiable (Richtsmeier et al. 2002). Despite this, several GMM studies have analysed error after processing landmark data with GPA and principal component analysis (PCA) (for example: Franklin et al. 2006, 2007; Kranioti et al. 2009; Lockwood et al. 2002; Terhune et al. 2007). For instance, a study conducted by Franklin et al. (2006) assessed intra-observer error by placing landmarks six times, aligning the configurations through GPA, and performing PCA. It was determined, based on the clustering of repeat configurations for principal components 1–5, that intra-observer error was “unlikely to have unduly influenced the results” (Franklin et al. 2006, p. 16). A similar analysis was conducted by Kranioti et al. (2009) where intra-observer error was determined to be low enough due to distance between repeats being lower than distance between individual data points. However, GPA-registered data resulted in error being distributed randomly across the configuration, in a phenomenon referred to as the Pinocchio effect, as such methods are not suitable for assessing error (Chapman 1990; von Cramon-Taubadel et al. 2007; Zelditch et al. 2012).

Researchers have developed a wide range of methods to quantify and assess levels of observer error. These include plotting the results of PCA to visually assess how tightly clustered repeat data points are (Dryden and Mardia 1998; O’Higgins and Jones 1998), the comparison of intra-individual distances to inter-individual distances (Lockwood et al. 2002), analyses of variance (ANOVA) (Freidline et al. 2015; Nicholson and Harvati 2006; Ross and Williams 2008), calculation of intra-class correlation coefficients (Fourie et al. 2011; Weinberg 2006), and technical error of measurement values (Weinberg 2006). Finally, at present there is no standard of acceptable observer error (von Cramon-Taubadel et al. 2007). Different thresholds have been suggested, ranging from 0.5 mm or less (Guyomarc’h et al. 2012) to 1 mm or less (Weinberg 2006); however, as error must be considered in terms of relative significance of effect on the results, it is unlikely that an acceptable standardised error threshold will be established. Further discussion as to how to determine acceptable error is still sorely needed, even if this may not be applicable to all studies.

Reconstruction

One of the greatest problems facing archaeology is limited preservation, which confines sample sizes and research capabilities (Benazzi and Senck 2011). This is particularly the case with GMM, because the landmarks utilised must be present on the entire sample (Bookstien 1991). It is important therefore to be able to reconstruct missing data to allow for morphological analysis or other forms of analysis such as facial reconstruction for victim identification in forensic investigations (Benazzi and Senck 2011; Benazzi et al. 2009; Krogman and Iscan 1986; Ponce De León and Zollikofer 1999; Wilkinson and Neave 2003). As such the ability to digitally reconstruct damaged, distorted, or fragmented objects is of great value in 3D research.

Despite the numerous benefits of digital (or virtual) reconstructions, these methods obviously introduce error. Furthermore, the accuracy of these reconstructions has been found to vary significantly between reconstruction methods and the nature of the study material (Benazzi et al. 2009; Hirst 2016). This is illustrated in a study conducted by Arbour and Brown (2014) which compared four reconstruction methods among five different specimens shown in Figure 1. To develop a standardised method for assessing error in GMM studies, it is important to examine the variety of methods that have been used in previous GMM studies, and assess the validity of these methods. However, the ability to compare reconstruction methods is arguably further hindered by the lack of standardisation in how studies present error. By creating standardised error assessment methods, researchers will be able to more accurately compare available literature on reconstruction methods. Until then, the reliability and validity of the results of this field remain somewhat questionable.

Figure 1
figure 1

The accuracy of reconstruction methods, the circle radius indicates the mean error in reconstruction (Arbour and Brown 2014)

Recommendations

After reviewing the above issues of ethics, digitisation methods, and error assessment, the following recommendations can be made for future researchers conducting 3D digitisation of human remains:

  1. 1.

    Clear international ethical guidelines are required: these should describe uses in practical terms and avoid vague language such as ownership; these should also account for the potential of 3D digital data to be changed and altered.

  2. 2.

    Organisations and communities involved in the repatriation of human remains should be included in future discussions regarding the digitisation of these remains, to develop guidelines. These guidelines should reflect the potential cultural variation between countries and communities regarding ethical treatment of human remains.

  3. 3.

    Future studies which digitise culturally sensitive remains or remains in anticipation of repatriation should require ethical approval from the organisation/community to whom the remains will be repatriated. This approval and a discussion of the future use of these digital data should be clearly stated in any publications.

  4. 4.

    Further research is needed in order to determine how to maximise the future potential of 3D digital collections.

  5. 5.

    A review of previously published papers is needed in order to investigate the variation in landmarks used in GMM studies and to create a standardised landmark system which will maximise the potential of coordinate data, and allow direct comparisons between studies.

  6. 6.

    A standardised method of assessing error needs to be created, and decisions made on the threshold for what may be considered as a reasonable amount of error for digitisation methods, the placement of landmarks or other coordinate data, and the reconstruction or estimation of missing data.

Conclusion

This paper has discussed the current lack of standardisation regarding the ethics and ownership of 3D data, as well as 3D data collection and analysis, with regard specifically to the analysis of human remains. It is hoped that this will lead to further discussions resulting in a standardised approach to 3D data collection, use, and ownership, and assessment of error in 3D data analysis within archaeological research. Future studies that explicitly evaluate the current methods for assessing error in GMM results are required to determine the best approach for future research. Before a standardised approach can be suggested, it is necessary to understand how different institutions and cultures view 3D scans of human remains, which we suggest should be achieved through collaboration between institutions, researchers, and relevant individuals. These discussions need to be started, while the field is still develo**, in order to avoid the problems that have already hindered archaeological research in the past due to the difficulty in comparisons across studies and to prevent results from losing their value due to a lack of standardisation. We therefore argue that cross-disciplinary research, involving anthropologists, archaeologists, bioethicists, and legal scholars, is needed to consider these ethical questions and to develop suitable guidelines of proper practice.