1 Introduction

Facing ever new challenges with the diversification of customer needs, increasingly complex manufacturing processes, and the rising demand for product quality and efficiency, manufacturing firms must respond with shorter product design and manufacturing cycles, faster product iteration, higher production efficiency, and more flexible production methods. To cope with these challenges, Smart Manufacturing Systems have been proposed. A Smart Factory, or Smart Manufacturing System, is fully connected via information networks [1], operating without human force [2] by generating, transferring, receiving and processing necessary data to conduct all required tasks for producing all kinds of goods [3]. Hence, effective communication between the physical and digital world is crucial for modern factories [4] and operations in general [12]. The commissioning on site, hence, becomes less stressful, less time-consuming and error prone. As corrections on site are more expensive, financial savings can be increased [13].

While in real machines errors are comprised of mechanical faults and external perturbations, this paper focuses on their models. A virtual model of a machine can be assumed to be ”error-free”, if it behaves like a perfect machine, without any disturbances, randomness or further influences. Such error-free models can still be imperfect, when compared to their real counterparts, so that error refers to the models imperfectness when encountering real machine errors. To avoid confusion regarding the term error and for the sake of consistency, this paper uses this nomenclature throughout.

Given the imperfection of machinery applied in manufacturing, machine failure behavior can be analyzed by pattern recognition on a production system level [14] but could be mitigated with error-inclusion in VC. More errors in the control system could be detected and corrected in advance with increasingly accurate virtual models, which would also help realizing truly automated test automation [15]. In order to verify the control software’s reactions and find any observable defects in test automation [16], these errors should be identified and simulated in the virtual model during VC. While narrowing the reality gap may not always yield immediate results for PLC software quality, it is of particular importance for enabling truly automated test automation. Additionally, modern control techniques where digital twins are equipped with foresight [10] are increasingly transferred to machine and system control, requiring adequate virtual models with minimal reality gaps [5].

1.1 Research question

As an early investigation in addressing the challenges of error modeling in VC, this paper aims at optimizing the virtual model by embedding the machines’ errors that are ignored in existing, error-free VC. However, these errors exist in real production during operation. So in order for the dynamic performance of the virtual model to match the machines under real conditions, the models need to be improved.

Fig. 1
figure 1

Illustration of the reality gap between the virtual model and the real machine

As shown in Fig. 1, the real machine can be considered to consist of the current error-free virtual model and a reality gap, i.e. unknown errors. Due to this reality gap, the control system validated by error-free virtual models could be overwhelmed when malfunctions appear in real systems. Hence, including an accurate error model to narrow the reality gap can provide mitigation. However, these error-models are neither known or applied nowadays.

Through literature review and expert interviews, a thorough overview of VC model errors is given. These errors are classified by factors such as their modeling suitability, leading to a VC model error directory. Subsequently, the error model shall be integrated into the error-free virtual model, narrowing this reality gap. Even with a seemingly benign part such as an inductive sensor, it can be shown that potentially significant differences between the reality and corresponding model can arise by ignoring errors.

2 Literature review

Commissioning is a critical part of the engineering process [7]. According to the VDI guideline 4499, VC encompasses the development-accompanying testing of individual components and sub-functions of an automation system using simulation methods and models that are tailored to the respective task [17, 18].

2.1 Definitions and delimitation

While error modeling is established in other fields of research, it is not at all common in the field of VC. The focus of this work lies on VC, i.e. the testing of the control systems. To show how digital approaches can help, Fig. 2 illustrates how virtual engineering and commissioning accelerate the traditional approach. The term VC should furthermore be clearly separated from other approaches that deal with accelerating the engineering process, such as concurrent engineering. As opposed to concurrent engineering, which describes the general idea of simultaneous product and production system engineering. Furthermore, it is focused on production and not on the products and their effects on a production system [19].

Fig. 2
figure 2

Smart manufacturing systems engineering process with virtual commissioning (VC) and virtual engineering (VE)

Compared to traditional approaches, the engineering process can be accelerated significantly through VC, as illustrated in Fig. 2. Up to 70% of software errors are corrected by connecting the PLC to virtual models via hardware-in-the-loop (HiL) or software-in-the-loop (SiL) [20] with respect to real-time restrictions [13]. Subsequently, the ramp-up time is drastically reduced and a higher quality is guaranteed [21].

2.2 General literature review

In order to model virtual prototypes that are closer to the real machine, the authors of Yang et al. propose that physical attributes such as mass, acceleration, friction, and spring force should be added to error-free models, to decrease the reality gap [22]. Barth et al. mention that VC makes it possible to test major incidents such as cable breaks or sensor failures but do not provide specific solutions [17]. Süß et al. test the movement performance of a robot and check interference among different mechanical structures with an accurate error-free virtual model, also without considering errors in VC [23]. Schneider et al. verify the system with regard to application scenarios as well as unexpected behavior in case of failure [24]. Nonetheless, they define safe states and test conditions only for failures over a specified risk priority [24]. According to Kufner’s definition of modeling depth for VC, error modeling is at the deepest level at modeling depth 5—physical condition depiction—and has not yet received much attention [25]. This indicates that error modeling in VC is still a new topic to be investigated. Given the examined researches, a scientific demand for error modeling in VC is visible.

Errors exist not only in hardware but also in software [17]. During the control program configuration five software errors are common, mentioned in the VDI-guideline 4499 [18]: (1) errors in logical control code, (2) process errors in the control code, (3) alarms, (4) communication errors and (5) malfunction of elements in the configuration. While the first types are manageable, the last error type, malfunctions cannot be automatically eliminated via software in the control system as simulated components in VC are error-free by default. Thus, the control system does not consider malfunctions in the VC phase. In conclusion, no matter how accurate the simulation is, there is a substantial difference between simulation and reality, the so-called reality gap [26]. Five error types are summarized from the reviewed literature, as shown in Fig. 3. Errors can be distinguished based on the source of deviation between reality and virtual model: (1) inherent errors that depend on external effects (e.g. temperature induced sensor effects), (2) code errors that stem from false coding, (3) installation errors (e.g. two sensors are connected vice versa), (4) poor compatibility (e.g. PLC, sensor and actuator incompatibilities) and (5) physical errors (e.g. of individual elements).

Fig. 3
figure 3

The five error types leading to the reality gap

This paper aims to enhance the dynamic characteristics of virtual elements with error models and, thus, narrow the reality gap by focusing on the first error type. If malfunctions are not simulated in a virtual model, control systems can not react correctly.

3 Solution concept

Based on the aim to derive error-incorporating models for VC, the following solution concept is derived: Firstly, the challenges and idea in resolving the underlying problem, i.e. the different types of reality gaps, are outlined. Secondly, a process model is presented as a solution concept in order to bridge the reality gap between a virtual model and the corresponding real machine.

3.1 Bridging the reality gap

Fig. 4
figure 4

The outline of VC with error model to bridge the reality gap

Errors have to be identified thoroughly before integration into virtual models. The idea is illustrated schematically in Fig. 4. The upper part of Fig. 4 shows the process of current VC, where the validated control system obtained after VC includes corrected errors and unknown errors (upper part and yellow). The lower part is a schematic diagram of error modeling in VC. After the error model is embedded into the error-free virtual model, more control system errors (depicted in dark green) are discovered and corrected in the validated control system. Thereby, the performance of VC is improved with an error-embedded virtual model, and the engineering process can be further shortened. Figure 4 indicates the implementation of error modeling as a step-wise process. At first, the object of interest is to be determined and then the related errors have to be identified. Next, errors must be expressed as a corresponding model in software. Finally, the error model is integrated with the error-free virtual model. Therefore, this paper uses a process model that satisfies the following four requirements.

First, physical elements presentable in virtual models have to be gathered. In this paper, the investigated physical objects shall be elements that may produce target errors, inherent errors of elements and environmental effects on elements, c.f. [22], as outlined in Fig. 3.

Afterwards, a knowledge base of errors can improve modeling efficiency and accuracy. Thus, the potential errors of elements applied to the production line have to be identified from different sources, including theoretical literature, practical experience and data analysis. As a result, a list of all possible and relevant errors is obtained. By analyzing and clustering the errors, they are presented in this work to be used in any given digital twin. The method to narrow the reality gap in this paper is implemented in a virtual environment. Thus, these physical faults need to be transferred from reality to software.

This has to be done pragmatically, meaning that the errors have to be implemented in a digital twin in a way that they distort the simulation behavior similar to the real errors would with the actual production system. The error models have to be included into existing behavior libraries, so that they can be used seamlessly. Hence, a requirement for errors is for them to be programmable.

The last step aims at creating error models that narrow the gap between real and virtual machines. A model library based on a universal error description could ensure information consistency and, thus, improve the quality and efficiency of VC.

3.2 Process model

Based on the derived requirements, a process model to bridge the reality gap is proposed, including three steps, as outlined in Fig. 5.

  • Step 1 The primary task is to determine the scope of physical elements. Exploring potential errors through a literature review, structured interviews with experienced engineers as well as real data analysis in the above processes is crucial.

  • Step 2 Before errors from the knowledge base are imported into a virtual model, a feasibility analysis is required because not all identifiable errors are related to VC. The following steps focus on VC-related errors, which are subdivided further, including irregular and mathematically representable errors, including different forms of modeling.

  • Step 3 The next step is to simulate errors and malfunctions embedded into the error-free virtual model to obtain a new virtual model in line with the actual system behavior.

Fig. 5
figure 5

The process model to bridge the reality gap

4 Methodology

Based on the proposed three step process model, this section introduces the implementation to ultimately narrow the reality gap. The comprehensive error directory is acquired by analyzing existing literature, real data and conducting expert interviews. This knowledge base can then be used to implement errors and, thus, narrow the reality gap step by step.

4.1 Step 1: Discovering the reality gap

In order to provide a thorough overview, inherent errors of elements and environmental effects on elements are analyzed from three perspectives: literature, semi-structured interviews with production engineers and real production data. Doing so, a knowledge base can be established, which serves as an indispensable basis for subsequent work. Note that the reviewed sources for the knowledge base form the core of the regarded knowledge, which in real-world applications should be updated regularly.

4.1.1 Error analysis from literature

By analyzing the technical characteristics of the elements described in literature, potential errors are identified. In spite of modern approaches to facilitate the extraction of such information from literature, such as Natural Language Processing (NLP) [27], this paper focuses on presenting errors and their suitability to be modeled in VC, which is not discussed in the analyzed literature and, thus, cannot be extracted.

The elements themselves are mutually exclusive, based on the interaction within the production system. However, many errors can occur in multiple of the element classes. These are: Detective elements are used for the sake of measuring parameters and monitoring the material flow in a production line. Measurement is a trade-off process as elements are always disturbed by internal or external factors. In general, some factors are taken into account in the measurement results, but others are ignored. A holistic overview of different factors to detective elements is shown in Table 1.

Dynamic elements convert other forms of energy into mechanical energy. Mechanical energy is put into the production system continuously. A stable dynamic element ensures ideal workpiece motion characteristics. Consequently, the errors of elements (see Table 2) cannot be ignored while simulating.

Transmission elements, on the other hand, transport the workpiece in a production line, e.g. conveyor belts or robots. They do not solely focus on mechanical energy transformation, but also signals. Errors on transmission elements are depicted in Table 3.

Table 1 Error sources of detective elements
Table 2 Error sources of dynamic elements
Table 3 Error sources of transmission elements

While many errors are verbally described in the literature, as outlined in Tables 1, 2 and 3, the scope of the existing work is limited to the description of individual errors. Furthermore, a comprehensive knowledge base of errors is not envisioned and a clear connection to error modeling in VC and the mathematical representation is not presented, as the description of individual elements is favored.

4.1.2 Error analysis from interviews

This subsection deals with errors gathered by experience from practical aspects. The failures that often occur on the production line are summarized in detail. The previously defined requirement— a knowledge base for error modeling—is met through a combination of theoretical and practical analyses. To this end, five industry experts, each having at least five years of experience and a background in the fields of industrial automation, control engineering, and VC, were questioned in semi-structured interviews. Their summary of actual errors of real production lines is of considerable significance to the study of the reality gap between virtual models and real machines. It should be noted that experts can be biased in their answers. The reader is referred to [58] for an investigation of said bias. Furthermore, despite their rich experience, they most likely can not have experienced each and every error that can arise. Despite those potential drawbacks, the answers from the expert interviews are still valuable and can not be replaced by other means of data acquisition. The identified most common errors are listed in Table 4.

Table 4 Errors reported by industrial experts

4.1.3 Error analysis based on real data

The real data reflects a running production line for battery cells. The sampling period of the research objects is 22 min and 7 s totaling 13,270 samples (binary values) for a single variable. The following two types of errors in the real machine were found, although they do not harm the machine. If they were simulated in VC, the virtual model could more realistically reflect the real machine and its behavior.

Signal delay Signal delay is found as a typical error in real data. As shown in Fig. 6, the control signal of a pneumatically actuated lifteris taken as an example to be discussed. Due to its very common nature, pneumatic cylinders are suited perfectly as a basic example of error modeling with transferable results. The signals of said lifter are recorded and plotted to investigate how the actuator behaves and how the PLC receives the signals.

Fig. 6
figure 6

Signal delay between commands and movement

There are 50 sample points, equaling 5 s. During this period, the lifter control signal changes and an error occurs. The lifter is controlled by a pneumatic system, moving in the positive and negative direction of the z-axis. In the first experiment, the lift is initially at the bottom, as shown by the blue line. A command “lift up” is given by the PLC to the pneumatic system at \(n=3\). However, until 1.2 s later, at \(n=15\), the lifter doesn’t leave the bottom position. In the second experiment, a command “lift down” is sent to the top lifter by the PLC at \(n=7\), as shown by the red line. Likewise, it takes about 1.3 s until leaving the top position is detected. Signal delay is present in both of the above presented experiments. Many factors may be responsible for this type of error such as a control signal delays: Starting at the PLC the signal needs to pass through different control elements and the fieldbus, such as a switch, I/O ports, etc., to reach the actuator. This delays the signal transmission. Furthermore, sensor delays can occur as sensors need a reaction time to be activated. At the third sampling point, the lifter might leave the bottom, but the sensor does not detect the lifter due to the sensor’s delay. Additionally, a sensor’s detective range can influence the behavior. When the object is located within the detective range, the sensor output does not change. In this case, the lifter is still in the sensing range after ascending, so the output of the bottom sensor does not change. Last but not least, actuator delay can play a role: A pneumatic system drives the lifter. Like sensors, the actuator can add delay as a result of less than perfect air flow and inertia.

In addition, another type of delay can be found in the real data. The time required for the machine to finish each step is 4.1 s more than the required time in the virtual model. This type of delay increases to 41 s after a process cycle (all 10 steps are finished). However, it should be pointed out that this “delay” is not an error and does not lead to the reality gap since it is caused by an unequal execution time between the virtual model and real machine. This could be eliminated by slowing down the execution speed of each step in the virtual model and incorporating the above mentioned errors.

Contradictory state In some cases, element states are contradictory. As shown in Fig. 7, the curve shows the state of stopper 1, an actuator in the production line.

Fig. 7
figure 7

The contradictory state of stopper 1

The sample consists of 15 sampling points with a total duration of 1.5 s. The light blue curve indicates that stopper 1 is open, and the dark blue curve signifies that stopper 1 is closed. Between the 7th and 10th point, the value of both curves is 0. This means the state of the stopper 1 is neither open nor closed. However, this is inconsistent and in reality impossible. Such contradictory states exist not only in stoppers but also in photoelectric sensors. Although this phenomenon exists in many elements, the production line is still working, and the control system has no noticeable problem. One possible reason is that the control system can not respond to the change of an element state in such a short time. However, the sample data does not contain a lot of errors, which illustrates another important issue. Even if errors are examined from the real data, they are not guaranteed to show up in any given set of data. Nevertheless, when they appear in reality they can easily lead to disruptions or even breakdowns. Hence, it is worth to identify, store and model such errors in order to optimize VC’s quality.

4.2 Step 2: Analyze feasibility

A clearer understanding of potential errors is obtained after the classification. Before these errors are simulated in software and integrated into error-free virtual models, prerequisites should be met, including correlation with VC and programmability. Thus, the feasibility of errors and malfunctions for simulation are discussed. While this work found a number of irrelevant errors, they are not presented here and the focus remains on those that are relevant to simulation and VC.

Irregular errors are accidental and passive errors, usually caused by complex environmental factors, such as vibration, contamination, electromagnetism, etc. Irregular errors are impossible to be expressed through mathematical models. However, their appearance is likely to cause a malfunction in practice with validated control software after VC, so the corresponding warning messages should be set during simulation. For example, if a camera cannot recognize the target due to contamination, it might output a false signal. In this scenario, a warning message should appear on the control system, prompting the operator to eliminate the fault. Kee** in mind that the focus of VC is the PLC software, those irregular errors cannot be handled by software and as such are not deemed relevant for error modeling in VC..

Mathematically representable errors are errors whose changes and consequences are predictable using mathematical models. These can be summarized from published literature, such as the effect of temperature on sensor output accuracy. Thus, these errors can be mathematically modeled to determine patterns of change which can be used to describe them in software.

Feasibility analysis shows that not all errors are relevant for VC and thus could and should not be modeled. The errors found in Sect. 4.1 can be divided into two broad categories, but this paper focuses only on those that are relevant to VC models. This enables the reader to subsequently implement those errors in their models.. This leaves two main categories: irregular errors and those that can be modeled mathematically. The first category includes contamination, pollution and belt slip as well as vibrations from neighboring stations or even factory traffic.

4.3 Step 3: Error simulation

Errors from the knowledge base are summarized and categorized based on their characteristics. This section introduces how to simulate feasible errors in VC, corresponding to the third step of the process model in Fig. 5.

As previously mentioned, irregular errors cannot be accurately described by mathematical models. Because of irregular errors, abnormal outputs of certain elements can prevent the control system from functioning correctly. Current VC does not take these errors into account, so the control system assumes that the output of elements is as predetermined and may be overwhelmed by problems such as camera failures due to contamination. Thus, for irregular errors, although the pattern of change itself cannot be described precisely, the influence of the error is possible to be simulated in VC. So, by observing the influence, approximated models can be generated. This would allow the programmer to evaluate the PLC’s behavior in such a scenario. The modeling process of irregular errors is shown in the top half of Fig. 8.

Fig. 8
figure 8

The simulation process of VC relevant errors

Another type of errors are mathematically representable errors. These often occur in the form of changes that can be predicted by mathematical models or logical control loops. Taking an inductive sensor as an example, the output value in operation may differ from the output value of the error-free virtual model due to fluctuating room temperature. In this case, the fluctuating temperature can be seen as an error. The virtual sensor is not precise enough so that it does not exactly match the dynamic performance of the inductive sensor in operation. The most effective way to resolve this error is to establish an extra mathematical model for room temperature in software. In the next step, this model is integrated into the error-free virtual model, narrowing the reality gap. The bottom half of Fig. 8 illustrates the simulation process of mathematically representable errors.

5 Results

This work presents a comprehensive overview of the current state of error simulation, the current deficits and potential benefits. As a result, it allows the reader to enrich their models by understanding what can and should be simulated in what manner. This knowledge base provides a first step towards narrowing the reality gap. The proposed solution concept is investigated for feasibility according to the presented process model. Extending the error free model requires a knowledge base to map relevant errors and their mathematical representation to be included into the PLC for VC, which is regularly extended whenever a novel error is introduced. The main contribution, however, lies in a consequent approach to error modeling in VC. The resulting knowledge base enables digital twins to be enhanced, which in turn can elevate the VC improvements. Especially in the context of automated testing, the systematic simulation of relevant errors enables simulation experts to improve the PLC software’s robustness towards a plethora of possible behavioral deviations, which would then speed up the real commissioning..

6 Conclusion and outlook

This paper presents a process model as a solution concept to bridge the reality gap in VC. This section discusses the entire approach, including the innovation of this approach over current VC techniques, and its limitations that can be further optimized.

6.1 Innovation

Firstly, compared to current VC techniques using error-free virtual models, the proposed solution can avoid loss of profits and delay of a project, thereby speeding up the production cycle. Thus, the urgency of including error modeling in virtual models is exemplified. Secondly, the process model proposed in this paper is highly versatile and scalable. Industrial applications are plentiful: the testing can become more realistic, thus yielding a higher software quality on the construction site. Automated testing can be established, where the errors are automatically simulated and evaluated. At the very least, a better understanding of what is possible, what can and should be simulated is provided.

The knowledge base established in the first step is changeable and extendable according to the VC application domain. Common types of errors are distinct in different industries [17]. Therefore, the database’s sources should be matched with its application domain. For the remanufacturing industry, suffering under different product states, and high potentials for failure [59] a small reduction in errors can yield great potential for increasing process robustness. In addition, depending on the project’s accuracy requirements, the number and types of errors considered in the process model can be extended easily. In other words, the process model can be applied to projects with different specifications and sizes. This paper, thus, provides the first step towards error modeling in VC, to get from an idealized model to one that incorporates errors from the real world surroundings of production environments. This could also open the door to projects, such as the one discussed in [5], to include the vast field of digital twins for deep and reinforcement learning into production systems [60] and knowledge aware, knowledge graph based virtual models [61].

6.2 Limitations

The dynamic performance of error-free virtual models is optimized through error modeling. However, the error-embedded virtual model’s stability is not fully verified yet. In order to achieve a full evaluation of the inclusion of error modeling in VC, a real machine with a PLC and an experienced programmer is required. Based on this research, which provides the first step towards error modeling in VC, the main focus lies on presenting basic innovations for error modeling in VC. Consequently, a robust control system after VC cannot yet be guaranteed in real operation. Thus, the control system’s robustness validated by error-embedded virtual models should be focused in future research. Against the background of the shown results, this approach of error modeling in VC is feasible. However, in comparison with state-of-the-art techniques that control systems are validated by an error-free virtual model, it is not fully understood how much benefit a control system, verified by an error-embedded virtual model, could bring to VC. While achieving a high level of realism per se is not conceivable, the required granularity of error inclusion is not investigated in detail and, hence, is worth discussing.

6.3 Outlook

This work constitutes the first step from investigating the reality gap to deriving a knowledge base. This can then be used to implement errors in previously error-free VC models. The next step then has to be to model the errors in the respective VC tool. These models must then be saved and organized into a database, from where they can then be applied to real life production models. Future research on error modeling in VC should focus on incorporating actual errors from actual production system, i.e. dig into up-to-the-minute virtual models. In these cases, real data on the operation of relevant equipment must be collected and analyzed, and the error database consequently updated. All the programmable errors have to be implemented in existing libraries for them to be available in production systems. Meanwhile, further prototypical implementation is necessary and consequently the established error-embedded virtual models should be tested by the PLC to verify whether the control system validated by this model can remain stable. Last but not least, the versatility and compatibility of error-embedded virtual models are priorities for future developments. The error models should be as compatible as possible to different software to reduce the effort of VC. In future research, the control system validated by VC could be more adaptable to real production conditions, and test automation could be implemented in a more genuine manner. Against the background of narrowing the reality gap, error-modeling in VC, as introduced in this paper, can serve as a game-changing enabler for digitization and Industry 4.0.