Keywords

1 Introduction

The domain of causal structure learning deals with finding cause and effect relationships in inspected environments and aims to find the true causal structure graph. These novel machine learning methods can help a machine to independently understand its environment, to use its causal dependencies, and also to assess the consequences of actions that have already occurred or will occur in the future. The discovery of the causal graphs is used for causal-motivated root cause analysis [5] and causal effect estimation in smart manufacturing. For their development, it is common to apply them on well-known environments.

A subdomain of causal structure learning tries to uncover causal relations by performing experiments and gaining structure knowledge by inspecting the interventional effects. But established procedures as the Rubin causal model [9], the basic theory for randomized controlled trials [3], inspect the effect in only one affected variable. Currently, gaining structure knowledge from interventions that affected multiple variables is under discovered, since the provided information about the causal graph is highly ambiguous. Each causal relation between the intervened and the affected variable could be direct, or it could be indirect via any other affected variable.

In this work, we investigated how to use such ambiguous information for causal structure learning by defining so called path constraints. In a novel low-intrusive intervention, we injected a minor wavelet onto the intervened variable, tried to rediscover the wavelet to identify the affected variables and thus derived these path constraints. A path constraint contains the sole information that two variables are direct or indirect connected somehow in a specified direction. We demonstrate their usefulness by applying them to the results of the state of the art PCMCI+ causal structure learning algorithm and thereby improve its discovered causal graphs.

In order to exercise these methods, we have chosen a simulation of a running combustion engine as testing environment. The specific chosen combustion engine was validated on a test stand, has a manageable and well-known number of causal relationships. Its simulation allows the injection of wavelets that can propagate in the system naturally.

The paper is structured as follows. In Sect. 2, we will present related work. In Sect. 3, we shed some light on causal graphs and existing approaches for causal structure learning. In Sect. 4, we introduce our fundamentals for the novel interventional technique. In Sect. 5, the wavelet injections are demonstrated on a combustion engine simulation step by step. In Sect. 6, we draw the conclusion.

2 Related Work

Related Work in Soft Interventions

In the domain of soft interventions, the use of probabilistic interventions is common to gain structure information between two investigated variables. Eberhardt et al. [2] investigated how many soft interventions in variable pairs are required to gain knowledge over the true full causal graph. Kühnert et al. [6] exercised the Max min parents and children method to learn the graph skeleton, oriented some edges using conditional independence tests and then applied soft interventions by ‘pushing’ probability distributions in a certain direction to finalize the discovered graph by inspecting variable pairs.

Related Work in Using Propagation Properties

A related characteristic about the propagation properties of causal relations has been used before by Hoyer et al. [4]. According to their research, the affected variable may be represented as a function of the causing variable and some random and independent additive noise in the actual direction from cause to effect, but not in the opposed direction. By testing for each direction, they were able to direct the relations between the two variables with high confidence.

Related Work in Multivariate Discovery

Some methods have specialized in the discovery of multiple causal relations to uncover the causal graph such as such as PCMCI+ [10] and Multivariate Transfer Entropy [7]. They used only observed data for their discoveries and performed no interventions.

3 Fundamentals

3.1 Causal Graphs

Causal graphs consist of a set of nodes representing variables \(\mathcal {V}\) and a set of edges \(\mathcal {E}\) representing causal relations. If a directed edge points from \(A \in \mathcal {V}\) to \(B \in \mathcal {V}\), then variable B is caused by A. A path from A to B is a chain of arbitrary edges connecting A and B with a number of edges being equal or greater than one. A directed path from A to B is a path with consistently directed edges from A to B. A direct causal relation between variables indicates a path length of one, but an indirect causal relation indicates a path length greater than one.

3.2 Causal Structure Learning

The goal of causal structure learning is to gain knowledge over the true causal graph of the inspected environment. Such knowledge may be gained by applying observational and interventional methods.

Observational Causal Structure Learning

Observational causal structure learning methods try to find knowledge over the true causal graph from recorded data gained by observations. One of the most powerful methods has become PCMCI+ [10]. It works in two phases. First, a set of potential parents is estimated in a specified time window. In the second step, a Mutual Conditional Indepencence (MCI) test is applied on the variables from the parent set and from the inspected time window. Opposed to prior algorithms, PCMCI+ does not assume causal edges to be oriented either in the one or the other direction. Thus, it allows finding bidirected and contemporaneous causal relations as they are often present in engineering applications.

Interventional Causal Structure Learning

Using interventions for causal structure learning is one of the oldest and most popular approaches in science. Still in some application scenarios their use may be costly, unethical or simply not feasible. It is assumed that a variable is a cause of another variable B, if an intervention on A also affects the associated variable B [12]. [2] distributed the existing approaches in two major categories called structural interventions and soft interventions. Structural interventions cut off all causal influences to the variable under intervention and fully determine its value (e.g. treatment drug or no drug). Soft interventions (also called parametric interventions) only undertake minor changes. Common is the intervention on the probability distribution of a variable [2, 6]. They do not disturb the original causal structure, therefore influences of other variables on the intervened variable are not ruled out.

4 Wavelet-Based Soft Interventions

The new intervention method does not perturb the original causal relation and can therefore be considered according to Sect. 3 as soft intervention. The particularity of these interventions lies in the fact that a wavelet is added to the intervened variable and is tried to be rediscovered to gain causal information.

When injecting a wavelet into a variable A, the injected wavelet is added to the timeseries of the injected variable. We assume the wavelet to spread in direction of the causal relations in the graph. If we find the wavelet in only one variable, we assume a direct causal relation to be present. In case of a discovery in several other variables, including B, we may not. Instead, we gain knowledge about an existing path between the variables with \(|C(A,B)|\ge 1\), since the wavelet must have traveled in some way from A to B. This specific path information are the so called path constraints.

As soft interventions allow other causal influences on the intervened variable, the wavelet was required to be distinguishable from the other influences. Thus, we decided on the use of uniquely shaped wavelets to increase the chance of their successful reidentification. Note, that in general, we do not consider information about variables in which the injected wavelet could not be found, as the wavelet may be lost due to loss in amplitude, deformation or other reasons.

For wavelet recovery, we normalized the measured values. Otherwise, the different scales of the variables would make a comparison difficult. Then we applied on each measured variable the fast pattern matching algorithm called Mueen’s ultra-fast Algorithm for Similarity Search (MASS) [13]. It gradually matches a desired pattern to a subsequence of the inspected timeseries and calculates the z-normalized Euclidean distance. The aggregation of these distances results in an overall distance profile. If its minimal distance is below a chosen threshold, we assume the position to be our wavelet. Otherwise, we assume the wavelet to be absent in the observed variable and thus we gain no path constraint.

5 Applying Wavelet Injections

In this section, we demonstrate how we applied the wavelet injections on a combustion engine dataset. As experimental setup, we first applied PCMCI+ on the simulation timeseries without any wavelet injections present. Then, we added three different wavelets to a root variable and tried to rediscover them in the other variables to gain path constraints. Later on, we compared the discovered graphs found by PCMCI+ and PCMCI+ combined with path constraints against the actual causal graph using the Receiver Operating Curve Area under Curve (ROC AUC). For this purpose, the true causal graph was created in advance using expert knowledge.

Simulation Setup

As a testing environment, we used a running combustion engine simulation [1, 8, 11]. For evaluation, we constructed the true causal graph as is shown in Fig. 2. Here, we give a brief explanation of the causal relations: The angle of the throttle plate influences how much air intake in the motor cylinder is possible. The air intake over time adds up to the aircharge in the cylinder before combustion. After combustion, depending on the aircharge, increases the torque of the engine and the overall engine speed. The increase in engine speed also depends on the load carried by the engine.

We injected the wavelets shown in Fig. 1 by actuating only the throttle angle and inspected all the other measured timeseries for traces of the injected wavelet. According to the true causal graph in Fig. 2, the wavelets should be found in the air intake, aircharge, torque and engine speed variable, as they directly and indirectly depend on the angle variable. Only the load variable should be free of any wavelet, as it is independent of the angle variable.

Step 1: Performing PCMCI+

First, the PCMCI+ described in Sect. 3.2 was applied on the running combustion engine dataset. We performed six measurements by setting the alpha value to 0.05 and 0.01 for each maximum lag parameter of 5, 10 and 15 data entries. One of these graphs is depicted in Fig. 2a as an example. In average the graphs found by PCMCI+ achieved an average ROC AUC of 73% points in respect to the true causal graph shown in Fig. 2c. As shown, the method was not able to direct several edges.

Step 2: Performing Wavelet Injection and Recovery

We decided to use three very distinct and well-defined wavelets: a Daubechie 4 wavelet, a Mexican Hat wavelet and a Haar wavelet. They are depicted in Fig. 1. We have chosen these wavelets because they contain amplitudes in the positive and negative value range and have a distinct shape.

Fig. 1
3 line graphs plot amplitude versus time in milliseconds. a. A Daubechies 4 wavelet is centered at around (3500, 1.1). b. A Mexican hat wavelet is centered at around (5000, 0.85). c. A Haar wavelet is between (0, 0) and (4000, 0).

The wavelets that were injected into the angle variable

Fig. 2
3 causal graphs where the angle is followed by air intake, air charge, torque, load, and engine speed. a. Air charge and load connect to engine speed. b. Air charge leads to torque with an arrow. c. Torque and load lead to engine speed.

Depicted are the causal graphs a) after the application of PCMCI+; b) after the application of the path constraints to the PCMCI+ result; and c) the actual causal graph constructed by experts

Fig. 3
6 line graphs. b, d, and f. Load versus time. Daubechies 4, Mexican hat, and Haar wavelets have overlap** step trends for after wavelet injection and without wavelet injection. a. c, and e. Aircharge versus time. After wavelet injection fluctuates within the recovered wavelet position region.

The wavelets as they were discovered in the exemplary aircharge and load variable. The area where the lines diverge indicates the presence of a wavelet and should be highlighted green as a mark for successful recovery. If the lines do not diverge, no wavelet is present and nothing should be discovered.

As an implementation of the pattern matching algorithm, we used the python package stumpyFootnote 1. It found all wavelets in all variables depending on the influenced angle variable. Each wavelet was sufficient enough for injection and discovery, as each was found in the actual position in all variables depending on the angle variable. Figure 3 presents an excerpt from our results for the aircharge and the load variable for each of the three wavelets. Plotted are the measured variables with and without variable injection, thus any divergence between the plotted lines must be caused by the wavelet. The colored area is where the wavelet was rediscovered by the pattern matching algorithm. It is colored green, then the wavelet is found in its actual position. This was the case for all wavelets in the aircharge variable.

In the load variable, no wavelet was found as it should be, since the variable is not dependent on the angle variable.

Step 3: Combining PCMCI+ and Path Constraints

From the previous step, we were able to retrieve four path constraints as the wavelet injection in angle affected the air intake, aircharge, torque, and engine speed variable. To improve the results of PCMCI+, we oriented the unoriented edges in the found causal graph according to the path constraints. We assumed a directed path to be present between the causing and the affected variable for each path constraint. If only one path was present in the PCMCI+ result, we directed it’s undirected edges in accordance with the directed path. We applied this procedure on all six PCMCI+ results. An example is shown in Fig. 2b. Due to the path constraints, the ROC AUC of the six graphs increased to an average ROC AUC of 83% points.

6 Summary and Conclusion

We investigated the idea of retrieving causal graph information from soft interventions that affect multiple variables and thus cannot deliver distinct structure information for causal graph construction. For this purpose, we created the definition of path constraints. We demonstrated how this information helps improving the results of the well established PCMCI+ algorithm. The procedure was demonstrated on a running combustion engine simulation. The obtained path constraints made it possible to increase the average ROC AUC value of the PCMCI+ results from 73% points to 83% points. Thus, we deem path constraints helpful in improving causal structure learning results. In future work, we will investigate more complex application scenarios. Additionally, we want explore the use of temporal information to gain additional structure information.