1 Introduction

In this paper, we study the Tracking Paths problem, which involves finding a minimum weight set of vertices in a vertex-weighted simple graph G that can track moving objects in a network along the way from a source s to a target t. A set of vertices T is a tracking set if for any (simple) \(s\)-\(t\) path  P in G the sequence \(\overrightarrow{T_P}\) of the subset of vertices from T that appear in P, in their order along P, uniquely identifies the path P. That is, if \(\overrightarrow{T_{P_1}}\ne \overrightarrow{T_{P_2}}\) holds for any two distinct \(s\)-\(t\) path s \(P_1\) and \(P_2\). Formally, the problem is the following.

figure a

In the current age of information, social media networks have an important role in information exchange and dissemination. However, due to the unregulated nature of this exchange, spreading of rumours and fake news pose serious challenges in terms of authenticity of information [10, 32]. Identifying and studying patterns of rumor spreading in social media poses a lot of challenges due to huge amounts of data in constant movement in large networks [36]. Tracing the sequence of channels (people, agents, ...) through which rumors spread can make it easier to contain the spread of such unwanted messages [27, 37]. A basic approach would require tracing the complete route traversed by each message in the network. Here an optimum tracking set can serve as a resource-efficient solution for tracing the spread of rumors and dissolving them. Furthermore, Tracking Paths finds applications in tracking traffic movement in transport networks and tracing object movement in wireless sensor networks [1, 30].

The graph theoretic version of the problem was introduced by Banik et al. [6], wherein the authors studied the unweighted (i.e., \(w(v) = 1\) for all \(v \in V(G)\)) shortest path variant of the problem, namely Tracking Shortest Paths (i.e., the set T is required to uniquely identify each of the shortest \(s\)-\(t\) path s). They showed that this problem is NP-hard, even to approximate it within a factor of 1.1557. They also show that Tracking Shortest Paths admits a 2-approximation algorithm for planar graphs. Later parameterized complexity of Tracking Paths was studied in [4], where the problem was also proven to be NP-hard.

To the best of our knowledge, Eppstein et al. [17] were the first to study approximation algorithms for the unweighted Tracking Paths. They gave a 4-approximation algorithm when the input graph is planar. Recently this result was extended by Goodrich et al. [20] to a \((1+ \varepsilon )\)-approximation algorithm for H-minor free graphs and an \(\mathcal {O}(\log \mathrm {OPT})\)-approximation algorithm for general (unweighted) graphs [20]. They also gave an \(\mathcal {O}(\log n)\)-approximation algorithm for Tracking Paths. The existence of constant factor application algorithm was posed as an open problem by Eppstein et al. [17]. In this paper we answer this affirmatively.

Theorem 1

There is a 66-approximation algorithm for Tracking Paths in weighted graphs and a 4-approximation algorithm if the input is unweighted.

There exists an interesting connection between Feedback Vertex Set (FVS) and Tracking Paths. Before we discuss this in more detail, we introduce FVS and its fault tolerant variant. Formally, for a given vertex-weighted graph \(G=(V,E)\), FVS requires finding a minimum weight set of vertices \(S\subseteq V\), referred to as a feedback vertex set (fvs), such that the graph induced by the vertex set \(V\setminus S\) does not have any cycles. FVS is a classical NP-hard problem [25] that has been thoroughly studied in graph theory. An r-fault tolerant feedback vertex set (\(r\)-ftfvs) is a set of vertices that intersect with each cycle in the graph in at least \(r+1\) vertices; finding a minimum weight \(r\)-ftfvs is the \(r\) -Fault Tolerant Feedback Vertex Set problem [31]. Note that if a graph has a cycle of length less than or equal to r, then it cannot have an \(r\)-ftfvs.

Relation Between FVS and Tracking Paths . For a graph G with source s and destination t, if each vertex and edge participates in an \(s\)-\(t\) path, then we refer to G as a preprocessed graph. It is known that in a preprocessed graph, a tracking set is also a feedback vertex set [4]. Thus, the weight of a minimum feedback vertex set serves as a lower bound for the weight of a tracking set in preprocessed graphs. This lower bound has proven to be helpful in the analysis of Tracking Paths. However, approximating Tracking Paths has been challenging since the size of a tracking set can be arbitrarily larger than that of a minimum fvs.

Further, it is known [11, 17] that in a graph G, if a set of vertices T contains at least three vertices from each cycle in G, then T is a tracking set for G. Thus, a 2-fault tolerant feedback vertex set is also a tracking set. In this paper, we borrow inspiration from this concept to derive a polynomial time algorithm to compute an approximate tracking set. In particular, we start with finding an fvs for the input graph G, and then identify cycles that need more vertices as trackers additional to the ones selected as feedback vertices.

Observe that a feedback vertex set is indeed a 0-fault tolerant fvs. Misra [31] gave a 3-approximation algorithm for the problem of finding a 1-fault tolerant fvs in unweighted graphs and 34-approximation algorithm for weighted graphs. In this paper, we give an approximation algorithm for finding an r-fault tolerant feedback vertex set, where r is a constant. We do this by using the Multicut in Forests problem (see Sect. 2) as an auxiliary problem. Misra [31] pointed out that the complexity of \(r\) -Fault Tolerant Feedback Vertex Set is not known for \(r \ge 2\) and asked for an approximation algorithm.

Theorem 2

There is an \(\mathcal {O}(r^2)\)-approximation algorithm for \(r\) -Fault Tolerant Feedback Vertex Set in weighted graphs, where r is a constant.

It is worth mentioning that our approach relies on explicit enumeration of certain cycles in the input graph G. This can be done in polynomial time if r is a constant (see Observation 9). Thus, it remains open how to approximate the \(r\) -Fault Tolerant Feedback Vertex Set if r depends on the size of the input.

Motivation for (Fault Tolerant) FVS. The FVS problem is motivated by applications in deadlock recovery [18, 34], Bayesian inference [7], VLSI design [24], and other areas. Fault tolerant solutions are crucial to real world applications that are prone to failure of nodes in a network or entities in a system [33]. In the case of FVS, the failure corresponds to not being able to eliminate the node from the network.

Related Work. There has been a lot of heuristic based work on the problem of tracking moving objects in a network [29, 35, 38]. Parameterized complexity of Tracking Shortest Paths and Tracking Paths was studied in [4, 5, 8, 12, 13, 17]. Feedback Vertex Set is known to admit a 2-approximation algorithm which is tight under UGC [3, 14]. The best known parameterized algorithm for FVS runs in \(2.7^k \cdot n^{\mathcal {O}(1)}\) time, where k is the size of the solution [28]. It is worth noting that Misra [31] uses Multicut in Forests as a subroutine as well. The edge version is known to admit an LP formulation whose matrix is totally unimodular [19] if the family of paths is non-crossing; thus, it is solvable in polynomial time. A related problem is the d -Hurdle Multiway Cut for which a factor 2-approximation algorithm is known (and it is again tight under UGC) [15].

Preliminaries and Notations. We refer to [16] for the standard graph theory terminology. All paths that we consider in this work are simple paths. For a graph G, we use V(G) to denote its vertex set. For a set \(S \subseteq V(G)\) we use \(G\setminus S\) to denote a graph that results from the deletion of vertex set S and the edges incident to V(S). For a weight function \(w :V(G) \rightarrow \mathbb {N}\), let w(S) for \(S \subseteq V(G)\) denote the sum of respective elements, i.e., \(\sum _{v\in S}w(v)\). An unweighted version of the problem is obtained by assigning all vertices the same weight. In fact, we can also omit the weights in this case. We write vectors in boldface and their entries in normal font, i.e., \(x_3\) is the third entry of a vector \(\boldsymbol{x}\). If we apply minimum to two vectors, then it is applied entry-wise. We write \(\boldsymbol{1}^n\) for the vector of n ones; we omit the superscript if the dimension is clear from the context.

2 Vertex Multicut in Forests

In this section, we gather polynomial time (approximation) algorithms for solving Vertex Multicut in Forests. The algorithms are used later in the subsequent sections to derive the approximation algorithms for r -Fault Tolerant FVS and Tracking Paths.

figure b

In this work, we consider the Unrestricted version of Multicut in Forests i.e., a solution set is allowed to contain vertices from the terminal pairs. We will consider a set of paths \(\mathcal {P}\) instead of a set of terminal pairs. It is not hard to see that these two versions are equivalent on forests; if the terminals in a pair belong to different trees we can discard the pair since there does not exist any path between them, otherwise there exists a unique path between each pair of terminals, which the solution S should intersect.

While MCF is NP-hard [9], the unweighted version is polynomial time solvable [9, 23]. However, recently the following strong version of constant factor approximability of the problem was shown (in fact, the result holds for all chordal graphs, not just for forests).

Consider the natural ILP formulation of the problem, where we introduce a binary variable \(x_v \in \{0,1\}\) for each vertex \(v \in V\), describing whether the corresponding vertex should or should not be taken into a solution. At least one vertex must be taken from each path from \(\mathcal {P}\) to construct a feasible solution. Consider the following LP relaxation of the natural ILP.

figure c

Proposition 3

(Agrawal et al.  [2, Lemma 5.1]). For a given instance of Multicut in Forests one can find a solution S such that \(w(S) \le 32\mathrm {OPT}\) in polynomial time, where \(\mathrm {OPT}\) is the objective value of an optimal solution to the corresponding (LP\(_{\text {MCF}}\)).

We also need the following result of similar nature for unweighted forests, strengthening the polynomial time solvability of the problem.

Lemma 4

If all the weights are equal, then there is an integral optimal solution for (LP\(_{\text {MCF}}\)). Furthermore, such a solution can be found in polynomial time.

Proof

It is enough to show the lemma for each tree T of F separately.

Root the tree T in an arbitrary vertex r. Among all optimal solutions to the LP, consider the solution \(\boldsymbol{x}\) that minimizes \(\sum _{v \in V(T)} x_v \cdot \text {dist}(v, r)\).

Assume for the sake of contradiction, that \(\boldsymbol{x}\) is not integral. Let u be a vertex with \(0< x_u < 1\) and the maximum distance from the root r. Note that, in particular, for all descendants v of u it holds that \(x_v \in \{ 0,1 \}\). Let us first assume that \(u \ne r\) and let p be the parent of u. We define \(\boldsymbol{\widehat{x}}\) as

$$ \widehat{x}_v = {\left\{ \begin{array}{ll} 0 &{}\text {if } v = u, \\ \min \{x_p+x_u,1\} &{}\text {if } v = p, \\ x_v &{} \text {otherwise.} \end{array}\right. } $$

We claim that \(\boldsymbol{\widehat{x}}\) is a solution to (LP\(_{\text {MCF}}\)). Obviously, \(0\le \widehat{x}_v \le 1\) for every \(v\in V\). Let \(P \in \mathcal {P}\). If \(u \notin V(P)\), then \(\sum _{v \in V(P)} \widehat{x}_v \ge \sum _{v \in V(P)} x_v \ge 1\). If \(u \in V(P)\), then either the path P is fully contained in the subtree of F rooted in u, or \(p \in V(P)\). In the first case, we have \(\sum _{v \in V(P)} \widehat{x}_v \ge \big (\sum _{v \in V(P)} x_v\big ) - x_u\ge 1-x_u > 0\). Since all the summands on the left hand side are integral by the choice of u, it follows that \(\sum _{v \in V(P)} \widehat{x}_v \ge 1\). In the later case, if \(\widehat{x}_p=1\), then \(\sum _{v \in V(P)} \widehat{x}_v \ge 1\) and otherwise \(\sum _{v \in V(P)} \widehat{x}_v =\big (\sum _{v \in V(P)} x_v\big ) +x_u - x_u=\sum _{v \in V(P)} x_v\ge 1\).

Hence, \(\boldsymbol{\widehat{x}}\) is a solution to (LP\(_{\text {MCF}}\)). Furthermore, we have that \(\sum _{v \in V} w(v) \cdot \widehat{x}_v \le \sum _{v \in V} {w(v) \cdot x_v}\) (since all the weights are equal) and \( \sum _{v \in V} x_v \cdot \text {dist}(v, r) > \sum _{v \in V} \widehat{x}_v \cdot \text {dist}(v, r) \) which is a contradiction. The case \(u=r\) can be proved using a similar argument for \(\boldsymbol{\widehat{x}}\) such that \(\widehat{x}_u = 0\) and \(\widehat{x}_v=x_v\) if \(v \ne u\).

The second part of the lemma follows from polynomial time algorithm for MCF [9], since any optimal solution to the instance of MCF represents an optimal solution for the corresponding (LP\(_{\text {MCF}}\)) by the first part of the lemma.

   \(\square \)

For the rest of the paper we use \(\mu \) to denote the best LP relative approximation ratio achievable for MCF with respect to LP relative approximation. That is \(\mu \le 32\) for weighted instances and \(\mu =1\) for unweighted instances.

3 Approximate r-Fault Tolerant Feedback Vertex Set

In this section, we give an algorithm for computing an approximate r-fault tolerant feedback vertex set for undirected weighted graphs, for any fixed integer \(r\ge 2\). Recall that an r-fault tolerant feedback vertex set is a set of vertices that contains at least \(r+1\) vertices from each cycle in the graph. A polynomial time algorithm for computing a constant factor approximate 1-fault tolerant feedback vertex set was given by Misra [31]. The factor can be easily observed to be \(2+\mu \), where \(\mu \) is the best possible approximation ratio for MCF. We start in Sect. 3.1 by giving an algorithm for finding 2-fault tolerant feedback vertex set. Later, in Sect. 3.2, we show how this technique can be generalized to give an r-fault tolerant feedback vertex set for any \(r\ge 3\).

3.1 Two-Fault Tolerant FVS

figure d

Let \(G=(V,E)\) be the input graph. First, we compute a \((2+\mu )\)-approximate 1-fault tolerant feedback vertex set S for G using the algorithm of Misra [31]. Note that S contains at least two vertices from each cycle in G. Our goal is to compute a vertex set that contains at least three vertices from each cycle in G. Hence, for each cycle C in G for which \(|V(C)\cap S|=2\), we need to pick at least one more vertex from C into our solution. We first identify, which pairs of vertices from S are involved in such cycles. This can be done in polynomial time, by considering each pair of vertices \(\{a,b\}\in S\) and checking the graph \(G_{ab}= G\setminus (S\setminus \{a,b\})\) for cycles. If no such \(G_{ab}\) contains a cycle, then we return S as a 2-fault tolerant feedback vertex set. Otherwise, there exists at least one cycle in G such that S contains exactly two vertices from it. Observe that even though S does not contain three vertices from each cycle in G, S might contain at least three vertices from some cycles in G. We shall ignore such cycles.

If a cycle C in G intersects with S at vertices a and b, then there exist two vertex-disjoint paths \(P_1\) and \(P_2\) between a and b in \(G_{ab}\), such that \(V(P_1)\cup V(P_2)=V(C)\). In order to find a 2-fault tolerant fvs that extends S, we need to ensure that at least one vertex from \((V(P_1)\cup V(P_2))\setminus \{a,b\}\) is included in the solution. Observe that paths \(P_1\) and \(P_2\) are a pair of paths between the vertices ab in the graph \(G\setminus (S\setminus \{a,b\})\), which implies that the subpaths between the neighbors of a and b are paths in the graph \(G\setminus S\), which is a forest. As each pair of paths is uniquely determined by the neighbors of a and b on \(P_1\) and \(P_2\), there are at most \(\mathcal {O}(n^4)\) such pairs in total and all of them can be found in \(\mathcal {O}(n^5)\) time. We create a family \(\mathcal {P}\) of all such pairs of vertex disjoint paths between each pair of vertices in S. More precisely, for each pair \(P_1',P_2'\) of such paths of length at least 2 we obtain \(P_1\) and \(P_2\) by removing a and b from \(P_1'\) and \(P_2'\), respectively. Then we add the pair \(\{P_1, P_2 \}\) to \(\mathcal {P}\). If a and b are adjacent, then for each a-b-path \(P'_1\) of length at least 2 in \(G\setminus (S\setminus \{a,b\})\), we add to \(\mathcal {P}\) the pair \(\{P_1, P_1\}\), where \(P_1\) is obtained from \(P'_1\) by removing a and b.

Next, we use the following linear program to identify which path among each pair should be selected, from which a vertex would be picked in order to be included in the solution.

figure e

We solve the above linear program in polynomial time using the Ellipsoid method [21] (see also [22, Chapter 3]). Let \(\boldsymbol{x}^*\) be an optimal solution for the above LP and \(\mathrm {OPT}_{\boldsymbol{x}^*}\) be its value.

Observation 5

Let \(S^*\) be a 2-fault tolerant fvs in G (not necessarily extending S) and let \(\mathrm {OPT}_{\boldsymbol{x}^*}\) be the optimum value of (LP\(_{\text {pairs}}\)). Then, \(\mathrm {OPT}_{\boldsymbol{x}^*} \le w(S^*)\).

Proof

We claim that the vector \(\boldsymbol{\widehat{x}}\) defined for \(v \in V \setminus S\) as

$$ \widehat{x}_v= {\left\{ \begin{array}{ll} 1 &{}\text {if } v \in S^* \\ 0 &{} \text {otherwise} \end{array}\right. } $$

constitutes a solution to (LP\(_{\text {pairs}}\)). In order to see this let \(\{ P_1, P_2 \} \in \mathcal {P}\) be a pair of paths and let C be the cycle formed by these paths together with some \(a,b \in S\). Since \(S^*\) is a 2-fault tolerant fvs, we have \(|S^* \cap V(C)| \ge 3\) and thus \(|S^* \cap (V(C) \setminus S)|= \sum _{v \in V(P_1) \cup V(P_2)} x_v \ge 1\) as needed. Hence, \(\mathrm {OPT}_{\boldsymbol{x}^*} \le w(S^*)\).

   \(\square \)

Next, we create a set of paths \(\mathcal {P}_{\boldsymbol{x}^*}\). For each path \(P \in \bigcup _{\{P_1,P_2\} \in \mathcal {P}}\{P_1,P_2\}\), we include P in \(\mathcal {P}_{\boldsymbol{x}^*}\) if \(\sum _{v \in V(P)} x^*_v \ge \frac{1}{2}\) holds. Through this process, we are selecting the paths from which we will include at least one vertex in our solution.

Finally, we create an instance \((G \setminus S, w|_{(V\setminus S)},\mathcal {P}_{\boldsymbol{x}^*})\) of Multicut in Forests. Consider the corresponding LP relaxation.

figure f

Lemma 6

Let \(\boldsymbol{x}^*\) be an optimal solution of (LP\(_{\text {pairs}}\)) and let \({\mathrm {OPT}}_{\boldsymbol{x}^*}\) be its objective value. Then, \(\boldsymbol{y} = \min \{\boldsymbol{1},2\boldsymbol{x}^*\}\) is a solution to (LP\(_{\text {paths}}\)). In particular, \(\mathrm {OPT}_{\boldsymbol{y}^*} \le 2 \cdot \mathrm {OPT}_{\boldsymbol{x}^*}\) holds, where \(\mathrm {OPT}_{\boldsymbol{y}^*}\) is the value of an optimal solution to (LP\(_{\text {paths}}\)).

Proof

Recall that we have \(\sum _{v \in V(P)} x^*_v \ge \frac{1}{2}\) for every path \(P \in \mathcal {P}_{\boldsymbol{x}^*}\), by the definition of \(\mathcal {P}_{\boldsymbol{x}^*}\). Thus, we have \(\sum _{v \in V(P)} y_v = \sum _{v \in V(P)} \min \{ 1, 2 \cdot x^*_v\} \ge \min \{ 1, \sum _{v \in V(P)} 2 \cdot x^*_v\} \ge 1\) for all \(P\in \mathcal {P}_{\boldsymbol{x}^*}\) and clearly \(0 \le y_v \le 1\) for all \(v \in V \setminus S\). We conclude that \(\boldsymbol{y}\) is a solution to (LP\(_{\text {paths}}\)).    \(\square \)

We now show how to combine the \((2+\mu )\)-approximate 1-fault tolerant fvs with an approximate solution for Multicut in Forests to obtain a \((2+3\mu )\)-approximate 2-fault tolerant fvs.

Lemma 7

Let S be an \(\alpha \)-approximate 1-fault tolerant fvs and let \(\boldsymbol{y}\) be an integral solution to (LP\(_{\text {paths}}\)) of weight at most \(\mu \) times the weight of an optimal solution. Then, \(S'=S \cup \{ v \in V \setminus S \mid y_v = 1 \}\) is an \((\alpha +2\mu )\)-approximate 2-fault tolerant fvs.

Proof

Let \(S^*\) be an optimal 2-fault tolerant fvs. We know that \(w(S) \le \alpha \cdot w(S^*)\). By Observation 5 there is a solution \(\boldsymbol{x}\) to (LP\(_{\text {pairs}}\)) with \(\sum _{v \in V \setminus S} w(v)\cdot x_v \le w(S^*)\). Thus, by Lemma 6 we have \(\sum _{v \in V \setminus S} w(v)\cdot y_v \le 2 \mu \cdot w(S^*)\). In total we get

$$\begin{aligned} w(S')&=w\left( S \cup \{ v \in V \setminus S \mid y_v = 1 \} \right) \\&= w(S) + \sum _{v \in V \setminus S} w(v)\cdot y_v \\&\le \alpha \cdot w(S^*) + 2 \mu \cdot w(S^*)\\&= (\alpha +2\mu ) w(S^*) \,. \end{aligned}$$

It is not hard to see that \(S'\) is a 2-fault tolerant fvs. Indeed, if S contains at least three vertices in a cycle of the input graph, so does \(S'\). Thus, we can focus on a cycle C with \(V(C) \cap S = \{ a,b \}\). If this is the case, then the conditions of (LP\(_{\text {pairs}}\)) imply that in \(\boldsymbol{y}\) it holds \(y_v = 1\) for some \(v \in V(C) \setminus \{ a,b \}\) (follows from the construction of (LP\(_{\text {paths}}\))). Therefore, we have \(\left| V(C) \cap S' \right| \ge 3\).    \(\square \)

Corollary 8

There is a 5-approximation algorithm for unweighted 2-Fault Tolerant FVS and 98-approximation algorithm for weighted 2-Fault Tolerant FVS.

Proof

We begin with the \((2+\mu )\)-approximation algorithm for 1-Fault Tolerant FVS by Misra [31]. In polynomial time we construct (LP\(_{\text {pairs}}\)) and obtain an optimal solution \(\boldsymbol{x}^*\) for it. Based on that we construct \(\mathcal {P}_{\boldsymbol{x}^*}\) and (LP\(_{\text {paths}}\)) in polynomial time. By Proposition 3 or Lemma 4 one can in polynomial time find an integral solution to (LP\(_{\text {paths}}\)) of weight at most \(\mu \) times the weight of an optimal solution. By Lemma 7 this solution combined with the initial 1-fault tolerant fvs gives \((2+3\mu )\)-approximate 2-fault tolerant fvs. The algorithm works in polynomial time as it uses polynomial-time routines.    \(\square \)

3.2 Higher Fault Tolerant FVS

Now we explain the procedure to scale up the algorithm from Sect. 3.1 to compute an r-fault tolerant fvs for \(r\ge 3\).

figure g

For the rest of the section we assume that \(r\ge 3\) is a fixed constant. We follow a recursive process to compute an r-fault tolerant fvs. We start with an approximate solution S for \((r-1)\) -Fault Tolerant FVS. Note that S contains at least r vertices from each cycle in G. Similar to the process in algorithm in Sect. 3.1, here we identify every group of r vertices that are involved in a cycle C in G, such that \(|V(C) \cap S| = r\). Such cycles can be found by checking whether the graph \(G_X=G \setminus (S \setminus X)\), for \(X \subseteq S\) such that \(|X|=r\), contains a cycle. If no such \(G_X\) contains a cycle, then S is an r-fault tolerant fvs, and we return it as a solution. Else, we focus on cycles which contain exactly r vertices from S.

We create a family \(\mathcal {P}\) of path sets in the following way. For each cycle C that contains exactly r vertices of S, labeled in cyclic order along C as vertices \(\{v_1,v_2,\dots ,v_r\}\), we consider the r paths, say \(P'_1,P'_2,\dots ,P'_r\), where \(P'_i\) starts in \(v_i\) and ends in \(v_{i+1}\) (modulo r) and \(P'_i\) is a subpath of C. We remove paths with only two vertices, and we shorten the rest by removing their end vertices, leaving us with paths \(P_1,\dots ,P_s\), where \(s \le r\) (as some paths may have been removed). If the set of paths \(\{P_1,\dots ,P_s\}\) is non-empty, then we add it to the family \(\mathcal {P}\). If the set \(\{P_1,\dots ,P_s\}\) is empty, then we have found a cycle of length r and we report that there is no r-fault tolerant feedback vertex set for G as we cannot choose at least \(r+1\) vertices on a cycle of length r.

Observation 9

Construction of \(\mathcal {P}\) can be done in \(n^{\mathcal {O}(r)}\) time.

Proof

There are no more than \(\left( {\begin{array}{c}n\\ r\end{array}}\right) \) subsets of S of order r. Each such subset X can be a part of many different cycles. To find all such cycles we take each of its r! orderings \(v_1,\dots ,v_r\) and we fix a predecessor and a successor (taken out of the respective vertex neighborhood) of each \(v_i\). This constitutes at most \(n^{2r}\) different possibilities for each ordering of X. There is at most one path from the successor of \(v_i\) to the predecessor of \(v_{i+1}\) (modulo r) in \((G\setminus S)\) as it is a forest. As the paths are now fixed, we just need to check whether they form a cycle by checking that the paths are vertex disjoint, which can be done in polynomial time. Altogether, we have \(\left( {\begin{array}{c}n\\ r\end{array}}\right) r! n^{2r+\mathcal {O}(r)} = n^{\mathcal {O}(r)}\) time.    \(\square \)

Once the family \(\mathcal {P}\) is computed, we solve the following linear program using the Ellipsoid method in polynomial time. Let \(\boldsymbol{x}^*\) be its optimal solution and \(\mathrm {OPT}_{x^*}\) its value.

figure h

Similarly to Observation 5 we get the following.

Observation 10

Let \(S^*\) be an r-fault tolerant fvs in G and let \({\text {OPT}}_{x^*}\) be the optimum value of (LP\(_s{\text {-tuples}}\)). Then, \(\mathrm {OPT}_{x^*} \le w(S^*)\).

Next, we create a set of paths \(\mathcal {P}_{\boldsymbol{x}^*}\). For each path \(P \in \bigcup _{\{P_1,\dots ,P_s\} \in \mathcal {P}}\{P_1,\dots ,P_s\}\), we include P in \(\mathcal {P}_{\boldsymbol{x}^*}\) if \(\sum _{v \in V(P )} x^*_v \ge \frac{1}{r}\). As in Sect. 3.1, we create an instance \((G \setminus S, w|_{(V\setminus S)},\mathcal {P}_{\boldsymbol{x}^*})\) of Multicut in Forests and consider its LP relaxation (LP\(_{{\text {paths}}}\)).

Lemma 11

Let S be an \((r-1)\)-fault tolerant fvs. Let \(\boldsymbol{x}^*\) be an optimal solution of (LP\(_s{\text {-tuples}}\)) and let \(\mathrm {OPT}_{\boldsymbol{x}^*}\) be its objective value. Then, \(\boldsymbol{y} = \min \{\boldsymbol{1},r \cdot \boldsymbol{x}^*\}\) is a solution to (LP\(_{{\text {paths}}}\)) for \(\mathcal {P}_{\boldsymbol{x}^*}\). In particular, \(\mathrm {OPT}\le r \cdot \mathrm {OPT}_{\boldsymbol{x}^*}\) holds, where \(\mathrm {OPT}\) is the value of an optimal solution to (LP\(_{{\text {paths}}}\)).

Proof

Recall that we have \(\sum _{v \in V(P)} x^*_v \ge \frac{1}{r}\) for each path \(P \in \mathcal {P}_{\boldsymbol{x}^*}\). Thus, we have \(\sum _{v \in V(P)} y_v = \sum _{v \in V(P)} \min \{ 1, r \cdot x^*_v\} \ge \min \{ 1, \sum _{v \in V(P)} r \cdot x^*_v\} \ge 1\) for all \(P\in \mathcal {P}\) and clearly \(0 \le y_v \le 1\) for all \(v \in V \setminus S\). We conclude that \(\boldsymbol{y}\) is a solution to (LP\(_{{\text {paths}}}\)).    \(\square \)

Lemma 12

Let r be a constant. Let S be an \(\alpha \)-approximate \((r-1)\)-fault tolerant fvs and let \(\boldsymbol{y}\) be a solution to (LP\(_{\text {MCF}}\)) induced by \(\mathcal {P}_{\boldsymbol{x}^*}\) with weight at most \(\mu \) times the optimal. Then, \(S' = S\,\cup \, \{ v \in V \setminus S \mid y_v = 1 \}\) is an \((\alpha +\mu \cdot r)\)-approximate r-fault tolerant fvs.

Proof

Let \(S^*\) be an optimal r-fault tolerant fvs. We know that \(w(S) \le \alpha \cdot w(S^*)\). By Observation 10 there is a solution \(\boldsymbol{x}\) to (LP\(_s{\text {-tuples}}\)) with \(\sum _{v \in V \setminus S} w(v) \cdot x_v \le w(S^*)\). Thus, by Lemma 11 we have \(\sum _{v \in V \setminus S} w(v) \cdot y_v \le \mu \cdot r \cdot w(S^*)\). In total we get

$$\begin{aligned} w(S')&=w\left( S \cup \{ v \in V \setminus S \mid y_v = 1 \} \right) \\&= w(S) + \sum _{v \in V \setminus S} w(v) \cdot y_v\\&\le \alpha \cdot w(S^*) + \mu \cdot r \cdot w(S^*)\\&= (\alpha +\mu r) w(S^*) \,. \end{aligned}$$

Let us prove that \(S'\) is r-fault tolerant. If S contains at least \(r+1\) vertices in a cycle of the input graph, so does \(S'\). Thus, we can focus on a cycle C with \(V(C) \cap S = \{v_1,v_2,\dots ,v_r\}\). If this is the case, then the conditions of (LP\(_s{\text {-tuples}}\)) imply that in \(\boldsymbol{y}\) it holds \(y_v = 1\) for some \(v \in V(C) \setminus \{v_1,v_2,\dots ,v_r\}\) (follows from the construction of (LP\(_{\text {paths}}\))). Therefore, we have \(\left| V(C) \cap S' \right| \ge r+1\) for every such C. Hence, \(S'\) is an r-fault tolerant fvs.    \(\square \)

Theorem 13

(precise version of Theorem 2). Let r be a constant. There is a \(\left( 2 + \sum _{i = 1}^r i \cdot \mu \right) \)-approximation algorithm for \(r\) -Fault Tolerant Feedback Vertex Set.

Proof

The 1-Fault Tolerant FVS problem has a \((2+\mu )\)-approximation algorithm due to Misra [31]. By Lemma 12 we add \(\mu \cdot r\) to the approximation factor when devising r-fault tolerant fvs from \((r-1)\)-fault tolerant fvs it follows that for an arbitrary r we have \((2+\mu \cdot \sum _{i=1}^r i)\)-approximation algorithm for finding an r-fault tolerant fvs.

The algorithm by Misra [31] gives the \((2+\mu )\)-approximation for 1-Fault Tolerant FVS in polynomial time. We showed how to devise an s-fault tolerant fvs from an \((s-1)\)-fault tolerant fvs. We use \(r-1\) such steps to incrementally increase the fault tolerance of the fvs. In step of devising s-fault tolerant fvs we use \(n^{\mathcal {O}(s)}\) time to find \(\mathcal {P}\) as seen in Observation 9, then we construct and solve (LP\(_s{\text {-tuples}}\)) in polynomial time using the Ellipsoid method and then construct (LP\(_{\text {MCF}}\)) and solve it using either Proposition 3 or Lemma 4 in polynomial time. As r is a constant we conclude that our \((2+\mu \cdot \sum _{i=2}^r i)\)-approximation algorithm for r -Fault Tolerant FVS has polynomial time complexity.    \(\square \)

4 Approximate Tracking Set

In this section, we give a constant factor approximation algorithm for Tracking Paths.

Let \(G=(V,E)\) be the input graph and s and t the source and the target. We start by applying the following reduction rule on G. This can clearly be done in polynomial time.

Reduction Rule 1

(Banik et al.  [4]). If there exists a vertex or an edge that does not participate in any \(s\)-\(t\) path, then delete it.

We use the term reduced graph to denote a graph that has been preprocessed using Reduction Rule 1. For the sake of simplicity, after the application of reduction rule, we continue to refer to the reduced graph as G.

Next, we describe local source-destination pair (local s-t pair), a concept that has served as crucial for develo** efficient algorithms for Tracking Paths  [4, 11, 13, 17]. For a subgraph \(G'\subseteq G\), and vertices \(a,b\in V(G')\), we say that ab is a local s-t pair for \(G'\) if

  1. 1.

    there exists a path in G from s to a, say \(P_{sa}\),

  2. 2.

    there exists a path in G from b to t, say \(P_{bt}\),

  3. 3.

    \(V(P_{sa})\cap V(P_{bt})=\emptyset \), and

  4. 4.

    \(V(P_{sa})\cap V(G')=\{a\}\) and \(V(P_{bt})\cap V(G')=\{b\}\).

Note that a subgraph can have more than one local source-destination pair. It can be verified in \(\mathcal {O}(n^2)\) time whether a pair of vertices \(a,b\in V(G')\) form a local source-destination pair for \(G'\) by checking if there exist disjoint paths from s to a and b to t in the graph \(G\setminus (V(G')\setminus \{a,b\})\), using the disjoint path algorithm from [26].

Observation 14

Let G be a graph and let \(G'\) be a subgraph of G. We can verify in polynomial time whether \(a,b \in V(G')\) is a local s-t pair for \(G'\).

We recall the following lemma from previous work.

Lemma 15

([11, Lemma 2]). In a graph G, if \(T \subseteq V(G)\) is not a tracking set for G, then there exist two \(s\)-\(t\) paths with the same sequence of trackers, and they form a cycle C in G, such that C has a local source a and a local destination b, and \(T \cap (V(C) \setminus \{ a,b \}) = \emptyset \).

Eppstein et al. [17] mentioned that a 2-fault tolerant feedback vertex set is always a tracking set. Here we use a variation of this idea to compute an approximate tracking set. Specifically, we start with a 2-approximate feedback vertex set and then identify the cycles that contain only one or two feedback vertices. We check if these cycles need more vertices as trackers and we use (LP\(_{\text {pairs}}\)) and (LP\(_{\text {MCF}}\)) explained in the previous section to add them.

Now we present the algorithm for computing a \(2(1+\mu )\)-approximate tracking set in polynomial time. We start by computing a 2-approximate feedback vertex set S on the reduced graph G using the algorithm by Bafna et al. [3]. We first check whether S is a tracking set for G by using the tracking set verification algorithm given in [4]. If it is a tracking set, we return S as the solution, otherwise we proceed further.

If S is not a tracking set, we will find vertices on which to place additional trackers in the following way. First we identify cycles C such that \(|V(C)\,\cap \, S|=1\). Each such cycle C can be obtained by taking a vertex \(a \in S\) together with a path between a pair of its neighbors in \(G\setminus S\). For each vertex \(b \in V(C) \setminus \{a\}\) we check whether ab (or ba) is a local \(s\)-\(t\) pair for C. If this is the case, then we distinguish two cases. If a and b are adjacent on C, then let \(P_1\) be the path \(C \setminus \{a,b\}\). We add to \(\mathcal {P}\) the pair \(\{P_1,P_1\}\). If a and b are non-adjacent on C, then let \(P'_1\) and \(P'_2\) be the two paths between a and b forming the cycle C. We obtain \(P_1\) and \(P_2\) by removing a and b from \(P_1'\) and \(P_2'\), respectively. Then we add the pair \(\{P_1, P_2 \}\) to \(\mathcal {P}\).

If a cycle C in G intersects with S in vertices a and b, then there exist two vertex-disjoint paths \(P'_1\) and \(P'_2\) between a and b, such that \(V(P'_1)\cup V(P'_2)=V(C)\). Hence, each such cycle C is uniquely determined by the neighbors of a and b on \(P'_1\) and \(P'_2\). If, furthermore, ab (or ba) is a local \(s\)-\(t\) pair for C, then we add some pair to \(\mathcal {P}\). In particular, if both \(P_1',P_2'\) are of length at least 2 we obtain \(P_1\) and \(P_2\) by removing a and b from \(P_1'\) and \(P_2'\), respectively. Then we add the pair \(\{P_1, P_2 \}\) to \(\mathcal {P}\). If one of the paths, say \(P'_2\), is of length 1 (i.e., a and b are adjacent on C), we add to \(\mathcal {P}\) the pair \(\{P_1, P_1\}\), where \(P_1\) is obtained from \(P'_1\) by removing a and b.

Similarly to Observation 9, we have at most \(n^2\) candidate cycles with a single vertex of S and for each of them we have at most n candidates on b. We have \(n^4\) cycles with two vertices of S. For each of them, we check, whether ab (or ba) is a local \(s\)-\(t\) pair in \(\mathcal {O}(n^2)\) time. Hence, \(\mathcal {P}\) can be obtained in \(\mathcal {O}(n^6)\) time.

Now we use (LP\(_{\text {pairs}}\)) with \(\mathcal {P}\) to identify the paths on which we want to place at least one additional tracker. Let \(\boldsymbol{x^*}\) be an optimal solution of (LP\(_{\text {pairs}}\)), which can be obtained in polynomial time using the Ellipsoid method. We construct \(\mathcal {P}_{x^*}\) as the set of all paths \(P \in \mathcal {P}\) such that \(\sum _{v \in V(P)} x^*_v \ge \frac{1}{2}\).

We first show the following observation.

Observation 16

Let G be a reduced graph and \(T^*\) be a tracking set for G. Let \(\mathrm {OPT}_{x^*}\) be the optimum value of (LP\(_{\text {pairs}}\)). Then \(\mathrm {OPT}_{x^*} \le w(T^*)\).

Proof

Let \(\boldsymbol{\widehat{x}}\) be a vector such that for all \(v \in V(G)\setminus S\)

$$ \widehat{x}_v= {\left\{ \begin{array}{ll} 1 &{}\text {if } v \in T^* \\ 0 &{} \text {otherwise.} \end{array}\right. } $$

We show that \(\boldsymbol{\widehat{x}}\) is a solution to (LP\(_{\text {pairs}}\)). Suppose that \(\sum _{v \in V(P_i)\,\cup \, V(P_j)} x_v < 1\) for some \({\{P_i, P_j\} \in \mathcal {P}}\), therefore \(T^* \cap (V(P_i) \,\cup \, V(P_j)) = \emptyset \). Let ab be the vertices such that \(G[V(P_i) \cup V(P_j) \cup \{a, b\}]\) contains a cycle C, such that ab are a local \(s\)-\(t\) pair for C. Such ab must exist because of the way \(\mathcal {P}\) was constructed.

Since ab is a local \(s\)-\(t\) pair, there exist two distinct paths \(\widehat{P}_1\) and \(\widehat{P}_2\) such that \(\overrightarrow{T^*_{\widehat{P}_1}}=\overrightarrow{T^*_{\widehat{P}_2}}\): Both reach a from s and reach t from b but one reaches b from a by \(P_i\) and the other one by \(P_j\). This contradicts the assumption that \(T^*\) is a tracking set.    \(\square \)

To decide which additional vertices to include in the solution, we compute the minimum multicut in forests on \((G \setminus S,w|_{(V\setminus S)}, \mathcal {P}_{x^*})\) using (LP\(_{\text {MCF}}\)). Let \(\boldsymbol{y^*}\) be an integral solution of (LP\(_{\text {MCF}}\)) of weight at most \(\mu \) times the optimal one. By Proposition 3 or Lemma 4 such a solution can be obtained in polynomial time. Let X be the set of vertices such that \(y_v = 1\). Then we claim that \(S \cup X\) is a \(2(1+\mu )\)-approximate solution to the Tracking Paths.

Lemma 17

Let S be a 2-approximate fvs in G and let \(\boldsymbol{y}\) be a solution to (LP\(_{\text {MCF}}\)) induced by \(\mathcal {P}_{\boldsymbol{x}^*}\) with weight at most \(\mu \) times the optimal. The set \(T = S \cup \left\{ v \in V \setminus S \mid y_v = 1 \right\} \) is a \(2(1+\mu )\)-approximate solution to Tracking Paths on G.

Proof

Let \(X = \left\{ v \in V \setminus S \mid y_v = 1 \right\} \). First we show that \(T=S \cup X\) is a tracking set. Suppose there exists a pair of distinct \(s\)-\(t\) paths \(P_1, P_2\) in G that are not distinguished by T, i.e., \(\overrightarrow{T_{P_1}}=\overrightarrow{T_{P_2}}\). Then, by Lemma 15, there exists a cycle C such that \(V(C) \subseteq V(P_1) \cup V(P_2)\) and a local \(s\)-\(t\) pair \(a,b\in V(C)\) such that \(T\cap V(C)\setminus \{a,b\}=\emptyset \). But then C must have been one of the cycles enumerated by the algorithm when constructing \(\mathcal {P}\) and therefore there must be a \(v \in (V(C) \setminus \{a,b\}) \cap X\), contradicting the choice of C.

Now we show that w(T) is at most \(2(1+\mu )\) times the weight of a minimum tracking set in G. Let \(t^*\) be the weight of the minimum tracking set on G and \(f^*\) be the weight of a minimum feedback vertex set in G. Furthermore, let \(\mathrm {OPT}_{x^*}\) be the objective value of an optimal solution to (LP\(_{\text {pairs}}\)) and \(\mathrm {OPT}_{y^*}\) be the objective value of a solution to (LP\(_{\text {MCF}}\)).

We claim that \(|S| \le 2 f^* \le 2 t^*\). The first inequality follows from the fact that S is a 2-approximate feedback vertex set for G. The second inequality follows from the fact that every tracking set is also a feedback vertex set [4]. It also holds \(w(X) \le \mu \mathrm {OPT}_{y^*} \le 2\mu \cdot \mathrm {OPT}_{x^*} \le 2\mu \cdot t^*\). Here the second inequality follows from Lemma 6, while the third one follows from Observation 16.

Together this gives us \(w(T) = w(S) + w(X) \le 2 t^* + 2 \mu \cdot t^* = 2(1+\mu ) \cdot t^*\).    \(\square \)

The result is summed up as follows.

Theorem 18

(precise version of Theorem 1). There exists a \(2(1+\mu )\)-approximation algorithm for Tracking Paths, in particular there is a 4-approximation algorithm for unweighted graphs and 66-approximation algorithm for weighted graphs.