Adaptivity Gaps for the Stochastic Boolean Function Evaluation Problem

  • Conference paper
  • First Online:
Approximation and Online Algorithms (WAOA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13538))

Included in the following conference series:

  • 354 Accesses

Abstract

We consider the Stochastic Boolean Function Evaluation (SBFE) problem where the task is to efficiently evaluate a known Boolean function f on an unknown bit string x of length n. We determine f(x) by sequentially testing the variables of x, each of which is associated with a cost of testing and an independent probability of being true. If a strategy for solving the problem is adaptive in the sense that its next test can depend on the outcomes of previous tests, it has lower expected cost but may take up to exponential space to store. In contrast, a non-adaptive strategy may have higher expected cost but can be stored in linear space and benefit from parallel resources. The adaptivity gap, the ratio between the expected cost of the optimal non-adaptive and adaptive strategies, is a measure of the benefit of adaptivity. We present lower bounds on the adaptivity gap for the SBFE problem for popular classes of Boolean functions, including read-once DNF formulas, read-once formulas, and general DNFs. Our bounds range from \(\varOmega (\log n)\) to \(\varOmega (n/\log n)\), contrasting with recent O(1) gaps shown for symmetric functions and linear threshold functions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The term series-parallel circuits (systems) refers to a set of parallel circuits that are connected in series (see, e.g., [11, 31]). Viewed as graphs, they correspond to the subset of two-terminal series-parallel graphs whose st-connectivity functions correspond to read-once CNF formulas. We note that Kowshik used the term “series-parallel graph” in a non-standard way to refer only to this subset; Fu et al.  in citing Kowshik, used the term the same way [13, 27].

  2. 2.

    Some definitions of a read-once formula allow negations in the internal nodes of the formula. By DeMorgan’s laws, these negations can be “pushed” into the leaves of the formula, resulting in a formula whose internal nodes are \(\vee \) and \(\wedge \), such that each variable \(x_i\) appears in at most one leaf.

  3. 3.

    Notice that W is similar to a Lambert W function \(e^y y\), after changing the base of the logarithm and substituting \(y=\log (w^{1-\epsilon })\) [6].

References

  1. Agarwal, A., Assadi, S., Khanna, S.: Stochastic submodular cover with limited adaptivity. In: Chan, T.M. (ed.) Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 323–342. SIAM (2019)

    Google Scholar 

  2. Asadpour, A., Nazerzadeh, H., Saberi, A.: Stochastic submodular maximization. In: Papadimitriou, C., Zhang, S. (eds.) WINE 2008. LNCS, vol. 5385, pp. 477–489. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-92185-1_53

    Chapter  Google Scholar 

  3. Boros, E., Unyulurt, T.: Sequential testing of series-parallel systems of small depth. In: Laguna, M., Velarde, J.L.G. (eds.) Computing Tools for Modeling, Optimization and Simulation: Interfaces in Computer Science and Operations Research, pp. 39–73. Springer, Boston (2000). https://doi.org/10.1007/978-1-4615-4567-5_3

    Chapter  Google Scholar 

  4. Boucheron, S., Lugosi, G., Massart, P.: Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford (2013)

    Book  Google Scholar 

  5. Bradac, D., Singla, S., Zuzic, G.: (Near) optimal adaptivity gaps for stochastic multi-value probing. In: Achlioptas, D., Végh, L.A. (eds.) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2019, 20–22 September 2019, Massachusetts Institute of Technology, Cambridge, MA, USA. LIPIcs, vol. 145, pp. 49:1–49:21. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)

    Google Scholar 

  6. Bronstein, M., Corless, R.M., Davenport, J.H., Jeffrey, D.J.: Algebraic properties of the Lambert W function from a result of Rosenlicht and of Liouville. Integral Transform. Spec. Funct. 19(10), 709–712 (2008)

    Article  MathSciNet  Google Scholar 

  7. Dean, B.C., Goemans, M.X., Vondrák, J.: Approximating the stochastic knapsack problem: the benefit of adaptivity. In: Proceedings of 45th Symposium on Foundations of Computer Science (FOCS 2004), 17–19 October 2004, Rome, Italy, pp. 208–217. IEEE Computer Society (2004)

    Google Scholar 

  8. Dean, B.C., Goemans, M.X., Vondrák, J.: Adaptivity and approximation for stochastic packing problems. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2005, Vancouver, British Columbia, Canada, 23–25 January 2005, pp. 395–404. SIAM (2005)

    Google Scholar 

  9. Dean, B.C., Goemans, M.X., Vondrák, J.: Approximating the stochastic knapsack problem: the benefit of adaptivity. Math. Oper. Res. 33(4), 945–964 (2008)

    Article  MathSciNet  Google Scholar 

  10. Deshpande, A., Hellerstein, L., Kletenik, D.: Approximation algorithms for stochastic submodular set cover with applications to boolean function evaluation and min-knapsack. ACM Trans. Algorithms 12(3), 42:1–42:28 (2016)

    Google Scholar 

  11. El-Neweihi, E., Proschan, F., Sethuraman, J.: Optimal allocation of components in parallel-series and series-parallel systems. J. Appl. Probab. 23(3), 770–777 (1986)

    Article  MathSciNet  Google Scholar 

  12. Eppstein, D.: Parallel recognition of series-parallel graphs. Inf. Comput. 98(1), 41–55 (1992)

    Article  MathSciNet  Google Scholar 

  13. Fu, L., Fu, X., Xu, Z., Peng, Q., Wang, X., Lu, S.: Determining source-destination connectivity in uncertain networks: modeling and solutions. IEEE/ACM Trans. Netw. 25(6), 3237–3252 (2017)

    Article  Google Scholar 

  14. Ghuge, R., Gupta, A., Nagarajan, V.: Non-adaptive stochastic score classification and explainable halfspace evaluation. CoRR abs/2111.05687 (2021)

    Google Scholar 

  15. Ghuge, R., Gupta, A., Nagarajan, V.: The power of adaptivity for stochastic submodular cover. In: Proceedings of the 38th International Conference on Machine Learning, ICML (2021)

    Google Scholar 

  16. Gkenosis, D., Grammel, N., Hellerstein, L., Kletenik, D.: The stochastic score classification problem. In: 26th Annual European Symposium on Algorithms, ESA 2018, 20–22 August 2018, Helsinki, Finland, pp. 36:1–36:14 (2018)

    Google Scholar 

  17. Goemans, M., Vondrák, J.: Stochastic covering and adaptivity. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 532–543. Springer, Heidelberg (2006). https://doi.org/10.1007/11682462_50

    Chapter  Google Scholar 

  18. Greiner, R., Hayward, R., Jankowska, M., Molloy, M.: Finding optimal satisficing strategies for and-or trees. Artif. Intell. 170(1), 19–58 (2006)

    Article  MathSciNet  Google Scholar 

  19. Gupta, A., Nagarajan, V., Singla, S.: Algorithms and adaptivity gaps for stochastic probing. In: Krauthgamer, R. (ed.) Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, 10–12 January 2016, pp. 1731–1747. SIAM (2016)

    Google Scholar 

  20. Gupta, A., Nagarajan, V., Singla, S.: Adaptivity gaps for stochastic probing: submodular and XOS functions. In: Klein, P.N. (ed.) Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, 16–19 January 2017, pp. 1688–1702. SIAM (2017)

    Google Scholar 

  21. Happach, F., Hellerstein, L., Lidbetter, T.: A general framework for approximating min sum ordering problems. INFORMS J. Comput. 34(3), 1437–1452. https://doi.org/10.1287/ijoc.2021.1124

  22. Harris, T.E., et al.: The Theory of Branching Processes, vol. 6. Springer, Berlin (1963)

    Book  Google Scholar 

  23. Harvey, N.J., Patrascu, M., Wen, Y., Yekhanin, S., Chan, V.W.: Non-adaptive fault diagnosis for all-optical networks via combinatorial group testing on graphs. In: IEEE INFOCOM 2007–26th IEEE International Conference on Computer Communications, pp. 697–705. IEEE (2007)

    Google Scholar 

  24. Hellerstein, L., Kletenik, D., Lin, P.: Discrete stochastic submodular maximization: adaptive vs. non-adaptive vs. offline. In: Proceedings of the 9th International Conference on Algorithms and Complexity (CIAC) (2015)

    Google Scholar 

  25. Kaplan, H., Kushilevitz, E., Mansour, Y.: Learning with attribute costs. In: Gabow, H.N., Fagin, R. (eds.) Proceedings of the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, 22–24 May 2005, pp. 356–365. ACM (2005)

    Google Scholar 

  26. Kaplan, H., Kushilevitz, E., Mansour, Y.: Learning with attribute costs. In: Proceedings of the 37th Annual ACM Symposium on Theory of Computing, (STOC), pp. 356–365 (2005)

    Google Scholar 

  27. Kowshik, H.J.: Information aggregation in sensor networks. University of Illinois at Urbana-Champaign (2011)

    Google Scholar 

  28. Liva, G., Paolini, E., Chiani, M.: Optimum detection of defective elements in non-adaptive group testing. In: 2021 55th Annual Conference on Information Sciences and Systems (CISS), pp. 1–6. IEEE (2021)

    Google Scholar 

  29. O’Donnell, R.: Analysis of Boolean Functions. Cambridge University Press, Cambridge (2014)

    Book  Google Scholar 

  30. Ünlüyurt, T.: Sequential testing of complex systems: a review. Discret. Appl. Math. 142(1–3), 189–205 (2004)

    Article  MathSciNet  Google Scholar 

  31. Wikipedia: Series and parallel circuits – Wikipedia, the free encyclopedia (2022). Accessed 8 Feb 2022

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Teal Witter .

Editor information

Editors and Affiliations

A Additional Proofs

A Additional Proofs

Proof

(Proof of Theorem 1). Suppose f is a read-once DNF formula. We will prove that for unit costs and the uniform distribution, there is a non-adaptive strategy S such that \(\text {cost}(f,S) \le O(\log n) \cdot \textsf{OPT}_{\mathcal {A}}(f)\).

Let m be the number of terms in f. Because each variable \(x_i\) appears in at most one term, we have that \(m \le n\). As a warm-up, we begin by proving adaptivity gaps for two special cases of f.

Case 1: All Terms have at Most \(2\log n\) Variables. Under the uniform distribution and with unit costs, the \(p_i\) are all equal, and the \(c_i\) are all equal. Thus in this case, the optimal adaptive strategy described previously tests terms in increasing order of length. The adaptive strategy skips in the sense that if it finds a variable in a term that is false, it moves to the next term without testing the remaining variables in the term. Suppose we eliminate skip** from the optimal adaptive strategy, making the strategy non-adaptive. Since all terms have at most \(2\log n\) variables, this increases the testing cost for any given x by a factor of at most \(2\log n\). Thus the cost of evaluating f(x) for a fixed x increases by a factor of at most \(2 \log n\) from an optimal adaptive strategy to a non-adaptive strategy, leading to an adaptivity gap of at most \(2 \log n\).

Case 2: All Terms have More than \(2\log n\) Variables Consider the following non-adaptive strategy that operates in two phases. In Phase 1, the strategy tests a fixed subset of \(2 \log n\) variables from each term, where the terms are taken in increasing length order. In Phase 2, it tests the remaining untested variables in fixed arbitrary order. Since each term has more than \(2\log n\) variables, the value f can only be determined in Phase 1 if a false variable is found in each term during that phase.

Say that an assignment x is bad if the value of f cannot be determined in Phase 1, meaning that a false variable is not found in every term during the phase. The probability that a random x satisfies all the tested \(2 \log n\) variables of a particular term is \(1/n^2\). Then, by the union bound, the probability that x is bad is at most \(m/n^2 \le n/n^2 = 1/n\).

Now let us focus on the good (not bad) assignments x. For each good x, our strategy must find a false variable in each term of f, which requires at least one test per term for any adaptive or non-adaptive strategy. The cost incurred by our non-adaptive strategy on a good x is at most \(2m\log n\), since the strategy certifies that \(f(x)=0\) by the end of Phase 1. Therefore, the expected cost incurred by our non-adaptive strategy S is

$$\begin{aligned} \text {cost}(f, S)&\le \Pr (x \text{ good}) \cdot \mathbb {E}[\text {cost}(f,x,S) | x \text{ good}] \\&+\Pr (x \text{ bad}) \cdot \mathbb {E}[\text {cost}(f,x,S) | x \text{ bad}] \\&\le 1 \cdot 2m\log n + \frac{1}{n} \cdot n \le 3m \log n \end{aligned}$$

using the fact that \(\mathbb {E}[\text {cost}(f,x,S) | x \text{ bad}] \le n\), since there are only n tests, with unit costs.

The expected cost of any strategy, including the optimal adaptive strategy, is at least

$$\begin{aligned} \textsf{OPT}_\mathcal {A}(f)&\ge \min _{S \in \mathcal {A}} \Pr (f(x)=0) \cdot \mathbb {E}[\text {cost}(f,x,S)|f(x)=0] \ge P(f(x)=0) \cdot m \\&= (1-\Pr (f(x)=1)) \cdot m \ge (1-\Pr (x \text{ bad})) \cdot m \ge \left( 1- \frac{1}{n} \right) \cdot m \ge \frac{m}{2} \end{aligned}$$

for \(n \ge 2\). It follows that the adaptivity gap is at most \(6 \log n\).

Case 3: Everything Else. We now generalize the ideas in the above two cases. Let f be a read-once DNF that does not fall into Case 1 or Case 2. We can break this DNF into two smaller DNFs, \(f = f_1 \vee f_2\) where \(f_1\) contains the terms of f of length at most \(2\log n\) and \(f_2\) contains the terms of f of length greater than \(2 \log n\).

Let S be the non-adaptive strategy that first applies the strategy in Case 1 to \(f_1\) and then, if \(f_1(x)=0\), the strategy in Case 2 to \(f_2\). Since S cannot stop testing until it determines the value of f, in the case that \(f_1(x)=0\), it will test all variables in \(f_1\) and then proceed to test variables \(f_2\).

Let \(S^*\) be the optimal adaptive strategy for evaluating read-once DNFs, described above. We know \(S^*\) will test terms in non-decreasing order of length since all tests are equivalent. So, like S, \(S^*\) tests \(f_1\) first and then, if \(f_1(x)=0\), it continues to \(f_2\). It follows that we can write the expected cost of S on f as

$$\begin{aligned} \mathbb {E}[\text {cost}(f,x,S)] = \mathbb {E}[\text {cost}(f_1,x,S_1)] + \Pr (f_1(x)=0) \cdot \mathbb {E}[\text {cost}(f_2, x, S_2)|f_1(x)=0] \end{aligned}$$

where \(S_1\) is the first stage of S, where \(f_1\) is evaluated, and \(S_2\) is the second stage of S, where \(f_2\) is evaluated. Notice that, by the independence of variables, \(\mathbb {E}[\text {cost}(f_2, x, S_2)|f_1(x)=0] = \mathbb {E}[\text {cost}(f_2, x, S_2)]\). We can similarly write the expected cost of \(S^*\) on f. Then the adaptivity gap is

$$\begin{aligned} \frac{\textsf{OPT}_\mathcal {N}(f)}{\textsf{OPT}_\mathcal {A}(f)} \le \frac{\mathbb {E}[\text {cost}(f_1,x,S_1)] + \Pr (f_1(x)=0) \cdot \mathbb {E}[\text {cost}(f_2, x, S_2)]}{\mathbb {E}[\text {cost}(f_1,x,S^*_1)] + \Pr (f_1(x)=0) \cdot \mathbb {E}[\text {cost}(f_2, x, S^*_2)]} \end{aligned}$$
(6)

where \(S^*_1\) is \(S^*\) applied to \(f_1\) and \(S^*_2\) is \(S^*\) beginning from the point when it starts evaluating \(f_2\).

Using the observation that \((a+b)/(c+d) \le \max \{a/c, b/d\}\) for positive real numbers abcd, we know that

$$\begin{aligned} (6) \le \max \left\{ \frac{\mathbb {E}[\text {cost}(f_1,x,S_1)]}{\mathbb {E}[\text {cost}(f_1,x,S^*_1)]}, \frac{\mathbb {E}[\text {cost}(f_2,x,S_2)]}{\mathbb {E}[\text {cost}(f_2,x,S^*_2)]} \right\} = O(\log n) \end{aligned}$$

where the upper bound follows from the analysis of Cases 1 and 2.

Proof

(Proof of Theorem 2). Suppose f is a read-once DNF formula. For unit costs and the uniform distribution, we will show that \(\textsf{OPT}_\mathcal {N}(f) \ge \varOmega (\log n) \cdot \textsf{OPT}_\mathcal {A}(f)\).

For ease of notation, assume \(\sqrt{n}\) is an integer. Consider a read-once DNF f with \(\sqrt{n}\) terms where each term has \(\sqrt{n}\) variables. By examining the number of tests in each term, we can write the optimal adaptive cost as

$$\begin{aligned} \textsf{OPT}_\mathcal {A}(f) \le \sqrt{n} \sum _{i=1}^{\sqrt{n}} \frac{i}{2^i} \le \sqrt{n} \sum _{i=1}^{\infty } \frac{i}{2^i} = 2 \sqrt{n}. \end{aligned}$$

The key observation is that, within a term, the adaptive strategy queries variables in any order since each variable is equivalent to any other. Then the probability that the strategy queries exactly \(i \le \sqrt{n}\) variables is \(1/2^i\).

Next, we will lower bound the expected cost of the optimal non-adaptive strategy

$$\begin{aligned} \textsf{OPT}_\mathcal {N}(f)&= \min _{S \in \mathcal {N}} \mathbb {E}_{x \sim \{0,1\}^n}[\text {cost}(f,x,S)] \\&\ge \min _{S \in \mathcal {N}} \Pr (f(x)=0) \mathbb {E}[\text {cost}(f,x,S)|f(x)=0] \end{aligned}$$

where \(x \sim \{0,1\}^n\) indicates x is drawn from the uniform distribution. First, we know \(\Pr (f(x)=0) \ge .5\). To see this, consider a random input \(x \sim \{0,1\}^n\). The probability that a particular term is true is \(1/2^{\sqrt{n}}\) so the probability that all terms are false (i.e., \(f(x)=0\)) is

$$\begin{aligned} \left( 1-\frac{1}{2^{\sqrt{n}}} \right) ^{\sqrt{n}} = \left( \left( 1-\frac{1}{2^{\sqrt{n}}} \right) ^{2^{\sqrt{n}}} \right) ^{\sqrt{n}/2^{\sqrt{n}}} \ge \left( \frac{1}{2e}\right) ^{\sqrt{n}/2^{\sqrt{n}}} \ge .5 \end{aligned}$$

where the first inequality follows from the loose lower bound that \((1-1/x)^x \ge 1/(2e)\) when \(x \ge 2\) and the second inequality follows when \(n \ge 8\). Second, we know

$$\begin{aligned}&\mathbb {E}[\text {cost}(f,x,S)|f(x)=0] \\&\ge \Pr (\text {one term needs} \varOmega (\log n) \text {tests}|f(x)=0) \cdot \frac{\log _4 n}{2} \cdot \frac{\sqrt{n}}{2} \end{aligned}$$

where we used the symmetry of the terms to conclude that if any term needs \(\varOmega (\log n)\) tests to evaluate it then any non-adaptive strategy will have to spend \(\varOmega (\log n)\) on half the terms in expectation.

All that remains is to lower bound the probability one term requires \(\varOmega (\log n)\) tests given \(f(x)=0\). Observe that this probability is

$$\begin{aligned}&1-(1-\Pr (\text {a particular term needs} \varOmega (\log n) \text {tests} |f(x)=0))^{\sqrt{n}} \\&\ge 1-\left( 1-\frac{1}{\sqrt{n}}\right) ^{\sqrt{n}} \ge 1- \frac{1}{e} \ge .63 \end{aligned}$$

where we will now show the first inequality. We can write the probability that a particular term needs \(\log _4(n)/2\) tests given \(f(x)=0\) as

$$\begin{aligned} \Pr&\left( x_1=1|f(x)=0 \right) \cdots \Pr \left( x_{\log _4(n)/2}=1|f(x)=0, x_1=\cdots =x_{\log _4(n)/2-1}=1 \right) \\&= \frac{2^{\sqrt{n}-1}-1}{2^{\sqrt{n}}-1} \cdots \frac{2^{\sqrt{n}-1-\log _4(n)/2}-1}{2^{\sqrt{n}-\log _4(n)/2}-1} \ge \left( \frac{2^{\sqrt{n}-1-\log _4(n)/2}-1}{2^{\sqrt{n}-\log _4(n)/2}-1}\right) ^{\log _4(n)/2} \\&\ge \left( \frac{1}{4} \right) ^{\log _4(n)/2} = \frac{1}{\sqrt{n}}. \end{aligned}$$

For the first equality, we use the observation that conditioning on \(f(x)=0\) eliminates the possibility every variable is true so the probability of observing a true variable is slightly smaller. For the first inequality, notice that \((2^{i-1}-1)/(2^{i}-1)\) is monotone increasing in i. For the second, observe that \(i \ge \sqrt{n} - \log _4(n)/2\) for our purposes and so \((2^{i-1}-1)/(2^{i}-1) \ge 1/4\) when \(n \ge 16\).

Proof

(Proof of Theorem 4). Suppose f is a read-once DNF. For unit costs and arbitrary probabilities, we prove \(\textsf{OPT}_\mathcal {N}(f) \ge \varOmega (\sqrt{n}) \cdot \textsf{OPT}_\mathcal {A}(f)\).

Consider the read-once DNF with \(m=2\sqrt{n}\) identical terms where each term has \(\ell =\sqrt{n}/2\) variables. In each term, let one variable have \(1/\ell \) probability of being true and the remaining variables have a \((\ell /m)^{1/(\ell -1)}\) probability of being true. Within a term, the optimal adaptive strategy will test the variable with the lowest probability of being true first. Using this observation, we can write

$$\begin{aligned} \textsf{OPT}_\mathcal {A}(f) \le \left[ \Pr (x_1=0) \cdot 1 + \Pr (x_1=1) \cdot \ell \right] \cdot m \\ \le \left[ (1-1/\ell ) \cdot 1 + (1/\ell ) \cdot \ell \right] \cdot m \le 4 \sqrt{n} \end{aligned}$$

where \(x_1\) is the first variable tested in each term. The first inequality follows by charging the optimal adaptive strategy for all \(\ell \) tests in the term if the first one is true. The second inequality follows since the variable with probability \(1/\ell \) of being true is tested first for \(n \ge 18\) (i.e., \(1/\ell < (\ell /m)^{1/(\ell -1)}\) for such n).

In order to lower bound the cost of the optimal non-adaptive strategy, we will argue that there is a constant probability of an event where the non-adaptive strategy has to test \(\varOmega (n)\) variables. In particular,

$$\begin{aligned} \textsf{OPT}_\mathcal {N}(f) \ge&\min _{S \in \mathcal {N}} \Pr (\text{ exactly } \text{ one } \text{ term } \text{ is } \text{ true}) \\ \cdot&\mathbb {E}[\text {cost}(f,x,S)| \text{ exactly } \text{ one } \text{ term } \text{ is } \text{ true}]. \end{aligned}$$

By the symmetry of the terms, observe that

$$\mathbb {E}[\text {cost}(f,x,S)| \text{ exactly } \text{ one } \text{ term } \text{ is } \text{ true}] \ge \sqrt{n}/2 \cdot \sqrt{n} = n/2.$$

That is, the optimal non-adaptive strategy has to search blindly for the single true term among all \(2\sqrt{n}\) terms, making \(\sqrt{n}/2\) tests each for half the terms in expectation.

All that remains is to show there is a constant probability exactly one term is true. The probability a particular term is true is \((1/\ell )((\ell /m)^{1/(\ell -1)})^{(\ell -1)} = 1/m\). Since all variables are independent, the probability that exactly one of the m terms is true is

$$\begin{aligned}&m \cdot \Pr (\text {a term is true}) \cdot \Pr (\text {a term is false})^{m-1} \\ =&m \cdot \frac{1}{m} \cdot \left( 1-\frac{1}{m} \right) ^{m-1} \ge \frac{1}{2e}^{(m-1)/m} \ge \frac{1}{2e}. \end{aligned}$$

It follows that \(\textsf{OPT}_\mathcal {N}(f) \ge \frac{1}{2e} \cdot \frac{n}{2} = \varOmega (n)\) so the adaptivity gap is \(\varOmega (\sqrt{n})\).

Proof

(Proof of Theorem 5). Suppose f is a read-once formula. For arbitrary costs and the uniform distribution, \(\textsf{OPT}_\mathcal {N}(f) \ge \varOmega (n^{1-\epsilon }/\log n) \cdot \textsf{OPT}_\mathcal {A}(f)\).

Define \(W(w) := w^{1-\epsilon } \log _2(w^{1-\epsilon })\) for positive real numbers w.Footnote 3 We will choose \(n_\epsilon \) in terms of the function W so that \(W(n) < n\) for \(n \ge n_\epsilon \). First, consider the first and second derivatives of W:

$$\begin{aligned} W'(w)&= \frac{1-\epsilon }{w^\epsilon } \left( \log _2(w^{1-\epsilon }) + \frac{1}{\log 2} \right) \\ W''(w)&= \frac{1-\epsilon }{w^{1+\epsilon }} \left[ -\epsilon \left( \log _2(w^{1-\epsilon }) + \frac{1}{\log 2}\right) + \frac{1-\epsilon }{\log 2} \right] . \end{aligned}$$

For fixed \(\epsilon > 0\), observe that as w goes to infinity, \(W(w) < w\), \(W'(w) < 1\), and \(W''(w) < 0\). Therefore there is some point \(n_\epsilon \) so that for all \(n \ge n_\epsilon \), the slope of W is decreasing, the slope of W is less than the slope of n, and W(n) is less than n. Equivalently, \(n \ge W(n) = n^{1-\epsilon }\log _2(n^{1-\epsilon })\). We will use this inequality when lower bounding the asymptotic behavior of the adaptivity gap.

For \(n \ge n_\epsilon \) we construct the n-variable read-once DNF formula f as follows. First, let \(r_n\) be a real number such that \(n = n^{1-r_n}\log _2(n^{1-r_n})\). We know that \(r_n\) exists for all \(n \ge 4\) by continuity since \(n^{1-0} \log _2(n^{1-0}) \ge n \ge n^{1-1} \log _2(n^{1-1})\). Let f be the read-once DNF formula with m terms of length \(\ell \), where \(\ell = \log _2(n^{1-r_n})\) and \(m=2^{\ell }\). Thus the total number of variables in f is \(m\ell = n^{1-r_n}\log _2(n^{1-r_n})=n\) as desired. We assume for simplicity that \(\ell \) is an integer. The bound holds by a similar proof without this assumption.

To obtain our lower bound on evaluating this formula, we consider expected evaluation cost with respect to the uniform distribution and the following cost assignment: in each term, choose an arbitrary ordering of the variables and set the cost of testing the ith variable in the term to be \(2^{i-1}\).

Consider a particular term. Recall the optimal adaptive strategy for evaluating a read-once DNF formula presented at the start of Sect. 2. Within a term, this optimal strategy tests the variables in non-decreasing cost order, since each variable has the same probability of being true. Since it performs tests within a term until finding a false variable or certifying the term is true, we can upper bound the expected cost of this optimal adaptive strategy in evaluating f as follows:

$$\begin{aligned} \textsf{OPT}_\mathcal {A}(f) \le m \cdot \left[ \frac{1}{2}\cdot (1) + \frac{1}{4}\cdot (1+2) + \ldots + \frac{1}{2^\ell }\cdot (1+\ldots +2^{\ell -1}) \right] \le m \cdot \ell . \end{aligned}$$

In contrast, the optimal non-adaptive strategy does not have the advantage of stop** tests in a term when it finds a false variable. We will lower bound the expected cost of the optimal non-adaptive strategy in the case that exactly one term is true. By symmetry, any non-adaptive strategy will have to randomly search for the term and so pay \(2^{\ell }\) for half the terms in expectation.

All that remains is to show there is a constant probability exactly one term is true. The probability that a particular term is true is \(1/2^{\ell }\) and so the probability that exactly one term is true is

$$\begin{aligned} m \cdot \frac{1}{2^\ell } \cdot \left( 1-\frac{1}{2^\ell }\right) ^{m-1} \ge \frac{m}{2^\ell } \cdot \left( \frac{1}{2e} \right) ^{(m-1)/2^\ell } \ge \frac{1}{2e} \end{aligned}$$

where the last inequality follows since \(m=2^\ell \). Then the expected cost \(\textsf{OPT}_\mathcal {N}(f)\) of the optimal non-adaptive strategy is at least

$$\begin{aligned} \Pr (\text{ exactly } \text{ one } \text{ term } \text{ is } \text{ true}) \cdot 2^\ell \cdot \frac{m}{2} = \varOmega (m \cdot 2^\ell ) = \varOmega (m \cdot n^{1-r_n}) \ge \varOmega (m \cdot n^{1-\epsilon }) \end{aligned}$$

where we used that \(2^\ell = n^{1-r_n}\) and \(n^{1-r_n} \log _2(n^{1-r_n}) = n \ge n^{1-\epsilon } \log _2(n^{1-\epsilon })\) since \(n \ge n_\epsilon \). It follows that the adaptivity gap is \(\varOmega (n^{1-\epsilon }/\log n)\).

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hellerstein, L., Kletenik, D., Liu, N., Witter, R.T. (2022). Adaptivity Gaps for the Stochastic Boolean Function Evaluation Problem. In: Chalermsook, P., Laekhanukit, B. (eds) Approximation and Online Algorithms. WAOA 2022. Lecture Notes in Computer Science, vol 13538. Springer, Cham. https://doi.org/10.1007/978-3-031-18367-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18367-6_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18366-9

  • Online ISBN: 978-3-031-18367-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation