Continuum Armed Bandit Problem of Few Variables in High Dimensions

Tyagi, Hemant; Gärtner, Bernd

doi:10.1007/978-3-319-08001-7_10

Hemant Tyagi¹⁷ &
Bernd Gärtner¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8447))

Included in the following conference series:

International Workshop on Approximation and Online Algorithms

648 Accesses
4 Citations

Abstract

We consider the stochastic and adversarial settings of continuum armed bandits where the arms are indexed by [0,1]^d. The reward functions r:[0,1]^d → ℝ are assumed to intrinsically depend on at most k coordinate variables implying $r(x_1,\dots,x_d) = g(x_{i_1},\dots,x_{i_k})$ for distinct and unknown i ₁,…,i _k ∈ {1,…,d} and some locally Hölder continuous g:[0,1]^k → ℝ with exponent α ∈ (0,1]. Firstly, assuming (i ₁,…,i _k) to be fixed across time, we propose a simple modification of the CAB1 algorithm where we construct the discrete set of sampling points to obtain a bound of $O(n^{\frac{\alpha+k}{2\alpha+k}} (\log n)^{\frac{\alpha}{2\alpha+k}} C(k,d))$ on the regret, with C(k,d) depending at most polynomially in k and sub-logarithmically in d. The construction is based on creating partitions of {1,…,d} into k disjoint subsets and is probabilistic, hence our result holds with high probability. Secondly we extend our results to also handle the more general case where (i ₁,…,i _k) can change over time and derive regret bounds for the same.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

On Two Continuum Armed Bandit Problems in High Dimensions

Article 12 September 2014

On the Complexity of All $$\varepsilon $$ -Best Arms Identification

The non-stationary stochastic multi-armed bandit problem

Article 30 March 2017

References

Awerbuch, B., Kleinberg, R.: Near-optimal adaptive routing: Shortest paths and geometric generalizations. In: Proceedings of ACM Symposium on Theory of Computing (2004)
Google Scholar
Bansal, N., Blum, A., Chawla, S., Meyerson, A.: Online oblivious routing. In: Proceedings of ACM Symposium in Parallelism in Algorithms and Architectures, pp. 44–49 (2003)
Google Scholar
Monteleoni, C., Jaakkola, T.: Online learning of non-stationary sequences. In: Advances in Neural Information Processing Systems (2003)
Google Scholar
Blum, A., Kumar, V., Rudra, A., Wu, F.: Online learning in online auctions. In: Proceedings of 14th Symp. on Discrete Alg., pp. 202–204 (2003)
Google Scholar
Kleinberg, R., Leighton, T.: The value of knowing a demand curve: Bounds on regret for online posted-price auctions. In: Proceedings of Foundations of Computer Science, pp. 594–605 (2003)
Google Scholar
Lai, T.L., Robbins, H.: Asymptotically efficient adaptive allocations rules. Proceedings of Adv. in Appl. Math. 6, 4–22 (1985)
Article MATH MathSciNet Google Scholar
Rothschild, M.: A two-armed bandit theory of market pricing. Journal of Economic Theory 9, 185–202 (1974)
Article MathSciNet Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: The adversarial multi-armed bandit problem. In: Proceedings of 36th Annual Symposium on Foundations of Computer Science, pp. 322–331 (1995)
Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2-3), 235–256 (2002)
Article MATH Google Scholar
Kleinberg, R.: Nearly tight bounds for the continuum-armed bandit problem. In: 18th Advances in Neural Information Processing Systems (2004)
Google Scholar
Abernethy, J., Hazan, E., Rakhlin, A.: Competing in the dark: An efficient algorithm for bandit linear optimization. In: Proceedings of the 21st Annual Conference on Learning Theory, COLT 2008 (2008)
Google Scholar
DeVore, R., Petrova, G., Wojtaszczyk, P.: Approximation of functions of few variables in high dimensions. Constr. Approx. 33, 125–143 (2011)
Article MATH MathSciNet Google Scholar
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15, 1373–1396 (2003)
Article MATH Google Scholar
Agrawal, R.: The continuum-armed bandit problem. SIAM J. Control and Optimization 33, 1926–1951 (1995)
Article MATH MathSciNet Google Scholar
Cope, E.W.: Regret and convergence bounds for a class of continuum-armed bandit problems. IEEE Transactions on Automatic Control 54, 1243–1253 (2009)
Article MathSciNet Google Scholar
Auer, P., Ortner, R., Szepesvari, C.: Improved rates for the stochastic continuum-armed bandit problem. In: Proceedings of 20th Conference on Learning Theory (COLT), pp. 454–468 (2007)
Google Scholar
Kleinberg, R., Slivkins, A., Upfal, E.: Multi-armed bandits in metric spaces. In: Proceedings of the 40th Annual ACM Symposium on Theory of Computing, STOC 2008, pp. 681–690 (2008)
Google Scholar
Bubeck, S., Munos, R., Stoltz, G., Szepesvari, C.: X-armed bandits. Journal of Machine Learning Research (JMLR) 12, 1587–1627 (2011)
MathSciNet Google Scholar
Bubeck, S., Stoltz, G., Yu, J.Y.: Lipschitz bandits without the Lipschitz constant. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS (LNAI), vol. 6925, pp. 144–158. Springer, Heidelberg (2011)
Chapter Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48–77 (2003)
Article MathSciNet Google Scholar
Mossel, E., O’Donnell, R., Servedio, R.: Learning juntas. In: Proceedings of the thirty-fifth Annual ACM Symposium on Theory of Computing, STOC 2009, pp. 206–212. ACM (2003)
Google Scholar
Naor, M., Schulman, L.J., Srinivasan, A.: Splitters and near-optimal derandomization. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 182–191 (1995)
Google Scholar
Tyagi, H., Gärtner, B.: Continuum armed bandit problem of few variables in high dimensions. CoRR, abs/1304.5793 (2013)
Google Scholar
Audibert, J.-Y., Bubeck, S.: Regret bounds and minimax policies under partial monitoring. Journal of Machine Learning Research 11, 2635–2686 (2010)
MathSciNet Google Scholar
Kleinberg, R.D.: Online Decision Problems with Large Strategy Sets. PhD thesis. MIT, Boston (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Theoretical Computer Science, ETH Zürich (ETHZ), CH-8092, Zürich, Switzerland
Hemant Tyagi & Bernd Gärtner

Authors

Hemant Tyagi
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Gärtner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Technology Institute and Press “Diophantus” & Department of Computer Engineering and Informatics, University of Patras, 26504, Rio, Greece
Christos Kaklamanis
University of Pittsburgh, 15260, Pittsburgh, PA, USA
Kirk Pruhs

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tyagi, H., Gärtner, B. (2014). Continuum Armed Bandit Problem of Few Variables in High Dimensions. In: Kaklamanis, C., Pruhs, K. (eds) Approximation and Online Algorithms. WAOA 2013. Lecture Notes in Computer Science, vol 8447. Springer, Cham. https://doi.org/10.1007/978-3-319-08001-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-08001-7_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08000-0
Online ISBN: 978-3-319-08001-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Continuum Armed Bandit Problem of Few Variables in High Dimensions

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On Two Continuum Armed Bandit Problems in High Dimensions

On the Complexity of All $$\varepsilon $$ -Best Arms Identification

The non-stationary stochastic multi-armed bandit problem

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Continuum Armed Bandit Problem of Few Variables in High Dimensions

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

On Two Continuum Armed Bandit Problems in High Dimensions

On the Complexity of All $$\varepsilon $$ -Best Arms Identification

The non-stationary stochastic multi-armed bandit problem

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation