Log in

Deep reinforcement learning in computer vision: a comprehensive survey

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision. We start with comprehending the theories of deep learning, reinforcement learning, and deep reinforcement learning. We then propose a categorization of deep reinforcement learning methodologies and discuss their advantages and limitations. In particular, we divide deep reinforcement learning into seven main categories according to their applications in computer vision, i.e. (i) landmark localization (ii) object detection; (iii) object tracking; (iv) registration on both 2D image and 3D image volumetric data (v) image segmentation; (vi) videos analysis; and (vii) other applications. Each of these categories is further analyzed with reinforcement learning techniques, network design, and performance. Moreover, we provide a comprehensive analysis of the existing publicly available datasets and examine source code availability. Finally, we present some open issues and discuss future research directions on deep reinforcement learning in computer vision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  • Aaron W, Alan F, Prasad T (2014) Using trajectory data to improve bayesian optimization for reinforcement learning. J Mach Learn Res 15(8):253–282

    MathSciNet  MATH  Google Scholar 

  • Abeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on machine learning, pp 1–8. Association for Computing Machinery

  • Adam C, Pieter A, Andrew YN (2009) Apprenticeship learning for helicopter control. Commun ACM 52(7):97–105

    Article  Google Scholar 

  • Agogino AK, Tumer K (2004) Unifying temporal and structural credit assignment problems. In Proceedings of the third international joint conference on autonomous agents and multiagent systems–vol 2, pp 980–987. IEEE Computer Society

  • Al WA, Yun ID (2019) Partial policy-based reinforcement learning for anatomical landmark localization in 3d medical images. IEEE Trans Med Image

  • Al WA, Yun Io, Lee KJ (2019) Reinforcement learning-based automatic diagnosis of acute appendicitis in abdominal ct. ar**v preprint ar**v:1909.00617

  • Alaniz S (2018) Deep reinforcement learning with model learning and monte carlo tree search in minecraft. In Conference on reinforcement learning and decision making

  • Amir A, Ozan O, Yuanwei L, Loic LF, Benjamin H, Ghislain V, Konstantinos K, Athanasios V, Ben G, Bernhard K et al (2019) Evaluating reinforcement learning agents for anatomical landmark detection. Med Image Anal 53:156–164

    Article  Google Scholar 

  • Amir Shahroudy, Jun Liu, Tian-Tsong Ng, and Gang Wang. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1010–1019, 2016

  • Andersson O, Heintz F, Doherty P (2015) Model-based reinforcement learning in continuous environments using real-time constrained optimization. In AAAI

  • Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) A brief survey of deep reinforcement learning. ar**v preprint ar**v:1708.05866

  • Avinash Ramakanth S, Venkatesh Babu R (2014) Seamseg: Video object segmentation using patch seams. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 376–383

  • Ayle M, Tekli J, El-Zini J, El-Asmar B, Awad M (2020) Bar-a reinforcement learning agent for bounding-box automated refinement

  • Babaeizadeh M, Frosio I, Tyree S, Clemons J, Kautz J (2016) GA3C: gpu-based A3C for deep reinforcement learning. arxiv:CoRR:abs/1611.06256

  • Babenko B, Yang M-H, Belongie S (2009) Visual tracking with online multiple instance learning. In 2009 IEEE conference on computer vision and pattern recognition, pp 983–990. IEEE

  • Bae S-H, Yoon K-J (2014) Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1218–1225

  • Bagnell J (2012) Learning decision: Robustness, uncertainty, and approximation. 04

  • Bagnell JA, Schneider JG (2001) Autonomous helicopter control using reinforcement learning policy search methods. In Proceedings 2001 ICRA. IEEE international conference on robotics and automation (Cat. No.01CH37164), vol 2, pp 1615–1620

  • Barron JT (2019) A general and adaptive robust loss function. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4331–4339

  • Bellver M, Giró-i Nieto X, Marqués F, Torres J (2016) Hierarchical object detection with deep reinforcement learning. ar**v preprint ar**v:1611.03718

  • Bergmann P, Fauser M, Sattlegger D, Steger C (2019) Mvtec ad a comprehensive real-world dataset for unsupervised anomaly detection. In 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9584–9592

  • Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr Philip HS (2016) Fully-convolutional siamese networks for object tracking. In European conference on computer vision, pp 850–865. Springer

  • Black MJ, Yacoob Y (1995) Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In Proceedings of IEEE international conference on computer vision, pp 374–381. IEEE

  • Bloch N, Madabhushi A, Huisman H, Freymann J, Kirby J, Grauer M, Enquobahrie A, Jaffe C, Clarke L, Farahani K (2013) challenge: automated segmentation of prostate structures. Cancer Imag Arch 370:2015

    Google Scholar 

  • Boedecker J, Springenberg JT, Wlfing J, Riedmiller M (2014) Approximate real-time optimal control based on sparse gaussian process models. In 2014 IEEE symposium on adaptive dynamic programming and reinforcement learning (ADPRL), pp 1–8

  • Brazil G, Liu X (2019) M3d-rpn: Monocular 3d region proposal network for object detection. In Proceedings of the IEEE international conference on computer vision, Seoul, South Korea,

  • Bredell G, Tanner C, Konukoglu E (2018) Iterative interaction training for segmentation editing networks. In International workshop on machine learning in medical imaging, pp 363–370. Springer

  • Buetti-Dinh A, Galli V, Bellenberg S, Ilie O, Herold M, Christel S, Boretska M, Pivkin Igor V, Wilmes P, Sand W, Vera M, Dopson M (2019) Deep neural networks outperform human experts capacity in characterizing bioleaching bacterial biofilm composition. Biotechnol Rep 22:e00321

    Article  Google Scholar 

  • Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cybern C 38(2):156–172

    Article  Google Scholar 

  • Caelles S, Maninis K-K, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) One-shot video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 221–230

  • Caicedo JC, Lazebnik S (2015) Active object localization with deep reinforcement learning. In Proceedings of the IEEE international conference on computer vision, pp 2488–2496

  • Carrera D, Manganini F, Boracchi G, Lanzarone E (2017) Defect detection in sem images of nanofibrous materials. IEEE Trans Ind Inf 13(2):551–561

    Article  Google Scholar 

  • Carsten R, Vladimir K, Andrew B (2004) ‘grabcut’ interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314

    Article  Google Scholar 

  • Chen B, Wang D, Li P, Wang S, Lu H (2018) Real-time’actor-critic’tracking. In Proceedings of the European conference on computer vision (ECCV), pp 318–334

  • Cheng J, Tsai Y-H, Wang S, Yang M-H (2017) Segflow: Joint learning for video object segmentation and optical flow. In Proceedings of the IEEE international conference on computer vision, pp 686–695

  • Cher B, Pyry H, Vincenzo DP, Claudia C, Anthony BA (2017) Detection of axonal synapses in 3d two-photon images. PLoS ONE 12(9):e0183309

    Article  Google Scholar 

  • Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. ar**v preprint ar**v:1406.1078

  • Cho K, van Merrienboer B, Gülçehre Ç, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arxiv:CoRR:abs/1406.1078

  • Choi J, ** Chang H, Yun S, Fischer T, Demiris Y, Young Choi J (2017) Attentional correlation filter network for adaptive visual tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4807–4816

  • Choi W (2015) Near-online multi-target tracking with aggregated local flow descriptor. In Proceedings of the IEEE international conference on computer vision, pp 3029–3037

  • Chorowski J, Bahdanau D, Serdyuk D, Cho KH, Bengio Y (2015) Attention-based models for speech recognition. arxiv:CoRR:abs/1506.07503

  • Chu Q, Ouyang W, Li H, Wang X, Liu B, Yu N (2017) Online multi-object tracking using cnn-based single object tracker with spatial-temporal attention mechanism. In Proceedings of the IEEE international conference on computer vision, pp 4836–4845

  • Chu W-H, Kitani KM (2020) Neural batch sampling with reinforcement learning for semi-supervised anomaly detection. In European conference on computer vision, pp 751–766

  • Chu W-S, Song Y, Jaimes A (2015) Video co-summarization: video summarization by visual co-occurrence. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3584–3592

  • Clavera I, Rothfuss J, Schulman J, Fujita Y, Asfour T, Abbeel P (2018) Model-based reinforcement learning via meta-policy optimization. arxiv:CoRR:abs/1809.05214

  • Comaniciu D, Ramesh V, Meer P (2000) Real-time tracking of non-rigid objects using mean shift. In Proceedings IEEE conference on computer vision and pattern recognition. CVPR 2000 (Cat. No. PR00662), vol 2, pp 142–149. IEEE

  • Concetto S, Simone P, Daniela G (2016) Gamifying video object segmentation. IEEE Trans Pattern Anal Mach Intell 39(10):1942–1958

    Google Scholar 

  • Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223

  • Coulom R (2006) Efficient selectivity and backup operators in monte-carlo tree search. In Proceedings of the 5th international conference on computers and games, pp 72–83

  • Coumans E, Bai Y (2016) Pybullet, a python module for physics simulation for games, robotics and machine learning

  • Craig Jordan V (1990) Long-term adjuvant tamoxifen therapy for breast cancer. Breast Cancer Res Treat 15(3):125–136

    Article  Google Scholar 

  • Criminisi A, Shotton J, Robertson D, Konukoglu E (2010) Regression forests for efficient anatomy detection and localization in ct studies. In International MICCAI workshop on medical computer vision, pp 106–117. Springer

  • Dai T, Dubois M, Arulkumaran K, Campbell J, Bass C, Billot B, Uslu F, De Paola V, Clopath C, Bharath AA (2019) Deep reinforcement learning for subpixel neural tracking. In International conference on medical imaging with deep learning, pp 130–150

  • Danelljan Martin, Bhat Goutam, Shahbaz Khan Fahad, Felsberg Michael (2017) Eco: efficient convolution operators for tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6638–6646

  • Danelljan M, Hager G, Shahbaz Khan F, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In Proceedings of the IEEE international conference on computer vision, pp 4310–4318

  • Darryl MC, Andrew M, Adnan T, Dominic K, Stuart C (2014) Fully automatic lesion segmentation in breast mri using mean-shift and graph-cuts on a region adjacency graph. J Magn Reson Imaging 39(4):795–804

    Article  Google Scholar 

  • David S, Guy L, Heess N, Wierstra D, Riedmiller M (2014) Deterministic policy gradient algorithms, Thomas Degris

  • Deisenroth MP, Englert P, Peters J, Fox D (2014) Multi-task policy search for robotics. In 2014 IEEE international conference on robotics and automation (ICRA), pp 3876–3881

  • Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp 248–255. IEEE

  • Deng J, Guo J, Xue N, Zafeiriou S (2019) Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4690–4699

  • Denzler J, Paulus DWR (1994) Active motion detection and object tracking. In Proceedings of 1st international conference on image processing, vol 3, pp 635–639. IEEE

  • Depraetere B, Liu M, Pinte G, Grondman I, Babuka R (2014) Comparison of model-free and model-based methods for time optimal hit control of a badminton robot. Mechatronics 24(8):1021–1030

    Article  Google Scholar 

  • De Asis K, Hernandez-Garcia JF, Holland GZ, Sutton RS (2018) Multi-step reinforcement learning: a unifying algorithm. In Thirty-Second AAAI conference on artificial intelligence

  • DiPietro R, Lea C, Malpani A, Ahmidi N, Vedula SS, Lee GI, Lee MR, Hager GD (2016) Recognizing surgical activities with recurrent neural networks. In International conference on medical image computing and computer-assisted intervention, pp 551–558. Springer

  • Dollár P, Wojek C, Schiele B, Perona P (2009) Pedestrian detection: a benchmark. In 2009 IEEE conference on computer vision and pattern recognition, pp 304–311. IEEE

  • Dominik N, Saša G, Matthias J, Nassir N, Joachim H, Razvan I (2014) Probabilistic sparse matching for robust 3d/3d fusion in minimally invasive surgery. IEEE Trans Med Imaging 34(1):49–60

    Google Scholar 

  • Don M, Anup B (1994) Motion tracking with an active camera. IEEE Trans Pattern Anal Mach Intell 16(5):449–459

    Article  Google Scholar 

  • Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2014) Long-term recurrent convolutional networks for visual recognition and description. arxiv:CoRR:abs/1411.4389

  • Dosovitskiy A, Fischer P, Ilg E, Hausser P, Hazirbas C, Golkov V, Van Der Smagt P, Cremers D, Brox T (2015) Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision, pp 2758–2766

  • Dosovitskiy A, Ros G, Codevilla F, Lopez A, Koltun V (2017) Carla: an open urban driving simulator. ar**v preprint ar**v:1711.03938

  • Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1110–1118

  • Dulac-Arnold G, Levine N, Mankowitz DJ, Li J, Paduraru C, Gowal S, Hester T (2020) An empirical investigation of the challenges of real-world reinforcement learning

  • Dunnhofer M, Martinel N, Luca Foresti G, Micheloni C (2019) Visual tracking by means of deep reinforcement learning and an expert demonstrator. In Proceedings of the IEEE international conference on computer vision workshops

  • Duong CN, Quach KG, Jalata I, Le N, Luu K (2019) Mobiface: a lightweight deep learning face recognition on mobile devices. In 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS), pp 1–6. IEEE

  • Duong CN, Quach KG, Luu K, Hoang LT, Savvides M, Bui TD (2019) Learning from longitudinal face demonstration–where tractable deep modeling meets inverse reinforcement learning. 127(6–7)

  • Eddy I, Nikolaus M, Tonmoy S, Margret K, Alexey D, Thomas B (2017) Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2462–2470

  • El-Fakdi A, Carreras M (2008) Policy gradient based reinforcement learning for real autonomous underwater cable tracking. In 2008 IEEE/RSJ international conference on intelligent robots and systems, pp 3635–3640

  • Elhamifar E, Sapiro G, Vidal R (2012) See all by looking at a few: Sparse modeling for finding representative objects. In 2012 IEEE conference on computer vision and pattern recognition, pp 1600–1607. IEEE

  • Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2147–2154

  • Everingham M Van Gool L, Williams CKI, Winn J, Zisserman A (2007) The pascal visual object classes challenge 2007 (voc2007) results

  • Everingham M, Winn J (2011) The pascal visual object classes challenge 2012 (voc2012) development kit. Pattern analysis, statistical modelling and computational learning, Tech. Rep, 8, 2011

  • Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) Lasot: a high-quality benchmark for large-scale single object tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5374–5383

  • Fan H, Ling H (2017) Parallel tracking and verifying: a framework for real-time and high accuracy visual tracking. In Proceedings of the IEEE international conference on computer vision, pp 5486–5494

  • Felix H, Antoine B, Sumit C, Jason W (2015) The goldilocks principle: Reading children’s books with explicit memory representations. arxiv:CoRR:abs/1511.02301

  • Finn C, Tan XY, Duan Y, Darrell T, Levine S, Abbeel P (2016) Deep spatial autoencoders for visuomotor learning. In: Kragic D, Bicchi A, De Luca A (eds) 2016 IEEE international conference on robotics and automation, ICRA 2016. Stockholm, Sweden, pp 512–519

    Google Scholar 

  • Florin-Cristian G, Bogdan G, Yefeng Z, Sasa G, Andreas M, Joachim H, Dorin C (2017) Multi-scale deep reinforcement learning for real-time 3d-landmark detection in ct scans. IEEE Trans Pattern Anal Mach Intell 41(1):176–189

    Google Scholar 

  • FlorinC G, Edward K, Bogdan G, Vivek S, Yefeng Z, Joachim H, Dorin C (2016) Marginal space deep learning: efficient architecture for volumetric image parsing. IEEE Trans Med Imaging 35(5):1217–1228

    Article  Google Scholar 

  • Fontes DASE, Brandão LAP, da Antonio L Jr, de Albuquerque Araújo A, (2011) Vsumm: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Ecogn Lett 32(1):56–68

  • François-Lavet V, Henderson P, Islam R, Bellemare MG, Pineau J (2018) An introduction to deep reinforcement learning. ar**v preprint ar**v:1811.12560

  • Ganin Y, Kulkarni T, Babuschkin I, Eslami SM, Vinyals O (2018) Synthesizing programs for images using reinforced adversarial learning. ar**v preprint ar**v:1804.01118

  • Gao H, Zhuang L, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

  • Gao M, Yu R, Li A, Morariu VI, Davis LS (2018) Dynamic zoom-in network for fast object detection in large images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6926–6935

  • Gao Y, Vedula SS, Reiley CE, Ahmidi N, Varadarajan B, Lin HC, Tao L, Zappella L, Béjar B, Yuh DD, et al (2014) Jhu-isi gesture and skill assessment working set (jigsaws): a surgical activity dataset for human motion modeling. In Miccai workshop: M2cai, vol 3, pp 3, 2014

  • Gauriau R, Cuingnet R, Lesage D, Bloch I (2014) Multi-organ localization combining global-to-local regression and confidence maps. In International conference on medical image computing and computer-assisted intervention, pp 337–344. Springer

  • Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition, pp 3354–3361

  • Giles M (2017) Mit technology review. Google researchers have reportedly achieved ‘quantum supremacy’. http://www.technologyreview.com/f, 614416

  • Girshick R (2015) Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  • Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  • Gkioxari G, Girshick R, Malik J (2015) Contextual action recognition with r* cnn. In Proceedings of the IEEE international conference on computer vision, pp 1080–1088

  • Gl M, Chen J, Barron JT, Hasinoff Samuel W, Durand F (2017) Deep bilateral learning for real-time image enhancement. ACM Trans Graph 36(4):1–12

    Google Scholar 

  • Goel V, Weng J, Poupart P (2018) Unsupervised video object segmentation for deep reinforcement learning. In Advances in neural information processing systems, pp 5683–5694

  • Gonzalez-Garcia A, Vezhnevets A, Ferrari V (2015) An active search strategy for efficient object class detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3022–3031

  • Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In Advances in neural information processing systems, pp 2672–2680

  • Graves A, Mohamed A, Hinton GE (2013) Speech recognition with deep recurrent neural networks. arxiv:CoRR:abs/1303.5778

  • Gubern-Mérida A, Martí R, Melendez J, Hauth JL, Mann RM, Karssemeijer N, Platel B (2015) Automated localization of breast cancer in dce-mri. Med Image Anal 20(1):265–274

    Article  Google Scholar 

  • Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V (2017) and Aaron C Courville. Improved training of wasserstein gans. In Advances in neural information processing systems, pp 5767–5777

  • Guo M, Lu J, Zhou J (2018) Dual-agent deep reinforcement learning for deformable face tracking. In Proceedings of the European conference on computer vision (ECCV), pp 768–783

  • Gupta A, Mendonca R, Liu YX, Abbeel P, Levine S (2018) Meta-reinforcement learning of structured exploration strategies. In Advances in neural information processing systems, pp 5302–5311

  • Gupta S, Arbelaez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from rgb-d images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 564–571

  • Gupta S, Girshick R, Arbeláez P, Malik J (2014) Learning rich features from rgb-d images for object detection and segmentation. In European conference on computer vision, pp 345–360. Springer

  • Gygli M, Grabner H, Riemenschneider H, Van Gool L (2014) Creating summaries from user videos. In European conference on computer vision, pp 505–520. Springer

  • Hamid Rezatofighi S, Milan A, Zhang Z, Shi Q, Dick A, Reid I (2015) Joint probabilistic data association revisited. In Proceedings of the IEEE international conference on computer vision, pp 3047–3055

  • Han J, Yang L, Zhang D, Chang X, Liang X (2018) Reinforcement cutting-agent learning for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9080–9089

  • Hang X, Hanchuan P (2013) App2: automatic tracing of 3d neuron morphology based on hierarchical pruning of a gray-weighted image distance-tree. Bioinformatics 29(11):1448–1454

    Article  Google Scholar 

  • Haralick RM, Shapiro LG (1985) Image segmentation techniques. Computer Vision, Graphics, and Image Processing 29(1):100–132

    Article  Google Scholar 

  • Hare S, Golodetz S, Saffari A, Vineet V, Cheng M-M, Hicks SL, Torr PHS (2015) Struck: structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096–2109

    Article  Google Scholar 

  • Hariharan B, Arbeláez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. In European conference on computer vision, pp 297–312. Springer

  • Hariharan B, Arbeláez P, Girshick R, Malik J (2015) Hypercolumns for object segmentation and fine-grained localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 447–456

  • Haroon I, Imran S, Cody S, Mubarak S (2013) Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2547–2554

  • Haroon I, Muhmmad T, Kishan A, Dong Z, Somaya A-M, Nasir R, Mubarak S (2018) Composition loss for counting, density map estimation and localization in dense crowds. In Proceedings of the European conference on computer vision (ECCV), pp 532–546

  • Hase H, Azampour MF, Tirindelli M, Paschali M, Simson W, Fatemizadeh E, Navab N (2020) Ultrasound-guided robotic navigation with deep reinforcement learning. ar**v preprint ar**v:2003.13321

  • Hasselt HV (2010) Double q-learning. In Advances in neural information processing systems, pp 2613–2621

  • Hausknecht MJ, Stone P (2015) Deep recurrent q-learning for partially observable mdps. arxiv:CoRR:abs/1507.06527

  • He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  • Hernandez-Leal P, Kaisers M, Baarslag T, de Cote EM (2017) A survey of learning in multiagent environments: Dealing with non-stationarity. arxiv:CoRR:abs/1707.09183

  • Le Hoang NT, Duong CN, Han L, Luu K, Quach KG, Savvides M (2018) Deep contextual recurrent residual networks for scene labeling. Pattern Recogn 80:32–41

    Article  Google Scholar 

  • Le Hoang NT, Quach KG, Luu K, Duong CN, Savvides M (2018) Reformulating level sets as deep recurrent neural network approach to semantic segmentation. IEEE Trans Image Process 27(5):2393–2407

    Article  MathSciNet  Google Scholar 

  • Hoiem D, Efros AA, Hebert M (2007) Recovering surface layout from an image. Int J Comput Vision 75(1):151–172

    Article  MATH  Google Scholar 

  • Holliday JB, Le Ngan TH (2020) Follow then forage exploration: improving asynchronous advantage actor critic. In International conference on soft computing, artificial intelligence and applications (SAI 2020), pp 107–118

  • Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 7132–7141

  • Hu JF, Zheng WS, Lai J, Zhang J (2015) Jointly learning heterogeneous features for rgb-d activity recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5344–5352

  • Huang L, Zhao X, Huang K (2019) Got-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans Pattern Anal Mach Intell

  • Humpire-Mamani GE, Setio Arnaud AA, van Ginneken B, Jacobs C (2018) Efficient organ localization using multi-label convolutional neural networks in thorax-abdomen ct scans. Phys Med Biol 63(8):085003

  • Ibanez L, Schroeder W, Ng L, Cates J (2005) The itk software guide: updated for itk version 2:4

  • Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. ar**v preprint ar**v:1502.03167

  • Jack CR Jr, Bernstein MA, Fox NC, Thompson P, Alexander G, Harvey D, Borowski B, Britson PJ, Whitwell JL, Ward C et al (2008) The alzheimerâ??s disease neuroimaging initiative (adni): Mri methods. J Magn Reson Imag 27(4):685–691

    Article  Google Scholar 

  • Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. Advances in neural information processing systems 2017–2025

  • Jaderberg M, Vedaldi A, Zisserman A (2014) Deep features for text spotting. In European conference on computer vision, pp 512–528. Springe

  • Jain A, Powers A, Johnson HJ (2020) Robust automatic multiple landmark detection. In 2020 IEEE 17th international symposium on biomedical imaging (ISBI), pp 1178–1182. IEEE

  • Jain SD, Grauman K (2014) Supervoxel-consistent foreground propagation in video. In European conference on computer vision, pp 656–671. Springer

  • Jain SD, **ong B, Grauman K (2017) Fusionseg: Learning to combine motion and appearance for fully automatic segmentation of generic objects in videos. In 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2117–2126. IEEE

  • Jampani V, Gadde R, Gehler PV (2017) Video propagation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 451–461

  • Jan P, Stefan S (2008) Reinforcement learning of motor skills with policy gradients. Neural Netw 21(4):682–697

    Article  MathSciNet  Google Scholar 

  • Jang WD, Kim C-S (2017) Online video object segmentation via convolutional trident network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5849–5858

  • Jens Kober J, Bagnell A, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Robot Res 32(11):1238–1274

    Article  Google Scholar 

  • Jia Z, Yang L, Szepesvari C, Wang M (2020) Model-based reinforcement learning with value-targeted regression. In Proceedings of the 2nd conference on learning for dynamics and control, volume 120 of proceedings of machine learning research, pp 666–686, The Cloud

  • Jialue F, Wei X, Ying W, Yihong G (2010) Human tracking using convolutional neural networks. IEEE Trans Neural Netw 21(10):1610–1623

    Article  Google Scholar 

  • Jiang M, Deng C, Pan Z, Wang L, Sun X (2018) Multiobject tracking in videos based on lstm and deep reinforcement learning. Complexity

  • Jie Z, Liang X, Feng J, ** X, Lu W, Yan S (2016) Tree-structured reinforcement learning for sequential object localization. In Advances in neural information processing systems, pp 127–135

  • **won A, Sungzoon C (2015) Variational autoencoder based anomaly detection using reconstruction probability. Spec Lect IE 2(1):1–18

    Google Scholar 

  • Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016

  • Jun Koh Y, Kim C-S (2017) Primary object segmentation in videos based on region augmentation and reduction. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3442–3450

  • Justin G, Reza EM (2015) Concurrent markov decision processes for robot team learning. Eng Appl Artif Intell 39:223–234

    Article  Google Scholar 

  • Jégou S, Drozdzal M, Vazquez D, Romero A, Bengio Y (2017) The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 11–19

  • Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. Association for Computational Linguistics

  • Kempka M, Wydmuch M, Runc G, Toczek J, Jaśkowski W (2016) Vizdoom: A doom-based ai research platform for visual reinforcement learning. In 2016 IEEE conference on computational intelligence and games (CIG), pp 1–8. IEEE

  • Keni B, Rainer S (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Video Process 2008:1–10

    Google Scholar 

  • Kim KK, Cho SH, Kim HJ, Lee JY (2005) Detecting and tracking moving object using an active camera. In The 7th international conference on advanced communication technology, 2005, ICACT 2005, vol 2, pp 817–820. IEEE

  • Kirwan D (2010) Nhs fetal anomaly screening programme. National Standards and Guidance for England 18

  • Klein S, Staring M, Murphy K, Viergever MA, Pluim JPW (2009) Elastix: a toolbox for intensity-based medical image registration. IEEE Trans Med Imaging 29(1):196–205

    Article  Google Scholar 

  • Konda VR, Tsitsiklis JN (2000) Actor-critic algorithms. In Advances in neural information processing systems, pp 1008–1014

  • Krebs J, Mansi T, Delingette H, Zhang L, Ghesu FC, Miao S, Maier AK, Ayache N, Rui L, Ali K (2017) Robust non-rigid registration through agent-based action learning. In International conference on medical image computing and computer-assisted intervention, pp 344–352. Springer

  • Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, Cehovin Zajc L, Vojir T, Bhat G, Lukezic A, Eldesokey A, et al (2018) The sixth visual object tracking vot2018 challenge results. In Proceedings of the European conference on computer vision (ECCV)

  • Kristan M, Matas J, Leonardis A, Felsberg M, Cehovin L, Fernandez G, Vojir T, Hager G, Nebehay G, Pflugfelder R (2015) The visual object tracking vot2015 challenge results. In Proceedings of the IEEE international conference on computer vision workshops, pp 1–23

  • Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp 1097–1105

  • Krähenbühl P, Koltun V (2011) Efficient inference in fully connected crfs with gaussian edge potentials. In Advances in neural information processing systems, pp 109–117

  • Kupcsik A, Deisenroth MP, Peters J, Loh AP, Vadakkepat P, Neumann G (2017) Model-based contextual policy search for data-efficient generalization of robot skills. Artif Intell 247:415–439

    Article  MathSciNet  MATH  Google Scholar 

  • Kupcsik A, Deisenroth M, Peters J, Neumann G (2013) Data-efficient generalization of robot skills with contextual policy search. In AAAI

  • Kurutach T, Clavera I, Duan Y, Tamar A, Abbeel P (2018) Model-ensemble trust-region policy optimization

  • Le N, Le T, Yamazaki K, Bui TD, Luu K, Savides M (2020) Offset curves loss for imbalanced problem in medical segmentation. ar**v preprint ar**v:2012.02463

  • Le N, Yamazaki K, Truong D, Quach KG, Savvides M (2020) A multi-task contextual atrous residual network for brain tumor detection & segmentation. ar**v preprint ar**v:2012.02073

  • LeCun Y (1998) The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist

  • LeCun Y, Bottou L, Orr GB, Müller K-R (1998) Efficient backprop. In Neural networks: Tricks of the trade, pp 9–50. Springer

  • LeCun Y, Touresky D, Hinton G, Sejnowski T (1988) A theoretical framework for back-propagation. In Proceedings of the 1988 connectionist models summer school, pp 21–28. CMU, Pittsburgh, Pa: Morgan Kaufmann

  • Lea C, Flynn MD, Vidal R, Reiter A, Hager GD (2017) Temporal convolutional networks for action segmentation and detection. In proceedings of the IEEE conference on computer vision and pattern recognition, pp 156–165

  • Lea C, Reiter A, Vidal R, Hager GD (2016) Segmental spatiotemporal cnns for fine-grained action segmentation. In European conference on computer vision, pp 36–52. Springer

  • Lea C, Vidal R, Hager GD (2016) Learning convolutional action primitives for fine-grained action recognition. In 2016 IEEE international conference on robotics and automation (ICRA), pp 1642–1649. IEEE

  • Lea C, Vidal R, Reiter A, Hager GD (2016) Temporal convolutional networks: a unified approach to action segmentation. In European conference on computer vision, pp 47–54. Springer

  • Leal-Taixé L, Canton-Ferrer C, Schindler K (2016) Learning by tracking: Siamese cnn for robust target association. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 33–40

  • Leal-Taixé L, Fenzi M, Kuznetsova A, Rosenhahn B, Savarese S (2014) Learning an image-based motion context for multiple people tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3542–3549

  • Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K (2015) Motchallenge 2015: towards a benchmark for multi-target tracking. ar**v preprint ar**v:1504.01942

  • Lee JW, Park J, Jangmin O, Lee J, Hong E (2007) A multiagent approach to q-learning for daily stock trading. Trans Syst Man Cyber Part A 37(6):864–877

    Article  Google Scholar 

  • Lee H, Kim HE, Nam H (2019) Srm: a style-based recalibration module for convolutional neural networks. pp 1854–1862

  • Leibo JZ, Zambaldi VF, Lanctot M, Marecki J, Graepel T (2017) Multi-agent reinforcement learning in sequential social dilemmas. arxiv:CoRR:abs/1702.03037

  • Leo B (1996) Bagging predictors. Mach Learn 24(2):123–140

    Article  MATH  Google Scholar 

  • Leo G (2006) Random walks for image segmentation. IEEE Trans Pattern Anal Mach Intell 28(11):1768–1783

    Article  Google Scholar 

  • Levine S, Koltun V (2014) Learning complex neural network policies with trajectory optimization. In Proceedings of the 31st international conference on machine learning, pp 829–837

  • Li B, Yan J, Wu W, Zhu Z, **aolin H (2018) High performance visual tracking with siamese region proposal network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8971–8980

  • Li B, Ouyang W, Sheng L, Zeng X, Wang X (2019) GS3D: an efficient 3d object detection framework for autonomous driving. In IEEE conference on computer vision and pattern recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp 1019–1028. Computer vision foundation/IEEE

  • Li C, Zhong Q, **e D, Pu S (2018) Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. ar**v preprint ar**v:1804.06055

  • Li D, Chen Q (2020) Deep reinforced attention learning for quality-aware visual recognition. In European conference on computer vision, pp 493–509

  • Li G, Yu Y (2015) Visual saliency based on multiscale deep features. ar**v preprint ar**v:1503.08663

  • Li J, Luong MT, Jurafsky D (2015) A hierarchical neural autoencoder for paragraphs and documents. arxiv:CoRR:abs/1506.01057

  • Li K, Rath M, Burdick JW (2018) Inverse reinforcement learning via function approximation for clinical motion analysis. In 2018 IEEE international conference on robotics and automation (ICRA), pp 610–617

  • Li L, Chu W, Langford J, Schapire RE (2010) A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web, pp 661–670

  • Li Y, Merialdo B (2010) Multi-video summarization based on video-mmr. In 11th International workshop on image analysis for multimedia interactive services WIAMIS 10, pp 1–4. IEEE

  • Li Y, Alansary A, Cerrolaza JJ, Khanal B, Sinclair M, Matthew J, Gupta C, Knight C, Kainz B, Rueckert D (2018) Fast multiple landmark localisation using a patch-based iterative network. In International conference on medical image computing and computer-assisted intervention, pp 563–571. Springer

  • Liang-Chieh C, George P, Iasonas K, Kevin M, AlanL Y (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Google Scholar 

  • Liao R, Miao S, de Tournemire P, Grbic S, Kamen A, Mansi T, Comaniciu D (2017) An artificial agent for robust image registration. In Thirty-First AAAI conference on artificial intelligence

  • Liao X, Li W, Xu Q, Wang X, ** B, Zhang X, Wang Y, Zhang Y (2020) Iteratively-refined interactive 3d medical image segmentation with multi-agent reinforcement learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9394–9402

  • Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. ar**v e-prints ar**v:1509.02971

  • Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. ar**v preprint ar**v:1509.02971

  • Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  • Lindeberg T (2013) Scale-space theory in computer vision, volume 256. Springer Science & Business Media

  • Litjens G, Toth R, van de Ven W, Hoeks C, Kerkstra S, van Ginneken B, Vincent G, Guillard G, Birbeck N, Zhang J et al (2014) Evaluation of prostate segmentation algorithms for MRI: the promise12 challenge. Med Image Anal 18(2):359–373

    Article  Google Scholar 

  • Liu D, Jiang T (2018) Deep reinforcement learning for surgical gesture segmentation and classification. In International conference on medical image computing and computer-assisted intervention, pp 247–255. Springer

  • Liu H, Socher R, **ong C (2019)Taming maml: efficient unbiased meta-reinforcement learning. In International conference on machine learning, pp 4061–4071

  • Liu L, Hao L, Zou H, **ong H, Cao Z, Shen C (2020) Sequential crowd counting by reinforcement learning, Weighing counts

  • Liu L, Wu C, Lu J, **e L, Zhou J, Tian Q (2020) Reinforced axial refinement network for monocular 3d object detection. In European conference on computer vision ECCV, pp 540–556

  • Liu L, Wang H, Li G, Ouyang W, Lin L (2018) Crowd counting using deep recurrent spatial-aware network. ar**v preprint ar**v:1807.00601

  • Liu T, Meng Q, Vlontzos A, Tan J, Rueckert D, Kainz B (2020) Ultrasound video summarization using deep reinforcement learning. ar**v preprint ar**v:2005.09531

  • Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5099–5108

  • Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pp 3730–3738

  • Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  • Lorenzi M, Ayache N, Frisoni GB, Pennec X (2013) Alzheimerâ??s Disease Neuroimaging Initiative (ADNI), Lcc-demons: a robust and accurate symmetric diffeomorphic registration algorithm. Neuroimage 81:470–483

    Article  Google Scholar 

  • Lotfi T, Tang L, Andrews S, Hamarneh G (2013) Improving probabilistic image registration via reinforcement learning and uncertainty evaluation. In International workshop on machine learning in medical imaging, pp 187–194. Springer

  • Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  • Luo W, Sun P, Zhong F, Liu W, Zhang T, Wang Y (2017) End-to-end active object tracking via reinforcement learning. ar**v preprint ar**v:1705.10561

  • Luong T, Sutskever I, Le QV, Vinyals O, Zaremba W (2014) Addressing the rare word problem in neural machine translation. arxiv:CoRR:abs/1410.8206

  • Luu K, Zhu C, Bhagavatula C, Ngan Le TH, Savvides M (2016) A deep learning approach to joint face detection and segmentation. In Advances in face detection and facial image analysis, pp 1–12. Springer

  • Ma C, Huang JB, Yang X, Yang M-H (2015) Hierarchical convolutional features for visual tracking. In Proceedings of the IEEE international conference on computer vision, pp 3074–3082

  • Ma K, Wang J, Singh V, Tamersoy B, Chang YJ, Wimmer A, Chen T (2017) Multimodal image registration with deep context reinforcement learning. In International conference on medical image computing and computer-assisted intervention, pp 240–248. Springer

  • Mahasseni B, Lam M, Todorovic S (2017) Unsupervised video summarization with adversarial lstm networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 202–211

  • Maicas G, Carneiro G, Bradley AP, Nascimento JC, Reid I (2017) Deep reinforcement learning for active breast lesion detection from dce-mri. In International conference on medical image computing and computer-assisted intervention, pp 665–673. Springer

  • Mao J, Xu W, Yang Y, Wang J, Yuille AL (2014) Deep captioning with multimodal recurrent neural networks (m-rnn). arxiv:CoRR:abs/1412.6632

  • Martinez-Marin T, Duckett T (2005) Fast reinforcement learning for vision-guided mobile robots. In Proceedings of the 2005 IEEE international conference on robotics and automation, pp 4170–4175

  • de Marvao A, Dawes-Timothy JW, Shi W, Minas C, Keenan NG, Diamond T, Durighel G, Montana G, Rueckert D, Cook SA et al (2014) Population-based studies of myocardial hypertrophy: high resolution cardiovascular magnetic resonance atlases improve statistical power. J Cardiovasc Magn Reson 16(1):16

    Article  Google Scholar 

  • Massimiliano P, Angelo C (2017) Head pose estimation in the wild using convolutional neural networks and adaptive gradient methods. Pattern Recogn 71:132–143

    Article  Google Scholar 

  • Matas J, James S, Davison v (2018) Sim-to-real reinforcement learning for deformable object manipulation. ar**v preprint ar**v:1806.07851

  • Mathe S, Pirinen v, Sminchisescu C (2016) Reinforcement learning for visual object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2894–2902

  • Matsopoulos GK, Mouravliansky NA, Delibasis KK, Nikita KS (1999) Automatic retinal image registration scheme using global optimization techniques. IEEE Trans Inf Technol Biomed 3(1):47–60

    Article  Google Scholar 

  • Matteo H, Joseph M, Hado Van H, Tom S, Georg O, Will D, Dan H, Bilal P, Mohammad A, David S (2017) Rainbow: combining improvements in deep reinforcement learning. ar**v preprint ar**v:1710.02298

  • Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, Wiest R et al (2014) The multimodal brain tumor image segmentation benchmark (brats). IEEE Trans Med Imag 34(10):1993–2024

    Article  Google Scholar 

  • Miao S, Wang ZJ, Liao R (2016) A cnn regression approach for real-time 2d/3d registration. IEEE Trans Med Imag 35(5):1352–1363

    Article  Google Scholar 

  • Miao S, Liao R, Pfister M, Zhang L, Ordy V (2013) System and method for 3-d/3-d registration between non-contrast-enhanced cbct and contrast-enhanced ct for abdominal aortic aneurysm stenting. In International conference on medical image computing and computer-assisted intervention, pp 380–387. Springer

  • Michael FJ, West Jay B (2001) The distribution of target registration error in rigid-body point-based registration. IEEE Trans Med Imaging 20(9):917–927

    Article  Google Scholar 

  • Mikolov T, Kombrink S, Burget L, Cernocký J, Khudanpur S (2011) Extensions of recurrent neural network language model. In ICASSP, pp 5528–5531

  • Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: a benchmark for multi-object tracking. ar**v preprint ar**v:1603.00831

  • Milan A, Leal-Taixé L, Schindler K, Reid I (2015) Joint tracking and segmentation of multiple targets. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5397–5406

  • Milan A, Rezatofighi SH, Dick A, Reid I, Schindler K (2017) Online multi-target tracking using recurrent neural networks. In Thirty-First AAAI conference on artificial intelligence

  • Minaee S, Abdolrashidi A, Su H, Bennamoun M, Zhang D (2019) Biometric recognition using deep learning: a survey. arxiv:CoRR:abs/1912.00271

  • Ming-Ming C, Mitra Niloy J, **aolei H, Torr Philip HS, Shi-Min H (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582

    Google Scholar 

  • Mingxin J, Tao H, Zhigeng P, Haiyan W, Yinjie J, Chao D (2019) Multi-agent deep reinforcement learning for multi-object tracker. IEEE Access 7:32400–32407

    Article  Google Scholar 

  • Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  • Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pp 928–1937

  • Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In Proceedings of the 33rd international conference on machine learning, pp 1928–1937

  • Mordatch I , Mishra N, Eppner C, Abbeel P (2016) Combining model-based policy search with online model learning for control of physical humanoids. In 2016 IEEE international conference on robotics and automation (ICRA), pp 242–248

  • Morimoto J, Zeglin G, Atkeson CG (2003) Minimax differential dynamic programming: application to a biped walking robot. In Proceedings 2003 IEEE/RSJ international conference on intelligent robots and systems (IROS 2003) (Cat. No.03CH37453), vol 2, pp 1927–1932

  • Morimoto J, Atkeson CG (2009) Nonparametric representation of an approximated poincaré map for learning biped locomotion. In Autonomous robots, pp 131–144

  • Mousavian A, Anguelov D, Flynn J, Košecká J (2017) 3D bounding box estimation using deep learning and geometry. In 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 5632–5640

  • Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for UAV tracking. In European conference on computer vision, pp 445–461. Springer

  • Märki N, Perazzi F, Wang O, Sorkine-Hornung A (2016) Bilateral space video segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 743–751

  • Nagabandi A, Clavera I, Liu S, Fearing RS, Abbeel P, Levine S, Finn C (2018) Learning to adapt in dynamic, real-world environments through meta-reinforcement learning. ar**v preprint ar**v:1803.11347

  • Nair A, McGrew B, Andrychowicz M, Zaremba W, Abbeel P (2018) Overcoming exploration in reinforcement learning with demonstrations. In 2018 IEEE international conference on robotics and automation (ICRA), pp 6292–6299

  • Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4293–4302

  • Narges A, Lingling T, Shahin S, Yixin G, Colin L, Bejar HB, Luca Z, Sanjeev K, René V, Hager Gregory D (2017) A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery. IEEE Trans Biomed Eng 64(9):2025–2041

    Article  Google Scholar 

  • Nassif AB, Shahin I, Attili I, Azzeh M, Shaalan K (2019) Speech recognition using deep neural networks: a systematic review. IEEE Access 7:19143–19165

    Article  Google Scholar 

  • Navarro F, Sekuboyina A, Waldmannstetter D, Peeken JC, Combs SE, Menze BH (2020) Deep reinforcement learning for organ localization in ct. ar**v preprint ar**v:2005.04974

  • Neil B, Nicholas HA, Darcie Thomas E (2008) Minimal-bracketing sets for high-dynamic-range image capture. IEEE Trans Image Process 17(10):1864–1875

    Article  MathSciNet  MATH  Google Scholar 

  • Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning

  • Ng AY, Russell SJ (2000) Algorithms for inverse reinforcement learning. In Proceedings of the seventeenth international conference on machine learning, ICML ’00, pp 663–670, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc

  • Ng AY, Russell SJ, et al (2000) Algorithms for inverse reinforcement learning. In Icml, vol 1

  • Nguyen TT, Li Z, Silander T, Leong T-Y (2013) Online feature selection for model-based reinforcement learning. In Proceedings of the 30th international conference on international conference on machine learning–vol 28, pp I–498–I–506

  • Nhan Duong C, Quach KG, Luu K, Le N, Savvides M (2017) Temporal non-volume preserving approach to facial age-progression and age-invariant face recognition. In Proceedings of the IEEE international conference on computer vision, pp 3735–3743

  • Nicolas S, Kenji D (2003) Meta-learning in reinforcement learning. Neural Netw 16(1):5–9

    Article  Google Scholar 

  • Niedzwiedz C, Elhanany I, Liu Z, Livingston S (2008) A consolidated actor-critic model with function approximation for high-dimensional pomdps. In AAAI 2008 workshop for advancement in POMDP solvers

  • Ning Y, He S, Zhiyong W, **ng C, Zhang L-J (2019) A review of deep learning based speech synthesis. Appl Sci 9(19)

  • Noam B, Tuomas S (2019) Superhuman ai for multiplayer poker. Science 365(6456):885–890

    Article  MathSciNet  MATH  Google Scholar 

  • Okuma K, Taleghani A, De Freitas N, Little JJ, Lowe DG (2004) A boosted particle filter: multitarget detection and tracking. In European conference on computer vision, pp 28–39. Springer

  • Olga R, Jia D, Hao S, Jonathan K, Sanjeev S, Sean M, Zhiheng H, Andrej K, Aditya K, Michael B et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211–252

    Article  MathSciNet  Google Scholar 

  • Orlando JI, Fu H, Breda JB, van Keer K, Bathula DR, Diaz-Pinto A, Fang R, Heng P-A, Kim J, Lee JH, et al (2020) Refuge challenge: a unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med Image Anal 59:101570

  • Osa T, Pajarinen J, Neumann G, Bagnell JA, Abbeel P, Peters J (2018)

  • Papazoglou A, Ferrari V (2013) Fast object segmentation in unconstrained video. In Proceedings of the IEEE international conference on computer vision, pp 1777–1784

  • Paschalidis IC, Li K, Moazzez Estan**i R (2009) An actor-critic method using least squares temporal difference learning. In Proceedings of the 48h IEEE conference on decision and control (CDC) held jointly with 2009 28th Chinese Control Conference, pp 2564–2569

  • Peixia L, Dong W, Lijun W, Huchuan L (2018) Deep visual tracking: Review and experimental comparison. Pattern Recogn 76:323–338

    Article  Google Scholar 

  • Peng H, Ruan Z, Long F, Simpson JH, Myers EW (2010) V3d enables real-time 3d visualization and quantitative analysis of large-scale biological image data sets. Nat Biotechnol 28(4):348–353

    Article  Google Scholar 

  • Pengpeng L, Erik B, Haibin L (2015) Encoding color information for visual tracking: Algorithms and benchmark. IEEE Trans Image Process 24(12):5630–5644

    Article  MathSciNet  MATH  Google Scholar 

  • Pengyu Z, Dong W, Lu H (2020) Review and experimental comparison, Multi-modal visual tracking

  • Perazzi F, Khoreva A, Benenson R, Schiele B, Sorkine-Hornung A (2017) Learning video object segmentation from static images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2663–2672

  • Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 724–732

  • Philippe T, Michael U (2000) Optimization of mutual information for multiresolution image registration. IEEE Trans Image Process 9(12):2083–2099

    Article  MATH  Google Scholar 

  • Pieter A, Adam C, Andrew YN (2010) Autonomous helicopter aerobatics through apprenticeship learning. Int J Robot Res 29(13):1608–1639

    Article  Google Scholar 

  • Pirinen A, Sminchisescu C (2018) Deep reinforcement learning of region proposal networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6945–6954

  • Pirsiavash H, Ramanan D, Fowlkes CC (2011) Globally-optimal greedy algorithms for tracking a variable number of objects. In CVPR 2011, pp 1201–1208. IEEE

  • Plaat A, Kosters W, Preuss M (2020) Deep model-based reinforcement learning for high-dimensional problems, a survey

  • Potapov D, Douze M, Harchaoui Z, Schmid C (2014) Category-specific video summarization. In European conference on computer vision, pp 540–555. Springer

  • Pourreza-Shahri R, Kehtarnavaz N (2015) Exposure bracketing via automatic exposure selection. In 2015 IEEE international conference on image processing (ICIP), pp 320–323. IEEE

  • Prest A, Leistner C, Civera J, Schmid C, Ferrari V (2012) Learning object class detectors from weakly annotated video. In 2012 IEEE Conference on computer vision and pattern recognition, pp 3282–3289. IEEE

  • Qi Y, Zhang S, Qin L, Yao H, Huang Q, Lim J, Yang M-H (2016) Hedged deep tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4303–4311

  • Rakelly K, Zhou A, Finn C, Levine S, Quillen D (2019) Efficient off-policy meta-reinforcement learning via probabilistic context variables. In International conference on machine learning, pp 5331–5340

  • Rameswar Panda, Abir Das, Ziyan Wu, Jan Ernst, and Amit K Roy-Chowdhury. Weakly supervised summarization of web videos. In Proceedings of the IEEE International Conference on Computer Vision, pages 3657–3666, 2017

  • Redmon J. Farhadi A (2018) Yolov3: An incremental improvement. ar**v preprint ar**v:1804.02767

  • Ren L, Lu J, Wang Z, Tian Q, Zhou J (2018) Collaborative deep reinforcement learning for multi-object tracking. In Proceedings of the European conference on computer vision (ECCV), pp 586–602

  • Ren L, Yuan X, Lu J, Yang M, Zhou J (2018) Deep reinforcement learning with iterative shift for visual tracking. In Proceedings of the European conference on computer vision (ECCV), pp 684–700

  • Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pp 91–99

  • Reza M, Kosecka J, et al (2016) Reinforcement learning for semantic segmentation in indoor scenes. ar**v preprint ar**v:1606.01178

  • Richard A, Gall J (2016) Temporal action detection using a statistical language model. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3131–3140

  • Rochan M, Ye L, Wang Y (2018) Video summarization using fully convolutional sequence networks. In Proceedings of the European conference on computer vision (ECCV), pp 347–363

  • Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention, pp 234–241. Springer

  • Ros G, Koltun V, Codevilla F, Lopez A (2019) The carla autonomous driving challenge

  • Rotman D (2013) Mit technology review. Retrieved from meet the man with a cheap and easy plan to stop global warming. http://www. technologyreview. com/featuredstor y/511016/a-cheap-and-easy-plan-to-stop-globalwarming

  • Rouet J-M, Jacq J-J, Roux C (2000) Genetic algorithms for a robust 3-d mr-ct registration. IEEE Trans Inf Technol Biomed 4(2):126–136

    Article  Google Scholar 

  • Rumelhart DE (1998) the architecture of mind: a connectionist approach. Mind Read pp 207–238

  • Runarsson TP, Lucas SM (2012) Imitating play from game trajectories: Temporal difference learning versus preference learning. In 2012 IEEE conference on computational intelligence and games (CIG), pp 79–82

  • Sadeghian A, Alahi A, Savarese S (2017) Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In Proceedings of the IEEE international conference on computer vision, pp 300–311

  • Sahba F (2016) Deep reinforcement learning for object segmentation in video sequences. In 2016 International conference on computational science and computational intelligence (CSCI), pp 857–860. IEEE

  • Sahba F, Tizhoosh HR, Salama MMA (2006) A reinforcement learning framework for medical image segmentation. In The 2006 IEEE international joint conference on neural network proceedings, pp 511–517. IEEE

  • Sahba F, Tizhoosh HR, Salama MMMA (2007) Application of opposition-based reinforcement learning in image segmentation. In 2007 IEEE symposium on computational intelligence in image and signal processing, pp 246–251. IEEE,

  • Schulman J, Levine S, Abbeel P, Jordan M, Moritz P (2015) Trust region policy optimization. In International conference on machine learning, pp 1889–1897

  • Schulman J, Levine S, Moritz P, Jordan MI, Abbeel P (2015) Trust region policy optimization. ar**v e-prints

  • Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. ar**v e-prints

  • Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. ar**v preprint ar**v:1707.06347

  • Sefati S, Cowan NJ, Vidal R (2015) Learning shared, discriminative dictionaries for surgical gesture segmentation and classification. In MICCAI workshop: M2CAI, vol 4

  • Sepp H, Jürgen S (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Seung-Hwan B, Kuk-** Y (2017) Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. IEEE Trans Pattern Anal Mach Intell 40(3):595–610

    Google Scholar 

  • Shafiee MJ, Chywl B, Li F, Wong A (2017) Fast yolo: a fast you only look once system for real-time embedded object detection in video. ar**v preprint ar**v:1709.05943

  • Shaker MR, Yue S, Duckett T (2009)Vision-based reinforcement learning using approximate policy iteration. In 2009 international conference on advanced robotics, pp 1–6

  • Shalabh B, Sutton Richard S, Mohammad G, Mark L (2009) Natural actorâ-critic algorithms. Automatica 45(11):2471–2482

    Article  MathSciNet  MATH  Google Scholar 

  • Shalev-Shwartz S, Shammah S, Shashua A (2016) Safe, multi-agent, reinforcement learning for autonomous driving. arxiv:CoRR:abs/1610.03295

  • Shen J, Zafeiriou S, Chrysos GG, Kossaifi J, Tzimiropoulos G, Pantic M (2015) The first facial landmark tracking in-the-wild challenge: Benchmark and results. In Proceedings of the IEEE international conference on computer vision workshops, pp 50–58

  • Shi Y, Cui L, Qi Z, Meng F, Chen Z (2016) Automatic road crack detection using random structured forests. IEEE Trans Intell Transp Syst 17(12):3434–3445

    Article  Google Scholar 

  • Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In European conference on computer vision, pp 746–760. Springer

  • Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556

  • Sindagi VA, Patel VM (2019) Multi-level bottom-top and top-bottom feature fusion for crowd counting. In Proceedings of the IEEE international conference on computer vision, pp 1002–1012

  • Singh SP, Wang L, Gupta S, Goli H, Padmanabhan P, Gulyás B (2020) 3d deep learning on medical images: a review

  • Song G, Myeong H, Lee KM (2018) Seednet: automatic seed generation with deep reinforcement learning for robust interactive segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1760–1768

  • Song Y, Vallmitjana J, Stent A, Jaimes A (2015) Tvsum: summarizing web videos using titles. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5179–5187

  • Song Y, Ma C, Gong L, Zhang J, Lau RWH, Yang M-H (2017) Crest: Convolutional residual learning for visual tracking. In Proceedings of the IEEE international conference on computer vision, pp 2555–2564

  • Stadie BC, Abbeel P, Sutskever I (2017) Third-person imitation learning. arxiv:CoRR:abs/1703.01703

  • Subramanian J, Mahajan A (2019) Reinforcement learning in stationary mean-field games, pp 251–259. International foundation for autonomous agents and multiagent systems

  • Sun S, Hu J, Yao M, Hu J, Yang X, Song Q, Wu X (2018) Robust multimodal image registration using deep recurrent reinforcement learning. In Asian conference on computer vision, pp 511–526. Springer

  • Sundararajan K, Woodard DL (2018) Deep learning for biometrics: a survey. 51:3

  • Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press

  • Sutton RS, McAllester D, Singh S, Mansour Y (1999) Policy gradient methods for reinforcement learning with function approximation. In Proceedings of the 12th international conference on neural information processing systems, NIPS’99, pp 1057–1063

  • Sutton RS, McAllester DA, Singh SP, Mansour Y (2000) Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems vol 12, pp 1057–1063

  • Szegedy C, Ioffe S, Vanhoucke V, lemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI conference on artificial intelligent

  • Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich V (2015) Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  • Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. In Advances in neural information processing systems, pp 2553–2561

  • Sæmundsson S, Hofmann K, Deisenroth KP (2018) Meta reinforcement learning with latent variable gaussian processes. ar**v preprint ar**v:1803.07551

  • Tang Y, Tian Y, Lu J, Li P, Zhou J (2018) Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5323–5332

  • Tao R, Gavves E, Smeulders AWM (2016) Siamese instance search for tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1420–1429

  • Tian Z, Si X, Zheng Y, Chen Z, Li X (2020) Multi-step medical image segmentation based on reinforcement learning. J Ambient Intell Human Comput

  • Tianyang X, Zhen-Hua F, **ao-Jun W, Josef K (2019) Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Trans Image Process 28(11):5596–5609

    Article  MathSciNet  MATH  Google Scholar 

  • Todd H, Michael Q, Peter S (2011) A real-time model-based reinforcement learning architecture for robot control. arxiv:CoRR:abs/1105.1749

  • Toro OJ, Müller H, Krenn M, Gruenberg K, Taha AA, Winterstein M, Eggel I, Foncubierta-Rodríguez A, Goksel O, Jakab A et al (2016) Cloud-based evaluation of anatomical structure segmentation and landmark detection algorithms: Visceral anatomy benchmarks. IEEE Trans Med Imaging 35(11):2459–2475

    Article  Google Scholar 

  • Toromanoff M, Wirbel E, Moutarde F (2020) End-to-end model-free reinforcement learning for urban driving using implicit affordances. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7153–716

  • Toshev A, Szegedy C (2014) Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1653–1660

  • Tsai Y-H, Yang M-H, Black MJ (2016) Video segmentation via object flow. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3899–3908

  • Tsurumine Y, Cui Y, Yamazaki K, Matsubara K (2019) Generative adversarial imitation learning with deep p-network for robotic cloth manipulation. In 2019 IEEE-RAS 19th international conference on humanoid robots (humanoids), pp 274–280

  • Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  • Uzkent B, Yeh C, Ermon S (2020) Efficient object detection in large images using deep reinforcement learning. In The IEEE winter conference on applications of computer vision, pp 1824–1833

  • Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PHS (2017) End-to-end representation learning for correlation filter based tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2805–2813

  • Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In Thirtieth AAAI conference on artificial intelligence

  • Van Hove L (2001) Optimal denominations for coins and bank notes: in defense of the principle of least effort. J Money Credit Bank pp 1015–1021

  • Vecchio G, Palazzo S, Giordano D, Rundo F, Spampinato C (2020) Mask-rl: Multiagent video object segmentation framework through reinforcement learning. IEEE Trans Neural Netw Learn Syst

  • Vijayanarasimhan S, Ricco S, Schmid C, Sukthankar R, Fragkiadaki K (2017) Sfm-net: learning of structure and motion from video. ar**v preprint ar**v:1704.07804

  • Vinyals O, Babuschkin I, Chung J, Mathieu M, Jaderberg M, Czarnecki W, Dudzik A, Huang A, Georgiev P, Powell R, Ewalds T, Horgan D, Kroiss M, Danihelka I, Agapiou J, Oh J, Dalibard V, Choi D, Sifre L, Sulsky Y, Vezhnevets S, MolloyJ , Cai T, Budden D, Paine T, Gulcehre C, Wang Z, Pfaff T, Pohlen T, Yogatama D, Cohen J, McKinney K, Smith O, Schaul T, Lillicrap T, Apps C, Kavukcuoglu K, Hassabis D, Silver D (2019) AlphaStar: mastering the real-time strategy game starCraft II. https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/

  • Vlontzos A, Alansary A, Kamnitsas K, Rueckert D, Kainz B (2019) Multiple landmark detection using multi-agent reinforcement learning. In International conference on medical image computing and computer-assisted intervention, pp 262–270

  • Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612

    Article  Google Scholar 

  • Wang G, Zuluaga MA, Li W, Pratt R, Patel PA, Aertsen M, Doel T, David AL, Deprest J, Ourselin S, et al (2018) Deepigeos: a deep interactive geodesic framework for medical image segmentation. IEEE Trans Pattern Anal Mach Intell 41(7):1559–1572

    Article  Google Scholar 

  • Wang H, Wang Y, Zhou Z, Ji X, Gong D, Zhou J, Li Z, Liu W (2018) Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5265–5274

  • Wang JX, Kurth-Nelson Z, Tirumala D, Soyer H, Leibo JZ, Munos R, Blundell C, Kumaran D, Botvinick M (2016 )Learning to reinforcement learn. arxiv:CoRR:abs/1611.05763, 2016

  • Wang L, Lu H, Ruan X, Yang M-H (2015) Deep networks for saliency detection via local estimation and global search. In 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 3183–3192. IEEE

  • Wang M, Deng W (2020) Mitigating bias in face recognition using skewness-aware reinforcement learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9322–9331

  • Wang M, Deng W, Hu J, Tao X, Huang Y (2019) Racial faces in the wild: Reducing racial bias by information maximization adaptation network. In Proceedings of the IEEE international conference on computer vision, pp 692–702

  • Wang N. Yeung D-Y (2013) Learning a deep compact image representation for visual tracking. In Advances in neural information processing systems, pp 809–817

  • Wang T, Bao X, Clavera I, Hoang J, Wen Y, Langlois E, Zhang S, Zhang G, Abbeel P, Ba J (2019) Benchmarking model-based reinforcement learning. arxiv:CoRR:abs/1907.02057

  • Wang Y, Dong M, Shen J, Wu Y, Cheng S, Pantic M (2020) Dynamic face video segmentation via reinforcement learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6959–6969

  • Wang Z, Zhang J, Lin M, Wang J, Luo P, Ren J (2020) Learning a reinforced agent for flexible exposure bracketing selection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1820–1828

  • Wang Z, Schaul T, Hessel M, Van Hasselt H, Lanctot M, De Freitas N (2015) Dueling network architectures for deep reinforcement learning. ar**v preprint ar**v:1511.06581

  • Weiming H, ** L, Wenhan L, **aoqin Z, Stephen M, Zhongfei Z (2012) Single and multiple object tracking using log-euclidean riemannian subspace and block-division appearance model. IEEE Trans Pattern Anal Mach Intell 34(12):2420–2440

    Article  Google Scholar 

  • Wickelgren WA (1973) The long and the short of memory. Psychol Bull 80(6):425

    Article  Google Scholar 

  • Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3–4):229–256

    Article  MATH  Google Scholar 

  • Wirth C, Fürnkranz J (2015) On learning from game annotations. IEEE Trans Comput Intell AI Games 7(3):304–316

    Article  Google Scholar 

  • Wohlhart P, Lepetit V (2015) Learning descriptors for object recognition and 3d pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3109–3118

  • Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In European conference on computer vision, pp 3–19

  • Wu Y, Lim J, Yang M-H (2013) Online object tracking: a benchmark. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2411–2418

  • **a L, Chen C-C, Aggarwal JK (2012) View invariant human action recognition using histograms of 3d joints. In 2012 IEEE computer society conference on computer vision and pattern recognition workshops, pages 20–27. IEEE

  • **ahai Z, Juan S (2016) Multi-scale patch and multi-modality atlases for whole heart segmentation of mri. Med Image Anal 31:77–87

    Article  Google Scholar 

  • **ang S, Li H (2017) On the effects of batch and weight normalization in generative adversarial networks. ar**v preprint ar**v:1704.03971

  • **ang Y, Alahi A, Savarese S (2015) Learning to track: online multi-object tracking by decision making. In Proceedings of the IEEE international conference on computer vision, pp 4705–4713

  • **ao F, Lee YJ (2016) Track and segment: an iterative unsupervised approach for video object proposals. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 933–942

  • **e Q, Luong M-T, Hovy E, Le QV (2020) Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10687–10698

  • **ong H, Lu H, Liu C, Liu L, Cao Z, Shen C (2019) From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of the IEEE international conference on computer vision, pp 8362–8371

  • Xu H, Su F (2015) Robust seed localization and growing with deep convolutional features for scene text detection. In Proceedings of the 5th ACM on international conference on multimedia retrieval, pp 387–394. ACM

  • Xu N, Price B, Cohen S, Yang J, Huang TS (2016) Deep interactive object selection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 373–381

  • Xu Y-S, Fu T-J, Yang H-K, Lee C-Y (2018) Dynamic video segmentation network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6556–6565

  • Xuanang X, Fugen Z, Bo L, Dongshan F, **angzhi B (2019) Efficient multiple organ localization in ct image using 3d region proposal network. IEEE Trans Med Imaging 38(8):1885–1898

    Article  Google Scholar 

  • Yamakazi K, Viet-Khoa Vo-Ho AS, Le NTH, Tran T (2021) Agent-environment network for temporal action proposal generation. In International conference on acoustics, speech and signal processing

  • Yamazaki K, Rathour VS, Le T (2021) Invertible residual network with regularization for effective medical image segmentation. ar**v preprint ar**v:2103.09042

  • Yan W, Lei Z, Lituan W, Zizhou W (2018) Multitask learning for object localization with deep reinforcement learning. IEEE Trans Cogn Deve Syst 11(4):573–580

    Google Scholar 

  • Yan Z, Yuan Y, Zuo W, Tan X, Wang Y, Wen S, Ding E (2019) Perspective-guided convolution networks for crowd counting. In Proceedings of the IEEE international conference on computer vision, pp 952–961

  • Yang Z, Huang L, Chen Y, Wei Z, Ahn S, Zelinsky G, Samaras D, Hoai M (2020) Predicting goal-directed human attention using inverse reinforcement learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  • Yi W, Jongwoo L, Ming-Hsuan Y (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848

    Article  Google Scholar 

  • Yong KD, Moongu J (2014) Data fusion of radar and image measurements for multi-object tracking via kalman filtering. Inf Sci 278:641–652

    Article  MathSciNet  Google Scholar 

  • Yoon JH, Lee CR, Yang MH, Yoon KJ (2016) Online multi-object tracking via structural constraint event aggregation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1392–1400

  • Yoshihisa T, Yunduan C, Eiji U, Takamitsu M (2019) Deep reinforcement learning with smooth policy update: application to robotic cloth manipulation. Robot Auton Syst 112:72–83

    Article  Google Scholar 

  • Yoshua B, Patrice S, Paolo F (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166

    Article  Google Scholar 

  • You C, Lu J, Filev D, Tsiotras P (2019) Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robot Auton Syst 114:1–18

    Article  Google Scholar 

  • Yu C, Liu J, Nemati S (2019) Reinforcement learning in healthcare: a survey. ar**v preprint ar**v:1908.08796

  • Yu T, Quillen D, He Z, Julian R, Hausman K, Finn C, Levine S (2020) Meta-world: a benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on robot learning, pp 1094–1100

  • Yun S, Choi J, Yoo Y, Yun K, Choi JY (2017) Action-decision networks for visual tracking with deep reinforcement learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2711–2720

  • Yunliang C, Said O, Manas S, Mark L, Shuo L (2015) Multi-modality vertebra recognition in arbitrary views using 3D deformable hierarchical model. IEEE Trans Med Imaging 34(8):1676–1693

    Article  Google Scholar 

  • Yushi C, ** J (2015) Spectral-spatial classification of hyperspectral data based on deep belief network. IEEE J Select Top Appl Earth Observ Remote Sens 8(6):2381–2392

    Article  Google Scholar 

  • Zdenek K, Krystian M, Jiri M (2011) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422

    Google Scholar 

  • Zengyi Q, **glu W, Yan L (2019) Monogrnet: a geometric reasoning network for monocular 3d object localization. Proc AAAI Confer Artific Intell 33(01):8851–8858

    Google Scholar 

  • Zha D, Lai K-H, Zhou K, Hu X (2019) Experience replay optimization. ar**v preprint ar**v:1906.08387

  • Zhang J, Li W, Ogunbona PO, Wang P, Tang C (2016) Rgb-d-based action recognition datasets: a survey. Pattern Recogn 60:86–105

    Article  Google Scholar 

  • Zhang-Wei H, Chen Yu-M, Shih-Yang S, Tzu-Yun S, Yi-Hsiang C, Hsuan-Kung Y, Brian Hsi-Lin H, Chih-Chieh T, Yueh-Chuan C, Tsu-Ching H, et al. Virtual-to-real: learning to control in visual semantic segmentation. ar**v preprint ar**v:1802.00285

  • Zhang D, Maei H, Wang X, Wang Y-F (2017) Deep reinforcement learning for visual object tracking in videos. ar**v preprint ar**v:1701.08936

  • Zhang D, Yang L, Meng D, Xu D, Han J (2017) Spftn: a self-paced fine-tuning network for segmenting objects in weakly labelled videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4429–4437

  • Zhang K, Chao W-L, Sha F, Grauman K (2016) Video summarization with long short-term memory. In European conference on computer vision, pages 766–782. Springer

  • Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

  • Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the European conference on computer vision (ECCV), pp 405–420

  • Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890

  • Zheng Y, Liu D, Georgescu B, Nguyen H, Comaniciu D (2015) 3d deep learning for efficient and robust landmark detection in volumetric data. In International conference on medical image computing and computer-assisted intervention, pp 565–572. Springer

  • Zhewei H, Wen H, Shuchang Z (2019) Learning to paint with model-based deep reinforcement learning. In Proceedings of the IEEE international conference on computer vision, pp 8709–8718

  • Zhiheng H, Wei X, Kai Y (2015) Bidirectional lstm-crf models for sequence tagging. ar**v preprint ar**v:1508.01991

  • Zhiwu H, Chengde W, Thomas P, Van Gool L (2017) Deep learning on lie groups for skeleton-based action recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6099–6108

  • Zhong-Qiu Z, Shou-Tao X, Dian L, Wei-Dong T, Zhi-Da J (2019) A review of image set classification. Neurocomputing 335:251–260

    Article  Google Scholar 

  • Zhou B, Zhao H, Puig X, Fidler S, Barriuso A, Torralba A (2017) Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 633–641

  • Zhou K, Qiao Y, **ang T (2018) Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward. In Thirty-Second AAAI conference on artificial intelligence

  • Zhou K, **ang T, Cavallaro A (2018) Video summarisation by classification with deep reinforcement learning. ar**v preprint ar**v:1807.03089

  • Zhu X, **ong Y, Dai J, Yuan L, Wei Y (2017) Deep feature flow for video recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2349–2358

  • Zou WY, Wang X, Sun X, Lin Y (2014) Generic object detection with dense neural patterns and regionlets. ar**v preprint ar**v:1404.4316

  • van Beek P (2018) Improved image selection for stack-based hdr imaging. ar**v preprint ar**v:1806.07420

  • van Hasselt H, Guez A, Silver D (2015) Deep reinforcement learning with double q-learning. ar**v e-prints, ar**v:1509.06461

Download references

Acknowledgements

This material is based upon work supported by the National Science Foundation under Award No OIA-1946391.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ngan Le.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Le, N., Rathour, V.S., Yamazaki, K. et al. Deep reinforcement learning in computer vision: a comprehensive survey. Artif Intell Rev 55, 2733–2819 (2022). https://doi.org/10.1007/s10462-021-10061-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-021-10061-9

Keywords

Navigation