IVIST: Interactive VIdeo Search Tool in VBS 2020

Park, Sungjune; Song, Jaeyub; Park, Minho; Ro, Yong Man

doi:10.1007/978-3-030-37734-2_74

Sungjune Park¹⁶,
Jaeyub Song¹⁶,
Minho Park¹⁶ &
…
Yong Man Ro¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11962))

Included in the following conference series:

International Conference on Multimedia Modeling

2435 Accesses

Abstract

This paper presents a new video retrieval tool, Interactive VIdeo Search Tool (IVIST), which participates in the 2020 Video Browser Showdown (VBS). As a video retrieval tool, IVIST is equipped with proper and high-performing functionalities such as object detection, dominant-color finding, scene-text recognition and text-image retrieval. These functionalities are constructed with various deep neural networks. By adopting these functionalities, IVIST performs well in searching users’ desirable videos. Furthermore, due to user-friendly user interface, IVIST is easy to use even for novice users. Although IVIST is developed to participate in VBS, we hope that it will be applied as a practical video retrieval tool in the future, dealing with actual video data on the Internet.

S. Park and J. Song—have equally contributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (France)

eBook: EUR 93.08; Price includes VAT (France)

Softcover Book: EUR 116.04; Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

IVIST: Interactive Video Search Tool in VBS 2022

Deep Learning-Based Concept Detection in vitrivr

Autopiloting Feature Maps: The Deep Interactive Video Exploration (diveXplore) System at VBS2019

References

Cobârzan, C., Schoeffmann, K., Bailer, W., et al.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimedia Tools Appl. 76, 5539–5571 (2017)
Article Google Scholar
Lokoč, J., et al.: Interactive search or sequential browsing? a detailed analysis of the video browser showdown 2018. ACM Trans. Multimedia Comput. Commun. Appl. 15(29), 18 (2019)
Google Scholar
Deng, D., Liu, H., Li, X., Cai, D.: PixelLink.: detecting scene text via instance segmentation. ar**v preprint ar**v:1801.01315 (2018)
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 41(9), 2035–2048 (2018)
Article Google Scholar
Bookstein, F.L.: Thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11(6), 567–585 (1989)
Article Google Scholar
Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)
Article Google Scholar
Lee K.-H., **, C., Gang, H., Houdong, H., **aodong, H.: Stacked cross attention for image-text matching. ar**v preprint ar**v:1803.08024 (2018)
Chen, K., et al.: Hybrid task cascade for instance segmentation. ar**v preprint ar**v:1901.07518 (2019)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. ar**v preprint ar**v:1405.0312 (2014)
Kuznetsova, A., et al.: The open images dataset V4: unified image classification, object detection, and visual relationship detection at scale. ar**v preprint ar**v:1811.00982 (2018)
ZFTurbo: Keras-RetinaNet-for-Open-Images-Challenge-2018. https://github.com/zfturbo/keras-retinanet-for-open-images-challenge-2018
Lin, T.-Y., Goyal, P., Girchick, R., He, K., Dollar, P.: Focal loss for dense object detection. ar**v preprint ar**v:1708.02002 (2018)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. ar**v preprint ar**v:1712.00726 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. ar**v preprint ar**v:1703.06870 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. ar**v preprint ar**v:1506.01497 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR. IEEE Computer Society, pp. 770–778 (2016)
Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2671–2673 (1997)
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)
Google Scholar
Chorowski, J., Bahdanau, D., Serdyuk, D., Cho, K., Bengio, Y.: Attention-based models for speech recognition. ar**v preprint ar**v:1506.07503 (2015)

Download references

Author information

Authors and Affiliations

Image and Video Systems Lab, School of Electrical Engineering, KAIST, Daejeon, South Korea
Sungjune Park, Jaeyub Song, Minho Park & Yong Man Ro

Authors

Sungjune Park
View author publications
You can also search for this author in PubMed Google Scholar
Jaeyub Song
View author publications
You can also search for this author in PubMed Google Scholar
Minho Park
View author publications
You can also search for this author in PubMed Google Scholar
Yong Man Ro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Sungjune Park or Jaeyub Song .

Editor information

Editors and Affiliations

Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Yong Man Ro
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Junmo Kim
National Cheng Kung University, Tainan City, Taiwan
Wei-Ta Chu
Tsinghua University, Bei**g, China
Peng Cui
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Jung-Woo Choi
National Tsing Hua University, Hsinchu, Taiwan
Min-Chun Hu
Ghent University, Ghent, Belgium
Wesley De Neve

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Park, S., Song, J., Park, M., Ro, Y.M. (2020). IVIST: Interactive VIdeo Search Tool in VBS 2020. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_74

Download citation

DOI: https://doi.org/10.1007/978-3-030-37734-2_74
Published: 24 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37733-5
Online ISBN: 978-3-030-37734-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

IVIST: Interactive VIdeo Search Tool in VBS 2020

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

IVIST: Interactive Video Search Tool in VBS 2022

Deep Learning-Based Concept Detection in vitrivr

Autopiloting Feature Maps: The Deep Interactive Video Exploration (diveXplore) System at VBS2019

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

IVIST: Interactive VIdeo Search Tool in VBS 2020

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

IVIST: Interactive Video Search Tool in VBS 2022

Deep Learning-Based Concept Detection in vitrivr

Autopiloting Feature Maps: The Deep Interactive Video Exploration (diveXplore) System at VBS2019

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation