IVIST: Interactive VIdeo Search Tool in VBS 2020

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11962))

Included in the following conference series:

  • 2435 Accesses

Abstract

This paper presents a new video retrieval tool, Interactive VIdeo Search Tool (IVIST), which participates in the 2020 Video Browser Showdown (VBS). As a video retrieval tool, IVIST is equipped with proper and high-performing functionalities such as object detection, dominant-color finding, scene-text recognition and text-image retrieval. These functionalities are constructed with various deep neural networks. By adopting these functionalities, IVIST performs well in searching users’ desirable videos. Furthermore, due to user-friendly user interface, IVIST is easy to use even for novice users. Although IVIST is developed to participate in VBS, we hope that it will be applied as a practical video retrieval tool in the future, dealing with actual video data on the Internet.

S. Park and J. Song—have equally contributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 93.08
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 116.04
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cobârzan, C., Schoeffmann, K., Bailer, W., et al.: Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimedia Tools Appl. 76, 5539–5571 (2017)

    Article  Google Scholar 

  2. LokoÄŤ, J., et al.: Interactive search or sequential browsing? a detailed analysis of the video browser showdown 2018. ACM Trans. Multimedia Comput. Commun. Appl. 15(29), 18 (2019)

    Google Scholar 

  3. Deng, D., Liu, H., Li, X., Cai, D.: PixelLink.: detecting scene text via instance segmentation. ar**v preprint ar**v:1801.01315 (2018)

  4. Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 41(9), 2035–2048 (2018)

    Article  Google Scholar 

  5. Bookstein, F.L.: Thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11(6), 567–585 (1989)

    Article  Google Scholar 

  6. Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)

    Article  Google Scholar 

  7. Lee K.-H., **, C., Gang, H., Houdong, H., **aodong, H.: Stacked cross attention for image-text matching. ar**v preprint ar**v:1803.08024 (2018)

  8. Chen, K., et al.: Hybrid task cascade for instance segmentation. ar**v preprint ar**v:1901.07518 (2019)

  9. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. ar**v preprint ar**v:1405.0312 (2014)

  10. Kuznetsova, A., et al.: The open images dataset V4: unified image classification, object detection, and visual relationship detection at scale. ar**v preprint ar**v:1811.00982 (2018)

  11. ZFTurbo: Keras-RetinaNet-for-Open-Images-Challenge-2018. https://github.com/zfturbo/keras-retinanet-for-open-images-challenge-2018

  12. Lin, T.-Y., Goyal, P., Girchick, R., He, K., Dollar, P.: Focal loss for dense object detection. ar**v preprint ar**v:1708.02002 (2018)

  13. Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. ar**v preprint ar**v:1712.00726 (2017)

  14. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. ar**v preprint ar**v:1703.06870 (2018)

  15. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. ar**v preprint ar**v:1506.01497 (2016)

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR. IEEE Computer Society, pp. 770–778 (2016)

    Google Scholar 

  17. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2671–2673 (1997)

    Article  Google Scholar 

  18. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)

    Google Scholar 

  19. Chorowski, J., Bahdanau, D., Serdyuk, D., Cho, K., Bengio, Y.: Attention-based models for speech recognition. ar**v preprint ar**v:1506.07503 (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sungjune Park or Jaeyub Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Park, S., Song, J., Park, M., Ro, Y.M. (2020). IVIST: Interactive VIdeo Search Tool in VBS 2020. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_74

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37734-2_74

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37733-5

  • Online ISBN: 978-3-030-37734-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation