Dewar** Document Image by Displacement Flow Estimation with Fully Convolutional Network

  • Conference paper
  • First Online:
Document Analysis Systems (DAS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12116))

Included in the following conference series:

  • 1570 Accesses

Abstract

As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance. In this paper, we propose a novel framework for both rectifying distorted document image and removing background finely, by estimating pixel-wise displacements using a fully convolutional network (FCN). The document image is rectified by transformation according to the displacements of pixels. The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization. Our approach is easy to implement and consumes moderate computing resource. Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 42.79
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 53.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Amidror, I.: Scattered data interpolation methods for electronic imaging systems: a survey. J. Electron. Imaging 11, 157–76 (2002)

    Article  Google Scholar 

  2. Brown, M.S., Tsoi, Y.C.: Geometric and shading correction for images of printed materials using boundary. IEEE Trans. Image Process. 15(6), 1544–1554 (2006)

    Article  Google Scholar 

  3. Cao, H., Ding, X., Liu, C.: A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp. 228–233. IEEE (2003)

    Google Scholar 

  4. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. ar**v preprint ar**v:1706.05587 (2017)

  5. Courteille, F., Crouzil, A., Durou, J.D., Gurdjos, P.: Shape from shading for the digitization of curved documents. Mach. Vis. Appl. 18(5), 301–316 (2007)

    Article  Google Scholar 

  6. Das, S., Ma, K., Shu, Z., Samaras, D., Shilkrot, R.: Dewarpnet: single-image document unwar** with stacked 3D and 2D regression networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 131–140 (2019)

    Google Scholar 

  7. Das, S., Mishra, G., Sudharshana, A., Shilkrot, R.: The common fold: utilizing the four-fold to dewarp printed documents from a single image. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp. 125–128. ACM (2017)

    Google Scholar 

  8. Fu, B., Wu, M., Li, R., Li, W., Xu, Z., Yang, C.: A model-based book dewar** method using text line detection. In: Proceedings of 2nd International Workshop on Camera Based Document Analysis and Recognition, Curitiba, Barazil, pp. 63–70 (2007)

    Google Scholar 

  9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  10. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)

  11. Li, X., Zhang, B., Liao, J., Sander, P.V.: Document rectification and illumination correction using a patch-based cnn. ACM Trans. Graph. 38(6), 1–11 (2019)

    Google Scholar 

  12. Liang, J., DeMenthon, D., Doermann, D.: Geometric rectification of camera-captured document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 591–605 (2008)

    Article  Google Scholar 

  13. Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2010)

    Article  Google Scholar 

  14. Liu, C., Zhang, Y., Wang, B., Ding, X.: Restoring camera-captured distorted document images. Int. J. Doc. Anal. Recogn. 18(2), 111–124 (2015)

    Article  Google Scholar 

  15. Ma, K., Shu, Z., Bai, X., Wang, J., Samaras, D.: Docunet: document image unwar** via a stacked u-net. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4709 (2018)

    Google Scholar 

  16. Meng, G., Wang, Y., Qu, S., **ang, S., Pan, C.: Active flattening of curved document images via two structured beams. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3890–3897 (2014)

    Google Scholar 

  17. Ramanna, V., Bukhari, S.S., Dengel, A.: Document image dewar** using deep learning. In: International Conference on Pattern Recognition Applications and Methods (2019)

    Google Scholar 

  18. Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 377–384. IEEE (2011)

    Google Scholar 

  19. Tsoi, Y.C., Brown, M.S.: Multi-view document rectification using boundary. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)

    Google Scholar 

  20. Wada, T., Ukida, H., Matsuyama, T.: Shape from shading with interreflections under a proximal light source: distortion-free copying of an unfolded book. Int. J. Comput. Vision 24(2), 125–135 (1997)

    Article  Google Scholar 

  21. Wang, P., et al.: Understanding convolution for semantic segmentation. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1451–1460. IEEE (2018)

    Google Scholar 

  22. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)

    Google Scholar 

  23. **ng, Y., Li, R., Cheng, L., Wu, Z.: Research on curved Chinese document correction based on deep neural network. In: International Symposium on Computational Intelligence and Design, vol. 2, pp. 342–345. IEEE (2018)

    Google Scholar 

  24. You, S., Matsushita, Y., Sinha, S., Bou, Y., Ikeuchi, K.: Multiview rectification of folded documents. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 505–511 (2017)

    Article  Google Scholar 

  25. Zhang, L., Zhang, Y., Tan, C.: An improved physically-based method for geometric restoration of distorted document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 728–734 (2008)

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by National Natural Science Foundation of China (NSFC) Grants 61733007, 61573355 and 61721004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng-Lin Liu .

Editor information

Editors and Affiliations

Publish with us

Policies and ethics

Societies and partnerships

Navigation