Abstract
As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance. In this paper, we propose a novel framework for both rectifying distorted document image and removing background finely, by estimating pixel-wise displacements using a fully convolutional network (FCN). The document image is rectified by transformation according to the displacements of pixels. The FCN is trained by regressing displacements of synthesized distorted documents, and to control the smoothness of displacements, we propose a Local Smooth Constraint (LSC) in regularization. Our approach is easy to implement and consumes moderate computing resource. Experiments proved that our approach can dewarp document images effectively under various geometric distortions, and has achieved the state-of-the-art performance in terms of local details and overall effect.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amidror, I.: Scattered data interpolation methods for electronic imaging systems: a survey. J. Electron. Imaging 11, 157–76 (2002)
Brown, M.S., Tsoi, Y.C.: Geometric and shading correction for images of printed materials using boundary. IEEE Trans. Image Process. 15(6), 1544–1554 (2006)
Cao, H., Ding, X., Liu, C.: A cylindrical surface model to rectify the bound document image. In: Proceedings Ninth IEEE International Conference on Computer Vision, pp. 228–233. IEEE (2003)
Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. ar**v preprint ar**v:1706.05587 (2017)
Courteille, F., Crouzil, A., Durou, J.D., Gurdjos, P.: Shape from shading for the digitization of curved documents. Mach. Vis. Appl. 18(5), 301–316 (2007)
Das, S., Ma, K., Shu, Z., Samaras, D., Shilkrot, R.: Dewarpnet: single-image document unwar** with stacked 3D and 2D regression networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 131–140 (2019)
Das, S., Mishra, G., Sudharshana, A., Shilkrot, R.: The common fold: utilizing the four-fold to dewarp printed documents from a single image. In: Proceedings of the 2017 ACM Symposium on Document Engineering, pp. 125–128. ACM (2017)
Fu, B., Wu, M., Li, R., Li, W., Xu, Z., Yang, C.: A model-based book dewar** method using text line detection. In: Proceedings of 2nd International Workshop on Camera Based Document Analysis and Recognition, Curitiba, Barazil, pp. 63–70 (2007)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ar**v preprint ar**v:1412.6980 (2014)
Li, X., Zhang, B., Liao, J., Sander, P.V.: Document rectification and illumination correction using a patch-based cnn. ACM Trans. Graph. 38(6), 1–11 (2019)
Liang, J., DeMenthon, D., Doermann, D.: Geometric rectification of camera-captured document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 591–605 (2008)
Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2010)
Liu, C., Zhang, Y., Wang, B., Ding, X.: Restoring camera-captured distorted document images. Int. J. Doc. Anal. Recogn. 18(2), 111–124 (2015)
Ma, K., Shu, Z., Bai, X., Wang, J., Samaras, D.: Docunet: document image unwar** via a stacked u-net. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4709 (2018)
Meng, G., Wang, Y., Qu, S., **ang, S., Pan, C.: Active flattening of curved document images via two structured beams. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3890–3897 (2014)
Ramanna, V., Bukhari, S.S., Dengel, A.: Document image dewar** using deep learning. In: International Conference on Pattern Recognition Applications and Methods (2019)
Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 377–384. IEEE (2011)
Tsoi, Y.C., Brown, M.S.: Multi-view document rectification using boundary. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Wada, T., Ukida, H., Matsuyama, T.: Shape from shading with interreflections under a proximal light source: distortion-free copying of an unfolded book. Int. J. Comput. Vision 24(2), 125–135 (1997)
Wang, P., et al.: Understanding convolution for semantic segmentation. In: IEEE Winter Conference on Applications of Computer Vision, pp. 1451–1460. IEEE (2018)
Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402. IEEE (2003)
**ng, Y., Li, R., Cheng, L., Wu, Z.: Research on curved Chinese document correction based on deep neural network. In: International Symposium on Computational Intelligence and Design, vol. 2, pp. 342–345. IEEE (2018)
You, S., Matsushita, Y., Sinha, S., Bou, Y., Ikeuchi, K.: Multiview rectification of folded documents. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 505–511 (2017)
Zhang, L., Zhang, Y., Tan, C.: An improved physically-based method for geometric restoration of distorted document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 728–734 (2008)
Acknowledgements
This work has been supported by National Natural Science Foundation of China (NSFC) Grants 61733007, 61573355 and 61721004.