Log in

EMVS: Event-Based Multi-View Stereo—3D Reconstruction with an Event Camera in Real-Time

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. They offer significant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. However, because the output is composed of a sequence of asynchronous events rather than actual intensity images, traditional vision algorithms cannot be applied, so that a paradigm shift is needed. We introduce the problem of event-based multi-view stereo (EMVS) for event cameras and propose a solution to it. Unlike traditional MVS methods, which address the problem of estimating dense 3D structure from a set of known viewpoints, EMVS estimates semi-dense 3D structure from an event camera with known trajectory. Our EMVS solution elegantly exploits two inherent properties of an event camera: (1) its ability to respond to scene edges—which naturally provide semi-dense geometric information without any pre-processing operation—and (2) the fact that it provides continuous measurements as the sensor moves. Despite its simplicity (it can be implemented in a few lines of code), our algorithm is able to produce accurate, semi-dense depth maps, without requiring any explicit data association or intensity estimation. We successfully validate our method on both synthetic and real data. Our method is computationally very efficient and runs in real-time on a CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Notes

  1. The DAVIS comprises both a frame camera and an event sensor (DVS) in the same pixel array of size \(240~\times ~180\). The frames may be used to simplify intrinsic camera calibration, by applying standard algorithms (Zhang 2000). Otherwise, tailored event-based algorithms, such as (Mueggler et al. 2014), may be applied.

References

  • Bardow, P., Davison, A. J., & Leutenegger, S. (2016). Simultaneous optical flow and intensity estimation from an event camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2016.102.

  • Benosman, R., Clercq, C., Lagorce, X., Ieng, S.-H., & Bartolozzi, C. (2014). Event-based visual flow. IEEE Transactions on Neural Networks and Learning Systems, 25(2), 407–417. https://doi.org/10.1109/TNNLS.2013.2273537.

    Article  Google Scholar 

  • Benosman, R., Ieng, S.-H., Clercq, C., Bartolozzi, C., & Srinivasan, M. (2012). Asynchronous frameless event-based optical flow. Neural Networks, 27, 32–37. https://doi.org/10.1016/j.neunet.2011.11.001.

    Article  Google Scholar 

  • Brandli, C., Muller, L., & Delbruck, T. (2014a) Real-time, high-speed video decompression using a frame- and event-based DAVIS sensor. In International Symposium Circuits and Systems (ISCAS) (pp. 686–689). https://doi.org/10.1109/ISCAS.2014.6865228.

  • Brandli, C., Berner, R., Yang, M., Liu, S.-C., & Delbruck, T. (2014b). A 240 \(\times \) 180 130 dB 3us latency global shutter spatiotemporal vision sensor. IEEE Jorunal of Solid-State Circuits, 49(10), 2333–2341. https://doi.org/10.1109/JSSC.2014.2342715.

    Article  Google Scholar 

  • Camunas-Mesa, L. A., Serrano-Gotarredona, T., Ieng, S. H., Benosman, R. B., & Linares-Barranco, B. (2014). On the use of orientation filters for 3D reconstruction in event-driven stereo vision. Frontiers in Neuroscience, 8, 48. https://doi.org/10.3389/fnins.2014.00048.

    Article  Google Scholar 

  • Censi, A., & Scaramuzza, D. (2014). Low-latency event-based visual odometry. In IEEE International Conference on Robotics and Automation (ICRA). https://doi.org/10.1109/IROS.2016.7758089.

  • Collins, R. T. (1996). A space-sweep approach to true multi-image matching. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (pp. 358–363). https://doi.org/10.1109/CVPR.1996.517097.

  • Cook, M., Gugelmann, L., Jug, F., Krautz, C., & Steger, A. (2011). Interacting maps for fast visual interpretation. In International Joint Conference Neural Networks (IJCNN) (pp. 770–776). https://doi.org/10.1109/IJCNN.2011.6033299.

  • Delbruck, T. (2016). Neuromorophic vision sensing and processing. In European Solid-State Device Research Conferernce (ESSDERC) (pp. 7–14). https://doi.org/10.1109/ESSDERC.2016.7599576.

  • Delbruck, T., & Lichtsteiner, P. (2007). Fast sensory motor control based on event-based hybrid neuromorphic-procedural system. In IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 845–848). https://doi.org/10.1109/ISCAS.2007.378038.

  • Delbruck, T., & Lang, M. (2013). Robotic goalie with 3 ms reaction time at 4% CPU load using event-based dynamic vision sensor. Frontiers in Neuroscience,. https://doi.org/10.3389/fnins.2013.00223.

    Article  Google Scholar 

  • Drazen, D., Lichtsteiner, P., Hafliger, P., Delbruck, T., & Jensen, A. (2011). Toward real-time particle tracking using an event-based dynamic vision sensor. Experiments in Fluids, 51(5), 1465–1469. https://doi.org/10.1007/s00348-011-1207-y.

    Article  Google Scholar 

  • Engel, J., Schöps, J., & Cremers, D. (2014). LSD-SLAM: Large-scale direct monocular SLAM. In European Conference on Computer Vision (ECCV). https://doi.org/10.1007/978-3-319-10605-2_54.

    Google Scholar 

  • Engel, J., Koltun, V., & Cremers, D. (2017). Direct sparse odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, PP(99), 1. https://doi.org/10.1109/TPAMI.2017.2658577.

    Article  Google Scholar 

  • Forster, C., Pizzoli, M., & Scaramuzza, D. (2014). SVO: Fast semi-direct monocular visual odometry. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 15–22). https://doi.org/10.1109/ICRA.2014.6906584.

  • Gallego, G, Lund, E. A., Mueggler, E., Rebecq, H., Delbruck, T., & Scaramuzza, D. (2017). Event-based, 6-DOF camera tracking from photometric depth maps. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017. https://doi.org/10.1109/TPAMI.2017.2769655.

    Article  Google Scholar 

  • Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Kim, H., Handa, A., Benosman, R., Ieng, S.-H., & Davison, A. J. (2014). Simultaneous mosaicing and tracking with an event camera. In British Machine Vision Conference (BMVC). https://doi.org/10.5244/C.28.26.

  • Kim, H., Leutenegger, S., & Davison, A. J. (2016). Real-time 3D reconstruction and 6-DoF tracking with an event camera. In European Conference on Computer Vision (ECCV) (pp. 349–364). https://doi.org/10.1007/978-3-319-46466-4_21.

    Chapter  Google Scholar 

  • Kogler, J., Humenberger, M., & Sulzbachner, C. (2011a). Event-based stereo matching approaches for frameless address event stereo data. In International Symposium on Advances in Visual Computing (ISVC) (pp. 674–685). https://doi.org/10.1007/978-3-642-24028-7_62.

    Chapter  Google Scholar 

  • Kogler, J., Sulzbachner, C., Humenberger, M., & Eibensteiner, F. (2011b). Address-event based stereo vision with bio-inspired silicon retina imagers. In Advances in Theory and Applications of Stereo Vision (pp. 165–188). InTech. https://doi.org/10.5772/12941.

    Google Scholar 

  • Kueng, B., Mueggler, E., Gallego, G., & Scaramuzza, D. (2016). Low-latency visual odometry using event-based feature tracks. In IEEE/RSJ International Conference on IIntelligent Robots and Systems (IROS) (pp. 16–23). Daejeon, Korea. https://doi.org/10.1109/IROS.2016.7758089.

  • Lagorce, X., Orchard, G., Gallupi, F., Shi, B. E., & Benosman, R. (2016). HOTS: A hierarchy of event-based time-surfaces for pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence,. https://doi.org/10.1109/TPAMI.2016.2574707.

    Article  Google Scholar 

  • Lee, J., Delbruck, T., Park, P. K. J., Pfeiffer, M., Shin, C.-W., Ryu, H., & Kang, B. C. (2012). Live demonstration: Gesture-based remote control using stereo pair of dynamic vision sensors. In IEEE International Symposium on Circuits and Systems (ISCAS). https://doi.org/10.1109/ISCAS.2012.6272144.

  • Lee, J. H., Delbruck, T., Pfeiffer, M., Park, P. K. J., Shin, C.-W., Ryu, H., et al. (2014). Real-time gesture interface based on event-driven processing from stereo silicon retinas. IEEE Transactions on Neural Networks and Learning Systems, 25(12), 2250–2263. https://doi.org/10.1109/TNNLS.2014.2308551.

    Article  Google Scholar 

  • Lichtsteiner, P., Posch, C., & Delbruck, T. (2008). A 128 \(\times \) 128 120 dB 15 \(\mu \)s latency asynchronous temporal contrast vision sensor. IEEE Journal of Solid-State Circuits, 43(2), 566–576. https://doi.org/10.1109/JSSC.2007.914337.

    Article  Google Scholar 

  • Litzenberger, M., Belbachir, A. N., Donath, N., Gritsch, G., Garn, H., Kohn, B., Posch, C., & Schraml, S. (2006). Estimation of vehicle speed based on asynchronous data from a silicon retina optical sensor. In IEEE Intelligent Transportation Systems Conference (pp. 653–658). https://doi.org/10.1109/ITSC.2006.1706816.

  • Matsuda, N., Cossairt, O., & Gupta. M. (2015). MC3D: Motion contrast 3D scanning. In IEEE International Conference on Computational Photography (ICCP) (pp. 1–10). https://doi.org/10.1109/ICCPHOT.2015.7168370.

  • Mueggler, E., Huber, B., & Scaramuzza, D. (2014). Event-based, 6-DOF pose tracking for high-speed maneuvers. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2761–2768). https://doi.org/10.1109/IROS.2014.6942940.

  • Mueggler, E., Rebecq, H., Gallego, G., Delbruck, T., & Scaramuzza, D. (2017). The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM. International Journal of Robotics Research, 36, 142–149. https://doi.org/10.1177/0278364917691115.

    Article  Google Scholar 

  • Orchard, G., Meyer, C., Etienne-Cummings, R., Posch, C., Thakor, N., & Benosman, R. (2015). HFirst: A temporal approach to object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(10), 2028–2040. https://doi.org/10.1109/TPAMI.2015.2392947.

    Article  Google Scholar 

  • Piatkowska, E., Belbachir, A. N., & Gelautz, M. (2013). Asynchronous stereo vision for event-driven dynamic stereo sensor using an adaptive cooperative approach. In International Conference on Computer Vision Workshops (ICCVW) (pp. 45–50). https://doi.org/10.1109/ICCVW.2013.13.

  • Piatkowska, E., Belbachir, A. N., Schraml, S., & Gelautz, M. (2012). Spatiotemporal multiple persons tracking using dynamic vision sensor. In IEEE International Conference on Computer Vision and Pattern Recognition Workshop (pp. 35–40). https://doi.org/10.1109/CVPRW.2012.6238892.

  • Pizzoli, M., Forster, C., & Scaramuzza, D. (2014). REMODE: Probabilistic, monocular dense reconstruction in real time. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 2609–2616). https://doi.org/10.1109/ICRA.2014.6907233.

  • Rebecq, H., Gallego, G., & Scaramuzza, D. (2016). EMVS: Event-based multi-view stereo. In British Machine Vision Conference (BMVC). https://doi.org/10.5244/C.30.63.

  • Rebecq, H., Horstschäfer, T., Gallego, G., & Scaramuzza, D. (2017). EVO: A geometric approach to event-based 6-DOF parallel tracking and map** in real-time. IEEE Robotics and Automation Letters, 2, 593–600. https://doi.org/10.1109/LRA.2016.2645143.

    Article  Google Scholar 

  • Reinbacher, C., Graber, G., & Pock, T. (2016). Real-time intensity-image reconstruction for event cameras using manifold regularisation. In British Machine Vision Conference (BMVC). https://doi.org/10.5244/C.30.9.

  • Rogister, P., Benosman, R., Ieng, S.-H., Lichtsteiner, P., & Delbruck, T. (2012). Asynchronous event-based binocular stereo matching. IEEE Transactions on Neural Networks and Learning Systems, 23(2), 347–353. https://doi.org/10.1109/TNNLS.2011.2180025.

    Article  Google Scholar 

  • Rueckauer, B., & Delbruck, T. (2016). Evaluation of event-based algorithms for optical flow with ground-truth from inertial measurement sensor. Frontiers in Neuroscience,. https://doi.org/10.3389/fnins.2016.00176.

    Article  Google Scholar 

  • Rusu, R. B., & Cousins, S. (2011). 3D is here: Point cloud library (PCL). In IEEE International Conference on Robotics and Automation (ICRA). Shanghai, China. https://doi.org/10.1109/ICRA.2011.5980567.

  • Schraml, S., Belbachir, A. N., & Bischof, H. (2015). Event-driven stereo matching for real-time 3D panoramic vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 466–474). https://doi.org/10.1109/CVPR.2015.7298644.

  • Schraml, S., Belbachir, A. N., Milosevic, N., & Schön, P. (2010). Dynamic stereo vision system for real-time tracking. In IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 1409–1412). https://doi.org/10.1109/ISCAS.2010.5537289.

  • Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2006.19.

  • Szeliski, R. (2010). Computer vision: Algorithms and applications., Texts in computer science London: Springer.

    MATH  Google Scholar 

  • Szeliski, R., & Golland, P. (1999). Stereo matching with transparency and matting. International Journal of Computer Vision, 32(1), 45–61. https://doi.org/10.1023/A:1008192912624.

    Article  Google Scholar 

  • Vogiatzis, G., & Hernández, C. (2011). Video-based, real-time multi view stereo. Image and Vision Computing, 29(7), 434–441. https://doi.org/10.1016/j.imavis.2011.01.006.

    Article  Google Scholar 

  • Weikersdorfer, D., & Conradt, J. (2012). Event-based particle filtering for robot self-localization. In IEEE International Conference on Robotics and Biomimetics (ROBIO) (pp. 866–870). https://doi.org/10.1109/ROBIO.2012.6491077.

  • Weikersdorfer, D., Hoffmann, R., & Conradt, J. (2013). Simultaneous localization and map** for event-based vision systems. In International Conference on Computer Vision Systems (ICVS) (pp. 133–142). https://doi.org/10.1007/978-3-642-39402-7_14.

    Google Scholar 

  • Wiesmann, G., Schraml, S., Litzenberger, M., Belbachir, A. N., Hofstatter, M., & Bartolozzi, C. (2012). Event-driven embodied system for feature extraction and object recognition in robotic applications. In IEEE International Conference on Computer Vision and Pattern Recognition Workshop (pp. 76–82). https://doi.org/10.1109/CVPRW.2012.6238898.

  • Wolberg, G. (1990). Digital image war**. California: Wiley-IEEE Computer Society Press.

    Google Scholar 

  • Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), 1330–1334. https://doi.org/10.1109/34.888718. ISSN 0162-8828.

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded by the DARPA FLA Program, the National Center of Competence in Research (NCCR) Robotics through the Swiss National Science Foundation and the SNSF-ERC Starting Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henri Rebecq.

Additional information

Communicated by Edwin Hancock, Richard Wilson, Will Smith, Adrian Bors, Nick Pears.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 23797 KB)

A Relation of Area Elements due to a 2D Homography

A Relation of Area Elements due to a 2D Homography

This section provides a useful result on how a 2D transformation given by a homography affects the area element.

Fig. 18
figure 18

Result 1. A homography \(\mathtt {H}\) maps points to points and lines to lines. Area elements are transformed according to \(dA'=|\mathtt {J}|dA\), where \(\mathtt {J}\) is the Jacobian of the homography \(\mathtt {H}\)

Result 1

(Jacobian of a Homography) Let \(\mathtt {H}\) be a 2D homography transforming points \(\mathbf {x}\doteq (x,y,1)^{\top }\) to points \(\mathbf {x}'\doteq (x',y',1)^{\top }\) in homogeneous coordinates: \(\mathbf {x}'\sim \mathtt {H}\mathbf {x}\), where \(\sim \) means equality up to a non-zero scale factor. The determinant of the Jacobian of the transformation \((x,y){\mathop {\mapsto }\limits ^{\mathtt {H}}}(x',y')\) (in Euclidean coordinates),

$$\begin{aligned} \mathtt {J}\doteq \frac{\partial (x',y')}{\partial (x,y)} = \left( \begin{array}{cc} \frac{\partial x'}{\partial x} &{} \quad \frac{\partial x'}{\partial y}\\ \frac{\partial y'}{\partial x} &{} \quad \frac{\partial y'}{\partial y} \end{array}\right) \end{aligned}$$
(16)

is

$$\begin{aligned} \det (\mathtt {J}) = \frac{\det (\mathtt {H})}{(\mathbf {e}_{3}^{\top }\mathtt {H}\mathbf {x})^{3}}, \end{aligned}$$
(17)

where \(\mathbf {e}_{3}=(0,0,1)^{\top }\) is the 3-rd vector of the canonical basis in \(\mathbb {R}^{3}\).

The determinant of the Jacobian (17) provides the relation between the area elements in (xy) and in \((x',y')\) according to the geometric transformation given by the homography \(\mathtt {H}\),

$$\begin{aligned} dA'\doteq dx'dy' = \det (\mathtt {J}) \,dxdy = \det (\mathtt {J}) \,dA, \end{aligned}$$
(18)

as illustrated in Fig. 18.

Proof

Let \(\mathtt {H}=(h_{ij})\) be the homogeneous matrix of the homography, and let \(\mathbf {h}^\top _{3}\doteq \mathbf {e}_{3}^{\top }\mathtt {H}\) be its third row. Writing out explicitly the transformed variables

$$\begin{aligned} x'=\frac{h_{11}x+h_{12}y+h_{13}}{h_{31}x+h_{32}y+h_{33}},\quad y'=\frac{h_{21}x+h_{22}y+h_{23}}{h_{31}x+h_{32}y+h_{33}}, \end{aligned}$$
(19)

we may compute the four elements of the Jacobian matrix (17):

$$\begin{aligned} \mathtt {J}= \frac{1}{\mathbf {h}_{3}^{\top }\mathbf {x}}\left( \begin{array}{cc} h_{11}-x'h_{31} &{} \quad h_{12}-x'h_{32}\\ h_{21}-y'h_{31} &{} \quad h_{22}-y'h_{32} \end{array}\right) \end{aligned}$$
(20)

Next, we compute the determinant of this matrix. Noting that \((h_{11}-x'h_{31})(h_{22}-y'h_{32})-(h_{12}-x'h_{32})(h_{21}-y'h_{31}) = \mathbf {x}'\cdot ((\mathtt {H}\mathbf {e}_{1})\times (\mathtt {H}\mathbf {e}_{2}))\) is a mixed product in terms of the first two columns of \(\mathtt {H}\), with \(\mathbf {e}_1=(1,0,0)^\top \) and \(\mathbf {e}_2=(0,1,0)^\top \), gives

$$\begin{aligned} \det \left( \mathtt {J}\right) =\frac{1}{(\mathbf {h}_{3}^{\top }\mathbf {x})^{2}}\,\mathbf {x}'\cdot \left( (\mathtt {H}\mathbf {e}_{1})\times (\mathtt {H}\mathbf {e}_{2})\right) . \end{aligned}$$
(21)

Substituting \(\mathbf {x}'= \mathtt {H}\mathbf {x}/ (\mathbf {h}_3^\top \mathbf {x})\) in the mixed product \(\mathbf {x}'\cdot ((\mathtt {H}\mathbf {e}_{1})\times (\mathtt {H}\mathbf {e}_{2})) = \det \left( \mathbf {x}',\mathtt {H}\mathbf {e}_{1},\mathtt {H}\mathbf {e}_{2}\right) \) and using the properties of the determinant, \(\det \left( \mathtt {H}\mathbf {x},\mathtt {H}\mathbf {e}_{1},\mathtt {H}\mathbf {e}_{2}\right) = \det (\mathtt {H})\det \left( \mathbf {x},\mathbf {e}_{1},\mathbf {e}_{2}\right) = \det (\mathtt {H})\), gives the desired result (17):

$$\begin{aligned} \det \left( \mathtt {J}\right) {\mathop {=}\limits ^{(21)}}\frac{\det \left( \mathtt {H}\mathbf {x},\mathtt {H}\mathbf {e}_{1},\mathtt {H}\mathbf {e}_{2}\right) }{(\mathbf {h}_{3}^{\top }\mathbf {x})^{3}} =\frac{\det (\mathtt {H})}{(\mathbf {e}_{3}^{\top }\mathtt {H}\mathbf {x})^{3}}. \end{aligned}$$
(22)

\(\square \)

1.1 A.1 Planar Homography

Next, we particularize the previous general Result 1 to the case of a planar homography induced by a plane in space.

Let us consider (1) two finite cameras (i.e., whose optical centers are not at infinity) with projection matrices given by \(\mathtt {P}=(\mathtt {I}|\mathbf {0})\) and \(\mathtt {P}'=(\mathtt {R}|\mathbf {t})\) in calibrated coordinates, and (2) a plane not passing through the optical centers of the cameras, with homogeneous coordinates \(\varvec{\pi }=(a,b,c,d)^\top =(\mathbf {n}^\top ,d)^\top \), where \(\mathbf {n}\) is the unit normal to the plane. The optical centers of \(\mathtt {P}\) and \(\mathtt {P}'\) are \(\mathbf {0}\) and \(\mathbf {C}=-\mathtt {R}^\top \mathbf {t}\), respectively. The planar homography from the image plane of \(\mathtt {P}\) to the image plane of \(\mathtt {P}'\) via the plane \(\varvec{\pi }\), such that \(\mathbf {x}' \sim \mathtt {H}\mathbf {x}\), is

$$\begin{aligned} \mathtt {H}_{\varvec{\pi }}(\mathtt {P},\mathtt {P}') \sim \mathtt {R}-\frac{1}{d}\mathbf {t}\mathbf {n}^{\top }=\mathtt {R}\left( \mathtt {I}+\frac{1}{d}\mathbf {C}\mathbf {n}^{\top }\right) , \end{aligned}$$
(23)

where \(\mathtt {I}\) is the identity matrix.

The planar homography from \(\mathtt {P}'\) to \(\mathtt {P}\) via the plane \(\varvec{\pi }\) is given by the inverse of (23):

$$\begin{aligned} \mathtt {H}_{\varvec{\pi }}(\mathtt {P}',\mathtt {P}) = \mathtt {H}^{-1}_{\varvec{\pi }}(\mathtt {P},\mathtt {P}') \sim \left( \mathtt {I}-\frac{1}{d+\mathbf {n}^{\top }\mathbf {C}}\mathbf {C}\mathbf {n}^{\top }\right) \mathtt {R}^{\top }. \end{aligned}$$
(24)
Fig. 19
figure 19

Result 2. Relation of area elements induced by a planar homography: \(dA'=|\mathtt {J}|dA\), where \(\mathtt {J}\) is the Jacobian of the planar homography, and \(Z, Z'\) are the depths of the scene point \(\mathbf {X}\) with respect to the two cameras, respectively

Result 2

(Jacobian of a Planar Homography) For a planar homography (23), Result 1 becomes

$$\begin{aligned} \det \left( \mathtt {J}\right) =\left( \frac{Z}{Z'}\right) ^{3}\left( 1+\frac{\mathbf {C}\cdot \mathbf {n}}{d}\right) , \end{aligned}$$
(25)

where Z and \(Z'\) are the depths of the point \(\mathbf {X}\in \varvec{\pi }\), projecting on \(\mathbf {x}\) and \(\mathbf {x}'\), with respect to cameras \(\mathtt {P}\) and \(\mathtt {P}'\), respectively. This is illustrated in Fig. 19.

Proof

Let us compute the numerator and denominator of (17). Applying \(\det (\mathtt {R})=1=\det (\mathtt {I})\) and the matrix determinant lemma to (23) gives

$$\begin{aligned} \det (\mathtt {H}) =\det (\mathtt {R})\det \left( \mathtt {I}+\frac{1}{d}\mathbf {C}\mathbf {n}^{\top }\right) =1+\frac{1}{d}\mathbf {n}^{\top }\mathbf {C}. \end{aligned}$$
(26)

A point \(\mathbf {X}\doteq (X,Y,Z)^{\top }\) lies on the plane \(\varvec{\pi }\) if it satisfies

$$\begin{aligned} \mathbf {n}^{\top }\mathbf {X}+d = 0. \end{aligned}$$
(27)

The point \(\mathbf {X}\) expressed in the frame of \(\mathtt {P}'\) becomes

$$\begin{aligned} (X',Y',Z')^{\top }\doteq \mathbf {X}'=\mathtt {R}\mathbf {X}+\mathbf {t}=\mathtt {R}(\mathbf {X}-\mathbf {C}). \end{aligned}$$
(28)

Since \(\mathbf {x}Z=(x,y,1)^\top Z = \mathbf {X}\) and

$$\begin{aligned} \mathtt {H}\mathbf {X}{\mathop {=}\limits ^{(23)}} \mathtt {R}\left( \mathbf {X}+\frac{\mathbf {X}\cdot \mathbf {n}}{d}\mathbf {C}\right) {\mathop {=}\limits ^{(27)}}\mathtt {R}(\mathbf {X}-\mathbf {C}){\mathop {=}\limits ^{(28)}}\mathbf {X}', \end{aligned}$$
(29)

the denominator of (17) is given in terms of \(\mathbf {e}_{3}^{\top }\mathtt {H}\mathbf {x}{\mathop {=}\limits ^{(29)}} \mathbf {e}_{3}^{\top }\mathbf {X}'/Z\). Substituting this result and (26) in (17) gives (25). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rebecq, H., Gallego, G., Mueggler, E. et al. EMVS: Event-Based Multi-View Stereo—3D Reconstruction with an Event Camera in Real-Time. Int J Comput Vis 126, 1394–1414 (2018). https://doi.org/10.1007/s11263-017-1050-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-017-1050-6

Keywords

Navigation