Introduction

The development of high-density memory and computing has been a major focus in the field of spintronics1,2,3,4,5,6,7,8,9,10. Ferromagnets with perpendicular magnetic anisotropy (PMA) are considered as ideal material platforms for realizing spintronic devices11,12,13,14, due to their scalability and compatibility with CMOS technology, and the ability to manipulate their magnetization through charge-spin conversion mechanisms15,16 such as spin Hall effect17,18,19,20,21,22,23, interfacial Edelstein effect24,25. Coupling the charge-spin conversion with the magnetization in ferromagnetic systems with PMA would provide a unique knob for reconfiguring the charge-spin conversion, introducing an additional pathway for develo** novel spintronics devices. Recently, it has been proposed that such magnetization-determined charge-spin conversion can be realized through time-reversal-odd SHE26,27, in which the switching of magnetization will reverse the spin Hall current. However, magnetic materials hosting such T-odd SHE are still rare and only limited in antiferromagnetic systems, i.e., non-collinear bulk antiferromagnet Mn3Sn18,28 and collinear bulk antiferromagnet RuO220,27,29. The study of T-odd SHE in ferromagnetic systems remains unexplored so far.

Van der Waals (vdW) ferromagnetic heterostructures comprised of vdW materials with PMA and strong spin-orbit coupling (SOC) offer an unprecedented opportunity for exploring magnetization-related spin-charge conversion mechanisms. Without suffering the issues of lattice mismatch and interface disorders as in traditional ferromagnetic heterostructures30, vdW interfaces are atomically sharp, free of dangling bonds and structural defects, providing more freedoms and better control in symmetry engineering31,32,33. Notably, such symmetry engineering at vdW heterointerface plays a vital role in generating unconventional charge-spin conversion mechanisms, such as non-orthogonal SHE for realizing field-free current-induced perpendicular magnetization reversal34. Moreover, strong interaction between the two constituting materials, despite the existence of a vdW gap region, can lead to coherent distribution of active electronic states across the vdW interface. In this way, vdW ferromagnetic heterostructures offer ideal platforms to explore many fascinating physics not available in the traditional bulk materials and are of paramount importance to the development of novel spin-based memory and computing technologies35,36.

In this work, we present the discovery of a time-reversal-odd (T-odd) charge-spin conversion over the vdW interface of Fe3GeTe2/MoTe2 heterostructure and demonstrate its promising application in in-memory computing devices. We observe that a charge current induces a pure spin current that is locked to the perpendicular magnetization of Fe3GeTe2. Our findings show a high efficiency of charge-spin conversion with a large spin Hall angle of 0.67. This giant interfacial-MSHE arises from the intricate interplay of symmetry breaking and band geometric properties at the interface of Fe3GeTe2/MoTe2 vdW heterostructure, as verified by our experiments and theoretical calculations. Based on this interfacial spin-charge-conversion, we propose a spintronic device called memtransformer and prove its effectiveness in multiply-accumulate operations and artificial neural network. Our work paves the way for exploring a wealth of symmetry-breaking related physics and new computing technologies.

Results

SHE in Fe3GeTe2/MoTe2 vdW heterostructure

We created a vdW interface by utilizing vdW layered-structure materials Fe3GeTe2 (FGT) and MoTe2. The fabrication process is described in further detail in the Methods section. FGT is a ferromagnetic metal with strong perpendicular magnetic anisotropy37 and exhibits a Curie temperature as high as 220 K in bulk form38,39,40,41,42. MoTe2 as a nonmagnetic metal is known for its high charge-spin conversion efficiency43,44 and robust spin diffusion over several micrometers44,45, making it suitable for efficient detection of spin current through inverse spin Hall effect (ISHE). Figure 1a schematically shows the multi-terminal vdW heterostructure device that we fabricated, which was used to conduct nonlocal electrical measurements to explore the charge-spin interconversion effect. Optical image of a typical device is shown in Supplementary Fig. 3.

Fig. 1: Nonlocal transport measurement configuration and characterization of the SHE.
figure 1

a Schematic of the FGT/MoTe2 device for transport measurements. The external magnetic field is swept along z axis. The numbers represent the terminal positions for the measurements. b The corresponding inverse spin Hall signal as a function of out-of-plane magnetic field. The voltage jump is denoted as ∆VISHE. c The corresponding spin Hall signal as a function of out-of-plane magnetic field.

To investigate the spin diffusion in MoTe2 used in the vdW heterostructure, we perform an experiment based on non-local measurement scheme46 where a charge current is applied between terminal 1 on the FGT flake and terminal 5 on MoTe2 flake. In this case, the charge current flows from FGT into MoTe2 and the resulting spin polarization is injected into the MoTe2 and diffuses to the MoTe2 Hall bar (terminals 3 and 4). We monitor the spin diffusion-induced charge imbalance with the voltage signal between terminals 3 and 4, as shown in Fig. 1b. The measurement shows a sharp jump in the voltage signal (denoted as \(\triangle V\)ISHE) as the out-of-plane magnetization of the FGT is reversed by the applied perpendicular magnetic field. Our results reveal a large non-local inverse spin Hall signal of 1.6 Ω, which is one order of magnitude larger than that measured in local ferromagnetic/heavy metal nanostructures47. Such large inverse spin Hall signal indicates that MoTe2 has long spin diffusion length, since the channel length between the heterostructure region and MoTe2 Hall cross (2 \(\mu m\)) is three orders of magnitude larger than spin diffusion length found in heavy metals17,48,49. The long spin diffusion length of MoTe2 can be further verified by our length-dependence transport measurement (Supplementary Note 1). Nevertheless, comprehensive understanding of such a long spin diffusion length observed in MoTe2 with strong spin-orbit coupling requires more theoretical and experimental effort in the future. As shown in Fig. 1c, we applied a charge current between terminals 3 and 4 on MoTe2 to generate a pure spin current along the y direction, resulting in a spin accumulation at the FGT/MoTe2 interface that can be probed by the setup shown in the inset of Fig. 1c. These significant magnetic hysteresis loops of the spin Hall signal and inverse spin Hall signal demonstrate the great potential of using MoTe2 as a highly efficient nonlocal spin detector for exploring the charge-spin conversion at the FGT/MoTe2 heterostructure region.

We then drive a charge current I12 between terminals 1 and 2 on the FGT strip and use the nonlocal spin detector to investigate the corresponding charge-spin conversion at the heterostructure region. The nonlocal voltage signal is monitored between Hall terminals 3 and 4 on MoTe2 flake as we vary the magnetic field, with the results shown in Fig. 2a. The measured nonlocal voltage signal displays a rectangular hysteresis loop in response to the applied magnetic field. At a fixed M direction (and B = 0), the polarity of this hysteresis loop reverses when flip** the direction of the driving current I12 (Fig. 2b). It is noteworthy that the sharp jumps in the voltage signal occur at the switching field of FGT, demonstrating the T-odd nature of the spin current \(({J}_{y}^{{S}_{z}})\) induced by the injected charge current along the FGT strip. This implies that the direction of the resulting spin current is flipped when the magnetization of FGT is reversed. This T-odd signal is clear evidence for MSHE.

Fig. 2: Nonlocal transport measurements of the MSHE.
figure 2

a The measurement of MSHE with measurement configuration shown in inset. The voltage jump in this \({V}_{{{{{{\rm{NL}}}}}}}\)B hysteresis loop is denoted as \(\triangle V\). b The measurement of MSHE with negative bias current. c The measurement of inverse MSHE in the same device. d \(\triangle V\) as a function of bias current I. The line is linear fit to the experiment data.

Our device setup is capable of detecting the inverse effect of MSHE. By applying a charge current between terminals 3 and 4 on the MoTe2 flake, depicted in the inset of Fig. 2c, a spin current (\({J}_{y}^{{S}_{z}}\)) is generated through the spin Hall effect in MoTe2. The resulting spin current diffuses from right to left side, and then induces a charge current via the inverse MSHE at the heterostructure region, leading to a T-odd voltage signal between terminals 1 and 2 on the FGT flake, as shown in Fig. 2c. The measured voltage signals exhibit a rectangular hysteresis loop with comparable signal magnitude and opposite switching polarity compared to that shown in Fig. 2a, which is consistent with the Onsager reciprocity relation50.

The observed spin-charge-conversion at the heterostructure region cannot be attributed to the conventional charge-spin conversion mechanisms. The conventional spin Hall effect (whether it is from FGT or from MoTe2 or from the junction) is T-even hence cannot lead to this T-odd signal. We also make careful analysis to show that Rashba-Edelstein effect has negligible contributions to the observed T-odd signal (Supplementary Note 2). In addition, the magnitude ∆V of this T-odd signal exhibits a linear dependence on the driving current \({I}_{12}\,\)(Fig. 2d), which eliminates the possibility of thermoelectric contributions in our measurements, as they would result in a quadratic bias current dependence of the resulting voltage signal.

Additionally, we carry out a control experiment by replacing MoTe2 with graphite in the device. As shown in the inset of Supplementary Fig. 4, when a charge current is applied along the FGT strip, no signal jump is observed in the graphite nonlocal detector, even though a background signal is still present. This lack of signal jump can be attributed to the negligible spin-orbit coupling in graphite, which only allows for the detection of charge diffusion rather than spin diffusion in the nonlocal channel. This confirms that the T-odd signal observed in our previous measures cannot be explained by the charge diffusion in the MoTe2 channel or anomalous Hall effect and must be a result of a spin-related phenomenon.

The origin of the interfacial-MSHE

Our symmetry analysis and calculations reveal that the MSHE in FGT/MoTe2 heterostructure is a result of the interplay between symmetry breaking and band geometric properties at the vdW interface. In our measurement configuration, the ISHE in the MoTe2 is sensitive only to the spin current \({J}_{y}^{{S}_{z}}\) with spin component out of the transport plane, i.e. spin-z polarized43. This indicates that T-odd spin Hall conductivity (SHC) \({\sigma }_{{yx}}^{{S}_{z}}\) dominates the MSHE in our device. Based on the rigorous symmetry analysis51,52,53, the bulk FGT used in our work is forbidden from having a nonzero \({\sigma }_{{yx}}^{{S}_{z}}\) due to multiple magnetic symmetries (e.g. C2yT, MxT, and etc.), as shown in Fig. 3a. However, the relevant symmetry limitations are all broken by stacking the symmetry-mismatched heterointerface formed by FGT and MoTe2, in which the mirror planes of FGT and MoTe2 are misaligned (see more details in Supplementary Note 3). Therefore, this interfacial symmetry breaking allows the nonzero T-odd \({\sigma }_{{yx}}^{{S}_{z}}\), making the observation of the interfacial-MSHE possible in the vdW heterostructure as depicted in Fig. 3b.

Fig. 3: The mechanism for interfacial-MSHE.
figure 3

a Illustration of crystal structure of bulk FGT with symmetry generator C2yT and MxT, which prohibit T-odd SHC \({\sigma }_{{yx}}^{{S}_{z}}\). b Illustration that the T-odd SHC \({\sigma }_{{yx}}^{{S}_{z}}\) is allowed by the symmetry-breaking at the interface of FGT/MoTe2 heterostructure. c Calculated k-resolved spin current dipole of \({\sigma }_{{yx}}^{{S}_{z}}\) in FGT/MoTe2 heterostructure. The color code represents the magnitude of the spin current dipole. Different from that in bulk FGT, the distribution of k-resolved spin current dipole \({D}_{{yx}}^{{S}_{z}}\) in FGT/MoTe2 heterostructure results in non-zero integration of spin current dipole and non-zero magnetic spin Hall conductivity. d Calculated temperature dependence of T-odd SHC \({\sigma }_{{yx}}^{{S}_{z}}\), which is in good agreement with our experimental results. The calculated τ in our system is 0.04 ps–0.33 ps. The error bars are determined by the noise level of the measured nonlocal voltage signal. e Schematics of interfacial-MSHE in a ferromagnetic metal/nonmagnetic metal bilayer. M is the magnetization of ferromagnet. \({J}^{S}\) is the spin current over the interface. \({J}^{C}\) is the applied charge current. The gray box represents the nonmagnetic layer and the orange box represents the magnetic layer with perpendicular magnetization in +z direction. f Schematics of the interfacial-MSHE in the same bilayer with the magnetization direction of the magnetic layer reversed. As a time-reversal odd quantity, the direction of the spin current induced by interfacial-MSHE reverses compared to that in e.

Theoretically, the T-odd SHC can be described by the equation (the derivation is presented in the Methods section)

$${\sigma }_{{ab}}^{{S}_{c}}=\frac{e\tau }{{{\hslash }}}{D}_{{ab}}^{{S}_{c}}=\frac{e\tau }{{{\hslash }}}\mathop{\sum }\limits_{n}\int \frac{{d}^{d}k}{{\left(2\pi \right)}^{d}}{f}_{0}{\partial }_{{k}_{b}}{j}_{a,n}^{{S}_{c}}\left({{{{{\boldsymbol{k}}}}}}\right),$$
(1)

where \(\tau\) is the relaxation time, \({j}_{a,{n}}^{{S}_{c}}\left({{{{{\boldsymbol{k}}}}}}\right)=\left\langle {u}_{n}\left({{{{{\boldsymbol{k}}}}}}\right)\left|\frac{1}{2}\left\{{\hat{v}}_{a},{\hat{s}}_{c}\right\}\right|{u}_{n}\left({{{{{\boldsymbol{k}}}}}}\right)\right\rangle\) is the average spin current density for band eigenstate \(|{u}_{n}\left({{{{{\boldsymbol{k}}}}}}\right)\rangle\), \({\hat{v}}_{a}\,{{{{{\rm{and}}}}}}\,{\hat{s}}_{c}\) are the velocity and spin operators, \(d\) is the dimension of the system, and \({f}_{0}\) is the Fermi distribution function. \({D}_{{ab}}^{{S}_{c}}={\sum }_{n}\int \frac{{d}^{d}k}{{\left(2\pi \right)}^{d}}{f}_{0}{\partial }_{{k}_{b}}{j}_{a,{n}}^{{S}_{c}}\left({{{{{\boldsymbol{k}}}}}}\right)\) can be regarded as a spin current dipole in momentum space, similar to the Berry curvature dipole54,55,56. As \({j}_{a,n}^{{S}_{c}}({{{{{\boldsymbol{k}}}}}})\) is an even function in k-space under the time-reversal symmetry, the resulting spin current dipole \({D}_{{ab}}^{{S}_{c}}\) is an odd function. This behavior is consistent with the ferromagnetic systems and the sign of spin current dipole flips with the magnetic order parameter (see more details in Methods). It is important to note that the spin current dipole is dependent solely on the band structure and only exists in magnetic systems, disappearing in nonmagnetic systems that maintain time-reversal symmetry. This makes the spin current dipole an intrinsic property of each ferromagnetic material.

To evaluate the spin current dipole and the spin Hall conductivity, we use the above equation and conduct the first-principles calculations. Our calculation results, shown in Fig. 3c, reveal a nonzero spin Hall conductivity \({\sigma }_{{yx}}^{{S}_{z}}\) at the interface of FGT/MoTe2 vdW heterostructure while it vanishes in bulk FGT (see Supplementary Fig. 5). This confirms our early symmetry analysis. The k-resolved spin current dipole \({D}_{{yx}}^{{S}_{z}}\) in the FGT/MoTe2 bilayer system reveals that the spin properties of materials can be manipulated by tailoring the distribution of \({J}_{y}^{{S}_{z}}\) in momentum space. Additionally, we calculate the temperature dependence of \({\sigma }_{{yx}}^{{S}_{z}}\) in the FGT/MoTe2 vdW heterostructure by varying the temperature, as shown in Fig. 3d. Our calculations show a monotonic decrease in the spin Hall conductivity with increasing temperature. To validate these results, we also measure the non-local voltage at different temperatures (as shown in Supplementary Fig. 6) by using the configuration shown in the inset of Fig. 2a and extract the temperature dependence of the spin Hall conductivity at the interface of the FGT/MoTe2 heterostructure, by using the following equation

$${\sigma }_{{SH}}^{{interface}}=\frac{1}{2}B\triangle {R}_{{NL}}\frac{1}{{\rho }_{{{MoTe}}_{2}}^{2}}\frac{1}{{\rho }_{{FGT}}}\frac{1}{{\sigma }_{{SH}}},$$
(2)

where \({\rho }_{{{MoTe}}_{2}}\) and \({\rho }_{{FGT}}\) are the resistivity of the MoTe2 and FGT, \({\sigma }_{{SH}}\) is the intrinsic spin Hall conductivity of MoTe2, \(\triangle\)RNL (\(=\Delta V/I\)) is the resistance jump of hysteresis loops shown in Supplementary Fig. 6, and B is an interfacial-related constant. Our calculation result (solid line) is in good agreement with the experimental data (symbols), which suggests that the unique interface of vdW heterostructure is responsible for the observed MSHE. In addition, by comparing ΔRISHE (\(=\Delta {V}_{{ISHE}}/I\)) and MSHE-related ΔR measured in the same device, we estimate that the spin Hall angle is 0.67 in our device (see Methods), indicating a highly-efficient charge-spin conversion mechanism. It is noted that our observed magnetic spin Hall signal is highly reproducible and independent of the sample thickness (see Supplementary Note 4 and Supplementary Fig. 7), verifying its intrinsic origin from the spin current dipole of the electronic band structure at the symmetry-mismatched heterointerface.

Based on the above analysis, the interfacial-MSHE observed in the Fig. 2a can be explained as follows. At the interface of the FGT/MoTe2 vdW heterostructure, both time-reversal and spatial-reversal symmetries are broken. This symmetry breaking leads to the tailoring of the band geometry of FGT/MoTe2, causing a redistribution of spin current in k-space and resulting in a non-zero T-odd transverse spin current. This spin current is generated from the injected charge current in the FGT and flows over the interface. Since the spin current is coherently generated over the interface, it saves the need to cross the interface and suffer from scattering, and generates a large MSHE response. Importantly, this spin current is locked to the magnetization of the FGT, as illustrated in Figs. 3e and 3f. As a result, the direction of generated spin current can be reversed by the applied perpendicular magnetic field, leading to the rectangular hysteresis loop of voltage signal seen in Fig. 2a.

Neuromorphic computing based on nonvolatile charge-spin conversion

We propose a proof-of-concept spintronic device, referred to as a “memtransformer”, that utilizes the interfacial-MSHE at the junction of the FGT/MoTe2 vdW heterostructure and the inverse spin Hall effect in the MoTe2. The memtransformer, shown in Fig. 4a, operates by using voltage signal as both input (\({V}^{{in}}\)) and output (\({V}^{{out}}\)). When the magnetization is pointing upwards, the input voltage signal generates spin current through the MSHE over the vdW interface. The resulting spin current, with its upward polarization, carries the memory information associated with the magnetization and transforms the input voltage into a positive voltage signal, which is detected by the inverse spin Hall effect in the MoTe2. In this way, the spin current acts as a linear, nonvolatile transformer that transforms the input voltage signal into the output voltage signal. This linear, nonvolatile transformation between the input and output electrical signals enables the memtransformer to perform multiplication operation. If the magnetization is switched downwards, the resulting spin current with downward polarization generates a negative voltage output signal, also suitable for multiplication. Therefore, the memtransformer can be used as a building block for in-memory computing by incorporating it into electrical circuit.

Fig. 4: Demonstration of multiply accumulate operation using memtransformer and BCNN for handwritten digit recognition.
figure 4

a Schematics of memtransformer based on MSHE. b \({V}^{{out}}\) as a function of \({V}^{{in}}\) with positive and negative weights in single memtransformer device. c The measured values \({V}_{{measured}}^{{out}}\) as a function of their calculated values \({V}_{{ex}{pected}}^{{out}}(={w}_{1}{V}_{1}^{{in}}+{w}_{2}{V}_{2}^{{in}})\) in array comprised by two memtransformer devices. The weights are denoted as (\({{{{{{\rm{w}}}}}}}_{1}^{+}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{+}\)), (\({{{{{{\rm{w}}}}}}}_{1}^{+}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{-}\)), (\({{{{{{\rm{w}}}}}}}_{1}^{-}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{+}\)), (\({{{{{{\rm{w}}}}}}}_{1}^{-}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{-}\)). The inset is schematic of two-memtransformer-based circuits. d Structure of five-layer binary neural network used for MNIST image recognition, with two convolutional layers, one pooling layer and two fully connected layers. e The corresponding memtransformer-based arrays of the convolutional kernels in (d). f Ten thousand MNIST handwritten-digit (0,1,…,9) images are classified with a 96.7% accuracy, which is comparable to recognition accuracy of state-of-the-art binary neural network. Recognition accuracy of the ten different output digits is classified.

We next conduct the corresponding electrical measurement to validate the operating mechanism of the memtransformer. Figure 4b displays the output voltage (\({V}^{{{{{{\rm{out}}}}}}}\)) as a function of the input voltage (\({V}^{{{{{{\rm{in}}}}}}}\)) in a single memtransformer for both up (square) and down (triangular) magnetization directions. The results show that the output voltage exhibits a strong linear relationship with the input voltage for both magnetization directions, demonstrating that the memtransformer can perform the multiplication, where \({V}^{{out}}=w\cdot {V}^{{in}}\) and the weight \(w\) can be represented by the magnetization (i.e., up magnetization as the \({w}^{+}\) and down magnetization as \({w}^{-}\)). The inset of Fig. 4c shows cascading two memtransformer devices enables to realize multiply accumulate (MAC) operations, which are essential in in-memory computing and artificial intelligence. To demonstrate this application, we used an array of two memtransformer devices to perform vector-vector multiplication, with the corresponding results shown in Fig. 4c. Input voltage \({{{{{{\rm{V}}}}}}}_{i}^{{in}}\) (i = 1,2) is applied onto each device and the accumulated output voltage defined as \({{{{{{\rm{V}}}}}}}^{{out}}={w}_{1}\cdot {V}_{1}^{{in}}+{w}_{2}\cdot {V}_{2}^{{in}}\) is measured, where the weights i.e., (\({{{{{{\rm{w}}}}}}}_{1}^{+}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{+}\)), (\({{{{{{\rm{w}}}}}}}_{1}^{+}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{-}\)), (\({{{{{{\rm{w}}}}}}}_{1}^{-}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{+}\)), (\({{{{{{\rm{w}}}}}}}_{1}^{-}\),\(\,{{{{{{\rm{w}}}}}}}_{2}^{-}\)) can be changed by switching the magnetization. The experimental result \({{{{{{\rm{V}}}}}}}_{{measured}}^{{out}}\) is in an excellent agreement with the expected result \({{{{{{\rm{V}}}}}}}_{{expected}}^{{out}}\), which is calculated based on the slope shown in the Fig. 4b, indicating that the MAC operations can be successfully implemented with the memtransformer array.

We showcase the potential of the memtransformer in in-memory computing by building a binary convolutional neural network (BCNN), where the binary weights (i.e., −1 and +1) are implemented with the memtransformer. Figure 4d presents a schematic of the BCNN, which consists of two convolutional layers, one pooling layer and two fully connected layers. In the array, the drain terminals of the memtransformer devices are connected to the source lines clamped at 0 V, as depicted in Fig. 4e. We use the BCNN to evaluate the average recognition accuracy of Modified National Institute of Standards and Technology (MNIST) handwritten-digit images and achieve a remarkable accuracy of 96.7% (Fig. 4f), which is comparable to recognition accuracy of state-of-the-art binary neural network. It is worth noting that the two convolutional layers in our proposed BCNN are connected without the use of activation function, indicating that our proposed neural network boasts superior energy efficiency compared to traditional binary convolutional networks that require activation functions.

Discussion

In conclusion, we have uncovered an interfacial T-odd spin Hall effect in the ferromagnet/metal vdW heterostructure, which has not been identified in the conventional ferromagnet/metal bilayer systems. The T-odd nature of this SHE allows for tuning charge-spin conversion by reversing the magnetization, providing a distinct control mechanism for designing spintronic device with new functionalities. As a proof-of-concept, we propose a memtransformer spintronic device and demonstrate its potential for using in binary convolutional neural network. These results not only broaden the field of charge-spin conversion research, but also offer a path towards energy-efficient neuromorphic computing.

Methods

Nanofabrication

We cleaved FGT and MoTe2 bulk crystals onto 285 nm SiO2/Si substrates in a glovebox filled with inert atmosphere and selected the exfoliated flakes with the suitable thickness (5–10 nm thick MoTe2 and 25–35 nm thick FGT), with the thickness of these flakes identified with optical contrast. The exfoliated MoTe2 flakes were then transferred on FGT flakes in inert atmosphere to avoid degradation. E-beam lithography was used to define the electrode pattern, and the electrodes (5 nm Ti/45 nm Au) were deposited by standard electron-beam evaporation. The MoTe2 flakes used in device were then shaped into Hall bar geometry by using standard electron-beam lithography and dry etching in an inductively coupled plasma system. The accurate thicknesses of the MoTe2 and FGT flakes were characterized via a Bruker Multimode atomic force microscope after transport measurements finished. We fabricated FGT/MoTe2 devices at room temperature by using MoTe2 of monoclinic phase (1 T’ phase). During the cooling process, MoTe2 of room-temperature monoclinic phase undergoes a structural transition at 250 K, changing into orthorhombic phase (Td phase) at lower temperature.

Transport measurements

All the electrical measurements were performed in the Oxford cryostat with magnetic fields of up to 8 T and a base temperature of about 1.6 K. The magnetic field was applied along the out-of-plane direction of the devices. Electrical measurements were performed by lock-in amplifiers (Stanford SR830) using a low-frequency (17.7 Hz) and a Keithley 2636B.

Calculation details

The electronic structures were carried out in the framework of density functional theory as implemented in the Vienna ab initio simulation package57,58 with the projector augmented wave method59 and Perdew, Burke, and Ernzerh of exchange correlation functionals60. The result of calculated electronic structures of bulk FGT is shown in Supplementary Fig. 8. For the convergence of the results, the spin–orbit coupling was included self-consistently in the calculations of electronic structures with the kinetic energy cutoff of 700 eV. Since all devices were measured at low temperature, we considered the two-dimensional FGT/MoTe2 bilayer consisted of two layers of FGT and one layer of Td-MoTe2. And a vacuum layer with 15 Å was set to simulate the slab model of FGT/MoTe2 bilayer. Monkhorst-Pack k meshs of 12 × 12 × 2 and 9 × 5 × 1 were used in the FGT bulk and FGT/MoTe2 bilayer, respectively. The s, p orbitals of Te and Ge atoms and s, p, d orbitals of Fe and Mo atoms were used to construct Wannier functions61. Based on the real-space Hamiltonian in the maximally-localized Wannier functions basis, we evaluated the spin current dipole and spin conductivity of MSHE within the linear response theory using the Kubo formula, as implemented in the Wannier-Linear-Response code26. We also calculated the values of spin conductivity in FGT/MoTe2 heterostructures with varied thicknesses of FGT (Supplementary Fig. 7). The values of T-odd spin Hall conductivity in these systems are independent of the sample thickness, further verifying that the magnetic spin Hall effect observed in FGT/MoTe2 heterostructure is an interfacial effect.

Calculation of spin Hall angle

The inverse spin Hall signal \(\triangle {R}_{{ISHE}}\) which is measured with configuration shown in the inset of Fig. 1b can be phenomenologically expressed as62:

$$\triangle {R}_{{ISHE}}=2{\theta }_{{SHE}}\frac{{\rho }_{{{MoTe}}_{2}}}{{t}_{{{MoTe}}_{2}}}P{e}^{-(L/{L}_{{sf}})}$$
(M1)

where \(\theta\)SHE is the spin Hall angle of MoTe2, \({\rho }_{{{MoTe}}_{2}}\) (81.85 \(\Omega \cdot {nm}\)) and \({t}_{{{MoTe}}_{2}}\) (8 nm) are the resistivity and thickness of MoTe2 flake, respectively. P is the effective current spin polarization and \({L}_{{sf}}\) is the spin diffusion length of MoTe2. In the calculations of P, we use the experimental values of spin Hall angle \({\theta }_{{SHE}}\) = 0.32 (obtained from measurements of spin Hall effect in MoTe2 in ref. 44) and spin diffusion length \({L}_{{sf}}\)  = 1.6 \(\mu {{{{{\rm{m}}}}}}\) (see Supplementary Note 1), and thus we obtain P ≈ 0.853.

Considering the measurement configuration of MSHE, in which a charge current applied along FGT strip gives rise to a spin current flowing along MoTe2 channel, the non-local voltage jump measured at MoTe2 Hall cross due to the combined MSHE and ISHE can be denoted as63:

$${\triangle V}_{{NL}}={2\rho }_{{{MoTe}}_{2}}{\theta }_{{SHE}}{j}_{s}w$$
(M2)

where w is the width of MoTe2 channel. \({j}_{s}\) is the spin current generated by MSHE and can be described as:

$${j}_{s}={j}_{c}{\alpha }_{{MSH}}{e}^{-\left(L/{L}_{{sf}}\right)}$$
(M3)

One can thus obtain the non-local resistance jump measured at MoTe2 Hall cross:

$$\triangle {R}_{{NL}}=\frac{\triangle {V}_{{NL}}}{I}=2A{\rho }_{{{MoTe}}_{2}}{\alpha }_{{MSH}}{\theta }_{{SHE}}{e}^{-(L/{L}_{{sf}})}$$
(M4)

where A is geometric factor and \({\alpha }_{{MSH}}\) is the magnetic spin Hall angle.

By dividing Eq. (M1) and Eq. (M4), we derive the magnetic spin Hall angle:

$${\alpha }_{{MSH}}=\frac{P}{A\cdot {t}_{{MoT}{e}_{2}}}\cdot \frac{\triangle {R}_{{NL}}}{\triangle {R}_{{ISHE}}}$$
(M5)

Finally, we obtain the spin Hall angle of about 0.67.

Analysis of temperature dependence of time-odd spin Hall conductivity originated from the heterostructure region

Measurements of \(\triangle {R}_{{NL}}\) with configuration shown in the inset of Fig. 2a at different temperatures can be used to derive the temperature dependence of spin Hall conductivity originated from the heterostructure region. From the Eq. (M4), function of the measured non-local resistance jump mentioned above, the spin Hall conductivity of heterostructure region can be expressed as:

$${\sigma }_{{SH}}^{{in}{terface}}=\frac{1}{2}B\triangle {R}_{{NL}}\frac{1}{{\rho }_{{{MoTe}}_{2}}^{2}}\frac{1}{{\rho }_{{FGT}}}\frac{1}{{\sigma }_{{SHE}}}$$
(M6)

where B is an interface-related constant with given geometric parameters and \({\sigma }_{{SHE}}\) is the intrinsic spin Hall conductivity of MoTe2 where \({\sigma }_{{SHE}}\) = 176 \((\hslash /e){\Omega }^{-1}{{cm}}^{-1}\) (obtained from ref. 43). \({\rho }_{{FGT}}\) is the resistivity of FGT. Considering the measured temperature dependence of \({\rho }_{{FGT}}\) and \({\rho }_{{{MoTe}}_{2}}\), the temperature dependence of T-odd spin Hall conductivity can be derived.

Theory of time-reversal odd spin Hall effect

The spin current generated at the linear order of an applied electric field E is given by

$${j}_{a}^{c}={\sigma }_{{ab}}^{c}{E}_{b},$$
(M7)

where the Einstein summation convention is adopted for the Cartesian Coordinates, and \({j}_{a}^{c}\) is the current along the \(a\) direction of the spin c component.

The spin current density is given by the integral of the spin current carried by each electron \({j}_{n}^{c}({{{{{\boldsymbol{k}}}}}})\) weighted by the distribution function \({f}_{n}({{{{{\boldsymbol{k}}}}}})\):

$${j}^{c}=\mathop{\sum}\limits_{n}\int \frac{{d}^{2}k}{{(2\pi )}^{2}}{f}_{n}\left({{{{{\boldsymbol{k}}}}}}\right){j}_{n}^{c}\left({{{{{\boldsymbol{k}}}}}}\right)$$
(M8)

Here n and ћk are the band index and crystal momentum, respectively, \({j}_{n}^{c}\left(k\right)=\left\langle {u}_{n}({{{{{\boldsymbol{k}}}}}}) | \frac{1}{2}\left\{\hat{v},{\hat{s}}^{c}\right\} | {u}_{n}({{{{{\boldsymbol{k}}}}}})\right\rangle\) and \(\left[d{{{{{\boldsymbol{k}}}}}}\right]\) is shorthand for \({\sum }_{n}{d}^{2}{{{{{\boldsymbol{k}}}}}}/{(2\pi )}^{2}\). In the relaxation time approximation, the deviation from the equilibrium Fermi distribution \({f}_{0}({\varepsilon }_{n})\) is of a dipole structure \((e \, < \, 0)\):

$${f}_{n}-{f}_{0}=-\frac{e}{{\hbar}}\tau {{{{{\boldsymbol{E}}}}}}\cdot {\partial }_{{{{{{\boldsymbol{k}}}}}}}{f}_{0}$$
(M9)

The time reversal odd spin conductivity is at the linear order of the relaxation time τ. The spin conductivity tensor is thus given by

$${\sigma }_{{ab}}^{c}=\frac{e}{{\hbar}}\tau \mathop{\sum}\limits_{n}\int \frac{{d}^{2}{{{{{\boldsymbol{k}}}}}}}{{(2\pi )}^{2}}{f}_{0}{\partial }_{{{{{{{\boldsymbol{k}}}}}}}_{b}}{j}_{a,n}^{c}\left({{{{{\boldsymbol{k}}}}}}\right)$$
(M10)

where we can introduce

$${D}_{{ba},n}^{c}\left(k\right) \, \equiv \, {\partial }_{{k}_{b}}{j}_{a,n}^{c}\left(k\right)=\left\langle {u}_{n} | \frac{1}{2}\left\{{\partial }_{{k}_{b}}{\hat{v}}_{a},{\hat{s}}^{c}\right\} | {u}_{n}\right\rangle \\ +{\hbar}\mathop{\sum}\limits_{{n}_{1}\ne n}{{{{\mathrm{Re}}}}}\frac{\left\langle {u}_{n} | \left\{{\hat{v}}_{a},{\hat{s}}^{c}\right\} | {u}_{{n}_{1}}\right\rangle \left\langle {u}_{{n}_{1}} | {\hat{v}}_{b} | {u}_{n}\right\rangle }{{\varepsilon }_{n}-{\varepsilon }_{{n}_{1}}}$$
(M11)

We can plot the k-space distribution of

$${D}_{{ba}}^{c}\left({{{{{\boldsymbol{k}}}}}}\right)\equiv \mathop{\sum}\limits_{n}{f}_{0}{D}_{{ba},n}^{c}\left({{{{{\boldsymbol{k}}}}}}\right)=\mathop{\sum}\limits_{n}{f}_{0}{\partial }_{{k}_{b}}{j}_{a,n}^{c}\left({{{{{\boldsymbol{k}}}}}}\right)$$
(12)

The application of memtransformer array on neuromorphic computing

To show the potential of memtransformer array on achieving large-scale neuromorphic computing, we demonstrate image recognition simulation on a convolution neural network and the MNIST dataset. The MNIST handwritten digit database is composed of 60,000 training images and 10,000 testing images, each with a resolution of 28 × 28 belonging to one of ten digital categories from zero to nine. The neural network has five layers, containing convolution layers, pooling layer and fully connected layers. The network is trained by the stochastic gradient descent for 2000 steps with a batch size of 128, using the Adam optimizer and a cross-entropy loss function. We compare the performance of the neural network with different device models (resistive neural network and memtransformer neural network). The weight of resistive neural network can take any real values without limitation, while that of the memtransformer neural network is bounded to binary values. The recognition accuracy of this memtransformer neural network reached 96.7%, which is comparable to that of resistive neural network, shown in Fig. 4f. It should be noted that in the analog memristor array, its current summation mode would lead to challenges in low-power consumption and large-scale integration due to the finite resistance of crossbar wires. Besides, a neuromorphic computing network generally consists of multiple crossbar arrays, which requires digital-to-analog (analog-to-digital) conversion when large amounts of data go in to (out of) the crossbar arrays, and this also incurs significant costs in area and energy. As such, the cascadable memtransformer array using the voltage sum mode may offer a promising approach to realize energy-efficient and large-scale neuromorphic computing.