Introduction

The ability of extremely intense and brief femtosecond X-ray free-electron laser (XFEL) pulses to outrun radiation damage avoids the need to freeze (and thus immobilize) biological samples to minimize damage, as required in conventional protein crystallography1 or cryogenic electron microscopy (cryo-EM). For single particles, this enables the study of protein dynamics under near-physiological conditions at room temperature. The principle of outrunning damage by collecting diffraction data before the onset of the damaging photoelectron cascade was first established experimentally at the Free-Electron Laser in Hamburg (FLASH) facility in 20062 and is now routine in serial femtosecond crystallography3,4,5. Since the first aerosol single-particle imaging experiments at the FLASH6, the method of flash X-ray imaging has been applied to image living cells7, cell organelles8, and viruses9,10, in particular, the giant Mimivirus in two-dimensional (2D) projections11, as well as in full 3D12. Despite continual improvements in reconstruction algorithms, the number of reconstructed resolution elements across the sample remains at about a dozen voxels13,14,15. The main reasons for this limitation are the large dynamic range spanned by the diffracted intensities, going beyond the technical limits of current detector technology, as well as the weakness of the diffraction signal and the shot-to-shot variations in imaging conditions due to lateral distance between the sample and the X-ray focus (the impact parameter), background scattering, and detector response. Averaging over a very large number of single-particle snapshots is required to obtain sufficient information at high-resolution regions in diffraction space. This is necessary even for strongly scattering samples. Until now, this has been hampered by the low hit probabilities and the relatively low 120 Hz pulse repetition rate at XFEL facilities available to date.

The European XFEL (EuXFEL) introduces an era of high-intensity, high repetition-rate, and high data-rate XFELs by taking advantage of a superconducting linear accelerator16. The high repetition rate poses new challenges for sample injectors and X-ray detectors. Whenever the XFEL pulse hits a sample, it rapidly transforms it into a plasma. To fully exploit the high repetition rate, this plasma must not interfere with the delivery of the next particle, thereby ensuring that different pulses correspond to independent measurements from undamaged, intact objects. For serial crystallography at the EuXFEL, this has recently been shown to be possible17,18,19.

The first single-particle experiments at the EuXFEL were performed in December 2017 using the Single Particles, Clusters, and Biomolecules & Serial Femtosecond Crystallography (SPB/SFX) instrument20 with microfocus optics. The main goal of the experiment was to demonstrate single-particle imaging at the high intra­bunch repetition rate of the EuXFEL with the Adaptive Gain Integrated Pixel Detector (AGIPD)21.

In this article, we present the results of this experiment. We start by characterizing the background inherent to the instrument, which is a critical parameter for determining the maximum achievable resolution, as well as the signal-to-noise ratio (SNR) of the recorded patterns, instrumental stability, and the incident photon flux. We then size the particles corresponding to the patterns recorded while injecting viruses into the beam, confirming that a substantial fraction of the patterns corresponded to the expected particle size. Finally, we searched for any correlation or dependence among diffraction patterns obtained from the same pulse train. Overall, we show that single-particle imaging experiments can be performed at the megahertz intrabunch repetition rate of the EuXFEL.

Results

Overview of data collection

The experiment (EuXFEL proposal 2013) was performed over five 12-h shifts in December 2017. The X-ray beam, with a photon energy of 9.2 keV, was focused to a spot of 15 × 15 μm2. Data were recorded for 300 pulses per second, at an inter-train repetition rate of 1.1 MHz, during 376 experimental runs. Each run contained 30,000 pulses, corresponding to one thousand bunch trains, with each containing 30 pulses. In total, 11,255,800 frames were recorded with the MHz camera AGIPD, out of which 557,675 patterns were identified as hits or diffraction patterns from the target samples. The overall statistics of the measured data are summarized in Table 1.

Table 1 The summary of measurements broken down by samples and shifts.

A heavy-metal salt solution was used to align the beam and the injector. When a salt solution is aerosolized and focused by a gas dynamic virtual nozzle (GDVN) (see “Methods”), it forms a single-file stream of droplets. Water quickly evaporates from the droplets in a vacuum, resulting in amorphous salt spheres. In aerosol imaging experiments, a salt solution is convenient for detecting the X-ray beam since each droplet gives rise to a salt particle, thus leading to high hit rate. This contrasts with colloidal particles dispersed in a volatile medium, where many droplets may not contain particles or form any upon injection, leading to low a hit rate.

Diffraction from these spheres was simulated to determine the effect of experimental parameters such as the incident flow rate, particle size, and alignment on the diffraction patterns. A scattering model for spherical particles22 was fitted to the diffraction patterns for the iridium(III) chloride (IrCl3) samples (see Fig. 1a–c) captured in the third and fourth shifts, as described in “Methods”. We assumed that the density of amorphous IrCl3 particles formed in vacuum was close to its solid-state density of 5.3 g/cm3. Also, we assumed that, on average, each IrCl3 molecule is hydrated by three water molecules, resulting in a molar mass of 352.6 g/mol and a scattering factor of 149.6 electrons. We further assumed that radiation damage from the X-rays had a negligible effect on the low angle scattering we fitted. Particle sizes and incident beam fluences were obtained as described in the “Methods” and are shown in Fig. 2a–d.

Fig. 1: Examples of scattering patterns from IrCl3 and Mimivirus.
figure 1

Scattering from IrCl3 spheres of a 145 nm, b 301 nm, and c 465 nm diameter, respectively. d Scattering from Mimivirus. The edge resolution of the patterns shown is 36.8 nm.

Fig. 2: Distributions of the reconstructed parameters of scattering from spherical particles formed by IrCl3.
figure 2

Distribution of incident photon fluences over particle diameters, shown as a 2D histogram, in the third (a) and fourth (b) shifts. Histograms of the fitted particle diameters in the third (c) and fourth (d) shifts.

The 2D distributions of particle sizes indicate that the particle size ranges from 80 to 800 nm in diameter (Fig. 2c, d) and show an upper limit of the fluence of the incident photons, independent of particle size (see Fig. 2a, b, green dashed line). This limit is the value of the fluence at the focus of the beam (Im), where it reaches a maximum. The lack of events in the upper-right corner of the distribution results from the small number of large particles in the measured set. Thus, we can only approximately estimate the upper limit of the flux at about 2.8 × 109 photons/μm2 during the third shift, and about 1.3 × 109 photons/μm2 during the fourth shift.

The lower fluence limit (Fig. 2a, b, red dashed line) depends on the particle size and corresponds to the sensitivity limit (Is) below which it was impossible to fit a spherical scattering model. The slope of the lower bound is −3 on the log-scale, matching the scaling of the signal for a given particle volume

$$I_s = I_m\left( {R_0^3/R^3} \right).$$

The line showing the limit of sensitivity crosses the line for the upper limit of the flux Im at a particle size R0. This value indicates the theoretical size limit of particles that can be distinguished for a given sample and set-up. These were 52 and 73 nm in the third and fourth shifts, respectively.

Background characterization

The background scattering data were collected in the third shift, comprising 4000 images taken with an average pulse energy of 1.135 mJ, as measured by the X-ray gas monitor detector23, and 120,000 images with an average pulse energy of 1.477 mJ in the fourth shift.

In addition to the instrument background, we measured the background including any contributions from the gas used for sample delivery itself, known as injection background, by using the frames classified as nonhits, as described above. We calculated the average injection background for each shift, except for the third shift when the detector was moved. As a result, we calculated two separate background profiles.

The injection background, shown as a function of \(S = \frac{2}{\lambda }{\rm{sin}}\, \theta\) (with θ half the scattering angle) in Fig. 3a, b, was averaged over 569,274 and 471,072 patterns with an average pulse energy of 1.276 and 1.539 mJ, respectively. The injection background barely exceeds the instrument background at low diffraction angles. The median background for all pixels of the detector was about 4 × 10−4 photons per pixel in both shifts.

Fig. 3: Average background, in photons per detector pixel.
figure 3

a, b Radially averaged background for the third and fourth shifts, respectively. The orange line is the instrument background and the blue line is the injection background. Note that the scale is linear below 10−3 photons per pixel.

The background fades rapidly, reaching 10−3 photons per pixel from S > 0.02 nm−1. The value of 10−3 photons per pixel is the limit of the statistical accuracy of background estimation, given the calibration of the AGIPD detector as available in this experiment (see “Methods”). At higher S, only stochastic fluctuations are observed.

Variations in the position of diffraction pattern centers

The position of the diffraction pattern centers varies from pulse to pulse since each particle collides with the X-ray beam at a random point relative to the beam axis24. At these different interaction points, the beam has different phase shift values, that define the shift of the zero wavevector of the diffraction. The 2D histograms of the reconstructed centers of diffraction patterns scattered from spherical IrCl3 particles are shown in Fig. 4a, b. The diffraction pattern centers are given in horizontal (γh) and vertical (γv) angles of the beam deviation from the mean beam direction when measured from the interaction point.

Fig. 4: Reconstructed positions of diffraction pattern centers.
figure 4

2D histograms of the distribution of the centers of diffraction patterns for the third (a) and fourth (b) shifts. The squares shown by black dashed lines indicate the edges of the detector pixel containing the center of the distribution. γh and γv are the horizontal and vertical deviations from the mean beam direction when measured from the interaction point.

The distribution during the third shift had an interquartile range (IQR) of 18 μrad along the horizontal axis and 20 μrad in the vertical direction. Overall, 90% of the diffraction pattern centers lie in the range of 50 and 59 μrad in the horizontal and vertical directions, respectively. During the fourth shift, the corresponding values of IQR were 18 and 22 μrad, and the corresponding ranges for 90% of the centers were 47 and 55 μrad. The fraction of centers inside the central pixel (see Fig. 4a, b, square shown in black dashed lines) is 91% and 94% for the third and fourth shifts, respectively.

Signal versus background

The assembled and cropped diffraction pattern from a single hit of an IrCl3 particle is shown in Fig. 5a. The particle has an estimated diameter of 439 nm, which is close to the size of Mimivirus. The estimated incident photon fluence was 6.8 × 108 photons/μm2.

Fig. 5: Comparison of signal with background for a single diffraction pattern.
figure 5

a Single strong diffraction pattern of an IrCl3 sphere of 439 nm in diameter, cropped to an edge resolution of 12.7 nm. b Comparison between the radially averaged scattering of the IrCl3 sphere (orange), fitted model (blue), and radially averaged background with injection (green). Note that the scale is linear below 10−2 photons per pixel. The red dashed lines (18.4 nm resolution) mark the angle at which the modeled scattering is stronger than the noise in a single frame; the purple dashed lines (12.7 nm resolution) mark the angle where the modeled scattering exceeds an average background; detector edge resolution is 6.5 nm.

The measured pattern corresponds to the spherical model at small diffraction angles (see Fig. 5b). At scattering vectors above 0.054 nm−1 (red dashed line), the noise in one frame exceeds the amplitude of the spherical model, and fringes are not distinguishable, although the background when averaged across a large number of frames, is still an order of magnitude lower than the expected signal. The model approaches the injection background level at diffraction angles above 0.079 nm−1 (purple dashed line).

The radial average of the scattering intensities above the background (Fig. 6), when averaged across the different samples, also show the signal disappearing around 0.08 nm−1.

Fig. 6: Average signal of all the hits by sample.
figure 6

Radial average of the signal minus the background for all the hits of IrCl3, Mimivirus, and Melbourne virus.

Filtering virus images by the particle size

Scattering from Mimivirus particles was recorded in 154 runs, which produced a total of four million frames. A pixel where the signal was above one photon was considered to have detected photons, hereafter called a lit pixel. Frames, where the number of lit pixels was three standard deviations above the mean, were classified as hits and the rest as misses. This resulted in a set of 44,905 hit diffraction patterns, which were further processed.

The next step was to identify diffraction patterns produced by a single Mimivirus particle. In this work, we were only interested in single hit diffraction patterns as they can be immediately used to reconstruct the 3D Fourier space volume of the sample. To identify single hit diffraction patterns, we estimated the size of injected particles. A continuous wavelet transform (CWT)-based procedure was used, as described in the “Methods.” The distribution of images by the diameter of the particle is presented in Fig. 7a.

Fig. 7: Histogram of particle size distribution.
figure 7

a Size distribution of all the particle diameters, b images with a particle diameter between 400 and 600 nm, and c images from the highlighted area in b with the recalculated diameters. Dashed blue line in b and c is the Gaussian fit. The highlighted region is mean ± 1 standard deviation.

The particle diameter distribution (Fig. 7) is bimodal, with a maximum at the lower end of the detection range, which likely corresponds to aggregates of impurities25, and another one at around 500 nm, which coincides with the diameter of Mimivirus particles measured by cryo-EM\(\sigma _i/\left( {\mathop {\sum}\nolimits_i {\sigma _i^2} } \right)^{1/2}\), where the summation is carried out over cell–pixels in the group.

Pixels with the noise (σ) or the baseline (μ0) values outside of a 3.5 standard deviations interval and with the gain (μ1μ0) outside of 4 standard deviations interval in the distributions of corresponding values over the detector panels were marked as bad pixels.

Hit/nonhit images classification

We used a lit pixel counter8 to split frames into two classes: nonhits were frames with background scattering, and hits were frames with scattering from a sample.

In each frame, we calculated the number of lit pixels that record a signal of more than 45 analog-to-digital units above the baseline (~0.7 of the one-photon signal). For each run, the histogram of lit pixel counts was fitted with a Gaussian function. The value equal to 2.5 standard deviations above the mean of the fitted Gaussian was set as a threshold for the hits in this particular run. Frames with the number of lit pixels below the threshold were classified as nonhits. If we had a true Gaussian distribution of lit pixels in the set of frames only with background scattering, then we would expect about 150 (~0.5%) false positive hits per run using this value of the threshold.

Model of scattering from spheres

The scattered intensity from a sphere of diameter R, placed in the beam with incident photon fluence I0 at the scattering vector S is given by

$$I\left( {S,R,I^0} \right) = I^0\left( {r_{\mathrm{e}}\frac{{\pi R^3}}{6}n} \right)^2{\it{\Delta \Phi }}\left[ {3\frac{{j_1\left( {\pi SR} \right)}}{{\pi SR}}} \right]^2,$$

where n is the density of electrons, re is the classical electron radius, ΔΦ is the solid angle and j1 is the spherical Bessel function of the first kind.

The length of the scattering vector Si related to the i-th pixel with coordinates (xi, yi) on the detector at the distance L from the scattering point is

$$S_i = \frac{2}{\lambda }{\mathrm{sin}}\, \theta _i = \frac{{\sqrt {2 - 2c_i} }}{\lambda },\;c_i = {\mathrm{cos}}\, 2\theta _i = \frac{L}{{\sqrt {L^2 + r_i^2} }},\;r_i = \sqrt {\left( {x_i - x} \right)^2\, +\, \left( {y_i - y} \right)^2} ,$$

where x, y are the coordinates of the diffraction pattern center, λ is the wavelength, 2θi is the angle between the beam direction and the direction to the pixel i.

The solid angle of i-th pixel is

$${\it{\Delta \Phi }}_i = \frac{A}{{L^2}}c_i^3,$$

where A is an area of a pixel.

The measured diffraction νi at pixel i is a result of the combination of Poisson and Gaussian statistics

$$v_i = P\left( {I_i + b_i} \right) + N\left( {0,\sigma _i^2} \right),$$

where σi is the instrumental error at the pixel i, estimated by the processing of the dark run, and b0 is the averaged background scattering.

One diffraction pattern consists of N pixels with successfully measured diffraction

$$X = \left\{ {x_i,y_i,v_i,\sigma _i,b_i} \right\},\;i = 1 \ldots N.$$

Fitting the sphere scattering model to experimental patterns

The following procedure was used for model-based interpretation of the experimental diffraction pattern X. First we found a rough estimate of the center (x, y) of the diffraction pattern averaged over several strongest patterns using the Hough transform37,38. Then we made a rough estimate of the diameter R of the particle and the incident photon fluence I0 by a least-squares fit of the scattering from the spherical model to the measured radially averaged diffraction intensity. We then selected the interpretable images according to χ2 value of the fit. Finally, all parameters (x, y, R, I0) were refined using maximum likelihood given the measured intensities (νi). In contrast to the initial rough estimate of R and I0done before; here, we also refine the center of the diffraction pattern.

Refinement of parameters with likelihood maximization

Here, we approximate the Poisson distribution with the Normal distribution. Then the likelihood may be written as

$${\cal{L}}\left( {\theta |X} \right) = \mathop {\prod}\limits_{i = 1}^N {\frac{1}{{\sqrt {2\pi \left( {I_i + \sigma _i^2} \right)} }}{\mathrm{exp}}\left( { - \frac{{\left( {I_i + b_i - v_i} \right)^2}}{{2\left( {I_i + \sigma _i^2} \right)}}} \right)} .$$

Take a logarithm

$$l\left( {\theta |X} \right) = - \frac{1}{N}{\mathrm{log}}\, {\cal{L}}\left( {\theta |X} \right) = \, \frac{{{\mathrm{log}}\left( {2\pi } \right)}}{2} + \frac{1}{{2N}}\mathop {\sum}\limits_{i = 1}^N {\mathrm{log}}\left( {I_i + \sigma _i^2} \right) \\ \,\,\,\,\,\,+ \frac{1}{{2N}}\mathop {\sum}\limits_{i = 1}^N {\frac{{\left( {I_i + b_i - v_i} \right)^2}}{{\left( {I_i + \sigma _i^2} \right)}}}.$$

The optimal parameters correspond to the minimum of l

$$\theta = \left( {R,I^0,x,y} \right) = \arg\,\min l\left( {\theta |X} \right).$$

The goodness of fit was estimated as

$$\chi ^2 = \frac{1}{N}\mathop {\sum}\limits_{i = 1}^N {\frac{{\left( {I_i - v_i} \right)^2}}{{I_i + \sigma _i^2}}} .$$

The fitting was regarded as successful if the first- and the second-order optimality conditions were met and the goodness of fit (χ2) was less than a predefined tolerance

$$\left| {\left| {\frac{{\partial \theta }}{{\partial X}}} \right|} \right|\, <\, \varepsilon ,\;H = \frac{{\partial ^2l}}{{\partial \theta \partial \theta^{\prime} }}\;{\mathrm{is}}\;{\mathrm{positive}}\;{\mathrm{defined}},\chi ^2\, <\, \zeta ,$$

where ε and ζ are predefined tolerance. We used ε = 10−6 and ζ = 1.1.

Fast determination of particle size by the CWT

To estimate the size of the scattering particle for each diffraction pattern we used the spherical particle model. A centered diffraction pattern is converted to its radial average which is then compared to the diffraction pattern of a uniform sphere. To account for an unknown background signal present in experimental data, the experimental and theoretical spherical diffraction functions were only compared at the positions of their maxima.

To find peaks in noisy experimental radial average, we used a CWT-based peak detection algorithm39. We used, scaled and translated the second peak of the spherical form factor as our wavelet, which has produced better results than the commonly used Ricker wavelet.

To estimate the diameter of the particle, we used three passes of this CWT procedure. The first pass was tuned to identify images for which the diameter was too small (<300 nm); these images were discarded. The second pass was used to estimate the diameter of larger particles with a diameter between 300 and 800 nm. In both cases, we estimated the diameter using the average distance between neighboring maxima, relying on the fact that for spherical form factor this distance is very close to π/r.

The third pass was used to refine the initially determined approximate value of the particle diameter. We used the positions of the first three peaks in the spherical scattering function to refine the particle size using least-squares minimization.

$$\frac{{X_i}}{r} + c,$$

where Xi is a position of i-th order maximum of spherical form factor with 1 nm radius and c is an arbitrary constant shift introduced to account for imprecise determination of the center of the diffraction and for the fact that experimental particles are not perfectly spherical. In this way, in addition to the particle diameter, we obtain two more values—the shift of the beam center and the mean square error of the fit. Both these values are used to estimate the reliability of the obtained parameters.