1 Introduction

With the development of new medical imaging techniques and constant refinement of existing ones, the associated storage requirements have been reported to grow exponentially each year [1]. This explains why medical image compression is an active area of research.

Our work belongs to the family of compression algorithms that are based on partial differential equations (PDEs). The general idea behind this approach is to store a sparse subset of the image information and to reconstruct the remaining image via PDE-based inpainting [2].

PDE-based compression has a long tradition for the lossy compression of natural images [2, 3] and videos [4,5,6]. The benefit of PDE-based approaches relative to transform-based codecs like JPEG [7] and JPEG2000 [8] has often been most pronounced at high compression rates [3]. Even though this strategy for lossy compression has also been transferred to three-dimensional images [9], in medical imaging, lossless compression is often preferred to ensure that all diagnostically relevant details are preserved. In some cases, it is even legally forbidden to apply lossy compression for medical image archival [10, 11].

We recently introduced a PDE-based codec for 3D medical images that stores the residuals between the original image and an intermediate PDE-based reconstruction to ensure that the final reconstruction is lossless, and we demonstrated that this strategy led to competitive compression rates [12]. In our current work, we extend this idea for the specific use case of image datasets from diffusion MRI.

Diffusion MRI (dMRI) [13, 14] is a variant of magnetic resonance imaging in which diffusion sensitizing gradients are introduced into the measurement sequence. If the hydrogen nuclei that generate the MR signal undergo a net displacement along the gradient direction during the measurement, the signal is attenuated. Assuming that these displacements result from (self-) diffusion, comparing diffusion-weighted to non-weighted measurements permits computation of an apparent diffusion coefficient.

Taking measurements with different gradient directions captures the directional dependence of the diffusivity. It results from interactions between water and tissue microstructure and therefore carries information about structures that are much smaller than the MR image resolution. Important applications of dMRI include the detection of microstructural changes that are related to aging or disease, and the reconstruction of major white matter tracts, which is referred to as fiber tracking or tractography [15].

The large number of repeated measurements in diffusion MRI leads to large amounts of data. In practice, resulting image datasets are often compressed using GZIP [16]. In our previous work [12], we demonstrated that, compared to this, PDE-based lossless compression can further reduce the memory requirement of individual dMRI volumes by more than 25%. However, applying our codec to each 3D volume independently does not exploit the fact that measurements for nearby gradient directions are usually similar. Moreover, it is relatively time consuming.

In our current work, we address both of these limitations by combining the previous idea of lossless compression via image-space inpainting with a novel approach of PDE-based inpainting in q-space, which is the space spanned by diffusion sensitizing gradient directions and magnitudes. We find that predictions from linear diffusion in q-space can be made with low computational effort and are strong enough to further improve compression rates.

The remainder of our work is organized as follows: Sect. 2 provides the required background and discusses prior work on 4D image compression. Section 3 introduces the components of our proposed codec. Section 4 demonstrates that the resulting compression rates exceed those of several baselines and investigates the effects of specific design choices. Section 5 concludes with a brief discussion.

2 Background and Related Work

We will now introduce the main ideas behind diffusion PDE-based image inpainting and compression (Sect. 2.1), clarify the foundations of diffusion MRI and q-space (Sect. 2.2), and briefly review the literature on 4D medical image compression (Sect. 2.3).

2.1 Diffusion PDE-Based Inpainting and Compression

Inspired by their use for modeling physical phenomena, partial differential equations (PDEs) have a long tradition for solving problems in image processing. In particular, the PDE describing heat diffusion has provided a framework for image smoothing and inpainting [17,18,19,20,21,22,23].

The heat equation captures the relationship between temporal changes in a temperature \(\partial _t u\) and the divergence of its spatial gradient \(\nabla u\),

$$\begin{aligned} {\partial _t u} = \textrm{div}(D \cdot \nabla u) \; , \end{aligned}$$
(1)

where D is the thermal diffusivity of the medium. In a homogeneous and isotropic medium, the diffusivity D is a constant scalar. In a non-homogeneous isotropic medium, D would still be a scalar, but depend on the spatial location. In an anisotropic medium, heat dissipates more rapidly in some directions than in others. In that case, D is a symmetric positive definite matrix that is referred to as a diffusion tensor.

In image processing, the gray value at a certain location is interpreted as a temperature u, and Eq. (1) is coupled with suitable boundary conditions. For image smoothing,

$$\begin{aligned} \begin{aligned} {\partial _t u}&= \textrm{div}(D \cdot \nabla u), \quad \Omega \times (0,\infty )\ , \\ {\partial _n u}&= 0, \qquad \qquad \qquad \partial \Omega \times (0,\infty ) \ , \end{aligned} \end{aligned}$$
(2)

where \(\Omega \) is the image domain and n is the normal vector to its boundary \(\partial \Omega \). The original image \(f:\Omega \rightarrow \mathbb {R}\) is used to specify an initial condition \(u=f\) at \(t=0\). For increasing diffusion time t, u will correspond to an increasingly smoothed version of the image.

In image inpainting, values are known at a subset of pixel locations, and unknown values should be filled in. For this, a Dirichlet boundary condition is introduced, which fixes values at a subset K of pixel locations [2, 20]

$$\begin{aligned} \begin{aligned} {\partial _t u}&= \textrm{div}(D \cdot \nabla u), \quad \Omega \backslash K \times (0,\infty )\ , \\ {\partial _n u}&= 0, \qquad \qquad \qquad \partial \Omega \times (0,\infty ) \ , \\ u&= f, \qquad \qquad \qquad K\times [0,\infty ) \ . \end{aligned} \end{aligned}$$
(3)

and a steady-state is computed at which \(\partial _t u\approx 0\). The ability of PDEs to reconstruct plausible images even from a very sparse subset of pixels made them useful for image compression [2,3,4].

Different choices of diffusivity D introduce considerable flexibility with respect to sha** the final result. Fixing \(D=1\) turns Eq. (3) into second-order linear homogenous (LH) diffusion

$$\begin{aligned} {\partial _t u} = L u, \quad \Omega \backslash K \times (0,\infty ) \end{aligned}$$
(4)

with \(Lu=\Delta u\), where \(\Delta \) denotes the Laplace operator, and the steady state satisfies the Laplace equation \(\Delta u = 0\). Even though the resulting reconstructions suffer from singularities [2] and can often be improved by the more complex models discussed below, they have been used to design compression codecs for cartoon-like images [24], flow fields [25], and depth maps [26,27,28]. Its simple linear nature and fast convergence to the steady-state also make LH diffusion an attractive choice for real-time video compression [5, 6].

Compared to LH diffusion, decreasing the diffusivity as a function of image gradient magnitude permits a better preservation of salient edges [18, 29]. This is referred to as nonlinear diffusion, since the results are no longer linear in the original image f. Rather than just decreasing the overall diffusivity close to edges, modeling D as an anisotropic diffusion tensor permits smoothing along edges, while maintaining or even increasing the contrast perpendicular to them. One widely used model is referred to as edge-enhancing diffusion (EED) [30].

All PDEs that have been discussed up to this point are of second order. Fourth- and higher-order extensions have also been studied, both for smoothing [31,32,33,34,35] and for inpainting [36, 37]. In the simplest case, setting \(Lu=-\Delta ^2 u\) in Eq. (4) leads to the biharmonic (BH) equation. In two and three dimensions, it does not suffer from the singularities that are present in the results of LH diffusion [2, 38], while preserving a simple linear nature. For this reason, BH has been considered for the design of compression codecs [38,39,40,41]. However, it no longer satisfies a min-max principle [33] and it increases running time and sensitivity to quantization error.

Our own previous work [37] proposed an anisotropic fourth-order PDE in which a fourth-order diffusion tensor is constructed from the image gradient in a similar way as in second-order EED. We thus refer to it as fourth-order edge-enhancing diffusion (FOEED). It was shown to result in more accurate inpainting results than second-order EED, and higher PDE-based compression rates, in several examples [12].

Our current work is concerned with compressing data from diffusion MRI, which is similar to hyperspectral images in that it contains a large number of values (channels) at each location [40]. However, the channels in hyperspectral images have a one-dimensional order, while our channels correspond to positions on a spherical shell in q-space, which we will now introduce.

2.2 Diffusion MRI

The signal in magnetic resonance imaging is generated by the hydrogen atoms within water molecules. Their heat motion is referred to as self-diffusion, since it takes place despite a zero concentration gradient. The extent to which this motion is restricted in a cellular environment correlates with microstructural parameters such as cellular density or integrity. Moreover, in the white matter of the human brain, which contains the tracts that connect different brain regions, self-diffusion can occur more freely in the local direction of those tracts than perpendicular to it [42]. Therefore, measuring the apparent diffusion coefficient in different directions provides relevant information about small-scale structures that are below the image resolution of in vivo MRI.

This motivates the use of diffusion MRI. It goes back to the idea of measuring diffusion by introducing a pair of diffusion sensitizing magnetic field gradients into a nuclear magnetic resonance sequence [43]. Integrating it with spatially resolved magnetic resonance imaging permits diffusion measurements at a voxel level [13]. Repeating the measurements with differently oriented gradients reveals a biologically relevant directional dependence in various tissue types, including muscle and the white matter of the brain [44].

Several key parameters of the diffusion sensitization can be summarized in the gradient wave vector

$$\begin{aligned} \textbf{q}=\frac{1}{2\pi }\gamma \delta \textbf{g}, \end{aligned}$$
(5)

where \(\gamma \) is the gyromagnetic ratio of hydrogen nuclei in water, \(\delta \) is the duration of the diffusion sensitizing gradients, and \(\textbf{g}\) corresponds to their direction and strength. The normalized MR echo magnitude \(|E(\textbf{q},\tau )|\) additionally depends on the time \(\tau \) between the pair of gradient pulses. It is computed as the ratio between the corresponding diffusion-weighted measurement and an unweighted measurement with \(\textbf{q}=\textbf{0}\). It is antipodally symmetric, \(|E(\mathbf {-q},\tau )|=|E(\textbf{q},\tau )|\).

The relevance of this q-space formalism derives from a Fourier relationship between \(|E(\textbf{q},\tau )|\) and the ensemble average diffusion propagator \(\bar{P}(\textbf{R},\tau )\), which specifies the probability of a molecular displacement \(\textbf{R}\) within a fixed diffusion time [45]. An alternative parameterization of the diffusion gradients is in terms of their direction and a factor \(b=4\pi ^2\Vert \textbf{q}\Vert ^2\left( \tau -\delta /3\right) \), which also accounts for the fact that the diffusion weighting increases with the effective diffusion time \(\left( \tau -\delta /3\right) \).

Fig. 1
figure 1

Illustration of three diffusion sensitizing gradient directions on a shell in q space, with equal \(b=700\) (top left). The three diffusion-weighted images have been measured with different gradient directions, as indicated at the bottom left of each image. Comparing them reveals the directional dependence of the dMRI signal

Due to practical constraints on the overall duration of dMRI measurements, the sampling of q-space is usually limited to one or several reference measurements with \(\textbf{q}=\textbf{0}\), as well as one or a few shells with constant \(\Vert \textbf{q}\Vert \), and thus constant b. This is illustrated in Fig. 1. Such setups focus on the directional dependence of the signal, and typically strive for a uniform distribution of gradient directions on these shells [46]. Our codec assumes dMRI data with such a “shelled” structure, an assumption that is shared by well-established algorithms in the field [47].

2.3 4D Medical Image Compression

Many medical imaging modalities, including magnetic resonance imaging, computed tomography, and ultrasound, can be used to image volumes repeatedly, in order to capture time-dependent phenomena such as organ motion, perfusion, or blood oxygenation. Considerable work has been done on lossless and lossy compression of the resulting 4D (3D plus time) image data. Much of it has borrowed from video coding and has often involved motion compensation [48, 49], which is combined with wavelet transforms [48, 50,51,52,53] or hierarchical vector quantization [54] for compression.

Almost all of these works have compared their compression rates to codecs from the JPEG family. We will also compare our codec to JPEG-LS, lossless JPEG, and JPEG2000. Additionally, we compare compression rates against GZIP [16] which, in conjunction with the Neuroimaging Informatics Technology Initiative (NIfTI) file format, is currently most widely used to compress diffusion MRI data in practice. To make this comparison fair, we also use Huffman coding or Deflate within our own codec, as opposed to computationally efficient alternatives that might further improve compression rates [55, 56].

Even though the volumetric images in diffusion MRI are also taken sequentially, their temporal order is less relevant than the q-space structure that was described above: Measuring with the same diffusion sensitizing gradients, but in a different order, should yield equivalent information, even though it permutes the temporal order. To the best of our knowledge, no codec has been proposed so far that exploits this very specific structure. There has been extensive work on compressed sensing for diffusion MRI (see [57, 58] and references therein), but with a focus on reducing measurement time, rather than efficient storage of the measured data.

Recent work has demonstrated the potential of deep learning for lossless compression of 3D medical images [59]. Extending this specifically for diffusion MRI is an interesting future direction. However, our PDE-based approach has the advantage of not requiring any training data. Since medical data are a particularly sensitive type of personal data, obtaining diverse large-scale datasets can be difficult, and the potential of model attacks that could cause data leakage is concerning [60, 61].

3 Proposed Lossless Codec

Traditional PDE-based image compression [2, 3, 12, 37] performs inpainting in image space, which relies on piecewise smoothness of the image. A key contribution of our current work is to additionally exploit the smoothness in q-space. As it can be seen in Fig. 1, dMRI signals that are measured with similar gradient directions are correlated.

Our codec uses a spatial PDE for the first few volumes, which is described in more detail in Sect. 3.1. Once sufficiently many samples are available so that a q-space PDE, described in Sect. 3.2, produces stronger compression than the spatial PDE, we switch to it.

The q-space PDE assumes that all volumes are in correct spatial alignment, which might be violated in practice due to subject motion. For this reason, our codec includes a mechanism for motion compensation, described in Sect. 3.3. Our overall compressed file format is specified in Sect. 3.4.

3.1 Lossless 3D Spatial Codec

The initial few volumes are compressed with an image space PDE-based codec that follows our recent conference paper [12]. To make our current work self-contained, we briefly summarize the most relevant points, focusing on the forward, i.e., encoding direction. The decoding process just mirrors the respective steps. The codec is composed of three main parts: data sparsification (initial mask selection), prediction (iterative reconstruction), and residual coding.

Initial Mask Selection As an initial mask, our codec simply stores voxel intensities on a sparse regular grid. More specifically, for a given 3D input image of size \(n_1 \times n_2 \times n_3\), the initial mask is chosen as a hexahedral grid consisting of voxels \((4i_1, 4i_2, 4i_3)\), where \(i_j \in \{0, 1, \ldots , \lfloor \frac{(n_j-1)}{4} \rfloor \}\), \(j \in {1,2,3}\).

Most lossy PDE-based codecs select a mask adaptively [2, 3], which better preserves important image features such as edges and corners [24]. However, this introduces the need to store the locations of the selected pixels, which can be avoided by the use of fixed grids [27, 62]. In the context of lossless compression, we achieved higher compression rates by combining the latter strategy with iterative reconstruction.

Iterative Reconstruction Making PDE-based compression lossless requires coding the differences between the original image and the PDE-based reconstruction and is beneficial in terms of compression rates to the extent that those residuals are more compressible than the original image. In general, residuals become more compressible the more accurate the reconstruction is. Therefore, the overall compression rate can be increased by iteratively coding residuals of some pixels, and refining the remaining ones based on them.

Our previous work [12] explored different iterative schemes. The variant that is used here codes the residuals in all remaining face-connected neighbors of the current mask voxels, i.e., up to six voxels per mask voxel. Those neighbors become part of the mask for the next iteration, and the process continues until all voxels have been coded.

Among the PDEs that have been explored for inpainting, we currently consider the two that worked best in [12], i.e., traditional edge-enhancing diffusion (EED) [20] and our recent fourth-order generalization (FOEED) [37].

Residual Coding Residuals are computed in modular arithmetic, so that they can be represented as unsigned integers. The final compression of the initial mask and the residuals is either done via a Huffman entropy encoder or the Deflate algorithm, depending on which gives the smaller output file size.

In cases where medical images contain a substantial amount of empty space, e.g., a background region with exactly zero image intensity, our previous work [12] found that coding it separately using run length encoding (RLE) can provide an additional benefit. Unfortunately, in dMRI, the background is perturbed by measurement noise, which renders this approach ineffective. Therefore, our current work does not include any dedicated empty space coding.

Fig. 2
figure 2

An example of q-space inpainting to predict a diffusion-weighted volume in gradient direction \(\textbf{g}_5\) (red double arrow) based on four known volumes, illustrated as filled volumes in directions \(\textbf{g}_i\) (black arrows) (Color figure online)

3.2 PDE-Based q-Space Inpainting

The general idea of q-space inpainting is illustrated in Fig. 2: Once a certain number of diffusion-weighted images with different gradient directions are known, we can use them to predict images that correspond to a new direction. This happens at the voxel level, so that the prediction at a given location is entirely determined by values at the same location in the known images.

This can be understood as “flip**” the setup from Sect. 3.1, where the mask consisted of pixel locations, and the inpainting was repeated with an identical mask for each channel. Instead, the mask now specifies the known channels, and inpainting is repeated for each voxel in the volume.

3.2.1 Compressing Diffusion-Weighted Images

Since we assume that diffusion-weighted measurements are on spherical shells in q-space (Sect. 2.2), we inpaint with second-order linear homogeneous (LH) diffusion

$$\begin{aligned} \partial _t u=\Delta u \end{aligned}$$
(6)

or fourth-order biharmonic (BH) smoothing

$$\begin{aligned} \partial _t u=-\Delta ^2 u \end{aligned}$$
(7)

on the sphere, where \(\Delta \) is the Laplace-Beltrami operator.

Fig. 3
figure 3

The q space sampling of the dMRI data used in our experiments (left), and the resulting triangulation that is used within the finite element method

Given that our samples do not form a regular grid, we numerically solve these equations using finite element methods (FEMs) [63, 64]. For this, we first construct a 3D Delaunay tessellation from the set of all gradient vectors \(\textbf{g}_i\) and their antipodal counterparts \(-\textbf{g}_i\), and then extract a triangular surface mesh from it. Figure 3 shows an example of the given vectors (left), and the resulting triangular mesh (right).

Similar to PDE-based inpainting in the image domain, we fix the known values by imposing Dirichlet boundary conditions at the vertices corresponding to the previously coded diffusion-weighted images, again accounting for antipodal symmetry so that each known image determines the values of two vertices. Once a steady state has been reached, the values at locations corresponding to diffusion-weighted images that are yet to be coded can serve as predictions. Similar as before, we compute residuals with respect to those predictions in modular arithmetic and apply Huffman coding or Deflate to them.

We found that once a sufficient number of diffusion-weighted images are available as a basis of q-space inpainting, its residuals become more compressible than those from iterative image space inpainting. Our codec adaptively determines a suitable point for switching from spatial to q-space predictions. After the first diffusion-weighted volume, it starts comparing the sizes of compressing subsequent volumes with the spatial codec (Sect. 3.1) to the size when using q-space inpainting and switches to it on the first volume where it is beneficial. To limit computational effort, the spatial codec is no longer tried for subsequent volumes.

3.2.2 Accelerated Computation

Since q-space inpainting happens at a voxel level, it should be repeated for each voxel of the 3D image. However, the computational cost of running the FEM solver for each voxel separately is extremely high. Fortunately, linearity of the PDEs and the fact that the Dirichlet boundary conditions are imposed on the same vertices for each voxel permit a significant speedup.

Formally, we can consider one time step of numerically solving Eqs. (6) or (7), at time t, as applying a discrete linear differential operator \(\textbf{D}\), which is determined by the vertices and connectivity of our triangular mesh, on a discrete input \(\textbf{u}^{(t)}\in \mathbb {R}^c\),

$$\begin{aligned} \textbf{u}^{(t+1)}=\textbf{D} \left[ \textbf{u}^{(t)}\right] , \end{aligned}$$
(8)

where c is the number of channels (q-space samples per voxel). The boundary conditions ensure that \(u_{k_j}^{(t+1)}=u_{k_j}^{(t)}\) at positions \(k_j\) that correspond to the fixed (previously coded) channels.

The inpainting result is obtained as the fixed point \(\textbf{u}^{(\textrm{FP})}\) as \(t\rightarrow \infty \). It can be approximated by iterating \(\textbf{D}\) a sufficient number of times, resulting in an operator \(\textbf{D}_\textrm{FP}\) that directly maps

$$\begin{aligned} \textbf{u}^{(\textrm{FP})}=\textbf{D}_\textrm{FP} \left[ \textbf{u}^{(0)}\right] . \end{aligned}$$
(9)

\(\textbf{D}_\textrm{FP}\) is still linear, and we observe that its kernel is the subspace corresponding to the unknown q-space samples, so that their initialization in \(\textbf{u}^{(0)}\) does not influence the steady state [4]. Therefore, we can rewrite Eq. 9 as

$$\begin{aligned} \begin{aligned} \textbf{D}_\textrm{FP} \left[ \textbf{u}^{(0)}\right] =\,&u^{(0)}_{k_1} \textbf{D}_\textrm{FP} [\textbf{e}_{k_1}] + u^{(0)}_{k_2} \textbf{D}_\textrm{FP} [\textbf{e}_{k_2}]\\ {}&+ \ldots + u^{(0)}_{k_n} \textbf{D}_\textrm{FP} [\textbf{e}_{k_n}], \end{aligned} \end{aligned}$$
(10)

where \(\textbf{e}_{k_j}\) are the indicator vectors of the known samples \(k_j\).

In other words, by computing \(\textbf{w}_{k_j}=\textbf{D}_\textrm{FP} [\textbf{e}_{k_j}]\), we can obtain weight vectors that specify how the known values are combined to predict the unknown ones. They are analogous to the “inpainting echoes” that have been computed in previous work [65] for the purpose of optimizing tonal data. Omitting the irrelevant initialization of the unknown values from the input \(\textbf{u}^{(0)}\), and the known values from the output \(\textbf{u}^{(\textrm{FP})}\) yields a weight matrix \(\textbf{W}\) of shape \(m\times n\) for n known and m unknown values.

We compute the coefficients of \(\textbf{W}\) by running the FEM n times. In the jth run, we set the value corresponding to \(k_j\) to one, all remaining values to zero. After numerically solving the Laplace or Biharmonic PDE, the values at the m unique vertices that correspond to unknown DWIs yield the jth column of \(\textbf{W}\).

Finally, \(\textbf{W}\) allows us to make efficient predictions in each voxel, by simply multiplying it to a vector that contains the intensities in that voxel from the previously coded diffusion gradients.

3.2.3 Implementation Details and Running Times

We numerically solve Eqs. (6) and (7) via the open-source FEM solver package FEniCSx [64]. For implementation details, we refer to its tutorials [66]. Applying this solver to each voxel of a \(104 \times 104 \times 72\) volume takes close to two and four hours, respectively, for LH and BH PDEs on a single 3.3 GHz CPU core.

The acceleration from the previous section reduces this to only 1.6 s and 2.4 s per volume, respectively. This includes the time for building a Delaunay tessellation, which is computed with SciPy [67], and extracting a surface mesh from it using the BoundaryMesh method from FEniCSx.

3.2.4 Compressing \(b=0\) Images

Fig. 4
figure 4

Top A: Compression of the first \(b=0\) volume using the recently proposed lossless 3D codec [12]. Bottom B: Compression of the remaining \(b=0\) volumes using residuals in modular arithmetic. Residuals are taken with respect to the first \(b=0\) volume, after motion correction

Our general approach simplifies for unweighted volumes with \(b=0\). Again, the first of them is compressed using the spatial codec. If multiple \(b=0\) images were acquired to increase the signal-to-noise ratio, our codec compresses the remaining ones by taking the respective residuals with respect to the first \(b=0\) volume, as illustrated in Fig. 4.

3.3 Motion Correction

Subject motion commonly occurs during the lengthy dMRI acquisitions and is typically accounted for by applying motion correction based on image registration [68]. We also include this step in our codec, since inpainting in q-space requires a correct spatial alignment of all 3D volumes so that predictions are based on information from the same location within the brain.

Fig. 5
figure 5

Boundary effects in volume alignment. Top: Original DWI volume. Center and Bottom: Motion corrected with zero padding and nearest neighbor extrapolation, respectively

We implement motion correction as follows:

  1. 1.

    We perform affine registration of each volume to the same \(b=0\) volume, which is used as a common reference. This yields a transformation matrix \(\textbf{T}_{X \rightarrow b_0}\) which aligns DWI volume X to the \(b=0\) reference.

  2. 2.

    When predicting a DWI volume P, we transform all known volumes X via the affine transformation \(\textbf{T}_{P \rightarrow b_0}^{-1} \circ \textbf{T}_{X \rightarrow b_0}\), which can be computed from the transformations in Step 1.

  3. 3.

    In addition to resampling each known volume X, we re-orient its diffusion gradient direction \(\textbf{g}_X\) according to the rotational part of the transformation in Step 2. Omitting this step would lead to incorrect relative orientations of diffusion gradient directions [69], which could again reduce accuracy of q-space inpainting.

Transforming images via a common reference allows us to align them without having to perform image registration on all pairs of volumes. This saves considerable computational effort. Combining the two transformations and applying them in a single step also reduces computational effort, and simultaneously reduces image blurring compared to a two-stage implementation that would involve interpolating twice.

In addition to the computational expense, motion correction incurs the cost of having to store the affine matrices \(\textbf{T}_{X \rightarrow b_0}\) along with the compressed data. Experimental results in Sect. 4.4 will demonstrate that this storage cost is outweighed by the increase in compression rate when q-space inpainting properly accounts for motion.

Subject movement correction and B-matrix reorientation are done using the freely available FSL tools [68] and the DIPY imaging library [70], respectively. A practically relevant implementation detail concerns boundary effects. As illustrated in Fig. 5, missing information can enter the field of view when applying image transformations. We found that q-space inpainting near the boundary of the domain works more reliably if we resolve these cases with nearest neighbor extrapolation, rather than with zero padding.

Fig. 6
figure 6

Example images from our two dMRI datasets, without deliberate head motion (left) and with strong motion artifacts (right). In each case, six corresponding sagittal slices from different diffusion weighted images (DWIs) are shown. Note that subject motion leads to spatial misalignments between DWIs, but also to artifacts within individual images

3.4 Compressed File Format

In our current implementation, the relevant data are spread over multiple files whose sizes are added when computing compression rates.

The volumes that are compressed with the 3D lossless codec (Sect. 3.1) are stored with the same header as in [12]. Stated briefly, it contains the original minimum and maximum voxel values (4 bytes), sizes of the compressed data streams for zero voxel binary mask and mask intensities (8 bytes), the diffusivity contrast parameter (4 bytes), the type of PDE (2 bits), the dilation mode (1 bit), and the types of encoding for mask intensities and residuals (2 bits).

For each volume that is compressed with q-space inpainting, the header contains the original minimum and maximum voxel values (4 bytes), the type of PDE (2 bits), the type of encoding for the residuals (1 bit), and the volume number in the original order (2 bytes).

Mask and residual values themselves are stored after compression with pure Huffman coding or Deflate, depending on what gave a smaller file size.

In addition, we store the NIfTI header (348 bytes) as well as files containing b values and gradient vectors in their original ASCII formats. For simplicity, affine transformations for motion correction are also kept in the ASCII format generated by FSL FLIRT [68].

4 Results and Discussion

4.1 Data

We evaluate our codec on two dMRI datasets that were made publicly available by Koch et al. [71] and are specifically suited to investigate the impact of subject motion compensation. Both datasets have been collected from the same subject (male, 34 years) in the same scanner, a 3T MAGNETOM Skyra (Siemens Healthcare, Erlangen, Germany), with an identical measurement sequence. For the first scan, the subject received the usual instruction of staying as still as possible during the acquisition. For the second scan, the subject was asked to move his head, to deliberately introduce motion artifacts.

From these datasets, we use the five non-diffusion weighted (\(b=0\)) MRI scans each, as well as 30 diffusion weighted images (\(b=700\,\textrm{s}/\textrm{mm}^2\), diffusion gradient duration \(\delta =334\,\textrm{ms}\), spacing \(\tau =445\,\textrm{ms}\)). Each image consists of \(104 \times 104 \times 72\) voxels with a resolution of \(2\times 2\times 2\,\textrm{mm}^3\). The data, and the effects of subject motion, are illustrated in Fig. 6.

4.2 DTI Baseline

We compare the signal predictions from our q-space PDE to a simple baseline, which is derived from the diffusion tensor imaging (DTI) model. DTI is widely used in practice, due to its relative simplicity and modesty in terms of scanner hardware and measurement time.

It rests on the assumption that the diffusion propagator \(\bar{P}(\textbf{R},\tau )\) is a zero-mean Gaussian whose covariance matrix is proportional to the diffusion tensor \(\textbf{D}\), a symmetric positive definite \(3\times 3\) matrix that characterizes the local diffusion [14]. The signal model in DTI relates the diffusion-weighted signal \(S(\hat{\textbf{g}},b)\) for a given b-value and gradient vector direction \(\hat{\textbf{g}}=\textbf{g}/\Vert \textbf{g}\Vert \) to the unweighted signal \(S_0\) according to

$$\begin{aligned} S(\hat{\textbf{g}},b)=S_0\textrm{e}^{-b\hat{\textbf{g}}^T\textbf{D}\hat{\textbf{g}}}. \end{aligned}$$
(11)
Table 1 Compressed file sizes from separate PDE-based compression of each 3D scan (baseline), from different variants of our proposed lossless codec, as well as from GZIP and lossless codecs from the JPEG family

Fitting \(\textbf{D}\) requires at least one reference MR image \(S_0\), plus diffusion-weighted images in at least six different directions, which are usually taken with a fixed nonzero b-value. Equation (11) can then be used to predict the diffusion-weighted signal in any desired direction. In our experiments, we compare our PDE-based to DTI-based predictions that account for the same set of known measurements.

4.3 Comparing Lossless Codecs for Diffusion MRI

A comparison of file sizes that can be achieved on our two test datasets with different lossless codecs is provided in Table 1. As a baseline, the first two rows show results from coding each 3D volume independently with our recently proposed PDE-based codec [12], using second-order (R-IEED-1) and fourth-order anisotropic diffusion (R-IFOEED-1). Additional savings of other codecs with respect to R-IEED-1 are given in percent.

The second block in Table 1 shows results from several variants of our proposed new codec, which adaptively combines inpainting in q-space and image space. Highest compression rates were achieved when combining linear homogeneous (LH) diffusion in q-space with R-IFOEED-1 in image space, closely followed by R-IEED-1. Biharmonic (BH) diffusion in q-space also produced useful, but slightly weaker results.

Both q-space diffusion approaches achieved better compression than predictions from DTI (Sect. 4.2). This could be due to the fact that the quadratic model of diffusivities in Equation (11) is known to be an oversimplification in many parts of the brain [72], and the PDE-based approaches provide more flexibility.

DTI requires independent coding of at least seven 3D volumes, which led us to fix this split in our experiments. PDE-based imputation makes it possible to switch to q-space inpainting earlier, and our adaptive selection does so after four volumes in the low-motion data, after five volumes in the data with strong motion.

Switching to q-space inpainting also speeds up our codec. Our implementation of R-IEED-1 and R-IFOEED-1 requires approximately 478 s and 6185 s, respectively, for one volume on a single 3.3 GHz CPU core. Even though it would be possible to further optimize this, exploiting linearity in qLH and qBH, as described in Sect. 3.2, significantly lowers the intrinsic computational complexity, so that even a straightforward implementation only requires 1.64 s and 2.4 s per volume, respectively.

Table 2 Compressed file sizes when omitting motion compensation, and the relative benefit from motion correction

It can be seen in Fig. 6 that subject motion during different phases of the acquisition leads to different types of artifacts. Results in Table 1 include the motion correction described in Sect. 3.3, which compensates spatial misalignments of different scans. However, motion can also lead to signal dropouts or to distortions within scans, which our current codec does not explicitly account for. This explains why q-space inpainting is less effective on the second as compared to the first scan. However, even on this challenging dataset that exhibits unusually strong artifacts, q-space inpainting still provides a benefit compared to all other alternatives.

Finally, Table 1 shows results from several other lossless codecs for comparison. GZIP is most widely used in practice, but the resulting files are more than 35% larger than those from our proposed codec. Among the lossless codecs from the JPEG family, JPEG2000 is the only one that outperforms R-IEED-1 for per-volume compression, and only by a small margin. Our new hybrid methods that combine image space and q-space inpainting always performed best.

4.4 Benefit from Motion Correction

Table 2 investigates the benefit of motion correction (Sect. 3.3) by showing file sizes when removing motion correction from our codec, and comparing the results to ones with motion correction (Table 1), indicating the benefit in percent.

Even on the first scan, in which the subject tried to keep his head still, compensating for small involuntary movements yields a slight benefit. The effect is largest when imputing via qBH and DTI. This might be explained by the fact that qLH satisfies the min-max principle, which makes it more robust against inaccuracies in its inputs, and provides another argument in its favor.

When strong head motion is present (second scan), restoring a correct voxel alignment via motion correction becomes essential for q-space inpainting. Without it, the switch to q-space imputation happens much later, and the overall file size is larger than when coding each volume independently. This is explained by the fact that our codec always applies difference coding to the \(b=0\) images and that this becomes detrimental when those images are strongly misaligned.

4.5 Effect of Re-ordering DWIs

Since q-space imputation relies on the previously (de)coded diffusion weighted images, its accuracy depends on the order in which we process the gradient directions.

Fig. 7
figure 7

Given a set of previously coded DWIs, the closest strategy (left) selects the volume whose gradient vector has the smallest angular distance from the known ones, to maximize expected prediction accuracy. The furthest strategy (right) maximizes the angular distance, aiming for a more uniform coverage of the sphere for subsequent steps. The sketch shows the directions selected in the first three steps as black double arrows, the fourth direction as a red dot

Two contradictory greedy strategies are illustrated in Fig. 7: Always selecting the closest gradient direction, i.e., the one with the smallest angular distance from the already known ones, can be expected to result in the most accurate prediction, in the same spirit as our spatial codec (Sect. 3.1) iteratively grows a mask of known pixels around an initial set of seed points.

Table 3 Compressed file size for scan 1 (without strong motion) when ordering the diffusion-weighted images differently

On the other hand, the spatial codec starts with a seed mask that covers the full domain sparsely, but uniformly. Achieving something similar motivates selecting the gradient direction that is furthest from any of the known ones. Even though this strategy can be expected to lead to lower accuracy, and therefore to less compressible residuals in the first few iterations, later iterations might benefit from the more uniform coverage of the overall (spherical) domain.

Table 3 presents the effect of these two selection strategies on final file sizes. The results are from the first scan, without strong motion. Overall, greedily selecting the furthest gradient vector gives slightly smaller overall file sizes. Therefore, this is the strategy that we followed in all other experiments.

5 Conclusion

In this work, we introduced a PDE-based lossless image compression codec that explicitly exploits both the spatial and the q-space structure in diffusion MRI. To our knowledge, it is the first codec that has been tailored to this type of data. We demonstrated a clear improvement over PDE-based codecs that treat each volume separately, and over other established baselines including GZIP and spatial codecs from the JPEG family.

We evaluated several variants of our codec, and found that q-space predictions with linear homogeneous diffusion permitted the highest compression rates among them. With our proposed method for accelerated computation, it could also be applied at a very reasonable computational cost. We further demonstrated the importance of including motion correction, and propose an efficient implementation that is based on affine image transformations via a common reference. Finally, we found that the order of coding the diffusion-weighted volumes had a relatively minor effect, but that a greedy strategy that strives to cover the sphere as uniformly as possible provides a small benefit.

In the future, one might attempt to replace the switching between image space and q-space inpainting with a PDE that jointly operates on the product space. However, this is likely to substantially increase the computational effort, and introduces the issue of properly balancing image space and q-space diffusion. Similarly, employing nonlinear PDEs for q-space predictions might further increase compression rates, but is likely to cause a high computational cost.