by John Melonakos, Georgia Tech
Disclaimer: Note that these thoughts are the result of conversations I have had with friends and colleagues in this line of work over the past several years, so they are by no means entirely original with myself. Also, some of the following comments may be entirely or partially incorrect, and I welcome any comments and critiques by email at <jmelonak at ece.gatech.edu>.
This page contains a hodge podge of thoughts on Diffusion-Weighted MRI (DW-MRI) processing and algorithm development. From a bird's eye view, the most general question for a DW-MRI data processor is: What clinically relevant information can be leveraged from DW-MRI data and used for scientific and clinical studies? Furthermore, what algorithms, programming constructs, clinical guidance, and training are needed to make this happen? The following are some quick thoughts that may be considered when attempting to answer these questions, in no particular order:
DW-MRI vs DT-MRI
A Preliminary Note: I should mention that I have taken the position that for the purposes of an algorithms person, it is the DW-MRI data that is handed to us by the acquisition folks, so it is with the DW-MRI data that image processing begins. A more adventurous algorithms person may attempt to look back at reconstruction algorithms which are tied to the physics of MR in the hope of parlaying even more information from the scanner to the end result... (I'm not that adventurous!).
In answering the questions posed above, it is common for people to assume from the onset that we are using terminology related to tensors (i.e. DT-MRI, DTI). The raw data coming out of the scanner is referred to as DW-MRI or DWI. The difference between the "T" and the "W" is that the "T" has undergone a preprocessing step in which tensors are computed at each voxel that are a function of the original "W" data. Hence, the data resulting from this preprocessing step is referred to at DT-MRI. There are advantages and disadvantages to using this preprocessing tensor construction step. The point here is that when we think about trying to find ways to go from the scanner data to clinical answers, we are talking about going from DW-MRI data to clinical answers. In other words, it is okay to think outside the box and to consider ways in which ellipsoidal constraints arising from the tensor model may be meaningfully relaxed.
A Spaghetti Fancy
In this line of research, the most common figures one can expect to see in the key Neuroimage, Science, Nature, etc. publications are pretty pictures of a bunch of open curves (or fibers) sprawling through the brain. These images are so common that some people have come to presuppose that this is the kind of output one should expect from algorithms which process DW-MRI or DT-MRI data. Of course, these images are the result of streamline techniques, which is one important branch of DW-MRI research. But if you consider recent 2006-2007 publications on DW-MRI and DT-MRI algorithms, you quickly realize that this research trend has recently experienced a shift toward other optimal path / volumetric segmentation type approaches, which do not produce these kinds of sprawling fiber figures. So, in current and future publications, one can expect to see a greater variety of visualizations which reflect the other branches of research emerging in DW-MRI processing.
Synchronizing Algorithm Terminology
An important step in approaching the questions above is the development of a set of common terms which we can use to describe our work. The following is a list of commonly used and confused terms in this line of research with corresponding definitions. Note that some things (such as fractional anisotropy (FA), b-values, etc) have straightforward definitions and need no further attempt at solidifying. These definitions are certainly not perfect and could benefit greatly from community input:
- DW-MRI, DWI - Diffusion-Weighted Magnetic Resonance Imagery - the raw data coming out of the scanner - small values correspond to strong diffusion.
- DT-MRI, DTI - Diffusion-Tensor Magnetic Resonance Imagery - the 6-independent component data (which constitute the unique elements of a symmetric 3x3 tensor) resulting from the construction of tensors from DW-MRI data - high eigenvalues correspond to strong diffusion.
- fibers, tracts - open curves representing possible diffusion paths through the brain - while the paths themselves certainly do not correspond to any micro-architecture, they provide possible paths through which particles may diffuse through the volume - no optimality criterion are associated with the terms fibers or tracts.
- streamlines - these are a particular subset of fibers and tracts which are produced via streamline-based algorithms - streamline-based algorithms employ a local decision-making methodology, propagating paths until a particular termination criterion is reached - no optimality criterion are associated with the term streamlines.
- fiber bundles - volumetric bundles of axons which are large enough to be resolved with the typical 1-3mm resolution of MRI scanners - whether or not this term permits structures with branchings is still an open question.
- fiber clusters - anatomically equivalent to fiber bundles - algorithmically different from fiber bundles because clusters refer to the fact that fiber clusters are formed by analyzing the group behavior of many streamlines.
- optimal fiber, optimal tract, optimal path - the best open curve connecting two ROIs, as determined via an optimization of an energy functional or probability measure - note that the underlying metric which is optimized may be different in the various algorithms, but the essential concept is the same wherein the optimal path connecting two ROIs is found according to the particular choice of metric - note that these do not necessarily constitute geodesics, as the particular energy functional need not have an associated geometric interpretation.
- geodesic fiber, geodesic tract, geodesic path - a subset of optimal fibers, optimal tracts, optimal paths for which the optimal path may be thought of as the geodesic of a particular geometric manifold (such as a Finsler, Riemannian, or Euclidean manifold).
- anchor tract - a subset of optimal fibers, optimal tracts, optimal paths for which the optimal path is leveraged to initialize (i.e. anchor) a fiber bundle segmentation algorithm.
- fiber tractography - algorithms which take raw DW-MRI or DT-MRI data as an input and produce fibers, streamlines, or an optimal tract as output - note that this term was used first to describe streamline techniques, but has since been applied to optimization techniques - therefore, one must be careful when using this term to describe algorithms since two very different classes of algorithms have borrowed the term.
- fiber bundle segmentation - algorithms which take raw DW-MRI or DT-MRI data as an input and produce volumetric segmentations of fiber bundles as an output.
- connectivity maps - these are maps which indicate the connectivity of a particular region with the rest of the domain (typically the entire brain) - these connectivity maps come in a variety of flavors, as probability maps, as arrival time maps resulting from energy minimization techniques, etc. - note that it is possible for algorithms to generate connectivity maps which indicate the degree of connectivity of two ROIs, without ever explicitly segmenting a single fiber or fiber bundle.
- Finsler metric - any algorithm which attempts to find optimal connections between two ROIs must be based upon a Finsler metric - in other words, the conditions of the Finsler metric are the minimally necessary conditions upon which any metric must be built to guarantee that an optimum may be reached - these necessary conditions include: 1) a regularity condition, 2) a positive homogeneity condition in the directional component, and 3) a strong convexity condition.
- Riemannian metric - a subset of Finsler metrics derived by adding in an additional quadratic constraint - intuitively, this additional constraint is arises from the imposition of the ellipsoidal diffusion model.
- Hamilton-Jacobi-Bellman equation - the subset of Hamilton-Jacobi equations which are considered for DW-MRI functionals (i.e. for any Finsler functional, which includes all Riemannian functionals) - this subset of Hamilton-Jacobi equations reflect the fact that we are minimizing functionals which depend upon position and direction.
Validation Soap Box
When we consider assessing the utility of a particular tool in serving the needs of our clinical customers, we should stop and consider the factors upon which our assessment is based and biased. This is particular important when attempting to validate results in an environment absent of ground truth. In the absence of ground truth, tools are judged according to: visual appeal, reproducibility, user-friendliness, and (if you're shooting for the stars and want something for which there is ground truth) accuracy in diagnosis.
Broadly speaking, each tool consists of four components which contribute toward the utility of the tool: the algorithm, the engineering, the incorporation of clinical knowledge, and the training. It is common to interchange the terms tool and algorithm when assessing the utility of a tool. However, especially in an environment where results are primarily judged by the visual appeal and user-friendliness, this is misleading. The fear here is that streamline-based algorithms which have been around the longest and have enjoyed a great deal of engineering input, clinical guidance, and community training, will somehow be inordinately attended to because the judgment criterion used for these tools is based on these non-algorithmic components. It is useful to ask the question, "If algorithm A had received as much engineering, clinical input, and training as algorithm B, how would this change things?"
Algorithms designed to answer the primary DW-MRI questions listed above come in a variety of flavors. There are many possible paths to go from raw DW-MRI data to clinical answers. In fact, most algorithms do not even attempt to go all the way from the DW-MRI data to a clinical comparison of different populations. Instead, algorithms focus on particular subsets of the various possible pipelines to get from DW-MRI data to clinical answers. In attempting to find method in the madness, it is useful to consider the key distinguishing factors by which algorithms may be categorized. The following is an attempt at listing a few of categories of distinguishing algorithm characteristics (please refer to the Synchronizing Algorithm Terminology section for explanations of the terms):
- Is an explicit segmentation provided?
- If so, does the segmentation output fibers, fiber bundles, or fiber clusters?
- If so, does the segmentation capture anatomically feasible structure (within the realm or realistic resolution), such as main fiber bundle trunks? Or does the segmentation attempt to paint a picture of micro-structure, such as finely branching fibers or fibers which fan out in a particular region, recognizing the fact that such micro-structure is certainly not exact due to the lack of resolution? Or does the segmentation provide a little of both worlds?
- If so, does the segmentation output provide avenues for statistical analysis of white matter structural characteristics, such as volume, shape, etc?
- If so, does the segmentation correspond to something optimal (i.e. can you say anything special about the segmentation other than the fact that it looks good)?
- If so, does the segmentation provide a parameterization which permits the computation of statistics as a function of position along the fibers or fiber bundles?
- Is a connectivity map provided?
- If so, does the connectivity map explain fiber connectivity or fiber bundle connectivity (i.e. does it tell how well a volumetric fiber bundle connects two ROIs or does it tell the best connectivity available among all possible infinitesimally small paths connecting the two ROIs).
- If so, which metrics are optimized in the computation of the connectivity map and is the algorithm flexible to determine connectivity under a variety of metrics?
Note that generally-speaking, the goal of DW-MRI segmentation and connectivity maps are to produce statistics by which two clinical populations may be compared and contrasted. Therefore, it is important to consider which
- What statistics are provided for comparison of clinical populations?
- The characteristics have two characteristics: the statistic itself, and the aggregation used to determine the statistics.
- Example statistics include: fractional anisotropy, relative anisotropy, the optimal connectivity measure resulting from the algorithms, structural bundle characteristics (such as size, shape, etc.
- Example aggregation methods include: point-wise statistical comparisons without parameterization, point-wise statistical comparisons with parameterization, tract-wise statistical comparisons, case-wise statistical comparisons, etc.
- The characteristics have two characteristics: the statistic itself, and the aggregation used to determine the statistics.
Beware of the Segmentation-Driven Statistical Comparison Paradox
A common temptation among DW-MRI researchers is to use the statistical measures (such as FA) with which they wish to compare between two populations as an input in their tractography or fiber bundle segmentation algorithms. The paradox here is the following: If you use a statistic to give you a segmentation that you like, then when you are comparing two populations based on this statistic you are unable to determine whether any statistical significance arising from the study is due to the fact that the populations are truly statistically distinct or due to the fact that the segmentations you achieved in the first place were biased towards favoring that statistic. Therefore, it is highly important that the statistics used in the population comparisons be independent of the statistics used to achieve the tractography or segmentation results.
Beware of Mr. Legendre
Another common temptation of DW-MRI algorithm designers is to ignore the fact that the directional dependence of the data critically impacts the class of numerical algorithms which may be used to find optimizations for these energy functionals. It is particular troublesome when techniques designed for directionally isotropic data (or data which does not depend upon direction) are applied directly (or with hacked modifications) to directional datasets. The key here is the Legendre Transform, which can be related to the conversion of particle-based schemes to front-based schemes. Front-based schemes are desirable for their efficiency and accuracy in computing solutions to energy functionals.
For directionally isotropic data, the Legendre Transform giving the front speed is simply related to the inverse of the cost of moving a particle. However, in directionally dependent schemes, the Legendre Transform giving the front speed must take into account the directionality of the underlying particle movement cost structure. In other words, the orientation of the front matters. Several front-propagation numerical algorithms have been proposed to solve directionally dependent energy functionals. These numerical algorithms reflect the added complexity which arises from the more complicated Legendre Transform. The most popular methods are: 1) a sweeping approach which tests all possible directions for the front update (Fast Sweeping) and 2) a modification of Fast Marching which accounts for the directionality and keeps a critical sorted heap to ensure proper front evolution (Upwind Ordered Methods).
Hamilton-Jacobi vs Hamilton-Jacobi-Bellman vs Eikonal Equations
These are the partial-differential equations (PDEs) which are typically solved when attempting to find optimal paths in a domain, such as optimal tracts in the brain.
Let U(p): R^n->R be a scalar field. Now let the DW-MRI data coming out of the scanner be the gradient of U(p), denoted grad(U(p)), which is equivalent to the directional information. General Hamilton-Jacobi equations are a given as a function of U(p), grad(U(p)), and p as follows: H(U(p),grad(U(p)),p)=0. However, for functionals built upon DW-MRI data, we are not given the scalar field U(p). Hence, we end up solving a subset of Hamilton-Jacobi equations, which does not depend upon U(p). This subset is termed Hamilton-Jacobi-Bellman equations, given by: H(grad(U(p)),p)=0. So any energy functional attempting to find optimal paths in DW-MRI data is effectually solving a PDE of the Hamilton-Jacobi-Bellman class.
Note that in the absence of directional information, you are left with the Eikonal equation, which is a further subset of Hamilton-Jacobi equations. A common temptation of algorithm designers is to attempt to solve the Eikonal equation for DW-MRI problems, because the Eikonal equation is so popular in other directionally isotropic cases. However, when considering solutions to energy functionals based upon directionally dependent DW-MRI data, it is inappropriate to attempt to solve the directionally isotropic Eikonal equation.
Numeric Solutions to Hamilton-Jacobi Equations
First of all, note that Hamilton-Jacobi-Bellman and Eikonal Equations are also Hamilton-Jacobi (HJ) equations. The following contains generic ideas which may be applied to numeric solutions of any class of HJ equations.
Assume that one wishes to find the numerical solution to an HJ equation that yields an optimal open curve connecting two ROIs in a domain. Numerically, there are two standard ways in which these solutions are computed:
- Gradient descent along the solution of a PDE, taking either one of the ROIs as the seed and the other as the target region (see for instance the Live Wire algorithm by Mortensen, et al.).
- Summation of the solutions of two PDEs, where the only difference between the two PDEs is the seed domain -- one PDE for each ROI (see for instance the work by Yezzi and Prince on computing Tissue Thickness in IEEE TMI, Oct 2003).
These two approaches give equivalent results (assuming certain minimal requirements such as symmetry in the underlying metric, etc). Therefore, the choice of approach typically depends upon computational complexity of the added PDE solver versus the gradient descent step.