18

Nonrigid Image Registration Using Groupwise Methods

Many medical tasks involve the analysis of a deformable object or a deformable structure within an object; examples include the study of variation, classification, atlas look- up, anomaly detection, and segmentation (the partitioning of a digital image into two or more regions). Tasks like these are often performed by specialists (e.g., radiologists) who rely on binocular vision and human judgment to convert raw data into a clinically useful form. This is often a subjective, error- prone and time- consuming process. Recent advances in machine vision make it possible to automate some of the image analysis and interpretation tasks. Automatically establishing dense correspondences between samples of a deformable object (Figure 18.1), otherwise known as nonrigid registration in the literature, has attracted significant interest in the research area of medical image analysis.1

In this chapter, we will focus on the nonrigid registration as it is one of the most fundamental problems in computer vision and also has importance in a wide range of clinical applications. Given a set of samples of a deformable object, it is only when correspondences between the samples are established that any form of meaningful analysis of the whole set can be performed.

First, we will consider an important special case of the registration problem – establishing correspondences between elements of a set of images (e.g., photographs, radiographs, depth maps measured by range scanners, computed tomography slices, or magnetic resonance imaging volumetric data).

Applications of image registration methods are vast, so it is unlikely that generic methods can be developed that are fit for purpose for a wide range of applications.

Rigid versus nonrigid transformations

By convention, image registration methods are often classified according to their transformation model into “rigid” (translation, rotation, and scaling) and “nonrigid” (complex deformations). Rigid methods aim to find the optimal affine transformations of the images, with respect to a common coordinate frame, that bring the images into correspondence. In other words, these methods assume only a global (entire image) deformation model. Nonrigid methods operate under a less restrictive assumption, and the images are allowed to undergo more general and complex deformations, including local (subsections of the image) warping. This is often the case with medical imaging due to the complex mechanical deformations within and between subjects.

Nonrigid methods are further subclassified according to their transformation model. The most common type of nonrigid transformation models are piecewise linear, spline- based and fluid-based.^{2–5} Transformations between images need not be spatial. For example, Learned – Miller considered a problem of groupwise registration (in his paper termed “congealing”) but applied this to the optimization of spatial and also nonspatial modes of deformation (e.g., brightness).6

Groupwise versus pairwise

There is an important distinction between registering a *pair* and a (large) *set* of images. One might naï vely assume that, given an algorithm to establish correspondence between a *pair* of images, correspondences between *multiple* images could be found by the repeated application of such a pairwise algorithm. An example of such a naï ve approach would be to select one of the images from the set as a reference and then repeatedly register all other images to this reference. However, this approach frequently used in the past may be *fundamentally* flawed because, at any given time, not all the available information is used, only the information from a pair of images. A survey of pairwise registration methods is found elsewhere.^{7}

Repeated application of a pairwise registration algorithm will inevitably be biased to the choice of reference image, which leads to errors in the final alignment.^{8} In addition, an unfortunate choice of reference, for example if it has missing features or is not representative of the entire set, will degrade the performance of the algorithm further. To combat such issues, recently developed groupwise approaches that consider and bring an entire group of images into registration simultaneously endeavor to use all the available information from all images in the set.^{2,3,8} Here, the information from the *entire* dataset is being utilized at each stage, rather than that from only a pair.

Only from multiple images can the features that are reliable enough to be registered be discriminated from those features that are unimportant (structures that are only transient, or noise) and should be ignored. Early experiments with the groupwise approaches were concerned with finding dense correspondence across a set of unlabeled images and have been shown to be superior compared with pairwise methods.^{3} Despite established superiority, it was not until recently that the practical implementations of groupwise methods were considered. The reason for this is the extra cost (computational effort) with which the superior quality of the groupwise registration comes.

Image registration may be broken down into three subproblems:^{9}

- A mechanism for representing and manipulating dense correspondence between images. This is the model of transformations that the images undergo.
- An objective function,
*F*, measuring the quality of registration given the dense correspondences between samples (i.e., a particular instance of the transformations), with a minimum at a point corresponding to the desired registration. - A global minimization algorithm that optimizes
*F*over the space of all possible parameters describing correspondence between images.

Global minimization of the objective function, *F*, whose arguments are correspondences between all images and whose value measures the quality of registration, solves the problem. However, unlike with pairwise methods, when correspondences between multiple images are considered, the dimensionality of the space in which the search for the optimal solution is performed grows very rapidly with the number of samples in the set when the groupwise approach is employed. In practical problems, very high dimensionality of the search space presents a significant obstacle to finding the optimal solution.^{10} Suppose *N* images are to be registered, and the correspondences between images are controlled by *K* degrees of freedom per image, yielding ^{2} *NK* degrees of freedom in total; even for a modest dataset (hundreds of images) and a modest resolution of deformations (tens of degrees of freedom per image), the dimensionality of the space in which the solution is to be found is measured in thousands.

The problem of efficient optimization of the very high- dimensional objective function in the context of groupwise image registration has not been explored extensively in the literature. Most traditional optimization algorithms used in previous studies cannot deal with an optimization problem of such magnitude and tend to converge to local minima. ^{2,3} Some stochastic algorithms (e.g., simulated annealing and genetic algorithms), which attempt to avoid local minima, have impractical computation times even for small datasets. The main contribution of this chapter is to describe an efficient optimization framework for many – dimensional groupwise objective functions for nonrigid image registration that can quickly and reliably find the best minima.

Our approach alleviates the “curse of dimensionality” by offering:

- a novel solution that implicitly reduces the dimensionality of the search space as the search progresses by incrementally learning optimal deformations;
- a novel application of stochastic optimization algorithms that do not significantly degrade in performance as the dimensionality grows.

In addition, due to the efficient formulation, it is amenable for graphic processing unit (GPU) implementation – apart from the control logic, all steps can be performed on a GPU.

Due to the robustness of our method, we are also able to perform *interperson* groupwise registration: we take a corpus of individual face images and can successfully register them.^{11} This is the first time that the automatic nonrigid registration of data possessing such variety has been reported.

Problem statement: groupwise registration

The input to the registration algorithm is a (possibly unordered) set of *N* images {*I _{i}*,

*i*= 1 …

*N*} of different examples of a deformable object or a deformable structure in an object. We seek to automatically (without user intervention) derive a dense spatial correspondence between the examples. Deformation fields (one for each image) define spatial correspondences between the images, by specifying where each pixel on the underlying object structure is located on that image.

Groupwise registration algorithm

The algorithm is briefly outlined below (full details having been published elsewhere).^{12} Groupwise registration can be considered as an optimization problem. Prior to explaining the optimization regime, the deformation mechanism and the objective function to minimize will be defined first.

Deformation mechanism

The most common ways to represent deformation fields that transform one image into another are fields defined by a sparse set of control points that control either a set of splines,^{5} the nodes of a triangulated mesh,^{3} or compositions of simple warps,^{2} or dense fields representing the displacement of each pixel.^{4} The proposed approach uses both dense deformation fields (to store accumulated deformations) and fields controlled by sparse sets of points (when incrementally improving accumulated deformations). In the latter case, for computational efficiency, we piecewise linearly interpolate the deformation inside the triangles of a mesh obtained by triangulating the set of control points, given the deformations at vertices (control points).^{3} Although this representation has been previously criticized for being insufficiently smooth,^{2} and has limited flexibility and spatial resolution, ^{13} this is no longer a problem with the proposed method, which benefits from its computational efficiency without any sacrifices.

Let D denote a dense deformation field; we store a deformation field in a vector – valued matrix (deformation map, of the same size as the images) whose elements contain the pixel displacements. Thus, given an image *I* and a dense deformation map D, a warped image *I*′ is obtained thus: *I*′(x) = *I*(x + D(x)),∀x, which we will abbreviate to *I*′(x) = *W*(*I*, D); this is illustrated in Figures 18.2, and 18.12, below. Given two (ordered) sets of control points *C*_{1} and *C*_{2} placed correspondingly in images *I*_{1} and *I*_{2}, a dense deformation map from *I*_{1} to *I*_{2} can be obtained by interpolation as described above; we will abbreviate it thusly: D = *L*(*C*_{1}, *C*_{2}).

Incrementally learning optimal deformation

Let there exist a dense set of control points, *C _{D}*, and a sparse subset of

*C*,

_{D}*C*∈

_{S}*C*. Suppose, for ease of explanation, that they are vertices of a triangular mesh describing the deformation of an image (in Figure 18.3a,

_{D}*C*is all points, and

_{D}*C*is points on the solid lines). Let

_{S}*C*

_{D}_{−}

*denote the set of points in*

_{S}*C*not in

_{D}*C*(in Figure 18.3 a, solely on dashed lines). If we could express the optimal position of points in

_{S}*C*

_{D}_{−}

_{ S}*as a function of optimal position of points in*

*C*, this would obviously yield a dimensionality- reducing reparameterization of the deformation; by means of this, we would be able to control more complex deformations with the same number of control points (increased resolution) or control the deformations with a smaller number of control points keeping the same resolution (dimensionality reduction).

_{S}