Revision as of 19:43, 5 January 2008

Home < ITK Registration Optimization

Summary

Goals

There are two components to this research

Identify registration algorithms that are suitable for non-rigid registration problems that are endemic to NA-MIC
Develop implementations of those algorithms that take advantage of multi-core and multi-processor hardware

Steps involved

Modify ITK's registration framework to support oriented images
Modify ITK's registration framework to be thread safe
Develop multi-threaded versions of select registration modules
Make everything backward compatible with ITK's existing registration methods and framework
Deliver in ITK

Target date for these deliverables: Jan 1, 2008

Planned follow-on work

Implement helper-classes in ITK that combine common and effective registration components
- e.g., b-spline deformable registration using LBFSGB optimizer and Mattes MI metric
Deliver b-spline deformable registration using LBFSGB optimizer and Mattes MI metric as a multi-threaded Slicer modules

Target date for the follow-on work: Jan 15, 2008

Status and News

Have developed mult-threaded registration metrics in ITK
- Lead to the discovery that ITK's registration framework was not thread safe.
- Making ITK's registration framework thread safe is conceptually a bug fix for ITK.
- The incomplete implementation of oriented images in ITK has greatly extended the time and effort needed for this project.
  - Fixing this must be done in a manner that maintains ITK's backward compatibility.
  - This is a major effort involving approximately 50,000 lines of new code and over 400 new tests in ITK.
  - We have chosen to spend the time to integrate with ITK because it will serve the broader community, it will benefit from the support of the broader community, it will avoid having to incorporate another SVN checkout into Slicer's build process, and it will keep us from having to maintain and monitor separate dashboards for this effort.
Weekly tcons, Monday, 10am
- Luis Ibanez, Matt Turek, Stephen Aylward
Active proposal to the ITK community:
- http://www.itk.org/Wiki/Proposals:Oriented_Image_Registration
Project plan
- BWH Neuroimaging Analysis Center (NAC), 2007-2008: Grid Enabled ITK
IJ article on oriented images and registration in ITK
- http://www.insight-journal.org/dspace/bitstream/1926/1293/2/Brooks_Arbel_FastOrientedImage_V1.pdf
- Solution presented by the authors is closely related to the changes being made in ITK

Publications

Aylward, Stephen; Jomier, Julien; Barre, Sebastien; Davis, Brad; Ibanez, Luis, "Optimizing ITK’s Registration Methods for Multi-processor, Shared-Memory Systems." MICCAI Open Source and Open Data Workshop, 2007 (Download PDF)

Quick Links

Algorithmic Requirements and Use Cases

Requirements
1. relatively robust, with few parameters to tweak
2. runs on grey scale images
3. has already been published
4. relatively fast (ideally speaking a few minutes for volume to volume).
5. not patented
6. can be implemented in ITK and parallelized.

Use-cases
1. Intersubject mapping
  - Example data set (Kilian)
2. fMRI to hi-res brain morphology mapping
  - Example data set (Steve Pieper)
3. DTI: components of the diffusion tensor
  - Example data (Sylvain)

Hardware Platform Requirements and Use Cases

Requirements
1. Shared memory
2. Single and multi-core machines
3. Single and multi-processor machines
4. AMD and Intel - Windows, Linux, and SunOS

Use-cases
1. Intel Core2Duo
2. Intel quad-core Xeon processors, Visual Studio 8, Windows Vista (Kitware: redwall)
3. 6 CPU Sun, Solaris 8 (SPL: vision)
4. 12 CPU Sun, Solaris 8 (SPL: forest and ocean)
5. 16 core Opteron (SPL: john, ringo, paul, george)
6. 16 core, Sun Fire, AMDOpteron (UNC: Styner)

Data

Now distributed with CVS

Workplan

Establish testing and reporting infrastructure

Identify timing tools
1. Cross platform and multi-threaded
2. Timing and profiling
Develop performance dashboard for collecting results
1. Each test will report time and accuracy to a central server
2. The performance of a test, over time, for a given platform can be viewed on one page
3. The performance of a set of tests, at one point in time, for all platforms can be viewed on one page

Develop tests

Develop modular tests
Develop complete registration solutions for use cases

ITK Optimization

Target bottlenecks
- Multi-thread metric calculation
  - Initial target is MattesMutualInformationImageToImageMetric
- Optimize code
  - Sacrifice some memory and algorithm initialization speed to gain algorithm operation speed increases
  - Call multi-threaded functions when possible
Integrate metrics with transforms and interpolators for tailored performance

Example Results: MattesMutualInformationImageToImageMetric

Example of Optimizations Employed

GetValue
- Added multi-threading to GetValue function
  - Partitions the samples - thereby distributes the computation of the transforms and interpolations across threads
  - Added the pre-computation of the FixedImageMarginalPDF for the sample to reduce the need for the thread mutex lock
    - Required the concept of an AdjustedFixedImageMarginalPDF that is updated when a fixed image voxel does not map into the moving image and thereby isn't valid for the current computations. By only updating when samples are missed, mutex lock to update a cross-thread data structure is needed less often.
  - Each thread now has its own copy of the joinPDF. After threads complete, jointPDFs from each thread are summed. This eliminates mutex from the main loop over samples.

Results

Speedup on a dual-core system is about 30% (reduction in computation time) when using linear transform and linear interpolation and about 45% when using bspline transform and bspline interpolation.

Performance Testing Results

GetValue Test at Identity Parameters

   // Print out a line with the test information
   std::cout << "GetValue2,";
   std::cout << metric->GetNameOfClass() << "," << interpolator->GetNameOfClass();
   std::cout << "," << transform->GetNameOfClass(); 

   // Make a time probe
   itk::TimeProbe timeProbe;

   // Run at the identity transform parameters.
   unsigned int numIters = 100;
   timeProbe.Start();
   for (unsigned int iter = 0; iter < numIters; iter++)
     {
     value = metric->GetValue( identityParameters );
     }
   timeProbe.Stop();

   // Print out the number of samples
   std::cout << "," << metric->GetNumberOfPixelsCounted();
   // Print out the time result.
   std::cout << "," << timeProbe.GetMeanTime()/numIters << std::endl;

Preliminary Results

Mattes GetValue Results

Events

Related Pages

Performance Measurement

LTProf - simple profilter for Windows - Shareware
Intel's VTune for Linux ($)
TAU
Threadmon: Thread usage/blockage
TotalView ($)
PerfSuite (POSIX Threads)
GProf work-around for multi-threaded apps
References on multi-threaded profiling and code optimization

@@ Line 156: / Line 156: @@
      std::cout << "," << timeProbe.GetMeanTime()/numIters << std::endl;
+=== Preliminary Results ===
 [http://www.na-mic.org/Wiki/images/f/fe/MattesGetValue.pdf Mattes GetValue Results]

Difference between revisions of "ITK Registration Optimization"

Revision as of 19:43, 5 January 2008

Contents

Summary

Goals

Steps involved

Planned follow-on work

Status and News

Publications

Quick Links

Algorithmic Requirements and Use Cases

Hardware Platform Requirements and Use Cases

Data

Workplan

Establish testing and reporting infrastructure

Develop tests

ITK Optimization

Example Results: MattesMutualInformationImageToImageMetric

Example of Optimizations Employed

Results

Performance Testing Results

GetValue Test at Identity Parameters

Preliminary Results

Events

Related Pages

Performance Measurement

Navigation menu

Views

Personal tools

General

Resources

Search

Tools