ITK Registration Optimization

From NAMIC Wiki
Revision as of 20:45, 1 April 2007 by Aylward (talk | contribs)
Jump to: navigation, search
Home < ITK Registration Optimization

Goals

There are two components to this research

  1. Identify registration algorithms that are suitable for non-rigid registration problems that are indemic to NA-MIC
  2. Develop implementations of those algorithms that take advantage of multi-core and multi-processor hardware.

Algorithmic Requirements and Use Cases

  • Requirements
    1. relatively robust, with few parameters to tweak
    2. runs on grey scale images
    3. has already been published
    4. relatively fast (ideally speaking a few minutes for volume to volume).
    5. not patented
    6. can be implemented in ITK and parallelized.

Hardware Platform Requirements and Use Cases

  • Requirements
    1. Shared memory
    2. Single and multi-core machines
    3. Single and multi-processor machines
    4. AMD and Intel - Windows, Linux, and SunOS
  • Use-cases
    1. Intel Core2Duo
    2. Intel quad-core Xeon processors (?)
    3. 6 CPU Sun, Solaris 8 (SPL: vision)
    4. 12 CPU Sun, Solaris 8 (SPL: forest and ocean)
    5. 16 core Opteron (SPL: john, ringo, paul, george)
    6. 16 core, Sun Fire, AMDOpteron (UNC: Styner)

Data

Workplan

Establish testing and reporting infrastructure

  1. Identify timing tools
    1. Cross platform and multi-threaded
    2. Timing and profiling
    • Status
      1. Instrumenting modular tests
        • Extending itk's cross-platform high precision timer
        • Adding thread affinity to ensure valid timings
        • Adding method for increasing process priority
      2. Profiling complete registration solutions for use cases
        • Using CacheGrind on single and multi-core linux systems
  2. Develop performance dashboard for collecting results
    1. Each test will report time and accuracy to a central server
    2. The performance of a test, over time, for a given platform can be viewed on one page
    3. The performance of a set of tests, at one point in time, for all platforms can be viewed on one page
    • Status
      1. BatchMake database communication code being isolated
      2. Performance dashboard web pages being designed

Develop tests

  1. Develop modular tests
    • Status
      1. Developed itkCheckerboardImageSource so no IO required
      2. Developing tests as listed in the "Modular Tests" section below
  2. Develop C-style tests
    1. Tests should represent the non-ITK way of doing image analysis
      • Use standard C/C++ arrays and pointers to access blocks of memory as images
  3. Develop complete registration solutions for use cases
    • Status
      1. Centralized data and provide easy access
      2. Identified relevant registration algorithms
        • rigid, affine, bspline, multi-level bspline, and Demons'
        • normalized mutual information, mean squared difference, and cross correlation
      3. Developing traditional ITK-style implementations

Compute performance on target platforms

  • Ongoing

Optimize bottlenecks

  • Target bottlenecks
    • Use random, sub-sampling iterator in mean squared difference and cross correlation
    • Multi-thread metric calculation
    • Integrate metrics with transforms and interpolators for tailored performance
    1. MattesMutualInformationImageToImageMetric
Time in self Time in subfuncs Function
"0.00" "60.54" "__tmainCRTStartup"
0.00 34.04 main"
0.00 21.39 itk::CheckerBoardImageSource<itk::Image<float,3> >::GenerateData"
16.52 16.57 floor ?"
0.00 13.55 itk::OptMattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::GetDerivative"
0.00 12.95 itk::ImageSource<itk::Image<float,3> >::ThreaderCallback"
0.00 12.95 itk::MattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::GetDerivative"
0.30 11.45 itk::CentralDifferenceImageFunction<itk::Image<float,3>,double>::Evaluate"
8.71 8.73 itk::CentralDifferenceImageFunction<itk::Image<float,3>,double>::EvaluateAtIndex"
2.70 8.43 itk::OptMattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::GetValueAndDerivative"
7.51 7.53 itk::BSplineKernelFunction<3>::Evaluate"
3.30 7.53 itk::MattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::GetValueAndDerivative"
0.00 6.63 endthreadex ?"
6.61 6.63 itk::StatisticsImageFilter<itk::Image<float,3> >::ThreadedGenerateData"
3.30 4.82 itk::CheckerBoardSpatialFunction<double,3,itk::Point<double,3> >::Evaluate"
4.50 4.52 itk::OptMattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::ComputePDFDerivatives"
3.90 3.92 itk::NearestNeighborInterpolateImageFunction<itk::Image<float,3>,double>::EvaluateAtContinuousIndex"
3.60 3.61 _ftol2_pentium4"
3.60 3.61 itk::BSplineKernelFunction<2>::Evaluate [1]"
1.80 3.01 itk::OptMattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::GetValue"
0.90 2.41 itk::MattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::GetValue"
2.40 2.41 itk::ProgressReporter::CompletedPixel"
2.40 2.41 itk::ShiftScaleImageFilter<itk::Image<float,3>,itk::Image<float,3> >::ThreadedGenerateData"
2.10 2.11 itk::ImageFunction<itk::Image<float,3>,double,double>::IsInsideBuffer"
1.80 1.81 itk::BSplineDerivativeKernelFunction<3>::Evaluate"
1.20 1.81 itk::ImageFunction<itk::Image<float,3>,itk::CovariantVector<double,3>,double>::ConvertContinuousIndexToNearestIndex"
1.50 1.51 itk::BSplineKernelFunction<2>::Evaluate"
0.00 1.51 thunk@40316b ?"
1.20 1.20 itk::InterpolateImageFunction<itk::Image<float,3>,double>::Evaluate"
0.90 1.20 itk::MattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::Initialize"
0.60 1.20 itk::OptMattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::Initialize"
0.90 0.90 itk::ImageBase<3>::GetSpacing"
0.90 0.90 itk::ImageFunction<itk::Image<float,3>,double,double>::ConvertContinuousIndexToNearestIndex"
0.90 0.90 itk::MattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::ComputePDFDerivatives"
0.90 0.90 itk::MattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::TransformPoint"
0.90 0.90 itk::OptMattesMutualInformationImageToImageMetric<itk::Image<float,3>,itk::Image<float,3> >::TransformPoint"


Modular tests

All tests send two values to performance dashboards

  • the time required
  • an measure of the error (0 = no error; 1 = 100% error)

Tests being developed and their parameter spaces

  1. LinearInterpTest <numThreads> <dimSize> <factor> [<outputImage>]
    • NumThreads = 1, 2, 4, and #OfCoresIf>4
    • DimSize = 100, 200 (i.e., 100^3 and 200^3 images)
    • Factor = 2, 3 (i.e., producing up to 600^3 images)
    • = 16 tests (approx time on Core2Duo for these tests = 1 minute)
  2. BSplineInterpTest <numThreads> <dimSize> <factor> <bSplineOrder> [<outputImage>]
    • NumThreads = 1, 2, 4, and #OfCoresIf>4 (for every platform)
    • DimSize = 100, 200 (i.e., 100^3 and 200^3 images)
    • Factor = 2, 3 (i.e., producing up to 600^3 images)
    • bSplineOrder = 3
    • = 16 tests (approx time on Core2Duo for these tests = 10 minute)
  3. SincInterpTest <numThreads> <dimSize> <factor> [<outputImage>]
    • Uses the Welch window function
    • NumThreads = 1, 2, 4, and #OfCoresIf>4 (for every platform)
    • DimSize = 100, 200 (i.e., 100^3 and 200^3 images)
    • Factor = 2, 3 (i.e., producing up to 600^3 images)
    • = 16 tests (approx time on Core2Duo for these tests = 30 minute)
  4. BSplineTransformLinearInterpTest <numThreads> <dimSize> <numNodesPerDim> <bSplineOrder> [<outputImage>]
    • 3 nodes are also added outside of the image for interpolation
  1. MeanReciprocalSquaredDifferenceMetricTest
  2. MeanSquaresMetricTest
  3. NormalizedCorreltationMetricTest
  4. GradientDifferentMetricTest
  5. MattesMutualInformationMetricTest
  6. MutualInformationMetricTest
  7. NormalizedMutualInformationMetricTest
  8. MutualInformationHistogramMetricTest
  9. NormaalizedMutualInformationHistogramMetricTest

Notes

  • MattesMutualInformationMetric defaults to BSpline interpolator - above tests override to instead use nearest neighbor interpolation

Related Pages

Performance Measurement