# Engineering:Project:Feature Analysis Framework

From NAMIC Wiki

Home < Engineering:Project:Feature Analysis Framework

## Contents

# Project: Object population desription and general feature analysis framework

## Description

This project has two goals that are interrelated.

- Novel framework for shape analysis in ITK using feature base classes, population descriptions and plugable statistical tests. The remainder of this documents mainly deals with this part of the project.
- Description of object populations for statististical studies. This includes the development of a set of tools for easy editing and setup of the description representation.

### Notes

- This allows the design of statistical shape analysis filters within the ITK pipeline architecture
- The dataobject in this pipeline is a statistical study that encompasses a set of populations. Each population has a set of subjects and each subject has a variety of variables (age, gender, weight ...) and of possible objects (hippocampi, caudates) measured in many ways (attributed and non-attributed surfaces, difference fields, images ...).
- The possible measurements (or statistical features sets) cover a wide range. Each is itself a DataObject
- There is no single ITK factory covering the range of the features/objects
- The processing of the studyObjects is unique in ITK, as each study is merely a container of information about labeling of objects into populations, as well as information about the availability of object types. Running a statistical filter can attach computed information at the study, population or the object level.
- Right now, all features need to be linear, i.e. in Euclidean space. In case of non-Euclidean features (such as radii, magnitudes, angles) the features can first be linearized.
- The design of the objects, features, and shape analysis filters should leave room for more advanced nonlinear analysis in the future. This includes frameworks where the shape representations are manifolds or where the only structure needed is a distance metric between shapes.

### General Structure

- Statistical Filters are subclasses of ProcessObject
- input is a studyObject with populations and subjects
- output is the same studyObject modified by adding information regarding the results of the filter
- the old studyObject still exists, but does not contain the additional information
- Statistical Filter developers need to adhere to a set of given preconditions. These are not enforceable, as the studyObject is only a container of reference to data, but does not contain the data itself

- Statistical Studies are subclasses of DataObject
- Some Filters might only work on a single population without the need for a full study, can be achieved through population selection mechanism

### Detailed discussion pages

- Discussion of statistical studies specs
- Discussion of Statistical Objects specs
- Discussion of Statistical Filters specs

### Examples

Simple example: Difference of means field

Computation of the difference vector field from the mean surface of population A to the mean surface of population B: * Filter DiffMeanStat: First computes the mean surface of each population, then the Euclidean difference between the surfaces * Input to DiffMeanStat: Study with 2 populations + information identifying population A and B (the filter is not symmetric in A and B) * Input to DiffMeanStat: ID of object to compute over (each subject potentially has many different objects), e.g. identifier of a rigidly aligned surface: Hippocampus, ObjectRigidAlignSurfRef * Output of DiffMeanStat: Study with mean surfaces for each population stored at the population level and the surface of population A attributed with the difference vector at each surface vertex at the study level.

More complex example: Significance map computation

Computation of Hotelling T^2 metric based statistical non-parametric significance surface map of 2 populations: * Filter HotellingT2SnPMStat: Computes significance maps based on permutation tests and non-parametric statistics of the Hotelling T2 metric * Input to HotellingT2SnPMStat: Study with 2 populations * Input to HotellingT2SnPMStat: ID of object to compute over, e.g. identifier of a rigidly aligned, scaling normalized surface: Caudate, ObjectScaleAlignSurfRef * Output of HotellingT2SnPMStat: Study with mean surfaces for each population stored at the population level and separate statistical surface maps for T^2 metric, significance p-value (raw and corrected for multiple comparisons), effect-size, and r^2 metric at the study level.

Other complex example: Classification task

Classification of a study using e.g. Support Vector Machines based on a classifier trained on a secterond study * Filter SVNClassificationStat: does the SVN classification * Input: training study (remains unchanged) and testing study (gets classfied) * Input: ID of object to compute over: Cingulate, ObjectOrigDTIFiberRef * Output: Classified study with information at study level: classification rates (when ground truth is provided); at population level: (optionally) N new populations for the N classification classes; at subject level: the classification of each subject

## Members

- Martin Styner (UNC-CH)
- Tom Fletcher (Utah)
- Jim Miller (GE Research)

## Keywords

Shape analysis, Statistics