Difference between revisions of "AHM 2006:ProjectsSlicerDataModel"

From NAMIC Wiki
Jump to: navigation, search
m (Update from Wiki)
 
Line 3: Line 3:
 
Designing a Data Centric Model for Slicer 3.
 
Designing a Data Centric Model for Slicer 3.
  
See also [[Slicer3:Data_Model|Feature set description]].
+
See also [http://www.slicer.org/slicerWiki/index.php/Slicer3:Data_Model#General_References_on_XML Feature set description].
  
 
= Open Questions from Programmer's Week Discussions =
 
= Open Questions from Programmer's Week Discussions =

Revision as of 21:17, 5 November 2008

Home < AHM 2006:ProjectsSlicerDataModel

Project

Designing a Data Centric Model for Slicer 3.

See also Feature set description.

Open Questions from Programmer's Week Discussions

  • Syntax of the factory for itk (is the extra layer needed?) - Jim Miller
    • Keeping the VTK and ITK factory syntax parallel
  • How can developers add new data types to mrml? - Lauren O'Donnell
    • Current slicer supports the idea of modules having their own data types
    • Implementation is difficult and not well documented.
  • Who will be doing what? (Alex, Xiaodong, Mathieu to allocate time and effort)

Goals

Design and Implement a prototype of a Data Model server for Slicer 3

  1. It should represent a scene graph
  2. It should compute and return Transforms between objects in the scene graph.
  3. It should be suitable for Image Guided Surgery. This may require it to be compatible with Real-Time OS
  4. It should return datasets (image data)
  5. It should return surface models (vtkPolydata?)

Requirements

  1. It must work as a service
  2. It must be accessible from Batch programs as well as GUI programs
  3. It must be computationally efficient
  4. It must be multi-platform
  5. It must be memory efficient

Use Cases

Some of these use cases were taken from the Slicer requirements for IGS applications

Slicer 3 and IGSTK integration (Nobuhiko Hata, Luis Ibanez, Patrick Cheng)

The Data Model will act as a service that offers to clients the options of

  • Storing data along with tags (MetaData Identifiers)
  • Retrieving data using the Identifiers
  • Modifying data in place


Current Use Cases

The basic Data Model in Slicer supports instances as,

  • Volumes
  • Scalar Types
  • Label Maps (segmentation result)
  • Reference to Lookup Table
  • Models
  • Named Field Data (scalars, vectors, labels) at points and cells (FreeSurferReaders)
  • Color, Clipping State, Visibility, Scalar Visibility, LookupTable
  • Transforms*
  • Fiducials, Fiducial Lists
  • Name, Label#, Diffuse/Ambient/Specular.

The Data Model API in Slicer allows adding, deleting, reading, and modifying medical image data types (Volumes, Models, Transforms, Fiducials, etc).


Use Cases to Add

In addition to the Data Model provided by Slicer, we will develop additional instances required uniquely for RFA.

  • State information
  • Transformation matrix for CT-to-patinet registration in the tracker’s coordinate system
  • Predicted error from the CT-to-patient registration
  • Locations of tracker attached to the RFA applicator and US transducer
  • Transformation matrix from calibration of tracker to the US image coordinate system.
  • Magnitude and gain of the US imager in the last state of the imaging.
  • Location of fiducial markers

Strawman "Hello MRML" programs

No client server, just read an xml file

main ()
{
  vtkMrmlTree *mrml = vtkMrmlTree::New();
  mrml->Connect("file://data.xml");
  mrml->PrintSelf();
  mrml->Delete();
}

Connect to a server, modify, commit

main ()
{
  vtkMrmlTree *mrml = vtkMrmlTree::New();
  mrml->Connect("mrml://mrml.na-mic.org/data");
  vtkMrmlTransformNode *trans = vtkMrmlTransformNode::New();
  mrml->AddNode(trans);
  mrml->Commit();
  trans->Delete();
  mrml->Delete();
}

Open mrml file, run a vtk filter, save new file. This example use separate mrml, and vtkmrml libraries.

 #include "mrml.h"
 #include "vtkmrml.h"
 
 main ()
 {
 
   // get mrml tree
   mrml::Tree *mrml = mrml::Tree::New();
   mrml->Connect("file://data.xml");
 
   // get input image in vtk format
   mrml::VolumeNode *volNode = mrml->GetNthVolume(0);
 
   vtkmrml::VolumeData *inData = vtkmrml::VolumeData::New();
 
   inData->SetSourceNode(volNode);
   vtkImageData *imgData = inData->GetImageData(); // converts data from internal format to vtk
 
   // vtk pipeline
   vtkImageGaussianSmooth *igs = vtkImageGaussianSmooth::New();
   igs->SetInput(imgData);
   igs->GetOutput()->Update();
 
   // put output volume in a new mrml volume node
   mrml::VolumeNode *volNodeOut = mrml::VolumeNode::New();
 
   vtkmrml::VolumeData *outData = vtkmrml::VolumeData::New();
 
   outData->SetTargetNode(volNodeOut);
   outData->SetSourceImage(igs->GetOutput());
   outData->Update();   // converts data fom vtkImage into internal format
 
   // add node to the mrml tree
   mrml->AddNode(vol);
 
   // save new file
   mrml->Save("file://data1.xml");
 
   igs->Delete();
 
   mrml->Delete(); // Do we need this? vtk style or smartPointers?
   inData->Delete(); // Do we need this? vtk style or smartPointers?
   outData->Delete(); // Do we need this? vtk style or smartPointers?
   volNodeOut->Delete(); // Do we need this? vtk style or smartPointers?
 }
 

Connect to a server, run an ITK filter, commit

main ()
{
  vtkMrmlTree *mrml = vtkMrmlTree::New();
  mrml->Connect("mrml://mrml.na-mic.org/data");
  vtkMrmlVolumeNode *vol = mrml->GetNthVolume(0);
  typedef itk::NormalizeImageFilter<<float,3>,<float,3>> ImageFilterType;
  ImageFilterType::Pointer norm = ImageFilterType::New();
  norm->SetInput(vol->GetITKDataF());
  norm->GetOutput()->Update();
  vtkMrmlVolumeNode *vol = vtkMrmlVolumeNode::New();
  vol->SetITKDataF(norm->GetOutput());
  mrml->AddNode(vol);
  mrml->Commit();
  vol->Delete();
  mrml->Delete();
}

ITK Style

using namespaces and ITK idiom


main ()
{
  Mrml::Tree::Pointer mrml = Mrml::Tree::New();
  mrml->Connect("mrml://mrml.na-mic.org/data");

  typedef itk::Image<float,3> ImageType;
  // itkmrml knows about both itk and mrml
  typedef itkmrml::VolumeData<ImageType> VolumeDataFactoryType;

  VolumeDataFactoryType::Pointer factory = VolumeDataFactoryType::New();
  factory->SetSource(mrml->GetNthVolume(0));
  if ( !factory->CanTranslate() ) return;

  typedef itk::NormalizeImageFilter<ImageType,ImageType> ImageFilterType;
  ImageFilterType::Pointer norm = ImageFilterType::New();
  norm->SetInput(factory->GetImage());
  norm->GetOutput()->Update(); // this pulls mrml data into itk::Image

  VolumeDataFactoryType::Pointer outfactory = VolumeDataFactoryType::New();
  mrml::VolumeNode outvol = mrml::VolumeNode::New();
  outfactory->SetImage(norm->GetOutput());
  outfactory->SetTarget(outvol);
  outfactory->Update(); // this pushes itk::Image data into mrml
  mrml->AddNode(outvol);
  mrml->Commit();
}

DataModel API

This is an initial draft of interactions with the DataModel. Most of the entries were taken from the vtkMrmlTree class.

  • dm->Connect("filename");
  • dm->Connect("URL");
  • dm->Commit();
  • dm->Close();
  • dm->InsertNode( node, "parent name", "node name");
  • dm->GetNode("node name");
  • dm->HasNode("node name");
  • dm->GetNextNode(); ?? // shouldn't we rather have iterators ?
  • dm->GetNthItem(); // useful for blind IO...?
  • dm->Gets by Class():
    • GetVolume()
    • GetTransform()
    • GetMatrix() ?? are these matrices representing Transforms ?
    • GetColor()
  • dm->ComputeTransforms();
  • dm->ComputeRelativeTransform("node1 name","node2 name");
  • dm->DeleteNode( node );
  • dm->DeleteNode( "node name" );
  • dm->Delete() : // Let's use vtkSmartPointers and avoid to need Delete()...

Node name stands for any type of Identification, it may be implemented in the form of an integer Id, or in the form of a string.

XML versus SQL

Our analysis seems to indicate that SQL and XML are possible solutions for the storage of the data on disk. We intent to implement an API that will talk to the storage implementation and that will hide it from the Slicer applications. In other words, slicer developers and slicer users should not need to know that there is an XML file or a SQL database underneath.


The following table summarizes the advantages and disadvantages of using XML versus SQL. There is also the option of combining both, if we find that each one alone does not provide all the features that we want for Slicer applications.

Feature XML SQL
Get element by an identifier natural but need to be hierarchical natural
Insert element with an identifier natural natural
Hierarchy navigation natural must be implemented with auxiliary table
Resistant to power-down No Yes
Support for large datasets Yes Yes
Speed for access to be measured to be measured

SQL Options

The implementation could be done using a unified approach for all the platforms, or it could be done by creating a common API, that then wraps to different local libraries in different platforms. For example, it could use MS-SQL in Windows, and MySQL in Unix, wrapping both of them in a common C++ API customized for the types for objects that Slicer would manage.

Matrix of current options

MS-Windows Linux Cygwin Macintosh Sun SGI License Installation Burden
MS-SQL yes no no no no no EULA? Medium (only Windows)
PostgreSQL yes yes yes yes yes yes

BSD (see details)

Medium (requires root or home user build)
MySQL yes yes yes yes yes yes

GPL / Commercial (see details)(see some issues)

Medium (requires root or home user build)
CORBA* yes yes yes yes yes yes ? High (requires root and network setup)
SQLite yes yes yes yes yes yes

Public Domain (see)

Low (built-in into the application)
MetaKit yes yes yes yes yes yes

X/MIT Style (see)

Low (built-in into the application)

CORBA would actually require a specific package to be tested per platform...

Current Option

SQLite

Features include: (from)

  • Transactions are atomic, consistent, isolated, and durable (ACID) even after system crashes and power failures.
  • Zero-configuration - no setup or administration needed.
  • Implements most of SQL92. (Features not supported)
  • A complete database is stored in a single disk file.
  • Database files can be freely shared between machines with different byte orders.
  • Supports databases up to 2 terabytes (241 bytes) in size.
  • Sizes of strings and BLOBs limited only by available memory.
  • Small code footprint: less than 250KiB fully configured or less than 150KiB with optional features omitted.
  • Faster than popular client/server database engines for most common operations.
  • Simple, easy to use API.
  • TCL bindings included. Bindings for many other languages available separately.
  • Well-commented source code with over 95% test coverage.
  • Self-contained: no external dependencies.
  • Sources are in the public domain. Use for any purpose.

Second Option

PostgreSQL DataBase

Online Tutorial

Features

  • Allows connections via unix domain sockets and TCP/IP connections
  • Has binding to PHP, C, Python, Perl, Tcl
  • Size Limitations (taken from)
    • Maximum size for a database? unlimited (32 TB databases exist)
    • Maximum size for a table? 32 TB
    • Maximum size for a row? 1.6TB
    • Maximum size for a field? 1 GB (This is what we will map to one Image. If it becomes a limit we could store the image in Slices per field)
    • Maximum number of rows in a table? unlimited
    • Maximum number of columns in a table? 250-1600 depending on column types
    • Maximum number of indexes on a table? unlimited
  • Object Oriented Database: Fields can be customized object data structures.
    • Supports Inheritance: on database can inherit properties from another one (details).
  • Database server can be a remote machine or the local one
    • This will support naturaly a client/server approach such as the one in ParaView
    • Client applications can be very diverse in nature: a client
      • Could be a text-oriented tool.
      • A graphical application
      • A web server that accesses the database to display web pages
      • or a specialized database maintenance tool.
  • The PostgreSQL server can handle multiple concurrent connections from clients.
    • For that purpose it starts ("forks") a new process for each connection. From that point on, the client and the new server process communicate without intervention by the original postmaster process.
  • Supported platforms (see)
  • Native support for using SSL connections to encrypt client/server communications for increased security. This requires that OpenSSL is installed on both client and server systems and that support in PostgreSQL is enabled at build time

Third Option

MetaKit DataBase

http://www.equi4.com/mkoverview.html

Features

  • Use your data on any platform. Both the code and datafiles are portable. All byte-ordering managed by the library.
  • Complex datastructures in one file. Store multiple nested data structures, to create document-centric applications.
  • Restructure datafiles, instantly. It restructure files on-the-fly, while open.
  • Serialize all data for transport. Complementing commit/rollback of changes, data can also be serialized.
  • Recover from system-failures. The use of Stable Storage ensures that files cannot be corrupted by crashes.
  • Load on-demand, quick startup.Files are opened without reading data. Memory-mapped files if O/S supports it.
  • Behaves like containers. The API mimics container classes. Quickly get sizes and iterate over rows.
  • Wide range of operators built-in. Sorting, relational join / group by, set operations, permutations, hashing.
  • 1-32 bits per int (or 64), variable-sized data. The largest int defines storage format. String/binary data is stored as var-sized.
  • Create fully self-contained applications. Can be linked shared or statically, for hassle-free deployment of components.
  • Tiny code (125 Kb as Win32 DLL). The library is extremely small, unused functions are stripped off in static links.
  • Simple API, just 6 core classes. Only a small interface is exposed. One header file lists all the classes you need.
  • Also use from Python and Tcl. These language bindings are coded to take advantage of the respective idioms.

API Issues: Strawman Answers

Bold: updates after tcon

  • MRML Tree:
    • It MrmlTree a true hierarchy or a list of nodes (as currently)?
    • Should it be a real scene tree? Right now it's XML file image in memory which combines scene hierarchy and data persistance. No.
    • If it's xml file image, do we use DOM, XPath for internal representation of xml file? No, use existing MRML hierarchy.
    • Do we use SQL database to persist MRML tree and data? Do we use database to provide remote access to MRML trees and data in the client/server mode. No. Implementation of the internals of the data model will be hidden behind the API
  • MRML nodes:
    • How is the data accessed from Mrml Node, can we make it independent from vtk/itk types like vtkImageData and itk::Image<>?
    • Metadata and vtk data should be separated to avoid redundancy. What metadata is stored in the new MrmlVolume, MrmlModel, etc. Can we use delegation from vtkImageData SetSpacing() etc. methods to avoid duplication? Explicit synchronize metadata methods between MRML node and vtk data. The metadata in the MRML nodes is the definitive version -- any platform-specific metadata is filled out by the factory that generates the structure
    • What subset of general vtk vtkImageData and vtkPolyData is supported, multicomponent, tensors etc. Do we create special MRML nodes for tensors. Allow full vtk API for creating and manipulating vtkDataSet and vtkFieldData. Define a specific set of functionality to be represented by MRML -- do not rely on vtk
    • How transformations are represented? Do we use new Coordianet System Manager? Yes. Yes Do we use MrmlGroup Node instead MrmlTransform with the new coordinate system manager defining transformations? Yes. Need a way to serialize coordinate systems.
    • Support ITK Volumes in API? Only 3D vtk volumes. Define the volume types that MRML needs to support for NA-MIC needs (with ability to extend) ITK and VTK factories will be responsible for mapping them to the specific system.
  • Coordinate Systems:
    • How are Slicer internal coordinate systems (RAS, LPS, ijk, vtk) represented by the new coordinate system manager? Do we store RAStoIJK as part of MRML node metadata? Each RAStoRAS Transform is a MRML node. Coordinate System Manager has pointers to those transforms and internal RAStoIJK transforms of volumes and models.
    • How are non-linear transforms represented? Do we support displacement fields, BSplines? From what coordinate system to what they transform? How are vectors, normals and tensors treated? All non-linear transforms are RAStoRAS. Need new vtk Transforms similar to ITK transforms. vtk transforms can be implemented with vtkITK wrappers
  • Execution model:
    • Need C++ classes for Aplication, Modules, Viewers. Move tcl global arrays to vtk collections. Move all visualization and application state and logic into it's own classes. Need C++ API for update loop, observers.
  • Client/Server:
    • Do we support the entire mrml API between client and server? Initially we support only ITK style ImageIO. Later full API?
    • Do we use CORBA? Need to have stream based serialization for both Mrml nodes and Mrml data
    • Do we use SQL database? Database has to support client/server mode for simulatbious read-write operations.

Path-based MRML3 proposal

Here's a proposal for a path-based MRML3 implementation, using ideas from the Coordinate Space Manager.

Current Status

  1. First C++ draft is in the sandbox

Current Tools

  1. Python prototype by Mike of the path-based XML layer:
    1. parses XML elements and overlays semantics of "path" and "ref" tags.
    2. pemote resource cache implemented
    3. mechanism for handling namespaced elements and attributes implemented
    4. reading files implemented
    5. writing 70% implemented (remaining issue: renaming resources while maintaining links)
    6. renaming resources 50% implemented (what happens when you want to move a remote resource like a URL into a local file?)
  2. C++ prototype by Luis

Test Data

N/A.

Team Members

  • Mike Halle - BWH
  • Alex Yarmarkovich - Isomics
  • Xiaodong Tao - GE
  • Luis Ibanez - Kitware
  • Steve Pieper - Isomics

Slides

File:2006 AHM Programming Half Week SlicerDataModel.ppt