Difference between revisions of "DataRepository"

From NAMIC Wiki
Jump to: navigation, search
Line 1: Line 1:
== Overview ==
+
= Overview =
  
 
NA-MIC Driving Biological Projects investigators have made several datasets available to NA-MIC developers. This page provides information on how to obtain access to these datasets.
 
NA-MIC Driving Biological Projects investigators have made several datasets available to NA-MIC developers. This page provides information on how to obtain access to these datasets.
 +
 +
= BRIN Controlled Infrastructure =
  
 
== User Accounts to Access NAMIC Data ==
 
== User Accounts to Access NAMIC Data ==
Line 135: Line 137:
  
 
'''Please note''' that users who previously belonged to the external domain have been added to the namic domain. If you use the SRB S-commands, this mandates a change to your .MdasEnv file if you wish for your default directory to be /home/<your_user_name>.namic or if your default directory was previously /home/<your_user_name>.external. Please update your .MdasEnv file to point to your new default directory.
 
'''Please note''' that users who previously belonged to the external domain have been added to the namic domain. If you use the SRB S-commands, this mandates a change to your .MdasEnv file if you wish for your default directory to be /home/<your_user_name>.namic or if your default directory was previously /home/<your_user_name>.external. Please update your .MdasEnv file to point to your new default directory.
 +
 +
= MIDAS =
 +
 +
MIDAS is a public repository of freely available data.  New contributions from NAMIC collaborators are welcome.
 +
 +
MIDAS is a system for collecting, processing, and distributing massive collections of data. It is particularly well suited for large images and their meta-data.
 +
 +
One of the outstanding features of MIDAS is its ease of integration into existing products and processes. Using MIDAS, remote collaborators and terabytes of data can be seemlessly integrated into the workflow of individual research projects as well as large publication efforts. The key to MIDAS' ease of integration is the diversity of data, processing, and access formats that it supports. For example, MIDAS provides the following services:
 +
 +
* Indexing: MIDAS automatically indexes text in any of over 20 different files types including XML, Microsoft Word(TM) and Adobe PDF(TM) files. Also, it automatically indexes header information from over 20 different image types including MetaImage and DICOM header information.
 +
* Searching: MIDAS supports the OAI data harvesting standard, so that public data is readily discovered by Google and other search engines. Private data is readily searched using a local search engine.
 +
* Storing: MIDAS provides batch upload methods and supports arbitrary file types and meta-data fields so that it can manage all of your raw and processed data as well as your companion reports and summary statistics.
 +
* Processing: Local, server-side, and distributed processing can be automatically initiated on upload of data to MIDAS or initiated using thin or thick clients. Server-side and distributed processing is provided using the BatchMake scripting language.
 +
* Accessing: You can browse large images on MIDAS, without downloading the data; you can download MIDAS's data over the web or directly into your legacy software using SSHFS. SSHFS presents the hierarchical organization of the data on MIDAS as a file system that can be mounted on Windows, Linux, and MacIntosh platforms.
 +
 +
MIDAS has been used to host publications for e-journals, to host and process data for multi-center research projects, and to offer public access to data in compliance with NIH and NSF data sharing policies.
 +
 +
== Slicer Tutorial Data ==
 +
 +
* http://www.insight-journal.org/midas/view_community.php?communityid=17
 +
 +
== Slicer Brain and Abdomen Data ==
 +
 +
* http://www.insight-journal.org/midas/view_collection.php?collectionid=34

Revision as of 16:01, 20 July 2007

Home < DataRepository

Overview

NA-MIC Driving Biological Projects investigators have made several datasets available to NA-MIC developers. This page provides information on how to obtain access to these datasets.

BRIN Controlled Infrastructure

User Accounts to Access NAMIC Data

FMRI and DTI datasets are currently available to NAMIC participants via the BIRN Controlled Infrastructure. In order to access this data, please use the following steps to create a BIRN account with NAMIC permissions:

  • ** Go to the portal and follow directions to set up a member account (Your site may not be listed - please state that you are a NA-MIC participant in the interest section)


  • ** Before you download data using the account you have created, please confirm with your PI that you have received adequate human subject training. If your site does not provide this training, please do this Human Subject Online Training as an affiliate of Brigham and Women's Hospital and send a copy of your certificate to your PI for their records.

Brockton VA/Harvard Structural and DTI Images

Data Access

To download this data, click on the following links:

  • Ensure you are logged into the portal portal
  • Download Structural MRI
  • Download DTI data
    • The header for each DWI file is obsolete and incorrect, download Harvard_VA_nhdr.tgz and overwrite the DWI nrrd headers

Data Description

Data Contact

If you have any questions about the data, please contact Sylvain Bouix at Harvard.

Dartmouth Structural, Functional, T2 and DTI Images

Data Access

To download this data, click on the following links:

Data Description

  • 6 Healthy Controls Data
    • All data were acquired on a 1.5T GE MRI Scanner using the Horizon LX software platform.
    • SPGR: Scan Parameters and Data Structure
    • T2: Scan Parameters and Data Structure
    • fMRI (ER-RECOG task):Scan Parameters and Data Structure
      • NOTE: onsets for tasks included with data are WRONG. New onsets have been uploaded to here
    • DTI:Scan Parameters and Data Structure
    • Hippocampus Traces: Hippocampal traces were made on an SPGR resampled to 1.015625 mm3 isotropic voxels and realigned along the long axis of the hippocampus. This reoriented SPGR volume is included with the traces. Traces were drawn using the Brains program.
    • Hippocampal ROIs: Binary Hippocampal ROIs have been uploaded in the same directory as the traces. These ROIs are in Analyze format. There are four ROIs for each subject (left hippo, left hippo smoothed 3mm FHMW, right hippo, right hippo smoothed 3mm FHMW). The smoothed ROIs have 'sROI' in the filename.
  • MRI Data for 15 SZ
    • Data consists of DICOM slices for SPGR and trace files for various Frontal Lobe Subdivisions
      • SPGR: Scan Parameters and Data Structure
      • Frontal Traces: Traces were made on SPGR volume using the Brains program. Go here for a listing of the structures available and the filenames associated with them.

Data Contact

  • For any extra information needed on this data please contact John D. West. Contact info can be found on the members list in the NA-MIC section of BIRN.

UCI Structural, Functional and PET Images, Genetic and Clinical Data

Data Access

To download this data:

Data Description

Data Contact

  • For additional information on this data, please contact Jessica Turner at UCI.

Search for NAMIC Data

Users earch for data available on the BIRN data grid through metadata applied to the data itself. Each data object and collection can be assigned any number of metadata tags. Currently the metadata tags consist of items such as: Modality (DTI, T1, T2, etc.), Institution, etc.

To see an example of metadata on a data file, take a look at the following UCI file: First

Replication of Data

Currently, the NAMIC data sets are stored in BIRN at:

/home/Projects/NAMIC__0003/Files (physicall located at NCMIR)
/home/Projects/Study5000__0005/Files (physically located at UCI)

For redundancy in case a Birn Rack were to go offline we have replicated the data to the BWH rack in preparation for the Programmer's Week.

This can be seen when using 'Sls -a'

 [jgerk@ncmir-gpop scratch]$ Sls -a UCI07110773.tgz
 jgerk       0 z-ucsd-ncmir-nas1       451315092 2005-02-18-14.00 % /home/Projects/Study5000__0005/Files/Archive/UCI07110773.tgz
 jgerk       1 z-harvard-bwh-nas1      451315092 2005-06-24-17.02 % /home/Projects/Study5000__0005/Files/Archive/UCI07110773.tgz

During regular use it is not important that the file resides in two places, but it should be understood that removing the file will result in removing both copies.

As more data is added to NAMIC we can periodically run the following command to update the BWH resource (provided you have the correct permissions).

 Sbkupsrb -r -S z-harvard-bwh-nas1 Files

Notes

Instructions on creating Data Repository Pages are here.


Namic Domain Creation

Please note that users who previously belonged to the external domain have been added to the namic domain. If you use the SRB S-commands, this mandates a change to your .MdasEnv file if you wish for your default directory to be /home/<your_user_name>.namic or if your default directory was previously /home/<your_user_name>.external. Please update your .MdasEnv file to point to your new default directory.

MIDAS

MIDAS is a public repository of freely available data. New contributions from NAMIC collaborators are welcome.

MIDAS is a system for collecting, processing, and distributing massive collections of data. It is particularly well suited for large images and their meta-data.

One of the outstanding features of MIDAS is its ease of integration into existing products and processes. Using MIDAS, remote collaborators and terabytes of data can be seemlessly integrated into the workflow of individual research projects as well as large publication efforts. The key to MIDAS' ease of integration is the diversity of data, processing, and access formats that it supports. For example, MIDAS provides the following services:

  • Indexing: MIDAS automatically indexes text in any of over 20 different files types including XML, Microsoft Word(TM) and Adobe PDF(TM) files. Also, it automatically indexes header information from over 20 different image types including MetaImage and DICOM header information.
  • Searching: MIDAS supports the OAI data harvesting standard, so that public data is readily discovered by Google and other search engines. Private data is readily searched using a local search engine.
  • Storing: MIDAS provides batch upload methods and supports arbitrary file types and meta-data fields so that it can manage all of your raw and processed data as well as your companion reports and summary statistics.
  • Processing: Local, server-side, and distributed processing can be automatically initiated on upload of data to MIDAS or initiated using thin or thick clients. Server-side and distributed processing is provided using the BatchMake scripting language.
  • Accessing: You can browse large images on MIDAS, without downloading the data; you can download MIDAS's data over the web or directly into your legacy software using SSHFS. SSHFS presents the hierarchical organization of the data on MIDAS as a file system that can be mounted on Windows, Linux, and MacIntosh platforms.

MIDAS has been used to host publications for e-journals, to host and process data for multi-center research projects, and to offer public access to data in compliance with NIH and NSF data sharing policies.

Slicer Tutorial Data

Slicer Brain and Abdomen Data