Difference between revisions of "Slicer3:Large scale experiment control brainstorming"

From NAMIC Wiki
Jump to: navigation, search
m (Text replacement - "http://www.slicer.org/slicerWiki/index.php/" to "https://www.slicer.org/wiki/")
 
(18 intermediate revisions by 5 users not shown)
Line 1: Line 1:
== Goal ==
+
<big>'''Note:''' We are migrating this content to the slicer.org domain - <font color="orange">The newer page is [https://www.slicer.org/wiki/Slicer3:Large_scale_experiment_control_brainstorming#BatchMake_and_Experiment_Control  here]</font></big>
 
 
To provide Slicer3 with a mechanism for submitting, monitoring, and summarizing large scale experiments that utilize Slicer3 modules, particularly the Command Line Modules.  This page summarizes our thoughts, requirements, and experiments to date, mostly accomplished during the [[Slicer3:MiniRetreat March 29, 30, 2007 | March 2007 Slicer3 MiniRetreat]].
 
 
 
There are two introductory use cases that we wish to support:
 
 
 
# Slicer3 is used interactively to select a set of parameters for an algorithm or workflow on a single dataset.  Then, these parameters are applied to N datasets non-interactively.
 
# Slicer3 is used interactively to select a subset of parameters for an algorithm or workflow on a single dataset.  Then, the remaining parameter space is searched non-interactively. Various parameter space evaluation techniques could be employed, the simplest of which is to sample the space of (param1, param2, param3).
 
 
 
Note, that with the above two use cases, we are only trying to address large scale experiment control from the standpoint of what it means to Slicer3.  We are '''not''' trying to solve the general case of large scale experiment control.
 
 
 
== Assumptions and restrictions ==
 
 
 
# Computing configuration.
 
#: We shall support a variety of computing infrastructures which include
 
#:# single computer systems,
 
#:# clusters,
 
#:# grids (optional)
 
# Access to compute nodes.
 
#: We shall have no direct access to the compute nodes.  All '''job submissions''' shall be to some sort of submit node. Exception may be when operating on a single computer system configuration.
 
# Staged data
 
#: The compute nodes shall mount a filesystem outside of the node on which data is staged.  We are not providing Slicer3 with the mechanisms to stage data.  We assume that all data is staged outside of Slicer3.
 
# Staged programs
 
#: The compute nodes shall have access to the Slicer3 processing modules. Like the case for data, the processing modules are staged outside of the Slicer3 environment.
 
# Experiment scheduling
 
#: A given experiment shall result in one or more processing jobs being submitted to the computing resources.
 
# Job submission
 
#: Submitting a job to the computing infrastructure shall result in a job submission token such that that job can be
 
#:# monitored for status: scheduled, running, completed
 
#:# terminated
 
# Experiment control
 
#: We shall be able to monitor an experiment to see its status.
 
#: We shall be able to interrupt an experiment. This may involve removing jobs from the queue and terminating jobs in process.
 
#: We shall be able to resume an experiment without re-running the entire experiment. Previously terminated jobs will be resubmitted.  Previously completed jobs will not be rerun.
 
#: We shall be able to rerun an experiment, overwriting previous results.
 
# Job execution robustness
 
#: Jobs terminating unsuccessfully shall be automatically resubmitted to the computing environment upon the experiment designers request. Jobs may be resubmitted zero times, K times, or until successful.
 
 
 
== Components ==
 
 
 
(Steve, fill this section in from the whiteboard.)
 
 
 
== Thought experiements ==
 
 
 
Below are a few thought experiments to address the above. These will be used to see how the above needs can be addressed.
 
 
 
=== Makefiles + the loopy launcher ===
 
 
 
=== BatchMake ===
 
 
 
BatchMake allows for large scale experiments to be designed using a scripting language similar to CMake scripts.  BatchMake provides a number of looping constructs which can be used to design experiments and parameter searches
 
* foreach
 
* sequence
 
* randomize
 
* fornfold
 
Here is a BatchMake script to search the parameter space of a median filter
 
<pre>
 
SetApp(median @'Median Filter')
 
SetAppOption(median.inputVolume 'c:/projects/I2/Insight/Testing/Data/Input/cthead1.png')
 
 
 
Set(kernels '1,1,1' '2,2,1' '3,3,1' '4,4,1' '5,5,1')
 
Set(outVolumePrefix 'c:/projects/Temp/Slicer3/median')
 
 
 
foreach(kernel ${kernels})
 
  RegEx(kernelText ${kernel} ',' REPLACE '_')
 
  SetAppOption(median.outputVolume ${outVolumePrefix}${kernelText}.png)
 
  SetAppOption(median.neighborhood ${kernel})
 
 
 
  Run(output ${median})
 
 
 
endforeach(kernel)
 
</pre>
 
We have extended the [[Slicer3:Execution Model Documentation | ModuleDescription library]] in Slicer3 to generate a BatchMake XML Application Wrapper from a ModuleDescription object.  This allows Slicer3 Command Line Modules to be loaded into BatchMake and used as BatchMake application objects in BatchMake scripts. This code has yet to be integrated into Slicer3 permanently because there a number of design decisions outstanding. Here is the ModuleDescription XML file that Slicer uses
 
<pre>
 
<?xml version="1.0" encoding="utf-8"?>
 
<executable>
 
  <category>
 
  Filtering.Denoising
 
  </category>
 
  <title>
 
  Median Filter
 
  </title>
 
  <description>
 
The MedianImageFilter is commonly used as a robust approach for
 
noise reduction. This filter is particularly efficient against
 
"salt-and-pepper" noise. In other words, it is robust to the presence
 
of gray-level outliers. MedianImageFilter computes the value of each output
 
pixel as the statistical median of the neighborhood of values around the
 
corresponding input pixel.
 
  </description>
 
  <version>0.1.0.$Revision: 2085 $(alpha)</version>
 
  <documentation-url></documentation-url>
 
  <license></license>
 
  <contributor>Bill Lorensen</contributor>
 
  <acknowledgements>This command module was derived from Insight/Examples/Filtering/MedianImageFilter (copyright) Insight Software Consortium</acknowledgements>
 
  <parameters>
 
    <label>Median Filter Parameters</label>
 
    <description>Parameters for the median filter</description>
 
    <integer-vector>
 
      <name>neighborhood</name>
 
      <longflag>--neighborhood</longflag>
 
      <description>The size of the neighborhood in each dimension</description>
 
      <label>Neighborhood Size</label>
 
      <default>1,1,1</default>
 
    </integer-vector>
 
  </parameters>
 
  <parameters>
 
    <label>IO</label>
 
    <description>Input/output parameters</description>
 
    <image>
 
      <name>inputVolume</name>
 
      <label>Input Volume</label>
 
      <channel>input</channel>
 
      <index>0</index>
 
      <description>Input volume to be filtered</description>
 
    </image>
 
    <image>
 
      <name>outputVolume</name>
 
      <label>Output Volume</label>
 
      <channel>output</channel>
 
      <index>1</index>
 
      <description>Output filtered</description>
 
    </image>
 
  </parameters>
 
</executable>
 
</pre>
 
and here is the resulting BatchMake XML Application wrapper
 
<pre>
 
<?xml version="1.0" encoding="utf-8"?>
 
<BatchMakeApplicationWrapper>
 
  <BatchMakeApplicationWrapperVersion>1.0</BatchMakeApplicationWrapper>
 
  <Module>
 
    <Name>Median Filter</Name>
 
    <Version>0.1.0.$Revision: 2085 $(alpha)</Version>
 
    <Path>c:/projects/Slicer3-clean-net2005/bin/RelWithDebInfo/../../lib/Slicer3/Plugins/RelWithDebInfo/MedianImageFilter.exe</Path>
 
    <Parameters>
 
      <Param>
 
        <Type>1</Type>
 
        <Name>neighborhood.flag</Name>
 
        <Value>--neighborhood</Value>
 
        <Parent>0</Parent>
 
        <External>0</External>
 
        <Optional>1</Optional>
 
      </Param>
 
      <Param>
 
        <Type>4</Type>
 
        <Name>neighborhood</Name>
 
        <Value>1,1,1</Value>
 
        <Parent>1</Parent>
 
        <External>0</External>
 
        <Optional>0</Optional>
 
      </Param>
 
      <Param>
 
        <Type>0</Type>
 
        <Name>inputVolume</Name>
 
        <Value></Value>
 
        <Parent>0</Parent>
 
        <External>1</External>
 
        <Optional>0</Optional>
 
      </Param>
 
      <Param>
 
        <Type>0</Type>
 
        <Name>outputVolume</Name>
 
        <Value></Value>
 
        <Parent>0</Parent>
 
        <External>2</External>
 
        <Optional>0</Optional>
 
      </Param>
 
    </Parameters>
 
  </Module>
 
</BatchMakeApplicationWrapper>
 
</pre>
 
 
 
==== BatchMake and the Computing Infrastructure ====
 
 
 
* What is needed to make BatchMake submit to a cluster?
 
* To a grid?
 
 
 
==== BatchMake and Job Control ====
 
 
 
# Can BatchMake terminate a job?
 
# Can BatchMake resubmit a job until it completes successfully?
 
 
 
==== BatchMake and Experiment Control ====
 
 
 
Can BatchMake interrupt, continue, and rerun an experiment?
 

Latest revision as of 18:05, 10 July 2017

Home < Slicer3:Large scale experiment control brainstorming

Note: We are migrating this content to the slicer.org domain - The newer page is here