2. Installation

  1. Distributed Pipeline Server Installation Utility
    1. Requirements
    2. Warning
    3. Downloading
    4. GUI
      1. Start the Installer
      2. Select Components
      3. Install Grid Engine
      4. Install Pipeline
      5. Install Neuro Imaging Tools
      6. Install Neuro Bioinformatics Tools
      7. Finish Install
      8. Start the Server
    5. Command Line Installation
    6. Troubleshoot
  2. Conventional Installation (without DPS utility)
    1. Requirements
    2. Downloading
    3. Starting the server

2.1 Distributed Pipeline Server Installation Utility

The Distributed Pipeline Server Installer is a GUI installer that allows you to install and configure 3 types of resources – backend grid management resources (Grid Engine), the Pipeline server, and a number of computational imaging and informatics software tools. After successfully running the installer, you will have a running Pipeline server with grid engine managing jobs on your machine(s), imaging and informatics software tools installed, as well as a set of predefined workflows and modules in your server library.

2.1.1 Requirements

The requirements for the Pipeline server installation can be found on Distributed Pipeline Server Installer page.

Warning: If any of the requirements are not met, there may be unexpected behavior in the installer (e.g. hanging, crashing). If you have any questions, please contact pipeline@loni.usc.edu

A complete installation (including grid engine, the Pipeline server, and all software tools) can take several hours. However, this is mostly because some of the tools take a long time to download (e.g. FSL can take up to 6 hours, depending on your internet speed). If you skip the tools or have already downloaded the ones that require manual download, the total installation time is less than 30 minutes.

2.1.2 Warning

When you run the DPS installation utility to install the Pipeline server, the underlying scripts will edit the firewall rules to open up the Pipeline port for connections from clients. Be forewarned that these changes can cause unexpected results on your system. We recommend backing up your iptables before starting the installation. In the future, this automatic configuration step will be made more robust.

2.1.3 Downloading

Download the installer from the Pipeline website, under Downloads > Distributed Pipeline Server Installer.

2.1.4 GUI

The graphical interface of the DPS utility simplifies the installation experience for the user by hiding unessential details and only asking the user for minimal configuration preferences. The steps are documented below and are accompanied by screenshots.

2.1.4.1 Start the Installer

To start the installer, open a terminal, change directories to the directory where the installer file is located, and type

su root (how to become root)
tar -zxvf pipelineServerInstaller.tar.gz
cd pipelineServerInstaller
./launchInstaller.sh

2.1.4.2 Select Components

After reading and agreeing to the license, you will be asked for an installation location and what components you want to install:

You can select any* or all of the components. It will guide you through all the steps needed for the installation.

* For example, if you have already installed SGE before launching this installer, then deselect the Oracle Grid Engine component. Likewise, if you only want to install the latest tools, you can select the Neuro Imaging Tools component and uncheck the rest.

The installer will verify the Shared File System Location given. It is required to have it on NFS if the server is set to use a grid. The shared file system is used for the Pipeline server to store intermediate files of workflows and to install Grid Engine and Tools.

2.1.4.3 Install Grid Engine

In this section you can configure Grid Engine installation. You can specify an installation location, cluster name (which uniquely identifies a specific Grid Engine cluster), spool directory (for spooling data), and execution hosts (hosts that execute the tasks (jobs)). You can leave installation location, cluster name and spool directory as they are, but you must provide a list of hostnames. You must provide fully qualified domain names, so something like “host1”, “localhost” or “127.0.0.1” is not allowed.

2.1.4.4 Install Pipeline

In this section you can configure the Pipeline server. You can specify an installation directory, Pipeline server address, port and user to run the Pipeline server process. The username must already exist and you can have the option to have its sudo file modified to accommodate privilege escalation.

User authentication lets you specify the authentication mechanism for the Pipeline server. If you already have NIS configured (there are plenty of online help resources, e.g. configure NIS server and client), it’s recommended to select the NIS option. Otherwise, you can select SSH Based option, which runs ssh command to test the provided credential. You can also choose No Authentication to let anybody connect to your sever. This option should only be used for testing and on a server with limited internal network access.

If the modify sudoers file option is selected, the installer will modify the operating system’s sudoers file so that the Pipeline server user will be able to sudo as any user, except root and the optional list of users provided. For example, if you have some user that can sudo as root, then this user should be listed as an exception, so that the Pipeline user will not be able to gain root access.

Install Pipeline with SGE already installed

If you already have SGE installed and the SGE_ROOT variable is defined on your system, you can skip SGE installation by unchecking the Oracle Grid Engine checkbox from step 3 (General Configuration). The Pipeline configuration window will now have an additional checkbox to “Enable Grid Submission” which needs to be selected if you want to use Pipeline with your pre-installed SGE.

Upon checking the “Enable Grid Submission” checkbox, you will need to select a grid plugin. In order to communicate with SGE, Pipeline uses Grid Plugins. LONI provides two plugins for SGE: JGDI Plugin and DRMAA Plugin. If you are using SGE we highly recommend using JGDI Plugin as it supports more Pipeline features and is more reliable. You can choose DRMAA Plugin if you have other DRMAA supported Grid Manager installed and want to integrate Pipeline with it.

The last step is to choose the submission queue. The installer will list all of your available queues and you have to pick one for Pipeline. If you don’t have a special queue already set up for Pipeline then you can use the default queue of SGE (all.q). If you do not have any queues defined in SGE, you will have to create one yourself.

Installing Pipeline without SGE

If you don’t have SGE installed, and you uncheck the Oracle Grid Engine checkbox from step 3 (General Configuration), the installer will install Pipeline without Grid Engine. All jobs submitted to the Pipeline server will run locally on the server. You have to be careful with number of jobs submitted to the server as high number of jobs will negatively affect the server’s performance. Please see Maximum number of threads for active jobs if you want to set limits on the number of parallel running jobs.

2.1.4.5 Install Neuro Imaging Tools

In this section you can select which imaging software tools and server library files to install.

There are two components that can be selected for each NeuroImaging tool:
     • the tool itself (binaries, executables, and scripts)
     • the modules/workflows (.pipe files) associated with that tool.

You may select either or both options for any tool, but please note that you can only install workflows for tools that are already installed or you have selected to install.I f you select to install the workflows for a tool but not the tool itself, and the tool cannot be found in the default installation directory (shared file system path + “tools”) then you will be prompted to provide where that tool is installed (second image). If you find yourself here by mistake, click back and modify your selection.

If the installation type for a tool is “Automatic”, it will be installed automatically without the need for user input. Some tools are marked as “Semi-Automatic” (e.g. FSL and FreeSurfer), which means that they require you to manually download the installer files for that tool from the developer’s website. This is because of the licensing restriction imposed on the software. For these types of tools, you will be shown a window after clicking ‘Next’ which contains instructions on what website to visit, which files to install, and any other requirements for that tool.

When you satisfy all the requirements for a tool, it will begin installing in the background immediately. A green check mark will appear next to that tool in the drop menu, indicating that you have provided the necessary information and can move on to the next tool. You may preemptively cancel the installation of a tool by clicking the ‘Don’t install’ at the bottom of the window. When all tools are either installing or cancelled, this window will close automatically.

Install the tools without installing Pipeline or SGE

If, at a later time, you want to install updated versions of some tools, you can have it installed without installing the Pipeline and/or SGE. Simply check only the NeuroImaging Tools in the general configuration section of the installer, then click Next and it will go directly to the tools installation step, skipping the Pipeline and SGE installation steps.

Please note that NeuroImaging tools can only be installed if you also selected to install the Pipeline Server or already have the server installed. If you select to install these tools without selecting to install the server, and the preferences.xml file cannot be found in its default location, a browse button will appear so that the location to your preferences.xml file can be provided. If you don’t have a preferences file, it means you have not installed the server yet and it should be selected during the installation process.

2.1.4.6 Install Bioinformatics Tools

The process for installing Bioinformatics Tools is the same as NeuroImaging (outlined in the previous step) except there are currently no “Semi-Automatic” tools in this section. Note this is the final step before the Pipeline installation utility takes over and starts to download/install files so only hit ‘Install’ if you are sure that all of your previous settings are correct.

Install the tools without installing Pipeline or SGE

Just as with NeuroImaging tools, Bioinformatics tools can only be installed if you also selected to install the Pipeline Server or already have the server installed. If you select to install these tools without selecting to install the server, and the preferences.xml file cannot be found in its default location, a browse button will appear so that the location to your preferences.xml file can be provided. If you selected NeuroImaging tools as well and already indicated the path to your preferences file in the NI Tools Configuration panel, you will not see this button.

2.1.4.7 Finish Install

After the installation has successfully completed, you will be shown a summary screen. Clicking the Finish button with “Start the LONI Pipeline Server” checked will exit the installer and launch the Pipeline server. You can also check the “Start Client to validate the installation” option to launch the client and test run a workflow.

Additionally, you may want to configure advanced server preferences by clicking on “Configure the server with advanced options…”. This will automatically open the server configuration tool, where you can edit the details of your server.

If you have any questions, please contact pipeline@loni.usc.edu

2.1.4.8 Start the Server

If you checked the “Start the LONI Pipeline Server” option on the summary page of the installation, the Pipeline server process will be started. To check the logs of the Pipeline server, go to the Pipeline server’s directory (/usr/pipeline by default), specified in the Install Pipeline step. You will find files called outputStream.log and errorStream.log, which store output and error stream information. You can verify if the server started successfully by checking the contents of the outputStream.log file. It should look something like this:

[ 1/6 ] Connecting to Persistence Database..............DONE [117ms]
[ 2/6 ] Starting server on port 8001....................DONE [1152ms]
[ 3/6 ] Loading server library..........................DONE [31ms]
[ 4/6 ] Loading server packages info....................DONE [7ms]
[ 5/6 ] Checking to resume backlogged workflows.........DONE [0ms]
[ 6/6 ] Checking to resume active workflows.............DONE [0ms]
[ SUCCESS ] Server started.

You can stop and start the Pipeline server by calling (root access required):

/etc/init.d/pipeline stop
/etc/init.d/pipeline start

The Pipeline and persistence database will be started/stopped in order, and the pipeline user will run these processes.

If you don’t have root access, you can stop and start the Pipeline server as the pipeline user. It will be equivalent to the init.d method above. To stop and start the Pipeline server, go to the Pipeline server’s directory and type

./killServer.sh
./launchServer.sh

Always check if the server has started successfully by viewing the outputStream.log file. If it shows error on persistence database, you can stop and start the persistence database process by typing:

./db/stopDB.sh
./db/startDB.sh

After the persistence database has been restarted, restart the Pipeline server as noted above.

2.1.5 Command Line Installation

An alternative to using the GUI to install the Pipeline server is an automated method that relies on a configuration file. All of the fields that are entered via the GUI are represented within a hierarchical XML file. A default configuration file is included in the distribution directory (dist/install_files) of the installer, which you can download here). After you set up your configuration file, you can run the installation in automatic mode by typing the following into your shell:

tar -zxvf pipelineServerInstaller.tar.gz
cd pipelineServerInstaller
./launchInstaller.sh -auto dist/install_files/DefaultInstallationPreferencesFile.xml

A complete template for the XML file can be found here. If you use this template as a starting point, note that it has a lot of placeholders and is not set up to run “as is”, so you would have to make many modifications. For reference, each of the tags is documented below:

  • DistributedPipelineServerInstaller: root tag, contains all other tags
  • SharedFileSystemPath: path to a directory that is shared (via NFS) between the host running the Pipeline server and qmaster, admin, and execution hosts of SGE
  • JDKLocation: only include this tag if you don’t already have Oracle JDK running on the host where you’re installing the Pipeline server; the value should be the path to the JDK RPM, which you can install from the Oracle page
  • PipelineServer: use attribute enabled=”true” to indicate that you would like to install the Pipeline server; the children of this element will specify information about the server installation
    • InstallLocation: specifies location where Pipeline server is to be installed
    • Hostname: specifies the hostname of the host where Pipeline server is being installed
    • Port: specifies port on which the Pipeline server will be accepting connections from clients
    • Username: specifies user that will be running the Pipeline server
    • TempDir: specifies a directory where Pipeline modules will write intermediate files
    • ScratchDir: specifies a scratch directory where sample workflows will write their outputs; this value then becomes available to users through the pre-defined ${tempdir} variable, documented here
    • GridSubmission: use attribute enabled=”true” to indicate that you would like the Pipeline to submit jobs via grid engine to execution hosts; otherwise, the jobs will be run locally on the host running the Pipeline server
      • GridPlugin: options are JGDI or DRMAA
      • GridSubmissionQueue: the SGE queue where Pipeline should submit its jobs
    • UsePrivilegeEscalation: options are true or false; privilege escalation is documented here
    • DBInstallLocation: path to a directory where you would like to install the Pipeline database; if it doesn’t exist, it will be created by the installer
    • StartPipelilneOnSystemStartup: set value to true if you would like to configure the system to start the Pipeline server on startup; false, otherwise
    • AuthenticationModule: options are SSH, NIS, and NoAuth; these are documented here
    • ModifySudoers: use attribute enabled=”true” to indicate that you want to add the Pipeline user to the sudoers list
      • SuperUsers: comma-separated list of users that you don’t want the Pipeline server to sudo as (default: root)
    • MemoryAllocation: specify the amount of memory you would like to allocate to the Pipeline server/database, in megabytes
  • PreferencesPath: if the Pipeline server is not being installed (i.e., the PipelineServer element is missing or has attribute enabled=”false”), then the user must specify the path to the Pipeline server preferences file (by default, the path is /usr/pipeline/preferences.xml); if the Pipeline server is being installed, you can omit this element.
  • SGE: use attribute enabled=”true” to indicate that you would like to install Son of Grid Engine; the tags that follow will describe some of the preferences for the installation; you can find documentation on SGE here
    • SGERoot: path to directory where you would like to install SGE (default: /usr/local/sge)
    • SGECluster: name of cluster that you would like to install (default: cluster)
    • SubmitHosts: specify hostnames of machines which will be configured to handle job submission and control; you can do this using one hostname per Host element, as children of the SubmitHosts element
    • ExecHosts: specify hostnames of machines which will be execution hosts; use same format as for SubmitHosts
    • AdminHosts: specify hostnames of machines that will be used for SGE administration purposes; use same format as for SubmitHosts
    • AdminUsername: user that will serve as SGE administrator
    • SpoolDir: path to a directory that will be used for spooling during installation
    • Queue: use attribute configure=”true” to indicate that you would like to configure a queue at the end of SGE installation; this is documented here
      • Name: the name of the new queue that you would like to configure
      • Hosts: the hosts that you would like to add to the queue
      • Slots: the slots that you would like to add to the queue (the difference between hosts and slots is documented here)
  • Tools: use the attribute enabled=”true” to indicate that you would like to install some tools; also use the path attribute to specify the directory where you would like to install the tools (note that this should be in an NFS-shared directory)
    • NeuroImagingTools: use the attribute enabled=”true” to indicate that you would like to install one or more NeuroImaging tools; true/false values for the all_executables and all_serverlibs tags indicate that you want to install the executables and/or .pipe files for all NeuroImaging tools, regardless of what values each tool is set to.
      • Available neuroimaging tools: AFNI, AIR, BrainSuite, FSL, FreeSurfer, LONI, MINC, ITK, DTK, GAMMA, and SPM; for each of these, the executables=”true” attribute is used to activate the tool installation and the serverlib=”true” attribute is used to activate the .pipe files for that tool; note that FSL, FreeSurfer, and DTK require that the user specify a sub element, namely ArchivePath, whose value is the path to the archive file, downloaded manually from the software website.
    • BioinformaticsTools: same attributes as NeuroImagingTools tag
      • Available bioinformatics tools: EMBOSS, Picard, MSA, BATWING, BayesAss, Formatomatic, GENEPOP, Migrate, GWASS, MrFAST, Bowtie, SamTools, PLINK, MAQ, miBLAST; again, the enabled attributes can be used to indicate activation or deactivation of installation for each of these elements

2.1.6 Troubleshoot

The following is a list of common problems and explanation:

– The provided directory seems not to be a network file shared (NFS) directory.
The installer will verify the Shared File System location given. It is required to be on NFS if the server is set to use a grid. The shared file system is used for the Pipeline server to store intermediate files of workflows and to install Grid Engine, NeuroImaging, and Bioinformatics Tools.

– For a Grid Engine installation, the local hostname cannot be “localhost” and/or the IP address is like 127.0.*.*
You must provide fully qualified domain names as hostnames (such as “host1″); “localhost” or “127.0.0.1″ is not allowed.

– Cannot enable Grid submission as SGE doesn’t have any queue.
If you do not have any queue defined in SGE, you have to create one yourself and recheck “Enable Grid submission” checkbox and select the queue.

– Why I can’t connect to the server?
If you have the Pipeline server running but you can’t have your client connect to it (shows “Server not found” message), you need to check your firewall settings and enable port 8001.

– Why is my first workflow taking so long?
When you have SGE installed and you submit jobs for the first time, it may take a long time to get the jobs running. This is because initially the SGE sees the compute nodes loaded heavily, but as time passes, the loading information will be updated more accurately.

2.2 Conventional Installation (without DPS utility)

If you’d like to install the Pipeline server by hand, here are some instructions on how to get started. Note that if you choose this route, you’ll have to carry out quite a bit of configuration on your own. This is only recommended if you’ve done it before or have a thorough understanding of the inner workings of the Pipeline server. Otherwise, use the DPS utility.

2.2.1 Requirements

The Pipeline server can run on any system that is supported by JRE 1.6 or higher, so the first thing to do is head over to the official Java website to download the latest JRE/JDK. If you run the server on Windows, you will not be able to use privilege escalation (you might not even need/want it). Also the Failover feature is only supported by Unix/Linux systems. All other features are available for all platforms.

The amount of memory required varies based on the load you will expect on the server, but for a reference point, as of summer 2010, the main Pipeline server running at LONI has been set to accept a max load of 620 jobs, and its memory footprint hovers between 50-300MB depending on the load and garbage collection scheme.

2.2.2 Downloading

Head over to the Pipeline download page and download the latest version of the program for Linux/Unix. The server and the client are both in the same jar file, so you only need to change the Main entry point when starting up the server. Extract the contents of the download to the location you want to install the server at.

2.2.3 Starting the server

Now let’s start the server for the first time. Get to a prompt and switch to the directory where you copied the Pipeline.jar and lib directory and type:

$ java -classpath Pipeline.jar server.Main

Assuming you have java in your path, you should have received the following message back in your terminal window:

[ 1/6 ] Connecting to Persistence Database..............DONE [61ms]
[ 2/6 ] Starting server on port 8001....................DONE [747ms]
[ 3/6 ] Loading server library..........................DONE [336ms]
[ 4/6 ] Loading server packages info....................DONE [2ms]
[ 5/6 ] Checking to resume backlogged workflows.........DONE [46ms]
[ 6/6 ] Checking to resume active workflows.............DONE [0ms]
[ SUCCESS ] Server started.

That’s not enough to have a fully functional server yet, but we’re a step closer, so go ahead and break out of the process by hitting Ctrl-C and then let’s begin configuration process.

Previous: 1. Introduction Table of Contents Next: 3. Configuration
  1. Requirements
  2. Downloading
  3. Setup and launching

2.1 Requirements

The only requirement of the Pipeline client is an installation of JRE 1.6 or higher, which can be downloaded from Oracle. Note to Linux users, your system may have java installed by default, but it may not be Oracle’s version. To check which version of java you are running, under terminal, type java -version. If you did not see something like “Java HotSpot(TM)”, then you need to download and install Java from Oracle.

In terms of memory consumption, it’s unlikely that you’ll need to worry about having sufficient RAM to run the Pipeline.

2.2 Downloading

To get the latest version of the LONI Pipeline, go to the Pipeline web site and click on the download link in the navbar at the top.

2.3 Setup and launching

OS X: To install the program, double click the disk image file you downloaded, and drag the LONI Pipeline application into the Applications folder. Once the program is done copying you can unmount (eject) the disk image and throw it in the trash. To start the Pipeline, just go to your Applications folder and double-click on the LONI Pipeline application.

Windows: To install on Windows, double-click the installer and follow the on-screen instruction. Once it finishes installing, you can throw away the installer and launch the program by going to the Start menu->Programs->LONI Pipeline and start the program.

Linux/Unix: Extract the contents of the file to a location on disk, and execute the PipelineGUI script. Make sure you have the java binary in your path.

Previous: 1. Introduction Table of Contents Next: 3. Interface Overview

This booklet gives an introduction to LONI Pipeline, provides installation instructions. It has several concrete and complete neuroimaging processing examples from designing modules to executing workflows to viewing results.

Click here to read the LONI Pipeline handbook in PDF format.

 

Handbook Handbook Handbook Handbook Handbook Handbook

“…the workflow ran in 54 minutes and every module read 100% complete in green. That is about twice as fast as it ever ran before with the same number of images. Quite an improvement. Also, it was nice to see the ROI labels for each of the 56 brain structures listed at the top of the output file. I think that will help the end users a lot when they get their first output.”

– Anonymous User (2012)

 

“I found the download/installation instructions to be complete and intuitive. The computational libraries seem fairly extensive. I would appreciate more workflows particularly with DTI and structural connectivity analyses. The Pipeline seems to have many available workflows and seems easy to adapt to any new processing that is relevant to a particular user.”

– Pipeline Workshop Trainee (2012)

 

“[A] great tool you guys have created, it really seems very powerful. I’ve tried out the standalone version and played around on the trial server, … I’d really like to do is to install it onto a local cluster.”

– Anonymous User (2012)

 

“[The LONI] Pipeline looks promising for our purposes!”

– Investigator, University of Missouri (2012)

 

“I’m totally new to Pipeline and … installed a Pipeline server on our own space.”

– Pipeline Forum User (2012)

 

“[The LONI Pipeline] is coming along nicely … we’ve been finding a number of sensible results…. I’m going to focus first on the Global Shape Analysis and then proceed … to start looking at specific SNP/genotype variations in relation to structure”

– Collaborator (2011)

 

“A large number of us Hoffman users are interested in using the pipeline”

– UCLA IDRE Cluster Investigator (2011)

 

“Cool it works…”

– Investigator, Emory University (2011)

 

“Great work on the pipeline guys. It’s truly amazing.”

– Pipeline Forum User (2011)

 

“For the last three months, I have used the LONI Pipeline software. I cannot emphasize enough how useful this software has been for my research. I could have used other resources but the implementation of the Pipeline has accelerated my research. I have been able to analyze more sequences with Pipeline than in my previous three years at the University.

“In particular, we analyzed more than 600,000 sequences from a metagenome. This means that we extracted DNA from soil in an extreme environment and sequenced it. We received 600,000 sequences (with an average length of 250 letters). Once these sequences are received and filtered (a quality control check), they have to be compared against millions of sequences in a database. This process can take several weeks on our servers and an expert programmer has to perform this process. With Pipeline, I can execute this process on my computer in several hours. Without Pipeline, I wouldn’t be writing the manuscript for several more months.

“Pipeline has enhanced my research and I plan to use the software in the near future for more analyses of genomes and metagenomes.”

– Professor, University of the Andes, Colombia (2010)

 

Who Uses Pipeline?

A distributed LONI Pipeline client-server infrastructure is in use at the following institutions.
Please contact us if you would like to get your organization on this list.

  1. Introduction
  2. Installation
    1. Requirements
    2. Downloading
    3. Setup and launching
  3. Interface overview
    1. Connection manager
  4. Building a workflow
    1. Dragging in modules
    2. Connecting modules
    3. Setting parameter values
    4. Processing multiple inputs
    5. Enable/Disable parameters
    6. Saving a workflow
  5. Execution
    1. Executing a workflow
    2. Viewing output

1. Introduction

This Quick Start Guide to the LONI Pipeline covers the fundamentals of building a Pipeline. For a more detailed description of Pipeline features, please see the User Guide.

2. Installation

2.1 Requirements

The only requirement of the Pipeline client is an installation of JRE 1.6 or higher, which can be downloaded from Oracle. In terms of memory consumption, it’s unlikely that you’ll need to worry about having sufficient RAM to run the Pipeline.

2.2 Downloading

To get the latest version of the LONI Pipeline, go to the Pipeline web site and click on the download link in the navbar at the top. A LONI account is required to download LONI software, you can fill an application here.

2.3 Setup and launching

OS X: To install the program, double click the disk image file you downloaded, and drag the LONI Pipeline application into the Applications folder. Once the program is done copying you can unmount (eject) the disk image and throw it in the trash. To start the Pipeline, just go to your Applications folder and double-click on the LONI Pipeline application.

Windows: To install on Windows, double-click the installer and follow the on-screen instruction. Once it finishes installing, you can throw away the installer and launch the program by going to the Start menu->Programs->LONI Pipeline and start the program.

Linux/Unix: Extract the contents of the file to a location on disk, and execute the PipelineGUI script. Make sure you have the java binary in your path.

3. Interface overview

Interface Overview

3.1 Connecting to Pipeline servers

If you need to connect to different Pipeline servers, go to the ‘Window’ menu and click on ‘Connections…’. Alternatively, you can click on the disconnected circles at the bottom right of the window, and in the popup menu click on ‘Connections…’.

undefined

In here you can add a connection to any Pipeline server that you want to access. If you don’t know of any servers you can add the LONI Pipeline server (cranium.loni.usc.edu) but you will need to apply for a LONI cranium account to actually connect to it. Please note this account is different from the general LONI account. Once you’ve entered the connection, go ahead and click ‘Connect’ then close the dialog. After 30 seconds or so you’ll notice that your server library has been populated with tools from the server.

4. Building a workflow

Open a new workflow by going to File->New.

4.1 Dragging in modules

Go to the server library at the left and expand the desired package. Click on a module and drag it into the workflow canvas that you just opened. Repeat this step for all other modules that you need.

undefined

4.2 Connecting modules

Each module in a workflow can have some inputs and outputs. The inputs are on the top, and the outputs on the bottom. Connect the modules by clicking on the output parameter of a module and then dragging the mouse pointer to the following module’s input parameter.

Module connection

When you attempt to make a connection, the Pipeline does some initial checking to make sure the connection is valid. For example, it won’t let you connect a file type parameter to a number type parameter, or connecting an output to another output and more.

4.3 Setting parameter values

Now, specify values for the input parameters of each module which do NOT have a connection to a previous module. Double click on the input parameter and select the input value, making sure to choose an input that correctly matches the parameter type (File, Directory, String, Number or Enumerated). Also, File parameters can require a specific file type, so make sure to check this too if necessary.

Once you’ve set the inputs, you’ll want to specify a destination for the output of the final module. Double-click on the output parameter and specify the path where you want the output(s) to be written to.

undefined

Note that you can mix data that is located on your computer and the computer that the server resides on, and the Pipeline will take care of moving data back and forth for you. For example, the input to a module could be located on your local drive, but you could set the output to be written to some location on the Pipeline server or vice versa.

4.4 Processing multiple inputs

One of the strengths of the LONI Pipeline is its ability to simplify processing of multiple pieces of data, by using the same workflow you use to process a single input. In order to do this, you can create a Data Source and use it to feed a list of inputs into the first module. Right click on any blank space in the workflow canvas and select ‘Add Data Source’. In the dialog that opens enter some information about the data source, and then click on the ‘Data’ tab. From here, you can click on ‘Add files’ at the bottom of the dialog and add multiple files into the list, or you can just type in the path to a file manually. Note that at the bottom there is an option for a server in case you want the data source to represent data on another computer.

undefined

4.5 Enable/Disable parameters

Most modules have 2-3 required parameters on them, and several more optional parameters. If you want to exercise any of those additional options, simply double-click on the module and you’ll see a list of all the required and optional parameters for that module. For each additional option you want to use just click on the box on the left side of its name to enable it. Conversely, to disable it click on the box again. Notice that you are not able to disable parameters that are required.

4.6. Saving Workflows

In order to save a workflow, go to File->Save.

5. Execution

5.1 Executing a workflow

Once you’ve completed your workflow, you can execute the workflow by simply clicking on the ‘Play’ button at the bottom of the workflow area. If the program needs a connection to a server, it will prompt you for a username and password. If you’ve already stored a username and password to the server in your list of connections, then it will automatically connect for you.

Once all necessary connections have been made and has completed the workflow will begin to execute.

undefined

5.2 Viewing output

As the modules continue executing you can view the output and error streams of any completed module. You can bring up the log viewer by going to Window->Log Viewer or more easily, right-clicking on the module that you want to view information about and click on ‘Show Output Logs.’ This will bring up the log viewer and set its focus on the module that was clicked.

LONI Pipeline Hands-On Training Day, UCLA

When: Monday, May 21, 2012, 9AM – 3PM
Where: LONI DIVE Theater, NRB #225, UCLA (Map)

Are you interested in learning more about LONI Pipeline? A programming expert wanting to know more about the integration of your tools with Pipeline? Are you in charge of a compute cluster and think that Pipeline might be right for you and your lab? Then you should attend one or more of these hands-on Pipeline training sessions! Please bring your own laptop.

9am-11am – Pipeline Workflow Basics: We’ll teach you how to get started creating your own Pipeline workflows and on your way to becoming a Pipeline expert in no time. This session will be specifically focused on the hands-on construction of Pipeline workflows from the ground up. Ideal for students and trainees having limited programming experience but a need to analyze large data sets using existing processing tools. Bring your PC/Mac laptop. Pipeline software to be provided. Topics covered include parts of user guide and handbook.

11am-1pm – Pipeline Module Library Construction: Got a set of Linux command line C, Java, Python, Bash programs that you would like to see as Pipeline modules? We’ll help you create a personal library of Pipeline modules and workflows from your own tools making them sharable with members of your lab and others! Have your laptop handy with access to your software routines and we’ll get you started. Topics covered include parts of user guide and handbook.

1pm-3pm – Pipeline Server Installation, User, and Job Management – Are you a lab or center IT manager looking for a grid enabled compute environment for your users? Want user and job management tools to help maximize cluster utility? Are you a lab PI with a new cluster and want to put it to good use? Then this session is for you. We’ll describe all of the steps for deploying Pipeline to your own compute cluster and how to manage users and track job submissions, etc. Topics covered include server guide.

Registration The registration is closed.

The event flier can be downloaded here (pdf format, 4 MB).

The LONI Pipeline Web Start (PWS) allows users to start the LONI Pipeline application directly from the web browser and run it locally without any installation. It has all the features and functionality of the downloadable stand-alone Pipeline application and allows anonymous guest access or user-authentication to connect to remote Pipeline servers.

Click the button below to launch the Pipeline Web Start

You can explore and run predefined Pipeline resources (data, executable modules, workflows and services) on the LONI Pipeline Training server. You can interactively view a summary of a workflow, download the .pipe XML file, load and launch the workflow using Pipeline Web Start.

Open full screen in a new window

March 16-17, 2012 in Cincinnati, OH. This workshop will provide hands-on training on design of heterogeneous data analysis protocols, sharing of data and pipeline workflows, integration of imaging, demographic and meta-data, Pipeline server installation, utilization of Grid resources using the distributed Pipeline computational infrastructure, and much more. For event details and registration, please click here.

March 16-17, 2012

Cincinnati Children’s Hospital Medical Center (CCHMC)
Presenters: Ivo D. Dinov, John D Van Horn, Petros Petrosyan

The LONI Pipeline environment is a free distributed workflow application that enables users to quickly create computational processing protocols, execute these as graphical workflows, monitor the state of their analyses, and broadly disseminate the detailed provenance of data and processing protocols. This workshop will provide hands-on training on remote Pipeline server installation, client-server interfaces, utilization of Grid resources using the distributed Pipeline computational infrastructure, design of new and modification of existent heterogeneous data analysis protocols, sharing of data and pipeline workflows, integration of imaging, demographic and meta-data. The organizers will provide a detailed training handbook and supplementary electronic media with all of the necessary software and test data

Who may be interested in this event?
Neuroimaging researchers, computational scientists, bioinformaticians.

Click here to view the program agenda >>

Registration

The registration is closed.

Contact Information

Pipeline Training Coordinator
Ivo D. Dinov
Laboratory of Neuro Imaging,
Department of Neurology,
UCLA School of Medicine
635 Charles Young Drive South, Suite 225
Los Angeles, CA 90095-7334
Phone: 310-206-2101
Fax: 310-206-5518
Click here to submit an inquiry >>

Event Location
Cincinnati Children’s Hospital
Medical Education Research Center (MERC) Building, Classroom 2005 [building map]
620 Oak Street
Cincinnati, Ohio 45206
Click here for Maps, Parking, and Directions to CCHMC >>

Hotel Information

We have 30 rooms blocked at a special Rate of $99/per night at Kingsgate Marriott Conference Center.

Address: 151 Goodman Drive, Cincinnati, OH 45219
Contact: Please call Marriott Reservations at 1 (800) 453-0309 or (513) 487-3800 on or before Thursday, February 23, 2012 (the “Cut Off Date”) to make room reservations. Please identify yourself as part of the LONI Pipeline Training group. Check in is on Thursday March 15th, 2012 and Check-Out is on Saturday March 17th, 2012.

Event Flier

The event flier can be downloaded here (pdf format, 683 KB).

Pipeline Web Start and Server Library

Pipeline Web Start and server library for the training event

Evaluation Survey

Click here for the Pipeline survey.

Setting Up Pipeline Web Start

You can set up a web page hosting links to launch Pipeline Web Start that submits workflows by randomly generated anonymous users, such as this one. There are 3 components you need to set up, a Pipeline server that’s running on your server, a set of Pipeline workflows you would like to host for others to use, and a php script that generates customized Java Web Start files (JNLP files). Follow the following steps:

  1. A Pipeline server configured* and running. [server installation guide]
    *You need to add/modify a few of elements to your server’s preferences.xml file: <GuestsEnabled>true</GuestsEnabled>
    This will enable guest login on the server.
    <UsePrivilegeEscalation>true</UsePrivilegeEscalation>
    <MappedGuestUser>joe</MappedGuestUser>
    These will keep privilege escalation for regular users, and map guest users to username joe on the system. This is recommended configuration as it provides more security. For a trusted/testing environment, you may set UsePrivilegeEscalation false and omit MappedGuestUser, i.e.:
    <UsePrivilegeEscalation>false</UsePrivilegeEscalation>
    This will make all jobs submitted to run as the user that launched the server process.
  2. Prepare all of the workflows you want to distribute and run. Put them on your web server.
  3. Download this PHP script (.zip file 2KB) and put it on your web server. If your server is running the latest version, the script is ready to go. If your server is running an old version that is incompatible with the latest client, you need to specify an earlier version that’s compatible with the server. Follow the instructions on top of the script to modify the $codebase variable.
  4. You are done. Create a web pages to include a list of links of these workflows by this format:

    Where [URL_TO_JNLP_WRITER.PHP] is the URL of the PHP file in step 3, [URL_TO_WORKFLOW] is the URL of the workflow (.pipe file) you prepared in step 2, and [SERVER_NAME] is the Pipeline server name you set up in step 1. For example:

Additional Features

When <GuestsEnabled> tag on server’s preferences.xml is set to true, it will allow system generated guest users to submit workflows. Guest usernames are generated randomly at initial launch of the PWS client, and remembered on the client so that previously submitted workflows can be retrieved.

If you want disable the random generation of guests on PWS clients, for example, the server doesn’t allow guests (<GuestsEnabled> is set to false), you can modify the PHP script so that the generated JNLP file will have the following:
<property name="pipeline.guestsEnabled" value="false"/>

If you want a given user (e.g. john) to run workflows automatically, you can add the following line above the line “<argument>-execute</argument>”
<argument>-username=john</argument>
When the user opens the JNLP file, Pipeline will pop a credential dialog asking for john’s password, once it’s given and verified, the workflow will be started automatically.