Glossary
Introduction
This is a listing of the commonly used Pipeline terms and their definitions.
Terms
Annotation – Notes that you can add to a workflow to remind yourself of pertinent information.
Cache – A directory in which the application creates intermediate output files, streams, and log files.
Command String – The exact command that was submitted by the Pipeline to the underlying operating system for execution.
Connection Manager – The dialog box which holds connections that you have created to Pipeline servers.
Data Sink – A special module that takes one or more output values and can be used as the output destination of one or more modules.
Data Source – A special module that takes one or more input values and can be used as the input source of one or more modules.
DRMAA – Distributor Resource Management Applications API. A library, or specifications, which allow applications to interact, or submit control to jobs on one or more DRM systems.
Executable – A file whose contents are meant to be interpreted as a program by a computer.
IDA – An acronym that stands for Image Database Archive. Along with being one of the protocol which the LONI Pipeline support, IDA offers the following benefits:
- De-identification – Addresses government regulations for protection of human subject privacy
- Data Transmission – Data is transmitted over the internet using Hyper-Text Transfer Protocol with SSL encryption (HTTPS)
- Storage – Data is archived on a fault-tolerant storage area network (SAN), providing near 24/7 availability
Execution Dialog – A dialog which shows important messages printed by the different times during its execution.
Module – The smallest unit of a Pipeline workflow. Specifically, it is a chunk of XML that describes an executable and its inputs and outputs. It can be created by a user and placed directly into a workflow, or a user can drag and drop predefined modules from the library of any server to which they are connected.
Module Definition – The collection of information including the executable author, name, package, version, description and parameter names that must be specified for each executable to create a module that can be used in the Pipeline. Once a module definition has been created for an executable, it can be saved in the library and reused by other users.
Module Group – A collection of modules. The Pipeline can abstract a Module Group to be represented as a single module in a workflow.
Output Log – A collection of messages which are printed out to the output stream of the application.
Package – A suite of module definitions which are interrelated.
Parameter – An input or output to a module.
Personal Library – A place to store or save workflows and modules for easy access through the pipeline.
Pipeline – An environment to develop workflows for data processing, independent of data location, program location, and platform.
Validation – Occurs automatically when the user requests that a workflow be executed. Validation entails:
- Verifying the existence of inputs, outputs and executables
- Cycle detection
- File cardinality checks
- File type checking
Depending on workflow content, connection status and other variables, the validation may be more or less complex. Please refer to Pipeline documentation on validation for more information.
Workflow – A set of connected modules that performs analysis or simply processes input data.