$base: "https://w3id.org/cwl/cwl#"
|
|
|
|
$namespaces:
|
|
sld: "https://w3id.org/cwl/salad#"
|
|
cwl: "https://w3id.org/cwl/cwl#"
|
|
|
|
$graph:
|
|
|
|
- name: "WorkflowDoc"
|
|
type: documentation
|
|
doc:
|
|
- |
|
|
# Common Workflow Language (CWL) Workflow Description, v1.0
|
|
|
|
This version:
|
|
* https://w3id.org/cwl/v1.0/
|
|
|
|
Current version:
|
|
* https://w3id.org/cwl/
|
|
- "\n\n"
|
|
- {$include: contrib.md}
|
|
- "\n\n"
|
|
- |
|
|
# Abstract
|
|
|
|
One way to define a workflow is: an analysis task represented by a
|
|
directed graph describing a sequence of operations that transform an
|
|
input data set to output. This specification defines the Common Workflow
|
|
Language (CWL) Workflow description, a vendor-neutral standard for
|
|
representing workflows intended to be portable across a variety of
|
|
computing platforms.
|
|
|
|
- {$include: intro.md}
|
|
|
|
- |
|
|
|
|
## Introduction to v1.0
|
|
|
|
This specification represents the first full release from the CWL group.
|
|
Since draft-3, this draft introduces the following changes and additions:
|
|
|
|
* The `inputs` and `outputs` fields have been renamed `in` and `out`.
|
|
* Syntax simplifcations: denoted by the `map<>` syntax. Example: `in`
|
|
contains a list of items, each with an id. Now one can specify
|
|
a mapping of that identifier to the corresponding
|
|
`InputParameter`.
|
|
```
|
|
in:
|
|
- id: one
|
|
type: string
|
|
doc: First input parameter
|
|
- id: two
|
|
type: int
|
|
doc: Second input parameter
|
|
```
|
|
can be
|
|
```
|
|
in:
|
|
one:
|
|
type: string
|
|
doc: First input parameter
|
|
two:
|
|
type: int
|
|
doc: Second input parameter
|
|
```
|
|
* The common field `description` has been renamed to `doc`.
|
|
|
|
## Purpose
|
|
|
|
The Common Workflow Language Command Line Tool Description express
|
|
workflows for data-intensive science, such as Bioinformatics, Chemistry,
|
|
Physics, and Astronomy. This specification is intended to define a data
|
|
and execution model for Workflows that can be implemented on top of a
|
|
variety of computing platforms, ranging from an individual workstation to
|
|
cluster, grid, cloud, and high performance computing systems.
|
|
|
|
- {$include: concepts.md}
|
|
|
|
- name: ExpressionToolOutputParameter
|
|
type: record
|
|
extends: OutputParameter
|
|
fields:
|
|
- name: type
|
|
type:
|
|
- "null"
|
|
- "#CWLType"
|
|
- "#OutputRecordSchema"
|
|
- "#OutputEnumSchema"
|
|
- "#OutputArraySchema"
|
|
- string
|
|
- type: array
|
|
items:
|
|
- "#CWLType"
|
|
- "#OutputRecordSchema"
|
|
- "#OutputEnumSchema"
|
|
- "#OutputArraySchema"
|
|
- string
|
|
jsonldPredicate:
|
|
"_id": "sld:type"
|
|
"_type": "@vocab"
|
|
refScope: 2
|
|
typeDSL: True
|
|
doc: |
|
|
Specify valid types of data that may be assigned to this parameter.
|
|
|
|
- type: record
|
|
name: ExpressionTool
|
|
extends: Process
|
|
specialize:
|
|
- specializeFrom: "#OutputParameter"
|
|
specializeTo: "#ExpressionToolOutputParameter"
|
|
documentRoot: true
|
|
doc: |
|
|
Execute an expression as a Workflow step.
|
|
fields:
|
|
- name: "class"
|
|
jsonldPredicate:
|
|
"_id": "@type"
|
|
"_type": "@vocab"
|
|
type: string
|
|
- name: expression
|
|
type: [string, Expression]
|
|
doc: |
|
|
The expression to execute. The expression must return a JSON object which
|
|
matches the output parameters of the ExpressionTool.
|
|
|
|
- name: LinkMergeMethod
|
|
type: enum
|
|
docParent: "#WorkflowStepInput"
|
|
doc: The input link merge method, described in [WorkflowStepInput](#WorkflowStepInput).
|
|
symbols:
|
|
- merge_nested
|
|
- merge_flattened
|
|
|
|
|
|
- name: WorkflowOutputParameter
|
|
type: record
|
|
extends: OutputParameter
|
|
docParent: "#Workflow"
|
|
doc: |
|
|
Describe an output parameter of a workflow. The parameter must be
|
|
connected to one or more parameters defined in the workflow that will
|
|
provide the value of the output parameter.
|
|
fields:
|
|
- name: outputSource
|
|
doc: |
|
|
Specifies one or more workflow parameters that supply the value of to
|
|
the output parameter.
|
|
jsonldPredicate:
|
|
"_id": "cwl:outputSource"
|
|
"_type": "@id"
|
|
refScope: 0
|
|
type:
|
|
- string?
|
|
- string[]?
|
|
- name: linkMerge
|
|
type: ["null", "#LinkMergeMethod"]
|
|
jsonldPredicate: "cwl:linkMerge"
|
|
doc: |
|
|
The method to use to merge multiple sources into a single array.
|
|
If not specified, the default method is "merge_nested".
|
|
- name: type
|
|
type:
|
|
- "null"
|
|
- "#CWLType"
|
|
- "#OutputRecordSchema"
|
|
- "#OutputEnumSchema"
|
|
- "#OutputArraySchema"
|
|
- string
|
|
- type: array
|
|
items:
|
|
- "#CWLType"
|
|
- "#OutputRecordSchema"
|
|
- "#OutputEnumSchema"
|
|
- "#OutputArraySchema"
|
|
- string
|
|
jsonldPredicate:
|
|
"_id": "sld:type"
|
|
"_type": "@vocab"
|
|
refScope: 2
|
|
typeDSL: True
|
|
doc: |
|
|
Specify valid types of data that may be assigned to this parameter.
|
|
|
|
|
|
- name: Sink
|
|
type: record
|
|
abstract: true
|
|
fields:
|
|
- name: source
|
|
doc: |
|
|
Specifies one or more workflow parameters that will provide input to
|
|
the underlying step parameter.
|
|
jsonldPredicate:
|
|
"_id": "cwl:source"
|
|
"_type": "@id"
|
|
refScope: 2
|
|
type:
|
|
- string?
|
|
- string[]?
|
|
- name: linkMerge
|
|
type: LinkMergeMethod?
|
|
jsonldPredicate: "cwl:linkMerge"
|
|
doc: |
|
|
The method to use to merge multiple inbound links into a single array.
|
|
If not specified, the default method is "merge_nested".
|
|
|
|
|
|
- type: record
|
|
name: WorkflowStepInput
|
|
extends: Sink
|
|
docParent: "#WorkflowStep"
|
|
doc: |
|
|
The input of a workflow step connects an upstream parameter (from the
|
|
workflow inputs, or the outputs of other workflows steps) with the input
|
|
parameters of the underlying step.
|
|
|
|
## Input object
|
|
|
|
A WorkflowStepInput object must contain an `id` field in the form
|
|
`#fieldname` or `#stepname.fieldname`. When the `id` field contains a
|
|
period `.` the field name consists of the characters following the final
|
|
period. This defines a field of the workflow step input object with the
|
|
value of the `source` parameter(s).
|
|
|
|
## Merging
|
|
|
|
To merge multiple inbound data links,
|
|
[MultipleInputFeatureRequirement](#MultipleInputFeatureRequirement) must be specified
|
|
in the workflow or workflow step requirements.
|
|
|
|
If the sink parameter is an array, or named in a [workflow
|
|
scatter](#WorkflowStep) operation, there may be multiple inbound data links
|
|
listed in the `source` field. The values from the input links are merged
|
|
depending on the method specified in the `linkMerge` field. If not
|
|
specified, the default method is "merge_nested".
|
|
|
|
* **merge_nested**
|
|
|
|
The input must be an array consisting of exactly one entry for each
|
|
input link. If "merge_nested" is specified with a single link, the value
|
|
from the link must be wrapped in a single-item list.
|
|
|
|
* **merge_flattened**
|
|
|
|
1. The source and sink parameters must be compatible types, or the source
|
|
type must be compatible with single element from the "items" type of
|
|
the destination array parameter.
|
|
2. Source parameters which are arrays are concatenated.
|
|
Source parameters which are single element types are appended as
|
|
single elements.
|
|
|
|
fields:
|
|
- name: id
|
|
type: string
|
|
jsonldPredicate: "@id"
|
|
doc: "A unique identifier for this workflow input parameter."
|
|
- name: default
|
|
type: ["null", Any]
|
|
doc: |
|
|
The default value for this parameter if there is no `source`
|
|
field.
|
|
jsonldPredicate: "cwl:default"
|
|
- name: valueFrom
|
|
type:
|
|
- "null"
|
|
- "string"
|
|
- "#Expression"
|
|
jsonldPredicate: "cwl:valueFrom"
|
|
doc: |
|
|
To use valueFrom, [StepInputExpressionRequirement](#StepInputExpressionRequirement) must
|
|
be specified in the workflow or workflow step requirements.
|
|
|
|
If `valueFrom` is a constant string value, use this as the value for
|
|
this input parameter.
|
|
|
|
If `valueFrom` is a parameter reference or expression, it must be
|
|
evaluated to yield the actual value to be assigned to the input field.
|
|
|
|
The `self` value of in the parameter reference or expression must be
|
|
the value of the parameter(s) specified in the `source` field, or
|
|
null if there is no `source` field.
|
|
|
|
The value of `inputs` in the parameter reference or expression must be
|
|
the input object to the workflow step after assigning the `source`
|
|
values and then scattering. The order of evaluating `valueFrom` among
|
|
step input parameters is undefined and the result of evaluating
|
|
`valueFrom` on a parameter must not be visible to evaluation of
|
|
`valueFrom` on other parameters.
|
|
|
|
|
|
- type: record
|
|
name: WorkflowStepOutput
|
|
docParent: "#WorkflowStep"
|
|
doc: |
|
|
Associate an output parameter of the underlying process with a workflow
|
|
parameter. The workflow parameter (given in the `id` field) be may be used
|
|
as a `source` to connect with input parameters of other workflow steps, or
|
|
with an output parameter of the process.
|
|
fields:
|
|
- name: id
|
|
type: string
|
|
jsonldPredicate: "@id"
|
|
doc: |
|
|
A unique identifier for this workflow output parameter. This is the
|
|
identifier to use in the `source` field of `WorkflowStepInput` to
|
|
connect the output value to downstream parameters.
|
|
|
|
|
|
- name: ScatterMethod
|
|
type: enum
|
|
docParent: "#WorkflowStep"
|
|
doc: The scatter method, as described in [workflow step scatter](#WorkflowStep).
|
|
symbols:
|
|
- dotproduct
|
|
- nested_crossproduct
|
|
- flat_crossproduct
|
|
|
|
|
|
- name: WorkflowStep
|
|
type: record
|
|
docParent: "#Workflow"
|
|
doc: |
|
|
A workflow step is an executable element of a workflow. It specifies the
|
|
underlying process implementation (such as `CommandLineTool` or another
|
|
`Workflow`) in the `run` field and connects the input and output parameters
|
|
of the underlying process to workflow parameters.
|
|
|
|
# Scatter/gather
|
|
|
|
To use scatter/gather,
|
|
[ScatterFeatureRequirement](#ScatterFeatureRequirement) must be specified
|
|
in the workflow or workflow step requirements.
|
|
|
|
A "scatter" operation specifies that the associated workflow step or
|
|
subworkflow should execute separately over a list of input elements. Each
|
|
job making up a scatter operation is independent and may be executed
|
|
concurrently.
|
|
|
|
The `scatter` field specifies one or more input parameters which will be
|
|
scattered. An input parameter may be listed more than once. The declared
|
|
type of each input parameter is implicitly wrapped in an array for each
|
|
time it appears in the `scatter` field. As a result, upstream parameters
|
|
which are connected to scattered parameters may be arrays.
|
|
|
|
All output parameter types are also implicitly wrapped in arrays. Each job
|
|
in the scatter results in an entry in the output array.
|
|
|
|
If `scatter` declares more than one input parameter, `scatterMethod`
|
|
describes how to decompose the input into a discrete set of jobs.
|
|
|
|
* **dotproduct** specifies that each of the input arrays are aligned and one
|
|
element taken from each array to construct each job. It is an error
|
|
if all input arrays are not the same length.
|
|
|
|
* **nested_crossproduct** specifies the Cartesian product of the inputs,
|
|
producing a job for every combination of the scattered inputs. The
|
|
output must be nested arrays for each level of scattering, in the
|
|
order that the input arrays are listed in the `scatter` field.
|
|
|
|
* **flat_crossproduct** specifies the Cartesian product of the inputs,
|
|
producing a job for every combination of the scattered inputs. The
|
|
output arrays must be flattened to a single level, but otherwise listed in the
|
|
order that the input arrays are listed in the `scatter` field.
|
|
|
|
# Subworkflows
|
|
|
|
To specify a nested workflow as part of a workflow step,
|
|
[SubworkflowFeatureRequirement](#SubworkflowFeatureRequirement) must be
|
|
specified in the workflow or workflow step requirements.
|
|
|
|
fields:
|
|
- name: id
|
|
type: string
|
|
jsonldPredicate: "@id"
|
|
doc: "The unique identifier for this workflow step."
|
|
- name: in
|
|
type: WorkflowStepInput[]
|
|
jsonldPredicate:
|
|
_id: "cwl:in"
|
|
mapSubject: id
|
|
mapPredicate: source
|
|
doc: |
|
|
Defines the input parameters of the workflow step. The process is ready to
|
|
run when all required input parameters are associated with concrete
|
|
values. Input parameters include a schema for each parameter which is
|
|
used to validate the input object. It may also be used build a user
|
|
interface for constructing the input object.
|
|
- name: out
|
|
type:
|
|
- type: array
|
|
items: [string, WorkflowStepOutput]
|
|
jsonldPredicate:
|
|
_id: "cwl:out"
|
|
_type: "@id"
|
|
identity: true
|
|
doc: |
|
|
Defines the parameters representing the output of the process. May be
|
|
used to generate and/or validate the output object.
|
|
- name: requirements
|
|
type: ProcessRequirement[]?
|
|
jsonldPredicate:
|
|
_id: "cwl:requirements"
|
|
mapSubject: class
|
|
doc: |
|
|
Declares requirements that apply to either the runtime environment or the
|
|
workflow engine that must be met in order to execute this workflow step. If
|
|
an implementation cannot satisfy all requirements, or a requirement is
|
|
listed which is not recognized by the implementation, it is a fatal
|
|
error and the implementation must not attempt to run the process,
|
|
unless overridden at user option.
|
|
- name: hints
|
|
type: Any[]?
|
|
jsonldPredicate:
|
|
_id: "cwl:hints"
|
|
noLinkCheck: true
|
|
mapSubject: class
|
|
doc: |
|
|
Declares hints applying to either the runtime environment or the
|
|
workflow engine that may be helpful in executing this workflow step. It is
|
|
not an error if an implementation cannot satisfy all hints, however
|
|
the implementation may report a warning.
|
|
- name: label
|
|
type: string?
|
|
jsonldPredicate: "rdfs:label"
|
|
doc: "A short, human-readable label of this process object."
|
|
- name: doc
|
|
type: string?
|
|
jsonldPredicate: "rdfs:comment"
|
|
doc: "A long, human-readable description of this process object."
|
|
- name: run
|
|
type: [string, Process]
|
|
jsonldPredicate:
|
|
"_id": "cwl:run"
|
|
"_type": "@id"
|
|
doc: |
|
|
Specifies the process to run.
|
|
- name: scatter
|
|
type:
|
|
- string?
|
|
- string[]?
|
|
jsonldPredicate:
|
|
"_id": "cwl:scatter"
|
|
"_type": "@id"
|
|
"_container": "@list"
|
|
refScope: 0
|
|
- name: scatterMethod
|
|
doc: |
|
|
Required if `scatter` is an array of more than one element.
|
|
type: ScatterMethod?
|
|
jsonldPredicate:
|
|
"_id": "cwl:scatterMethod"
|
|
"_type": "@vocab"
|
|
|
|
|
|
- name: Workflow
|
|
type: record
|
|
extends: "#Process"
|
|
documentRoot: true
|
|
specialize:
|
|
- specializeFrom: "#OutputParameter"
|
|
specializeTo: "#WorkflowOutputParameter"
|
|
doc: |
|
|
A workflow describes a set of **steps** and the **dependencies** between
|
|
those steps. When a step produces output that will be consumed by a
|
|
second step, the first step is a dependency of the second step.
|
|
|
|
When there is a dependency, the workflow engine must execute the preceding
|
|
step and wait for it to successfully produce output before executing the
|
|
dependent step. If two steps are defined in the workflow graph that
|
|
are not directly or indirectly dependent, these steps are **independent**,
|
|
and may execute in any order or execute concurrently. A workflow is
|
|
complete when all steps have been executed.
|
|
|
|
Dependencies between parameters are expressed using the `source` field on
|
|
[workflow step input parameters](#WorkflowStepInput) and [workflow output
|
|
parameters](#WorkflowOutputParameter).
|
|
|
|
The `source` field expresses the dependency of one parameter on another
|
|
such that when a value is associated with the parameter specified by
|
|
`source`, that value is propagated to the destination parameter. When all
|
|
data links inbound to a given step are fulfilled, the step is ready to
|
|
execute.
|
|
|
|
## Workflow success and failure
|
|
|
|
A completed step must result in one of `success`, `temporaryFailure` or
|
|
`permanentFailure` states. An implementation may choose to retry a step
|
|
execution which resulted in `temporaryFailure`. An implementation may
|
|
choose to either continue running other steps of a workflow, or terminate
|
|
immediately upon `permanentFailure`.
|
|
|
|
* If any step of a workflow execution results in `permanentFailure`, then
|
|
the workflow status is `permanentFailure`.
|
|
|
|
* If one or more steps result in `temporaryFailure` and all other steps
|
|
complete `success` or are not executed, then the workflow status is
|
|
`temporaryFailure`.
|
|
|
|
* If all workflow steps are executed and complete with `success`, then the
|
|
workflow status is `success`.
|
|
|
|
# Extensions
|
|
|
|
[ScatterFeatureRequirement](#ScatterFeatureRequirement) and
|
|
[SubworkflowFeatureRequirement](#SubworkflowFeatureRequirement) are
|
|
available as standard [extensions](#Extensions_and_Metadata) to core
|
|
workflow semantics.
|
|
|
|
fields:
|
|
- name: "class"
|
|
jsonldPredicate:
|
|
"_id": "@type"
|
|
"_type": "@vocab"
|
|
type: string
|
|
- name: steps
|
|
doc: |
|
|
The individual steps that make up the workflow. Each step is executed when all of its
|
|
input data links are fulfilled. An implementation may choose to execute
|
|
the steps in a different order than listed and/or execute steps
|
|
concurrently, provided that dependencies between steps are met.
|
|
type:
|
|
- type: array
|
|
items: "#WorkflowStep"
|
|
jsonldPredicate:
|
|
mapSubject: id
|
|
|
|
|
|
- type: record
|
|
name: SubworkflowFeatureRequirement
|
|
extends: ProcessRequirement
|
|
doc: |
|
|
Indicates that the workflow platform must support nested workflows in
|
|
the `run` field of [WorkflowStep](#WorkflowStep).
|
|
fields:
|
|
- name: "class"
|
|
type: "string"
|
|
doc: "Always 'SubworkflowFeatureRequirement'"
|
|
jsonldPredicate:
|
|
"_id": "@type"
|
|
"_type": "@vocab"
|
|
|
|
- name: ScatterFeatureRequirement
|
|
type: record
|
|
extends: ProcessRequirement
|
|
doc: |
|
|
Indicates that the workflow platform must support the `scatter` and
|
|
`scatterMethod` fields of [WorkflowStep](#WorkflowStep).
|
|
fields:
|
|
- name: "class"
|
|
type: "string"
|
|
doc: "Always 'ScatterFeatureRequirement'"
|
|
jsonldPredicate:
|
|
"_id": "@type"
|
|
"_type": "@vocab"
|
|
|
|
- name: MultipleInputFeatureRequirement
|
|
type: record
|
|
extends: ProcessRequirement
|
|
doc: |
|
|
Indicates that the workflow platform must support multiple inbound data links
|
|
listed in the `source` field of [WorkflowStepInput](#WorkflowStepInput).
|
|
fields:
|
|
- name: "class"
|
|
type: "string"
|
|
doc: "Always 'MultipleInputFeatureRequirement'"
|
|
jsonldPredicate:
|
|
"_id": "@type"
|
|
"_type": "@vocab"
|
|
|
|
- type: record
|
|
name: StepInputExpressionRequirement
|
|
extends: ProcessRequirement
|
|
doc: |
|
|
Indicate that the workflow platform must support the `valueFrom` field
|
|
of [WorkflowStepInput](#WorkflowStepInput).
|
|
fields:
|
|
- name: "class"
|
|
type: "string"
|
|
doc: "Always 'StepInputExpressionRequirement'"
|
|
jsonldPredicate:
|
|
"_id": "@type"
|
|
"_type": "@vocab"
|