Cirrus Process Payload

A Cirrus Process Payload is a JSON object containing the input metadata along with a parameters for processing that metadata and associated data. A Cirrus payload is used to start a single execution of a workflow.

Cirrus Process Payload

Field Name

Type

Description

type

string

Type of the GeoJSON Object. If set, it must be set to FeatureCollection.

features

[Item]

An array of STAC-like input items

process

[Process Definition]

An array of process definitions

At a high-level, the payload looks like a GeoJSON FeatureCollection with an additional process field.

{
    "type": "FeatureCollection",
    "features": [
        {
            <stac-item-1>
        },
        {
            <stac-item-2>
        }
    ],
    "process": {
        <process-definition>
    }
}

A workflow task may take 1 or more items as input, and could output 1 or more items. One common use-case is to take in a single input item and generate a single output item.

Item

An input item to Cirrus is commonly 1 or more STAC Items, allowing workflow tasks to be interoperable across different data sources. For example an NDVI task could create an NDVI asset, adding it to the item regardless of if it was a Landsat or Sentinel STAC Item.

However, sometimes workflows are also responsible for creating proper STAC Items, mapping metadata in different formats into STAC. Thus, as far as Cirrus is concerned, the only required field in the payload is an id field. The id is used to track the individual execution of a workflow.

Item

Field Name

Type

Description

id

string

REQUIRED An ID for this item

Each workflow task will have it’s own requirements on the items, and may perform additional validation. Payloads that fail validation of a workflow task should throw an InvalidInput exception, which will mark the execution in the state database as INVALID.

In the following example the input item is a partial STAC Item, using STAC fields but missing most required fields (e.g., geometry, datetime). In this case, this item provides URLs to the original Sentinel-2 metadata which will be converted to a STAC Item during during one of the workflow tasks.

{
    "features": [
        {
            "id": "tiles-15-V-WG-2022-3-21-0",
            "assets": {
                "tileInfo": {
                    "href": "https://roda.sentinel-hub.com/sentinel-s2-l2a/tiles/15/V/WG/2022/3/21/0/tileInfo.json"
                },
                "productInfo": {
                    "href": "https://roda.sentinel-hub.com/sentinel-s2-l2a/tiles/15/V/WG/2022/3/21/0/productInfo.json"
                },
                "metadata": {
                    "href": "https://roda.sentinel-hub.com/sentinel-s2-l2a/tiles/15/V/WG/2022/3/21/0/metadata.xml"
                }
            }
        }
    ]
}

While partial STAC Items make sense as input to workflows that create STAC metadata, the final output of a Cirrus workflow should always contain an array of actual STAC Items.

Process Definition

Process Definition

Field Name

Type

Description

description

string

An optional description of the process

workflow

string

REQUIRED Name of the workflow to run

input_collections

string

An identifier representing the set of collections the input items belong to

upload_options

Output Options

Parameters affecting the upload of item assets

tasks

Map<string, Map<str, object>>

A dictionary of task names (keys), each containing a dictionary of parameters for that task

input_collections

The input_collections field is a way to explicitly group together input items across executions of workflows. It is optional, and if not provided input_collections is derived from all the collections the input items belong to. For instance, if a payload contains a single item, and it belongs in the collection sat-a-l1, then input_collections is sat-a-l1.

If the payload contains multiple items spanning more than 1 collection, then input_collections is a ‘/’ separated string of the sorted list of collections. For instance, if the items are in collections sat-c-l1 and sat-a-l1 then input_collections would be sat-a-l1/sat-c-l1

tasks

The tasks field is a dictionary with an optional key for each task. If present, it contains a dictionary of parameters for the task. The documentation for each task will supply the list of available parameters.

Output Options

The output options object is a dictionary of parameters to used to control the publishing of the metadata and uploading data assets. Any task that uploads data should use the OutputOptions to control where and how that data is uploaded. See the cirrus-lib function transfer.upload_item_assets

Output Options

Field Name

Type

Description

path_template

string

REQUIRED A string template for specifying the location of uploaded assets

collections

Map<str, str>

REQUIRED A mapping of output collection name to a regex pattern used on Item IDs

public_assets

[str]

A list of asset keys that should be marked as public when uploaded

headers

Map<str, str>

A set of key, value headers to send when uploading data to s3

s3_urls

bool

Controls if the final published URLs should be an s3 (s3://<bucket>/<key>) or https URL.

path_template

The path_template string is a way to control the output location of uploaded assets from a STAC Item using metadata from the Item itself. The template can contain fixed strings along with variables used for substitution. The following variables can be used in the template.

Output Options

Field Name

Type

Description

id

string

The id of the Item

collection

string

The name of the Item’s Collection

date

string

The date portion of the Item’s datetime property of the form YYYY-MM-DD

year

string

The year portion of the Item’s datetime property

month

string

The month portion of the Item’s datetime property

day

string

The day portion of the Item’s datetime property

<property>

varies

Any Item property (e.g., mgrs:utm_zone)

As an example, a path_template of /data/${collection}/${id}/ will upload all assets to a path based on the Item’s collection and ID to the default Cirrus data bucket.

If a complete s3 URL is provided instead (e.g., s3://my-bucket/data/${collection}/${id}/) then the data will be uploaded to the provided bucket.

collections

The collections dictionary is a way to control what collection output STAC Items are ultimately assigned to. Each dictionary key is the name of a collection, and it’s value is a regex expression that is used to match against each STAC Item ID that will be published. The first matching collection will be used.

"upload_options": {
    "collections": {
        "sat-a-l1": "sa.*"
        "sat-b-l1": "sb.*"
    }
}

With this example an Item with an ID of “sa-l1-20200107” will be put in the sat-a-l1 collection, and Item “sb-l1-19731212” would be put in the sat-b-l1 collection.

If collections is supplied, each Item is assigned a new collection before it is published. If not provided the collections will remain as they were.