Cirrus Payload

A Cirrus Payload is a JSON object containing the input metadata along with a parameters for processing that metadata and associated data. A Cirrus payload is used to start a single execution of a workflow.

Cirrus Payload
Field Name	Type	Description
type	string	Type of the GeoJSON Object. If set, it must be set to FeatureCollection.
features	[Item]	An array of STAC-like input items
process	[Process Definition]	An array of process definitions

At a high-level, the payload looks like a GeoJSON FeatureCollection with an additional process field.

{
    "type": "FeatureCollection",
    "features": [
        {
            <stac-item-1>
        },
        {
            <stac-item-2>
        }
    ],
    "process": {
        <process-definition>
    }
}

A workflow task may take 1 or more items as input, and could output 1 or more items. One common use-case is to take in a single input item and generate a single output item.

Item

An input item to Cirrus is commonly 1 or more STAC Items, allowing workflow tasks to be interoperable across different data sources. For example an NDVI task could create an NDVI asset, adding it to the item regardless of if it was a Landsat or Sentinel STAC Item.

However, sometimes workflows are also responsible for creating proper STAC Items, mapping metadata in different formats into STAC. Thus, as far as Cirrus is concerned, the only required field in the payload is an id field. The id is used to track the individual execution of a workflow.

Item
Field Name	Type	Description
id	string	REQUIRED An ID for this item

Each workflow task will have it’s own requirements on the items, and may perform additional validation. Payloads that fail validation of a workflow task should throw an InvalidInput exception, which will mark the execution in the state database as INVALID.

In the following example the input item is a partial STAC Item, using STAC fields but missing most required fields (e.g., geometry, datetime). In this case, this item provides URLs to the original Sentinel-2 metadata which will be converted to a STAC Item during during one of the workflow tasks.

{
    "features": [
        {
            "id": "tiles-15-V-WG-2022-3-21-0",
            "assets": {
                "tileInfo": {
                    "href": "https://roda.sentinel-hub.com/sentinel-s2-l2a/tiles/15/V/WG/2022/3/21/0/tileInfo.json"
                },
                "productInfo": {
                    "href": "https://roda.sentinel-hub.com/sentinel-s2-l2a/tiles/15/V/WG/2022/3/21/0/productInfo.json"
                },
                "metadata": {
                    "href": "https://roda.sentinel-hub.com/sentinel-s2-l2a/tiles/15/V/WG/2022/3/21/0/metadata.xml"
                }
            }
        }
    ]
}

While partial STAC Items make sense as input to workflows that create STAC metadata, the final output of a Cirrus workflow should always contain an array of actual STAC Items.

Process Definition

Process Definition
Field Name	Type	Description
description	string	An optional description of the process
workflow	string	REQUIRED Name of the workflow to run
input_collections	string	An identifier representing the set of collections the input items belong to
upload_options	Output Options	Parameters affecting the upload of item assets
tasks	Map<string, Map<str, object>>	A dictionary of task names (keys), each containing a dictionary of parameters for that task

input_collections

The input_collections field is a way to explicitly group together input items across executions of workflows. It is optional, and if not provided input_collections is derived from all the collections the input items belong to. For instance, if a payload contains a single item, and it belongs in the collection sat-a-l1, then input_collections is sat-a-l1.

If the payload contains multiple items spanning more than 1 collection, then input_collections is a ‘/’ separated string of the sorted list of collections. For instance, if the items are in collections sat-c-l1 and sat-a-l1 then input_collections would be sat-a-l1/sat-c-l1

tasks

The tasks field is a dictionary with an optional key for each task. If present, it contains a dictionary of parameters for the task. The documentation for each task will supply the list of available parameters.

Output Options

The output options object is a dictionary of parameters to used to control the publishing of the metadata and uploading data assets. Any task that uploads data should use the OutputOptions to control where and how that data is uploaded. See the cirrus-lib function transfer.upload_item_assets

Output Options
Field Name	Type	Description
path_template	string	REQUIRED A string template for specifying the location of uploaded assets
collections	Map<str, str>	REQUIRED A mapping of output collection name to a regex pattern used on Item IDs
public_assets	[str]	A list of asset keys that should be marked as public when uploaded
headers	Map<str, str>	A set of key, value headers to send when uploading data to s3
s3_urls	bool	Controls if the final published URLs should be an s3 (s3://<bucket>/<key>) or https URL.

path_template

The path_template string is a way to control the output location of uploaded assets from a STAC Item using metadata from the Item itself. The template can contain fixed strings along with variables used for substitution. The following variables can be used in the template.

Output Options
Field Name	Type	Description
id	string	The id of the Item
collection	string	The name of the Item’s Collection
date	string	The date portion of the Item’s datetime property of the form YYYY-MM-DD
year	string	The year portion of the Item’s datetime property
month	string	The month portion of the Item’s datetime property
day	string	The day portion of the Item’s datetime property
<property>	varies	Any Item property (e.g., mgrs:utm_zone)

As an example, a path_template of /data/${collection}/${id}/ will upload all assets to a path based on the Item’s collection and ID to the default Cirrus data bucket.

If a complete s3 URL is provided instead (e.g., s3://my-bucket/data/${collection}/${id}/) then the data will be uploaded to the provided bucket.

collections

The collections dictionary is a way to control what collection output STAC Items are ultimately assigned to. Each dictionary key is the name of a collection, and it’s value is a regex expression that is used to match against each STAC Item ID that will be published. The first matching collection will be used.

"upload_options": {
    "collections": {
        "sat-a-l1": "sa.*"
        "sat-b-l1": "sb.*"
    }
}

With this example an Item with an ID of “sa-l1-20200107” will be put in the sat-a-l1 collection, and Item “sb-l1-19731212” would be put in the sat-b-l1 collection.

If collections is supplied, each Item is assigned a new collection before it is published. If not provided the collections will remain as they were.