Payload Bucket
The Cirrus payload bucket is an S3 bucket used to store workflow payloads that are either too large to pass inline through AWS service limits or that need to be persisted for state tracking, debugging, and retrieval after a workflow has finished executing.
Together with the StateDB, the payload bucket forms the record of a Cirrus deployment’s work: where the StateDB tracks that an execution happened and what state it’s in, the payload bucket stores the actual JSON payload content associated with each execution.
Configuration
Cirrus does not manage the payload bucket itself. The bucket is supplied
to Cirrus code through the CIRRUS_PAYLOAD_BUCKET environment
variable, and any S3 bucket that the running code has permission to read
from and write to can serve as the payload bucket. A bucket provisioned
alongside Cirrus (for example, via the reference CloudFormation in this
repository) is one common way to satisfy this, but it is not a
requirement.
All Cirrus code that reads or writes payload objects goes through the
cirrus.lib.payload_bucket.PayloadBucket class, which resolves the
bucket name from CIRRUS_PAYLOAD_BUCKET at construction time. If the
variable is not set when a PayloadBucket is created,
UndefinedPayloadBucketError is raised.
The root prefix under which all payload objects are organized can be
customized via the optional CIRRUS_PAYLOAD_ROOT_PREFIX environment
variable. When not set, it defaults to cirrus. This allows
deployments that share a payload bucket with other systems to use a
distinct namespace, avoiding key collisions.
Because Cirrus writes all of its objects under a single top-level prefix (see Key Organization), the payload bucket can also be shared with other, unrelated content without risk of key collisions.
Key Organization
All objects written by Cirrus live under a configurable top-level
prefix (default cirrus/, controlled by
CIRRUS_PAYLOAD_ROOT_PREFIX). Within that, keys are split into two
distinct namespaces based on how long the data is expected to live:
<root_prefix>/tmp/— ephemeral storage for transient payloads that the system does not need to retain once processing has moved on. Objects under this prefix are expected to be cleaned up by a bucket lifecycle rule (see Lifecycle and Retention).<root_prefix>/executions/— persistent storage for per-execution input and output payloads that Cirrus uses to link workflow runs to their payload content.
The full layout is as follows (using the default cirrus prefix):
s3://<payload-bucket>/
└── cirrus/ # configurable via CIRRUS_PAYLOAD_ROOT_PREFIX
├── tmp/ # ephemeral
│ ├── oversized/
│ │ └── <uuid>.json # overflow for oversized payloads
│ ├── batch/
│ │ └── <payload_id>/
│ │ └── <uuid>.json # payloads handed to batch tasks
│ └── invalid/
│ └── <uuid>.json # payloads that failed validation
└── executions/ # persistent
└── <payload_id>/
└── <execution_id>/
├── input.json # payload as received
└── output.json # payload after workflow completed
The prefix values are derived from the configured root prefix and are
exposed as instance attributes on PayloadBucket. They are the
single source of truth for where each type of object is written.
Note
Because a Cirrus payload_id has the form
<collections>/workflow-<workflow_name>/<item_ids> and contains
forward slashes, objects under <root_prefix>/executions/ and
<root_prefix>/tmp/batch/ fan out into multiple levels of S3
prefixes. This is intentional: it allows listing all executions for
a given collection, workflow, or item grouping with ordinary S3
prefix queries.
Execution Payloads
When a workflow execution is claimed by the process lambda, the
input payload is written to:
<root_prefix>/executions/<payload_id>/<execution_id>/input.json
On successful completion, the update-state lambda writes the output
payload to the sibling key:
<root_prefix>/executions/<payload_id>/<execution_id>/output.json
The <execution_id> component is the Step Functions execution name,
which is deterministically derived from the payload ID and the current
list of executions in the StateDB record (see
PayloadManagers.gen_execution_arn). Because a single payload_id
may be re-run — for example on FAILED or ABORTED states, or with
replace=True — multiple <execution_id> children may exist under
the same payload_id prefix, one per attempt.
This layout is what makes the payload bucket and the StateDB addressable
together: given a StateDB record, its payload_id plus the relevant
entry from its executions list uniquely identify the corresponding
input.json and output.json in the bucket. The management CLI’s
get-input-payload and get-output-payload commands rely on
exactly this correspondence.
Oversized Payloads
Cirrus passes payloads between lambdas and Step Functions as JSON, but
AWS EventBridge embeds the Step Functions input and output as escaped
strings inside its own event envelope. To stay safely below the
EventBridge event size limit, PayloadManager enforces a conservative
MAX_PAYLOAD_LENGTH (120 KB) on the JSON-escaped payload length.
When a payload exceeds that limit, its contents are uploaded to:
<root_prefix>/tmp/oversized/<uuid>.json
and replaced in-flight by a small reference object of the form
{"url": "s3://.../<root_prefix>/tmp/oversized/<uuid>.json"}.
Downstream code in cirrus.lib.utils.payload_from_s3 transparently
re-hydrates these references when the full payload is needed again.
Because oversized payloads are only used as an overflow mechanism
during a single execution, they live under <root_prefix>/tmp/ and
are expected to be cleaned up by the bucket’s lifecycle rule.
Batch Payloads
Tasks that are dispatched to AWS Batch cannot receive their payload
inline, so PayloadBucket.upload_batch_payload writes the payload to:
<root_prefix>/tmp/batch/<payload_id>/<uuid>.json
The Batch task is then invoked with a reference to this key. This path is also ephemeral and cleaned up by the lifecycle rule.
Note
The key format <payload_id>/<uuid>.json (rather than simply
<uuid>/input.json) is retained for backwards compatibility with
earlier versions of cirrus-lib.
Invalid Payloads
When a payload fails validation in a way that Cirrus wants to preserve for later inspection, it is uploaded to:
<root_prefix>/tmp/invalid/<uuid>.json
This is intended as short-lived diagnostic storage, not a permanent
archive of invalid inputs, which is why it lives under
<root_prefix>/tmp/. If long-term retention of invalid payloads is
required for a particular deployment, the recommended approach is to
ship them out to a separate bucket or archive from the handling code
rather than to alter the lifecycle of <root_prefix>/tmp/.
Lifecycle and Retention
Cirrus assumes a simple retention model for the payload bucket:
Everything under
<root_prefix>/tmp/is ephemeral and should be removed by a bucket lifecycle rule. The reference CloudFormation in this repository configures a 10-day expiration on the<root_prefix>/tmp/prefix (derived from the configuredPayloadRootPrefix), which is a reasonable default that gives oversized and batch payloads enough lifetime to survive retries, backfills, and manual reprocessing.Everything under
<root_prefix>/executions/is persistent and is not expired by any Cirrus-managed rule. These objects are the primary mechanism for reconstructing the history of a workflow run after the fact; losing them will break theget-input-payloadandget-output-payloadCLI commands and make post-hoc debugging significantly harder.
Operators bringing their own payload bucket should configure a
lifecycle rule on <root_prefix>/tmp/ that matches this expectation.
A 10-day expiration is a reasonable starting point; shortening it risks
removing objects that are still in use by in-flight retries, and
extending it past <root_prefix>/tmp/ into other prefixes will
silently delete execution history.
Operators who need different retention semantics for execution payloads
— for example, cost control on very high-throughput deployments — can
add their own lifecycle rules targeting the
<root_prefix>/executions/ prefix, keeping in mind that any payload
whose execution record is still referenced from the StateDB will become
unreachable through the CLI once expired.
Access Patterns
The payload bucket is accessed in a small number of well-defined places:
processlambda — uploads execution input payloads viaPayloadBucket.upload_input_payload, and uploads oversized payloads viaPayloadBucket.upload_oversize_payloadbefore invoking Step Functions.update-statelambda — uploads execution output payloads viaPayloadBucket.upload_output_payloadon successful completion.Task code — reads oversized payloads via
cirrus.lib.utils.payload_from_s3and, for batch tasks, fetches payload content from the batch prefix.Management CLI — reads execution input and output payloads via
PayloadBucket.get_input_payload_urlandPayloadBucket.get_output_payload_url, which reconstruct the deterministic S3 URL from apayload_idandexecution_idpair looked up in the StateDB.
No Cirrus code writes directly to the bucket without going through
PayloadBucket, and the prefix attributes are not intended to be
re-implemented elsewhere. Downstream code that needs to locate a
payload object by key should import the helpers from
cirrus.lib.payload_bucket rather than constructing prefixes by
hand, so that any future reorganization of the layout remains a
single-file change.