Lambda-based components

Components that use Lambda (feeders, functions, and some tasks) share a common set of required files and definition.yml format.

Required files

In addition to the definition.yml and README.md files required by all components, Lambda-based components also require a lambda_function.py Python file implementing a lambda_handler function, which serves as the Lambda execution entry point. Like all components, these files are contained within a directory named for the component within its component type’s directory.

For example, if we have a Lambda task component named reproject, we would end up with a directory structure that looks like this:

<project_dir>/
    tasks/
        reproject/
            definition.yml
            README.md
            lambda_function.py

The contents of a Lambda-based component’s directory–minus the definition.yml file–will be packaged into a Lambda deployment zip file and uploaded to AWS on project deploy. Any additional files added by the user will also be included in the Lambda zip.

Definition file

The definition.yml contains a Lambda component’s configuration. The format is similar to that used by the Serverless Framework, which underlies cirrus’s deployment mechanism, but is subtly different.

Here is an example definition.yml file for a Lambda component:

description: A sample lambda description string
environment:
  COMPONENT_LEVEL_VAR: some value
  OVERRIDDEN_VAR: another_value
enabled: true
lambda:
  enabled: true
  memorySize: 1024
  timeout: 60
  runtime: python3.7
  environment:
    LAMBDA_LEVEL_VAR: 13
    OVERRIDDEN_VAR: new_value
  layers:
    - arn:aws:lambda:${self:provider.region}:552188055668:layer:geolambda:2
  pythonRequirements:
    include:
      - rasterio==1.2.8
      - rio-cogeo~=1.1.10
  iamRoleStatements:
    - Effect: "Allow"
      Action:
        - "s3:PutObject"
      Resource:
        - !Join
          - ''
          - - 'arn:aws:s3:::'
            - ${self:provider.environment.CIRRUS_DATA_BUCKET}
            - '*'
    - Effect: "Allow"
      Action:
        - "s3:ListBucket"
        - "s3:GetObject"
        - "s3:GetBucketLocation"
      Resource: "*"
    - Effect: "Allow"
      Action: secretsmanager:GetSecretValue
      Resource:
        - arn:aws:secretsmanager:#{AWS::Region}:#{AWS::AccountId}:secret:cirrus-creds-*

Generally speaking, the lambda key’s value supports the same configuration parameters as Serverless Functions, with a few key differences. Consult that Serverless documentation for a full list of supported parameters, along with the following list of Cirrus-specific properties/behaviors.

Description

The top-level description value is used for the component’s description within Cirrus, as is also added to the lambda configuration during the cirrus build configuration compiliation process. So there’s no need to specify it twice.

Enabled state

Components can be disabled within Cirrus, which will exclude them from the compiled configuration. All components support a top-level enabled parameter to completely enable/disable the component. Lambda-based components also support an enabled parameter under the lambda key, which will enable/disable just the Lambda portion of the component.

For Lambda-only components these enabled controls function more or less identically. For components that support additional modes of operation (such as tasks, which also support Batch), the specific lambda.enabled parameter can be more useful.

Environment variables

Lambda-based components support top-level environment variable specifications, which they inherit, along with any environment variables defined globally in the cirrus.yml file under the provider.environment key. In the case of conflicts, inheritence will perfer a value in the Lambda environment varaibles over one from the task, and one from the task varaibles over that from the globals.

If, along with the example definition.yml above, we had a cirrus.yml defining:

provider:
  environment:
    GLOBAL_LEVEL_VAR: global_value
    OVERRIDDEN_VAR: first_value

we would end up with the following environment variables/values defined for the Lambda function:

GLOBAL_LEVEL_VAR: global_value
COMPONENT_LEVEL_VAR: some value
LAMBDA_LEVEL_VAR: 13
OVERRIDDEN_VAR: new_value

Generally, we recommend using the top-level environment variables for all variable definitions whenever possible. Global variables in the cirrus.yml are useful for values shared amongst most or all Lambda or Batch components, allowing a single place for updates. Values used by only one or a handful of components, however, are best specified in those respective component definitions.

We recommend using the top-level variable specification over the lambda level for consistency, as that is also preferred for tasks that use Batch (both to allow sharing the environment values between Batch and Lambda, where required, and because the Batch environment specification uses a different and more verbose format).

If ever in doubt about the final environment variables/values (or the values of any other parameters) used in a Lambda definition, the cirrus cli provides a show command that runs the full configuration interpolation to generate the “complete” definition as it appears in the compiled configuration generated by the build command. Run it like this:

❯ cirrus show task <TaskName>

IAM permissions

Lambda’s each get a unique role created via the serverless-iam-roles-per-function plugin. While this plugin supports the specification of global permissions in the cirrus.yml file under provider.iamRoleStatments or provider.iam.role.statements (depending on serverless version), using global permissions is highly discourgaed.

Instead, each function should have a specific set of IAM permissions listed in its definition.yml, limited to most restrictive set possible. The default set of permissions, as shown in the example, may or may not be that set, depending on the functionality of the Lambda components. Let’s break each of those default permissions down to see what they do.

- Effect: "Allow"
  Action:
    - "s3:PutObject"
  Resource:
    - !Join
      - ''
      - - 'arn:aws:s3:::'
        - ${self:provider.environment.CIRRUS_DATA_BUCKET}
        - '*'

This first action allows the Lambda to add/update an object in a the bucket referenced by the S3 bucket ARN provided via the global environment variable CIRRUS_DATA_BUCKET. This permission is useful for all tasks that need to write assets/items to the data bucket.

- Effect: "Allow"
  Action:
    - "s3:ListBucket"
    - "s3:GetObject"
    - "s3:GetBucketLocation"
  Resource: "*"

This next action allow Cirrus components to retrieve data from any S3 bucket that allows access. Task and other components that need to access assets or other files across a potentially unknown set of S3 buckets should get this permsission.

- Effect: "Allow"
  Action: secretsmanager:GetSecretValue
  Resource:
    - arn:aws:secretsmanager:#{AWS::Region}:#{AWS::AccountId}:secret:cirrus-creds-*

Some buckets require credentials for access (such as those using KMS encryption). The underlying cirrus-lib utility functions for accessing S3 objects implicitly supports accessing all secrets named like cirrus-creds-<bucket_name> to get and use credentials as requred for accessing such buckets. An IAM statement like this one allows this Cirrus component access to any such secrets as needed.

Python dependencies

Cirrus uses a Serverless plugin serverless-python-requirements to bundle any necessary Python dependencies into the Lambda deployment zip file when packaging. Unlike the stock plugin, however, Cirrus does not use a requirements.txt file for dependency specifiction. Instead, Cirrus supports a list of all requirements under lambda.pythonRequirements.include.

The items in that list support the normal requirements.txt file format and all version specification operators/options.

Global requirements supported by all Lambda components are supported via configuration in the cirrus.yml file under custom.pythonRequirements.include, but using this mechanism is highly discouraged in favor of explicitly listing pinned requirements for every Lambda component, as required.

Note that a dependency specification for cirrus-lib is injected into every Lambda. Cirrus does this by updating each Lambda component’s requirements list with the cirrus-lib requirements. cirrus-lib itself is copied into each Lambda deployment zip from the version installed to the current Python environment.

Different module/handler names

Serverless function definitions require the specification of a handler property to set the Lambda entry point module and function. Indeed, if lambda.handler is set, that value will be used to set the Lambda entry point.

However, Cirrus does not require this parameter to be specified, and will instead default it to lambda_function.lambda_handler, in line with AWS convention and the expected handler file name.