27 Aug 2020

Yohan Beschi
Developer, Cloud Architect and DevOps Advocate

Easy Infrastructure as Code with Troposphere

Choosing the perfect tool for Infrastructure as Code (IaC) is an impossible task - literally - this tool does not exist. AWS CloudFormation, AWS SAM, AWS CDK, Terraform, Serverless, etc. are all flawed in some ways.

AWS CloudFormation gives us full control over the AWS resources we want to create, but is too verbose, lacks flexibility, variables and loops, and there is no easy way to deploy templates (even if Ansible can help).

Terraform is similar to AWS CloudFormation in principle. But the state file (Terraform stores the state of the managed infrastructure and configuration in a file named “terraform.tfstate” by default. This file must be part of the IaC and shared between each team member. It can be stored in a git repository or an S3 Bucket for example.), coupled with the fact that Terraform has been in beta version for years and having breaking changes between versions, are too much of a risk to be used in production by a team.

AWS SAM is only an extension of AWS CloudFormation and while it offers a way to easily package and deploy, Lambdas or Lambda Layers with their dependencies, it is only intended for serverless resources like Lambdas, API Gateway or DynamoDB. For everything else you will have to use CloudFormation templates.

Serverless, like AWS SAM, offers an abstraction for Serverless resources but support a lot more of them. And like AWS SAM if we cannot create resources directly with Serverless we can use CloudFormation templates and we are back to our initial issues.

AWS CDK is the last IaC AWS product (the first stable version has been released in July 2019). Among all the IaC tools, AWS CDK is an oddball as it can be used in two ways:

we can have full control over the resources we want to create using a programming language (JavaScript, TypeScript, Python, Java and C#). In other word, every resource you can create using AWS CloudFormation, you can do the same with the CDK using classes starting with Cfn
we can have a lot of abstraction, and not only for serverless resources (e.g. we can create a VPC, with public and private subnets, an Internet Gateway, NAT Gateways, etc. with a few lines of code)

AWS CDK is pretty close to perfect, unfortunately:

to be able to use the abstraction layer we need to learn how to do it (which classes and parameters)
learning AWS CloudFormation (the syntax, how to use the service in the AWS console, etc.) is still mandatory
before deploying the AWS CloudFormation stacks, we must generate (synthesize in the CDK parlance) the AWS CloudFormation templates in order to check that we have exactly the AWS resources we want to create, nothing less and nothing more
the lack of control for stacks deployment can make simple tasks very complicated
the almost nonexistent help for serverless (compared to AWS SAM or serverless) require some extra coding

This quick overview of the main IaC tools currently used with AWS, seems dire. But it is far from the truth. All these tools are not that bad, we’ve been using them for years and they get things done. But the question is can we find a way to make are DevOps life easier, which is four folds:

faster infrastructures development
easier serverless applications deployment
less code/script/templates to write
minimal tool set required

Troposphere (Python), SparkleFormation (Ruby), etc. are tools based on the very principle, that using a programming language to generate CloudFormation templates offer flexibility, less copy/paste, less typing, variables, simple conditions, loops, etc., everything that we don’t have with AWS CloudFormation.

Troposphere is an open source Python library which has been around for almost a decade, but unfortunately it is for Python only and it is not supported by a big company, which means it does not get all the praise and attention it deserves, and the documentation is very poor. As often with open source softwares, we need to dig into the source code in order to use all it has to offer.

This article aims to show what Troposphere can do and how we can extend it to have the unicorn IaC tool for AWS. And of course, you should have some experience with AWS CloudFormation in order to understand what Troposphere has to offer and the full extent of this article as we won’t spend much time on how a CloudFormation template works.

Table of Contents

All the source code presented in this article is available in a Github repository.

Simple Troposphere example

Troposphere does only one thing, generates AWS CloudFormation templates from Python code, but does it well.

In order to use Troposphere we first need to initialize a Python environment with the library:

export PIPENV_VENV_IN_PROJECT=enabled
pipenv --python 3.8
pipenv install --dev troposphere

Let’s create a VPC to see how Troposphere works (getting_started/first_example.py):

from troposphere import Template
from troposphere import ec2

# Create a new AWS CloudFormation template
t = Template()

# Create a VPC (AWS::EC2::VPC)
r_vpc = ec2.VPC('VPC')
r_vpc.CidrBlock = '10.0.0.0/24'

# Add the VPC object to the template
t.add_resource(r_vpc)

We can then print the resulting template in JSON or YAML.

print(t.to_json())
print(t.to_yaml())

The execution of the script python first_example.py prints the following JSON:

{
    "Resources": {
        "VPC": {
            "Properties": {
                "CidrBlock": "10.0.0.0/24"
            },
            "Type": "AWS::EC2::VPC"
        }
    }
}

And YAML:

Resources:
  VPC:
    Properties:
      CidrBlock: 10.0.0.0/24
    Type: AWS::EC2::VPC

Internally Troposphere uses the json library from the Python Standard Library and rely on the cfn-flip library to generate the template in YAML.

Moreover, we could have written the previous bit of code as shown below, creating the VPC object in the method add_resource and passing all the parameters to the VPC constructor:

r_vpc = t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock='10.0.0.0/24'
))

The principle is the same for every AWS resource that can be created with CloudFormation. Based on the resource type (e.g. AWS::EC2::VPC or more generically AWS::service::resource), we first import the service from troposphere import ec2 and instantiate the resource object (ec2.VPC('VPC')) with at least the name of the resource in parameter.

AWS CloudFormation Templates elements

Below is the anatomy of any AWS CloudFormation Template:

AWSTemplateFormatVersion: "version date"

Description:
  String

Metadata:
  template metadata

Parameters:
  set of parameters

Mappings:
  set of mappings

Conditions:
  set of conditions

Transform:
  set of transforms

Resources:
  set of resources

Outputs:
  set of outputs

Every element that we can use while writing an AWS CloudFormation template can be used with Troposphere (getting_started/template_anatomy.py):

from troposphere import Template

t = Template()
t.set_version()
t.set_description()
t.set_metadata()
t.add_parameter()
t.add_mapping()
t.add_condition()
t.set_transform()
t.add_resource()
t.add_output()

Template() can take two parameters: Description and Metadata (or we can use the methods set_description() and set_metadata()).

set_version() can be called without a parameter and the version will be “2010-09-09”.

set_description() takes a string - set_description('Template generated by Troposphere').

set_metadata() takes a dict to which we can add anything we want:

t.set_metadata({
  'Comments': 'Initial Draft',
  'LastUpdated': 'Jan 1st 2015',
  'UpdatedBy': 'First Last',
  'Version': 'V1.0',
  'Instances' : {
    'Description' : 'Information about the instances'
  },
  'Databases' : {
    'Description' : 'Information about the databases'
  }
})

add_parameter() takes an object of type Parameter which is imported from __init__.py (from troposphere import Parameter):

t.add_parameter(Parameter(
  "Environment",
  Description="Environment into which the EC2 instance will be deployed",
  Type="String"
))

add_mapping() takes the name of the mapping and the definition of the mapping as a dict:

t.add_mapping('RegionMap', {
  'us-east-1': {'AMI': 'ami-7f418316'},
  'us-west-1': {'AMI': 'ami-951945d0'},
  'us-west-2': {'AMI': 'ami-16fd7026'},
  'eu-west-1': {'AMI': 'ami-24506250'},
  'sa-east-1': {'AMI': 'ami-3e3be423'},
  'ap-southeast-1': {'AMI': 'ami-74dda626'},
  'ap-northeast-1': {'AMI': 'ami-dcfa4edd'}
})

t.set_transform() takes the name of the Transform (AWS::Serverless-2016-10-31, AWS::SecretsManager-2020-07-23, etc.).

add_condition() takes the name of the condition and the definition of the condition:

t.add_condition('IsProduction', Equals(Ref('Environment'), 'true'))

In this example, we see the use of intrinsic functions (Fn::Ref and Fn::Equals). We will get back to it.

add_resource() takes an object of type AWSObject. For example:

t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock='10.0.0.0/24'
))

And finally, add_output() takes an object of type Output which is imported from __init__.py (from troposphere import Output):

r_vpc = t.add_resource(ec2.VPC(
  'VPC',
  Condition=c_is_prod,
  CidrBlock='10.0.0.0/24'
))

t.add_output(Output(
  'VPCId',
  Value=Ref(r_vpc),
  Description='VPC Id',
  Export=Export(Sub('${AWS::StackName}-' + r_vpc.title))
))

To export a value we need to use an object of type Export, once again defined in __init__.py (from troposphere import Export).

In this example, we can see that the name of a resource can be retrieved using the instance attribute title (r_vpc.title) to avoid copying and pasting strings or defining constants.

Furthermore, it worth mentioning that:

all add_xxx() methods can take lists:

t.add_output(
  [
    Output('VpcId',Value=Ref(r_vpc)),
    Output('SubnetAz1Id',Value=Ref(r_subnet_az1))
  ]
))

all keys (of the same level) in a template are alphabetically sorted (which help when doing diffs):

AWSTemplateFormatVersion: '2010-09-09'
Conditions:
  IsProduction: !Equals
    - !Ref 'Environment'
    - 'true'
Description: Template generated by Troposphere
Mappings:
  RegionMap:
    ap-northeast-1:
      AMI: ami-dcfa4edd
    ap-southeast-1:
      AMI: ami-74dda626
    eu-west-1:
      AMI: ami-24506250
    sa-east-1:
      AMI: ami-3e3be423
    us-east-1:
      AMI: ami-7f418316
    us-west-1:
      AMI: ami-951945d0
    us-west-2:
      AMI: ami-16fd7026
Metadata:
  Comments: Initial Draft
  Databases:
    Description: Information about the databases
  Instances:
    Description: Information about the instances
  LastUpdated: Jan 1st 2015
  UpdatedBy: First Last
  Version: V1.0
Outputs:
  VPCId:
    Description: VPC Id
    Export:
      Name: !Sub '${AWS::StackName}-VPC'
    Value: !Ref 'VPC'
Parameters:
  Environment:
    Description: Environment into which the EC2 instance will be deployed
    Type: String
  KeyName:
    Description: Name of an existing EC2 KeyPair to enable SSH access to the instance
    Type: String
Resources:
  VPC:
    Condition: IsProduction
    Properties:
      CidrBlock: 10.0.0.0/24
    Type: AWS::EC2::VPC

Constants

To make our lives easier, Troposphere has a constants module to avoid using strings:

Here are a few examples:

from troposphere.constants import (
  # Regions
  EU_WEST_1,
  # AZs
  EU_WEST_1A,
  # EC2 instance types
  T3_NANO,
  # DB instance types
  DB_R5_XLARGE,
  # CloudFormation parameter types
  STRING,
  # Cloudffront Hosted Zone ID
  CLOUDFRONT_HOSTEDZONEID
)

Unfortunately SSM parameter types (i.e. AWS::SSM::Parameter::XXX) are not defined in this module.

Pseudo Parameters

Troposphere supports all CloudFormation Pseudo Parameters (getting_started/pseudo_parameters.py).

from troposphere import (
  AccountId,
  NotificationARNs,
  NoValue,
  Partition,
  Region,
  StackId,
  StackName,
  URLSuffix
)

t = Template()

t.add_output(
  [
    Output(
      'AccountId',
      Value=AccountId
    ),
    Output(
      'NotificationARNs',
      Value=NotificationARNs
    ),
    Output(
      'NoValue',
      Value=NoValue
    ),
    Output(
      'Partition',
      Value=Partition
    ),
    Output(
      'Region',
      Value=Region
    ),
    Output(
      'StackId',
      Value=StackId
    ),
    Output(
      'StackName',
      Value=StackName
    ),
    Output(
      'URLSuffix',
      Value=URLSuffix
    )
  ]
)

Tagging resources

Tagging resources is essential. Adding Tags with Troposphere is quite easy (getting_started/tags.py).

from troposphere import Tags

r_vpc = t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock='10.0.0.0/16',
  Tags=Tags(
    Application='demo-app',
    Name='s-eu1-demo-vpc'
  )
))

Conditions

We have already seen how to add conditions to template. Now let’s see how to use them with resources (getting_started/condition.py).

c_is_prod = t.add_condition('IsProduction', Equals(Ref('Environment'), 'true'))

t.add_resource(ec2.VPC(
		'VPC',
		Condition=c_is_prod,
		CidrBlock='10.0.0.0/24'
))

AWS Resources attributes

AWS CloudFormation has several Resource attributes that can be used to configure the creation, update and deletion of a resource.

Let’s see some of them.

DependsOn

The attribute DependsOn is used to control the resources creation order, usually when there is no reference between them (getting_started/resources_attributes/depends_on.py).

r_route_igw = t.add_resource(ec2.Route(
  'RouteInternetGateway',
  DependsOn=r_gateway_attachment,
  GatewayId=Ref(r_internet_gateway),
  DestinationCidrBlock='0.0.0.0/0',
  RouteTableId=Ref(r_public_route_table),
))

DeletionPolicy

The attribute DeletionPolicy is used to control what CloudFormation needs to do when we delete a resource (getting_started/resources_attributes/deletion_policy.py).

t.add_resource(s3.Bucket(
    'S3Bucket',
    DeletionPolicy='Retain'
))

Intrinsic functions

We have already seen few examples, Troposphere supports all intrinsic functions.

In Troposphere, intrinsic functions are defined as classes and can be imported the same way:

from troposphere import (
  Base64,
  Cidr,
  FindInMap,
  GetAtt,
  GetAZs,
  ImportValue,
  Join,
  Ref,
  Select,
  Split,
  Sub,
  # Conditions functions
  And,
  Equals,
  If,
  Not,
  Or
)

Let’s see how we can use them.

Ref

The Ref intrinsic function is used to reference a Parameter or a Resource (getting_started/intrinsic_functions/ref.py).

from troposphere import Ref

[...]

p_vpc_cidr = t.add_parameter(Parameter('VpcCidr', Type='String'))

r_vpc = t.add_resource(ec2.VPC(
		'VPC',
		CidrBlock=Ref(p_vpc_cidr)
))

GetAtt

The GetAtt intrinsic function is used to retrieve returned values of a Resource (getting_started/intrinsic_functions/get_att.py).

from troposphere import GetAtt

[...]

s3bucket = t.add_resource(s3.Bucket(
  'S3Bucket',
  AccessControl=s3.PublicRead,
  WebsiteConfiguration=s3.WebsiteConfiguration(
    IndexDocument='index.html',
    ErrorDocument='error.html'
  )
))

t.add_output(
  Output(
    'WebsiteURL',
    Value=GetAtt(s3bucket, 'WebsiteURL'),
    Description='URL for website hosted on S3'
  )
)

ImportValue

The ImportValue intrinsic function is used to retrieve exported Output values from another stack (getting_started/intrinsic_functions/import_value.py).

from troposphere import ImportValue

[...]

t.add_resource(
  ec2.Instance(
    'Ec2Instance',
    ImageId='ami-12345678',
    InstanceType='t3.nano',
    KeyName='mykey',
    SubnetId=ImportValue('demo-subnet-az1')
  )
)

Sub

The Sub intrinsic function is comparable to python string interpolation. Instead of f'hello {name}' we use Sub('hello ${name}') where the variables are Resources and Parameters names and the value is resolved as it would be with !Ref MyResource (getting_started/intrinsic_functions/sub.py).

from troposphere import Sub

[...]

p_application = t.add_parameter(Parameter('Application', Type='String'))

r_vpc = t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock='10.0.0.0/24',
  Tags=Tags(Name=Sub(f'${{{p_application.title}}}-vpc'))
))

r_subnet_az1 = t.add_resource(ec2.Subnet(
  'SubnetAz1',
  CidrBlock='10.0.0.0/25',
  VpcId=Ref(r_vpc),
  AvailabilityZone='eu-west-1a',
  Tags=Tags(Name=Sub('${var}-vpc', { 'var': Ref(p_application) } ))
))

We can note the use of the triple braces (f'${{{p_application.title}}}-vpc') to have the resolution of p_application.title on one hand and the generation of ${Application}-vpc on the other.

Join

The Join intrinsic function is used to join elements from a list with a defined string delimiter into a string (getting_started/intrinsic_functions/join.py).

from troposphere import Join

[...]

t.add_resource(ssm.Parameter(
	'SsmPublicSubnetIds',
	Name='SsmPublicSubnetsIdsKey',
	Type='String',
	Value=Join(',', [
		Ref(r_public_subnet_az1),
		Ref(r_public_subnet_az2)
		]
	)
))

Split

The Split intrinsic function is used to split a string with a specific string delimiter into a list (getting_started/intrinsic_functions/split.py).

from troposphere import Split

[...]

p_vpc_cidr = t.add_parameter(
  Parameter('VpcCidr',
            Type='String',
            Default='10.0.0.0/24')
)
p_subnets_cidr = t.add_parameter(
  Parameter('Subnets',
            Type='CommaDelimitedList',
             Default='10.0.0.0/25,10.0.0.128/25')
)

r_vpc = t.add_resource(ec2.VPC(
    'VPC',
    CidrBlock=Ref(p_vpc_cidr)
))

t.add_resource(
  [
    ec2.Subnet(
      'SubnetAz1',
      CidrBlock=Select(0, Split(',', Ref(p_subnets_cidr))),
      VpcId=Ref(r_vpc),
      AvailabilityZone='eu-west-1a'
    ),
    ec2.Subnet(
      'SubnetAz2',
      CidrBlock=Select(1, Split(',', Ref(p_subnets_cidr))),
      VpcId=Ref(r_vpc),
      AvailabilityZone='eu-west-1b'
    )
  ]
)

Select

The Select intrinsic function is used to retieve an element inside a list at a defined position (index) (getting_started/intrinsic_functions/select.py).

from troposphere import Select

[...]

p_vpc_cidr = t.add_parameter(
  Parameter('VpcCidr',
            Type='String',
            Default='10.0.0.0/24'))
p_subnets_cidr = t.add_parameter(
  Parameter('Subnets',
            Type='CommaDelimitedList',
             Default='10.0.0.0/25,10.0.0.128/25'))

r_vpc = t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock=Ref(p_vpc_cidr)
))

t.add_resource(
  [
    ec2.Subnet(
      'SubnetAz1',
      CidrBlock=Select(0, Ref(p_subnets_cidr)),
      VpcId=Ref(r_vpc),
      AvailabilityZone='eu-west-1a'
    ),
    ec2.Subnet(
      'SubnetAz2',
      CidrBlock=Select(1, Ref(p_subnets_cidr)),
      VpcId=Ref(r_vpc),
      AvailabilityZone='eu-west-1b'
    )
  ]
)

GetAZs

The GetAZs intrinsic function is used to retrieve the list of Availability Zones available in a specific region (usually the one where the CloudFormation stack is deployed) (getting_started/intrinsic_functions/get_azs.py).

from troposphere import GetAZs

[...]

t.add_resource(ec2.Subnet(
    'PublicSubnetAz1',
    CidrBlock='10.0.0.0/25',
    VpcId=Ref(r_vpc),
    AvailabilityZone=Select(0, GetAZs())
))

FindInMap

The FindInMap intrinsic function is used to retrieve an element from a Mapping (getting_started/intrinsic_functions/find_in_map.py).

from troposphere import FindInMap

[...]

p_environment = t.add_parameter(Parameter('Environment', Type='String'))

t.add_mapping('EnvironmentMap', {
    'production': {'InstanceType': 't3.micro'},
    'development': {'InstanceType': 'm4.xlarge'}
})

t.add_resource(ec2.Instance(
    'Ec2Instance',
    ImageId='ami-7f418316',
    InstanceType=FindInMap('EnvironmentMap', Ref(p_environment), 'InstanceType')
))

The first parameter is the name of the mapping, the second the TopLevelKey and the third the SecondLevelKey.

Unfortunately, the method add_mapping() does not return anything and therefore we have to copy the string “EnvironmentMap”.

Cidr

The Cidr intrinsic function is used to split a CIDR block into a list of sub-CIDR blocks (getting_started/intrinsic_functions/cidr.py).

from troposphere import Cidr

[...]

r_vpc = t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock='10.0.0.0/24'
))

r_subnet_az1 = t.add_resource(ec2.Subnet(
  'SubnetAz1',
  CidrBlock=Select(0, Cidr(GetAtt(r_vpc, 'CidrBlock'), 2, 7)),
  VpcId=Ref(r_vpc),
  AvailabilityZone='eu-west-1a'
))

The first parameter is the CIDR block to split, the second the number of CIDR blocks to generate and the third the number of subnet bits for each CIDR block (e.g. for /24 it will be 32 - 25 = 7 ).

In the example above we ask for 2 /25 blocks, which will be 10.0.0.0/25 and 10.0.0.128/25.

Base64

The Base64 intrinsic function is used to encode the input string into Base64. It is usually used to define a user data.

[...]

  Instance:
    Type: AWS::EC2::Instance
    Properties:
      [...]
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          echo "Hello"

The first solution is to use Base64 with Join which requires quite an ugly python script (getting_started/intrinsic_functions/base64_ex1.py).

from troposphere import Base64

[...]

t.add_resource(ec2.Instance(
  'Ec2Instance',
  ImageId='ami-16fd7026',
  InstanceType='t3.nano',
  KeyName='mykey',
  UserData=Base64(Join('', [
    '#!/bin/bash\n',
    'echo "Hello"\n'
  ])),
))

The second one is to us the helper function userdata.from_file which takes a script and split every line to generate the appropriate Base/Join structure (getting_started/intrinsic_functions/base64_ex2.py).

from troposphere.helpers import userdata

[...]

t.add_resource(ec2.Instance(
  'Ec2Instance',
  ImageId='ami-16fd7026',
  InstanceType='t3.nano',
  KeyName='mykey',
  UserData=userdata.from_file('userdata.sh'),
))

And the third one is to use Base64 with Sub, which will only work for YAML templates (getting_started/intrinsic_functions/base64_ex3.py).

[...]

userdata = """#!/bin/bash
echo "Hello"
"""

t = Template()
t.add_resource(ec2.Instance(
  'Ec2Instance',
  ImageId='ami-16fd7026',
  InstanceType='t3.nano',
  KeyName='mykey',
  UserData=Base64(Sub(userdata))
))

print(t.to_yaml(clean_up=True))

This time, we had to add the parameter clean_up=True to the method to_yaml(), otherwise \n characters will be printed in the generated template.

Using Sub we can add variables to be interpolated (${MyResource}) and variable redefinition as-well (getting_started/intrinsic_functions/base64_ex4.py):

[...]

p_environment = t3.add_parameter(Parameter('Environment', Type='String'))

userdata = """#!/bin/bash
echo "Hello from ${MyVar}"
"""

t.add_resource(ec2.Instance(
  'Ec2Instance',
  ImageId='ami-16fd7026',
  InstanceType='t3.nano',
  KeyName='mykey',
  UserData=Base64(Sub(
    userdata,
    MyVar=Ref(p_environment)
  ))
))

print(t.to_yaml(clean_up=True))

Of course, you could still load the user data from an file (getting_started/intrinsic_functions/base64_ex5.py):

[...]

p_environment = t.add_parameter(Parameter('Environment', Type='String'))

with open("userdata.sh","r") as f:
  t_userdata = f.read()

t.add_resource(ec2.Instance(
  'Ec2Instance',
  ImageId='ami-16fd7026',
  InstanceType='t3.nano',
  KeyName='mykey',
  UserData=Base64(Sub(
    t_userdata,
    MyVar=Ref(p_environment)
  ))
))

print(t.to_yaml(clean_up=True))

Conditions intrinsic functions

Conditions can be created with the Conditions intrinsic function Equals, Not, And and Or. The If intrinsic function acts much like a ternary operator (if condition then value_1 else value_2).

Let’s see few examples (getting_started/intrinsic_functions/conditions.py):

from troposphere import Template, Parameter, Ref, Condition, Equals, And, Or, Not, If
from troposphere import ec2

t = Template()

t.add_parameter(
  [
    Parameter(
      'One',
      Type='String',
    ),
    Parameter(
      'Two',
      Type='String',
    ),
    Parameter(
      'Three',
      Type='String',
    ),
    Parameter(
      'Four',
      Type='String',
    ),
    Parameter(
      'SshKeyName',
      Type='String',
    )
  ]
)

t.add_condition('OneEqualsFoo',
  Equals(
    Ref('One'),
    'Foo'
  )
)

t.add_condition('NotOneEqualsFoo',
  Not(
    Condition('OneEqualsFoo')
  )
)

t.add_condition('BarEqualsTwo',
  Equals(
    'Bar',
    Ref('Two')
  )
)

t.add_condition('ThreeEqualsFour',
  Equals(
    Ref('Three'),
    Ref('Four')
  )
)

t.add_condition('OneEqualsFooOrBarEqualsTwo',
  Or(
    Condition('OneEqualsFoo'),
    Condition('BarEqualsTwo')
  )
)

t.add_condition('OneEqualsFooAndNotBarEqualsTwo',
  And(
    Condition('OneEqualsFoo'),
    Not(Condition('BarEqualsTwo'))
  )
)

t.add_condition('OneEqualsFooAndBarEqualsTwoAndThreeEqualsPft',
  And(
    Condition('OneEqualsFoo'),
    Condition('BarEqualsTwo'),
    Equals(Ref('Three'), 'Pft')
  )
)

t.add_condition('OneIsQuzAndThreeEqualsFour',
  And(
    Equals(Ref('One'), 'Quz'),
    Condition('ThreeEqualsFour')
  )
)

t.add_condition('LaunchInstance',
  And(
    Condition('OneEqualsFoo'),
    Condition('NotOneEqualsFoo'),
    Condition('BarEqualsTwo'),
    Condition('OneEqualsFooAndNotBarEqualsTwo'),
    Condition('OneIsQuzAndThreeEqualsFour')
  )
)

t.add_condition('LaunchWithGusto',
  And(
    Condition('LaunchInstance'),
    Equals(Ref('One'), 'Gusto')
  )
)

t.add_resource(
  ec2.Instance(
    'Ec2Instance',
    Condition='LaunchInstance',
    ImageId=If('ConditionNameEqualsFoo', 'ami-12345678', 'ami-87654321'),
    InstanceType='t1.micro',
    KeyName=Ref('SshKeyName')
  )
)

Policy Documents

Troposphere does not offer any type checking feature to define Policy Documents.

To create a Policy Document we have to resort to a simple dict (getting_started/policies_ex1.py).

[...]

t.add_resource(Role(
  'Role1',
  AssumeRolePolicyDocument={
    'Statement': [
      {
        'Principal': {
          'Service': [
            'ec2.amazonaws.com'
          ]
        },
        'Effect': 'Allow',
        'Action': [
          'sts:AssumeRole'
        ]
      }
    ]
  }
))

But Troposphere can be used with the awacs library for easier creation of AWS Access Policy Language JSON (getting_started/policies_ex2.py).

from awacs.aws import Allow, Statement, Principal, Policy
from awacs.sts import AssumeRole

[...]

t.add_resource(Role(
  'Role2',
  AssumeRolePolicyDocument=Policy(
    Statement=[
      Statement(
        Effect=Allow,
        Action=[AssumeRole],
        Principal=Principal('Service', ['ec2.amazonaws.com'])
      )
    ]
  )
))

As awacs is not the topic of this article and would require another long one, we won’t spend more time on it. Let’s just say that the documentation is even worse than Troposphere and the examples are sparse. The only way to learn how to use awacs is to look at the source code, which fortunately is quite easy to understand.

Generating files

Until now we have printed all CloudFormation templates to the console, but it is usually not what we want. To store the result in a file we only have to use regular Python code getting_started/generate_template.py:

from troposphere import Template
from troposphere import ec2

t = Template()
t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock='10.0.0.0/24'
))

with open('sample.cfn.yml', 'w') as f:
  f.write(t.to_yaml())

with open('sample.json.yml', 'w') as f:
  f.write(t.to_json())

AWS CloudFormation templates to Troposphere

Along the library, Troposphere provides a script to transform CloudFormation templates (in JSON only) into python/Troposphere code.

In our pipenv, we simply call cfn2py <template_name>.json to print the python code into the console.

It is a great start if you have already a lot of CloudFormation templates and want to migrate them to Troposphere.

If you have CloudFormation templates in YAML, you can edit the script cfn2py.py and even create a Pull Request for the opened issue #1366.

Unsupported Resources

Troposphere not being actively maintained it may happen that some new AWS resources or new properties are not present. In this case, the fastest is to do the modification on our side and wait for a new release.

If we only need to add a new property, the easiest is to copy/paste the class in our project and add the new property.

Troposphere has two base classes:

AWSObject inherited by each AWS resource classes (i.e. cloudfront.Distribution, ec2.Instance or s3.Bucket) and ;
AWSProperty inherited by each complex type (object) defined inside a resource

For example:

class Distribution(AWSObject):
  resource_type = "AWS::CloudFront::Distribution"

  props = {
    'DistributionConfig': (DistributionConfig, True),
    'Tags': ((Tags, list), False),
  }
 
class DistributionConfig(AWSProperty):
  props = {
    'Aliases': (list, False),
    'CacheBehaviors': ([CacheBehavior], False),
    'Comment': (basestring, False),
    'CustomErrorResponses': ([CustomErrorResponse], False),
    'DefaultCacheBehavior': (DefaultCacheBehavior, True),
    'DefaultRootObject': (basestring, False),
    'Enabled': (boolean, True),
    'HttpVersion': (basestring, False),
    'IPV6Enabled': (boolean, False),
    'Logging': (Logging, False),
    'Origins': ([Origin], True),
    'OriginGroups': (OriginGroups, False),
    'PriceClass': (priceclass_type, False),
    'Restrictions': (Restrictions, False),
    'ViewerCertificate': (ViewerCertificate, False),
    'WebACLId': (basestring, False),
  }

AWSObject have two class attributes:

a resource_type (e.g. AWS::CloudFront::Distribution) and ;
a props

AWSProperty has only one class attribute: props.

The props attribute is a dict where the key is the name of the CloudFormation property and a tuple of two elements:

the type of the property (a class inheriting AWSProperty) or a validator (a function defined in the module validators) which validates the data provided.
a boolean indicating if the property is required (True) or optional (False)

If we need a completely new AWS service, things are a little bit more complicated. Depending on the service, writing everything by hand can be time consuming and error prone. Fortunately, AWS provides a CloudFormation resource specification (the definition of each CloudFormation resource in a single JSON file) and Troposphere a script (gen.py) to generate the classes from this JSON file.

Here are the steps to generate Troposphere classes from the CloudFormation resource specification:

clone the repository (The gen.py script not been present when Troposphere is installed as a library)
download the JSON file (usually the one for Ohio is fine)
edit the file to keep only the resources you want to generate
execute python gen.py CloudFormationResourceSpecification.json

And here we go!

Deploying generated AWS CloudFormation templates

Troposphere provides the script cfn using the unmaintained boto library (not to confuse with boto3) to deploy CloudFormation templates. Needless to say, we are better off using an alternative like Ansible (see. CloudFormation with Ansible) or even create our own deploy/delete feature to avoid using an extra tool.

We can start with something really simple (deploy/myapp.py):

from troposphere import Template, Parameter, Tags, Sub
import troposphere.ec2 as ec2
from troposphere.constants import STRING

import cfn

t = Template()
t.set_version()

p_account_code = t.add_parameter(Parameter('AccountCode', Type=STRING))
p_region_code = t.add_parameter(Parameter('RegionCode', Type=STRING))
p_application = t.add_parameter(Parameter('Application', Type=STRING))

t.add_resource(ec2.VPC(
  'VPC',
  CidrBlock='10.0.0.0/16',
  Tags=Tags(
    Application='demo-app',
    Name=Sub('${AccountCode}-${RegionCode}-${Application}-vpc')
  )
))

if __name__ == "__main__":
  cfn.generate(t, '.cfn/sample.yml', cfn.TemplateFormat.YAML)

  cfn.deploy('l-ue2-mydemostack', '.cfn/sample.yml',
             template_parameters={
               'AccountCode': 'l',
               'RegionCode': 'ue2',
               'Application': 'demo'
             },
             tags={
               'Name': 'l-ue2-mydemostack',
               'Application': 'demo'
             },
             profile='spikeseed-labs', region='us-east-2'
            )

In a single file we define our template and in the main we call a generate() function and a deploy() function to which we can pass template parameters and tags, additionally to the stack name, template path, profile and region to use to deploy the stack.

The generate() function only generate the CloudFormation template.

The deploy() function handles the stack creation and update.

From there we can build something more complex/useful like:

using script parameters to be able to deploy specific stacks
using SSM Parameter store to store and retrieve elements from other stacks
packaging AWS Lambdas and Layers
having a mono-repository smart deployment - considering what has changed and what needs to be deployed to not deploy everything every time

Conclusion

Troposphere is comparable to AWS CloudFormation templates, but in Python - offering the features of a programming language. At this point you may wonder why would you use Troposphere instead of the AWS CDK, an AWS product, actively developed, well documented, which can be used with multiple programming languages and can do exactly what Troposphere can do and even more. And to be honest, the reasons are mostly philosophical.

The AWS CDK is written in TypeScript - and even if you can use another programming language to write your code, you still need to install Node.js - and is composed of hundred of small modules. On the other side, Troposphere is a single, extra small, very easy to use, Python library. And in this article, you’ve learned everything you need to know to use it and even how to extend it as much as you want.

In summary, you will choose Troposphere if:

your project is mainly in Python
you don’t want an extra dependency over Node.js
you want to write templates using the same keywords as CloudFormation and without having to use extra constructs to only generate CloudFormation templates
you want to create your own “do it all” tool extending Troposphere’ capabilities.