I’ve spent the last few months trying to bend Hashicorp’s Terraform to my will. It is a great tool in DevOps and cloud architecture and a means to ‘infrastructure as code’. With Terraform you write out the infrastructure you want, in my case for AWS, and it runs the files through a dependency tree engine to make sure you have what you need and then systematically calls the AWS API to create the required parts in the required order.

There are a couple of ‘gotchas’ to it, though.

  • As of this writing it does not have much in the means of throttling because AWS does not respond to API calls with any throttling related information. This means you can try to create too much infrastructure at once and AWS will start dropping requests from you. So far the only solution is to have your scripts slow down your network speed which I do through the Linux tool tc run before and after (remember to restore networking back to its original conditions when done!).

  • Because Terraform does not create a CloudFormation file and instead invokes everything individually it must track your infrastructure in a state file. If you lose this state file then you have to delete/update resources manually. However, as a plus, because it does things individually it is not bound to CloudFormation’s 200 object limit and it provides an easy mechanism for backing up your state file

A design & implementation problem we ran into is that we were creating an environment that had a ‘many to many’ relationship of workstation VPCs that data scientists connected to and lab VPCs that hosted our data science tools and data. We needed to limit which workstations could reach which labs for security reasons and there is no easy way to do this in Terraform. After thinking on it for a bit I told the team I had an idea, but it was a bit ‘terrifying’. The idea was to use Puppet as a templating engine for generating our Terraform files, which gave me the ability to use embedded Ruby for dynamically calculating security groups, NACLs, and networking routes. As a joke I named it ‘terrafying’ but frankly, in the end, it worked quite well and later discovered that other organizations with this same problem solved it in similar ways.