Deploying and Destroying Infrastructure with AWS CDK and Route 53 Hosted Zones
When I decided to try out AWS CDK, I began with a goal to reproducibly deploy and destroy from a single repository using AWS CDK and GitHub Actions. Every environment may not have this workflow, but I made it a requirement for testing.
Easily spinning up and down infrastructure is useful for rapid development and testing. Prod-like but ephemeral environments provide a convenient way to expose and troubleshoot bugs that may not appear in local development environments. The key with these environments is convenience, and a core feature of that convenience is the ability to destroy the infrastructure as easily as it is to deploy it.
This was my first time working with GitHub Actions, AWS CDK, and TypeScript, but fulfilling my requirement presented the most interesting challenges because of AWS CloudFormation.
Like many scary stories, this one starts with DNS.
To accomplish my requirement, I wanted a DNS stack to create a hosted zone for my domain and a certificate stack to create an SSL/TLS certificate in AWS Certificate Manager.
In AWS CDK, a stack allows you to define and deploy resources together. Think Terraform modules. Design, organize, and reuse your CDK code with stacks.
I prototyped different infrastructure deployments locally with VPC, RDS, and ECS Fargate stacks. Again, this was my first time using AWS CDK, and at this point, I was impressed. AWS CDK Constructs moved the abstraction of infrastructure creation to a higher level. This helped me to more easily hold a mental model of what each stack was doing and in a less fatiguing way than YAML or even a declarative configuration language, such as HCL. Plus, I was working with a general-purpose programming language, such as TypeScript, and could potentially take the experience in that language to non-DevOps projects.
The AWS Cloud Development Kit Library contains constructs for specific AWS resources, and these constructs may consist of three levels of abstraction from those resources: L1, L2, and L3.
The L3 Constructs, in particular, which AWS calls patterns, helped reduce creating an AWS Fargate cluster, its Application Load Balancer, and various assets to a single function call:
export class ECSStack extends Stack {
// ...
const fargateService = new ApplicationLoadBalancedFargateService(
this,
"fargateService",
{
cluster,
domainName: subDomain,
domainZone: props.hostedZone,
recordType: ApplicationLoadBalancedServiceRecordType.ALIAS,
desiredCount: 2,
taskImageOptions: {
image: ContainerImage.fromRegistry(
containerImage + `${process.env.RELEASE || "latest"}`,
),
containerPort: containerPort,
enableLogging: true,
secrets: {
DSN: ecsSecret.fromSecretsManager(creds),
},
},
publicLoadBalancer: true,
assignPublicIp: true,
serviceName: "fargateService",
certificate: props.cert,
},
);
// ...
}
It’s also possible to create your own constructs, so a custom L3 construct could enforce best practices for resource creation within a team or company.
To write these initial stacks, I really only needed the API Reference, repositories in AWS organizations, such as AWS Samples, CDK Labs at AWS, Amazon Web Services - Labs, the CDK repo, and the Amazon ECS Workshop demos.
If my testing requirements ended with deploying VPC, RDS, and ECS Fargate stacks, I would have walked away thinking AWS CDK had me covered, but when I added the DNS and certificate stacks, deployment stalled at certificate creation and then attempting to destroy the infrastructure produced the error:
The specified hosted zone contains non-required resource record sets and so cannot be deleted.
Initially, I seemed to have two issues: certificate creation and pesky record sets in the hosted zone.
I focused on what was stalling the deployment first. After a few dig
commands, head scratches, and searches, I came across a discussion about name servers and Route 53. If I’m going to be testing by deploying and destroying infrastructure, the name servers in the hosted zone and Route 53 domain registrar need to match, but Route 53 allocates a new set of four name servers every time a new hosted zone is created, which is what my DNS stack was doing. Certificate creation had stalled because of this, so what I thought were two separate service issues were both Route 53 issues.
After verifying in the AWS Console that this was what was blocking deployment and suspecting this had to be a problem others faced, I searched the repositories mentioned above for any use case involving Route 53 hosted zones and updating name servers, and I found an example of a custom resource that does exactly this.
AWS CDK code transpiles or “synthesizes” to CloudFormation templates, and CloudFormation does the provisioning work. CloudFormation stacks run basic create, update, and delete operations. A CloudFormation custom resource allows us to create our own customized provisioning logic, the most basic of which is an API call, such as UpdateDomainNameservers
, used in the custom resource above.
In AWS CDK, a custom resource executes using a special Lambda function that must handle these create, update, and delete operations. As we can see in this custom resource, the two create type operations make the API call UpdateDomainNameservers
, and delete is simply returned, as this custom resource’s purpose is to update or create a resource:
export async function handler(event: any): Promise<any> {
const { domain, nameServers } = event.ResourceProperties;
switch (event.RequestType) {
case 'Create':
await updateRegisteredNameServers(domain, nameServers);
return { PhysicalResourceId: nameServers };
case 'Update':
if (event.PhysicalResourceId !== nameServers) await updateRegisteredNameServers(domain, nameServers);
return { PhysicalResourceId: nameServers };
default:
}
}
Now that I’ve added a file for this custom resource, I’ll update my DNS stack to use it:
export class DnsStack extends Stack {
// ...
this.updateRegDomain(rootDomain, this.hostedZone);
}
/** A Custom Resource to update the Domain registrar with Hosted Zone name servers */
updateRegDomain(domain: string, hostedZone: HostedZone) {
const provider = new custom_resources.Provider(this, "Provider", {
onEventHandler: new aws_lambda_nodejs.NodejsFunction(
this,
"UpdateRegDomain",
{
initialPolicy: [
new PolicyStatement({
actions: ["route53domains:UpdateDomainNameservers"],
resources: ["*"],
}),
],
},
),
});
new CustomResource(this, "CustomResource", {
serviceToken: provider.serviceToken,
properties: {
domain,
nameServers: Fn.join(",", hostedZone.hostedZoneNameServers!),
},
});
}
}
With this addition to my AWS CDK code, I’m now able to test deployments the way I set out to, but there’s still one more problem: the error output mentioned above. Something is blocking the deletion of the certificate stack. Is it DNS again? Actually, it’s CloudFormation.
In the AWS Certificate Manager documentation, I see that ACM creates a CNAME record for certificate validation through DNS, which is not deleted with a stack delete operation. This is known behavior that the CloudFormation team may or may not change judging by the open ticket linked above. There have also been tickets created for it in the AWS CDK repo.
This time, however, I’m facing this problem with custom resources in my toolkit, so inspired by an additional custom resource in the CDK repo, I created a custom resource that handles the deletion operation for the certificate and simply returns the create and update operations, as my custom resource’s purpose is to delete a resource:
import { Route53 } from "aws-sdk";
const route53 = new Route53({ region: "us-east-1" });
const certRecordType: string = "CNAME";
const certRecordTTL: number = 300;
const changeAction: string = "DELETE";
/**
* AWS Certificate Manager creates a CNAME resource record set for certificate validation through DNS.
* Currently, ACM cannot delete this record, which blocks the deletion of the stack's hosted zone:
* https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/837.
* This function deletes the CNAME record created by ACM.
*/
const deleteCertificateRecord = (
hostedZoneId: string,
name: string,
value: string,
) =>
route53
.changeResourceRecordSets({
HostedZoneId: hostedZoneId,
ChangeBatch: {
Changes: [
{
Action: changeAction,
ResourceRecordSet: {
Name: name,
Type: certRecordType,
TTL: certRecordTTL,
ResourceRecords: [{ Value: value }],
},
},
],
},
})
.promise();
const listResourceRecordSets = (hostedZoneId: string) =>
route53.listResourceRecordSets({ HostedZoneId: hostedZoneId }).promise();
export async function handler(event: any): Promise<any> {
const { hostedZoneId } = event.ResourceProperties;
if (event.RequestType !== "Delete") {
return;
}
const recordSetsList = await listResourceRecordSets(hostedZoneId);
if (recordSetsList.ResourceRecordSets.length < 3) {
return;
}
const certRecord = recordSetsList.ResourceRecordSets.find(
(r) => r.Type === certRecordType,
);
const certRecordName = certRecord?.Name as string;
const value = certRecord?.ResourceRecords?.find(Boolean)?.Value as string;
await deleteCertificateRecord(hostedZoneId, certRecordName, value);
}
I’ll call this custom resource from a new Certificate Delete stack:
export class CertStackDelete extends Stack {
// ...
this.deleteCertRecord(rootDomain, props.hostedZone);
}
deleteCertRecord(domain: string, hostedZone: HostedZone) {
const provider = new custom_resources.Provider(this, "Provider", {
onEventHandler: new aws_lambda_nodejs.NodejsFunction(
this,
"DeleteCertRecord",
{
initialPolicy: [
new PolicyStatement({
actions: [
"route53:changeResourceRecordSets",
"route53:listResourceRecordSets",
],
resources: ["*"],
}),
],
},
),
});
new CustomResource(this, "CustomResource", {
serviceToken: provider.serviceToken,
properties: {
hostedZoneId: hostedZone.hostedZoneId,
},
});
}
}
Granted, my first attempt at a custom resource is a little hacky since it relies on knowing that ACM only creates one CNAME record. It’s not perfect, but it does the job, and that’s good.
My entire stack deploys and destroys consistently now.
The goal I set for my first experiment with AWS CDK took me on a journey that was far more interesting than I could have imagined when I started out. AWS CDK L3 patterns made creating major portions of my AWS infrastructure a breeze, and the missing parts of AWS CDK did not mean my goal was unreachable: custom resources made it possible in the end.
On the one hand, creating custom resources is extra work, but on the other hand, I did gain additional experience using TypeScript. As a DevOps engineer, I like gaining as much experience with a general-purpose programing language as I can.
There are many things to consider when choosing tools for Infrastructure as Code. I hope this article will help others who are considering using AWS CDK.
The stack in my repository linked above deploys infrastructure that’s pretty expensive to run for any significant amount of time. I mostly chose services I had never worked with before to test with, such as Fargate and Aurora, but convenience and speed of the deploy and destroy runs were also a factor.
That said, keep these expenses in mind if you decide to test using the code from my repository with your own AWS account. Interestingly, many aspects of Route 53 and DNS, such as hosted zones, are never part of the free tier in AWS. 🤔