ACME certificate issuers have drastically lowered the barriers to having browser-trusted SSL certificates on HTTPS sites and services. However, there are still challenges in managing ACME issued certs on internal-only HTTPS servers. I built the lecertvend tool to separate certificate issuance/renewal from emplacement. I used HashiCorp Vault to tie the two workflows together. The result is an easy and centralized process for acquiring and renewing certificates for both internal-only and external web servers.
Most of the tooling I have found for ACME certificate management either uses the local filesystem for certificate storage (Certbot, acme.sh) or is a proxy server that is expects to be doing SSL termination (Caddy, Traefik). Certificates issued with acme.sh and Certbot can be copied into Vault, but those tools still rely on the certificates existing on a filesystem in order to manage renewal.
The solution I wanted for ACME certificate management needed to have the following features:
I was not able to find an existing solution that used Vault as a primary storage target. CertMagic has a modular storage interface but that is only for certs and keys, and I wanted all necessary data for certificate issuance and challenge solving in Vault. I did find some things in CertMagic to be helpful examples, so I am glad I dug into that option a little ways.
Ultimately I spent a couple days hammering out a custom tool, which is lecertvend.
GitHub project here: https://github.com/arcandspark/lecertvend
The lecertvend tool is a CLI program that meets the feature list above. It gets used in two main places:
When an application build/deploy pipeline runs, the Project’s CI job will have a Vault token that has policy assigned based on the GitLab Group that the Project belongs to. GitLab is acquiring a Vault token with a JWT that indicates the group ID, and Vault returns a token with permissions based on that group’s Vault policy. That policy allows access to a secrets path where that GitLab group’s ACME issued certs are stored, along with a Cloudflare token that allows DNS updates to domains appropriate for that group.
In short, when a CI job runs, it will have access to issue and renew certs for a particular GitLab group, but not others. This is how multi-tenancy support is achieved.
From a developer experience perspective, this means lecertvend can be used as a one-liner in a CI job to ensure that a certificate will be issued or already exists for their project:
variables:
PROJECT_SLUG: myapp
.....
infra:
stage: infra
script:
- |
... terraform apply, other infra scripting ...
lecertvend -vend -mount secret -prefix lecertvend/teamname/teamdomain.com -secret ${PROJECT_SLUG} -names ${PROJECT_SLUG}
.....
And that certificate can be referenced in a Nomad Job that gets deployed by that pipeline:
job "myapp" {
type = "service"
.....
group "service" {
count = 1
.....
task "myapp-service" {
driver = "docker"
.....
vault {
policies = ["nomad-job-teamname"]
env = true
}
template {
change_mode = "restart"
error_on_missing_key = true
uid = 0
gid = 0
perms = "600"
destination = "secrets/cert.pem"
data = <<-EOF
{{with secret "secret/data/lecertvend/teamname/teamdomain.com/${var.project_slug}"}}{{.Data.data.cert}}{{end}}
EOF
}
template {
change_mode = "restart"
error_on_missing_key = true
uid = 0
gid = 0
perms = "600"
destination = "secrets/key.pem"
data = <<-EOF
{{with secret "secret/data/lecertvend/teamname/teamdomain.com/${var.project_slug}"}}{{.Data.data.key}}{{end}}
EOF
}
}
}
}
Each GitLab Group also has a certificate renewal project, with a simple pipeline to call lecertvend
daily to renew any certificates that are nearing expiration:
stages:
- renew
renew:
stage: renew
script:
- lecertvend -renew -mindays 28 -mount secret -prefix lecertvend/teamname
The output of which looks like this:
$ lecertvend -renew -mindays 28 -mount secret -prefix lecertvend/teamname
Renewing certs in prefix lecertmgmt/omt if less than 28 days validity remain.
lecertvend/teamname does not end in zone, looking for zones within...
ignoring secret lecertvend in non-zone prefix lecertvend/teamname
Renewing certs in prefix lecertvend/teamname/teamdomain.com if less than 28 days validity remain.
cert in secret lecertvend/teamname/teamdomain.com/pots has 42 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/desk has 69 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/door has 63 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/comb has 60 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/table has 63 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/chair has 76 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/music has 76 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/keys has 45 days of validity remaining, taking no action.
cert in secret lecertvend/teamname/teamdomain.com/www has 76 days of validity remaining, taking no action.
renewals started, waiting for completion...
renewals complete.
Cleaning up project directory and file based variables 00:00
Job succeeded
The following is an overall visualization of how lecertvend is used to issue and renew certificates in my environment: