As I announced in my last post, I want to write today about the termination of AWS Spot Instances and how I set up a Termination-Spotter Service.
If the price for a spot instance rises above the limit that you are willing to pay for it, you will lose this instance. However you will not lose it out of a sudden. AWS gives you a two minute warning before termination. This warning comes in form of an API at http://169.254.169.254/latest/meta-data/spot/termination-time
. This endpoint will become available, when your instance has been marked for termination. AWS recommends, that interested applications poll for the termination notice at five-second intervals.
Well quick and dirty the following lines will do the trick:
#!/usr/bin/env bash
while true
do
if [ -z $(curl -Is http://169.254.169.254/latest/meta-data/spot/termination-time | head -1 | grep 404 | cut -d \ -f 2) ]
then
# run something that deals with the termination
break
else
sleep 5
fi
done
You maybe want to write an init file
so start the script easily:
description "Termination Spotter"
author "Sebastian Herzberg"
start on runlevel [2345]
pre-start script
echo "[`date`] Termination Spotter starting" >> /var/log/termination-spotter.log
end script
exec /bin/sh /var/opt/termination-spotter.sh > /dev/null &
All our ECS instances are enlisted in several load balancers. One for each service container that runs on the instance. When the termination notice comes up, the instance needs to be pulled out of every ELB and deregistered from the ECS cluster. I am an Ansible fan, so I wrote a playbook that is triggered by the above script. It makes use of the local ECS metadata API that is available at http://localhost:51678/v1/metadata
---
- name: Deregister from any ELB and Cluster
# =========================================
hosts: localhost
gather_facts: yes
connection: local
tasks:
- name: Gather EC2 facts
ec2_facts:
- name: Deregistering target instance from ELBs
ec2_elb:
instance_id: "{{ ansible_ec2_instance_id }}"
region: "eu-west-1"
state: "absent"
wait: no
- name: Get the cluster name of the current instance
shell: "curl -m 10 -s http://localhost:51678/v1/metadata | jq -r '. | .Cluster' | awk -F/ '{print $NF}'"
register: clustername
- name: Get the instance ARN
shell: "curl -m 10 -s http://localhost:51678/v1/metadata | jq -r '. | .ContainerInstanceArn'"
register: instance_arn
- name: Deregister instance from cluster
shell: "aws ecs deregister-container-instance --cluster {{ clustername.stdout }} --region eu-west-1 --container-instance {{ instance_arn.stdout }} --force"
So first it will deregister the instance from all the ELBs it is registered to. Then it will check in which ECS cluster the instance is and get the ARN for the instance in order to deregister it from the cluster. I skipped a step here, where the playbook will fire a Slack notification about the upcoming termination. After that, we just wait for the instance to die.
Thats it for today, thank you for reading and see you soon.