As part of ensuring the servers I write are compatible with docker and kubernetes I wanted to ensure all my http servers shutdown gracefully so as not to drop any transactions. This is especially useful when scaling down in kubernetes or as part of rolling deploys of new pods, ensuring your containers have the best chance to not drop any transactions.
The standard shutdown procedure for a docker container is to send the container a SIGTERM to indicate a desire to shutdown, if the container doesn’t shutdown it is issued with a SIGKILL after a grace period. As such we need to hook the SIGTERM and ensure we finish all our communications.
My solution for this is as per the following gist:
There are a couple of important things to point out here. Firstly you will notice the use of manners, which is a wrapper for the standard http server to allow for graceful shutdown.
You will also notice the all important hooks for both SIGTERM and also Ctrl+C, which makes things a lot easier for quick tests and also operation outside of docker.
This approach ends up working nicely, you can boot a number of instances of this server and for the most part scale up and down without seeing any dropped requests. However I have noticed that when I scale to a single instance I do see some errors for a very brief period. I’m not sure if this is due to it being a single instance, or just less than the number of test nodes I was running but I’m wondering if there is some issue around etcd timing its changes across the cluster as requests come in.
Overall the approach works for the use case in mind, as a cluster system I don’t expect to run single instances of services that get scaled up and down rapidly, but I’m hoping I can investigate more to find out where the issue is coming from.