What Should I Learn About Next?

As you no doubt are aware, our learning never ends. The more you know, the more you realize you don’t know. C’est la vie.

To help in this effort, a brief summary of each valuable area will be provided, and then it’s up to you to do a bit of digging to learn more and think about how to apply these to your application.

Some areas will be more appealing and useful than others in your situation, so dig deeper into the areas you need, and feel free to leave what you don’t. Enjoy!

Limiting resources

As you run more containers in your cluster, you may find the need to constrain the CPU resources and memory given to certain containers. Both Swarm and Kubernetes provide a way to specify limits on the resources a container is allowed. Here’s an example with Swarm that means containers for some service will only receive a maximum of 50 MB of memory, and one tenth of a single CPU core.

services:
  some-service:
	deploy:
	  resources:
		limits:
		  cpus: "0.1"
		  memory: 50M

Autoscaling

What’s better than running a command to scale up a service? Running no commands to scale up a service. This is known as autoscaling. It involves monitoring key usage and load metrics for containers and detecting when they get close to becoming overloaded. At this point, the new containers for the service are launched to meet with additional demand. As the load dies down, this is again detected, and the service is scaled back down, freeing up more resources.

Unfortunately, Swarm does not provide built-in autoscaling, although it is possible to engineer it yourself with each node in the cluster exporting metrics (using something like cadvisor to a central metric service, such as Prometheus). If that sounds like too much work, you could consider one of various open source solutions, such as Orbiter. Your final option is to switch to using Kubernetes for your container orchestration. Although more complex, it is more fully featured and has autoscaling built in.

Zero-Downtime, Blue-Green Deploys

The pinnacle of a good continuous deployment pipeline is being able to deploy updates to your app in a seamless, safe way, with no downtime or impact on users. Typically, this is achieved with blue-green deploys, where a second version of the application is started, and then traffic is (usually gradually) cut over to the new version of the app. The previous version of the app is kept around (at least for a while) in case a problem emerges that means you need to roll back.

Docker Swarm provides some capabilities for performing these types of rolling updates. While this can be useful, unfortunately, Swarm currently does not support session affinity, also known as sticky sessions. That is, once an updated version of your app has been deployed, any new sessions will be handled by this latest version, but any existing user sessions will continue to be serviced by the old version. This is important because the old version may be incompatible with your updated version of the app in some way, particularly if routes or database schemas have changed.

You can still achieve zero-downtime deploys with Swarm, but it will involve some extra work, typically requiring you to run a reverse proxy in front of the app that does provide session affinity. Zero-downtime deploys can also be achieved with Kubernetes, but this similarly involves some work.

Security

It is beyond the scope of this course to teach you the ins and outs of securing your cloud-based infrastructure, but if you are building a production environment, this will be a key area to get right. Unfortunately, there is no one-size-fits-all approach, particularly as things can vary greatly between cloud platforms.

A good starting place is Docker’s own docs on the subject. Make sure you see the various pages under “Security” in the menu. There is no “Next” button at the bottom of the page. Key topics are: only using trusted images, scanning images for vulnerabilities, not running containers as root, and locking down firewalls to the bare minimum ports required, plus more involved ways to lock down your Docker installation.

More Advanced Architectural Possibilities

So far we have used Swarm’s built-in load-balancing capabilities to distribute incoming requests to different containers backing a given service. However, as you get more experienced, you may want to use more sophisticated setups with things like HAProxy or NGINX to do your own proxying and load balancing.

Not only is this possible, but you can run HAProxy or NGINX instances in containers themselves, building your own images with your config files. You can also use Docker’s network primitives to create different network configurations. This can allow you to wall off containers from each other and control which containers can communicate with others.

Secret management

In this course, we followed many of the twelve-factor app principles. For example, we externalized our app config, making it available as environment variables. However, it turns out that environment variables are not particularly secure. They are available to the entire process, easily leaked, and violate the principle of least privilege. Docker offers a more secure, built-in option called Docker secrets.

Docker secrets are added to a swarm with the docker secret create command (having first targeted a swarm manager). Alternatively, you can specify secrets in your deploy file (in Compose format).

Secrets are encrypted inside Swarm’s data structures that store them (encryption at rest), as well as on their entire journey to reach the containers that need them (encryption in transit). They are made available to a container via an in-memory filesystem that is mounted at /run/secrets/<secret_name>. Only containers explicitly given access to a secret are able to access it.

There is even a built-in mechanism for rotating secrets, which makes it more likely you will do the right thing and rotate your secrets frequently.

Restarting on failure

By default, when the process running inside a container terminates, the container is stopped. Sometimes, this behavior is exactly what we want. For example, our database-migrator service is supposed to do its job of migrating the database and then exit.

However, what about our Rails app containers running in production? If something goes wrong that causes the app to crash (for example, in the case of a memory leak), it would be nice if the containers themselves could be resilient and handle failures more efficaciously. Who wants to be awoken in the middle of the night to fix issues?

Docker allows you to define a restart policy that details how it should behave when the container terminates. By setting this to on-failure:

deploy:
  restart_policy:
	condition: on-failure

Swarm will now automatically restart our Rails app if it were to crash.

Get hands-on with 1200+ tech skills courses.