I’ve been on and off when it comes to Docker, but lately I’ve been starting to embrace it. In comparison to virtual machines, containers a lot easier to maintain and are more lightweight. While working with containers are great, their true power aren’t made visible until you start clustering them. There are a few clustering and orchestration options, the most popular being Kubernetes and Docker Swarm.
In this guide we’re going to see how to create a simple Docker Swarm cluster on several server nodes that consist of both manager nodes and worker nodes.
If you’re unfamiliar with Docker, you’re probably still going to be unfamiliar with it after having read this guide. This is because we’re going to focus on clustering with Docker Swarm. If you want to learn about using Docker, check out a previous tutorial I wrote on the topic.
Now for a quick breakdown on what we’re going to get ourselves into with Docker Swarm. To keep things simple we’re going to run a cluster of NGINX applications that can be accessed by requesting any of the server nodes in the cluster.
It is important to understand what Docker Swarm actually does. When we create a cluster, we won’t be sharing computing resources between each of our nodes. Instead think of a Docker Swarm as a single Docker host that spans multiple server nodes. When deploying containers, which we’ll refer to as services on Swarm, it acts as if we’re working with one node. Docker Swarm manages where our services end up and we don’t have to worry. If one server node goes down, another server node will take its place when trying to access the container.
For simplicity, we’re going to create our Swarm locally using VirtualBox. Each node, or virtual machine, will have Docker Engine running. We’ll be configuring each node using our local Docker for Mac or Docker for Windows client. While you could probably use Docker Toolbox, I don’t have much familiarity with it.
As mentioned previously, we are going to create several local nodes using the
virtualbox driver for Docker. Assuming VirtualBox is installed and likewise Docker for Mac or Docker for Windows, execute the following:
docker-machine create --driver virtualbox manager1
The above command will provision a new host with VirtualBox and give it a manager1 name. You’ll want to do the same for however many nodes you wish to have. In our case, let’s do the following:
docker-machine create --driver virtualbox manager2 docker-machine create --driver virtualbox worker1 docker-machine create --driver virtualbox worker2 docker-machine create --driver virtualbox worker3
This should leave us with five server nodes total, two of which will eventually be managers and three workers. As of right now all the nodes are created equal and don’t act as anything special.
You can verify that all your nodes are running by executing the following command:
The above command will show you all configured machines whether they are local with VirtualBox or on AWS or similar. Take note of the IP address for one of your future manager nodes. We’ll need that IP address soon.
Now we need to take each of those five machines and create a Docker Swarm. To do this we’ll need to SSH into each of them and run a few commands.
Starting with the manager node in which you took note of the IP address, execute the following:
docker-machine ssh manager1
You can choose to use the machine name or IP address in the above command. I find the machine name to be a bit easier to remember.
Once connected, we can initialize the Swarm. This can be done by executing the following:
docker swarm init --advertise-addr 192.168.99.100
In the above command you’ll need the IP address of manager1 which we’re currently connected to. After running the command, you should end up with a token for adding worker nodes. No need to remember this as it can be obtained again later.
Instead of adding a worker node, we’re going to add another manager node from our list of already created nodes. Before we do that we need to figure out the token for adding manager nodes. Execute the following while connected to manager1:
docker swarm join-token manager
The result should be similar to the information provided when initializing the Swarm. Just remember that this is for adding managers, not workers.
Copy the result, disconnect, and connect to the second manager node:
docker-machine ssh manager2
After connecting to the other manager machine, we need to execute the command that we had just copied. It will look something like this:
docker swarm join \ --token SWMTKN-1-5v1l5wgkknqjeny4ggn4g4jaepdgqsvfvy1l6xmao8vy6hqoja-700yx4qjthew6ji6kbjz56rk1 \ 192.168.99.100:2377
Upon success, you’ll be notified that the node has joined as a manager. Now we need to connect the worker nodes to Swarm. Execute the following from either of the two manager nodes:
docker swarm join-token worker
Copy the result, which should be different than that provided by the manager request that we ran previously. With the result in hand, disconnect from the manager and connect to one of the worker nodes:
docker-machine ssh worker1
Once connected, paste and execute the command that you had copied, which should look like this:
docker swarm join \ --token SWMTKN-1-5v1l5wgkknqjeny4ggn4g4jaepdgqsvfvy1l6xmao8vy6hqoja-9adub9fsn0ns11w1em04ih92i \ 192.168.99.103:2377
Upon success, you should be notified that you joined as a worker node. Repeat the same command on each of the remaining worker nodes.
At this point you should have a Docker Swarm without any services. You can verify the nodes of the Swarm by executing the following from one of the manager nodes:
docker node ls
All the nodes should be active, but your manager1 node should be the leader and the manager2 node should be reachable. If anything should happen to the leader, one of the other manager nodes will be promoted.
Like I said earlier, a Docker Swarm is treated like one large Docker host. You’ll use it similarly to how you would any other Docker host. Emphasis probably needs to be placed on the similar part.
If you aren’t already connected, connect to one of the manager nodes and execute the following command to install NGINX:
docker service create --replicas 5 -p 80:80 --name webserver nginx
The above command will install an NGINX container named webserver with five replicas. After running the above command you can check that everything was created by executing the following:
docker service ls
Don’t worry if all five replicas haven’t started by the time you listed the services. It could take a bit of time, so just run the command a few more times.
Want to see how the NGINX containers were distributed? Execute the following command:
docker service ps webserver
We had created five nodes and five replicas so NGINX will now exist on each of the nodes in the Docker Swarm. Had we created less replicas, Docker Swarm would have determined which nodes receive the service. We can also force which nodes receive which services, but that is a story for another day.
Try hitting any of the node IP addresses in your web browser. For example, in my case, navigating to http://192.168.99.100 will show me the default NGINX page. Navigating to any of the node IP addresses will show the same.
Time for the real magic.
We’re going to scale down our replicas of the NGINX service. Execute the following to go from five replicas to two replicas:
docker service scale webserver=2
At this point there are less replicas than there are server nodes. This means that some nodes will not have the NGINX service. However, try to navigate to one of those servers via your web browser. You should still be able to access NGINX because the Swarm has directed you to the right place.
You can scale your services up or down using the same command that was used above.
We all know server updates and upgrades have to happen. Using containers doesn’t eliminate this necessity, it only makes things a bit easier. When we need to perform maintenance on a server node, whether that be a manager node or a worker node, we need to bring it down properly. This is what is known as draining the node in Docker terms.
To drain a node, execute the following from a manager node:
docker node update --availability drain worker2
The above command will change the the node status from Active to Drain. It also does something else. It orchestrates any services that this drained node was serving to be served on a different node.
You can see which node is taking the temporary place by executing the
ps command that we saw earlier:
docker service ps webserver
If you wish to stop draining the worker node, change the availability by executing the following:
docker node update --availability active worker2
Per the Docker documentation, a node that was added to the Swarm will be used when it is needed so a rebalance isn’t necessary. However, if you wish to force a rebalance you can do it like the following:
docker service update webserver --force
So what happens if you wish to remove a service from your Swarm? It is no big deal. You can execute the following to make it happen:
docker service rm webserver
And just like that it is gone from your entire cluster.
There is a ton of cool stuff you can do with a cluster, or Swarm, of Docker containers. It makes creating high availability applications and microservices incredibly easy, something that would have been very hard to do a few years ago. Like I had mentioned, Docker Swarm isn’t the only solution to this. There are others such as Kubernetes.
Something to think about, and something I’ll explore in the next guide on the subject, is load balancing the Docker Swarm. In a production environment you probably don’t want to hammer one of your nodes with all the traffic. Instead you’ll probably want to put them all behind a load balancer.