Create Production Ready Cluster With Multi Host Network via Docker Swarm, Tls and Etcd. a Complete Step by Step Tutorial.

One of the newer features of docker is the multi-host network. By using this feature we are now able to create a network for all our containers which are distributed over various hosts. Within such a network, containers can communicate by using hostnames and they can also be added and removed anytime without breaking anything. These are some great advantages compared to using IP addresses or linking containers. Now we want to use this new feature and combine it with Docker Swarm, etcd and TLS to manage complex container infrastructures in a comfortable and safer way.

TL;DR

The multi-host network feature of Docker allows communication between containers which are distributed over multiple hosts.
In combination with Docker Swarm and etcd, one gets a powerful tool for managing complex container infrastructures.

The communication between all parties is encrypted with TLS.

Background knowledge

When we worked on an internal project here at Fusonic, we decided to use docker for a complex but flexible infrastructure. It should not make any difference on which host a container is running or if a container crashes and gets started on a different host.

Because of these requirements using IP addresses was not an option. Also linking containers has multiple drawbacks. One would be that links have to be created when a container is started. This means that containers have to be started in a strict order and when one container crashes or has to be restarted because of an update for example, all links break and all other containers linking this one have to be restarted as well.

For this reason and probably various others docker has released the networking feature in November 2015. By using this new feature (which uses libnetwork and libkv in turn) it is possible to create a network for multiple containers on various hosts.

What’s the benefit of doing it this way?

  • Containers within the network can communicate not only via IP addresses but also via hostnames.
  • There are no static links or dependencies between the containers themselves which means containers can be started on any host and in any order. As soon as a container has successfully started it will be reachable by his hostname. A container not being available won’t break other containers (applications / services within containers may have to handle this case).
  • New containers can be added and existing removed anytime without any trouble.
  • No container within a network is accessible from outside by default but all services are accessible from inside. This means internal services (e.g. database) don’t need any port mapping. Only services that have to be available from outside (e.g. web server, proxy) the network have to have a port mapping.

Hands on

Enough talk let’s take a look at how a multi-host network is created. In addition we will add a Docker Swarm and communicate via TLS to make things securer. Therefore we need two hosts with docker daemon installed on them.

Prerequisites

Before starting make sure that you have at least following versions of docker and Docker Swarm installed:

1
2
3
4
~ docker --version
Docker version 1.10.3, build 20f81dd
~ docker run swarm --version
swarm version 1.1.3 (7e9c6bd)

TLS

First we need some certificates for a secure communication between all components in our infrastructure. We are going to create one ca certificate which will sign all the other certificates. Only by providing a certificate signed by this ca the components in our infrastructure will communicate with each other. Of course the traffic will also be encrypted. Creating the certificates can be done by something like this:

1
2
3
4
5
6
7
8
9
# CA
openssl genrsa -aes256 -out ca.key 4096
openssl req -new -x509 -days 7300 -utf8 -key ca.key -sha256 -out ca.cert

# Client
openssl genrsa -out cert.key 4096
openssl req -subj "/C=[...]/ST=[...]/L=[...]/O=[...]/CN=[IPADDRESS]" -utf8 -sha256 -new -key cert.key -out cert.csr
echo subjectAltName = IP:[IPADDRESS] > extfile.cnf
openssl x509 -req -days 3650 -sha256 -in cert.csr -CA ca.cert -CAkey ca.key -CAcreateserial -out [IPADDRESS].cert -extfile extfile.cnf

In the end you should have a ca certificate and key and for each docker host a certificate including a key (can be the same for all certs). You can find more background information on securing docker and swarm here:

ETCD

To create multi-host networks a key-value-store is needed. Libkv currently supports Consul, etcd and ZooKeeper. In this example we are going to use etcd. It’s important that the ports 4789, 7469 and also the ports used by the key-value-store are open. In case of etcd the ports are configurable and we will stick to the defaults with 2379, 2380 and 4001. Because etcd itself can also be used as cluster some basic information is needed at startup. Therefore we will use the cluster discovery service by CoreOS. Accessing https://discovery.etcd.io/new?size=1 we will create a discovery id for a etcd cluster with one master node.

We chose to run etcd from within a container and therefore get the advantages of Docker Swarm. The image is quickly created or you can also use an existing one from docker hub. The important part is that etcd has access to the created certificates and is started with the following parameters:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/etcd-v2.2.5-linux-amd64/etcd \
-name [NAME] \
--advertise-client-urls https://[IPADDRESS]:2379,https://[HOST_IPADDRESS]:4001 \
--listen-client-urls https://0.0.0.0:2379,https://0.0.0.0:4001 \
--initial-advertise-peer-urls https://[HOST_IPADDRESS]:2380 \
--listen-peer-urls https://0.0.0.0:2380 \
--initial-cluster-token [CLUSTER_TOKEN] \
--discovery https://discovery.etcd.io/[DISCOVERY_ID] \
--client-cert-auth \
--cert-file=[PATH_TO_CERT] \
--key-file=[PATH_TO_CERT_KEY] \
--trusted-ca-file=[PATH_TO_CA_CERT] \
--peer-client-cert-auth \
--peer-cert-file=[PATH_TO_CERT] \
--peer-key-file=[PATH_TO_CERT_KEY] \
--peer-trusted-ca-file=[PATH_TO_CA_CERT]

IPADDRESS: eg. 172.168.1.1 → the IP address inside docker
HOST_IPADDRESS: e.g. 192.168.0.1 → the IP address of the host system

Here we will configure how …

  • the clients (e.g. swarm node, swarm master) will communicate with the etcd node (port 2379 and 4001)
  • the etcd nodes will communicate with each other (port 2380)
  • the etcd master nodes will find each other (use the generated id here)

In addition we also define that the communication should be done via TLS and where the certificates can be found. More details to each parameter can be found here:

After a successful start up and registration you will find your etcd node at https://discovery.etcd.io/DISOVERY_ID. At this point we have a running etcd node and can communicate with it when using the right certificates.

Docker-Daemon

Nearly done … In this step we will tell the docker daemon where he can find the etcd node and that he should communicate with it (and also with the swarm master) by using TLS. Therefore we will adjust our docker file (at /etc/systemd/system/ or /etc/systemd/system/docker.service.d/ on Ubuntu 15.10 and higher), include a new file for some additional configuration and change the port to 2376. Afterwards we should reload systemctl and restart the docker service. This has to be done on all docker daemon hosts that should be available as swarm nodes. On the host with etcd we have to start the etcd container again - it has been stopped when we restarted the docker daemon. The order of starting swarm nodes and swarm master does not really matter but etcd should be reachable before you start with setting up your swarm.

The docker.conf file:

1
2
3
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --dns 8.8.8.8 --dns 8.8.4.4 --config-file=/[PATH]/docker.conf.json

The referenced configuration file:

1
2
3
4
5
6
7
8
9
10
11
12
13
{
"tlsverify":true,
"tlscacert":"/[PATH]/ca.cert",
"tlscert":"/[PATH]/[HOST_IP].cert",
"tlskey":"/[PATH]/cert.key",
"cluster-advertise": "[HOST_IP]:0",
"cluster-store": "etcd://[ETCD_IP]:4001",
"cluster-store-opts": {
"kv.cacertfile":"/[PATH]/ca.cert",
"kv.certfile":"/[PATH]/[HOST_IP].cert",
"kv.keyfile":"/[PATH]/cert.key"
}
}

At this point we have restarted the docker daemon with the new configuration and also the etcd node. The docker daemon should be able to connect to the etcd and you should see an message like the following in the status report of the docker daemon on all hosts:

1
level=info msg="2016/04/05 09:48:12 [INFO] serf: EventMemberJoin: mzangerle 192.168.10.27\n"

The bootstrapping process on etcd master host

Because of the new configuration, each docker daemon on all hosts will try to connect to the etcd node on (re)start. But obviously the daemon on the master host can’t reach the etcd container while it is starting. Therefore it will take a few seconds until the daemon will proceed without connecting to etcd. You will see some error messages in the status report of the docker daemon but these should vanish when the etcd container is started and the daemon can finally establish a connection. This is a bootstrapping issue and won’t bother us anywhere else.

Swarm

At this point we will start a swarm master on our primary host and a swarm node on all hosts. Again we will reuse our generated certificates and the etcd node.

Start swarm master:

1
2
3
4
5
6
7
8
9
10
11
12
13
docker run --name master -p 3376:3376 \
-v /[PATH_TO_CERTS]/certs:/certs:ro \
swarm --debug \
manage etcd://[ETCD_IP]:4001 \
--tlsverify \
--tlscacert=/certs/ca.cert \
--tlscert=/certs/[HOST_IP].cert \
--tlskey=/certs/cert.key \
--discovery-opt kv.cacertfile=/certs/ca.cert \
--discovery-opt kv.certfile=/certs/[HOST_IP].cert \
--discovery-opt kv.keyfile=/certs/cert.key \
--advertise=[HOST_IP]:3376 \
-H tcp://0.0.0.0:3376

Start swarm node:

1
2
3
4
5
6
7
8
docker run --name [NAME] \
-v /[PATH_TO_CERTS]/certs:/certs:ro \
swarm join \
--advertise=[HOST_IP]:2376 \
--discovery-opt kv.cacertfile=/certs/ca.cert \
--discovery-opt kv.certfile=/certs/[HOST_IP].cert \
--discovery-opt kv.keyfile=/certs/cert.key \
etcd://[ETCD_IP]:4001

Now we can connect to the swarm master via TLS only and see all our swarm nodes in when executing the info command:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
docker --tlsverify \
--tlscacert=... \
--tlscert=... \
--tlskey=... \
-H=[SWARM_MASTER_HOST_IP]:3376 info

Nodes: 2
atroy: 192.168.10.14:2376
└ Status: Healthy
└ Containers: 10
...
└ Error: (none)
└ UpdatedAt: 2016-04-05T08:09:43Z
mzangerle: 192.168.10.27:2376
└ Status: Healthy
└ Containers: 3
...

Multi-host network

Finally we can create our multi host network using the swarm …

1
2
3
4
5
6
docker  --tlsverify \
--tlscacert=/[PATH_TO_CERTS]/ca.cert \
--tlscert=/[PATH_TO_CERTS]/[HOST_IP].cert \
--tlskey=/[PATH_TO_CERTS]/cert.key \
-H=[SWARM_MASTER_HOST_IP]:3376 \
network create --driver=overlay --subnet=10.0.9.0/24 [NETWORK_NAME]

… and start new containers with the –net parameter to add them to the network …

1
Docker -H=[IP_ADDRESS]:3376 run -d -h [HOSTNAME] --net=[NETWORK_NAME] [IMAGE_NAME]

… or add them later with following command:

1
docker -H tcp://[SWARM_MASTER_HOST_IP]:3376 network connect [NETWORK_NAME] [CONTAINER_NAME]

When inspecting the newly created network we should see the added containers here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
docker --tls… -H=[SWARM_MASTER_HOST_IP]:3376 network inspect [NETWORK_NAME]
[
{
"Name": "mynet",
...,
"Containers": {
"70b8d082a6ec91551f2d881fd372333ce0262236327a3b3ba4c12bdc8949de46": {
"Name": "proxy2",
...
},
"dd68706e8f0ad371296eed0f895216671f414de6e73c61125b4e323e7bdf1b4c": {
"Name": "proxy1",
...
}
},
}
]

When we now connect to a container on one host we should be able to ping a container on the other host by using the hostname of the container.

1
2
3
4
5
6
7
8
9
docker --tls... -H=192.168.10.27:3376 exec -it proxy1 /bin/bash
root@proxy1:/# ping proxy2
PING proxy2 (10.0.9.3) 56(84) bytes of data.
64 bytes from proxy2.mynet (10.0.9.3): icmp_seq=1 ttl=64 time=0.743 ms
64 bytes from proxy2.mynet (10.0.9.3): icmp_seq=2 ttl=64 time=0.614 ms
^C
--- proxy2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.614/0.678/0.743/0.069 ms

In production

For using this setup in production some manual work is needed, but a lot can be moved to some scripts like the startup of the docker daemon, simple swarm nodes or the interaction with the cluster discovery service. The bootstrapping process for etcd nodes or swarm master remains for now but this should be done only once anyway.

We could also add swarm master replication and/or multiple etcd master nodes to get a more robust infrastructure.

Result

At this point we have an etcd cluster with one node, which is used by the docker daemons and also the swarm nodes and master. The communication is secured by certificates and containers can communicate via hostnames within the new overlay network. Containers can also be added and removed without adding dependencies or breaking any other containers.

In case you have any questions, ideas for improvements or any other suggestions just leave us a comment.

Some more resources: