backgroud
our application so far includes the following three services:
simulator
pythonAPI
redisJQ
docker swarm network has the following three kind of networks, of course bridge
to host network.
overlay network, services in the same overlay network, can communicate to each other
routing network, the service requested can be hosted in any of the running nodes, further as load balancer.
host network
usually, multi-container apps can be deployed with docker-compose.yml, check docker compse for more details.
DNS service discovery
the following is an example from (overlay networking and service discovery:
my test env includes 2 nodes, with host IP as following. when running docker services, it will generate a responding virtual IP, while which is dynamic assgined.
hostname | virtualIP | hostIP |
---|---|---|
node1 | 10.0.0.4 | xx.20.181.132 |
node2 | 10.0.0.2 | xx.20.180.212 |
a common issue when try first to use overlay network in swarm, e.g. ping the other service doesn’t work, check /etc/resolv.conf
file:
|
|
- add a file
/etc/NetworkManager/dnsmasq.d/docker-bridge.conf
|
|
so basically, the default DNS setting only listens to DNS requests from 127.0.0.1 (ie, your computer). by adding listen-address=172.17.0.1
, it tells it to listen to the docker bridge also. very importantly, Docker DNS server is used only for the user created networks, so need create a new docker network. if use the default ingress
overlay network, the dns setup above still doesn’t work.
another solution is using host network, mentioned using host DNS in docker container with Ubuntu
test virtualIP network
- create a new docker network
|
|
why here need a new network ? due to Docker DNS server(172.17.0.1) is used only for the user created networks
- start the service with the created network:
|
|
check vip by the line IPv4Address
:
|
|
- go to the running container
|
|
- now ping service-name directly
|
|
- inspect service
|
|
- ping host IP from contianer vip
as far as we add both host dns
and docker0 dns
to the dns option in /etc/docker/daemon.json
, the container vip can ping host IP.
assign ENV variable from script
- get services vip
|
|
- create docker service with runtime env
|
|
- check docker-IP of
lg
:
|
|
- update
SIMULATOR_HOST
forredispythonapi
|
|
here we can check the lg container’s IP is 172.17.0.3
and redispythonapi’s IP is 172.17.0.4
, then update start_redis_worker.sh
with SIMULATOR_HOST=172.17.0.3
- get the container IP
|
|
assign a special IP to service in swarm
docker network create support subnet, which only ip-addressing function, namely we can use custom-defined virtual IP for our services. a sample:
|
|
understand subnet mask. IP address include master IP
and subnet mask
, we choose 28
here, basically generate about 2^(32-28)-2= 14
avaiable IP address in the subnet. but in a swarm env, subnet IPs are consuming more as the nodes or replicas of service increase.
taking an example, with 2-nodes and 2-replicas of service, 5 subnet IPs are occupied, rather than 2
run docker network inspect lgsvl-net
on both nodes:
- on node1 gives:
|
|
- on node2 gives:
|
|
clearly 5 IP address are occupied. and the IP for each internal service is random picked, there is no gurantee service
will always get the first avaiable IP.
docker serivce with –ip
only docker run –ip works, there is no similar --ip
option in docker service create
. but a lot case require this feature: how to publish a service port to a specific IP address, when publishing a port using --publish
, the port is published to 0.0.0.0
instead of a specific interface’s assigned IP. and there is no way to assign an fixed IP to a service in swarm.
a few disscussion in moby/#26696, add more options to `service create, a possible solution, Static/Reserved IP addresses for swarm services
mostly depend on the issues like “ip address is not known in advance, since docker service launched in swarm mode will end up on multiple docker servers”. there should not be applicable to docker swarm setup, since if one decides to go with docker swarm service, has to accept that service will run on multiple hosts with different ip addresses. I.e. trying to attach service / service instance to specific IP address somewhat contradicting with docker swarm service concept.
docker service create does have options --host host:ip-address
and --hostname
and similar in docker service update support host-add
and host-rm
.
|
|
dnsrr
mode, namely DNS round Robin mode, when query Docker’s internal DNS server to get the IP address of the service, it will return IP address of every node running the service.
vip
mode, return the IP address of only one of the running cntainers.
When you submit a DNS query for a service name to the Swarm DNS service, it will return one, or all, the IP addresses of the related containers, depending on the endpoint-mode.
dnsrr vs vip: Swarm defaults to use a virtual ip (endpoint-mode vip). So each service gets its own IP address, and the swarm load balancer assigns the request as it sees fit; to prevent a service from having an IP address, you can run docker service update your_app --endpoint-mode dnsrr
, which will allow an internal load balancer to run a DNS query against a service name
, to discover each task/container’s IP for a given service
in our case, we want to assign a speical IP to the service in swarm. why? because our app has websocket server/client communicataion, which is IP address based. we can’ assign service name
for WS server/client.
check another issue: dockerlize a websocket server
global mode to run swarm
when deploying service with global mode, namely each node will only run one replicas of the service. the benefit of global mode
is we can always find the node IP, no matter the IP address is in host network or user-defined overlay network/subnetwork.
get service’s IP in global mode
|
|
both 8080
and 8181
is listening after lgsvl
service started. on the lgsvl
side, we can modify it to listen on all address with 8181
port. then the following python script to find node’s IP:
|
|
in this way, no need to pre-define the SIMULATOR_HOST
env variable at first. the pythonAPI only need to find out its own IP and detect if 8181
is listening on in runtime.
container vs service
difference between service and container:
docker run
is used to create a standalone containerdocker service
is the one run in a distributed env. when create a service, you specify which container image to use and which commands to execue inside the running containers.
There is only one command(no matter the format is CMD
ENTRYPOINT
or command in docker-compose
) that docker will run to start your container, and when that command exits, the container exits. in swarm service mode, with default restart option(any), the container run and exit and restart again with a different containeID. check dockerfile, docker-compose and swarm mode lifecycle for details.
docker container restart policy:
docker official doc: start containers automatically
no, simply doesn’t restart under any circumstance
on-failure, to restart if the exit code has error. user can specify a maximum number of times Docker will automatically restart the container; the container will not restart when app exit with a successful exit code.
unless-stopped, only stop when Docker is stopped. so most time, this policy work exactly like
always
, one exception, when a container is stopped and the server is reboot or the DOcker serivce is restarted, the container won’t restart itself. if the container was running before the reboot, the container would be restarted once the system restarted.always, tells Docker to restart the container under any circumstance. and the service will restart even with reboot. any other policy can’t restart when system reboot.
similar restart policy can be found in :
keep redisJQ alive in python script
by default setup, redis server
is keep restarting and running, which make the pythonapi
service always report: redis.exceptions.ConnectionError: Error 111 connecting to xx.xxx.xxx:6379. Connection refused.
so we can keep redisJQ alive in python script level by simply a while loop
.
for test purpose, we also make pythonAPI restart policy as none, so the service won’t automatically run even with empty jobQueue.
the final test script can run in the following:
|
|
use python variable in os.system
|
|
proxy in docker swarm
Routing external traffic into the cluster, load balancing across replicas, and DNS service discovery are a few capabilities that require finesse. but proxy can’t either assign a special IP to a special service, neither can expose the service with a fixed IP, so in our case, no helpful.