一、Docker默认自动创建的网络

当安装完Docker时,Docker默认自动创建了三个网络:bridege, none和host:

[root@k8s-node1 ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
2a0cc0d0e1ee        bridge              bridge              local
7e135216136d        host                host                local
72f0eaf254a8        none                null                local

docker run命令–net有5个可选参数(kubernetes主要使用了bridge和container两种模式):

二、Docker容器的网桥模式

默认情况下就是bridge模式

[root@k8s-master ~]# docker inspect python-3.5.2 |jq '.[] | .HostConfig.NetworkMode'
"default"

[root@k8s-master ~]# docker network inspect bridge |jq '.[] | .Containers'
{
    "d2679307a4298d49653b9981f3a5a315d5b505827f49262b54efad39a40bfe19": {
        "Name": "python-3.5.2",
        "EndpointID": "99b0ae766b3b7e0e12557bb0f1433223cf7a62f0cc807fdccf4bc36976ad5b26",
        "MacAddress": "02:42:0a:14:3b:03",
        "IPv4Address": "10.20.59.3/24",
        "IPv6Address": ""
    }
}

[root@k8s-master ~]# ifconfig docker0
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.20.59.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::42:f8ff:fe1a:34dd  prefixlen 64  scopeid 0x20<link>
        ether 02:42:f8:1a:34:dd  txqueuelen 0  (Ethernet)
        RX packets 1433  bytes 103881 (101.4 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1742  bytes 2924861 (2.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

bridge模式就是:新建一个veth设备对(veth的技术特性保证了无论哪个veth设备接收到数据报文,都会立即将报文传到另一端的veth设备),一端连在docker0网桥上,一端连在docker容器里的网络命名空间中:

[root@k8s-master ~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 52:54:00:5c:2a:8d brd ff:ff:ff:ff:ff:ff
3: flannel0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1472 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 500
    link/none
4: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT
    link/ether 02:42:f8:1a:34:dd brd ff:ff:ff:ff:ff:ff
8: veth6719931@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT
    link/ether c6:b5:c1:b2:81:65 brd ff:ff:ff:ff:ff:ff link-netnsid 1

[root@k8s-master ~]# brctl show docker0
bridge name     bridge id               STP enabled     interfaces
docker0         8000.0242f81a34dd       no              veth6719931

[root@k8s-master ~]# ip netns list-id
nsid 0
nsid 1

可以看到宿主机上多出1个veth设备:veth6719931,它属于docker0网桥,它的设备序列号为8;它的另一端在id为1的网络命名空间中,且对方的设备序列号为7。

现在只有一个使用bridge模式的容器python-3.5.2,在容器python-3.5.2中查看veth设备:

[root@k8s-master ~]# docker exec -it python-3.5.2 ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
7: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether 02:42:0a:14:3b:03 brd ff:ff:ff:ff:ff:ff

[root@k8s-master ~]# docker exec -it python-3.5.2 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
7: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:0a:14:3b:03 brd ff:ff:ff:ff:ff:ff
    inet 10.20.59.3/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe14:3b03/64 scope link
       valid_lft forever preferred_lft forever

docker将容器中的veth设备重命名为eth0,看起来像一个网卡,实际上只是一个veth设备!容器python-3.5.2中,果然有一个设备序列号为7的veth设备,分配的ip为10.20.59.3/24(Linux网桥与工作在二层的物理网桥最大的不同就是Linux网桥可以分配ip);对端是序列号为8的veth设备。

查看一下宿主机的路由表:

[root@k8s-master ~]# ip route list
default via 192.168.128.1 dev eth0
10.20.0.0/16 dev flannel0  proto kernel  scope link  src 10.20.100.0
10.20.59.0/24 dev docker0  proto kernel  scope link  src 10.20.59.1
169.254.0.0/16 dev eth0  scope link  metric 1002
192.168.128.0/20 dev eth0  proto kernel  scope link  src 192.168.138.224

[root@k8s-master ~]# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.128.1   0.0.0.0         UG        0 0          0 eth0
10.20.0.0       0.0.0.0         255.255.0.0     U         0 0          0 flannel0
10.20.59.0      0.0.0.0         255.255.255.0   U         0 0          0 docker0
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
192.168.128.0   0.0.0.0         255.255.240.0   U         0 0          0 eth0

[root@k8s-master ~]# docker exec -it python-3.5.2 ip route list
default via 10.20.59.1 dev eth0
10.20.59.0/24 dev eth0  proto kernel  scope link  src 10.20.59.3

可以看到,目的地址在10.20.59.0/24的数据包会直接转发到docker0网桥上,如果是10.20.59.3/24,就会转发给容器中的veth设备eth0,就可以实现外界到容器的通信了。

同样的,在容器内的路由表里,发给外界的数据包,直接通过eth0(src为10.20.59.3)转发给自己的默认网关docker0(10.20.59.1)。

查看宿主机的iptables规则

NAT(地址转换,主要进行来源与目的地址IP的转换,与Linux本机无关,主要是与Linux主机后面的局域网有关)表:

-A POSTROUTING -s 10.20.59.0/24 ! -o docker0 -j MASQUERADE

这条规则会将源地址为10.20.59.0/24的包(也就是从Docker容器产生的包),并且不是发送给docker0网桥的,则进行动态地址修改(MASQUERADE),将数据包的源地址从容器的IP地址转换成主机网卡的IP地址。

宿主机有一块网卡为eth0,IP地址为192.168.138.224/20,网关为192.168.128.1/20。从主机上一个IP为10.20.59.3/24的容器中ping百度(180.76.3.151),IP包首先从容器发往自己的默认网关docker0(10.20.59.1),包到达docker0后,也就到达了宿主机上。然后会查询主机的路由表,发现包应该从宿主机的eth0发往主机的网关192.168.128.1/20。接着包会转发给eth0,并从eth0发出去(主机的ip_forward转发应该已经打开)。这时候,上面的Iptable规则就会起作用,对包做SNAT转换,将源地址换为eth0的地址(192.168.138.224/20)。这样,在外界看来,这个包就是从192.168.138.224/20上发出来的,Docker容器对外是不可见的

FILTER(过滤器,主要和进入本机的数据包有关)表有这样的规则:

-A FORWARD -i docker0 ! -o docker0 -j ACCEPT

-A FORWARD -i docker0 -o docker0 -j ACCEPT

-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

1、docker0网桥发出的数据包,如果FORWARD到非docker0网桥的本机IP地址设备,是允许的;

2、docker0发出的数据包还可以直接转给docker0本身,即,允许连接在docker0网桥上的docker容器相互访问;

3、在宿主机上发往docker0网桥的数据包,如果该数据包所处的连接已经建立,则无条件接受,并由Linux内核将其转至原来的连接上,回到docker容器内部。

如果docker容器使用了端口映射:

[root@k8s-master ~]# docker run -dit --restart always --name mysql-5.7.16 -e MYSQL_ROOT_PASSWORD=R0otMe -p 13306:3306 mysql:5.7.16
4d9110e05ed1b8755cf1ecc3f3cca0380b6aed2c43f95656b42ecdc7e7db2a2f

新建一个容器mysql-5.7.16,将宿主机的端口13306映射到容器中的3306端口。

iptables中的NAT表多了两条子链是给端口转发用的:

-A POSTROUTING -s 10.20.59.2/32 -d 10.20.59.2/32 -p tcp -m tcp --dport 3306 -j MASQUERADE

-A DOCKER ! -i docker0 -p tcp -m tcp --dport 13306 -j DNAT --to-destination 10.20.59.2:3306

第一条:源地址为10.20.59.2/32,目的地址为10.20.59.2/32,协议为TCP,端口为3306的数据包,进行动态地址修改(MASQUERADE);

第二条:将不是从docker0网桥发出的,协议为TCP,目标端口为13306的数据包转发到10.20.59.2:3306,使用的是DNAT(DNAT修改的是Destination的地址)。

FILTER表多了一条规则:

-A DOCKER -d 10.20.59.2/32 ! -i docker0 -o docker0 -p tcp -m tcp --dport 3306 -j ACCEPT

目的地址为10.20.59.2/32,不是从docker0网桥发出的,或者发给docker0网桥的,目的端口为3306的TCP数据包也是接受的。

二、Docker容器的container模式

Docker Container的other container网络模式是Docker中一种较为特别的网络的模式。之所以称为“other container模式”,是因为这个模式下的Docker Container,会使用其他容器的网络环境。之所以称为“特别”,是因为这个模式下容器的网络隔离性会处于bridge桥接模式与host模式之间。Docker Container共享其他容器的网络环境,则至少这两个容器之间不存在网络隔离,而这两个容器又与宿主机以及除此之外其他的容器存在网络隔离

创建两个容器,第一个使用bridge模式,第二个使用container模式:

[root@k8s-master ~]# docker run -dit --name netinfra-container --net bridge --restart always -p 23306:3306 debian:jessie
33372053c4c2d70eee5f02784c0d73537e8374cf4fc15c73523d9d8901fd7204

[root@k8s-master ~]# docker run -dit --name mysql-share-ns --net container:netinfra-container --restart always -e MYSQL_ROOT_PASSWORD=R0otMe mysql:5.7.16
9d54fc2883a32e266cb298297964312407014b9ee58e8c2a8b5504aa1af37b1b

[root@k8s-master ~]# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                     NAMES
9d54fc2883a3        mysql:5.7.16        "docker-entrypoint.sh"   4 seconds ago       Up 3 seconds                                  mysql-share-ns
33372053c4c2        debian:jessie       "/bin/bash"              19 seconds ago      Up 18 seconds       0.0.0.0:23306->3306/tcp   netinfra-container
4d9110e05ed1        mysql:5.7.16        "docker-entrypoint.sh"   About an hour ago   Up About an hour    0.0.0.0:13306->3306/tcp   mysql-5.7.16
d2679307a429        python:3.5.2        "python3"                5 hours ago         Up 5 hours                                    python-3.5.2

查看使用网桥模式的docker容器:

[root@k8s-master ~]# docker network inspect bridge |jq '.[] | .Containers'
{
    "33372053c4c2d70eee5f02784c0d73537e8374cf4fc15c73523d9d8901fd7204": {
        "Name": "netinfra-container",
        "EndpointID": "32ece9b94a6fd6f2fde19839292581551aed26560c81e10d08c664ba20f252cf",
        "MacAddress": "02:42:0a:14:3b:04",
        "IPv4Address": "10.20.59.4/24",
        "IPv6Address": ""
    },
    "4d9110e05ed1b8755cf1ecc3f3cca0380b6aed2c43f95656b42ecdc7e7db2a2f": {
        "Name": "mysql-5.7.16",
        "EndpointID": "244eb7413d06bafbed06f8611640667bef321e9a4cafed171defd69e9cdd71e0",
        "MacAddress": "02:42:0a:14:3b:02",
        "IPv4Address": "10.20.59.2/24",
        "IPv6Address": ""
    },
    "d2679307a4298d49653b9981f3a5a315d5b505827f49262b54efad39a40bfe19": {
        "Name": "python-3.5.2",
        "EndpointID": "99b0ae766b3b7e0e12557bb0f1433223cf7a62f0cc807fdccf4bc36976ad5b26",
        "MacAddress": "02:42:0a:14:3b:03",
        "IPv4Address": "10.20.59.3/24",
        "IPv6Address": ""
    }
}

容器netinfra-container的ip为10.20.59.2/24,使用的是bridge模式,查看容器netinfra-container和mysql-share-ns的网络模式:

[root@k8s-master ~]# docker inspect netinfra-container mysql-share-ns |jq '.[] | {ContainerID: .Id, ContainerName: .Name, NetworkMode: .HostConfig.NetworkMode}'
{
    "ContainerID": "33372053c4c2d70eee5f02784c0d73537e8374cf4fc15c73523d9d8901fd7204",
    "ContainerName": "/netinfra-container",
    "NetworkMode": "bridge"
}
{
    "ContainerID": "9d54fc2883a32e266cb298297964312407014b9ee58e8c2a8b5504aa1af37b1b",
    "ContainerName": "/mysql-share-ns",
    "NetworkMode": "container:netinfra-container"
}

注意输出,容器netinfra-container使用的是bridge网络模式,容器mysql-share-ns使用的是container:netinfra-container网络模式。

[root@k8s-master ~]# docker inspect netinfra-container mysql-share-ns |jq '.[] | {ContainerName: .Name, NetworkMode: .HostConfig.NetworkMode, NetworkSettings: .NetworkSettings}'

{
    "ContainerName": "/netinfra-container",
    "NetworkMode": "bridge",
    "NetworkSettings": {
        "Bridge": "",
        "SandboxID": "2623339d087303632e15bad76b862ce96cc98199c54b8d8722161caf28c506c3",
        "HairpinMode": false,
        "LinkLocalIPv6Address": "",
        "LinkLocalIPv6PrefixLen": 0,
        "Ports": {
            "3306/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "23306"
                }
            ]
        },
        "SandboxKey": "/var/run/docker/netns/2623339d0873",
        "SecondaryIPAddresses": null,
        "SecondaryIPv6Addresses": null,
        "EndpointID": "32ece9b94a6fd6f2fde19839292581551aed26560c81e10d08c664ba20f252cf",
        "Gateway": "10.20.59.1",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "IPAddress": "10.20.59.4",
        "IPPrefixLen": 24,
        "IPv6Gateway": "",
        "MacAddress": "02:42:0a:14:3b:04",
        "Networks": {
            "bridge": {
                "IPAMConfig": null,
                "Links": null,
                "Aliases": null,
                "NetworkID": "d052a2494c7e0aa6b4dd69bc9da5258011aca5331fb1462fd734732aa7bb91ba",
                "EndpointID": "32ece9b94a6fd6f2fde19839292581551aed26560c81e10d08c664ba20f252cf",
                "Gateway": "10.20.59.1",
                "IPAddress": "10.20.59.4",
                "IPPrefixLen": 24,
                "IPv6Gateway": "",
                "GlobalIPv6Address": "",
                "GlobalIPv6PrefixLen": 0,
                "MacAddress": "02:42:0a:14:3b:04"
            }
        }
    }
}
{
    "ContainerName": "/mysql-share-ns",
    "NetworkMode": "container:netinfra-container",
    "NetworkSettings": {
        "Bridge": "",
        "SandboxID": "",
        "HairpinMode": false,
        "LinkLocalIPv6Address": "",
        "LinkLocalIPv6PrefixLen": 0,
        "Ports": null,
        "SandboxKey": "",
        "SecondaryIPAddresses": null,
        "SecondaryIPv6Addresses": null,
        "EndpointID": "",
        "Gateway": "",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "IPAddress": "",
        "IPPrefixLen": 0,
        "IPv6Gateway": "",
        "MacAddress": "",
        "Networks": null
    }
}

分别在两个容器中查看网络状态:

[root@k8s-master ~]# docker exec -it netinfra-container ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
25: eth0@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:0a:14:3b:04 brd ff:ff:ff:ff:ff:ff
    inet 10.20.59.4/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe14:3b04/64 scope link
       valid_lft forever preferred_lft forever

[root@k8s-master ~]# docker exec -it mysql-share-ns ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
25: eth0@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:0a:14:3b:04 brd ff:ff:ff:ff:ff:ff
    inet 10.20.59.4/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe14:3b04/64 scope link
       valid_lft forever preferred_lft forever

可以看到两个容器的网络栈是完全一样的。

在mysql-share-ns容器中访问mysql:

[root@k8s-master ~]# docker exec -it mysql-share-ns mysql -uroot -pR0otMe
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.16 MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>

在另一个mysql容器中访问mysql-share-ns的mysql服务,有两种方式:

[root@k8s-master ~]# docker exec -it mysql-5.7.16 mysql -h 10.20.59.4 -P 3306 -uroot -pR0otMe
......
mysql>

[root@k8s-master ~]# docker exec -it mysql-5.7.16 mysql -h 192.168.138.224 -P 23306 -uroot -pR0otMe
......
mysql>

如果在宿主机上,或者宿主机上的其它容器中访问mysql-share-ns容器中的mysql服务,直接使用容器组(netinfra-container和mysql-share-ns)的IP(10.20.59.4)就行了(k8s service的ClusterIP模式就是这样实现的);

在其它的容器中或者能访问宿主机(192.168.138.224)的机器上访问mysql-share-ns容器中的mysql服务,直接使用宿主机的IP+宿主机上暴露的端口就行了(k8s service的NodePort模式就是这样实现的)。

Logo

K8S/Kubernetes社区为您提供最前沿的新闻资讯和知识内容

更多推荐