centos7.6从yum源二进制安装了k8s集群,建立NodePort类型service后无法从集群外访问service,本文记录了问题解决过程。
环境
centos7.6
k8s v1.5.2
dashbaord 1.6.0
master:10.10.10.14
node1:10.10.10.15
node2:10.10.10.16
集群外机器:10.10.10.1
问题分析
[vagrant@localhost ~]$ kubectl get pod -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kubernetes-dashboard-1801235744-v6r9t 1/1 Running 2 3d 172.16.91.2 node2[vagrant@localhost ~]$ kubectl get svc -n kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard 10.254.106.24380:30389/TCP 3d
我遇到的问题是通过https://10.10.10.16:30389从集群外机器(10.10.10.1)无法访问dashboard,其他service情况是一样的,在集群内部或者通过clusterip是可以访问的,排除集群本生问题。k8s的service是通过iptables实现的,初步怀疑和iptables有关。
问题定位
这里通过增加trace日志的方法来跟踪数据包在iptables中哪一步处理出现问题的。在node2上执行如下命令,对tcp协议并且目的端口为30389的数据包从PREROUTING链增加TRACE日志,日志会输出到/var/log/messages。
[vagrant@localhost ~]$ sudo iptables -t raw -A PREROUTING -p tcp --dport 30389 -j TRACE
详细分析日志信息,发现到FORWARD链的filter表就没有后续包了。
[vagrant@localhost ~]$sudo cat /var/log/messages
Sep 10 07:24:50 localhost kernel: TRACE: raw:PREROUTING:policy:2 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402)
Sep 10 07:24:50 localhost kernel: TRACE: nat:PREROUTING:rule:1 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402)
Sep 10 07:24:50 localhost kernel: TRACE: nat:KUBE-SERVICES:rule:4 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402)
Sep 10 07:24:50 localhost kernel: TRACE: nat:KUBE-NODEPORTS:rule:3 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402)
Sep 10 07:24:50 localhost kernel: TRACE: nat:KUBE-MARK-MASQ:rule:1 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402)
Sep 10 07:24:50 localhost kernel: TRACE: nat:KUBE-MARK-MASQ:return:2 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: nat:KUBE-NODEPORTS:rule:4 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: nat:KUBE-SVC-XGLOHA7QRQ3V22RZ:rule:1 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: nat:KUBE-SEP-7MDAIXCESNWFCQ4R:rule:2 IN=eth1 OUT= MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=10.10.10.16 LEN=52 TOS=0x00 PREC=0x00 TTL=128 ID=43914 DF PROTO=TCP SPT=58026 DPT=30389 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: filter:FORWARD:rule:1 IN=eth1 OUT=docker0 MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=172.16.91.2 LEN=52 TOS=0x00 PREC=0x00 TTL=127 ID=43914 DF PROTO=TCP SPT=58026 DPT=9090 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: filter:DOCKER-ISOLATION:return:1 IN=eth1 OUT=docker0 MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=172.16.91.2 LEN=52 TOS=0x00 PREC=0x00 TTL=127 ID=43914 DF PROTO=TCP SPT=58026 DPT=9090 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: filter:FORWARD:rule:2 IN=eth1 OUT=docker0 MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=172.16.91.2 LEN=52 TOS=0x00 PREC=0x00 TTL=127 ID=43914 DF PROTO=TCP SPT=58026 DPT=9090 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: filter:DOCKER:return:1 IN=eth1 OUT=docker0 MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=172.16.91.2 LEN=52 TOS=0x00 PREC=0x00 TTL=127 ID=43914 DF PROTO=TCP SPT=58026 DPT=9090 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x4000
Sep 10 07:24:50 localhost kernel: TRACE: filter:FORWARD:policy:6 IN=eth1 OUT=docker0 MAC=08:00:27:43:31:95:0a:00:27:00:00:0c:08:00 SRC=10.10.10.1 DST=172.16.91.2 LEN=52 TOS=0x00 PREC=0x00 TTL=127 ID=43914 DF PROTO=TCP SPT=58026 DPT=9090 SEQ=73790228 ACK=0 WINDOW=64240 RES=0x00 SYN URGP=0 OPT (020405B40103030801010402) MARK=0x400
查看FORWARD链发现其policy是DROP,问题就出在这里。
[vagrant@localhost ~]$ sudo iptables -nL FORWARD
Chain FORWARD (policy DROP)
target prot opt source destination
DOCKER-ISOLATION all – 0.0.0.0/0 0.0.0.0/0
DOCKER all – 0.0.0.0/0 0.0.0.0/0
ACCEPT all – 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
ACCEPT all – 0.0.0.0/0 0.0.0.0/0
ACCEPT all – 0.0.0.0/0 0.0.0.0/0
解决方法
将FORWARD链改为ACCEPT的。
[vagrant@localhost ~]$ sudo iptables -P FORWARD ACCEPT