Flannel是如何工作的


声明:本文转载自https://my.oschina.net/jxcdwangtao/blog/1624486,转载目的在于传递更多信息,仅供学习交流之用。如有侵权行为,请联系我,我会及时删除。

Author: xidianwangtao@gmail.com

概述

最近我们的TaaS平台遇到很多的网络问题,事实证明“contiv + ovs + vlan”的方案并不适合TaaS这种大规模高并发的场景,填不完的坑,当然DevOps场景下是没什么问题的。时间紧迫,只能使用“Flannel + host-gw”这个简单、稳定的网络方案搭建一个小规模的集群来作为紧急备选方案。趁这个机会,也学习一下前两年因性能差,广为诟病而一直不敢碰的Flannel如今是怎么个样子。经过春节半个月的稳定测试、压力测试证明确实很稳定。当然,calico(bgp)才是我们后续的主要网络方案。

Flannel支持多种Backend协议,但是不支持运行时修改Backend。官方推荐使用以下Backend:

  • VXLAN,性能损耗大概在20~30%;
  • host-gw, 性能损耗大概10%,要求Host之间二层直连,因此只适用于小集群;
  • UDP, 建议只用于debug,因为性能烂到家了,如果网卡支持 enable udp offload,直接由网卡进行拆包解包,性能还是很棒的。

实验性的Backend,不建议上生产:

  • AliVPC
  • Alloc
  • AWS VPC
  • GCE
  • IPIP
  • IPSec

Flannel的配置

Flannel在官方配置可以在https://github.com/coreos/flannel/blob/master/Documentation/configuration.md找到,但是注意文档中的配置不是最新的,是不完整的。

通过命令行配置

目前最新版的Flannel v0.10.0的命令行配置及说明如下:

Usage: /opt/bin/flanneld [OPTION]...   -etcd-cafile string     	SSL Certificate Authority file used to secure etcd communication   -etcd-certfile string     	SSL certification file used to secure etcd communication   -etcd-endpoints string     	a comma-delimited list of etcd endpoints (default "http://127.0.0.1:4001,http://127.0.0.1:2379")   -etcd-keyfile string     	SSL key file used to secure etcd communication   -etcd-password string     	password for BasicAuth to etcd   -etcd-prefix string     	etcd prefix (default "/coreos.com/network")   -etcd-username string     	username for BasicAuth to etcd   -healthz-ip string     	the IP address for healthz server to listen (default "0.0.0.0")   -healthz-port int     	the port for healthz server to listen(0 to disable)   -iface value     	interface to use (IP or name) for inter-host communication. Can be specified multiple times to check each option in order. Returns the first match found.   -iface-regex value     	regex expression to match the first interface to use (IP or name) for inter-host communication. Can be specified multiple times to check each regex in order. Returns the first match found. Regexes are checked after specific interfaces specified by the iface option have already been checked.   -ip-masq     	setup IP masquerade rule for traffic destined outside of overlay network   -kube-api-url string     	Kubernetes API server URL. Does not need to be specified if flannel is running in a pod.   -kube-subnet-mgr     	contact the Kubernetes API for subnet assignment instead of etcd.   -kubeconfig-file string     	kubeconfig file location. Does not need to be specified if flannel is running in a pod.   -log_backtrace_at value     	when logging hits line file:N, emit a stack trace   -public-ip string     	IP accessible by other nodes for inter-host communication   -subnet-file string     	filename where env variables (subnet, MTU, ... ) will be written to (default "/run/flannel/subnet.env")   -subnet-lease-renew-margin int     	subnet lease renewal margin, in minutes, ranging from 1 to 1439 (default 60)   -v value     	log level for V logs   -version     	print version and exit   -vmodule value     	comma-separated list of pattern=N settings for file-filtered logging 

需要说明如下:

  • 我们是通过-kube-subnet-mgr配置Flannel从Kubernetes APIServer中读取对应的ConfigMap来获取配置的。-kubeconfig-file, -kube-api-url我们也没有配置,因为我们是使用DaemonSet通过Pod来部署的Flannel,所以Flannel与Kubernetes APIServer是通过ServiceAccount来认证通信的。

  • 另外一种方式是直接从etcd中读取Flannel配置,需要配置对应的-etcd开头的Flag。

  • -subnet-file默认为/run/flannel/subnet.env,一般无需改动。Flannel会将本机的subnet信息对应的环境变量注入到该文件中,Flannel真正是从这里获取subnet信息的,比如:

    FLANNEL_NETWORK=10.244.0.0/16 FLANNEL_SUBNET=10.244.26.1/24 FLANNEL_MTU=1500 FLANNEL_IPMASQ=true 
  • -subnet-lease-renew-margin表示etcd租约到期前多少时间就可以重新自动续约,默认是1h。因为ttl时间是24h,所以这项配置自然不允许超过24h,即[1, 1439] min.

通过环境变量配置

上面的命令行配置项,都可以通过改成大写,下划线变中划线,再加上FLANNELD_前缀转成对应的环境变量的形式来设置。

比如--etcd-endpoints=http://10.0.0.2:2379对应的环境变量为FLANNELD_ETCD_ENDPOINTS=http://10.0.0.2:2379

部署Flannel

通过Kubernetes DaemonSet部署Flannel,这一点毫无争议。同时创建对应的ClusterRole,ClusterRoleBinding,ServiceAccount,ConfigMap。完整的Yaml描述文件可参考如下:

--- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata:   name: flannel rules:   - apiGroups:       - ""     resources:       - pods     verbs:       - get   - apiGroups:       - ""     resources:       - nodes     verbs:       - list       - watch   - apiGroups:       - ""     resources:       - nodes/status     verbs:       - patch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata:   name: flannel roleRef:   apiGroup: rbac.authorization.k8s.io   kind: ClusterRole   name: flannel subjects: - kind: ServiceAccount   name: flannel   namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata:   name: flannel   namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata:   name: kube-flannel-cfg   namespace: kube-system   labels:     tier: node     k8s-app: flannel data:   cni-conf.json: |     {       "name": "cbr0",       "plugins": [         {          "type": "flannel",          "delegate": {            "hairpinMode": true,            "isDefaultGateway": true          }         }       ]     }   net-conf.json: |     {       "Network": "10.244.0.0/16",       "Backend": {         "Type": "host-gw"       }     } --- apiVersion: extensions/v1beta1 kind: DaemonSet metadata:   name: kube-flannel   namespace: kube-system   labels:     tier: node     k8s-app: flannel spec:   template:     metadata:       labels:         tier: node         k8s-app: flannel     spec:       imagePullSecrets:       - name: harborsecret       serviceAccountName: flannel       containers:       - name: kube-flannel         image: registry.vivo.xyz:4443/coreos/flannel:v0.10.0-amd64         command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr"]         securityContext:           privileged: true         env:         - name: POD_NAME           valueFrom:             fieldRef:               fieldPath: metadata.name         - name: POD_NAMESPACE           valueFrom:             fieldRef:               fieldPath: metadata.namespace         - name: POD_IP           valueFrom:             fieldRef:               fieldPath: status.podIP         volumeMounts:         - name: run           mountPath: /run         - name: cni           mountPath: /etc/cni/net.d         - name: flannel-cfg           mountPath: /etc/kube-flannel/       - name: install-cni         image: registry.vivo.xyz:4443/coreos/flannel-cni:v0.3.0         command: ["/install-cni.sh"]         #command: ["sleep","10000"]         env:         # The CNI network config to install on each node.         - name: CNI_NETWORK_CONFIG           valueFrom:             configMapKeyRef:               name: kube-flannel-cfg               key: cni-conf.json         volumeMounts:         #- name: cni         #  mountPath: /etc/cni/net.d         - name: cni           mountPath: /host/etc/cni/net.d         - name: host-cni-bin           mountPath: /host/opt/cni/bin/       hostNetwork: true       tolerations:       - key: node-role.kubernetes.io/master         operator: Exists         effect: NoSchedule       volumes:         - name: run           hostPath:             path: /run         #- name: cni         #  hostPath:         #    path: /etc/kubernetes/cni/net.d         - name: cni           hostPath:             path: /etc/cni/net.d         - name: flannel-cfg           configMap:             name: kube-flannel-cfg         - name: host-cni-bin           hostPath:             path: /etc/cni/net.d   updateStrategy:     rollingUpdate:       maxUnavailable: 1     type: RollingUpdate 

工作原理

很容易混淆几个东西。我们通常说的Flannel(coreos/flannel),其实说的是flanneld。大家都知道Kubernetes是通过CNI标准对接网络插件的,但是当你去看Flannel(coreos/flannel)的代码时,并没有发现它实现了CNI的接口。如果你玩过其他CNI插件,你会知道还有一个二进制文件用来供kubele调用,并且会调用后端的网络插件。对于Flannel(coreos/flannel)来说,这个二进制文件是什么呢?git repo在哪里呢?

这个二进制文件就对应宿主机的/etc/cni/net.d/flannel,它的代码地址是https://github.com/containernetworking/plugins,最可恨的它的名字就叫做flannel,为啥不类似contiv netplugin对应的contivk8s一样,取名flannelk8s之类的。

上面的Flannel Pod中还有一个容器叫做install-cni,它对应的脚本在https://github.com/coreos/flannel-cni。

kube-flannel容器

在kube-flannel容器里面运行的是我们的主角flanneld,我们需要关注的这个容器里面的目录/文件:

  • /etc/kube-flannel/cni-conf.json
  • /etc/kube-flannel/net-conf.json
  • /run/flannel/subnet.env
  • /opt/bin/flanneld

下面是我的环境对应的内容:

/run/flannel # ls /etc/kube-flannel/ cni-conf.json  net-conf.json /run/flannel # cat /etc/kube-flannel/cni-conf.json  {   "name": "cbr0",   "plugins": [     {      "type": "flannel",      "delegate": {        "hairpinMode": true,        "isDefaultGateway": true      }     }   ] } /run/flannel # cat /etc/kube-flannel/net-conf.json  {   "Network": "10.244.0.0/16",   "Backend": {     "Type": "host-gw"   } }  /run/flannel # cat  /run/flannel/subnet.env  FLANNEL_NETWORK=10.244.0.0/16 FLANNEL_SUBNET=10.244.26.1/24 FLANNEL_MTU=1500 FLANNEL_IPMASQ=true  /run/flannel # ls /opt/bin/ flanneld           mk-docker-opts.sh /run/flannel # cat /opt/bin/mk-docker-opts.sh  #!/bin/sh  usage() { 	echo "$0 [-f FLANNEL-ENV-FILE] [-d DOCKER-ENV-FILE] [-i] [-c] [-m] [-k COMBINED-KEY]  Generate Docker daemon options based on flannel env file OPTIONS: 	-f	Path to flannel env file. Defaults to /run/flannel/subnet.env 	-d	Path to Docker env file to write to. Defaults to /run/docker_opts.env 	-i	Output each Docker option as individual var. e.g. DOCKER_OPT_MTU=1500 	-c	Output combined Docker options into DOCKER_OPTS var 	-k	Set the combined options key to this value (default DOCKER_OPTS=) 	-m	Do not output --ip-masq (useful for older Docker version) " >&2  	exit 1 }  flannel_env="/run/flannel/subnet.env" docker_env="/run/docker_opts.env" combined_opts_key="DOCKER_OPTS" indiv_opts=false combined_opts=false ipmasq=true  while getopts "f:d:icmk:?h" opt; do 	case $opt in 		f) 			flannel_env=$OPTARG 			;; 		d) 			docker_env=$OPTARG 			;; 		i) 			indiv_opts=true 			;; 		c) 			combined_opts=true 			;; 		m) 			ipmasq=false 			;; 		k) 			combined_opts_key=$OPTARG 			;; 		[\?h]) 			usage 			;; 	esac done  if [ $indiv_opts = false ] && [ $combined_opts = false ]; then 	indiv_opts=true 	combined_opts=true fi  if [ -f "$flannel_env" ]; then 	. $flannel_env fi  if [ -n "$FLANNEL_SUBNET" ]; then 	DOCKER_OPT_BIP="--bip=$FLANNEL_SUBNET" fi  if [ -n "$FLANNEL_MTU" ]; then 	DOCKER_OPT_MTU="--mtu=$FLANNEL_MTU" fi  if [ -n "$FLANNEL_IPMASQ" ] && [ $ipmasq = true ] ; then 	if [ "$FLANNEL_IPMASQ" = true ] ; then 		DOCKER_OPT_IPMASQ="--ip-masq=false" 	elif [ "$FLANNEL_IPMASQ" = false ] ; then 		DOCKER_OPT_IPMASQ="--ip-masq=true" 	else 		echo "Invalid value of FLANNEL_IPMASQ: $FLANNEL_IPMASQ" >&2 		exit 1 	fi fi  eval docker_opts="\$${combined_opts_key}"  if [ "$docker_opts" ]; then 	docker_opts="$docker_opts "; fi  echo -n "" >$docker_env  for opt in $(set | grep "DOCKER_OPT_"); do  	OPT_NAME=$(echo $opt | awk -F "=" '{print $1;}'); 	OPT_VALUE=$(eval echo "\$$OPT_NAME");  	if [ "$indiv_opts" = true ]; then 		echo "$OPT_NAME=\"$OPT_VALUE\"" >>$docker_env; 	fi  	docker_opts="$docker_opts $OPT_VALUE";  done  if [ "$combined_opts" = true ]; then 	echo "${combined_opts_key}=\"${docker_opts}\"" >>$docker_env fi 

install-cni容器

install-cni容器顾名思义就是负责安装cni插件的,把镜像里的flannel等二进制文件复制到宿主机的/etc/cni/net.d,注意这个目录要匹配kubelet对应的cni配置项,如果你没改kubelet默认配置,那么kubelet默认也是配置的这个cni目录。我们需要关注install-cni容器内的目录/文件:

  • /host/etc/cni/net.d/
  • /host/opt/cni/bin/
  • /host/etc/cni/net.d/10-flannel.conflist

下面是我的环境对应的内容:

 /host/etc/cni/net.d # pwd /host/etc/cni/net.d /host/etc/cni/net.d # ls 10-flannel.conflist  dhcp                 ipvlan               noop                 tuning bridge               flannel              loopback             portmap              vlan cnitool              host-local           macvlan              ptp   /host/etc/cni/net.d # cd /host/opt/cni/bin/ /host/opt/cni/bin # ls 10-flannel.conflist  dhcp                 ipvlan               noop                 tuning bridge               flannel              loopback             portmap              vlan cnitool              host-local           macvlan              ptp   /opt/cni/bin # ls bridge      dhcp        host-local  loopback    noop        ptp         vlan cnitool     flannel     ipvlan      macvlan     portmap     tuning  /opt/cni/bin # cat /host/etc/cni/net.d/10-flannel.conflist  {   "name": "cbr0",   "plugins": [     {      "type": "flannel",      "delegate": {        "hairpinMode": true,        "isDefaultGateway": true      }     }   ] } 

Flannel工作原理图

画一个图,应该就很清晰了。注意带颜色的部分是Volume对应的信息,可重点关注。

创建容器网络的流程就是:kubelet ——> flannel ——> flanneld。如果宿主机上并发创建Pod,则你会看到有多个flannel进程在后台,不过正常几秒钟就会结束,而flanneld是常驻进程。

输入图片说明

本文发表于2018年02月27日 10:31
(c)注:本文转载自https://my.oschina.net/jxcdwangtao/blog/1624486,转载目的在于传递更多信息,并不代表本网赞同其观点和对其真实性负责。如有侵权行为,请联系我们,我们会及时删除.

阅读 1999 讨论 0 喜欢 0

抢先体验

扫码体验
趣味小程序
文字表情生成器

闪念胶囊

你要过得好哇,这样我才能恨你啊,你要是过得不好,我都不知道该恨你还是拥抱你啊。

直抵黄龙府,与诸君痛饮尔。

那时陪伴我的人啊,你们如今在何方。

不出意外的话,我们再也不会见了,祝你前程似锦。

这世界真好,吃野东西也要留出这条命来看看

快捷链接
网站地图
提交友链
Copyright © 2016 - 2021 Cion.
All Rights Reserved.
京ICP备2021004668号-1