本文主要在centos7系統(tǒng)上基于containerd和stable版本(1.11.4)的cilium組件部署v1.24.0版本的k8s原生集群,由于集群主要用于自己平時學(xué)習(xí)和測試使用,加上資源有限,暫不涉及高可用部署。
此外,由于cilium已經(jīng)實現(xiàn)了對kube-proxy的一整套替代方案,這里部署k8s集群的時候會使用cilium的kubeproxy-free方案。
此前寫的一些關(guān)于k8s基礎(chǔ)知識和集群搭建的一些方案,有需要的同學(xué)可以看一下。
1、準(zhǔn)備工作
1.1 集群信息
機器均為8C8G的虛擬機,硬盤為100G。
| IP | Hostname |
|---|---|
| 10.31.18.1 | tiny-kubeproxy-free-master-18-1.k8s.tcinternal |
| 10.31.18.11 | tiny-kubeproxy-free-worker-18-11.k8s.tcinternal |
| 10.31.18.12 | tiny-kubeproxy-free-worker-18-12.k8s.tcinternal |
| 10.18.64.0/18 | podSubnet |
| 10.18.0.0/18 | serviceSubnet |
1.2 檢查mac和product_uuid
同一個k8s集群內(nèi)的所有節(jié)點需要確保mac地址和product_uuid均唯一,開始集群初始化之前需要檢查相關(guān)信息
# 檢查mac地址
ip link
ifconfig -a
# 檢查product_uuid
sudo cat /sys/class/dmi/id/product_uuid
1.3 配置ssh免密登錄(可選)
如果k8s集群的節(jié)點有多個網(wǎng)卡,確保每個節(jié)點能通過正確的網(wǎng)卡互聯(lián)訪問
# 在root用戶下面生成一個公用的key,并配置可以使用該key免密登錄
su root
ssh-keygen
cd /root/.ssh/
cat id_rsa.pub >> authorized_keys
chmod 600 authorized_keys
cat >> ~/.ssh/config <<EOF
Host tiny-kubeproxy-free-master-18-1.k8s.tcinternal
HostName 10.31.18.1
User root
Port 22
IdentityFile ~/.ssh/id_rsa
Host tiny-kubeproxy-free-worker-18-11.k8s.tcinternal
HostName 10.31.18.11
User root
Port 22
IdentityFile ~/.ssh/id_rsa
Host tiny-kubeproxy-free-worker-18-12.k8s.tcinternal
HostName 10.31.18.12
User root
Port 22
IdentityFile ~/.ssh/id_rsa
EOF
1.4 修改hosts文件
cat >> /etc/hosts <<EOF
10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal
10.31.18.11 tiny-kubeproxy-free-worker-18-11.k8s.tcinternal
10.31.18.12 tiny-kubeproxy-free-worker-18-12.k8s.tcinternal
EOF
1.5 關(guān)閉swap內(nèi)存
# 使用命令直接關(guān)閉swap內(nèi)存
swapoff -a
# 修改fstab文件禁止開機自動掛載swap分區(qū)
sed -i '/swap / s/^\(.*\)$/#\1/g' /etc/fstab
1.6 配置時間同步
這里可以根據(jù)自己的習(xí)慣選擇ntp或者是chrony同步均可,同步的時間源服務(wù)器可以選擇阿里云的ntp1.aliyun.com或者是國家時間中心的ntp.ntsc.ac.cn。
使用ntp同步
# 使用yum安裝ntpdate工具
yum install ntpdate -y
# 使用國家時間中心的源同步時間
ntpdate ntp.ntsc.ac.cn
# 最后查看一下時間
hwclock
使用chrony同步
# 使用yum安裝chrony
yum install chrony -y
# 設(shè)置開機啟動并開啟chony并查看運行狀態(tài)
systemctl enable chronyd.service
systemctl start chronyd.service
systemctl status chronyd.service
# 當(dāng)然也可以自定義時間服務(wù)器
vim /etc/chrony.conf
# 修改前
$ grep server /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
server 0.centos.pool.ntp.org iburst
server 1.centos.pool.ntp.org iburst
server 2.centos.pool.ntp.org iburst
server 3.centos.pool.ntp.org iburst
# 修改后
$ grep server /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
server ntp.ntsc.ac.cn iburst
# 重啟服務(wù)使配置文件生效
systemctl restart chronyd.service
# 查看chrony的ntp服務(wù)器狀態(tài)
chronyc sourcestats -v
chronyc sources -v
1.7 關(guān)閉selinux
# 使用命令直接關(guān)閉
setenforce 0
# 也可以直接修改/etc/selinux/config文件
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
1.8 配置防火墻
k8s集群之間通信和服務(wù)暴露需要使用較多端口,為了方便,直接禁用防火墻
# centos7使用systemctl禁用默認(rèn)的firewalld服務(wù)
systemctl disable firewalld.service
1.9 配置netfilter參數(shù)
這里主要是需要配置內(nèi)核加載br_netfilter和iptables放行ipv6和ipv4的流量,確保集群內(nèi)的容器能夠正常通信。
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system
1.10 關(guān)閉IPV6(不建議)
和之前部署其他的CNI不一樣,cilium很多服務(wù)監(jiān)聽默認(rèn)情況下都是雙棧的(使用cilium-cli操作的時候),因此建議開啟系統(tǒng)的IPV6網(wǎng)絡(luò)支持(即使沒有可用的IPV6路由也可以)
當(dāng)然沒有ipv6網(wǎng)絡(luò)也是可以的,只是在使用cilium-cli的一些開啟port-forward命令時會報錯而已。
# 直接在內(nèi)核中添加ipv6禁用參數(shù)
grubby --update-kernel=ALL --args=ipv6.disable=1
1.11 配置IPVS(可以不用)
IPVS是專門設(shè)計用來應(yīng)對負(fù)載均衡場景的組件,kube-proxy 中的 IPVS 實現(xiàn)通過減少對 iptables 的使用來增加可擴展性。在 iptables 輸入鏈中不使用 PREROUTING,而是創(chuàng)建一個假的接口,叫做 kube-ipvs0,當(dāng)k8s集群中的負(fù)載均衡配置變多的時候,IPVS能實現(xiàn)比iptables更高效的轉(zhuǎn)發(fā)性能。
如果我們使用的是cilium來完全替代kube-proxy,那么實際上就用不到ipvs和iptables,因此這一步理論上是可以跳過的。
因為cilium需要升級系統(tǒng)內(nèi)核,因此這里的內(nèi)核版本高于4.19
注意在4.19之后的內(nèi)核版本中使用
nf_conntrack模塊來替換了原有的nf_conntrack_ipv4模塊(Notes: use
nf_conntrackinstead ofnf_conntrack_ipv4for Linux kernel 4.19 and later)
# 在使用ipvs模式之前確保安裝了ipset和ipvsadm
sudo yum install ipset ipvsadm -y
# 手動加載ipvs相關(guān)模塊
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack
# 配置開機自動加載ipvs相關(guān)模塊
cat <<EOF | sudo tee /etc/modules-load.d/ipvs.conf
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
EOF
sudo sysctl --system
# 最好重啟一遍系統(tǒng)確定是否生效
$ lsmod | grep -e ip_vs -e nf_conntrack
nf_conntrack_netlink 49152 0
nfnetlink 20480 2 nf_conntrack_netlink
ip_vs_sh 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs 159744 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 159744 5 xt_conntrack,nf_nat,nf_conntrack_netlink,xt_MASQUERADE,ip_vs
nf_defrag_ipv4 16384 1 nf_conntrack
nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
libcrc32c 16384 4 nf_conntrack,nf_nat,xfs,ip_vs
$ cut -f1 -d " " /proc/modules | grep -e ip_vs -e nf_conntrack
nf_conntrack_netlink
ip_vs_sh
ip_vs_wrr
ip_vs_rr
ip_vs
nf_conntrack
1.12 配置Linux內(nèi)核(cilium必選)
cilium和其他的cni組件最大的不同在于其底層使用了ebpf技術(shù),而該技術(shù)對于Linux的系統(tǒng)內(nèi)核版本有較高的要求,完成的要求可以查看官網(wǎng)的詳細鏈接,這里我們著重看內(nèi)核版本、內(nèi)核參數(shù)這兩個部分。
Linux內(nèi)核版本
默認(rèn)情況下我們可以參考cilium官方給出的一個系統(tǒng)要求總結(jié)。因為我們是在k8s集群中部署(使用容器),因此只需要關(guān)注Linux內(nèi)核版本和etcd版本即可。根據(jù)前面部署的經(jīng)驗我們可以知道1.23.6版本的k8s默認(rèn)使用的etcd版本是3.5.+,因此重點就來到了Linux內(nèi)核版本這里。
| Requirement | Minimum Version | In cilium container |
|---|---|---|
| Linux kernel | >= 4.9.17 | no |
| Key-Value store (etcd) | >= 3.1.0 | no |
| clang+LLVM | >= 10.0 | yes |
| iproute2 | >= 5.9.0 | yes |
This requirement is only needed if you run
cilium-agentnatively. If you are using the Cilium container imagecilium/cilium, clang+LLVM is included in the container image.iproute2 is only needed if you run
cilium-agentdirectly on the host machine. iproute2 is included in thecilium/ciliumcontainer image.
毫無疑問CentOS7內(nèi)置的默認(rèn)內(nèi)核版本3.10.x版本的內(nèi)核是無法滿足需求的,但是在升級內(nèi)核之前,我們再看看其他的一些要求。
cilium官方還給出了一份列表描述了各項高級功能對內(nèi)核版本的要求:
| Cilium Feature | Minimum Kernel Version |
|---|---|
| IPv4 fragment handling | >= 4.10 |
| Restrictions on unique prefix lengths for CIDR policy rules | >= 4.11 |
| IPsec Transparent Encryption in tunneling mode | >= 4.19 |
| WireGuard Transparent Encryption | >= 5.6 |
| Host-Reachable Services | >= 4.19.57, >= 5.1.16, >= 5.2 |
| Kubernetes Without kube-proxy | >= 4.19.57, >= 5.1.16, >= 5.2 |
| Bandwidth Manager | >= 5.1 |
| Local Redirect Policy (beta) | >= 4.19.57, >= 5.1.16, >= 5.2 |
| Full support for Session Affinity | >= 5.7 |
| BPF-based proxy redirection | >= 5.7 |
| BPF-based host routing | >= 5.10 |
| Socket-level LB bypass in pod netns | >= 5.7 |
| Egress Gateway (beta) | >= 5.2 |
| VXLAN Tunnel Endpoint (VTEP) Integration | >= 5.2 |
可以看到如果需要滿足上面所有需求的話,需要內(nèi)核版本高于5.10,本著學(xué)習(xí)測試研究作死的精神,反正都升級了,干脆就升級到新一些的版本吧。這里我們可以直接使用elrepo源來升級內(nèi)核到較新的內(nèi)核版本。
# 查看elrepo源中支持的內(nèi)核版本
$ yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Available Packages
elrepo-release.noarch 7.0-5.el7.elrepo elrepo-kernel
kernel-lt.x86_64 5.4.192-1.el7.elrepo elrepo-kernel
kernel-lt-devel.x86_64 5.4.192-1.el7.elrepo elrepo-kernel
kernel-lt-doc.noarch 5.4.192-1.el7.elrepo elrepo-kernel
kernel-lt-headers.x86_64 5.4.192-1.el7.elrepo elrepo-kernel
kernel-lt-tools.x86_64 5.4.192-1.el7.elrepo elrepo-kernel
kernel-lt-tools-libs.x86_64 5.4.192-1.el7.elrepo elrepo-kernel
kernel-lt-tools-libs-devel.x86_64 5.4.192-1.el7.elrepo elrepo-kernel
kernel-ml.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
kernel-ml-devel.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
kernel-ml-doc.noarch 5.17.6-1.el7.elrepo elrepo-kernel
kernel-ml-headers.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
kernel-ml-tools.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
kernel-ml-tools-libs.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
kernel-ml-tools-libs-devel.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
perf.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
python-perf.x86_64 5.17.6-1.el7.elrepo elrepo-kernel
# 看起來ml版本的內(nèi)核比較滿足我們的需求,直接使用yum進行安裝
sudo yum --enablerepo=elrepo-kernel install kernel-ml -y
# 使用grubby工具查看系統(tǒng)中已經(jīng)安裝的內(nèi)核版本信息
sudo grubby --info=ALL
# 設(shè)置新安裝的5.17.6版本內(nèi)核為默認(rèn)內(nèi)核版本,此處的index=0要和上面查看的內(nèi)核版本信息一致
sudo grubby --set-default-index=0
# 查看默認(rèn)內(nèi)核是否修改成功
sudo grubby --default-kernel
# 重啟系統(tǒng)切換到新內(nèi)核
init 6
# 重啟后檢查內(nèi)核版本是否為新的5.17.6
uname -a
Linux內(nèi)核參數(shù)
首先我們查看自己當(dāng)前內(nèi)核版本的參數(shù),基本上可以分為y、n、m三個選項
- y:yes,Build directly into the kernel. 表示該功能被編譯進內(nèi)核中,默認(rèn)啟用
- n:no,Leave entirely out of the kernel. 表示該功能未被編譯進內(nèi)核中,不啟用
- m:module,Build as a module, to be loaded if needed. 表示該功能被編譯為模塊,按需啟用
# 查看當(dāng)前使用的內(nèi)核版本的編譯參數(shù)
cat /boot/config-$(uname -r)
cilium官方對各項功能所需要開啟的內(nèi)核參數(shù)列舉如下:
In order for the eBPF feature to be enabled properly, the following kernel configuration options must be enabled. This is typically the case with distribution kernels. When an option can be built as a module or statically linked, either choice is valid.
為了正確啟用 eBPF 功能,必須啟用以下內(nèi)核配置選項。這通常因內(nèi)核版本情況而異。任何一個選項都可以構(gòu)建為模塊或靜態(tài)鏈接,兩個選擇都是有效的。
我們暫時只看最基本的Base Requirements
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_NET_CLS_BPF=y
CONFIG_BPF_JIT=y
CONFIG_NET_CLS_ACT=y
CONFIG_NET_SCH_INGRESS=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_USER_API_HASH=y
CONFIG_CGROUPS=y
CONFIG_CGROUP_BPF=y
對比我們使用的5.17.6-1.el7.elrepo.x86_64內(nèi)核可以發(fā)現(xiàn)有兩個模塊是為m
$ egrep "^CONFIG_BPF=|^CONFIG_BPF_SYSCALL=|^CONFIG_NET_CLS_BPF=|^CONFIG_BPF_JIT=|^CONFIG_NET_CLS_ACT=|^CONFIG_NET_SCH_INGRESS=|^CONFIG_CRYPTO_SHA1=|^CONFIG_CRYPTO_USER_API_HASH=|^CONFIG_CGROUPS=|^CONFIG_CGROUP_BPF=" /boot/config-5.17.6-1.el7.elrepo.x86_64
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT=y
CONFIG_CGROUPS=y
CONFIG_CGROUP_BPF=y
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_CLS_ACT=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_USER_API_HASH=y
缺少的這兩個模塊我們可以在/usr/lib/modules/$(uname -r)目錄下面找到它們:
$ realpath ./kernel/net/sched/sch_ingress.ko
/usr/lib/modules/5.17.6-1.el7.elrepo.x86_64/kernel/net/sched/sch_ingress.ko
$ realpath ./kernel/net/sched/cls_bpf.ko
/usr/lib/modules/5.17.6-1.el7.elrepo.x86_64/kernel/net/sched/cls_bpf.ko
確認(rèn)相關(guān)內(nèi)核模塊存在我們直接加載內(nèi)核即可:
# 直接使用modprobe命令加載
$ modprobe cls_bpf
$ modprobe sch_ingress
$ lsmod | egrep "cls_bpf|sch_ingress"
sch_ingress 16384 0
cls_bpf 24576 0
# 配置開機自動加載cilium所需相關(guān)模塊
cat <<EOF | sudo tee /etc/modules-load.d/cilium-base-requirements.conf
cls_bpf
sch_ingress
EOF
其他cilium高級功能所需要的內(nèi)核功能也類似,這里不做贅述。
2、安裝container runtime
2.1 安裝containerd
詳細的官方文檔可以參考這里,由于在剛發(fā)布的1.24版本中移除了docker-shim,因此安裝的版本≥1.24的時候需要注意容器運行時的選擇。這里我們安裝的版本為最新的1.24,因此我們不能繼續(xù)使用docker,這里我們將其換為containerd
修改Linux內(nèi)核參數(shù)
# 首先生成配置文件確保配置持久化
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# Setup required sysctl params, these persist across reboots.
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
# Apply sysctl params without reboot
sudo sysctl --system
安裝containerd
centos7比較方便的部署方式是利用已有的yum源進行安裝,這里我們可以使用docker官方的yum源來安裝containerd
# 導(dǎo)入docker官方的yum源
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# 查看yum源中存在的各個版本的containerd.io
yum list containerd.io --showduplicates | sort -r
# 直接安裝最新版本的containerd.io
yum install containerd.io -y
# 啟動containerd
sudo systemctl start containerd
# 最后我們還要設(shè)置一下開機啟動
sudo systemctl enable --now containerd
關(guān)于CRI
官方表示,對于k8s來說,不需要安裝cri-containerd,并且該功能會在后面的2.0版本中廢棄。
FAQ: For Kubernetes, do I need to download
cri-containerd-(cni-)<VERSION>-<OS-<ARCH>.tar.gztoo?Answer: No.
As the Kubernetes CRI feature has been already included in
containerd-<VERSION>-<OS>-<ARCH>.tar.gz, you do not need to download thecri-containerd-....archives to use CRI.The
cri-containerd-...archives are deprecated, do not work on old Linux distributions, and will be removed in containerd 2.0.
安裝cni-plugins
使用yum源安裝的方式會把runc安裝好,但是并不會安裝cni-plugins,因此這部分還是需要我們自行安裝。
The
containerd.iopackage contains runc too, but does not contain CNI plugins.
我們直接在github上面找到系統(tǒng)對應(yīng)的架構(gòu)版本,這里為amd64,然后解壓即可。
# Download the cni-plugins-<OS>-<ARCH>-<VERSION>.tgz archive from https://github.com/containernetworking/plugins/releases , verify its sha256sum, and extract it under /opt/cni/bin:
# 下載源文件和sha512文件并校驗
$ wget https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-amd64-v1.1.1.tgz
$ wget https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-amd64-v1.1.1.tgz.sha512
$ sha512sum -c cni-plugins-linux-amd64-v1.1.1.tgz.sha512
# 創(chuàng)建目錄并解壓
$ mkdir -p /opt/cni/bin
$ tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.1.1.tgz
2.2 配置cgroup drivers
CentOS7使用的是systemd來初始化系統(tǒng)并管理進程,初始化進程會生成并使用一個 root 控制組 (cgroup), 并充當(dāng) cgroup 管理器。 Systemd 與 cgroup 集成緊密,并將為每個 systemd 單元分配一個 cgroup。 我們也可以配置容器運行時和 kubelet 使用 cgroupfs。 連同 systemd 一起使用 cgroupfs 意味著將有兩個不同的 cgroup 管理器。而當(dāng)一個系統(tǒng)中同時存在cgroupfs和systemd兩者時,容易變得不穩(wěn)定,因此最好更改設(shè)置,令容器運行時和 kubelet 使用 systemd 作為 cgroup 驅(qū)動,以此使系統(tǒng)更為穩(wěn)定。 對于containerd, 需要設(shè)置配置文件/etc/containerd/config.toml中的 SystemdCgroup 參數(shù)。
參考k8s官方的說明文檔:
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd-systemd
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
接下來我們開始配置containerd的cgroup driver
# 查看默認(rèn)的配置文件,我們可以看到是沒有啟用systemd
$ containerd config default | grep SystemdCgroup
SystemdCgroup = false
# 使用yum安裝的containerd的配置文件非常簡單
$ cat /etc/containerd/config.toml | egrep -v "^#|^$"
disabled_plugins = ["cri"]
# 導(dǎo)入一個完整版的默認(rèn)配置文件模板為config.toml
$ mv /etc/containerd/config.toml /etc/containerd/config.toml.origin
$ containerd config default > /etc/containerd/config.toml
# 修改SystemdCgroup參數(shù)并重啟
$ sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
$ systemctl restart containerd
# 查看containerd狀態(tài)的時候我們可以看到cni相關(guān)的報錯
# 這是因為我們先安裝了cni-plugins但是還沒有安裝k8s的cni插件
# 屬于正常情況
$ systemctl status containerd -l
May 12 09:57:31 tiny-kubeproxy-free-master-18-1.k8s.tcinternal containerd[5758]: time="2022-05-12T09:57:31.100285056+08:00" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
2.3 關(guān)于kubelet的cgroup driver
k8s官方有詳細的文檔介紹了如何設(shè)置kubelet的cgroup driver,需要特別注意的是,在1.22版本開始,如果沒有手動設(shè)置kubelet的cgroup driver,那么默認(rèn)會設(shè)置為systemd
Note: In v1.22, if the user is not setting the
cgroupDriverfield underKubeletConfiguration,kubeadmwill default it tosystemd.
一個比較簡單的指定kubelet的cgroup driver的方法就是在kubeadm-config.yaml加入cgroupDriver字段
# kubeadm-config.yaml
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.21.0
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
我們可以直接查看configmaps來查看初始化之后集群的kubeadm-config配置。
$ kubectl describe configmaps kubeadm-config -n kube-system
Name: kubeadm-config
Namespace: kube-system
Labels: <none>
Annotations: <none>
Data
====
ClusterConfiguration:
----
apiServer:
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.23.6
networking:
dnsDomain: cali-cluster.tclocal
serviceSubnet: 10.88.0.0/18
scheduler: {}
BinaryData
====
Events: <none>
當(dāng)然因為我們需要安裝的版本高于1.22.0并且使用的就是systemd,因此可以不用再重復(fù)配置。
3、安裝kube三件套
對應(yīng)的官方文檔可以參考這里
kube三件套就是kubeadm、kubelet 和 kubectl,三者的具體功能和作用如下:
-
kubeadm:用來初始化集群的指令。 -
kubelet:在集群中的每個節(jié)點上用來啟動 Pod 和容器等。 -
kubectl:用來與集群通信的命令行工具。
需要注意的是:
-
kubeadm不會幫助我們管理kubelet和kubectl,其他兩者也是一樣的,也就是說這三者是相互獨立的,并不存在誰管理誰的情況; -
kubelet的版本必須小于等于API-server的版本,否則容易出現(xiàn)兼容性的問題; -
kubectl并不是集群中的每個節(jié)點都需要安裝,也并不是一定要安裝在集群中的節(jié)點,可以單獨安裝在自己本地的機器環(huán)境上面,然后配合kubeconfig文件即可使用kubectl命令來遠程管理對應(yīng)的k8s集群;
CentOS7的安裝比較簡單,我們直接使用官方提供的yum源即可。需要注意的是這里需要設(shè)置selinux的狀態(tài),但是前面我們已經(jīng)關(guān)閉了selinux,因此這里略過這步。
# 直接導(dǎo)入谷歌官方的yum源
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF
# 當(dāng)然如果連不上谷歌的源,可以考慮使用國內(nèi)的阿里鏡像源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
# 接下來直接安裝三件套即可
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
# 如果網(wǎng)絡(luò)環(huán)境不好出現(xiàn)gpgcheck驗證失敗導(dǎo)致無法正常讀取yum源,可以考慮關(guān)閉該yum源的repo_gpgcheck
sed -i 's/repo_gpgcheck=1/repo_gpgcheck=0/g' /etc/yum.repos.d/kubernetes.repo
# 或者在安裝的時候禁用gpgcheck
sudo yum install -y kubelet kubeadm kubectl --nogpgcheck --disableexcludes=kubernetes
# 如果想要安裝特定版本,可以使用這個命令查看相關(guān)版本的信息
sudo yum list --nogpgcheck kubelet kubeadm kubectl --showduplicates --disableexcludes=kubernetes
# 安裝完成后配置開機自啟kubelet
sudo systemctl enable --now kubelet
4、初始化集群
4.1 編寫配置文件
在集群中所有節(jié)點都執(zhí)行完上面的三點操作之后,我們就可以開始創(chuàng)建k8s集群了。因為我們這次不涉及高可用部署,因此初始化的時候直接在我們的目標(biāo)master節(jié)點上面操作即可。
# 我們先使用kubeadm命令查看一下主要的幾個鏡像版本
$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.24.0
k8s.gcr.io/kube-controller-manager:v1.24.0
k8s.gcr.io/kube-scheduler:v1.24.0
k8s.gcr.io/kube-proxy:v1.24.0
k8s.gcr.io/pause:3.7
k8s.gcr.io/etcd:3.5.3-0
k8s.gcr.io/coredns/coredns:v1.8.6
# 為了方便編輯和管理,我們還是把初始化參數(shù)導(dǎo)出成配置文件
$ kubeadm config print init-defaults > kubeadm-kubeproxy-free.conf
- 考慮到大多數(shù)情況下國內(nèi)的網(wǎng)絡(luò)無法使用谷歌的k8s.gcr.io鏡像源,我們可以直接在配置文件中修改
imageRepository參數(shù)為阿里的鏡像源 -
kubernetesVersion字段用來指定我們要安裝的k8s版本 -
localAPIEndpoint參數(shù)需要修改為我們的master節(jié)點的IP和端口,初始化之后的k8s集群的apiserver地址就是這個 -
criSocket從1.24.0版本開始已經(jīng)默認(rèn)變成了containerd -
podSubnet、serviceSubnet和dnsDomain兩個參數(shù)默認(rèn)情況下可以不用修改,這里我按照自己的需求進行了變更 -
nodeRegistration里面的name參數(shù)修改為對應(yīng)master節(jié)點的hostname - 新增配置塊使用ipvs,具體可以參考官方文檔
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.31.18.1
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: tiny-kubeproxy-free-master-18-1.k8s.tcinternal
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.24.0
networking:
dnsDomain: free-cluster.tclocal
serviceSubnet: 10.18.0.0/18
podSubnet: 10.18.64.0/18
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
4.2 初始化集群
此時我們再查看對應(yīng)的配置文件中的鏡像版本,就會發(fā)現(xiàn)已經(jīng)變成了對應(yīng)阿里云鏡像源的版本
參考cilium官方的教程我們可以在集群初始化的時候添加參數(shù)--skip-phases=addon/kube-proxy跳過kube-proxy的安裝
# 查看一下對應(yīng)的鏡像版本,確定配置文件是否生效
$ kubeadm config images list --config kubeadm-kubeproxy-free.conf
registry.aliyuncs.com/google_containers/kube-apiserver:v1.24.0
registry.aliyuncs.com/google_containers/kube-controller-manager:v1.24.0
registry.aliyuncs.com/google_containers/kube-scheduler:v1.24.0
registry.aliyuncs.com/google_containers/kube-proxy:v1.24.0
registry.aliyuncs.com/google_containers/pause:3.7
registry.aliyuncs.com/google_containers/etcd:3.5.3-0
registry.aliyuncs.com/google_containers/coredns:v1.8.6
# 確認(rèn)沒問題之后我們直接拉取鏡像
$ kubeadm config images pull --config kubeadm-kubeproxy-free.conf
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.24.0
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.7
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.5.3-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:v1.8.6
# 初始化,注意添加參數(shù)跳過kube-proxy的安裝
$ kubeadm init --config kubeadm-kubeproxy-free.conf --skip-phases=addon/kube-proxy
[init] Using Kubernetes version: v1.24.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
...此處略去一堆輸出...
當(dāng)我們看到下面這個輸出結(jié)果的時候,我們的集群就算是初始化成功了。
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.31.18.1:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:7772f5461bdf4dc399618dc226e2d718d35f14b079e904cd68a5b148eaefcbdd
4.3 配置kubeconfig
剛初始化成功之后,我們還沒辦法馬上查看k8s集群信息,需要配置kubeconfig相關(guān)參數(shù)才能正常使用kubectl連接apiserver讀取集群信息。
# 對于非root用戶,可以這樣操作
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 如果是root用戶,可以直接導(dǎo)入環(huán)境變量
export KUBECONFIG=/etc/kubernetes/admin.conf
# 添加kubectl的自動補全功能
echo "source <(kubectl completion bash)" >> ~/.bashrc
前面我們提到過
kubectl不一定要安裝在集群內(nèi),實際上只要是任何一臺能連接到apiserver的機器上面都可以安裝kubectl并且根據(jù)步驟配置kubeconfig,就可以使用kubectl命令行來管理對應(yīng)的k8s集群。
配置完成后,我們再執(zhí)行相關(guān)命令就可以查看集群的信息了。
$ kubectl cluster-info
Kubernetes control plane is running at https://10.31.18.1:6443
CoreDNS is running at https://10.31.18.1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
tiny-kubeproxy-free-master-18-1.k8s.tcinternal NotReady control-plane 2m46s v1.24.0 10.31.18.1 <none> CentOS Linux 7 (Core) 5.17.6-1.el7.elrepo.x86_64 containerd://1.6.4
$ kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
NOMINATED NODE READINESS GATES
kube-system coredns-74586cf9b6-shpt4 0/1 Pending 0 2m42s <none> <none>
<none> <none>
kube-system coredns-74586cf9b6-wgvgm 0/1 Pending 0 2m42s <none> <none>
<none> <none>
kube-system etcd-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 2m56s 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system kube-apiserver-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 2m57s 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system kube-controller-manager-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 2m55s 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system kube-scheduler-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 2m55s 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
# 這時候查看daemonset可以看到是沒有kube-proxy的
$ kubectl get ds -A
No resources found
4.4 添加worker節(jié)點
這時候我們還需要繼續(xù)添加剩下的兩個節(jié)點作為worker節(jié)點運行負(fù)載,直接在剩下的節(jié)點上面運行集群初始化成功時輸出的命令就可以成功加入集群。
因為我們前面的kubeadm初始化master節(jié)點的時候沒有啟用kube-proxy,所以在添加節(jié)點的時候會出現(xiàn)警告,但是不影響我們繼續(xù)添加節(jié)點。
$ kubeadm join 10.31.18.1:6443 --token abcdef.0123456789abcdef \
> --discovery-token-ca-cert-hash sha256:7772f5461bdf4dc399618dc226e2d718d35f14b079e904cd68a5b148eaefcbdd
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0512 10:34:36.673112 7960 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" is forbidden: User "system:bootstrap:abcdef" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
如果不小心沒保存初始化成功的輸出信息也沒有關(guān)系,我們可以使用kubectl工具查看或者生成token
# 查看現(xiàn)有的token列表
$ kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
abcdef.0123456789abcdef 23h 2022-05-13T02:28:58Z authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
# 如果token已經(jīng)失效,那就再創(chuàng)建一個新的token
$ kubeadm token create
ri4jzg.wkn47l10cjvefep5
$ kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
abcdef.0123456789abcdef 23h 2022-05-13T02:28:58Z authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
ri4jzg.wkn47l10cjvefep5 23h 2022-05-13T02:40:15Z authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
# 如果找不到--discovery-token-ca-cert-hash參數(shù),則可以在master節(jié)點上使用openssl工具來獲取
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
7772f5461bdf4dc399618dc226e2d718d35f14b079e904cd68a5b148eaefcbdd
添加完成之后我們再查看集群的節(jié)點可以發(fā)現(xiàn)這時候已經(jīng)多了兩個node,但是此時節(jié)點的狀態(tài)還是NotReady,接下來就需要部署CNI了。
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
tiny-kubeproxy-free-master-18-1.k8s.tcinternal NotReady control-plane 11m v1.24.0
tiny-kubeproxy-free-worker-18-11.k8s.tcinternal NotReady <none> 5m57s v1.24.0
tiny-kubeproxy-free-worker-18-12.k8s.tcinternal NotReady <none> 65s v1.24.0
5、安裝CNI
5.1 部署helm3
cilium的部署依賴helm3,因此我們在部署cilium之前需要先安裝helm3。
helm3的部署非常的簡單,我們只要去GitHub找到對應(yīng)系統(tǒng)版本的二進制文件,下載解壓后放到系統(tǒng)的執(zhí)行目錄就可以使用了。
$ wget https://get.helm.sh/helm-v3.8.2-linux-amd64.tar.gz
$ tar -zxvf helm-v3.8.2-linux-amd64.tar.gz
$ cp -rp linux-amd64/helm /usr/local/bin/
$ helm version
version.BuildInfo{Version:"v3.8.2", GitCommit:"6e3701edea09e5d55a8ca2aae03a68917630e91b", GitTreeState:"clean", GoVersion:"go1.17.5"}
5.2 部署cilium
完整的部署指南可以參考官方文檔,首先我們添加helm的repo。
$ helm repo add cilium https://helm.cilium.io/
"cilium" has been added to your repositories
$ helm repo list
NAME URL
cilium https://helm.cilium.io/
參考官網(wǎng)的文檔,這里我們需要指定集群的APIserver的IP和端口
helm install cilium ./cilium \
--namespace kube-system \
--set kubeProxyReplacement=strict \
--set k8sServiceHost=REPLACE_WITH_API_SERVER_IP \
--set k8sServicePort=REPLACE_WITH_API_SERVER_PORT
但是考慮到cilium默認(rèn)使用的podCIDR為10.0.0.0/8,很可能會和我們集群內(nèi)的網(wǎng)絡(luò)沖突,最好的方案就是初始化的時候指定podCIDR,關(guān)于初始化的時候podCIDR的設(shè)置,可以參考官方的這個文章。
helm install cilium cilium/cilium --version 1.11.4 \
--namespace kube-system \
--set kubeProxyReplacement=strict \
--set k8sServiceHost=REPLACE_WITH_API_SERVER_IP \
--set k8sServicePort=REPLACE_WITH_API_SERVER_PORT \
--set ipam.operator.clusterPoolIPv4PodCIDRList=<IPv4CIDR> \
--set ipam.operator.clusterPoolIPv4MaskSize=<IPv4MaskSize>
最后可以得到我們的初始化安裝參數(shù)
helm install cilium cilium/cilium --version 1.11.4 \
--namespace kube-system \
--set kubeProxyReplacement=strict \
--set k8sServiceHost=10.31.18.1 \
--set k8sServicePort=6443 \
--set ipam.operator.clusterPoolIPv4PodCIDRList=10.18.64.0/18 \
--set ipam.operator.clusterPoolIPv4MaskSize=24
然后我們使用指令進行安裝
$ helm install cilium cilium/cilium --version 1.11.4 --namespace kube-system --set kubeProxyReplacement=strict --set k8sServiceHost=10.31.18.1 --set k8sServicePort=6443 --set ipam.operator.clusterPoolIPv4PodCIDRList=10.18.64.0/18 --set ipam.operator.clusterPoolIPv4MaskSize=24
W0512 11:03:06.636996 8753 warnings.go:70] spec.template.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms[1].matchExpressions[0].key: beta.kubernetes.io/os is deprecated since v1.14; use "kubernetes.io/os" instead
W0512 11:03:06.637058 8753 warnings.go:70] spec.template.metadata.annotations[scheduler.alpha.kubernetes.io/critical-pod]: non-functional in v1.16+; use the "priorityClassName" field instead
NAME: cilium
LAST DEPLOYED: Thu May 12 11:03:04 2022
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.
Your release version is 1.11.4.
For any further help, visit https://docs.cilium.io/en/v1.11/gettinghelp
此時我們再查看集群的daemonset和deployment狀態(tài):
# 這時候查看集群的daemonset和deployment狀態(tài)可以看到cilium相關(guān)的服務(wù)已經(jīng)正常
$ kubectl get ds -A
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system cilium 3 3 3 3 3 <none> 4m57s
$ kubectl get deploy -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system cilium-operator 2/2 2 2 5m4s
kube-system coredns 2/2 2 2 39m
再查看所有的pod,狀態(tài)都正常,ip也和我們初始化的時候分配的ip段一致,說明初始化的參數(shù)設(shè)置生效了。
# 再查看所有的pod,狀態(tài)都正常,ip按預(yù)期進行了分配
$ kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
NOMINATED NODE READINESS GATES
kube-system cilium-97fn7 1/1 Running 0 7m14s 10.31.18.11 tiny-kubeproxy-free-worker-18-11.k8s.tcinternal <none> <none>
kube-system cilium-k2gxc 1/1 Running 0 7m14s 10.31.18.12 tiny-kubeproxy-free-worker-18-12.k8s.tcinternal <none> <none>
kube-system cilium-operator-86884f4747-c2ps5 1/1 Running 0 7m14s 10.31.18.12 tiny-kubeproxy-free-worker-18-12.k8s.tcinternal <none> <none>
kube-system cilium-operator-86884f4747-zrm4m 1/1 Running 0 7m14s 10.31.18.11 tiny-kubeproxy-free-worker-18-11.k8s.tcinternal <none> <none>
kube-system cilium-t69js 1/1 Running 0 7m14s 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system coredns-74586cf9b6-shpt4 1/1 Running 0 41m 10.18.65.64 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system coredns-74586cf9b6-wgvgm 1/1 Running 0 41m 10.18.65.237 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system etcd-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 41m 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system kube-apiserver-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 41m 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system kube-controller-manager-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 41m 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
kube-system kube-scheduler-tiny-kubeproxy-free-master-18-1.k8s.tcinternal 1/1 Running 0 41m 10.31.18.1 tiny-kubeproxy-free-master-18-1.k8s.tcinternal <none> <none>
這時候我們再進入pod中檢查cilium的狀態(tài)
# --verbose參數(shù)可以查看詳細的狀態(tài)信息
# cilium-97fn7需要替換為任意一個cilium的pod
$ kubectl exec -it -n kube-system cilium-97fn7 -- cilium status --verbose
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init)
KVStore: Ok Disabled
Kubernetes: Ok 1.24 (v1.24.0) [linux/amd64]
Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement: Strict [eth0 10.31.18.11 (Direct Routing)]
Host firewall: Disabled
Cilium: Ok 1.11.4 (v1.11.4-9d25463)
NodeMonitor: Listening for events on 8 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 2/254 allocated from 10.18.66.0/24,
Allocated addresses:
10.18.66.223 (health)
10.18.66.232 (router)
BandwidthManager: Disabled
Host Routing: Legacy
Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled]
Clock Source for BPF: ktime
Controller Status: 21/21 healthy
Name Last success Last error Count Message
bpf-map-sync-cilium_ipcache 3s ago 8m59s ago 0 no error
cilium-health-ep 41s ago never 0 no error
dns-garbage-collector-job 59s ago never 0 no error
endpoint-2503-regeneration-recovery never never 0 no error
endpoint-82-regeneration-recovery never never 0 no error
endpoint-gc 3m59s ago never 0 no error
ipcache-inject-labels 8m49s ago 8m53s ago 0 no error
k8s-heartbeat 29s ago never 0 no error
mark-k8s-node-as-available 8m41s ago never 0 no error
metricsmap-bpf-prom-sync 4s ago never 0 no error
resolve-identity-2503 3m41s ago never 0 no error
resolve-identity-82 3m42s ago never 0 no error
sync-endpoints-and-host-ips 42s ago never 0 no error
sync-lb-maps-with-k8s-services 8m42s ago never 0 no error
sync-node-with-ciliumnode (tiny-kubeproxy-free-worker-18-11.k8s.tcinternal) 8m53s ago 8m55s ago 0 no error
sync-policymap-2503 33s ago never 0 no error
sync-policymap-82 30s ago never 0 no error
sync-to-k8s-ciliumendpoint (2503) 11s ago never 0 no error
sync-to-k8s-ciliumendpoint (82) 2s ago never 0 no error
template-dir-watcher never never 0 no error
update-k8s-node-annotations 8m53s ago never 0 no error
Proxy Status: OK, ip 10.18.66.232, 0 redirects active on ports 10000-20000
Hubble: Ok Current/Max Flows: 422/4095 (10.31%), Flows/s: 0.75 Metrics: Disabled
KubeProxyReplacement Details:
Status: Strict
Socket LB Protocols: TCP, UDP
Devices: eth0 10.31.18.11 (Direct Routing)
Mode: SNAT
Backend Selection: Random
Session Affinity: Enabled
Graceful Termination: Enabled
XDP Acceleration: Disabled
Services:
- ClusterIP: Enabled
- NodePort: Enabled (Range: 30000-32767)
- LoadBalancer: Enabled
- externalIPs: Enabled
- HostPort: Enabled
BPF Maps: dynamic sizing: on (ratio: 0.002500)
Name Size
Non-TCP connection tracking 65536
TCP connection tracking 131072
Endpoint policy 65535
Events 8
IP cache 512000
IP masquerading agent 16384
IPv4 fragmentation 8192
IPv4 service 65536
IPv6 service 65536
IPv4 service backend 65536
IPv6 service backend 65536
IPv4 service reverse NAT 65536
IPv6 service reverse NAT 65536
Metrics 1024
NAT 131072
Neighbor table 131072
Global policy 16384
Per endpoint policy 65536
Session affinity 65536
Signal 8
Sockmap 65535
Sock reverse NAT 65536
Tunnel 65536
Encryption: Disabled
Cluster health: 3/3 reachable (2022-05-12T03:12:22Z)
Name IP Node Endpoints
tiny-kubeproxy-free-worker-18-11.k8s.tcinternal (localhost) 10.31.18.11 reachable reachable
tiny-kubeproxy-free-master-18-1.k8s.tcinternal 10.31.18.1 reachable reachable
tiny-kubeproxy-free-worker-18-12.k8s.tcinternal 10.31.18.12 reachable reachable
其實到這里cilium的部署就可以說是ok了的,整個集群的cni都處于正常狀態(tài),其余的工作負(fù)載也都能夠正常運行了。
5.3 部署hubble
cilium還有一大特點就是其可觀測性比其他的cni要優(yōu)秀很多,想要體驗到cilium的可觀測性,我們就需要在k8s集群中安裝hubble。同時hubble提供了ui界面來更好的實現(xiàn)集群內(nèi)網(wǎng)絡(luò)的可觀測性,這里我們也一并把hubble-ui安裝上。
helm3安裝hubble
我們繼續(xù)接著上面的helm3來安裝hubble,因為我們已經(jīng)安裝了cilium,因此這里需要使用upgrade來進行升級安裝,并且使用--reuse-values來復(fù)用之前的安裝參數(shù)
helm upgrade cilium cilium/cilium --version 1.11.4 \
--namespace kube-system \
--reuse-values \
--set hubble.relay.enabled=true \
--set hubble.ui.enabled=true
然后我們直接進行安裝
$ helm upgrade cilium cilium/cilium --version 1.11.4 \
> --namespace kube-system \
> --reuse-values \
> --set hubble.relay.enabled=true \
> --set hubble.ui.enabled=true
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Thu May 12 11:34:43 2022
NAMESPACE: kube-system
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble Relay and Hubble UI.
Your release version is 1.11.4.
For any further help, visit https://docs.cilium.io/en/v1.11/gettinghelp
隨后我們查看相關(guān)的集群狀態(tài),可以看到相對應(yīng)的pod、deploy和svc都工作正常
$ kubectl get pod -A | grep hubble
kube-system hubble-relay-cdf4c8cdd-wgdqg 1/1 Running 0 66s
kube-system hubble-ui-86856f9f6c-vw8lt 3/3 Running 0 66s
$ kubectl get deploy -A | grep hubble
kube-system hubble-relay 1/1 1 1 74s
kube-system hubble-ui 1/1 1 1 74s
$ kubectl get svc -A | grep hubble
kube-system hubble-relay ClusterIP 10.18.58.2 <none> 80/TCP 82s
kube-system hubble-ui ClusterIP 10.18.22.156 <none> 80/TCP 82s
cilium-cli安裝hubble
使用cilium-cli功能來安裝hubble也非常簡單:
# 首先安裝cilium-cli工具
# cilium的cli工具是一個二進制的可執(zhí)行文件
$ curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz{,.sha256sum}
$ sha256sum --check cilium-linux-amd64.tar.gz.sha256sum
cilium-linux-amd64.tar.gz: OK
$ sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
cilium
# 然后直接啟用hubble
$ cilium hubble enable
# 再啟用hubble-ui
$ cilium hubble enable --ui
# 接著查看cilium狀態(tài)
$ cilium status
/ˉˉ\
/ˉˉ\__/ˉˉ\ Cilium: OK
\__/ˉˉ\__/ Operator: OK
/ˉˉ\__/ˉˉ\ Hubble: OK
\__/ˉˉ\__/ ClusterMesh: disabled
\__/
Deployment cilium-operator Desired: 2, Ready: 2/2, Available: 2/2
Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1
Deployment hubble-ui Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet cilium Desired: 3, Ready: 3/3, Available: 3/3
Containers: cilium Running: 3
cilium-operator Running: 2
hubble-relay Running: 1
hubble-ui Running: 1
Cluster Pods: 4/4 managed by Cilium
Image versions hubble-relay quay.io/cilium/hubble-relay:v1.11.4@sha256:460d50bd0c6bcdfa3c62b0488541c102a4079f5def07d2649ff67bc24fd0dd3f: 1
hubble-ui quay.io/cilium/hubble-ui:v0.8.5@sha256:4eaca1ec1741043cfba6066a165b3bf251590cf4ac66371c4f63fbed2224ebb4: 1
hubble-ui quay.io/cilium/hubble-ui-backend:v0.8.5@sha256:2bce50cf6c32719d072706f7ceccad654bfa907b2745a496da99610776fe31ed: 1
hubble-ui docker.io/envoyproxy/envoy:v1.18.4@sha256:e5c2bb2870d0e59ce917a5100311813b4ede96ce4eb0c6bfa879e3fbe3e83935: 1
cilium quay.io/cilium/cilium:v1.11.4@sha256:d9d4c7759175db31aa32eaa68274bb9355d468fbc87e23123c80052e3ed63116: 3
cilium-operator quay.io/cilium/operator-generic:v1.11.4@sha256:bf75ad0dc47691a3a519b8ab148ed3a792ffa2f1e309e6efa955f30a40e95adc: 2
安裝hubble客戶端
和cilium一樣,hubble也提供了一個客戶端來讓我們操作,不同的是我們
# 首先我們需要安裝hubble的客戶端,安裝原理和過程與安裝cilium幾乎一致
$ export HUBBLE_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/hubble/master/stable.txt)
$ curl -L --remote-name-all https://github.com/cilium/hubble/releases/download/$HUBBLE_VERSION/hubble-linux-amd64.tar.gz{,.sha256sum}
$ sha256sum --check hubble-linux-amd64.tar.gz.sha256sum
$ sudo tar xzvfC hubble-linux-amd64.tar.gz /usr/local/bin
$ rm hubble-linux-amd64.tar.gz{,.sha256sum}
然后我們需要暴露hubble api服務(wù)的端口,直接使用kubectl的port-forward功能把hubble-relay這個服務(wù)的80端口暴露到4245端口上
# 僅暴露在IPV4網(wǎng)絡(luò)中
$ kubectl port-forward -n kube-system svc/hubble-relay --address 0.0.0.0 4245:80 &
# 同時暴露在IPV6和IPV4網(wǎng)絡(luò)中
$ kubectl port-forward -n kube-system svc/hubble-relay --address 0.0.0.0 --address :: 4245:80 &
如果使用cilium-cli工具安裝的hubble也可以使用cilium暴露api端口,需要注意的是該命令默認(rèn)會暴露到IPV6和IPV4網(wǎng)絡(luò)中,如果宿主機節(jié)點不支持ipv6網(wǎng)絡(luò)會報錯
$ cilium hubble port-forward&
api端口暴露完成之后我們就可以測試一下hubble客戶端的工作狀態(tài)是否正常
$ hubble status
Handling connection for 4245
Healthcheck (via localhost:4245): Ok
Current/Max Flows: 10,903/12,285 (88.75%)
Flows/s: 5.98
Connected Nodes: 3/3
暴露hubble-ui
官方介紹里面是使用cilium工具直接暴露hubble-ui的訪問端口到宿主機上面的12000端口
# 將hubble-ui這個服務(wù)的80端口暴露到宿主機上面的12000端口上面
$ cilium hubble ui&
[2] 5809
?? Opening "http://localhost:12000" in your browser...
實際上執(zhí)行的操作等同于下面這個命令
# 同時暴露在IPV6和IPV4網(wǎng)絡(luò)中
# kubectl port-forward -n kube-system svc/hubble-ui --address 0.0.0.0 --address :: 12000:80
# 僅暴露在IPV4網(wǎng)絡(luò)中
# kubectl port-forward -n kube-system svc/hubble-ui --address 0.0.0.0 12000:80
這里我們使用nodeport的方式來暴露hubble-ui,首先我們查看原來的hubble-ui這個svc的配置
$ kubectl get svc -n kube-system hubble-ui -o yaml
...此處略去一堆輸出...
- name: http
port: 80
protocol: TCP
targetPort: 8081
selector:
k8s-app: hubble-ui
sessionAffinity: None
type: ClusterIP
...此處略去一堆輸出...
我們把默認(rèn)的ClusterIP修改為NodePort,并且指定端口為nodePort: 30081
$ kubectl get svc -n kube-system hubble-ui -o yaml
...此處略去一堆輸出...
ports:
- name: http
nodePort: 30081
port: 80
protocol: TCP
targetPort: 8081
selector:
k8s-app: hubble-ui
sessionAffinity: None
type: NodePort
...此處略去一堆輸出...
修改前后對比查看狀態(tài)
# 修改前,使用ClusterIP
$ kubectl get svc -A | grep hubble-ui
kube-system hubble-ui ClusterIP 10.18.22.156 <none> 80/TCP 82s
# 修改后,使用NodePort
$ kubectl get svc -A | grep hubble-ui
kube-system hubble-ui NodePort 10.18.22.156 <none> 80:30081/TCP 47m
這時候我們在瀏覽器中訪問http://10.31.18.1:30081就可以看到hubble的ui界面了
[圖片上傳失敗...(image-5e7fb7-1653405122589)]
6、部署測試用例
集群部署完成之后我們在k8s集群中部署一個nginx測試一下是否能夠正常工作。首先我們創(chuàng)建一個名為nginx-quic的命名空間(namespace),然后在這個命名空間內(nèi)創(chuàng)建一個名為nginx-quic-deployment的deployment用來部署pod,最后再創(chuàng)建一個service用來暴露服務(wù),這里我們先使用nodeport的方式暴露端口方便測試。
$ cat nginx-quic.yaml
apiVersion: v1
kind: Namespace
metadata:
name: nginx-quic
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-quic-deployment
namespace: nginx-quic
spec:
selector:
matchLabels:
app: nginx-quic
replicas: 4
template:
metadata:
labels:
app: nginx-quic
spec:
containers:
- name: nginx-quic
image: tinychen777/nginx-quic:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-quic-service
namespace: nginx-quic
spec:
externalTrafficPolicy: Cluster
selector:
app: nginx-quic
ports:
- protocol: TCP
port: 8080 # match for service access port
targetPort: 80 # match for pod access port
nodePort: 30088 # match for external access port
type: NodePort
部署完成后我們直接查看狀態(tài)
# 直接部署
$ kubectl apply -f nginx-quic.yaml
namespace/nginx-quic created
deployment.apps/nginx-quic-deployment created
service/nginx-quic-service created
# 查看deployment的運行狀態(tài)
$ kubectl get deployment -o wide -n nginx-quic
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
nginx-quic-deployment 4/4 4 4 2m49s nginx-quic tinychen777/nginx-quic:latest app=nginx-quic
# 查看service的運行狀態(tài)
$ kubectl get service -o wide -n nginx-quic
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
nginx-quic-service NodePort 10.18.54.119 <none> 8080:30088/TCP 3m app=nginx-quic
# 查看pod的運行狀態(tài)
$ kubectl get pods -o wide -n nginx-quic
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-quic-deployment-5d9b4fbb47-4gc6g 1/1 Running 0 3m10s 10.18.66.66 tiny-kubeproxy-free-worker-18-11.k8s.tcinternal <none> <none>
nginx-quic-deployment-5d9b4fbb47-4j5p6 1/1 Running 0 3m10s 10.18.64.254 tiny-kubeproxy-free-worker-18-12.k8s.tcinternal <none> <none>
nginx-quic-deployment-5d9b4fbb47-8gg9j 1/1 Running 0 3m10s 10.18.66.231 tiny-kubeproxy-free-worker-18-11.k8s.tcinternal <none> <none>
nginx-quic-deployment-5d9b4fbb47-9bv2t 1/1 Running 0 3m10s 10.18.64.5 tiny-kubeproxy-free-worker-18-12.k8s.tcinternal <none> <none>
# 查看IPVS規(guī)則
# 由于使用了cilium的kube-proxy-free方案,這時候Linux網(wǎng)絡(luò)中是沒有ipvs規(guī)則的
$ ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
# 查看cilium里面的狀態(tài)
$ kubectl exec -it -n kube-system cilium-97fn7 -- cilium service list
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init)
ID Frontend Service Type Backend
1 10.18.0.1:443 ClusterIP 1 => 10.31.18.1:6443
2 10.18.0.10:9153 ClusterIP 1 => 10.18.65.237:9153
2 => 10.18.65.64:9153
3 10.18.0.10:53 ClusterIP 1 => 10.18.65.237:53
2 => 10.18.65.64:53
4 10.18.22.156:80 ClusterIP 1 => 10.18.64.53:8081
5 10.18.58.2:80 ClusterIP 1 => 10.18.66.189:4245
6 10.31.18.11:30081 NodePort 1 => 10.18.64.53:8081
7 0.0.0.0:30081 NodePort 1 => 10.18.64.53:8081
8 10.18.54.119:8080 ClusterIP 1 => 10.18.64.254:80
2 => 10.18.66.66:80
3 => 10.18.64.5:80
4 => 10.18.66.231:80
9 10.31.18.11:30088 NodePort 1 => 10.18.64.254:80
2 => 10.18.66.66:80
3 => 10.18.64.5:80
4 => 10.18.66.231:80
10 0.0.0.0:30088 NodePort 1 => 10.18.64.254:80
2 => 10.18.66.66:80
3 => 10.18.64.5:80
4 => 10.18.66.231:80
最后我們進行測試,這個nginx-quic的鏡像默認(rèn)情況下會返回在nginx容器中獲得的用戶請求的IP和端口
# 首先我們在集群內(nèi)進行測試
# 直接訪問pod
$ curl 10.18.64.254:80
10.18.65.204:60032
# 直接訪問service的ClusterIP,這時請求會被轉(zhuǎn)發(fā)到pod中
$ curl 10.18.54.119:8080
10.18.65.204:38774
# 直接訪問nodeport,這時請求會被轉(zhuǎn)發(fā)到pod中,不會經(jīng)過ClusterIP
# 此時實際返回的IP要取決于被轉(zhuǎn)發(fā)到的后端pod是否在當(dāng)前的k8s節(jié)點上
$ curl 10.31.18.1:30088
10.18.65.204:51254
$ curl 10.31.18.11:30088
10.18.65.204:38784
$ curl 10.31.18.12:30088
10.18.65.204:60048
# 接著我們在集群外進行測試
# 直接訪問三個節(jié)點的nodeport,這時請求會被轉(zhuǎn)發(fā)到pod中,不會經(jīng)過ClusterIP
# 此時實際返回的IP要取決于被轉(zhuǎn)發(fā)到的后端pod是否在當(dāng)前的k8s節(jié)點上
$ curl 10.31.18.1:30088
10.18.65.204:43586
$ curl 10.31.18.11:30088
10.18.66.232:63415
$ curl 10.31.18.11:30088
10.31.100.100:12192
$ curl 10.31.18.12:30088
10.18.64.152:40782
$ curl 10.31.18.12:30088
10.31.100.100:12178