kubernetes K8S安装命令

安装引导文档

容器运行时安装

只需要安装其中一种。

运行时 Unix 域套接字
containerd unix:///var/run/containerd/containerd.sock
CRI-O unix:///var/run/crio/crio.sock
Docker Engine (使用 cri-dockerd) unix:///var/run/cri-dockerd.sock

安装 cni

git clone https://github.com/containernetworking/plugins
cd plugins
git checkout v1.1.1
./build_linux.sh
sudo mkdir -p /opt/cni/bin
sudo cp bin/* /opt/cni/bin/

安装 containerd

https://github.com/containerd/containerd/blob/main/docs/getting-started.md

安装docker与cri-docker

首先安装docker-ce

yum install docker-ce

然后再配合cri-docker

#可能需要挂代理
curl -OL https://github.com/Mirantis/cri-dockerd/releases/download/v0.2.5/cri-dockerd-0.2.5-3.el7.x86_64.rpm
rpm -iv cri-dockerd-0.2.5-3.el7.x86_64.rpm

修改配置 /etc/crio/crio.conf.rpmsave

#pause_image = "k8s.gcr.io/pause:3.2"
pause_image = "registry.aliyuncs.com/google_containers/pause:3.2"

修改 vim /usr/lib/systemd/system/cri-docker.service

ExecStart=/usr/bin/cri-dockerd --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7 --container-runtime-endpoint fd://

init

版本号最好指定,不然会默认取最新版本号

kubeadm init  --kubernetes-version=v1.25.0 \
  --pod-network-cidr=10.244.0.0/16 \
  --apiserver-advertise-address=172.17.102.164 \
  --cri-socket unix:///run/cri-dockerd.sock \
  --image-repository registry.aliyuncs.com/google_containers \
  -v5

节点Join

kubeadm join docker.hogwarts.ceshiren.com:6443 \
  --token 5frn49.2a90v8z0b84hna90 \
  --discovery-token-ca-cert-hash sha256:f89f5116ebd530e5d7248ce4e26f455ec5dafe37faf3e4914e4add416ed5417f \
  --cri-socket  unix:///var/run/cri-dockerd.sock -v5

关键配置

  • /usr/lib/systemd/system/kubelet.service
  • /etc/kubernetes/kubelet.conf
  • /var/lib/kubelet/
  • ~/.kube/config

排查命令

journalctl -ex
journalctl -u kebelet

K8S阿里云镜像

辅助命令 下载依赖镜像 可选

kubeadm config images list | 
awk -F'/' '{OFS=FS; 
$1=""
printf $0 " ";
if(NF==3) $2="google_containers"; else $2="google_containers/"$2; 
print $0}' |
while read origin aliyun; do 
docker pull registry.aliyuncs.com$aliyun
docker tag registry.aliyuncs.com$aliyun registry.k8s.io$origin
docker tag registry.aliyuncs.com$aliyun k8s.gcr.io$origin
done

常见错误

open /run/flannel/subnet.env: no such file or directory

open /run/flannel/subnet.env: no such file or directory

解决方案,新建/run/flannel/subnet.env文件

cat <<EOF  > /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
EOF

error connection is unauthorized

error connection is unauthorized

解决方案 修改vim /etc/cni/net.d/100-crio-bridge.conf

        "ranges": [
            [{ "subnet": "10.85.0.0/16" }],
            [{ "subnet": "10.244.0.0/16" }],
            [{ "subnet": "1100:200::/24" }]
        ]

certificate signed by unknown authority

Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”)

解决方案

重新同步配置

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

kube-flannel-ds CrashLoopBackOff

kube-flannel kube-flannel-ds-c25cq 0/1 CrashLoopBackOff

两个地方的ip范围要保持一致

/etc/kubernetes/manifests/kube-controller-manager.yaml

--cluster-cidr=10.244.0.0/16

kube-flannel.yml

  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }

invalid bearer token, square/go-jose: error in cryptographic primitive

Unable to authenticate the request due to an error" err="[invalid bearer token, square/go-jose: error in cryptographic primitive

跟kubelet的requestheader的设置有关系。不影响使用。

node not found

Sep 30 04:03:32 docker.hogwarts.ceshiren.com kubelet[369967]: E0930 04:03:32.339797  369967 kuberuntime_sandbox.go:71] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = failed to get sandbox image \"k8s.gcr.io/pause:3.6\": failed to pull image \"k8s.gcr.io/pause:3.6\": failed to pull and unpack image \"k8s.gcr.io/pause:3.6\": failed to resolve reference \"k8s.gcr.io/pause:3.6\": failed to do request: Head \"https://k8s.gcr.io/v2/pause/manifests/3.6\": dial tcp 142.250.157.82:443: i/o timeout" pod="kube-system/etcd-docker.hogwarts.ceshiren.com"
Sep 30 04:03:32 docker.hogwarts.ceshiren.com kubelet[369967]: E0930 04:03:32.401679  369967 kubelet.go:2448] "Error getting node" err="node \"docker.hogwarts.ceshiren.com\" not found"
Sep 30 04:03:32 docker.hogwarts.ceshiren.com kubelet[369967]: E0930 04:03:32.502557  369967 kubelet.go:2448] "Error getting node" err="node \"docker.hogwarts.ceshiren.com\" not found"
Sep 30 04:03:32 docker.hogwarts.ceshiren.com kubelet[369967]: E0930 04:03:32.603514  369967 kubelet.go:2448] "Error getting node" err="node \"docker.hogwarts.ceshiren.com\" not found"
Sep 30 04:03:32 docker.hogwarts.ceshiren.com kubelet[369967]: E0930 04:03:32.704359  369967 kubelet.go:2448] "Error getting node" err="node \"docker.hogwarts.ceshiren.com\" not found"

本质是第一步的报错,failed to pull image \"k8s.gcr.io/pause:3.6\"

如果用的是containerd

解决方案1

#挂代理
ctr -n k8s.io i pull k8s.gcr.io/pause:3.6

解决方案2

ctr -n k8s.io image pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6
ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 k8s.gcr.io/pause:3.6

如果是其他容器

挂代理或者加参数

2 个赞

The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

failed to get sandbox image \"registry.k8s.io/pause:3.6\": failed to pull image

解决方案

ctr -n k8s.io image pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6
ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 k8s.gcr.io/pause:3.6
ctr -n k8s.io image tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 registry.k8s.io/pause:3.6
Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
[preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: E1109 16:47:59.292506   11631 remote_runtime.go:948] "Status from runtime service failed" err="rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService

vim /etc/containerd/config.toml

#disabled_plugins = ["cri"

重启containerd

systemctl restart containerd

“Container runtime network not ready” networkReady=“NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized”

cat << EOF | tee /etc/cni/net.d/10-containerd-net.conflist
{
 "cniVersion": "1.0.0",
 "name": "containerd-net",
 "plugins": [
   {
     "type": "bridge",
     "bridge": "cni0",
     "isGateway": true,
     "ipMasq": true,
     "promiscMode": true,
     "ipam": {
       "type": "host-local",
       "ranges": [
         [{
           "subnet": "10.88.0.0/16"
         }],
         [{
           "subnet": "2001:db8:4860::/64"
         }]
       ],
       "routes": [
         { "dst": "0.0.0.0/0" },
         { "dst": "::/0" }
       ]
     }
   },
   {
     "type": "portmap",
     "capabilities": {"portMappings": true}
   }
 ]
}
EOF

https://github.com/kubernetes/website/blob/dev-1.24/content/en/docs/tasks/administer-cluster/migrating-from-dockershim/troubleshooting-cni-plugin-related-errors.md#an-example-containerd-configuration-file

failed to setup network for sandbox