文章目录
- 现象
- 问题排查
结论先行:kubesphere-system名称空间下reids宕机重启,会判断是否通过registry-proxy重新拉取镜像,该镜像原本是通过阿里云上拉取,代理上没有出现超时情况,导致失败。
解决方案:删除registry-proxy或修改registry-proxy配置
现象
kubesphere控制台点击登录不跳转
问题排查
- 查看kubesphere-system空间情况
kubectl get all -n kubesphere-system
发现redis没起来
- kubesphere控制台模块ks-console,查看日志确认问题是什么引起
kubectl logs <ks-console> -n kubesphere-system
结论:reids没起来导致的kubesphere登录不上
- 定位reids启动失败原因
由于reids pod没有创建 只能通过事件定位,查看kubesphere-system空间下事件
kubectl get events -n kubesphere-system
m27s Warning FailedCreate replicaset/redis-57f4b4584b Error creating: Internal error occurred: failed calling webhook "registry-proxy.registry-proxy.svc": failed to call webhook: Post "https://registry-proxy.registry-proxy.svc:443/mutate?timeout=3s": dial tcp 10.20.3.39:443: connect: connection refused
失败原因是通过registry-proxy调用错误,卸载registry-proxy,删除registry-proxy命名空间下所有资源,看是否恢复
- 卸载registry-proxy
卸载参考:https://ketches.cn/registry-proxy/
- 修改reids副本,重启
kubectl scale deployment redis --replicas=0 -n kubesphere-system
kubectl scale deployment redis --replicas=1 -n kubesphere-system
- 查看部署情况
kubectl get deployments -n kubesphere-system
- 再次登录成功
- registry-proxy用途还是比较大的,采用修改配置解决上述问题
配置参考:https://ketches.cn/registry-proxy/
修改默认配置:excludeNamespaces中添加kubesphere-system
apiVersion: v1
kind: ConfigMap
metadata:name: registry-proxy-confignamespace: registry-proxy
data:config.yaml: |enabled: trueproxies:docker.io: docker.ketches.cnregistry.k8s.io: k8s.ketches.cnquay.io: quay.ketches.cnghcr.io: ghcr.ketches.cngcr.io: gcr.ketches.cnk8s.gcr.io: k8s-gcr.ketches.cndocker.cloudsmith.io: cloudsmith.ketches.cnexcludeNamespaces:- kube-system- kube-public- kube-node-lease- registry-proxy- kubesphere-systemincludeNamespaces:- *