NAME STATUS ROLES AGE VERSION
kube-worker-1 NotReady <none> 1h v1.23.3
kubernetes-node-bols Ready <none> 1h v1.23.3
kubernetes-node-st6x Ready <none> 1h v1.23.3
kubernetes-node-unaj Ready <none> 1h v1.23.3
kubectl describe node kube-worker-1
Name: kube-worker-1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=kube-worker-1
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /run/containerd/containerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Thu, 17 Feb 2022 16:46:30 -0500
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: kube-worker-1
AcquireTime: <unset>
RenewTime: Thu, 17 Feb 2022 17:13:09 -0500
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Thu, 17 Feb 2022 17:09:13 -0500 Thu, 17 Feb 2022 17:09:13 -0500 WeaveIsUp Weave pod has set this
MemoryPressure Unknown Thu, 17 Feb 2022 17:12:40 -0500 Thu, 17 Feb 2022 17:13:52 -0500 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Thu, 17 Feb 2022 17:12:40 -0500 Thu, 17 Feb 2022 17:13:52 -0500 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Thu, 17 Feb 2022 17:12:40 -0500 Thu, 17 Feb 2022 17:13:52 -0500 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Thu, 17 Feb 2022 17:12:40 -0500 Thu, 17 Feb 2022 17:13:52 -0500 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: 192.168.0.113
Hostname: kube-worker-1
Capacity:
cpu: 2
ephemeral-storage: 15372232Ki
hugepages-2Mi: 0
memory: 2025188Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 14167048988
hugepages-2Mi: 0
memory: 1922788Ki
pods: 110
System Info:
Machine ID: 9384e2927f544209b5d7b67474bbf92b
System UUID: aa829ca9-73d7-064d-9019-df07404ad448
Boot ID: 5a295a03-aaca-4340-af20-1327fa5dab5c
Kernel Version: 5.13.0-28-generic
OS Image: Ubuntu 21.10
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.5.9
Kubelet Version: v1.23.3
Kube-Proxy Version: v1.23.3
Non-terminated Pods: (4 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default nginx-deployment-67d4bdd6f5-cx2nz 500m (25%) 500m (25%) 128Mi (6%) 128Mi (6%) 23m
default nginx-deployment-67d4bdd6f5-w6kd7 500m (25%) 500m (25%) 128Mi (6%) 128Mi (6%) 23m
kube-system kube-proxy-dnxbz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 28m
kube-system weave-net-gjxxp 100m (5%) 0 (0%) 200Mi (10%) 0 (0%) 28m
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1100m (55%) 1 (50%)
memory 456Mi (24%) 256Mi (13%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:
...
kubectl get node kube-worker-1 -o yaml
apiVersion:v1kind:Nodemetadata:annotations:kubeadm.alpha.kubernetes.io/cri-socket:/run/containerd/containerd.socknode.alpha.kubernetes.io/ttl:"0"volumes.kubernetes.io/controller-managed-attach-detach:"true"creationTimestamp:"2022-02-17T21:46:30Z"labels:beta.kubernetes.io/arch:amd64beta.kubernetes.io/os:linuxkubernetes.io/arch:amd64kubernetes.io/hostname:kube-worker-1kubernetes.io/os:linuxname:kube-worker-1resourceVersion:"4026"uid:98efe7cb-2978-4a0b-842a-1a7bf12c05f8spec:{}status:addresses:- address:192.168.0.113type:InternalIP- address:kube-worker-1type:Hostnameallocatable:cpu:"2"ephemeral-storage:"14167048988"hugepages-2Mi:"0"memory:1922788Kipods:"110"capacity:cpu:"2"ephemeral-storage:15372232Kihugepages-2Mi:"0"memory:2025188Kipods:"110"conditions:- lastHeartbeatTime:"2022-02-17T22:20:32Z"lastTransitionTime:"2022-02-17T22:20:32Z"message:Weave pod has set thisreason:WeaveIsUpstatus:"False"type:NetworkUnavailable- lastHeartbeatTime:"2022-02-17T22:20:15Z"lastTransitionTime:"2022-02-17T22:13:25Z"message:kubelet has sufficient memory availablereason:KubeletHasSufficientMemorystatus:"False"type:MemoryPressure- lastHeartbeatTime:"2022-02-17T22:20:15Z"lastTransitionTime:"2022-02-17T22:13:25Z"message:kubelet has no disk pressurereason:KubeletHasNoDiskPressurestatus:"False"type:DiskPressure- lastHeartbeatTime:"2022-02-17T22:20:15Z"lastTransitionTime:"2022-02-17T22:13:25Z"message:kubelet has sufficient PID availablereason:KubeletHasSufficientPIDstatus:"False"type:PIDPressure- lastHeartbeatTime:"2022-02-17T22:20:15Z"lastTransitionTime:"2022-02-17T22:15:15Z"message:kubelet is posting ready status. AppArmor enabledreason:KubeletReadystatus:"True"type:ReadydaemonEndpoints:kubeletEndpoint:Port:10250nodeInfo:architecture:amd64bootID:22333234-7a6b-44d4-9ce1-67e31dc7e369containerRuntimeVersion:containerd://1.5.9kernelVersion:5.13.0-28-generickubeProxyVersion:v1.23.3kubeletVersion:v1.23.3machineID:9384e2927f544209b5d7b67474bbf92boperatingSystem:linuxosImage:Ubuntu 21.10systemUUID:aa829ca9-73d7-064d-9019-df07404ad448
Kubernetesでは、コンテナのCPU使用率やメモリ使用率といったリソース使用量のメトリクスが、メトリクスAPIを通じて提供されています。これらのメトリクスは、ユーザーがkubectl topコマンドで直接アクセスするか、クラスター内のコントローラー(例えばHorizontal Pod Autoscaler)が判断するためにアクセスすることができます。
apiVersion:apps/v1kind:DaemonSetmetadata:name:node-problem-detector-v0.1namespace:kube-systemlabels:k8s-app:node-problem-detectorversion:v0.1kubernetes.io/cluster-service:"true"spec:selector:matchLabels:k8s-app:node-problem-detector version:v0.1kubernetes.io/cluster-service:"true"template:metadata:labels:k8s-app:node-problem-detectorversion:v0.1kubernetes.io/cluster-service:"true"spec:hostNetwork:truecontainers:- name:node-problem-detectorimage:registry.k8s.io/node-problem-detector:v0.1securityContext:privileged:trueresources:limits:cpu:"200m"memory:"100Mi"requests:cpu:"20m"memory:"20Mi"volumeMounts:- name:logmountPath:/logreadOnly:true- name:config# Overwrite the config/ directory with ConfigMap volumemountPath:/configreadOnly:truevolumes:- name:loghostPath:path:/var/log/- name:config# Define ConfigMap volumeconfigMap:name:node-problem-detector-config
新しい設定ファイルでNode Problem Detectorを再作成します。
# If you have a node-problem-detector running, delete before recreatingkubectl delete -f https://k8s.io/examples/debug/node-problem-detector.yaml
kubectl apply -f https://k8s.io/examples/debug/node-problem-detector-configmap.yaml
備考: この方法は kubectl で起動された Node Problem Detector にのみ適用されます。
POD ID CREATED STATE NAME NAMESPACE ATTEMPT
926f1b5a1d33a About a minute ago Ready sh-84d7dcf559-4r2gq default 0
4dccb216c4adb About a minute ago Ready nginx-65899c769f-wv2gp default 0
a86316e96fa89 17 hours ago Ready kube-proxy-gblk4 kube-system 0
919630b8f81f1 17 hours ago Ready nvidia-device-plugin-zgbbv kube-system 0
Podを名前でリストアップします:
crictl pods --name nginx-65899c769f-wv2gp
出力はこのようになります:
POD ID CREATED STATE NAME NAMESPACE ATTEMPT
4dccb216c4adb 2 minutes ago Ready nginx-65899c769f-wv2gp default 0
Podをラベルでリストアップします:
crictl pods --label run=nginx
出力はこのようになります:
POD ID CREATED STATE NAME NAMESPACE ATTEMPT
4dccb216c4adb 2 minutes ago Ready nginx-65899c769f-wv2gp default 0
CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT
1f73f2d81bf98 busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47 7 minutes ago Running sh 1
9c5951df22c78 busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47 8 minutes ago Exited sh 0
87d3992f84f74 nginx@sha256:d0a8828cccb73397acb0073bf34f4d7d8aa315263f1e7806bf8c55d8ac139d5f 8 minutes ago Running nginx 0
1941fb4da154f k8s-gcrio.azureedge.net/hyperkube-amd64@sha256:00d814b1f7763f4ab5be80c58e98140dfc69df107f253d7fdd714b30a714260a 18 hours ago Running kube-proxy 0
ランニングコンテナをリストアップします:
crictl ps
出力はこのようになります:
CONTAINER ID IMAGE CREATED STATE NAME ATTEMPT
1f73f2d81bf98 busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47 6 minutes ago Running sh 1
87d3992f84f74 nginx@sha256:d0a8828cccb73397acb0073bf34f4d7d8aa315263f1e7806bf8c55d8ac139d5f 7 minutes ago Running nginx 0
1941fb4da154f k8s-gcrio.azureedge.net/hyperkube-amd64@sha256:00d814b1f7763f4ab5be80c58e98140dfc69df107f253d7fdd714b30a714260a 17 hours ago Running kube-proxy 0
apiVersion:audit.k8s.io/v1# This is required.kind:Policy# Don't generate audit events for all requests in RequestReceived stage.omitStages:- "RequestReceived"rules:# Log pod changes at RequestResponse level- level:RequestResponseresources:- group:""# Resource "pods" doesn't match requests to any subresource of pods,# which is consistent with the RBAC policy.resources:["pods"]# Log "pods/log", "pods/status" at Metadata level- level:Metadataresources:- group:""resources:["pods/log","pods/status"]# Don't log requests to a configmap called "controller-leader"- level:Noneresources:- group:""resources:["configmaps"]resourceNames:["controller-leader"]# Don't log watch requests by the "system:kube-proxy" on endpoints or services- level:Noneusers:["system:kube-proxy"]verbs:["watch"]resources:- group:""# core API groupresources:["endpoints","services"]# Don't log authenticated requests to certain non-resource URL paths.- level:NoneuserGroups:["system:authenticated"]nonResourceURLs:- "/api*"# Wildcard matching.- "/version"# Log the request body of configmap changes in kube-system.- level:Requestresources:- group:""# core API groupresources:["configmaps"]# This rule only applies to resources in the "kube-system" namespace.# The empty string "" can be used to select non-namespaced resources.namespaces:["kube-system"]# Log configmap and secret changes in all other namespaces at the Metadata level.- level:Metadataresources:- group:""# core API groupresources:["secrets","configmaps"]# Log all other resources in core and extensions at the Request level.- level:Requestresources:- group:""# core API group- group:"extensions"# Version of group should NOT be included.# A catch-all rule to log all other requests at the Metadata level.- level:Metadata# Long-running requests like watches that fall under this rule will not# generate an audit event in RequestReceived.omitStages:- "RequestReceived"