I am using Terraform to create infrastructure on AWS environment. Out of many services, we are also creating AWS EKS using terraform-aws-modules/eks/aws module. The EKS is primarily used for spinning dynamic containers to handle asynchronous job execution. Once a given task is completed the container releases resources and terminates.
What I have noticed is that, the dead containers lying on the EKS cluster forever. This is causing too many dead containers just sitting on EKS and consuming storage. I came across few blogs which mention that Kubernetes has garbage collection process, but none describes how it can be specified using Terraform or explicitly for AWS EKS.
Hence I am looking for a solution, which will help to specify garbage collection policy for dead containers on AWS EKS. If not achievable via Terraform, I am ok with using kubectl with AWS EKS.
CodePudding user response:
Removal of terminated pod is control by the kube-controller-manager flag --terminated-pod-gc-threshold. This flag is currently not configurable for EKS. You can upvote here for the feature. Alternately, you can run a simple CronJob with kubectl image to delete terminated pod (eg. kubectl delete pods --all-namespaces --field-selector=status.phase==Succeeded).
CodePudding user response:
These two kubelet flags will cause the node to clean up docker images when the filesystem reaches those percentages. https://kubernetes.io/docs/concepts/architecture/garbage-collection/#container-image-lifecycle
--image-gc-high-threshold="85"
--image-gc-low-threshold="80"
But you also probably want to set --maximum-dead-containers 1 so that running multiple (same) images doesn't leave dead containers around.
