We are running EKS in Ireland and our nodes are going unhealthy regularly.
It is not possible to SSH to the host, pods are not reachable. We have experienced this with t2.xlarge, t2.small and t3.medium instances.
We could ssh to another node in the cluster and ping the NotReady node, but are not able to ssh it either.
Graphs show the memory goes high at about the same time that the journalctl logs below. The EBS IO also goes high. The exact time is hard to pinpoint. I added logs with interesting 'failures' around the time that we think the node disappeared.
We are using the cluster for running tests, so pods are getting created and destroyed often.
We have not done anything described in #51 for log rotation.
Cluster Information:
CNI: Latest daemonset with image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:1.2.1
Region: eu-west-1
LOGS
** Node AMI**
AMI ID amazon-eks-node-v24 (ami-0c7a4976cb6fafd3a)
** File system **
$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 1,9G 0 1,9G 0% /dev
tmpfs 1,9G 0 1,9G 0% /dev/shm
tmpfs 1,9G 2,3M 1,9G 1% /run
tmpfs 1,9G 0 1,9G 0% /sys/fs/cgroup
/dev/nvme0n1p1 64G 40G 25G 62% /
tmpfs 389M 0 389M 0% /run/user/1000
** kubectl describe node**
Name: ip-<secret>.eu-west-1.compute.internal
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=t3.medium
beta.kubernetes.io/os=linux
failure-domain.beta.kubernetes.io/region=eu-west-1
failure-domain.beta.kubernetes.io/zone=eu-west-1b
kubernetes.io/hostname=ip-<secret>.eu-west-1.compute.internal
Annotations: node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 30 Oct 2018 11:25:48 +0100
Taints: <none>
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk Unknown Wed, 31 Oct 2018 10:56:53 +0100 Wed, 31 Oct 2018 10:57:35 +0100 NodeStatusUnknown Kubelet stopped posting node status.
MemoryPressure Unknown Wed, 31 Oct 2018 10:56:53 +0100 Wed, 31 Oct 2018 10:57:35 +0100 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Wed, 31 Oct 2018 10:56:53 +0100 Wed, 31 Oct 2018 10:57:35 +0100 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure False Wed, 31 Oct 2018 10:56:53 +0100 Tue, 30 Oct 2018 11:25:46 +0100 KubeletHasSufficientPID kubelet has sufficient PID available
Ready Unknown Wed, 31 Oct 2018 10:56:53 +0100 Wed, 31 Oct 2018 10:57:35 +0100 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: <secret>
Hostname: ip-<secret>..eu-west-1.compute.internal
Capacity:
cpu: 2
ephemeral-storage: 67096556Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 3980344Ki
pods: 17
Allocatable:
cpu: 2
ephemeral-storage: 61836185908
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 3877944Ki
pods: 17
System Info:
Machine ID: asdf
System UUID: asdf
Boot ID: asdf
Kernel Version: 4.14.62-70.117.amzn2.x86_64
OS Image: Amazon Linux 2
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://17.6.2
Kubelet Version: v1.10.3
Kube-Proxy Version: v1.10.3
ProviderID: aws:///eu-west-1b/i-<secret>
Non-terminated Pods: (14 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system aws-node-hshhg 10m (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-proxy-fkrb8 100m (5%) 0 (0%) 0 (0%) 0 (0%)
monitoring datadog-datadog-bk5bd 200m (10%) 200m (10%) 256Mi (6%) 256Mi (6%)
monitoring prometheus-node-exporter-4z2dg 0 (0%) 0 (0%) 0 (0%) 0 (0%)
t1 0 (0%) 0 (0%) 0 (0%) 0 (0%)
t2 0 (0%) 0 (0%) 0 (0%) 0 (0%)
t3 0 (0%) 0 (0%) 0 (0%) 0 (0%)
t4 0 (0%) 0 (0%) 0 (0%) 0 (0%)
t5 250m (12%) 250m (12%) 500Mi (13%) 500Mi (13%)
t6 0 (0%) 0 (0%) 0 (0%) 0 (0%)
t7 250m (12%) 250m (12%) 500Mi (13%) 500Mi (13%)
t8 100m (5%) 0 (0%) 256Mi (6%) 0 (0%)
t9 250m (12%) 250m (12%) 500Mi (13%) 500Mi (13%)
t10 0 (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1160m (57%) 950m (47%)
memory 2012Mi (53%) 1756Mi (46%)
Events: <none>
journalctl logs around the time
okt 31 10:01:29 ip-<secret>.eu-west-1.compute.internal kernel: aws-k8s-agent: page allocation stalls for 10404ms, order:0, mode:0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=(null)
okt 31 10:01:30 ip-<secret>.eu-west-1.compute.internal kernel: aws-k8s-agent cpuset=1ef2c300b3981b045f3f2fcab050f674afead7e7c828362ec2d40ef82bf02441 mems_allowed=0
okt 31 10:01:31 ip-<secret>.eu-west-1.compute.internal kernel: CPU: 1 PID: 6267 Comm: aws-k8s-agent Not tainted 4.14.62-70.117.amzn2.x86_64 #1
okt 31 10:01:34 ip-<secret>.eu-west-1.compute.internal kernel: Hardware name: Amazon EC2 t3.medium/, BIOS 1.0 10/16/2017
okt 31 10:01:36 ip-<secret>.eu-west-1.compute.internal kernel: Call Trace:
okt 31 10:01:38 ip-<secret>.eu-west-1.compute.internal kernel: dump_stack+0x5c/0x82
okt 31 10:01:39 ip-<secret>.eu-west-1.compute.internal kernel: warn_alloc+0x114/0x1c0
okt 31 10:01:41 ip-<secret>.eu-west-1.compute.internal kernel: __alloc_pages_slowpath+0x831/0xe00
okt 31 10:01:42 ip-<secret>.eu-west-1.compute.internal kernel: ? get_page_from_freelist+0x371/0xba0
okt 31 10:01:45 ip-<secret>.eu-west-1.compute.internal kernel: __alloc_pages_nodemask+0x227/0x250
okt 31 10:01:46 ip-<secret>.eu-west-1.compute.internal kernel: filemap_fault+0x204/0x5f0
okt 31 10:01:47 ip-<secret>.eu-west-1.compute.internal kernel: __xfs_filemap_fault.constprop.8+0x49/0x120 [xfs]
okt 31 10:01:50 ip-<secret>.eu-west-1.compute.internal kernel: __do_fault+0x20/0x60
okt 31 10:01:52 ip-<secret>.eu-west-1.compute.internal kernel: handle_pte_fault+0x945/0xeb0
okt 31 10:01:55 ip-<secret>.eu-west-1.compute.internal kernel: __handle_mm_fault+0x431/0x540
okt 31 10:01:57 ip-<secret>.eu-west-1.compute.internal kernel: handle_mm_fault+0xaa/0x1e0
okt 31 10:02:00 ip-<secret>.eu-west-1.compute.internal kernel: __do_page_fault+0x23e/0x4c0
okt 31 10:02:02 ip-<secret>.eu-west-1.compute.internal kernel: ? async_page_fault+0x2f/0x50
okt 31 10:02:07 ip-<secret>.eu-west-1.compute.internal kernel: async_page_fault+0x45/0x50
okt 31 10:02:09 ip-<secret>.eu-west-1.compute.internal kernel: RIP: 0001:0x1f
okt 31 10:02:12 ip-<secret>.eu-west-1.compute.internal kernel: RSP: 0000:000000c420170f58 EFLAGS: 4d32dce245d7
okt 31 10:02:15 ip-<secret>.eu-west-1.compute.internal kernel: Mem-Info:
okt 31 10:02:16 ip-<secret>.eu-west-1.compute.internal kernel: active_anon:895836 inactive_anon:8314 isolated_anon:0
active_file:413 inactive_file:596 isolated_file:0
unevictable:0 dirty:1 writeback:0 unstable:0
slab_reclaimable:17241 slab_unreclaimable:26888
mapped:22510 shmem:28069 pagetables:7173 bounce:0
free:21650 free_pcp:12 free_cma:0
okt 31 10:02:17 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 active_anon:3583344kB inactive_anon:33256kB active_file:1652kB inactive_file:2384kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:90040kB dirty:4kB writeback:0kB shmem:112276kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 16384kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
okt 31 10:02:19 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 DMA free:15620kB min:268kB low:332kB high:396kB active_anon:288kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
okt 31 10:02:22 ip-<secret>.eu-west-1.compute.internal kernel: lowmem_reserve[]: 0 2951 3849 3849
okt 31 10:02:24 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 DMA32 free:54972kB min:51600kB low:64500kB high:77400kB active_anon:2799324kB inactive_anon:31696kB active_file:556kB inactive_file:816kB unevictable:0kB writepending:0kB present:3129320kB managed:3044324kB mlocked:0kB kernel_stack:7968kB pagetables:19844kB bounce:0kB free_pcp:148kB local_pcp:0kB free_cma:0kB
okt 31 10:02:27 ip-<secret>.eu-west-1.compute.internal kernel: lowmem_reserve[]: 0 0 898 898
okt 31 10:02:30 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 Normal free:15408kB min:15708kB low:19632kB high:23556kB active_anon:783732kB inactive_anon:1560kB active_file:884kB inactive_file:1392kB unevictable:0kB writepending:4kB present:987136kB managed:920112kB mlocked:0kB kernel_stack:4304kB pagetables:8848kB bounce:0kB free_pcp:420kB local_pcp:0kB free_cma:0kB
okt 31 10:02:32 ip-<secret>.eu-west-1.compute.internal kernel: lowmem_reserve[]: 0 0 0 0
okt 31 10:02:34 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 2*32kB (UM) 3*64kB (UM) 2*128kB (UM) 1*256kB (U) 1*512kB (M) 2*1024kB (UM) 0*2048kB 3*4096kB (ME) = 15620kB
okt 31 10:02:36 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 DMA32: 1659*4kB (UME) 1496*8kB (UME) 1181*16kB (UME) 446*32kB (UME) 54*64kB (UME) 1*128kB (E) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 55356kB
okt 31 10:02:37 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 Normal: 334*4kB (UMEH) 351*8kB (UMEH) 431*16kB (UMEH) 93*32kB (UMEH) 4*64kB (H) 2*128kB (H) 4*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15552kB
okt 31 10:02:38 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
okt 31 10:02:38 ip-<secret>.eu-west-1.compute.internal kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
okt 31 10:02:40 ip-<secret>.eu-west-1.compute.internal kernel: 28769 total pagecache pages
okt 31 10:02:40 ip-<secret>.eu-west-1.compute.internal kernel: 0 pages in swap cache
okt 31 10:02:42 ip-<secret>.eu-west-1.compute.internal kernel: Swap cache stats: add 0, delete 0, find 0/0
okt 31 10:02:43 ip-<secret>.eu-west-1.compute.internal kernel: Free swap = 0kB
okt 31 10:02:49 ip-<secret>.eu-west-1.compute.internal systemd-journal[26209]: Permanent journal is using 392.0M (max allowed 4.0G, trying to leave 4.0G free of 38.6G available โ current limit 4.0G).
okt 31 10:02:49 ip-<secret>.eu-west-1.compute.internal kernel: Total swap = 0kB
okt 31 10:02:49 ip-<secret>.eu-west-1.compute.internal kernel: 1033112 pages RAM
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: 0 pages HighMem/MovableOnly
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: 38026 pages reserved
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: process-agent: page allocation stalls for 10580ms, order:0, mode:0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=(null)
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: process-agent cpuset=67b33ad9edc4663ce3e97ac968df4726a9beeff073706349383b1e9eabd93125 mems_allowed=0
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: CPU: 1 PID: 7452 Comm: process-agent Not tainted 4.14.62-70.117.amzn2.x86_64 #1
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: Hardware name: Amazon EC2 t3.medium/, BIOS 1.0 10/16/2017
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: Call Trace:
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: dump_stack+0x5c/0x82
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: warn_alloc+0x114/0x1c0
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: __alloc_pages_slowpath+0x831/0xe00
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: ? get_page_from_freelist+0x371/0xba0
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: __alloc_pages_nodemask+0x227/0x250
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: filemap_fault+0x204/0x5f0
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: __xfs_filemap_fault.constprop.8+0x49/0x120 [xfs]
okt 31 10:02:50 ip-<secret>.eu-west-1.compute.internal kernel: __do_fault+0x20/0x60
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: handle_pte_fault+0x945/0xeb0
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: ? __switch_to_asm+0x34/0x70
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: ? __switch_to_asm+0x40/0x70
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: __handle_mm_fault+0x431/0x540
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: handle_mm_fault+0xaa/0x1e0
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: __do_page_fault+0x23e/0x4c0
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: ? async_page_fault+0x2f/0x50
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: async_page_fault+0x45/0x50
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: RIP: 00f1:0x11
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: RSP: 0234:00007f768c4e5d38 EFLAGS: 00000000
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: Mem-Info:
okt 31 10:02:51 ip-<secret>.eu-west-1.compute.internal kernel: active_anon:895956 inactive_anon:8314 isolated_anon:0
active_file:15 inactive_file:24 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:17237 slab_unreclaimable:26796
mapped:21743 shmem:28069 pagetables:7196 bounce:0
free:21560 free_pcp:682 free_cma:0
plugin logs
2018-10-31T10:50:15Z [INFO] Starting CNI Plugin v1.2.1 ...
2018-10-31T10:50:15Z [INFO] Received CNI del request: ContainerID(56904923f2dfb96db21ddfb6d39f2429d641141f78511d07823bd315feaf4302) Netns() IfName(eth0) Args(IgnoreUnknown=1;K8S_POD_NAMESPACE=monitoring;K8S_POD_NAME=datadog-datadog-bk5bd;K8S_POD_INFRA_CONTAINER_ID=56904923f2dfb96db21ddfb6d39f2429d641141f78511d07823bd315feaf4302) Path(/opt/aws-cni/bin:/opt/cni/bin) argsStdinData({"cniVersion":"","name":"aws-cni","type":"aws-cni","vethPrefix":"eni"})
2018-10-31T10:50:15Z [ERROR] Error received from DelNetwork grpc call for pod datadog-datadog-bk5bd namespace monitoring container 56904923f2dfb96db21ddfb6d39f2429d641141f78511d07823bd315feaf4302: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused"
ipamd.log
2018-10-31T10:05:43Z [DEBUG] Found ip addresses [10.0.1.72 10.0.1.208 10.0.1.36 10.0.1.86 10.0.1.219 10.0.1.63] on eni 02:af:21:3c:f9:4e
2018-10-31T10:05:44Z [DEBUG] Found eni mac address : 02:b3:1a:eb:c3:5e
2018-10-31T10:05:52Z [DEBUG] Using device number 0 for primary eni: eni-0f37efb5e4ebecf09
2018-10-31T10:05:52Z [DEBUG] Found eni: eni-0f37efb5e4ebecf09, mac 02:b3:1a:eb:c3:5e, device 0
2018-10-31T10:05:55Z [DEBUG] Found cidr 10.0.1.0/24 for eni 02:b3:1a:eb:c3:5e
2018-10-31T10:05:59Z [DEBUG] Found ip addresses [10.0.1.143 10.0.1.96 10.0.1.65 10.0.1.209 10.0.1.134 10.0.1.8] on eni 02:b3:1a:eb:c3:5e
2018-10-31T10:05:59Z [DEBUG] Reconcile existing ENI eni-0ce38d7ac411b07ab IP pool
2018-10-31T10:05:59Z [DEBUG] Reconcile and skip primary IP 10.0.1.117 on eni eni-0ce38d7ac411b07ab
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0ce38d7ac411b07ab)'s IPv4 address 10.0.1.53 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.53 on eni eni-0ce38d7ac411b07ab
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0ce38d7ac411b07ab)'s IPv4 address 10.0.1.102 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.102 on eni eni-0ce38d7ac411b07ab
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0ce38d7ac411b07ab)'s IPv4 address 10.0.1.120 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.120 on eni eni-0ce38d7ac411b07ab
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0ce38d7ac411b07ab)'s IPv4 address 10.0.1.42 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.42 on eni eni-0ce38d7ac411b07ab
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0ce38d7ac411b07ab)'s IPv4 address 10.0.1.59 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.59 on eni eni-0ce38d7ac411b07ab
2018-10-31T10:06:00Z [DEBUG] Reconcile existing ENI eni-0f1db76fd54b2e3f5 IP pool
2018-10-31T10:06:00Z [DEBUG] Reconcile and skip primary IP 10.0.1.72 on eni eni-0f1db76fd54b2e3f5
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f1db76fd54b2e3f5)'s IPv4 address 10.0.1.208 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.208 on eni eni-0f1db76fd54b2e3f5
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f1db76fd54b2e3f5)'s IPv4 address 10.0.1.36 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.36 on eni eni-0f1db76fd54b2e3f5
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f1db76fd54b2e3f5)'s IPv4 address 10.0.1.86 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.86 on eni eni-0f1db76fd54b2e3f5
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f1db76fd54b2e3f5)'s IPv4 address 10.0.1.219 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.219 on eni eni-0f1db76fd54b2e3f5
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f1db76fd54b2e3f5)'s IPv4 address 10.0.1.63 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.63 on eni eni-0f1db76fd54b2e3f5
2018-10-31T10:06:00Z [DEBUG] Reconcile existing ENI eni-0f37efb5e4ebecf09 IP pool
2018-10-31T10:06:00Z [DEBUG] Reconcile and skip primary IP 10.0.1.143 on eni eni-0f37efb5e4ebecf09
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f37efb5e4ebecf09)'s IPv4 address 10.0.1.96 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.96 on eni eni-0f37efb5e4ebecf09
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f37efb5e4ebecf09)'s IPv4 address 10.0.1.65 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.65 on eni eni-0f37efb5e4ebecf09
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f37efb5e4ebecf09)'s IPv4 address 10.0.1.209 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.209 on eni eni-0f37efb5e4ebecf09
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f37efb5e4ebecf09)'s IPv4 address 10.0.1.134 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.134 on eni eni-0f37efb5e4ebecf09
2018-10-31T10:06:00Z [DEBUG] Adding ENI(eni-0f37efb5e4ebecf09)'s IPv4 address 10.0.1.8 to datastore
2018-10-31T10:06:00Z [DEBUG] IP Address Pool stats: total: 15, assigned: 12
2018-10-31T10:06:00Z [DEBUG] Reconciled IP 10.0.1.8 on eni eni-0f37efb5e4ebecf09
2018-10-31T10:06:00Z [DEBUG] Successfully Reconciled ENI/IP pool
2018-10-31T10:06:06Z [DEBUG] IP pool stats: total = 15, used = 12, c.currentMaxAddrsPerENI = 5, c.maxAddrsPerENI = 5
2018-10-31T10:06:06Z [DEBUG] Start increasing IP Pool size
2018-10-31T10:06:06Z [DEBUG] Skipping increase IPPOOL due to max ENI already attached to the instance : 3
2018-10-31T10:06:11Z [DEBUG] IP pool stats: total = 15, used = 12, c.currentMaxAddrsPerENI = 5, c.maxAddrsPerENI = 5
2018-10-31T10:06:12Z [DEBUG] Start increasing IP Pool size
2018-10-31T10:06:13Z [DEBUG] Skipping increase IPPOOL due to max ENI already attached to the instance : 3
2018-10-31T10:06:17Z [INFO] Pods deleted on my node: t1
2018-10-31T10:06:18Z [DEBUG] IP pool stats: total = 15, used = 12, c.currentMaxAddrsPerENI = 5, c.maxAddrsPerENI = 5
2018-10-31T10:06:19Z [DEBUG] Start increasing IP Pool size
2018-10-31T10:06:19Z [DEBUG] Skipping increase IPPOOL due to max ENI already attached to the instance : 3
2018-10-31T10:06:25Z [DEBUG] IP pool stats: total = 15, used = 12, c.currentMaxAddrsPerENI = 5, c.maxAddrsPerENI = 5
2018-10-31T10:06:25Z [DEBUG] Start increasing IP Pool size
2018-10-31T10:06:26Z [DEBUG] Skipping increase IPPOOL due to max ENI already attached to the instance : 3
2018-10-31T10:06:33Z [DEBUG] IP pool stats: total = 15, used = 12, c.currentMaxAddrsPerENI = 5, c.maxAddrsPerENI = 5
2018-10-31T10:06:34Z [DEBUG] Start increasing IP Pool size
2018-10-31T10:06:35Z [DEBUG] Skipping increase IPPOOL due to max ENI already attached to the instance : 3
2018-10-31T10:06:35Z [DEBUG] Reconciling ENI/IP pool info...
2018-10-31T10:07:19Z [INFO] Pods deleted on my node: t2
2018-10-31T10:09:29Z [INFO] Pods deleted on my node: t3
2018-10-31T10:14:30Z [ERROR] Failed to retrieve interfaces data from instance metadata RequestError: send request failed
caused by: Get http://169.254.169.254/latest/meta-data/network/interfaces/macs/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2018-10-31T10:14:37Z [ERROR] ip pool reconcile: Failed to get attached eni infoget attached enis: failed to retrieve interfaces data: RequestError: send request failed
caused by: Get http://169.254.169.254/latest/meta-data/network/interfaces/macs/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2018-10-31T10:14:47Z [DEBUG] IP pool stats: total = 15, used = 12, c.currentMaxAddrsPerENI = 5, c.maxAddrsPerENI = 5
2018-10-31T10:14:49Z [DEBUG] Start increasing IP Pool size
2018-10-31T10:14:51Z [DEBUG] Skipping increase IPPOOL due to max ENI already attached to the instance : 3
2018-10-31T10:14:52Z [DEBUG] Reconciling ENI/IP pool info...
2018-10-31T10:14:59Z [INFO] Pods deleted on my node: t4
2018-10-31T10:15:43Z [DEBUG] Total number of interfaces found: 3
2018-10-31T10:15:44Z [DEBUG] Found eni mac address : 02:63:a9:60:fc:42