Comments (11)
Hey @brandoconnor - I am 99% positive it's not related to the module either but the severe lack of documentation/support for EKS right now is disappointing and made me resort to posting here. I also find it really hard to believe that I'm the only one who has run into this with EKS. Would appreciate anything at this point.
Edit: Maybe you or someone here can shed some light on where the 172.20.x.x addresses are coming from? The issue seems to be rooted there. Even if the userdata.sh used the 10.100.x.x there would still be a probelm. Is this an autoassigned cluster/pod-cidr by AWS? Given the pods/ENIs pull from the same subnet as the nodes, shouldn't the ClusterIPs as well?
from terraform-aws-eks.
@brandoconnor I have some things I've run into that I will hopefully have time to PR in the next few days - a lot of my customizations have either been custom AMIs or running ansible after the fact and haven't really taken a step back yet. Also, at this point it looks like my docker customizations did not help so 🤷
So as of right now, thank you to all the contributors for at least making creating a cluster painless!
from terraform-aws-eks.
Hey @hobbsh . Thanks for the detailed issue 🤘
I haven't actually tried this myself nor do I suspect the module here is the root cause. I would bet this is beyond the scope of any AWS docs as well.
@ozbillwang and @max-rocket-internet - have either of you done DNS within your EKS cluster yet?
from terraform-aws-eks.
First of all, probably nothing to do with this module. Second, if kube-dns
is not working, then almost nothing will work.
I tried a test using busybox and get inconsistent results:
$ kubectl run --rm -i --tty --image=busybox temp --restart=Never -- sh
If you don't see a command prompt, try pressing enter.
/ # nslookup kubernetes.default.svc.cluster.local
Server: 172.20.0.10
Address: 172.20.0.10:53
*** Can't find kubernetes.default.svc.cluster.local: No answer
/ #
/ # nslookup kubernetes.default.svc.cluster.local
Server: 172.20.0.10
Address: 172.20.0.10:53
Non-authoritative answer:
Name: kubernetes.default.svc.cluster.local
Address: 172.20.0.1
^C
/ # nslookup ingress1-nginx-ingress-default-backend
Server: 172.20.0.10
Address: 172.20.0.10:53
** server can't find ingress1-nginx-ingress-default-backend: NXDOMAIN
^C
/ # nslookup ingress1-nginx-ingress-default-backend
Server: 172.20.0.10
Address: 172.20.0.10:53
** server can't find ingress1-nginx-ingress-default-backend: NXDOMAIN
^C
/ # nslookup ingress1-nginx-ingress-defa^C
/ # nslookup ingress1-nginx-ingress-default-backend.default.svc.cluster.local
Server: 172.20.0.10
Address: 172.20.0.10:53
Name: ingress1-nginx-ingress-default-backend.default.svc.cluster.local
Address: 172.20.126.10
See it resolves sometimes but not others? Strange. I don't know why.
Anyway, with an ubuntu image, it works fine:
$ kubectl run --rm -i --tty --image=ubuntu temp --restart=Never -- bash
If you don't see a command prompt, try pressing enter.
root@temp:/# getent hosts kubernetes
172.20.0.1 kubernetes.default.svc.cluster.local
root@temp:/# getent hosts ingress1-nginx-ingress-default-backend
172.20.126.10 ingress1-nginx-ingress-default-backend.default.svc.cluster.local
root@temp:/#
from terraform-aws-eks.
@max-rocket-internet First of all thanks for replying and sorry to taint this repo with an unrelated thread - I appreciate the support! This all started with my grafana pod (prometheus-operator) not being able to find the datasource via the prometheus cluster hostname.
I woke up this morning and of course now it's working. However something is still up with busybox. Now it's even more frustrating that there's not a concrete answer for this...
$ kubectl exec -it busybox -- nslookup kubernetes.default
Server: 172.20.0.10
Address: 172.20.0.10:53
** server can't find kubernetes.default: NXDOMAIN
*** Can't find kubernetes.default: No answer
root@kube-prometheus-grafana-7c44bfbb84-5ghc4:/# nslookup kube-prometheus.monitoring
Server: 172.20.0.10
Address: 172.20.0.10#53
Name: kube-prometheus.monitoring.svc.cluster.local
Address: 172.20.49.204
Edit:
I take that back about nothing being changed - I had changed the docker.service file on all the nodes to use the cluster DNS - rebuilding again to verify:
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target docker.socket
Wants=docker.socket
[Service]
Type=notify
Environment=GOTRACEBACK=crash
ExecReload=/bin/kill -s HUP $MAINPID
Delegate=yes
KillMode=process
ExecStart=/usr/bin/dockerd \
--dns 172.20.0.10 \
--dns-search default.svc.cluster.local --dns-search svc.cluster.local --dns-search staging.thinklumo.com \
--dns-opt ndots:3 --dns-opt timeout:2 --dns-opt attempts:2
TasksMax=infinity
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
TimeoutStartSec=1min
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
from terraform-aws-eks.
Sure enough, it works now. Can't explain it, but I'll get off your lawn now. Thanks again.
from terraform-aws-eks.
Thanks for the assist @max-rocket-internet . I think we all lament the need to get support through channels like this... All I ask is we make the most of the situation and share here what was done to resolve the problems when they're resolved, which you've done in spades @hobbsh . Thanks for the wrap up!
from terraform-aws-eks.
@hobbsh - Can the module do anything to lessen the pain/confusion? Should we look to do more in userdata to alter the configurations as you did?
from terraform-aws-eks.
Just a follow-up - I think I was hitting this issue: moby/libnetwork#2187.
from terraform-aws-eks.
FWIW, the DNS service at 172.20.0.10
stems from this file: /etc/eks/bootstrap.sh
.
.
.
INTERNAL_IP=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
INSTANCE_TYPE=$(curl -s http://169.254.169.254/latest/meta-data/instance-type)
DNS_CLUSTER_IP=10.100.0.10
if [[ $INTERNAL_IP == 10.* ]] ; then
DNS_CLUSTER_IP=172.20.0.10;
fi
.
.
.
from terraform-aws-eks.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
from terraform-aws-eks.
Related Issues (20)
- metadata_options not reflecting in nodes when use_custom_launch_template = false HOT 4
- Improve Karpenter example - Multiple tagged security groups found for instance HOT 2
- Bottlerocket - SelfManaged NodeGroup - extra parameter issue HOT 3
- Error: Unsupported attribute for provider_key_arn when Upgrading to V19 from v18 HOT 6
- Add support for `ignore_failed_scaling_activities` HOT 2
- Add flexibility to choose cloudwatch event rule name HOT 2
- EKS cluster module doesn't create a cluster access entry for SSO users HOT 4
- No default networking add-ons: Terraform waiting for the nodes to be in Ready state (question) HOT 5
- Port 9443 and 8443 should not be added to node nsg unless these modules are installed HOT 2
- ConfigMap "aws-auth": Unauthorized HOT 6
- Can't pass tags to EC2 instance from eks managed node group HOT 1
- Add upgrade_policy config block for aws_eks_cluster HOT 1
- Created ec2 instances cannot join the cluster HOT 1
- Add depends_on for the 'resource "aws_eks_addon" "before_compute"' HOT 1
- dynamic number of access_entires HOT 2
- Documentation needs improvement + linting issue?
- Using terraform <1.6.0, `aws_ec2_tag` with dynamic tag *values* results in for_each error about unknown *keys* HOT 1
- Missing node to node security group
- AWS CLB creation question HOT 1
- 'cluster_service_ipv4_cidr' input variable apparently only accepts RFC1918 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from terraform-aws-eks.