Giter Site home page Giter Site logo

Comments (36)

lmickh avatar lmickh commented on May 17, 2024 1

Latest one worked. The short names are listed in azure.json properly and all hosts were able to create routes.

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024 1

Heh, yes, the last one of course: "kubernetes is successfully installed, but no routes are created."

Does the apiserver actually start running, or no? That will help me guess as to why routes aren't being created.

from acs-engine.

jpoon avatar jpoon commented on May 17, 2024

cc @colemickens. I can provide private keys to help debug, but it should be fairly easy to get a repro

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

Problem is here: https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/azure/azure_routes.go#L90

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

We go straight to the route table that is listed in the config file, but we also check the subnet to see if it's properly configured.

Options:

  1. Skip the subnet check
  2. Take a special vnetResourceGroup that can override. (Can you reference a vnet in a different sub? If so, also need `vnetSubscrpitionId)
  3. Start versioning the config with a nested struct, or two-pass decode, and then start using full identifiers everywhere. This means more util functions to rip the identifiers apart since the SDK APIs address resources by individual, separate strings of the inner identifiers.

I think #1 might possibly be the right thing to do, depending on if we can support multiple subnets of machines with same route table. If we can, then subnetName is sort of meaningless. We really just care about configuring the appropriate route table in the Routes implementation. At some point we have to expect the user set things up correctly.

CC: @brendandburns for any thoughts.

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

This presents another question though, where does the route table live. Need to find out if the route table for the subnet in the existing vnet can live in both resource groups, only one, etc. If it can live in the existing-vnet's resource group, then we need to support the full identifier string for the routeTable field anyway...

from acs-engine.

mogthesprog avatar mogthesprog commented on May 17, 2024

Just came back here to report the same issue (finally got around to looking at it from #99, sorry for the delay). My two cents would be that full Resource IDs feels like it would be the azure idiomatic way of declaring resources, makes me wonder if point 3 of yours makes more sense here?

from acs-engine.

jpoon avatar jpoon commented on May 17, 2024

Thanks @colemickens. Would it not be easiest to modify the azure.json that the ACS-engine generates and puts on the master node? (ie. the workaround that I mentioned above)

I think this is what you meant by option 3. As deploying Kubernetes under a custom VNET never worked, there would be no need to start versioning this config......yet.

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

@jpoon How would that help anything? The problem is that the code assumes that all resources are in the same resource group specified in the config file.

from acs-engine.

jpoon avatar jpoon commented on May 17, 2024

In our case, everything is under the same resource group. Are there situations where people would deploy a VNET under a separate resource group?

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

I had assumed as much, but I don't actually know that as a solid fact. Is that something you would have data on?

@rgardler or @sauryadas for customers who want to deploy clusters (particularly Kubernetes) into existing vnets... are they generally putting the cluster into the existing resource group, or are the existing vnet and new cluster typically in different resource groups?

from acs-engine.

jpoon avatar jpoon commented on May 17, 2024

As custom vnets don't work at all, would it be reasonable to do a quick fix to support things in the same RG?

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

Due to the fact that the apimodel takes vnetSubnetID as the full identifier string (which I agree with), it means that for this "quick fix" we need two template functions - one to extract the vnet name, and another to extract the subnet name.

So that in the template we can write {{ subnetNameFromId .VnetSubnetID }} (very pseudocode-y).

PRs are very welcome, I'm not going to be able to get to this for a while.

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

I have a branch here that might fix this issue for deployments into a single RG. I don't have an easy way of testing it. Is anyone here willing to give it a shot? https://github.com/colemickens/acs-engine/tree/colemickens-pr-fix-custom-vnet

from acs-engine.

lmickh avatar lmickh commented on May 17, 2024

@colemickens I tested the branch with a config very similar to the custom vnet example. New single RG, new vnet, and so on. Basically just changed the vnet and RG names. The result after applying the templates was the same as before. Both vnetName and subnetName are fully qualified in /etc/kubernetes/azure.json. Not sure why that is the case.

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

I think I fixed it. Could I get you to pull, rebuild and try again? Thanks so much.

from acs-engine.

lmickh avatar lmickh commented on May 17, 2024

No dice. Error on the deployment:

At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details. {
  "error": {
    "code": "InvalidTemplate",
    "message": "Unable to process template language expressions for resource '/subscriptions/<snip>/am23-kube01/providers/Microsoft.Compute/virtualMachines/k8s-agentpool2-14283094-0/extensions/cse0' at line '1' and column '66850'. 'The template variable 'subnetName' is not found. Please see https://aka.ms/arm-template/#variables for usage details.'"
  }
}

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

Okay, I pushed another iteration up, let me know if you try it (thanks for guinea-pigging it, I really appreciate it).

from acs-engine.

lmickh avatar lmickh commented on May 17, 2024

Hmm something went wrong again. Both the masters and agents looks like this now:

    "location": "eastus2",
    "subnetName": "subnets",
    "securityGroupName": "k8s-master-14283094-nsg",
    "vnetName": "virtualNetworks",

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

@lmickh ha, off by one. Just pushed another one if you want to try.

from acs-engine.

lmickh avatar lmickh commented on May 17, 2024

@colemickens I'm not seeing a new commit on that branch.

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

Sorry @lmickh apparently I commit --amended last night, but forgot to push. I've just pushed the change up now...

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

I'd missed your reply, @lmickh. Thanks very much for dogfooding for me and confirming.

from acs-engine.

anhowe avatar anhowe commented on May 17, 2024

Cole is fixing in #172

from acs-engine.

MoTAUser avatar MoTAUser commented on May 17, 2024

@colemickens. We still have the issue of the initial post from @jpoon. We've already tried several days to deploy acs cluster with k8s in custom VNET. Using fix in #172 - still not working for us.
Is there anything we have to consider or is this issue still in progress?

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

@MoTAUser Can you elaborate? I've had other people report it's working.

ACS does not support custom vnet, only ACS-Engine does...

from acs-engine.

Hupka avatar Hupka commented on May 17, 2024

Hi @colemickens,
I am working on the same project as @MoTAUser and I try to give you a little bit more information. We are fairly new to this topic so please point out when we give you insufficient information. We are really eager to get this to work.

  • one month ago we figured out that ACS from Azure Marketplace won't work for us, that is why we only work with ACS engine since then.
  • We have a custom vnet set up within the subscription to meet our corporation's requirements. We dedicated a subnet to the ACS deploymet. We temporarily modified the routing table to get kubernetes + docker successfully installed on the deployed machines to avoid proxy issues.
  • we have set up a service principal who is contributer to our subscription
  • we deploy acs+k8s cluster in same resource group as custom VNET
  • deployment runs successfully, agents+masters are tied to the same subnet
  • kubernetes is successfully installed, but no routes are created.

Do you see anything suspicious so far?

Regards,
Adrian

from acs-engine.

Hupka avatar Hupka commented on May 17, 2024

Hey, thanks for your reply. Unfortunately we are sitting here in Germany and aren't at work anymore and can't access the azure resource. We are going to dig into this again tomorrow morning. Anything else we should look out for? Do you want to have logs of any kind?

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

The full logs of kube-controller-manager (kubectl logs --namespace=kube-system kube-controller-manager-<whatever> and kubelet (journalctl -u kubelet) from the master will probably come in handy.

from acs-engine.

MoTAUser avatar MoTAUser commented on May 17, 2024

It seems apiserver is running normally.
Here are the log files including our acs-engine manifest (kubernetesvnet.json), and the azure deployment jsons (azuredeploy.json, azuredeploy.parameter.json).

colemickens-[kubernetes] unable to create cluster with custom vnet #120 logs.zip

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

This is now fixed by the merging of #172.

from acs-engine.

colemickens avatar colemickens commented on May 17, 2024

@MoTAUser Please file a new issue if you're still having issues after updating ACS-Engine, rebuilding and redeploying. Thanks.

from acs-engine.

rrajadevops avatar rrajadevops commented on May 17, 2024

@jpoon I created the custom acs cluster today. I am looking for how to connect my cluster using kubectl, could you please share the steps how to connect my k8s cluster using kubectl.
Regards,
Raja

from acs-engine.

rrajadevops avatar rrajadevops commented on May 17, 2024

@jpoon @colemickens @mogthesprog @lmickh can anyone please share the steps to expose our custom acs pods in loadbalancer service??

When i try to create LB service i got the below error.

Events:
Type Reason Age From Message


Normal EnsuringLoadBalancer 1s (x6 over 2m) service-controller Ensuring load balancer
Warning CreatingLoadBalancerFailed 1s (x6 over 2m) service-controller Error creating load balancer (will retry): Failed to ensure load balancer for service ge-dashboard-test/pythonservice: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/e9b70815-40be-4610-b24c-**********/resourceGroups/ge-dashboard-test/providers/Microsoft.Network/loadBalancers/ge-dashboard-test-internal?api-version=2017-03-01: StatusCode=0 -- Original Error: adal: Refresh request failed. Status Code = '400'

Regards,
Raja

from acs-engine.

jpoon avatar jpoon commented on May 17, 2024

Most likely a bad service principal: https://docs.microsoft.com/en-us/azure/aks/kubernetes-service-principal

from acs-engine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.