Giter Site home page Giter Site logo

garutilorenzo / k3s-oci-cluster Goto Github PK

View Code? Open in Web Editor NEW
211.0 10.0 74.0 224 KB

Deploy a Kubernetes cluster for free, using k3s and Oracle always free resources

Home Page: https://garutilorenzo.github.io/deploy-kubernetes-for-free-oracle-cloud

License: GNU General Public License v3.0

HCL 55.91% Shell 44.09%
kubernetes k3s k3s-cluster kubernetes-cluster oracle oci terraform terraform-module automation iac

k3s-oci-cluster's People

Contributors

djelibeybi avatar emilienmottet avatar garutilorenzo avatar kainlite avatar mannp avatar noesamaille avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

k3s-oci-cluster's Issues

Trying to provision a new cluster but only getting timeouts recently

Hi

I have successfully spun up clusters before, but as of now, I can only get timeouts.

It has been working after the latest commits (argocd) but I cannot get it working now.

I wondered if it is the same for others now, or works fine?

  install_nginx_ingress    = false
  install_certmanager      = false
  install_argocd           = false

Thanks.

module.k3s_cluster.oci_core_instance_pool.k3s_servers: Still creating... [59m30s elapsed]
module.k3s_cluster.oci_core_instance_pool.k3s_servers: Still creating... [59m40s elapsed]
module.k3s_cluster.oci_core_instance_pool.k3s_servers: Still creating... [59m50s elapsed]
module.k3s_cluster.oci_core_instance_pool.k3s_servers: Still creating... [1h0m0s elapsed]

╷
│ Error: Operation Timeout 
│ Provider version: 4.95.0, released on 2022-09-28.  
│ Service: Core Instance Pool 
│ Error Message: timeout while waiting for state to become 'STOPPED, RUNNING' (last state: 'PROVISIONING', timeout: 1h0m0s), you may need to increase the Terraform Operation timeouts for your resource to continue polling for longer 
│ Suggestion: Try increasing the timeout by referring to this document: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/terraformtroubleshooting.htm#common_issues__timeoutwhilewaiting
│ 
│ 
│   with module.k3s_cluster.oci_core_instance_pool.k3s_servers,
│   on ../k3s-servers.tf line 1, in resource "oci_core_instance_pool" "k3s_servers":
│    1: resource "oci_core_instance_pool" "k3s_servers" {
│ 
╵

Not sure if it is of any consequence, but on destroying the failed cluster, I get;


│ Error: 409-Conflict, Invalid State Transition of NLB lifeCycle state from Updating to Updating
│ Suggestion: The resource is in a conflicted state. Please retry again or contact support for help with service: Network Load Balancer Listener
│ Documentation: https://registry.terraform.io/providers/oracle/oci/latest/docs/resources/network_load_balancer_listener 
│ API Reference: https://docs.oracle.com/iaas/api/#/en/networkloadbalancer/20200501/Listener/DeleteListener 

No resources found in argocd namespace after successful deploy

Hi,

What a great guide and easy to follow terraform! I got things up and going quite quickly.

However, when I tried to do "kubectl get pods -n argocd" from the root user of a server node, I got "No resources found in argocd namespace". Doing a get pods -A also showed no argocd pods.

When I searched in the output of "terraform apply" I didn't see any reference to installing ArgoCD. I also checked the documentation and the default is to install ArgoCD.

I imagine I'm missing something simple?

Error when trying to access server via SSH

I followed all steps and everything runs fine on terraform, I got all IPs on output but when I try to access the server via SSH I got timeout. Port 22 is not open. Can you help please?

Inavalid Parameter

Error: 400-InvalidParameter
│ Provider version: 4.64.0, released on 2022-02-16. This provider is 3 Update(s) behind to current.
│ Service: Core Instance Configuration
│ Error Message: Shape VM.Standard.A1.Flex is incompatible with image ocid1.image.oc1.eu-zurich-1.aaaaaaaag2uyozo7266bmg26j5ixvi42jhaujso2pddpsigtib6vfnqy5f6q
│ OPC request ID: 2aa8ad97f19d28e205df905f29fcd66a/7BCE3221152E1FE2E2DFB24B7AB7A7C6/11E759335E0A3EED3E431088EB821EA9
│ Suggestion: Please Update the parameter(s) in the Terraform config as per error message Shape VM.Standard.A1.Flex is incompatible with image ocid1.image.oc1.eu-zurich-1.aaaaaaaag2uyozo7266bmg26j5ixvi42jhaujso2pddpsigtib6vfnqy5f6q
│
│
│   with module.k3s_cluster.oci_core_instance_configuration.k3s_server_template,
│   on ../template.tf line 1, in resource "oci_core_instance_configuration" "k3s_server_template":
│    1: resource "oci_core_instance_configuration" "k3s_server_template" {
│
╵
╷
│ Error: 400-InvalidParameter
│ Provider version: 4.64.0, released on 2022-02-16. This provider is 3 Update(s) behind to current.
│ Service: Core Instance Configuration
│ Error Message: Shape VM.Standard.A1.Flex is incompatible with image ocid1.image.oc1.eu-zurich-1.aaaaaaaag2uyozo7266bmg26j5ixvi42jhaujso2pddpsigtib6vfnqy5f6q
│ OPC request ID: a9108b4d8a189f52d95bf5cc5e585c0d/61DF7E379327254DC8C4B4118B6B0849/410A6A61C58310F78074DBB275924380
│ Suggestion: Please Update the parameter(s) in the Terraform config as per error message Shape VM.Standard.A1.Flex is incompatible with image ocid1.image.oc1.eu-zurich-1.aaaaaaaag2uyozo7266bmg26j5ixvi42jhaujso2pddpsigtib6vfnqy5f6q
│
│

This just happens when I terraform apply.

Error: did not find a proper configuration for private key

Hi @garutilorenzo,
First of all, thanks for the module!

I'm experimenting some problems when I try terraform plan

│ Error: can not create client, bad configuration: did not find a proper configuration for private key
│ 
│   with provider["registry.terraform.io/oracle/oci"],
│   on provider.tf line 10, in provider "oci":
│   10: provider "oci" {
│ 

Do you have any tip about?

Modifications needed to install Cilium

What kind of modifications are needed to successfully install Cilium on this setup?

Per documentation we need to set INSTALL_K3S_EXEC='--flannel-backend=none --disable-network-policy'.
But this seems to break the node connectivity, so I'm looking for some pointers on how communication should be configured/re-established.

Can the script be re-run after doing some changes?

Hello, thank you for this excellent repo.
I want to connect to this cluster some Raspberry Pi 4 that I have. After having some problems, I found that it could be easier if I install Netmaker which runs Wiregurad. This is ok, I have a new VPN like network between all the OCI machines, all RPi, a VPS, my local machine, etc on 10.20.30.0/24.
The thing is that now I want to re-run the k3s installation script, on server and on 3 worker nodes but this time with --flannel-iface=nm-netmaker, nm-netmaker being my Wireguard interface.
So is it possible to re-run the installation with --flannel-iface= parameter or can some flags be changed/added while the cluster is running?

I tried adding a new /etc/systemd/system/k3s.service.d/network.conf with

[Service]
ExecStart=
ExecStart=/usr/local/bin/k3s agent --node-ip 10.222.0.106 --flannel-iface=nm-netmaker

I tried editing /etc/systemd/system/k3s.service and /etc/rancher/k3s/k3s.yaml and it didn't changed to a new flannel interface. Or maybe i did something wrong.

Not working from last week

Hi, I have issue with Your repo, from last week its just dont work, after that erros he will creating over and over workers even if i delete it manually. There is only 2 workers and extra node, but no server node. I clean all my account manually, or upgrade modules to newest version and still not working. Have You got idea whats happend?

│ Error: 409-IdcsConversionError, Post request failed{"schemas":["urn:ietf:params:scim:api:messages:2.0:Error","urn:ietf:params:scim:api:oracle:idcs:extension:messages:Error"],"detail":"DynamicResourceGroup with the same displayName already exists.","status":"409","urn:ietf:params:scim:api:oracle:idcs:extension:messages:Error":{"messageId":"error.common.exception.duplicateResource"}}
│ Suggestion: The resource is in a conflicted state. Please retry again or contact support for help with service: Identity Dynamic Group
│ Documentation: https://registry.terraform.io/providers/oracle/oci/latest/docs/resources/identity_dynamic_group
│ API Reference: https://docs.oracle.com/iaas/api/#/en/identity/20160918/DynamicGroup/CreateDynamicGroup
│ Request Target: POST https://identity.eu-paris-1.oci.oraclecloud.com/20160918/dynamicGroups
│ Provider version: 4.103.0, released on 2023-01-11. This provider is 38 Update(s) behind to current.
│ Service: Identity Dynamic Group
│ Operation Name: CreateDynamicGroup
│ OPC request ID: 94a8b919f099a2f8abbb146ccb145389/ABCC013EBC9238D1CCF8EC767E4CB3CB/BE84F60F504E63FFDDB68CD28FA4E005


│ with module.k3s_cluster.oci_identity_dynamic_group.compute_dynamic_group,
│ on ../iam.tf line 1, in resource "oci_identity_dynamic_group" "compute_dynamic_group":
│ 1: resource "oci_identity_dynamic_group" "compute_dynamic_group" {

Any Ideas? I was try terraform destroy many times and he sad nothing to destroy, and still not work. I clean up all, netowrk and lb, everything and always sam problem.

403 Forbidden access

Hi Lorenzo,

I've been following your steps and tested the mariadb/nginx/wordpress deployments, however, every time I attempt to access the wordpress service through the public LB IP, I'm getting HTTP error 403:
image

Steps to reproduce the issue:

  1. git clone https://github.com/garutilorenzo/k3s-oci-cluster.git
  2. add terraform.tfvars file under 'example' dir
  3. run:
    $ tfinit && tfplan -out tfplan && tf apply -auto-approve "tfplan"
  4. exec to one of the server nodes
  5. run "kubectl apply" command on all deployments:
$ sudo kubectl apply -f https://raw.githubusercontent.com/garutilorenzo/k3s-oci-cluster/master/deployments/mariadb/all-resources.yml
$ sudo kubectl apply -f https://raw.githubusercontent.com/garutilorenzo/k3s-oci-cluster/master/deployments/nginx/all-resources.yml
$ sudo kubectl apply -f https://raw.githubusercontent.com/garutilorenzo/k3s-oci-cluster/master/deployments/wordpress/all-resources.yml

$ sudo kubectl get pods
NAME                         READY   STATUS    RESTARTS   AGE
mariadb-fff96ccbd-9zvvd      1/1     Running   0          29m
nginx-86756dfb74-pzxbs       1/1     Running   0          29m
wordpress-7df665c44b-5tpng   1/1     Running   0          29m
  1. accessing the public LB IP
$ curl http://$PUBLIC_LB_IP/wp-admin/install.php -v

*   Trying $PUBLIC_LB_IP:80...
* Connected to $PUBLIC_LB_IP ($PUBLIC_LB_IP) port 80 (#0)

> GET /wp-admin/install.php HTTP/1.1
> Host: $PUBLIC_LB_IP
> User-Agent: curl/7.81.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 403 Forbidden
< Date: Sat, 27 Aug 2022 00:22:53 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Vary: Accept-Encoding
< X-Powered-By: PHP/7.4.30
< Expires: Wed, 11 Jan 1984 05:00:00 GMT
< Cache-Control: no-cache, must-revalidate, max-age=0
< X-Redirect-By: WordPress
< Location: http://$PUBLIC_LB_IP/wp-admin/install.php
<

* Connection #0 to host $PUBLIC_LB_IP left intact

Additional info:
variable "expose_kubeapi" in vars.tf file was set to True.
Manually added access for both 80/443 ports to my public ip CIDR in the Default security list.

Am I missing something?

I would like to also thank you for this great project and the amazing work you did with it!

2 AMD Workers

Can 2 AMD Worker Nodes be added to the cluster using the free AMD micro VMs?

Thanks

Thanks for creating :) a couple of queries ...

You mentioned in the docs that after the 30 days, paid items will be closed down and stopped.

I wondered if all the resources you have used are free, so would it not remain free after the 30 days :)

Also, I wondered on any direction to allow for letsencrypt certs for the external LB, rather than creating one ourselves :)

Thanks again.

Alternative operating system support?

First, thanks for creating such a useful Terraform module. It works great.

Are you interested in supporting a choice of operating system for the instances? I've created a downstream fork that allows you to chose either Ubuntu (which remains the default) or Oracle Linux as well as allows users to change the shape and resource allocation of the instances.

I'm happy to submit the feature set as a PR if you want. If not, I'll probably remove the Ubuntu stuff and just maintain an Oracle Linux based fork downstream.

3 masters for ha

@garutilorenzo first of all.. good work done ! Why not use 3 master for real ha ? Rancher suggest so and behind lb you can habe real ha . I'm wrong ? Ps saluti da un connazionale :-)

404-NotAuthorizedOrNotFound

Hi @garutilorenzo.

Thanks for your work on this repo.

  1. I am trying to deploy the cluster but I get the following message, 2 times for the 2 backends,
>  Error: 404-NotAuthorizedOrNotFound, Unknown resource Entity of type Backend with key ocid1.instance.oc1.eu-frankfurt-1.xxxxxxnl6dbt2q2ffa.6443 not found
Suggestion: Either the resource has been deleted or service Network Load Balancer Backend need policy to access this resource. Policy reference: https://docs.oracle.com/en-us/iaas/Content/Identity/Reference/policyreference.htm
Provider version: 4.75.0, released on 2022-05-11.
│ Service: Network Load Balancer Backend
│ Operation Name: GetBackend
 with module.k3s_cluster.oci_network_load_balancer_backend.k3s_kube_api_backend[1],
│   on ../k3slb.tf line 36, in resource "oci_network_load_balancer_backend" "k3s_kube_api_backend":
│   36: resource "oci_network_load_balancer_backend" "k3s_kube_api_backend" {

Strangely enough I can see in the web console that the network load balancer is created and has 1 listener and 2 backends.

  1. Don't know if it's relevant but when I run terraform init the first time I got this message
>Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/template from the dependency lock file
- Reusing previous version of oracle/oci from the dependency lock file
- Reusing previous version of hashicorp/oci from the dependency lock file
- Using previously-installed hashicorp/template v2.2.0
- Using previously-installed oracle/oci v4.74.0
- Using previously-installed hashicorp/oci v4.75.0

╷
│ Warning: Additional provider information from registry
│
│ The remote registry returned warnings for registry.terraform.io/hashicorp/oci:
│ - For users on Terraform 0.13 or greater, this provider has moved to oracle/oci. Please update your source in
│ required_providers.
╵

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

[Terraform on WSL2 Ubuntu 20.04]
Terraform v1.1.9
on linux_amd64

  • provider registry.terraform.io/hashicorp/oci v4.75.0
  • provider registry.terraform.io/hashicorp/template v2.2.0

Thanks in advance for any pointers that you may have.

Use Traefik 2 instead of nginx as ingress

I wondered, if setting install_nginx_ingress = false, would still install Traefik 1 as the default ingress?

If so, is there a way to turn this off, so I can install Traefik 2?

Thanks.

Edit: Looking at the cluster, it looks like traefik 2.6 is installed, but not in its own namespace etc

Would there be any chance of having an option to say perhaps nginx ingress (Yes/No/None) the none being to apply --disable traefik without installing the nginx ingress.

%{ if install_nginx_ingress } 
disable_traefik="--disable traefik"
%{ endif }

That way, we can apply our own traefik 2 configs?

Access service from outside using Nginx

Hello, I'm trying to access a service from URL, the ingress already exists but points to internal LB IP
image

Can you please help me and tell how can I access my service from outside?

Ps: I have saw the nginx config for WP but I tried to use to my service but without success.

How to access Kubernetes API (port 6443) from the outside?

Ciao Lorenzo

First of all, thanks for the repo!
After fiddling with all the different parameters Terraform and OCI needs, I was able to start a K3S cluster.
However, I would like to access the cluster from my PC with a Management GUI like Lens or trough my local kubectl.
Since I'm a bit of a network-noob: How would I setup the access from the outside to the server ports 6443?

  • add backend sets to the public LB (how?)
  • if so, create a route to the internal LB port 6443 or round robin to the 2 servers?

Thanks for any pointers!

Invalid availability domain for servers & workers

Hi!
I seem to have problems when running terraform apply. When it tries to create the instance pools it fails both on the server one and the worker one.

I checked on the Oracle Cloud dashboard and if I try to create an AMD compute instance on any of my 3 availability domains it doesn't let me, I think they are full or something (?)

Here are the logs that fail. I'm on region "uk-london-1" with only Always Free resources enabled.
I hope I'm not leaking anything important. Thanks in advance.

╷
│ Error: 400-InvalidParameter, Invalid availability domain found in placement configuration, ocid1.availabilitydomain.oc1..aaaaaaaac6kgqnl5eberztbdm2ufhpxohk7cuelajmafpqa3m6rpri4tc2yq 
│ Suggestion: Please update the parameter(s) in the Terraform config as per error message Invalid availability domain found in placement configuration, ocid1.availabilitydomain.oc1..aaaaaaaac6kgqnl5eberztbdm2ufhpxohk7cuelajmafpqa3m6rpri4tc2yq
│ Documentation: https://registry.terraform.io/providers/oracle/oci/latest/docs/resources/core_instance_pool 
│ API Reference: https://docs.oracle.com/iaas/api/#/en/iaas/20160918/InstancePool/CreateInstancePool 
│ Request Target: POST https://iaas.uk-london-1.oraclecloud.com/20160918/instancePools 
│ Provider version: 4.83.0, released on 2022-07-05.  
│ Service: Core Instance Pool 
│ Operation Name: CreateInstancePool 
│ OPC request ID: 4fc4e338c31e815a7263de918a8a3d56/8CF034C65BAD9CF31734ADAE390066E0/5A105DFF42DA705E8F29525778883A57 
│ 
│ 
│   with module.k3s_cluster.oci_core_instance_pool.k3s_servers,
│   on ../k3s-servers.tf line 1, in resource "oci_core_instance_pool" "k3s_servers":
│    1: resource "oci_core_instance_pool" "k3s_servers" {
│ 
╵
╷
│ Error: 400-InvalidParameter, Invalid availability domain found in placement configuration, ocid1.availabilitydomain.oc1..aaaaaaaac6kgqnl5eberztbdm2ufhpxohk7cuelajmafpqa3m6rpri4tc2yq 
│ Suggestion: Please update the parameter(s) in the Terraform config as per error message Invalid availability domain found in placement configuration, ocid1.availabilitydomain.oc1..aaaaaaaac6kgqnl5eberztbdm2ufhpxohk7cuelajmafpqa3m6rpri4tc2yq
│ Documentation: https://registry.terraform.io/providers/oracle/oci/latest/docs/resources/core_instance_pool 
│ API Reference: https://docs.oracle.com/iaas/api/#/en/iaas/20160918/InstancePool/CreateInstancePool 
│ Request Target: POST https://iaas.uk-london-1.oraclecloud.com/20160918/instancePools 
│ Provider version: 4.83.0, released on 2022-07-05.  
│ Service: Core Instance Pool 
│ Operation Name: CreateInstancePool 
│ OPC request ID: 5a425ff03abc57b202ab03fd050754cc/6E03C941B2802124FDD2022B822EBB7B/F4D90D0F8F304FB098696D3B6072D9AF 
│ 
│ 
│   with module.k3s_cluster.oci_core_instance_pool.k3s_workers,
│   on ../k3s-workers.tf line 1, in resource "oci_core_instance_pool" "k3s_workers":
│    1: resource "oci_core_instance_pool" "k3s_workers" {
│ 
╵

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.