Giter Site home page Giter Site logo

kurokobo / awx-on-k3s Goto Github PK

View Code? Open in Web Editor NEW
507.0 28.0 141.0 565 KB

An example implementation of AWX on single node K3s using AWX Operator, with easy-to-use simplified configuration with ownership of data and passwords.

License: MIT License

Dockerfile 100.00%

awx-on-k3s's Introduction

πŸ“š AWX on Single Node K3s

An example implementation of AWX on single node K3s using AWX Operator, with easy-to-use simplified configuration with ownership of data and passwords.

  • Accessible over HTTPS from remote host
  • All data will be stored under /data
  • Fixed (configurable) passwords for AWX and PostgreSQL
  • Fixed (configurable) versions of AWX

If you want to view the guide for the specific version of AWX Operator, switch the page to the desired tag instead of the main branch.

πŸ“ Table of Contents

πŸ“ Environment

  • Tested on:
    • CentOS Stream 9 (Minimal)
    • K3s v1.29.5+k3s1
  • Products that will be deployed:
    • AWX Operator 2.19.0
    • AWX 24.6.0
    • PostgreSQL 15

πŸ“ References

πŸ“ Requirements

  • Computing resources
    • 2 CPUs minimum.
      • Both AMD64 (x86_64) with x86-64-v2 support, and ARM64 (aarch64) are supported.
    • 4 GiB RAM minimum.
    • It's recommended to add more CPUs and RAM (like 4 CPUs and 8 GiB RAM or more) to avoid performance issue and job scheduling issue.
    • The files in this repository are configured to ignore resource requirements which specified by AWX Operator by default.
  • Storage resources
    • At least 10 GiB for /var/lib/rancher and 10 GiB for /data are safe for fresh install.
      • /var/lib/rancher will be created and consumed by K3s and related data like container images and overlayfs.
      • /data will be created in this guide and used to store AWX-related databases and files.
    • Both will be grown during lifetime and actual consumption highly depends on your environment and your use case, so you should to pay attention to the consumption and add more capacity if required.

πŸ“ Deployment Instruction

βœ… Prepare CentOS Stream 9 host

Disable firewalld and nm-cloud-setup if enabled. This is recommended by K3s.

# Disable firewalld
sudo systemctl disable firewalld --now

# Disable nm-cloud-setup if exists and enabled
sudo systemctl disable nm-cloud-setup.service nm-cloud-setup.timer
sudo reboot

Install the required packages to deploy AWX Operator and AWX.

sudo dnf install -y git curl

βœ… Install K3s

Install a specific version of K3s with --write-kubeconfig-mode 644 to make the config file (/etc/rancher/k3s/k3s.yaml) readable by non-root users.

curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.29.5+k3s1 sh -s - --write-kubeconfig-mode 644

βœ… Install AWX Operator

Clone this repository and change directory.

If you want to use files suitable for a specific version of AWX Operator, refer to tags in this repository and specify the desired tag in git checkout. Especially for 0.13.0 or earlier versions of AWX Operator, refer to πŸ“Tips: Deploy older version of AWX Operator.

cd ~
git clone https://github.com/kurokobo/awx-on-k3s.git
cd awx-on-k3s
git checkout 2.19.0

Then invoke kubectl apply -k operator to deploy AWX Operator.

kubectl apply -k operator

The AWX Operator will be deployed to the namespace awx.

$ kubectl -n awx get all
NAME                                                   READY   STATUS    RESTARTS   AGE
pod/awx-operator-controller-manager-68d787cfbd-kjfg7   2/2     Running   0          16s

NAME                                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/awx-operator-controller-manager-metrics-service   ClusterIP   10.43.150.245   <none>        8443/TCP   16s

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/awx-operator-controller-manager   1/1     1            1           16s

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/awx-operator-controller-manager-68d787cfbd   1         1         1       16s

βœ… Prepare required files to deploy AWX

Generate a Self-Signed certificate. Note that an IP address can't be specified. If you want to use a certificate from a public ACME CA such as Let's Encrypt or ZeroSSL instead of a Self-Signed certificate, follow the guide on πŸ“ Use SSL Certificate from Public ACME CA first and come back to this step when done.

AWX_HOST="awx.example.com"
openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -out ./base/tls.crt -keyout ./base/tls.key -subj "/CN=${AWX_HOST}/O=${AWX_HOST}" -addext "subjectAltName = DNS:${AWX_HOST}"

Modify hostname in base/awx.yaml.

...
spec:
  ...
  ingress_type: ingress
  ingress_hosts:
    - hostname: awx.example.com   πŸ‘ˆπŸ‘ˆπŸ‘ˆ
      tls_secret: awx-secret-tls
...

Modify the two password entries in base/kustomization.yaml. Note that the password under awx-postgres-configuration should not contain single or double quotes (', ") or backslashes (\) to avoid any issues during deployment, backup or restoration.

...
  - name: awx-postgres-configuration
    type: Opaque
    literals:
      - host=awx-postgres-15
      - port=5432
      - database=awx
      - username=awx
      - password=Ansible123!   πŸ‘ˆπŸ‘ˆπŸ‘ˆ
      - type=managed

  - name: awx-admin-password
    type: Opaque
    literals:
      - password=Ansible123!   πŸ‘ˆπŸ‘ˆπŸ‘ˆ
...

Prepare directories for Persistent Volumes defined in base/pv.yaml. These directories will be used to store your databases and project files. Note that the size of the PVs and PVCs are specified in some of the files in this repository, but since their backends are hostPath, its value is just like a label and there is no actual capacity limitation.

sudo mkdir -p /data/postgres-15
sudo mkdir -p /data/projects
sudo chown 1000:0 /data/projects

βœ… Deploy AWX

Deploy AWX, this takes few minutes to complete.

kubectl apply -k base

To monitor the progress of the deployment, check the logs of deployments/awx-operator-controller-manager:

kubectl -n awx logs -f deployments/awx-operator-controller-manager

If the deployment completes successfully, the logs end with:

$ kubectl -n awx logs -f deployments/awx-operator-controller-manager
...
----- Ansible Task Status Event StdOut (awx.ansible.com/v1beta1, Kind=AWX, awx/awx) -----
PLAY RECAP *********************************************************************
localhost                  : ok=90   changed=0    unreachable=0    failed=0    skipped=82   rescued=0    ignored=1

The required objects should now have been deployed next to AWX Operator in the awx namespace.

$ kubectl -n awx get awx,all,ingress,secrets
NAME                      AGE
awx.awx.ansible.com/awx   6m48s

NAME                                                  READY   STATUS      RESTARTS   AGE
pod/awx-operator-controller-manager-59b86c6fb-4zz9r   2/2     Running     0          7m22s
pod/awx-postgres-15-0                                 1/1     Running     0          6m33s
pod/awx-web-549f7fdbc5-htpl9                          3/3     Running     0          6m5s
pod/awx-migration-24.6.0-kglht                        0/1     Completed   0          4m36s
pod/awx-task-7d4fcdd449-mqkp2                         4/4     Running     0          6m4s

NAME                                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/awx-operator-controller-manager-metrics-service   ClusterIP   10.43.58.194    <none>        8443/TCP   7m33s
service/awx-postgres-15                                   ClusterIP   None            <none>        5432/TCP   6m33s
service/awx-service                                       ClusterIP   10.43.180.226   <none>        80/TCP     6m7s

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/awx-operator-controller-manager   1/1     1            1           7m33s
deployment.apps/awx-web                           1/1     1            1           6m5s
deployment.apps/awx-task                          1/1     1            1           6m4s

NAME                                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/awx-operator-controller-manager-59b86c6fb   1         1         1       7m22s
replicaset.apps/awx-web-549f7fdbc5                          1         1         1       6m5s
replicaset.apps/awx-task-7d4fcdd449                         1         1         1       6m4s

NAME                               READY   AGE
statefulset.apps/awx-postgres-15   1/1     6m33s

NAME                             COMPLETIONS   DURATION   AGE
job.batch/awx-migration-24.6.0   1/1           2m4s       4m36s

NAME                                    CLASS     HOSTS             ADDRESS         PORTS     AGE
ingress.networking.k8s.io/awx-ingress   traefik   awx.example.com   192.168.0.221   80, 443   6m6s

NAME                                  TYPE                DATA   AGE
secret/redhat-operators-pull-secret   Opaque              1      7m33s
secret/awx-admin-password             Opaque              1      6m48s
secret/awx-postgres-configuration     Opaque              6      6m48s
secret/awx-secret-tls                 kubernetes.io/tls   2      6m48s
secret/awx-app-credentials            Opaque              3      6m9s
secret/awx-secret-key                 Opaque              1      6m41s
secret/awx-broadcast-websocket        Opaque              1      6m38s
secret/awx-receptor-ca                kubernetes.io/tls   2      6m14s
secret/awx-receptor-work-signing      Opaque              2      6m12s

Now your AWX is available at https://awx.example.com/ or the hostname you specified.

Note that you have to access via the hostname that you specified in base/awx.yaml, instead of by IP address, since this guide uses Ingress. So you should configure your DNS or hosts file on your client where the browser is running.

At this point, AWX can be accessed via HTTP as well as HTTPS. If you want to force users to use HTTPS, see πŸ“Tips: Enable HTTP Strict Transport Security (HSTS).

πŸ“ Back up and Restore AWX using AWX Operator

The AWX Operator 0.10.0 or later has the ability to back up and restore AWX in easy way.

Refer πŸ“ Back up AWX using AWX Operator and πŸ“ Restore AWX using AWX Operator for details.

πŸ“ Additional Guides

awx-on-k3s's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

awx-on-k3s's Issues

Update guide

Can you please create a guide for safely updating to a newer version of AWX Operator?

Where do files created during playbook go?

**## Environment
k3s --version
k3s version v1.22.7+k3s1 (8432d7f2)
go version go1.16.10

  • OS: CentOS X.Y, RHEL X.Y, Ubuntu X.Y, Debian X.Y, ...
    CentOS Linux release 8.5.2111

  • Kubernetes/K3s: X.Y.Z
    kubectl version
    Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.7+k3s1", GitCommit:"8432d7f239676dfe8f748c0c2a3fabf8cf40a826", GitTreeState:"clean", BuildDate:"2022-02-24T23:03:47Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.7+k3s1", GitCommit:"8432d7f239676dfe8f748c0c2a3fabf8cf40a826", GitTreeState:"clean", BuildDate:"2022-02-24T23:03:47Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}

  • AWX Operator: 0.20.0

Description

I can't find files that should be created in /data/projects/_???

The playbook:


  • hosts: localhost
    gather_facts: false
    tasks:
    • name: Create a zip archive of compress.txt
      community.general.archive:
      path: compress.txt
      format: zip
    • name: complete message
      debug:
      msg: "take 42 : file zipped!"

The template run:

EXEC /bin/sh -c 'rm -f -r /home/runner/.ansible/tmp/ansible-tmp-1649343859.1696134-25-67367807368605/ > /dev/null 2>&1 && sleep 0'
changed: [localhost] => {
"archived": [
"compress.txt"
],
"arcroot": "",
"changed": true,
"dest": "compress.txt.zip",
"dest_state": "archive",
"expanded_exclude_paths": [],
"expanded_paths": [
"compress.txt"
],
"gid": 0,
"group": "root",
"invocation": {
"module_args": {
"attributes": null,
"dest": null,
"exclude_path": [],
"exclusion_patterns": null,
"force_archive": false,
"format": "zip",
"group": null,
"mode": null,
"owner": null,
"path": [
"compress.txt"
],
�…
TASK [complete message] ********************************************************
task path: /runner/project/awx_compress.yml:9

This runs and says it is Successful.

Step to Reproduce

I source the project and then run the template.
I just can't find where compress.txt.zip is going if at all.
If I run this using ansible-playbook compress.yml on any system and it works and produces the file.

Template info

Templates
compress
Details

Back to Templates

Details

Access

Notifications

Schedules

Jobs

Survey

Name
compress
Job Type
run
Organization
Default
Inventory
Demo Inventory
Project
compress
Execution Environment
Control Plane Execution Environment
Playbook
awx_compress.yml
Forks
0
Verbosity
4 (Connection Debug)
Timeout
0
Show Changes
Off
Job Slicing
1
Created
4/6/2022, 5:32:35 PM by admin
Last Modified
4/7/2022, 10:51:03 AM by admin
Credentials
SSH: Demo Credential
Variables

Project info

Projects
compress
Details

Back to Projects

Details

Access

Job Templates

Notifications

Schedules

Last Job Status
Successful
Name
compress
Organization
Default
Source Control Type
Git
Source Control Revision
800c8ae
Source Control URL
http://192.168.99.201/root/compress_playbook.git
Source Control Credential
Scm: local_git_user
Cache Timeout
0 Seconds
Default Execution Environment
AWX EE (latest)
Project Base Path
/var/lib/awx/projects
Playbook Directory
_8__compress
Created
4/6/2022, 5:30:36 PM by admin
Last Modified
4/7/2022, 11:04:00 AM by admin
**

Include documentation for external PostgreSQL database

Thank-you for this awesome guide. I was able to get AWX working with minimal difficulties.

Would it be possible to get an alternate set of documentation that shows how to use an external (unmanaged) PostgreSQL database?

Image pull docker.io/library/postgres:12 failure - where to configure image pull secret / credentials

Environment

  • OS: RHEL 8.5
  • Kubernetes/K3s: v1.22.6+k3s1 (3228d9cb)
  • AWX Operator: 0.16.1

Description

On a fresh install the postgres container is in 'ErrImagePull' state.

# kubectl -n awx get pod
NAME                                               READY   STATUS         RESTARTS   AGE
awx-operator-controller-manager-6d959bd7dd-9g786   2/2     Running        0          2m58s
awx-postgres-0                                     0/1     ErrImagePull   0          46s

Looking further it seems it is unable to pull the image from docker.io

# kubectl -n awx describe pod awx-postgres-0
  Warning  Failed            20s (x2 over 50s)  kubelet            Error: ImagePullBackOff
  Normal   Pulling           6s (x3 over 54s)   kubelet            Pulling image "postgres:12"
  Warning  Failed            3s (x3 over 50s)   kubelet            Failed to pull image "postgres:12": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/postgres:12": failed to copy: httpReadSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/library/postgres/manifests/sha256:ed97ef00029e0df606e9d8c9fba68b1ef5d023dbacc84178b441312282178123: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

I think I understand the error - I have a login for docker hub, but I have not specified it anywhere so my requests are being denied.

This is very much identical to Image pull failing on K3S deployment using kurokobo/awx-on-k3s

I see references to image_pull_secret in the AWX Operator guide, but I'm unsure how to use them in conjunction with this guide.

What are the options for specifying credentials to docker hub?

Step to Reproduce

Follow the guide on a fresh installation, but do not provide any docker hub credentials or run 'docker login'.

Restore seems to fail

Environment

  • OS: Centos 8.3
  • Kubernetes/K3s: 1.21.3
  • AWX Operator: 0.13.0

Description

Hi,

Did a test backup which was executed as expected. The backup data exists. However, restore was not successful.

Not sure which part to pick out from the log but here is an example.

TASK [Create management pod from templated deployment config] ******************************** 
fatal: [localhost]: FAILED! => {"changed": false, "error": 422, "msg": "Failed to create object: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Pod \\\\\"awxrestore-2021-08-30-db-management\\\\\" is invalid: [spec.volumes[0].persistentVolumeClaim.claimName: Required value, spec.containers[0].volumeMounts[0].name: Not found: \\\\\"awxrestore-2021-08-30-backup\\\\\"]\",\"reason\":\"Invalid\",\"details\":{\"name\":\"awxrestore-2021-08-30-db-management\",\"kind\":\"Pod\",\"causes\":[{\"reason\":\"FieldValueRequired\",\"message\":\"Required value\",\"field\":\"spec.volumes[0].persistentVolumeClaim.claimName\"},{\"reason\":\"FieldValueNotFound\",\"message\":\"Not found: \\\\\"awxrestore-2021-08-30-backup\\\\\"\",\"field\":\"spec.containers[0].volumeMounts[0].name\"}]},\"code\":422}\\n'", "reason": "Unprocessable Entity", "status": 422}

Did I miss something?

Thanks

Changing default ports

Hello,

Thank you for this, really appreciate it!. Where can i change the port number? say i want to run on port 8443 rather than default port 80?

Thank you!

Bad Gateway upon fresh deployment

Hi there,

On my AWX on K3S deployment, I get a "Bad Gateway" same as described here in issue #10. Can you please give me some pointer how may I go about troubleshooting this?

Environment

  • OS: Open SUSE 15.3
  • k3s version v1.22.6+k3s1 (3228d9cb)
  • go version go1.16.10
  • AWX Operation version 0.14.0

Issue Description

I'm able to start all of the pods:

awx-nathan:~ # kubectl -n awx get all,ingress,awx
NAME                                                   READY   STATUS    RESTARTS         AGE
pod/awx-postgres-0                                     1/1     Running   0                141m
pod/awx-operator-controller-manager-68d787cfbd-2jlsf   2/2     Running   0                144m
pod/awx-b85cd74b6-l4zn4                                4/4     Running   20 (6m49s ago)   140m

NAME                                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/awx-operator-controller-manager-metrics-service   ClusterIP   10.43.178.253   <none>        8443/TCP   149m
service/awx-postgres                                      ClusterIP   None            <none>        5432/TCP   141m
service/awx-service                                       ClusterIP   10.43.137.81    <none>        80/TCP     140m

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/awx-operator-controller-manager   1/1     1            1           149m
deployment.apps/awx                               1/1     1            1           140m

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/awx-operator-controller-manager-68d787cfbd   1         1         1       149m
replicaset.apps/awx-b85cd74b6                                1         1         1       140m

NAME                            READY   AGE
statefulset.apps/awx-postgres   1/1     141m

NAME                                    CLASS    HOSTS            ADDRESS         PORTS     AGE
ingress.networking.k8s.io/awx-ingress   <none>   awx-nathan.com   10.188.28.216   80, 443   140m

NAME                      AGE
awx.awx.ansible.com/awx   141m

But when I try to access the AWX GUI, I get a "Bad Gateway error":
image

Any pointer would be really appreciated! Thanks.

Getting a "404 Page Not Found" After Seemingly Successful Deployment

Environment

  • OS: CentOS8 Stream
  • Kubernetes/K3s: 1.22.7
  • AWX Operator: 0.18.0

Description

After deployment, navigating to https://[ip] serves a blank page with the only text being "404 page Not Found".

Running kubectl describe on ingress

$ kubectl -n awx describe ingress
Name:             awx-ingress
Namespace:        awx
Address:          192.168.0.105
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
TLS:
  awx-secret-tls terminates [hostname]
Rules:
  Host                  Path  Backends
  ----                  ----  --------
  [hostname]
                        /   awx-service:80 (10.42.0.11:8052)
Annotations:            <none>
Events:                 <none>

Output of all AWX related pods, etc.

$ kubectl -n awx get awx,all,ingress,secrets
NAME                      AGE
awx.awx.ansible.com/awx   92m

NAME                                                   READY   STATUS    RESTARTS   AGE
pod/awx-operator-controller-manager-5ddf49cc4f-2nwdx   2/2     Running   0          96m
pod/awx-postgres-0                                     1/1     Running   0          91m
pod/awx-6c645b554-kgv2d                                4/4     Running   0          90m

NAME                                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/awx-operator-controller-manager-metrics-service   ClusterIP   10.43.98.31     <none>        8443/TCP   96m
service/awx-postgres                                      ClusterIP   None            <none>        5432/TCP   91m
service/awx-service                                       ClusterIP   10.43.231.213   <none>        80/TCP     90m

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/awx-operator-controller-manager   1/1     1            1           96m
deployment.apps/awx                               1/1     1            1           90m

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/awx-operator-controller-manager-5ddf49cc4f   1         1         1       96m
replicaset.apps/awx-6c645b554                                1         1         1       90m

NAME                            READY   AGE
statefulset.apps/awx-postgres   1/1     91m

NAME                                    CLASS    HOSTS                  ADDRESS         PORTS     AGE
ingress.networking.k8s.io/awx-ingress   <none>   awx.spartandev.local   192.168.0.105   80, 443   90m

NAME                                                 TYPE                                  DATA   AGE
secret/default-token-fjwqp                           kubernetes.io/service-account-token   3      96m
secret/awx-operator-controller-manager-token-9jb8t   kubernetes.io/service-account-token   3      96m
secret/awx-admin-password                            Opaque                                1      92m
secret/awx-postgres-configuration                    Opaque                                6      92m
secret/awx-secret-tls                                kubernetes.io/tls                     2      92m
secret/awx-app-credentials                           Opaque                                3      90m
secret/awx-token-l2cjc                               kubernetes.io/service-account-token   3      90m
secret/awx-secret-key                                Opaque                                1      92m
secret/awx-broadcast-websocket                       Opaque                                1      91m

AWX Upgrade results in container manager error

Environment

k3s version v1.21.5+k3s2 (724ef700)
go version go1.16.8
  • OS: CentOS 8
  • AWX Operator: 0.17.0

Description

Followed your guide and upgraded from 0.14.0 to 0.17.0. First attempt failed due to resources, but worked after deleting the old deployment.

An error occurs when I execute the backup and the following command:
kubectl -n awx logs deployments/awx-operator-controller-manager -c manager --since=60m --tail=4

error: container manager is not valid for pod awx-operator-controller-manager-775b5cfc56-46htw

NAME                                               READY   STATUS    RESTARTS   AGE
awx-postgres-0                                     1/1     Running   3          121d
awx-operator-controller-manager-775b5cfc56-46htw   2/2     Running   0          22h
awx-7f55c57c85-7d6q8                               4/4     Running   0          22h
[root@awx backup]# kubectl -n awx describe pod awx-operator-controller-manager-775b5cfc56-46htw
Name:         awx-operator-controller-manager-775b5cfc56-46htw
Namespace:    awx
Priority:     0
Node:         awx.domain.local/10.6.104.4
Start Time:   Thu, 17 Feb 2022 12:17:54 +0100
Labels:       control-plane=controller-manager
              pod-template-hash=775b5cfc56
Annotations:  <none>
Status:       Running
IP:           10.42.0.129
IPs:
  IP:           10.42.0.129
Controlled By:  ReplicaSet/awx-operator-controller-manager-775b5cfc56
Containers:
  kube-rbac-proxy:
    Container ID:  containerd://2d90e3940e25a32c2d0da5436ddd817b8b3cc3cb2a7a1c612bf3284d1f85cedd
    Image:         gcr.io/kubebuilder/kube-rbac-proxy:v0.8.0
    Image ID:      gcr.io/kubebuilder/kube-rbac-proxy@sha256:db06cc4c084dd0253134f156dddaaf53ef1c3fb3cc809e5d81711baa4029ea4c
    Port:          8443/TCP
    Host Port:     0/TCP
    Args:
      --secure-listen-address=0.0.0.0:8443
      --upstream=http://127.0.0.1:8080/
      --logtostderr=true
      --v=10
    State:          Running
      Started:      Thu, 17 Feb 2022 12:17:55 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-frw4h (ro)
  awx-manager:
    Container ID:  containerd://408d16164ed4bb4b63f5f8a5e619b73d569a057f7674a93807cf0918e4151c9f
    Image:         quay.io/ansible/awx-operator:0.17.0
    Image ID:      quay.io/ansible/awx-operator@sha256:2ffa0449b9ee0961df3e4794c5da5bcea2a0f7677df2ddad63e07652fd11ef54
    Port:          <none>
    Host Port:     <none>
    Args:
      --health-probe-bind-address=:6789
      --metrics-bind-address=127.0.0.1:8080
      --leader-elect
      --leader-election-id=awx-operator
    State:          Running
      Started:      Thu, 17 Feb 2022 12:17:56 +0100
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:6789/healthz delay=15s timeout=1s period=20s #success=1 #failure=3
    Readiness:      http-get http://:6789/readyz delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ANSIBLE_GATHERING:   explicit
      ANSIBLE_DEBUG_LOGS:  false
      WATCH_NAMESPACE:     awx (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-frw4h (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  kube-api-access-frw4h:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

Any ideas? Thx in advance...

Callback provisioning not working

I have AWX up and running. And have HTTP_X_FORWARDED_FOR set. But callback provisioning isn't working. I get {"msg":"No matching host could be found!"} Which leads me to believe that somehow the HTTP headers are getting changed too much so AWX doesn't recognize the host. I'm pretty inexperienced with Kubernetes. Is there something that needs to be done with K3s to make the forwarding work right?

Deployement update Fail

Hi all
I ve got a little problem when trying to update with the instructions.

Backup of 19.3 ok.

Prepare required files

git clone https://github.com/ansible/awx-operator.git >> OK
cd awx-operator >> OK
git checkout 0.14.0 >> OK

Deploy AWX Operator

export NAMESPACE=awx >> ok
make deploy >> KO

make deploy

/bin/sh: line 4: tar: command not found
curl: (23) Failed writing body (1349 != 1378)
make: *** [Makefile:95: kustomize] Error 127

Any idea?

Thank you!

/data/projects empty

I get a 404 error
I checked that the static files are not in the folder projects.
After the deployment the /data/folder in empty.
The folder postgres well contained the database.
To check if the folder is correct, i put it in 777.
After redeploy come back to 755 but is 0:1000 (instead of 1000:0) so something happen on it.
Don't understand why no files are in projects.
Any idea?
(Sorry for my english)

Originally posted by @Randy29800 in #3 (comment)

404 page not found

Environment

k3s version: v1.22.5+k3s1 (405bf79d)

  • OS: Rocky Linux 8.5
  • AWX Operator: 0.15.0

Description

The installation and configuration of awx goes immediately well. There are no error messages.
But when I try to access awx via my ip address (http://10.10.0.9) the error message comes: "404 page not found". The same goes for access via https.
What am I missing?

SSL Certificate error

Hi,

Followed your guide and AWX installs without errors according to the install logs. Using a VM in Azure with no ports open to the internet.

Certificate error: CA Root certificate not trusted, issuer is TRAEFIK DEFAULT CERT

The service principal is DNS Zone Contributor in the Azure DNS zone.

Any specific logs one should look at?

Cannot deploy when task_extra_env is specified

Hi again, and thanks for keeping this great project updated!

I am hitting an issue today so I am opening up a report here as I cannot manage to use task_extra_env to set AWX_CLEANUP_PATHS to False (this is related to https://groups.google.com/g/awx-project/c/XhY-uDSxDIo/m/CnhjRQG5AAAJ).

I have double-checked but let me know if I oversaw something. I am filing here since I deploy via your repo but this may be an issue with the upstream operator.

Environment

  • K3S: k3s version v1.19.15+k3s1 (e698d6d8)
  • OS: Ubuntu 20.04.3 LTS
  • AWX Operator: 0.14.0

Description

When task_extra_env is set to the following, the operator cannot deploy via the updated CRD:

 task_extra_env: |
    - name: AWX_CLEANUP_PATHS
      value: false

The syntax I used is in line with what is advised at https://github.com/ansible/awx-operator#exporting-environment-variables-to-containers:

  spec:
    task_extra_env: |
      - name: MYCUSTOMVAR
        value: foo
    web_extra_env: |
      - name: MYCUSTOMVAR
        value: foo
    ee_extra_env: |
      - name: MYCUSTOMVAR
        value: foo

Yet the deployment fails:

TASK [installer : Apply deployment resources] **********************************\r\ntask path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:35\nfatal: [localhost]: FAILED! => {\"changed\": false, \"error\": 400, \"msg\": \"Failed to apply object: b'{\\\"kind\\\":\\\"Status\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"metadata\\\":{},\\\"status\\\":\\\"Failure\\\",\\\"message\\\":\\\"Deployment in version \\\\\\\\\\\"v1\\\\\\\\\\\" cannot be handled as a Deployment: v1.Deployment.Spec: v1.DeploymentSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.Value: ReadString: expects \\\\\\\\\\\" or n, but found f, error found in #10 byte of ...|,\\\\\\\\\\\"value\\\\\\\\\\\":false}],\\\\\\\\\\\"im|..., bigger context ...|amespace\\\\\\\\\\\"}}},{\\\\\\\\\\\"name\\\\\\\\\\\":\\\\\\\\\\\"AWX_CLEANUP_PATHS\\\\\\\\\\\",\\\\\\\\\\\"value\\\\\\\\\\\":false}],

I can indeed see the CRD looks fishy. Note the bizarre whitespace placement, indentation was lost. Full output is attached:

$ kubectl describe awx
[...]
  Replicas:                         1
  route_tls_termination_mechanism:  Edge
  task_extra_env:                   - name: AWX_CLEANUP_PATHS
  value: false

  task_privileged:  false
Status:
  Conditions:
    Last Transition Time:  2021-12-07T14:03:53Z

I suspect this is because the templating might be too simple at https://github.com/ansible/awx-operator/blob/devel/roles/installer/templates/deployment.yaml.j2#L266 and not take into account the more complex data structure but I might be wrong.

Did you successfully manage to override those environment variables this way?

Step to Reproduce

  1. Deploy AWX as per your instructions but stop to modify base/awx.yaml to use the attached one
  2. Perform a kubectl apply -k base/ as usual
  3. Notice how the deployment fails at the operator level, with the abovementioned error.

Logs + full base/ directory: extra_env_error.zip

AWX and community module installation

Environment

  • OS: CentOS 8
  • Kubernetes/K3s: 1.21.3
  • AWX Operator: 0.13.0

Description

Hi,

I am having issues figuring out how to install community module requirements and notices that you code contains builder/requirements.yml and builder/requirements.txt.

Can these be used to customize which modules are present at runtime?

Can't execute community.vmware modules since it's nowhere described how to do this in AWX.

Step to Reproduce

- name: Make sure requirements are met to run vmware modules
  become: true
  ansible.builtin.pip:
    name: pyVmomi
    state: present

- name: Export virtual machine facts
  community.vmware.vmware_guest_info:
    hostname: "{{ hostname }}"
    username: '{{ domain_user }}'
    password: "{{ domain_password }}"
    datacenter: "{{ datacenter }}"
    validate_certs: no
    schema: vsphere
    name: VM01
  register: virtualmachine_facts

Both commands result in:

fatal: [127.0.0.1]: FAILED! => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/libexec/platform-python"
    },
    "changed": false,
    "module_stderr": "/bin/sh: sudo: command not found\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
    "rc": 127
}

How to specify an additional DNS Alias beside hostname

Hi,

we have two servers where we deploy our AWX instance. They are called ansible01.example.com and ansible02.example.com. We create a DNS alias and issue the certificates with an additional SAN called ansible.example.com.

In the base/awx.yml the hostname has to match the hostname of the URL that is later used to contact AWX. Otherwise you will get a 404 error.

...
spec:
  ...
  ingress_type: ingress
  ingress_tls_secret: awx-secret-tls
  hostname: awx.example.com     πŸ‘ˆπŸ‘ˆπŸ‘ˆ
...

Is there a way to specify an additional hostname? I know that traefik itself can handle this in the docker environment, but I'm not sure if this is possible in this context.

Thanks and best regards

Jens

Clarification on mode for /etc/rancher/k3s/k3s.yaml / --write-kubeconfig-mode 644 in install guide

Hello! Your project has been very useful for us in creating running AWX environments.

I have an environment where we are considering "locking down" kubectl interactions to root user, so am contemplating changing kubeconfig mode to 600/640. Is there risk to doing so? I do not understand the architecture well enough to understand why this kubeconfig file is made readable to non-root users in guide.

Thank you for your consideration.

Getting Failed Issue

Followed this exactly, but getting error. Is there any way to debug this?

VM has 2 cores and 8GB ram. Not sure what's wrong. I have tried 3x now.

PLAY RECAP *********************************************************************\r\nlocalhost                  : ok=32   changed=0    unreachable=0    failed=1    skipped=25   rescued=0    ignored=0   \r\n\n","job":"5444334632873030104","name":"awx","namespace":"awx","error":"exit status 2"}
{"level":"error","ts":1634329553.7360277,"logger":"controller-runtime.manager.controller.awx-controller","msg":"Reconciler error","name":"awx","namespace":"awx","error":"event runner on failed","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}

Here's the status:

[root@ansible awx-on-k3s]# kubectl -n awx get awx,all,ingress,secrets
NAME                      AGE
awx.awx.ansible.com/awx   57m

NAME                                                   READY   STATUS    RESTARTS   AGE
pod/awx-operator-controller-manager-68d787cfbd-tmrnv   2/2     Running   0          62m
pod/awx-84d5c45999-6pc4t                               0/4     Pending   0          57m
pod/awx-postgres-0                                     1/1     Running   0          57m

NAME                                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/awx-operator-controller-manager-metrics-service   ClusterIP   10.43.178.126   <none>        8443/TCP   62m
service/awx-postgres                                      ClusterIP   None            <none>        5432/TCP   57m
service/awx-service                                       ClusterIP   10.43.117.74    <none>        80/TCP     57m

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/awx-operator-controller-manager   1/1     1            1           62m
deployment.apps/awx                               0/1     1            0           57m

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/awx-operator-controller-manager-68d787cfbd   1         1         1       62m
replicaset.apps/awx-84d5c45999                               1         1         0       57m

NAME                            READY   AGE
statefulset.apps/awx-postgres   1/1     57m

NAME                                    CLASS    HOSTS                    ADDRESS          PORTS     AGE
ingress.networking.k8s.io/awx-ingress   <none>   awx.mydomain.net   199.XXX.XXX.21   80, 443   57m

NAME                                                 TYPE                                  DATA   AGE
secret/default-token-t9j72                           kubernetes.io/service-account-token   3      62m
secret/awx-operator-controller-manager-token-lgwjk   kubernetes.io/service-account-token   3      62m
secret/awx-admin-password                            Opaque                                1      57m
secret/awx-postgres-configuration                    Opaque                                6      57m
secret/awx-secret-tls                                kubernetes.io/tls                     2      57m
secret/awx-secret-key                                Opaque                                1      57m
secret/awx-broadcast-websocket                       Opaque                                1      57m
secret/awx-app-credentials                           Opaque                                3      57m
secret/awx-token-fjbtc                               kubernetes.io/service-account-token   3      57m

Not deploying

Environment

  • OS: Ubuntu 20.04
  • Kubernetes/K3s: v1.21.7+k3s1 (ac705709)
  • AWX Operator: 0.15.0

Description

Deployment in endless loop, worked once a few days ago, but not anymore

Step to Reproduce

deploy via README.md (with acme issuer)

Logs

...has timed out progressing.", "reason": "ProgressDeadlineExceeded", "status": "False", "type": "Progressing"}], "observedGeneration": 1, "replicas": 1, "unavailableReplicas": 1, "updatedReplicas": 1}}}\n\r\nPLAY RECAP *********************************************************************\r\nlocalhost :

ok=41 changed=0 unreachable=0 failed=1 skipped=30 rescued=0 ignored=0
\r\n\n","job":"3116155310435672272","name":"awx","namespace":"awx","error":"exit status 2"}
{"level":"error","ts":1640068891.7860224,"logger":"controller-runtime.manager.controller.awx-controller","msg":"Reconciler error","name":"awx","namespace":"awx","error":"event runner on failed","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}

Backup to Azure Storage

Hi,

Not a bug, but was wondering if backup/restore procedure would work with Azure Storage? If so, could you add an example some time in the future?

BR

Job error on container log rotation

Hi, thanks for this great guide!

When running jobs with a large number of hosts we found that we would trigger ansible/awx#10366 which causes the Job in AWX to exit with the message "error" and no summary even through the ee pod would still finish the playbook in the background.

As described in the issue linked above the error occurs when container logs are rotated by the kubelet causing the Kubernetes log stream which AWX uses to fail.

I can confirm that by increasing the maximum container log size as described by this comment ansible/awx#10366 (comment) the issue can be worked around until the root cause is fixed.
We used the following command to update/install K3s with the increased maximum:

curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" INSTALL_K3S_EXEC="--kubelet-arg "container-log-max-size=150Mi"" sh -

Maybe it's worth increasing the maximum even further than 150Mi (from the 10Mi default) for bigger environments.

Suggestion: Documentation of Storage Requirements

Environment

OS: RHEL 8.5
Kubernetes/K3s: v1.22.6+k3s1 (3228d9cb)
AWX Operator: 0.16.1

Description

It may be helpful for new users to this project to be aware of the storage requirements - perhaps in the "Prepare CentOS 8 host" section - as I ran into a couple of problems when I first started using this repository.

/var/lib/rancher

I ran out of room in /var as it is on a dedicated filesystem. I've since created a /var/lib/rancher filesystem which on my base AWX install consumes around 5.6GB of space. I'm unsure if this is going to grow and by how much - is it possible to make a reference to this directory/filesystem and perhaps some initial sizing suggestions?

File permissions

Our build has a restrictive umask, so my /data directories were different from yours.

drwxr-x---. 3 root root 18 Feb 16 09:07 /data/postgres
drwxr-x---. 2 root root  6 Feb 16 09:07 /data/postgres/data
drwxr-x---. 2 1000 root  6 Feb 16 09:07 /data/projects

However, I followed the steps https://github.com/kurokobo/awx-on-k3s/blob/main/tips/troubleshooting.md#the-pod-for-postgresql-is-in-crashloopbackoff-state-and-shows-permission-denied-log which resolved the issue. I now perform the following on a fresh install:

sudo chmod 755 /data/postgres /data/postgres/data

Is it worth adding that to the setup docs as it seems like a firm requirement?

/data

I created a /data filesystem for my environment - again, unsure if that should be documented. However, I did see this in the base configuration:

  postgres_storage_class: awx-postgres-volume
  postgres_storage_requirements:
    requests:
      storage: 2Gi

If a database grows more than this size, is it simply a case of updating that file and running kubectl apply -k base

Step to Reproduce

Follow the deployment guide on a freshly installed Operating System.

Cannot copy file from awx to remote host

Environment

  • OS: RHEL
  • Kubernetes/K3s: k3s version v1.23.6+k3s1 (418c3fa8)
  • AWX Operator: 0.21.0

Description

I want copy file from awx to remote host, but i don't understand how to it work. It show me an error as below.
Could not find or access 'copyfile.txt'\nSearched in:\n\t/runner/project/files/copyfile.txt\n\t/runner/project/copyfile.txt\n\t/runner/project/files/copyfile.txt\n\t/runner/project/copyfile.txt on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option",

I don't know /runner/project is where. I did try go to awx-ee and create copyfile.txt in there but it not working.

ERROR
{
"msg": "Could not find or access 'copyfile.txt'\nSearched in:\n\t/runner/project/files/copyfile.txt\n\t/runner/project/copyfile.txt\n\t/runner/project/files/copyfile.txt\n\t/runner/project/copyfile.txt on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option",
"exception": "Traceback (most recent call last):\n File "/usr/local/lib/python3.8/site-packages/ansible/plugins/action/copy.py", line 466, in run\n source = self._find_needle('files', source)\n File "/usr/local/lib/python3.8/site-packages/ansible/plugins/action/init.py", line 1364, in _find_needle\n return self._loader.path_dwim_relative_stack(path_stack, dirname, needle)\n File "/usr/local/lib/python3.8/site-packages/ansible/parsing/dataloader.py", line 341, in path_dwim_relative_stack\n raise AnsibleFileNotFound(file_name=source, paths=[to_native(p) for p in search])\nansible.errors.AnsibleFileNotFound: Could not find or access 'copyfile.txt'\nSearched in:\n\t/runner/project/files/copyfile.txt\n\t/runner/project/copyfile.txt\n\t/runner/project/files/copyfile.txt\n\t/runner/project/copyfile.txt on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option\n",
"invocation": {
"src": "copyfile.txt",
"dest": "/opt/",
"module_args": {
"src": "copyfile.txt",
"dest": "/opt/"
}
},
"_ansible_no_log": false,
"changed": false
}

feature suggestion: ee from private ECR registry

Hi, it would be very useful to pull the execution environments images from the AWS Elastic Container Registry, is it possible to implement such feature?

Thank you for everything, this project is awesome! :)

Question regarding SSL

Hi,

I've manually created the DNS record in Azure and usually acme can create certificate without the need to define azure resources. Why is the Issuer part needed? The concept is not clear to me. A public IP with access via port 80/http should be enough. Thanks in advance

TLS vulnerabilities, modify traefik/ingress configuration

Environment

  • OS: Ubuntu 20.04
  • Kubernetes/K3s: k3s version v1.21.7+k3s1 (ac705709)-
  • AWX Operator: 0.18.0

Description

testssl gives me a few vulnerable ciphers/SSL configuration of the AWX instance

SSLv2 not offered (OK)
SSLv3 not offered (OK)
TLS 1 offered (deprecated)
TLS 1.1 offered (deprecated)
TLS 1.2 offered (OK)
TLS 1.3 offered (OK): final
Triple DES Ciphers / IDEA offered
Obsolete CBC ciphers (AES, ARIA etc.) offered
Has server cipher order? no (NOT ok)

Step to Reproduce

deploy awx-operator and awx-on-k3s with cert-manager

how can this be solved? thank you :)

Can I add an existing PostgreSQL database dump file to the deployment?

Hello!

I like your straight forward approach to deploy AWX on Kubernetes. We have currently a setup where we deployed AWX 19 on Docker and therefore have an existing environment. Would it be possible to load a PostgreSQL database dump file instead of starting with a fresh install?

Thanks in advance and best regards

Jens

"make deploy" dont work

Environment

  • OS: CentOS X.Y, RHEL X.Y, Ubuntu X.Y, Debian X.Y, ...
  • Kubernetes/K3s: X.Y.Z
  • AWX Operator: X.Y.Z

Description

Blah blah blah ...

Step to Reproduce

  1. Deploy ...
  2. Invoke kubectl ...
  3. ...
  4. ...

Bastion host on k3s

Environment

  • OS: Ubuntu 20.04
  • Kubernetes/K3s: k3s v1.22.7+k3s1
  • AWX Operator: 0.20

Description

Create a bastion host for all hosts of my inventories

that's not an issue at all, but do you have a way to implement use of a bastion host on k3s for all hosts on my inventories used on awx?

CrashLoopBackOff on awx-postgres-0 pod

Hi,

I am experiencing CrashLoopBackOff on the awx-postgres-0 pod. The GUI is not available either (getting 404). Have you experienced this issue, too?
I am running on Ubuntu 20.04.2 LTS.

Thank you for your help and for your great project!

Petr

Changing password postgres and AWX

First of all, thank you for your amazing job.
Maybe I'm missing something but I recently messed up my awx.
I've changed my /etc/hostname and I think it did break some things, anyway I've made some changes inside my AWX so I didn't wanted to lose all my work, that's why I've delete everything except my data which is in the directory /data/

Since I wanted to put it in production, I've changed passwords in base/kustomization.yaml and I think now postgres cannot access the database
2022-02-03T15:32:58.888085423+01:00 stderr F 2022-02-03 14:32:58.887 UTC [1486] FATAL: password authentication failed for user "awx" 2022-02-03T15:32:58.888112407+01:00 stderr F 2022-02-03 14:32:58.887 UTC [1486] DETAIL: Connection matched pg_hba.conf line 99: "host all all all scram-sha-256"
and I don't know where I can tell it that I've changed password. Is it possible to do this without recreate a new db ? I also put a new password for awx-admin-password but for me, it's only to log in the web interface

CERT & Admin password - NOT WORKING

Environment

  • OS: CentOS X.Y, RHEL X.Y, Ubuntu X.Y, Debian X.Y, ...
  • Kubernetes/K3s: 1.23.6
  • AWX Operator: 0.21.0

Description

I've modified awx password under awx.yaml & deployed with my own Custom Certs.
However, post deployment SSL is not enabled & even modified password is not working.
In kubectl secret | base 64, I can see the password is updated with specified one.

But when changed the admin password through awx-manage (inside pod) it worked.

Can you pls. retest if both SSL & Custom password is working.

Enterprise SSL Certificate

Environment

  • OS: RockyLinux 8.6
  • Kubernetes/K3s: v1.23.6+k3s1 (418c3fa8)
  • AWX Operator: 0.22.0

Description

I deployed using a self-signed SSL certificate using
AWX_HOST="awx.example.com"
openssl req -x509 -nodes -days 3650 -newkey rsa:2048 -out ./base/tls.crt -keyout ./base/tls.key -subj "/CN=${AWX_HOST}/O=${AWX_HOST}" -addext "subjectAltName = DNS:${AWX_HOST}"
However, I would like to switch it to an enterprise CA signed SSL certificate. Do I just need generate a new certificate and key using my enterprise CA using the same names, place them in the same spot, then run kubectl apply -k base to apply the new certificate and private key?

Deployment failed - Unable to retrieve pull secret, the image pull may not succeed - redhat-operators-pull-secret

Environment

  • OS: RHEL 8.6
  • Kubernetes/K3s: v1.23.6+k3s1
  • AWX Operator: 0.21.0

Description

Attempted to spin up AWX 21 on a brand new server using this repository with the latest operator. The install failed with:

May 13 13:47:19 XXX k3s[27942]: I0513 13:47:19.715551 27942 kubelet_pods.go:891] "Unable to retrieve pull secret, the image pull may not succeed." pod="awx/awx-operator-controller-manager-675865446d-9nh27" secret="" err="secret \"redhat-operators-pull-secret\" not found"

Looking in ansible/awx-operator@859384e I can see a reference to redhat-operators-pull-secret which was made two weeks ago but I don't recall having to configure this parameter on previous releases.

Checking the version mapping table, I'm pretty sure I was able to use 0.20.1 of the operator to deploy AWX 21.0.0

AWX Operator AWX
0.21.0 21.0.0
0.20.2 21.0.0
0.20.1 21.0.0

I may rebuild and test again with version 0.20.1 just to rule out a local misconfiguration.

Step to Reproduce

  1. New server install on RHEL 8.6, follow repo instructions and deploy with AWX Operator 0.21.0
  2. kubectl apply -k base
  3. Deployment fails with above message

[Feature] Setting up Lets Encrypt

Rather than setting up a self-signed certificate, it'd be better to setup Lets Encrypt.
Since a domain/sub-domain is already set while configuring, might as well not make a self-signed cert.

SSL Question

Newer versions contain SSL via ACME. Can this folder be copied to older versions like 0.14.0 and still work?

subject in RoleBinding missing namespace

The RoleBinding created in the rbac/sa.yaml file seems to be missing the namespace in the subjects section. Without adding that I got authorization failures.

image

Restore fails

Environment

  • OS: CentOS 8
  • Kubernetes/K3s: v1.21.5+k3s2
  • AWX Operator: 0.14.0

Description

Attempting to restore awx from another 0.14.0 awx installation. The folder was copied.

Restore fails with message /data/backup/tower-openshift-backup-2021-10-19 does not exist, see the backupDirectory status on your AWXBackup for the correct backup_dir.

Step to Reproduce

---
apiVersion: awx.ansible.com/v1beta1
kind: AWXRestore
metadata:
  name: awxrestore-2021-10-20
  namespace: awx
spec:
  # Parameters to restore from existing files on PVC (without AWXBackup object)
  backup_pvc_namespace: awx
  backup_pvc: awx-backup-claim
  backup_dir: /data/backup/tower-openshift-backup-2021-10-19

The backup exists in /data/backup/tower-openshift-backup-2021-10-19

First attempt resulted in the message that awx-backup-claim did not exist. So I executed kubectl apply -f backup/awxbackup.yml in order to create awx-backup-claim.

For some reason AWX can't see the folder /data/backup/tower-openshift-backup-2021-10-19 which puzzles me?

BR

Nothing is up on port 443 (or 80) ? - Why ?

root@u500-cube-server:~/awx-on-k3s# kubectl get svc --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.43.0.1 443/TCP 29m
kube-system kube-dns ClusterIP 10.43.0.10 53/UDP,53/TCP,9153/TCP 29m
kube-system metrics-server ClusterIP 10.43.55.221 443/TCP 29m
kube-system traefik LoadBalancer 10.43.185.225 192.168.5.104 80:31716/TCP,443:31604/TCP 28m
default awx-operator-metrics ClusterIP 10.43.102.194 8383/TCP,8686/TCP 28m
awx awx-postgres ClusterIP None 5432/TCP 25m
awx awx-service ClusterIP 10.43.251.54 80/TCP 25m

root@u500-cube-server:~/awx-on-k3s# kubectl -n awx get awx,all,ingress,secrets
NAME AGE
awx.awx.ansible.com/awx 29m

NAME READY STATUS RESTARTS AGE
pod/awx-postgres-0 1/1 Running 0 29m
pod/awx-59ff55b5b-2czpx 4/4 Running 2 28m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/awx-postgres ClusterIP None 5432/TCP 29m
service/awx-service ClusterIP 10.43.251.54 80/TCP 28m

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/awx 1/1 1 1 28m

NAME DESIRED CURRENT READY AGE
replicaset.apps/awx-59ff55b5b 1 1 1 28m

NAME READY AGE
statefulset.apps/awx-postgres 1/1 29m

NAME CLASS HOSTS ADDRESS PORTS AGE
ingress.networking.k8s.io/awx-ingress awx.tunninet.com 192.168.5.104 80, 443 28m

NAME TYPE DATA AGE
secret/awx-admin-password Opaque 1 29m
secret/default-token-fwskh kubernetes.io/service-account-token 3 29m
secret/awx-postgres-configuration Opaque 6 29m
secret/awx-secret-tls kubernetes.io/tls 2 29m
secret/awx-app-credentials Opaque 3 28m
secret/awx-token-mdtqh kubernetes.io/service-account-token 3 28m
secret/awx-secret-key Opaque 1 29m
secret/awx-broadcast-websocket Opaque 1 29m

Passwords with some special characters cause issues

Environment

  • OS: Red Hat Enterprise Linux release 8.4 (Ootpa)
  • Kubernetes/K3s: v1.22.7+k3s1
  • AWX Operator: 0.17.0

Description

Special characters such as & in the password for awx in the awx-postgres-configuration causes backups using AWXBackup to not work. The backup container would start, create the backup directory, set permissions, but as soon as it would attempt to run the pg_dump it would error out and the container would get terminated. Removing the & from the password fixed it. The awx-operator documentation says that if the password has special characters it should be quoted.

Step to Reproduce

new install and set the passwords in the kustomization.yaml file to have an & in them then configure and run the backup operator per https://github.com/kurokobo/awx-on-k3s/tree/main/backup

Image Pull error

Getting stuck with status ImagePullBackOff on the awx-xxxxxxxxxxxxxx pod. I have attempted this install twice on a clean CentOS 8 Stream server. Server is 4 CPU, 12 GB Ram, 80 GB storage. I have been following the instructions to the T, so not sure what is going wrong.

error

Attached is a sample of the output from "kubectl -n awx logs -f deployments/awx-operator-controller-manager -c" if it helps.

awx-operator-controller-manager.txt

AWX Restore question

Hi,

Upgraded 14.0.0 to 21.0.0 and sure enough there is a bug... again. Downgrading resultet in an unknown AWX error so now I have to restore for the first time.

I've deleted everything and have the backup folder here: /data/restore/tower-openshift-backup-2022-05-05-02:01:07

No backup deployment present. Just the files.

Next step: kubectl apply -k restore

Now I am unsure how restore/awxrestore.yaml should look. Is this correct? Should a base awx be installed first before restoring the pvc?

---
apiVersion: awx.ansible.com/v1beta1
kind: AWXRestore
metadata:
  name: awxrestore-2022-05-05
  namespace: awx
spec:
  deployment_name: awx

  # Parameters to restore from AWXBackup object
  #backup_pvc_namespace: awx
  #backup_name: awxbackup-2021-06-06

  # Parameters to restore from existing files on PVC (without AWXBackup object)
  backup_pvc_namespace: awx
  backup_pvc: awx-backup-claim
  backup_dir: /backups/tower-openshift-backup-2022-05-05-02:01:07

Something went wrong" issue on Jobs Settings by trying to edit

Environment

k3s --version

k3s version v1.22.7+k3s1 (8432d7f2)

go version go1.16.10

  • OS: CentOS X.Y, RHEL X.Y, Ubuntu X.Y, Debian X.Y, ...

- CentOS Linux release 8.5.2111

Kubernetes/K3s: X.Y.Z
#- AWX Operator: 20.4

Description

#Something went wrong" issue on Jobs Settings by trying to edit

Step to Reproduce

#go to settings in awx and try to edit Job settings.

Adding tips to set proxy settings

Hello,

I wanted to add an extra tips page to add custom proxy setting for the awx-web, awx-task and awx-ee containers, but I don not have permission to push my branch.
Would you mind adding me to the list of contributors? I'm in the early stage of conributing to projects in GitHub, but I'm willing to learn.

Thanks and best regards
Jens

Deployment fails because of missing awx-projects-volume storage class

Environment

  • OS: Ubuntu 20.04
  • Kubernetes/K3s: k3s version v1.19.15+k3s1 (e698d6d8)
  • AWX Operator: 0.14.0

Description

First, a BIG thank you for this useful and well documented project! This makes deployments of AWX a lot easier when using K3S πŸ‘

I am opening this issue (feel free to redirect me if this is not the correct channel) because my AWX deployment fails following your instructions:
manager_logs.txt

plicas\": 1, \"updatedReplicas\": 1}}}\n\r\nPLAY RECAP *********************************************************************\r\nlocalhost                  : ok=32   changed=0    unreachable=0    failed=1    skipped=25   rescued=0    ignor
ed=0   \r\n\n","job":"5713703289679536467","name":"awx","namespace":"awx-vincent","error":"exit status 2"}                                                                                                                                    
{"level":"error","ts":1634033628.8462682,"logger":"controller-runtime.manager.controller.awx-controller","msg":"Reconciler error","name":"awx","namespace":"awx-vincent","error":"event runner on failed","stacktrace":"sigs.k8s.io/controller
-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.
func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}                                                                                                                                      
{"level":"info","ts":1634034630.7230046,"logger":"logging_event_handler","msg":"[playbook task start]","name":"awx","namespace":"awx-vincent","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"789813676
9315806288","EventData.Name":"installer : Patching labels to AWX kind"}   

The AWX pods cannot be started:

((0.14.0))$ k get pods
NAME                                               READY   STATUS    RESTARTS   AGE
awx-operator-controller-manager-5486747db4-9xb69   2/2     Running   0          114m
awx-postgres-0                                     1/1     Running   0          93m
awx-84d5c45999-qxgp9                               0/4     Pending   0          93m

Upon inspection, this is because of a persistent volume claim which is not bound:

$ k describe pods awx-84d5c45999-qxgp9
[...]
  awx-projects:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  awx-projects-claim
    ReadOnly:   false
  awx-token-57kdv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  awx-token-57kdv
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  94m   default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

Indeed:

((0.14.0))$ k get persistentvolumeclaims 
NAME                      STATUS    VOLUME                CAPACITY   ACCESS MODES   STORAGECLASS          AGE
awx-projects-claim        Pending                                                   awx-projects-volume   95m
postgres-awx-postgres-0   Bound     awx-postgres-volume   2Gi        RWO            awx-postgres-volume   95m

The failure appears because of:

((0.14.0))$ k describe persistentvolumeclaims awx-projects-claim
Name:          awx-projects-claim
Namespace:     awx-vincent
StorageClass:  awx-projects-volume
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Used By:       awx-84d5c45999-qxgp9
Events:
  Type     Reason              Age                  From                         Message
  ----     ------              ----                 ----                         -------
  Warning  ProvisioningFailed  37s (x382 over 95m)  persistentvolume-controller  storageclass.storage.k8s.io "awx-projects-volume" not found

The thing is, I cannot find any definition for this storage class and the cluster does not provide it otherwise:

((0.14.0))$ k get storageclasses.storage.k8s.io 
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  7d23h

Two more important notes I think:

  1. I am deploying in my own namespace, which is not awx, rather awx-vincent. I had to change some values under base/ to match for that (files attached: base.zip)
    I can see that one of the claims appears to still be under the awx/ namespace. cf awx/awx-projects-claim in the following output, although I don't know what could be causing the problem (aren't persistent volumes namespace-agnostic?). The describe command above showed it is in the correct namespace too.
((0.14.0))$ k get persistentvolume
NAME                  CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                                 STORAGECLASS          REASON   AGE
awx-projects-volume   2Gi        RWO            Retain           Released   awx/awx-projects-claim                awx-projects-volume            110m
awx-postgres-volume   2Gi        RWO            Retain           Bound      awx-vincent/postgres-awx-postgres-0   awx-postgres-volume            110m
  1. The problem does not occur for the awx-postgres-volume one, which seem to use the same notation/storage class for awx-postgres-volume, which is resolved/bound.

Could you let me know what I could be doing wrong here? Thanks for your help !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.