Giter Site home page Giter Site logo

awx workflow_job_templates launch --wait command fails with ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) about awx HOT 4 OPEN

akshat87 avatar akshat87 commented on June 2, 2024
awx workflow_job_templates launch --wait command fails with ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

from awx.

Comments (4)

fosterseth avatar fosterseth commented on June 2, 2024

do you see the jobs finish in the UI? how long do these workflows run, and how long did the cli command wait before returning the RemoteDisconnected error?

from awx.

XakV avatar XakV commented on June 2, 2024

I've seen similar in AWX 23.3.1. The template involves an ansible.builtin.uri call to VMware orchestrator, followed by an ansible.builtin.wait_for_connection. The job log pauses here:

Using module file /usr/local/lib/python3.11/site-packages/ansible/modules/ping.py
Pipelining is enabled.
<host.fqdn> ESTABLISH SSH CONNECTION FOR USER: $ANSIBLE_REMOTE_USER
<host.fqdn> SSH: EXEC ssh -vvv -o ServerAliveInterval=30 -o ControlMaster=auto -o ControlPersist=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="$ANSIBLE_REMOTE_USER"' -o ConnectTimeout=120 -o 'ControlPath="/tmp/ansible-root-%h"' host.fqdn '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"''
wait_for_connection: attempting ping module test
sending connection check: [b'ssh', b'-vvv', b'-o', b'ServerAliveInterval=30', b'-o', b'ControlMaster=auto', b'-o', b'ControlPersist=60', b'-o', b'StrictHostKeyChecking=no', b'-o', b'UserKnownHostsFile=/dev/null', b'-o', b'StrictHostKeyChecking=no', b'-o', b'KbdInteractiveAuthentication=no', b'-o', b'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', b'-o', b'PasswordAuthentication=no', b'-o', b'User="$ANSIBLE_REMOTE_USER"', b'-o', b'ConnectTimeout=120', b'-o', b'ControlPath="/tmp/ansible-root-%h"', b'-O', b'check', b'host.fqdn']

While the job log in the WebUI hangs here, the awx-task-runner-blah-blah repeatedly ( about every second or so ) attempts to make a connection.

You can watch the connection attempts by opening a shell in the task-runner container and identifying the parent ansible process/thread ID and then inferring the PID/TID from the active children, essentially ls -l /proc and if you suspected the child PID to be in the range of 200 to 399, while true; do cat /proc/[2,3]*/cmdline; done.

I can provide additional info if needed. The AWX install lives on a Rancher cluster running k8s v1.24.17 on rhel 7 hosts. Ingress is nginx, networking is Canal, pvc provided by portworx.

from awx.

XakV avatar XakV commented on June 2, 2024

Adding relevant bits of our ansible.cfg

defaults]

home = .ansible
roles_path    = roles
playbook_dir = playbooks
transport = smart
collections_path = .ansible/collections:/usr/share/ansible/collections:.venv/lib/python3.11/site-packages/ansible_collections/
remote_user = $ANSIBLE_REMOTE_USER
remote_tmp     = /tmp/$USER/.ansible
gather_subset = all
interpreter_python = auto
host_key_checking = False
timeout = 120
verbosity = 1
module_name = shell
ansible_managed = Ansible managed: {file} modified on %Y-%m-%d %H:%M by root on {host}
system_warnings = True
deprecation_warnings = True
command_warnings = False
callbacks_enabled = ansible.posix.profile_tasks
stdout_callback = yaml
display_skipped_hosts = False
retry_files_enabled = False
var_compression_level = 9
jinja2_extensions = jinja2.ext.do

[callback_profile_tasks]
task_output_limit = 5

[inventory]
enable_plugins=ansible.builtin.constructed, host_list, script, auto, yaml, ini, toml

[privilege_escalation]
become_ask_pass=False
become_method=sudo
become_flags="-iS"

[ssh_connection]

ssh_args = -o ServerAliveInterval=30 -o ControlMaster=auto -o ControlPersist=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null
control_path = /tmp/ansible-root-%%h
pipelining = True
transfer_method = smart

[persistent_connection]

connect_timeout = 30
connect_retries = 30
connect_interval = 1

Note that I'm substituting $ANSIBLE_REMOTE_USER for the actual user name.

from awx.

XakV avatar XakV commented on June 2, 2024

Found the error that ended the task above.

{"log":"2024-04-26 15:10:21,352 INFO [c3b7da2d511940cd9f42ad53edf60a96] awx.main.scheduler Workflow job 29241 failed due to reason: No error handling path for workflow job node(s) [(4838,error)]. Workflow job node(s) missing unified job template and error handling path [].\n","stream":"stderr","time":"2024-04-26T15:10:21.353827271Z"}

from awx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.