Giter Site home page Giter Site logo

broker's People

Contributors

akhil-jha avatar bherrin3 avatar colehiggins2 avatar damoore044 avatar dependabot[bot] avatar dosas avatar griffin-sullivan avatar ichimonji10 avatar jacobcallahan avatar jameerpathan111 avatar latran avatar lpramuk avatar mshriver avatar ogajduse avatar omkarkhatavkar avatar peterdragun avatar rmynar avatar rplevka avatar rujutashinde avatar swadeley avatar tpapaioa avatar tstrych avatar yanpliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

broker's Issues

[RFE] Implement `broker --version`

Issue

When using broker, sometimes it can be hard to figure out what settings.yaml file it is using. It is similar to ansible where ansible.cfg file can be specified using an EnvVar, or can be located in pwd or user's home dir or under /etc/ansible. Since there are multiple ways to specify the file they make it obvious with ansible --version or by specifying the file location information on stdout during every playbook execution(if ran with -v for verbose). It also helps user know which python executable Ansible will be using as there could be multiple python versions installed in various locations on a given system.

Requested solution

Please implement a way for broker to provide broker -v/--version to help users know what settings.yaml, python executable and potentially what inventory file is being used by the broker. As well as broker executable path itself.

This information can be useful in debugging issues when users may run into problems with broker failing to read correct settings or not finding inventory file etc.

Example of ansible --version

 ansible --version
ansible 2.9.7
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/kkulkarni/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.8/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 3.8.3 (default, May 15 2020, 00:00:00) [GCC 10.1.1 20200507 (Red Hat 10.1.1-1)]

Alternative solution

Currently readme does not specify the broker needs to execute from the same directory where the settings/inventory files are. We should update readme to reflect that OR allow user to explicitly set the path for these manually, explicitly so that user could invoke broker from any location on his system as long as it is configured to look at correct settings file. Sort of like git config --global where we can set global config for git to use no matter where we are calling it from.

`broker execute` leads to `UnboundLocalError`

(venv) ➜  broker git:(thp) ✗ broker execute
Traceback (most recent call last):
  File "/home/jhenner/work/sat/broker/venv/bin/broker", line 33, in <module>
    sys.exit(load_entry_point('broker', 'console_scripts', 'broker')())
  File "/home/jhenner/work/sat/broker/venv/lib64/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/jhenner/work/sat/broker/venv/lib64/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/jhenner/work/sat/broker/venv/lib64/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/jhenner/work/sat/broker/venv/lib64/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jhenner/work/sat/broker/venv/lib64/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/jhenner/work/sat/broker/venv/lib64/python3.9/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/jhenner/work/sat/broker/broker/commands.py", line 302, in execute
    result = broker_inst.execute()
  File "/home/jhenner/work/sat/broker/broker/broker.py", line 135, in execute
    logger.info(f"Using provider {provider.__name__} for execution")
UnboundLocalError: local variable 'provider' referenced before assignment

This is probably because of the self._provider_actions.items() is empty. The pattern used there:

        for action, arg in self._provider_actions.items():
            provider, method = PROVIDER_ACTIONS[action]
        logger.info(f"Using provider {provider.__name__} for execution")
        return self._act(provider, method)

Is quite suspicious. Is it really meant to use the last action defined in the _provider actions? Why we should iterate over all of them?

I see the pattern repeated many times in the broker.py. This violates DRY principle and I think it can be done better, but I don't yet understand the design.

'Could not determine an appropriate provider' when

With this in the broker_settings.yaml, without the "nonexistent" nick, one will get a error message that is not too helpful.

nicks:
    rhel7:
        workflow: "deploy-base-rhel"
        rhel_version: "7.9"
        notes: "Requested by broker"

Please improve the error message.

Here is the ipdb's where command output on my breakpoint:

(venv39) ➜  broker git:(logging) ✗ broker --log-level debug checkout --nick nonexistent
Log level changed to [debug]
[D 210812 13:44:29 commands:19] Executing func=<function checkout at 0x7fbf0487e670>
[D 210812 13:44:29 broker:86] Broker instantiated with kwargs={'nick': 'nonexistent'}
[D 210812 13:44:29 broker:90] kwargs after nick resolution kwargs={}
[D 210812 13:44:29 broker:133] Doing _checkout(): self._provider_actions={}
> /home/jhenner/projects/broker/broker/broker.py(137)_checkout()
-> raise self.BrokerError("Could not determine an appropriate provider")
(Pdb) w
  /home/jhenner/projects/robottelo/venv39/bin/broker(33)<module>()
-> sys.exit(load_entry_point('broker', 'console_scripts', 'broker')())
  /home/jhenner/projects/broker/broker/commands.py(32)__call__()
-> return self.main(*args, **kwargs)
  /home/jhenner/projects/robottelo/venv39/lib64/python3.9/site-packages/click/core.py(1062)main()
-> rv = self.invoke(ctx)
  /home/jhenner/projects/robottelo/venv39/lib64/python3.9/site-packages/click/core.py(1668)invoke()
-> return _process_result(sub_ctx.command.invoke(sub_ctx))
  /home/jhenner/projects/robottelo/venv39/lib64/python3.9/site-packages/click/core.py(1404)invoke()
-> return ctx.invoke(self.callback, **ctx.params)
  /home/jhenner/projects/robottelo/venv39/lib64/python3.9/site-packages/click/core.py(763)invoke()
-> return __callback(*args, **kwargs)
  /home/jhenner/projects/broker/broker/commands.py(20)wrapper()
-> retval = func(*wargs, **wkwargs)
  /home/jhenner/projects/robottelo/venv39/lib64/python3.9/site-packages/click/decorators.py(26)new_func()
-> return f(get_current_context(), *args, **kwargs)
  /home/jhenner/projects/broker/broker/commands.py(182)checkout()
-> broker_inst.checkout()
  /home/jhenner/projects/broker/broker/broker.py(156)checkout()
-> hosts = self._checkout()
  /home/jhenner/projects/broker/broker/broker.py(57)mp_split()
-> return self.func(instance, *args, **kwargs)
> /home/jhenner/projects/broker/broker/broker.py(137)_checkout()
-> raise self.BrokerError("Could not determine an appropriate provider")

Failed to login in the non-default port guest

Hey all,
The virt-who testing need to log in the kubevirt guest with the non-default (such as 32365) port, but failed due to the error:

tests/foreman/virtwho/ui/test_kubevirt.py:67: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
robottelo/virtwho_utils.py:269: in deploy_configure_by_command
    register_system(get_system(hypervisor_type), org=org)
robottelo/virtwho_utils.py:99: in register_system
    runcmd('subscription-manager unregister', system)
robottelo/virtwho_utils.py:86: in runcmd
    result = ssh.command(cmd, **system, timeout=timeout, output_format=output_format)
robottelo/ssh.py:57: in command
    result = client.execute(cmd, timeout=timeout)
../3.8_rot/lib/python3.8/site-packages/broker/hosts.py:84: in execute
    res = self.session.run(command, timeout=timeout)
../3.8_rot/lib/python3.8/site-packages/broker/hosts.py:35: in session
    self.connect()
../3.8_rot/lib/python3.8/site-packages/broker/hosts.py:57: in connect
    self._session = Session(
../3.8_rot/lib/python3.8/site-packages/broker/session.py:45: in __init__
    self.session.userauth_password(user, kwargs["password"])
ssh2/session.pyx:321: in ssh2.session.Session.userauth_password
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   ssh2.exceptions.AuthenticationError

With the initial analysis, for the session https://github.com/SatelliteQE/broker/blob/master/broker/session.py#L41 seems it is default to use the '22' as the guest port.

Attribute error during the checkout

Reproducer

$ broker --log-level debug checkout --workflow deploy-sat-jenkins --template satellite-6.10 --host_type satellite
Log level changed to [debug]
[D 210726 13:59:25 broker:86] Broker instantiated with kwargs={'workflow': 'deploy-sat-jenkins', 'template': 'satellite-6.10', 'host_type': 'satellite'}
[I 210726 13:59:25 broker:130] Using provider AnsibleTower to checkout
[D 210726 13:59:25 ansible_tower:77] AnsibleTower instantiated with kwargs={'workflow': 'deploy-sat-jenkins', 'template': 'satellite-6.10', 'host_type': 'satellite'}
[I 210726 13:59:26 ansible_tower:102] Using token authentication
[I 210726 13:59:27 ansible_tower:423] No inventory specified, Ansible Tower will use a default.
[D 210726 13:59:27 ansible_tower:424] Launching workflow: https://infra-ansible-tower-01.ourdomain.com/api/v2/workflow_job_templates/56/
    payload={'extra_vars': "{'workflow': 'deploy-sat-jenkins', 'template': 'satellite-6.10', 'host_type': 'satellite'}"}
[I 210726 13:59:28 ansible_tower:433] Waiting for job: 
    API: https://infra-ansible-tower-01.ourdomain.com/api/v2/workflow_jobs/825035/
    UI: https://infra-ansible-tower-01.ourdomain.com/#/workflows/825035
[D 210726 14:21:52 broker:111] {
        "id": 825035,
        "type": "workflow_job",
        "url": "/api/v2/workflow_jobs/825035/",
        "related": {
            "created_by": "/api/v2/users/13/",
            "modified_by": "/api/v2/users/13/",
            "unified_job_template": "/api/v2/workflow_job_templates/56/",
            "workflow_job_template": "/api/v2/workflow_job_templates/56/",
            "notifications": "/api/v2/workflow_jobs/825035/notifications/",
            "workflow_nodes": "/api/v2/workflow_jobs/825035/workflow_nodes/",
            "labels": "/api/v2/workflow_jobs/825035/labels/",
            "activity_stream": "/api/v2/workflow_jobs/825035/activity_stream/",
            "relaunch": "/api/v2/workflow_jobs/825035/relaunch/",
            "cancel": "/api/v2/workflow_jobs/825035/cancel/"
        },
        "summary_fields": {
            "organization": {
                "id": 2,
                "name": "Satellite",
                "description": ""
            },
            "inventory": {
                "id": 50,
                "name": "satlab-rhv-02-inventory",
                "description": "RHV host(s) for QE contributor and CI pipeline use, maintained by SatQE SatLab team. QE has full control of network for PXE and local network creation.",
                "has_active_failures": true,
                "total_hosts": 180,
                "hosts_with_active_failures": 7,
                "total_groups": 10,
                "has_inventory_sources": true,
                "total_inventory_sources": 2,
                "inventory_sources_with_failures": 0,
                "organization_id": 2,
                "kind": ""
            },
            "workflow_job_template": {
                "id": 56,
                "name": "deploy-sat-jenkins",
                "description": "Workflow to deploy sat-jenkins with separated postdeploy"
            },
            "unified_job_template": {
                "id": 56,
                "name": "deploy-sat-jenkins",
                "description": "Workflow to deploy sat-jenkins with separated postdeploy",
                "unified_job_type": "workflow_job"
            },
            "created_by": {
                "id": 13,
                "username": "ogajduse",
                "first_name": "Ondrej",
                "last_name": "Gajdusek"
            },
            "modified_by": {
                "id": 13,
                "username": "ogajduse",
                "first_name": "Ondrej",
                "last_name": "Gajdusek"
            },
            "user_capabilities": {
                "delete": true,
                "start": true
            },
            "labels": {
                "count": 0,
                "results": []
            }
        },
        "created": "2021-07-26T11:59:27.760765Z",
        "modified": "2021-07-26T11:59:28.268990Z",
        "name": "deploy-sat-jenkins",
        "description": "Workflow to deploy sat-jenkins with separated postdeploy",
        "unified_job_template": 56,
        "launch_type": "manual",
        "status": "successful",
        "failed": false,
        "started": "2021-07-26T11:59:28.264982Z",
        "finished": "2021-07-26T12:21:52.082022Z",
        "canceled_on": null,
        "elapsed": 1343.817,
        "job_args": "",
        "job_cwd": "",
        "job_env": {},
        "job_explanation": "",
        "result_traceback": "",
        "workflow_job_template": 56,
        "extra_vars": "{\"host_type\": \"satellite\", \"deploy_scenario\": \"sat-jenkins\", \"deploy_snap_version\": \"\", \"deploy_sat_version\": \"\", \"deploy_template_name\": \"\", \"deploy_rhel_version\": \"7\", \"workflow\": \"deploy-sat-jenkins\", \"template\": \"satellite-6.10\"}",
        "allow_simultaneous": true,
        "job_template": null,
        "is_sliced_job": false,
        "inventory": 50,
        "limit": null,
        "scm_branch": null,
        "webhook_service": "",
        "webhook_credential": null,
        "webhook_guid": ""
    }
[D 210726 14:21:52 ansible_tower:184] Attempting to merge: deploy-sat-jenkins
[D 210726 14:21:53 ansible_tower:206] {
        "id": 184129,
        "type": "workflow_job_node",
        "url": "/api/v2/workflow_job_nodes/184129/",
        "related": {
            "credentials": "/api/v2/workflow_job_nodes/184129/credentials/",
            "success_nodes": "/api/v2/workflow_job_nodes/184129/success_nodes/",
            "failure_nodes": "/api/v2/workflow_job_nodes/184129/failure_nodes/",
            "always_nodes": "/api/v2/workflow_job_nodes/184129/always_nodes/",
            "unified_job_template": "/api/v2/job_templates/21/",
            "job": "/api/v2/jobs/825107/",
            "workflow_job": "/api/v2/workflow_jobs/825035/"
        },
        "summary_fields": {
            "job": {
                "id": 825107,
                "name": "satlab-tower-deploy-set-stats-for-jenkins-automation-rhvm-02-wf",
                "description": "Set Stats output for Jenkins and Automation to use at the end of VM Deployment",
                "status": "successful",
                "failed": false,
                "elapsed": 11.148,
                "type": "job"
            },
            "workflow_job": {
                "id": 825035,
                "name": "deploy-sat-jenkins",
                "description": "Workflow to deploy sat-jenkins with separated postdeploy"
            },
            "unified_job_template": {
                "id": 21,
                "name": "satlab-tower-deploy-set-stats-for-jenkins-automation-rhvm-02-wf",
                "description": "Set Stats output for Jenkins and Automation to use at the end of VM Deployment",
                "unified_job_type": "job"
            }
        },
        "created": "2021-07-26T11:59:27.820927Z",
        "modified": "2021-07-26T12:21:26.065916Z",
        "extra_data": {},
        "inventory": null,
        "scm_branch": null,
        "job_type": null,
        "job_tags": null,
        "skip_tags": null,
        "limit": null,
        "diff_mode": null,
        "verbosity": null,
        "job": 825107,
        "workflow_job": 825035,
        "unified_job_template": 21,
        "success_nodes": [],
        "failure_nodes": [],
        "always_nodes": [],
        "all_parents_must_converge": false,
        "do_not_run": false,
        "identifier": "3e7293dd-ce45-4132-94f7-6cd2c08576d1"
    }
[D 210726 14:21:53 ansible_tower:184] Attempting to merge: satlab-tower-deploy-set-stats-for-jenkins-automation-rhvm-02-wf
[D 210726 14:21:53 ansible_tower:188] Found artifacts: {'deploy_snap_version': '10.0', 'name': 'ogajduse-sat-jenkins-6.10.0-10.0-4d17b8f4', 'host_type': 'satellite', 'os_distribution_version': '7.9', 'fqdn': 'dhcp-3-43.ourdomain.com', 'deploy_rhel_version': '7.9', 'template': 'satellite-6.10', 'tower_inventory': 'satlab-rhv-02-inventory', 'deploy_sat_version': '6.10.0', 'os_distribution': 'RedHat', 'reported_devices': {'nics': ['lo', 'eth0']}}
[D 210726 14:21:53 ansible_tower:371] {'deploy_snap_version': '10.0', 'name': 'ogajduse-sat-jenkins-6.10.0-10.0-4d17b8f4', 'host_type': 'satellite', 'os_distribution_version': '7.9', 'fqdn': 'dhcp-3-43.ourdomain.com', 'deploy_rhel_version': '7.9', 'template': 'satellite-6.10', 'tower_inventory': 'satlab-rhv-02-inventory', 'deploy_sat_version': '6.10.0', 'os_distribution': 'RedHat', 'reported_devices_nics': ['lo', 'eth0']}
[D 210726 14:21:53 ansible_tower:382] hostname: dhcp-3-43.ourdomain.com, name: ogajduse-sat-jenkins-6.10.0-10.0-4d17b8f4, host type: host
[D 210726 14:21:53 broker:135] host=<broker.hosts.Host object at 0x7fa5e8cd46d0>
[I 210726 14:21:53 broker:138] Host: dhcp-3-43.ourdomain.com
[I 210726 14:21:53 broker:130] Using provider AnsibleTower to checkout
[D 210726 14:21:53 ansible_tower:77] AnsibleTower instantiated with kwargs={'workflow': 'deploy-sat-jenkins', 'template': 'satellite-6.10', 'host_type': 'satellite'}
[I 210726 14:21:53 ansible_tower:102] Using token authentication
[E 210726 14:21:54 exceptions:10] getattr(): attribute name must be string
    Traceback (most recent call last):
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/broker/commands.py", line 18, in __call__
        return self.main(*args, **kwargs)
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/click/decorators.py", line 21, in new_func
        return f(get_current_context(), *args, **kwargs)
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/broker/commands.py", line 168, in checkout
        broker_inst.checkout()
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/broker/broker.py", line 146, in checkout
        hosts = self._checkout()
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/broker/broker.py", line 57, in mp_split
        return self.func(instance, *args, **kwargs)
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/broker/broker.py", line 132, in _checkout
        host = self._act(provider, method, checkout=True)
      File "/home/ogajduse/.virtualenvs/broker/lib/python3.9/site-packages/broker/broker.py", line 110, in _act
        result = getattr(provider_inst, method)(**self._kwargs)
    TypeError: getattr(): attribute name must be string
[E 210726 14:21:54 exceptions:12] BrokerError: getattr(): attribute name must be string

Identify why test_mp_checkout is failing and how to fix it

tests/test_broker.py test_mp_checkout is currently failing since the multiprocessing queue hangs on get.
This works perfectly fine when using the AnsibleTower provider, but this test uses the TestProvider.

Alternative solutions tried, but failed are:

  • Use starmap to get around a queue
  • Introduce a time delay in TestProvider's test_action method
  • Use a multiprocessing manager queue
  • Use a multiprocessing manager list
  • Make the Host class instances pickleable to be compatible with above

BUG: `sftp_write` fails if the destination file does not exist in the container

Problem statement

sftp_write method that we are using in robottelo behaves differently based on the broker provider - Container or AAP.

https://github.com/SatelliteQE/robottelo/blob/eca242110a5c9b4040d62d04a7094d00cebf5aa7/robottelo/hosts.py#L686-L697

Traceback

../../lib64/python3.8/site-packages/docker/api/client.py:268: in _raise_for_status
    response.raise_for_status()
../../lib64/python3.8/site-packages/requests/models.py:1021: in raise_for_status
    raise HTTPError(http_error_msg, response=self)
E   requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://ssh/v1.41/containers/9e2407883fed4af87a97ab80579150153a3ef2a7fbdcd0558618ac6b3bb3762e/archive?path=%2Fetc%2Frhsm%2Ffacts%2Flocations.facts

The above exception was the direct cause of the following exception:
tests/foreman/ui/test_dashboard.py:257: in test_positive_user_access_with_host_filter
    repos_collection.setup_virtual_machine(
robottelo/host_helpers/repository_mixins.py:725: in setup_virtual_machine
    vm.contenthost_setup(
robottelo/hosts.py:1007: in contenthost_setup
    self.set_facts({'locations.facts': {'foreman_location': str(location_title)}})
robottelo/host_helpers/contenthost_mixins.py:145: in set_facts
    self.put(tf.name, f'/etc/rhsm/facts/{filename}')
robottelo/hosts.py:697: in put
    self.session.sftp_write(source=local_path, destination=remote_path)
../../lib64/python3.8/site-packages/broker/session.py:234: in sftp_write
    self._cont_inst._cont_inst.put_archive(str(destination), tar.read_bytes())
../../lib64/python3.8/site-packages/docker/models/containers.py:334: in put_archive
    return self.client.api.put_archive(self.id, path, data)
../../lib64/python3.8/site-packages/docker/utils/decorators.py:19: in wrapped
    return f(self, resource_id, *args, **kwargs)
../../lib64/python3.8/site-packages/docker/api/container.py:976: in put_archive
    self._raise_for_status(res)
../../lib64/python3.8/site-packages/docker/api/client.py:270: in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
../../lib64/python3.8/site-packages/docker/errors.py:39: in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e
E   docker.errors.NotFound: 404 Client Error for http+docker://ssh/v1.41/containers/9e2407883fed4af87a97ab80579150153a3ef2a7fbdcd0558618ac6b3bb3762e/archive?path=%2Fetc%2Frhsm%2Ffacts%2Flocations.facts: Not Found ("Could not find the file /etc/rhsm/facts/locations.facts in container 9e2407883fed4af87a97ab80579150153a3ef2a7fbdcd0558618ac6b3bb3762e")****

If the file does not exist on the host, the above exception is raised.

Proposed solution

Broker should be smart and should create the file in the container if it does not exist - like in the VM (AnsibleTower provider).

Notes

@peterdragun and I played around with it a bit and we found out that if the destination file exists, it is overridden by the source file.

Race condition danger when using multiple instances of broker

In PR #53 I am introducing a fix for handling the inventory in the checkout being done in concurrent mode, but as @mshriver just noted to me that even with that PR merged, there still gonna be a danger when using the broker "in background" fashion.

There is no mechanism for exclusive access involved when writing the inventory file in update_inventory. And there is an time interval when we can get race condition. We are dealing with classing multiple Readers/Writters concurrency problem here. https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock

Ansible Automation Platform API path changes

Problem

With the addition of Ansible Automation Platform, the URL for API changes for workflows and job-templates

Other changes are more than likely part of the update, but these are the most relevant to primary functionality.

Proposed Solution

Add a new provider, AnsibleController.

This will allow for distinct changes with Ansible Automation Platform while still retaining support for legacy Ansible Tower

This will also allow seamless changes from one provider to another with simple changes to the broker_settings.yml (or at the same time) without requiring a hard switchover.

Examples and new APIs can be provided.

Traceback when checking out multiple VMs with `connect=True`

Checkout fails when _count > 1 and connect=True:

>>> from broker.broker import VMBroker
>>> vmb = VMBroker(nick='rhel7', _count=2)
>>> vms = vmb.checkout()
>>> vms
[<broker.hosts.Host object at 0x7fd4d8605ac0>, <broker.hosts.Host object at 0x7fd4d8605fa0>]
>>> vmb.checkin()

>>> vms = vmb.checkout(connect=True)
[...]
concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib64/python3.9/concurrent/futures/process.py", line 208, in _sendback_result
    result_queue.put(_ResultItem(work_id, result=result,
  File "/usr/lib64/python3.9/multiprocessing/queues.py", line 372, in put
    obj = _ForkingPickler.dumps(obj)
  File "/usr/lib64/python3.9/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "stringsource", line 2, in ssh2.session.Session.__reduce_cython__
TypeError: no default __reduce__ due to non-trivial __cinit__
"""

This also happens when using the broker instance in a with block, because VMBroker.__enter__() calls checkout() with connect=True.

When _count > 1, VMBroker._checkout() runs the checkout tasks in parallel, using concurrent.futures. That module uses pickle to serialize the host instances and pass them between the worker and parent process. When connect=True, then _checkout() also establishes an ssh connection to each host and stores a reference to the un-pickle-able ssh connection instance in host.session. The traceback is the result of trying to pickle that attribute.

Proposed patch is to move the calls to host.connect() outside of _checkout(), which should only perform the actions that need to be run in parallel, and into the calling method checkout():

$ git --no-pager diff --staged
diff --git a/broker/broker.py b/broker/broker.py
index 8b09943..2da3ed2 100644
--- a/broker/broker.py
+++ b/broker/broker.py
@@ -100,11 +100,9 @@ class VMBroker:
             return result
 
     @mp_decorator
-    def _checkout(self, connect=False):
+    def _checkout(self):
         """checkout one or more VMs
 
-        :param connect: Boolean whether to establish host ssh connection
-
         :return: List of Host objects
         """
         hosts = []
@@ -112,10 +110,8 @@ class VMBroker:
             provider, method = PROVIDER_ACTIONS[action]
             logger.info(f"Using provider {provider.__name__} to checkout")
             host = self._act(provider, method, checkout=True)
-            logger.debug(f"host={host} connect={connect}")
+            logger.debug(f"host={host}")
             if host:
-                if connect:
-                    host.connect()
                 hosts.append(host)
                 logger.info(f"{host.__class__.__name__}: {host.hostname}")
         return hosts
@@ -127,7 +123,10 @@ class VMBroker:
 
         :return: Host obj or list of Host objects
         """
-        hosts = self._checkout(connect=connect)
+        hosts = self._checkout()
+        if connect:
+            for host in hosts:
+                host.connect()
         self._hosts.extend(hosts)
         helpers.update_inventory([host.to_dict() for host in hosts])
         return hosts if not len(hosts) == 1 else hosts[0]

Add a possibility to have multiple providers of the same type

Having a possibility to choose from multiple providers of the same type, defined in the broker_settings.yaml, would be handy.

The initial idea of possible broker_settings.yaml design containing multiple AnsibleTower providers:

inventory_file: "inventory.yaml"
default_provider: at-instance1
AnsibleTower:
  - provider_nick: at-instance1 
    base_url: https://my-at-instance1.com/
    username: admin
    password: awesomepassword1
  - provider_nick: at-instance2 
    base_url: https://my-at-instance2.com/
    username: admin
    password: awesomepassword2

[RFE] Implement VM snapshots handling

In several cases I found useful the possibility of snapshotting the VM state and restoring it later (for example when using a setup which takes a while to create and you need to run some test repeatedly from certain point), which is not currently possible through the VM portal.

Is it possible (and worth it) to implement this functionality into broker?

Broker does not provide a way to pass non-trivial variables as an argument

I would like broker to accept JSON as an argument.

Let's assume the following JSON.

[{"baseurl": "http://my.mirror/rpm", "name": "base"}, {"baseurl": "http://my.mirror/rpm","name": "optional"}]

I was not able to pass this data structure as an argument.

Users might modify their broker_settings.yaml and create their own nick for this purpose. However, modifying broker_settings.yaml is not desired workflow within the CI.

I can imagine broker being able to either accept the following:

broker checkout --workflow my-workflow --myarg '[{"baseurl": "http://my.mirror/rpm", "name": "base"}, {"baseurl": "http://my.mirror/rpm","name": "optional"}]' --my_second_arg foo

or provide an option to pass a file with the JSON data structure as an argument like so:

myvars.json:

{"myarg":[{"baseurl": "http://my.mirror/rpm", "name": "base"}, {"baseurl": "http://my.mirror/rpm","name": "optional"}], "my_second_arg": "foo"}

Command using the variable file:

broker checkout --workflow my-workflow --vars-file myvars.json

Initial VM Provider list

What providers would we want to support in an initial list of VM provisioning services?

This issue is about assembling a handful of services that we could strive to include, initially. If I'm misunderstanding project goals, feel free to rehash, but in my mind, this would include things like being able to grab a VM with relevant info arbitrarily from things like:

  • AWS
  • Google Cloud
  • Azure
  • Tower/Satellite capabilities here? I know that's somewhat implemented, but what other products would be used?

What items would we want to support first and foremost? What are the most commonly useful ones in this case when dealing with different providers?

Reconsider the name?

I'd like to suggest that some alternate names for this project be considered...

Reason 1: "broker" is a very generic name, it really does not advertise what the project is.
Reason 2: There are already lots of projects with nearly or exactly the same name, including a "broker" package on PyPI so we won't be able to publish this.

Initial suggestions to consider:

  • machine-factory
  • vmfactory
  • boxfactory

Limit AnsibleTower job templates to those a user can execute

Ansible Tower admins can limit what users are able to execute. However, this limitation doesn't necessarily mean that a user can't view job templates they aren't able to execute.
The resolution to this issue should be Broker filtering job templates to only those that a user is able to execute.

                "user_capabilities": {
                    "edit": true,
                    "delete": true,
                    "start": true,
                    "schedule": true,
                    "copy": true
                },

Improve logging in Broker

When debug logging option is selected, the broker doesn't output much more of information then when the info level is selected. It should output more lines about what it does like:

[I 210803 15:23:32 commands:230] Pulling local inventory
[I 210803 15:23:33 commands:231] Making GET request [XXX] on [awx...redhat.com] ...
[I 210803 15:23:35 commands:231] Response for request [XXX]:
some content
of what it got
from the Tower
server

BUG: `AssertionError: can only join a child process` when trying to run the process in the background

$ python -V
Python 3.10.2

$ broker --version
Version: 0.1.35
Broker Directory: /home/ogajduse/repos/broker
Settings File: /home/ogajduse/repos/broker/broker_settings.yaml
Inventory File: /home/ogajduse/repos/broker/inventory.yaml
Log File: /home/ogajduse/repos/broker/logs/broker.log
$ broker extend ogajduse-sat-jenkins-6.11.0-18.0-rhel8.5-118a8f2b ogajduse-sat-jenkins-6.11.0-18.5-rhel8.5-3b881d52 --new-expire-time '2026-12-31 23:59'
[INFO 220513 13:18:05] Using token authentication
[INFO 220513 13:18:06] Using token authentication
[INFO 220513 13:18:07] Extending host None
[INFO 220513 13:18:07] Extending host None
[INFO 220513 13:18:07] Using token authentication
[INFO 220513 13:18:07] Using token authentication
[INFO 220513 13:18:10] Waiting for job: 
    API: https://some-aap.com/api/v2/workflow_jobs/445559/
    UI: https://some-aap.com/#/jobs/workflow/445559/output
[INFO 220513 13:18:10] Waiting for job: 
    API: https://some-aap.com/api/v2/workflow_jobs/445560/
    UI: https://some-aap.com/#/jobs/workflow/445560/output
^C
Ending Broker while running won't end processes being monitored.
Would you like to switch Broker to run in the background?
[y/n]: 
Ending Broker while running won't end processes being monitored.
Would you like to switch Broker to run in the background?
[y/n]: 
Ending Broker while running won't end processes being monitored.
Would you like to switch Broker to run in the background?
[y/n]: y
[INFO 220513 13:18:34] Running broker in the background with pid: 19954

Aborted!
Exception ignored in atexit callback: <function _exit_function at 0x7fd03406a440>
Traceback (most recent call last):
  File "/usr/lib64/python3.10/multiprocessing/util.py", line 357, in _exit_function
    p.join()
  File "/usr/lib64/python3.10/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process

"IndexError: pop from empty list" when checking out deploy-sat-lite workflow

$ broker checkout --workflow deploy-sat-lite --template satellite-6.8-latest
[INFO 200618 12:59:52] Using provider AnsibleTower to checkout
Traceback (most recent call last):
  File "/home/mzalewsk/.virtualenvs/broker/bin/broker", line 11, in <module>
    load_entry_point('broker', 'console_scripts', 'broker')()
  File "/home/mzalewsk/.virtualenvs/broker/lib64/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/mzalewsk/.virtualenvs/broker/lib64/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/mzalewsk/.virtualenvs/broker/lib64/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/mzalewsk/.virtualenvs/broker/lib64/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/mzalewsk/.virtualenvs/broker/lib64/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/mzalewsk/.virtualenvs/broker/lib64/python3.8/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/mzalewsk/sources/broker/broker/commands.py", line 56, in checkout
    broker_inst.checkout()
  File "/home/mzalewsk/sources/broker/broker/broker.py", line 55, in checkout
    host = self._act(provider, method, checkout=True)
  File "/home/mzalewsk/sources/broker/broker/broker.py", line 39, in _act
    return provider_inst.construct_host(
  File "/home/mzalewsk/sources/broker/broker/providers/ansible_tower.py", line 91, in construct_host
    job_attrs = self._merge_artifacts(
  File "/home/mzalewsk/sources/broker/broker/providers/ansible_tower.py", line 75, in _merge_artifacts
    child_obj = self.v2.jobs.get(id=child_id).results.pop()
IndexError: pop from empty list

That's Python 3.8.3, I'm on commit 59f5e42 (which is master as of time of this writing).

In Tower I can see workflow completed successfully ($TOWER/#/workflows/3941), and in RHVM I can see new machine under my name. It seems to do the job just fine, so I don't know what his problem is.

[BUG] AttributeError: 'ProviderError' object has no attribute 'hostname'

Tower has failed WF for which #145 was fixed and merged but now we see the following error with broker:

self = <broker.broker.VMBroker object at 0x7fcdca8e0370>
12:10:09  
12:10:09      @mp_decorator
12:10:09      def _checkout(self):
12:10:09          """checkout one or more VMs
12:10:09      
12:10:09          :return: List of Host objects
12:10:09          """
12:10:09          hosts = []
12:10:09          if not self._provider_actions:
12:10:09              raise self.BrokerError("Could not determine an appropriate provider")
12:10:09          for action in self._provider_actions.keys():
12:10:09              provider, method = PROVIDER_ACTIONS[action]
12:10:09              logger.info(f"Using provider ********provider.__name__} to checkout")
12:10:09              try:
12:10:09                  host = self._act(provider, method, checkout=True)
12:10:09                  logger.debug(f"host=********host}")
12:10:09              except exceptions.ProviderError as err:
12:10:09                  host = err
12:10:09              if host:
12:10:09                  hosts.append(host)
12:10:09  >               logger.info(f"********host.__class__.__name__}: ********host.hostname}")
12:10:09  E               AttributeError: 'ProviderError' object has no attribute 'hostname'
12:10:09  
12:10:09  ../../lib64/python3.8/site-packages/broker/broker.py:139: AttributeError

Please look into it !

new broker 0.2.10 fails to checkout some instances

When you specify multiple instances there is a chance broker is going to fail with:
BrokerError: 'NoneType' object has no attribute 'instances'

The higher count is the higher probability of failure. Count 3 is normally enough to reproduce the issue

$ broker checkout --workflow deploy-base-rhel --deploy_rhel_version 8 --count 3
[INFO 230116 11:57:46] Using provider AnsibleTower to checkout
[INFO 230116 11:57:46] Using provider AnsibleTower to checkout
[INFO 230116 11:57:46] Using provider AnsibleTower to checkout
[INFO 230116 11:57:46] Using token authentication
[INFO 230116 11:57:50] Waiting for job: 
    API: https://infra-ansible-tower-01/api/v2/workflow_jobs/373805/
    UI: https://infra-ansible-tower-01/#/jobs/workflow/373805/output
[INFO 230116 12:15:34] Host: dhcp-3-186.vms
[ERROR 230116 12:15:34] BrokerError: 'NoneType' object has no attribute 'instances'

broker 0.2.9 is free of this bug

Unauthorized (401) - During checkin dynaconf envars are overridden by empty values from settings

Reproducer:

$ oc rsh node-32d8cc64-8a41-48af-85a2-e0470bc5c396-2l1j0-s9gcd
Defaulted container "builder" out of: builder, jnlp
(app-root) sh-5.2$ broker checkout --workflow deploy-base-rhel --deploy_rhel_version 8
broker checkin --all
[INFO 230123 08:57:37] Using provider AnsibleTower to checkout
[INFO 230123 08:57:38] Using username and password authentication
[INFO 230123 08:57:40] Waiting for job: 
    API: https://infra-ansible-tower-01/api/v2/workflow_jobs/401332/
    UI: https://infra-ansible-tower-01/#/jobs/workflow/401332/output
[INFO 230123 09:01:29] Host: dhcp-3-39.vms
[INFO 230123 09:01:30] Using username and password authentication
[INFO 230123 09:01:31] Checking in dhcp-3-39.vms
[WARNING 230123 09:01:31] Encountered exception during checkin: Unauthorized (401) received - {'detail': 'Authentication credentials were not provided. To establish a login session, visit /api/login/.'}
[ERROR 230123 09:01:31] BrokerError: Unauthorized (401) received - {'detail': 'Authentication credentials were not provided. To establish a login session, visit /api/login/.'}

The problem doesn't have to be necessarily coming from broker. It can be caused by the way how we build our containers.

Support for custom vm provisioning templateas

It would be cool if a user could add a custom way of checking out a vm type. They would define it in a dictionary of sorts and put it in, and then they could arbitrarily add their providers that way.

ssh timeouts regardless of desired timeout specified

    def run(self, command, timeout=0):
        """run a command on the host and return the results"""
        self.session.set_timeout(translate_timeout(timeout))
        channel = self.session.open_session()
        channel.execute(
            command,
        )
        channel.wait_eof()
        channel.close()
        channel.wait_closed()
        results = self._read(channel)
        return results

When I run an ssh command that takes long time, I set timeout to e.g. 2000. So execute runs, the command is running. The script continues while the command is still running. The script runs wait_eof which timeouts after 1 second because the command is still running. My specified timeout wasn't actually used and any command that doesn't finish immediately causes this to fail.

The command I was trying to run is for i in {1..60}; do ping -c1 <IP> && exit 0; sleep 20; done; exit 1.

Support for passing inventory to use on Ansible Tower workflow execution

As mentioned in issue #83 there is now a requirement to pass along inventory to use on a workflow launch on Ansible Tower. There are following requirements:

  • add an option to specify an inventory to use on a workflow launch. Please note that the inventory name is not just another extra variable, but an AT specific information, that has to be passed using the inventory parameter in the AT launch API call. For a very simple proof of concept of this is available here: rdrazny@fc9803b
  • it would be useful if broker would be able to save the information about inventory used for deploy locally, and then automatically use this information for extend and remove workflows

Checkin of multiple VMs can fail due to authentication timing

When checkin -all or checking 0 1 2 3 4 5 syntax are used, all of the authentication sessions are created before any workflows are called.

This causes a situation where workflow execution can fail due to the authentication session timing out, resulting in a 401 from Tower and runtime exception.

Two thoughts for this behavior:

  1. Auth session should be created when used, not well before a workflow is actually executed.
  2. checkin --all and checkin 0 1 2 syntax should automatically run in parallel, instead of sequentially.
 File "/home/setup/repos/broker/broker/commands.py", line 135, in checkin
    broker_inst.checkin()
  File "/home/setup/repos/broker/broker/broker.py", line 147, in checkin
    self.checkin(_host)
  File "/home/setup/repos/broker/broker/broker.py", line 151, in checkin
    host.release()
  File "/home/setup/repos/broker/broker/providers/ansible_tower.py", line 65, in _host_release
    self.release(caller_host.name)
  File "/home/setup/repos/broker/broker/providers/ansible_tower.py", line 264, in release
    return self.exec_workflow(workflow=RELEASE_WORKFLOW, source_vm=name)
  File "/home/setup/repos/broker/broker/providers/ansible_tower.py", line 208, in exec_workflow
    wfjts = self.v2.workflow_job_templates.get(name=workflow).results
  File "/home/setup/repos/broker/.broker/lib64/python3.7/site-packages/awxkit/api/pages/page.py", line 396, in get
    return self._create().get(**params)
  File "/home/setup/repos/broker/.broker/lib64/python3.7/site-packages/awxkit/api/pages/page.py", line 275, in get
    page = self.page_identity(r)
  File "/home/setup/repos/broker/.broker/lib64/python3.7/site-packages/awxkit/api/pages/page.py", line 210, in page_identity
    raise exception(exc_str, data)
awxkit.exceptions.Unauthorized: Unauthorized (401) received - {'detail': 'Authentication credentials were not provided. To establish a login session, visit /api/login/.'}

`new_expire_time` value is interpreted as integer

If I use default value for new_expire_time which is "+172800" this is interpreted as number 172800. Problem is that Tower expects either time to be added or timestamp in the future or exact time.
Reproducer:
using latest version of broker with not new_expire_time variable set in config or in config with +.
broker extend --all

[RFE]: checkin multiple VMs

Right now I can checkin VMs one by one or use --all, but in practice I often want to return several but not all VMs. A parameter accepting say comma separated list of VM IDs would be helpful for these situations:

broker checkin --some 0,2,4

[RFE] Add a function to transfer directory/files from one host to another at specific location

I know that there is a function remote_copy in Broker to move file between hosts but this function only allows user to copy one file to the exact same location to another host.
My suggestion is to make a function that allows users to transfer files or directory recursively from the source host to the target host to the specific location that would be user-defined.
As the function with these features does not exist yet I am forced to use something like this in my tests.
i. e.:
source_host.execute('sshpass -p "password" scp -o StrictHostKeyChecking=no -r /var/backup/satellite-backup target.hostname.com:/backup/')

Implement smarter auth source identification for AnsibleTower

@rplevka had a great suggestion in #177 for a better way to determine which auth sources an AAP instance supports.

an unauthenticated request to <aap_server>/api/ resource
returns the following info:

{
"description": "AWX REST API",
"current_version": "/api/v2/",
"available_versions": {
"v2": "/api/v2/"
},
"oauth2": "/api/o/",
"custom_logo": "",
"custom_login_info": "",
"login_redirect_override": ""
}
Wouldn't it be better if broker fetched the custom_login_info value and display it as warning if set?
This way, we could keep the login-specifics to the individual provider instances instead of hardocding stuff.

We should evaluate this and see if we can use this instead of the current version implemented in that PR.

Properly handle return code after error

Broker should properly handle RC after hitting errors.
Deploy WFs are not in any way special in this.

I won't go grepping broker output for ERROR - that's purpose of return codes

[INFO 221104 15:39:26] Using provider AnsibleTower to checkout
[INFO 221104 15:39:26] Using username and password authentication
[INFO 221104 15:39:28] Waiting for job: 
    API: https://infra-ansible-tower-01/api/v2/workflow_jobs/142076/
    UI: https://infra-ansible-tower-01/#/jobs/workflow/142076/output
[ERROR 221104 15:40:00] ProviderError: AnsibleTower encountered the following error: {'reason': 'Unable to determine failure cause for deploy-satellite-upgrade ar /api/v2/workflow_jobs/142076/'}
[Pipeline] echo
Broker RC for checkout: 0

Actual Result:
RC = 0

Expected Result:
RC != 0

Add the ability to get a list of AnsibleTower inventories

With the incoming requirement to have broker accept and pass along inventory information, it would be nice to let users be able to figure out what queries are available.

Desired usage:
Get a list of inventories your account has access to.

broker providers AnsibleTower --inventories

Optional usage:
Get some relevant information about an inventory.

broker providers AnsibleTower --inventory test-inventory

`BrokerError: getattr(): attribute name must be string` during checkout

Reproducer:

$ broker checkout --workflow deploy-base-rhel --template tpl-cmp-RHEL-8.5.0-20210816.n.0
[INFO 210909 10:24:01] Using provider AnsibleTower to checkout
[INFO 210909 10:24:01] Using token authentication
[INFO 210909 10:24:03] No inventory specified, Ansible Tower will use a default.
[INFO 210909 10:24:04] Waiting for job: 
    API: https://infra-ansible-tower-01.infra.ourdomain.com/api/v2/workflow_jobs/1074415/
    UI: https://infra-ansible-tower-01.infra.ourdomain.com/#/workflows/1074415
[INFO 210909 10:28:46] Host: dhcp-3-245.vms.ourdomain.com
[INFO 210909 10:28:46] Using provider AnsibleTower to checkout
[INFO 210909 10:28:47] Using token authentication
[ERROR 210909 10:28:48] BrokerError: getattr(): attribute name must be string

Broker is reaching out to Tower after the deploy-base-rhel WF was successful and then fails with BrokerError.
This reproducer should work with

Broker version:

$ broker --version
Version: 0.1.22

My broker config:

# Broker settings
debug: True
inventory_file: "inventory.yaml"
# Host Settings
host_username: "root"
host_password: ""

# Provider settings
AnsibleTower:
  instances:
    - rhv:
        base_url: "https://infra-ansible-tower-01.infra.ourdomain.com/"
        # username: "ogajduse"
        # password: ""
        token: mytoken
        default: True
    - osp:
        base_url: "https://infra-ansible-tower-01.infra.ourdomain.com/"
        # username: "ogajduse"
        # password: ""
        token: EknRl0veErGgQjS8CA83KkoJyD1r1n
        inventory: "satlab-osp-01-inventory"
    - testing:
        base_url: "https://dhcp-2-80.vms.ourdomain.com/"
        inventory: "satlab-rhv-02-inventory"
        username: ogajduse
        password: ""
        #default: True
  release_workflow: "remove-vm"
  extend_workflow: "extend-vm"
  workflow_timeout: 3600
  #results_limit: 50

RFE: Include extend-vm time in 'broker inventory --details' output

It would be nice to have the expiration date, or at a minimal the epoch time, on the 'inventory --details' output. It's just convenient to see that my sat expiration date is present so I wouldn't have to worry about running it again in a few days because I forget when it expires.

RFE: Implement machine-processable structured output

Current implementation prints log messages and eventually, if -o raw is passed, it logs the returned json from AT.
Since broker is about to be used in pipelines quite much, it would be beneficial if I was able to get all the broker-provided data in some structured object (preferably JSON).

current output:

$ broker --log-level debug execute --workflow list-templates -o raw
Log level changed to [debug]
[I 210406 16:52:04 broker:146] Using provider AnsibleTower for execution
[D 210406 16:52:04 ansible_tower:42] AnsibleTower instantiated with kwargs={'workflow': 'list-templates'}
[I 210406 16:52:04 ansible_tower:82] Using username and password authentication
[D 210406 16:52:06 ansible_tower:280] Launching workflow: https://<redacted>/api/v2/workflow_job_templates/71/
[I 210406 16:52:07 ansible_tower:286] Waiting for job: 
    API: https://<redacted>/api/v2/workflow_jobs/356534/
    UI: https://<redacted>/#/workflows/356534

[D 210406 16:52:50 broker:94] {
        "id": 356534,
        "type": "workflow_job",
        "url": "/api/v2/workflow_jobs/356534/",
        ....
}

required output (e.g.):

{
  "provider": "AnsibleTower",
  "auth_type": "password",
  "return_code": 2,
  "workflow": {
     "workflow_api": "https:/<redacted>/api/v2/workflow_jobs/356534/",
     "workflow_UI": "https://<redacted>/#/workflows/356534",
  "output": {# workflow json output}
   }
}

This way, we're able to access the return code from the json object attribute as well as easily get the workflow URL (if any) instead of tricky parsing using sed, etc.

Properly handle ProviderError during multi checkout

Traceback when trying to checkout two hosts using VMBroker(when user exceeds SLA limit of 6):

==================================== ERRORS ====================================
________ ERROR at setup of test_positive_inventory_generate_upload_cli _________
concurrent.futures.process._RemoteTraceback:
'''
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/concurrent/futures/process.py", line 368, in _queue_management_worker
    result_item = result_reader.recv()
  File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
TypeError: __init__() missing 2 required positional arguments: 'provider' and 'message'
'''
 
The above exception was the direct cause of the following exception:
 
    @pytest.fixture(scope='module')
    def content_hosts():
        """A module-level fixture that provides two content hosts object based on the rhel7 nick"""
>       with VMBroker(nick='rhel7', host_classes={'host': ContentHost}, _count=2) as hosts:
 
pytest_fixtures/rh_cloud.py:32:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../env_robottelo_38/lib/python3.8/site-packages/broker/broker.py:291: in __enter__
    raise err
../env_robottelo_38/lib/python3.8/site-packages/broker/broker.py:284: in __enter__
    hosts = self.checkout(connect=True)
../env_robottelo_38/lib/python3.8/site-packages/broker/broker.py:136: in checkout
    hosts = self._checkout()
../env_robottelo_38/lib/python3.8/site-packages/broker/broker.py:67: in mp_split
    results.extend(f.result())
/usr/local/lib/python3.8/concurrent/futures/_base.py:432: in result
    return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
 
self = <Future at 0x7f64fec1d0d0 state=finished raised BrokenProcessPool>
 
    def __get_result(self):
        if self._exception:
>           raise self._exception
E           concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
 
/usr/local/lib/python3.8/concurrent/futures/_base.py:388: BrokenProcessPool
=============================== warnings summary ===============================

`broker providers` doesn't let one select an AT instance

I'm unable to select an AT instance when examining provider inventories:

$ broker providers AnsibleTower --inventories
[INFO 210806 16:34:19] Querying provider AnsibleTower
[INFO 210806 16:34:19] Using token authentication
[INFO 210806 16:34:20] Available Inventories:
    satlab-iqe-01-openstack
    satlab-rhv-02-inventory
$ broker providers AnsibleTower::mr --inventories
Usage: broker providers [OPTIONS] COMMAND [ARGS]...
Try 'broker providers --help' for help.

Error: No such command 'AnsibleTower::mr'.

This behavior is limiting because different AT instances can, in fact, have different inventories. This example comes from a case where the AT instance spun up for the duration of a merge request (AnsibleTower::mr) has a different inventory (satlab-iqe-01-inventory) than the default production AT instance.

AttributeError is raised when loading host with more complex data in `_broker_args` `from_inventory`

Have the following inventory.yaml:

- _broker_args:
    cdn_rhn_password: supersecret
    cdn_rhn_username: thisismyaccount
    cdn_rhsm_pool_id: some-cool-pool-id
    deploy_rhel_version: 8.6.0
    fqdn: dhcp-2-206.oh.no.fqdn.has.leaked
    host_type: host
    name:
    - ogajduse-is-cool-RHEL-8.6.0-20220420.3
    os_distribution: RedHat
    os_distribution_version: '8.6'
    reported_devices:
      nics:
      - lo
      - eth0
    rhel_compose_id: RHEL-8.6.0-20220420.3
    rhel_compose_repositories:
    - baseurl: http://foo.bar/repo1/os
      description: repo1
      file: os_repo.repo
      name: repo1
    - baseurl: http://foo.bar/repo2/os
      description: repo2
      file: os_repo.repo
      name: repo2
    sat_ansible_version: 2.9
    sat_xy_version: '6.11'
    template: RHEL-8.6.0-20220420.3
    tower_inventory: satlab-what-02-inventory
    workflow: deploy-satellite-rhel-cmp-template
  _broker_provider: AnsibleTower
  hostname: dhcp-2-206.oh.no.fqdn.has.leaked
  name: ogajduse-is-cool-RHEL-8.6.0-20220420.3
  type: host

And run:

from broker import Broker
Broker().from_inventory(filter='something')

or

ipython -c 'from broker import Broker; Broker().from_inventory(filter="something")'

The following traceback is generated:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [1], in <module>
----> 1 from broker import Broker; Broker().from_inventory(filter="something")

File ~/repos/broker/broker/broker.py:345, in Broker.from_inventory(self, filter)
    340 def from_inventory(self, filter=None):
    341     """Reconstruct one or more hosts from the local inventory
    342 
    343     :param filter: A broker-spec filter string
    344     """
--> 345     inv_hosts = helpers.load_inventory(filter=filter)
    346     return [self.reconstruct_host(inv_host) for inv_host in inv_hosts]

File ~/repos/broker/broker/helpers.py:207, in load_inventory(filter)
    203 inventory_file = settings.BROKER_DIRECTORY.joinpath(
    204     settings.settings.INVENTORY_FILE
    205 )
    206 inv_data = load_file(inventory_file, warn=False)
--> 207 return inv_data if not filter else inventory_filter(inv_data, filter)

File ~/repos/broker/broker/helpers.py:115, in inventory_filter(inventory, raw_filter)
    113 for host in inventory:
    114     flattened_host = flatten_dict(host, separator=".")
--> 115     eval_list = [
    116         eval(rf.test.format(haystack=flattened_host[rf.haystack], needle=rf.needle))
    117         for rf in resolved_filter
    118         if rf.haystack in flattened_host
    119     ]
    120     if eval_list and all(eval_list):
    121         matching.append(host)

File ~/repos/broker/broker/helpers.py:118, in <listcomp>(.0)
    113 for host in inventory:
    114     flattened_host = flatten_dict(host, separator=".")
    115     eval_list = [
    116         eval(rf.test.format(haystack=flattened_host[rf.haystack], needle=rf.needle))
    117         for rf in resolved_filter
--> 118         if rf.haystack in flattened_host
    119     ]
    120     if eval_list and all(eval_list):
    121         matching.append(host)

AttributeError: 'NoneType' object has no attribute 'haystack'

Additional info

$ broker --version
Version: 0.2.6

After machines are started up in RHVM, broker inventory --sync AnsibleTower --details didn't sync them properly.

Initial state:
5 sat machines :)
1 up 4 down in satlab, and broker showing them properly

These commands work correctly, (only the result of first one is pasted below)
broker inventory --sync AnsibleTower
broker inventory --sync AnsibleTower --details

[INFO 200804 12:27:30] Adding new hosts: tstrych-satellite-6.8-latest-1595494906
[INFO 200804 12:27:30] Removing old hosts: dhcp-3-234.vms.sat.rdu2.redhat.com
[INFO 200804 12:27:30] Pulling local inventory
[INFO 200804 12:27:30] 0: dhcp-3-182.vms.sat.rdu2.redhat.com
[INFO 200804 12:27:30] 1: tstrych-satellite-6.8-latest-1596013441
[INFO 200804 12:27:30] 2: tstrych-satellite-6.8-latest-1596013720
[INFO 200804 12:27:30] 3: tstrych-satellite-6.8-latest-1596014889
[INFO 200804 12:27:30] 4: tstrych-satellite-6.8-latest-1595494906

As machine 1-4 were down, I run them up directly in satlab.
After that, (there were some minutes between my actions)

command broker inventory --sync AnsibleTower --details was started but as you can see only the machine which ran before is shown.

[INFO 200804 13:26:55] Pulling remote inventory from AnsibleTower
[INFO 200804 13:27:00] Adding new hosts: dhcp-3-234.vms.sat.rdu2.redhat.com, dhcp-3-72.vms.sat.rdu2.redhat.com, dhcp-3-34.vms.sat.rdu2.redhat.com, dhcp-3-66.vms.sat.rdu2.redhat.com
[INFO 200804 13:27:00] Removing old hosts: tstrych-satellite-6.8-latest-1596013441, tstrych-satellite-6.8-latest-1596013720, tstrych-satellite-6.8-latest-1596014889, tstrych-satellite-6.8-latest-1595494906
[INFO 200804 13:27:00] Pulling local inventory
[INFO 200804 13:27:00] 0: dhcp-3-182.vms.sat.rdu2.redhat.com, Details: _broker_args:
      new_expire_time: '+604800'
      target_vm: tstrych-satellite-6.8-latest-1595838167
    _broker_provider: AnsibleTower
    hostname: dhcp-3-182.vms.sat.rdu2.redhat.com
    name: tstrych-satellite-6.8-latest-1595838167
    type: host

Right after that I ran broker inventory --sync AnsibleTower and all vm's were fetched successfully.

[INFO 200804 13:27:30] Pulling remote inventory from AnsibleTower
[INFO 200804 13:27:35] Adding new hosts: dhcp-3-234.vms.sat.rdu2.redhat.com, dhcp-3-72.vms.sat.rdu2.redhat.com, dhcp-3-34.vms.sat.rdu2.redhat.com, dhcp-3-66.vms.sat.rdu2.redhat.com
[INFO 200804 13:27:35] Pulling local inventory
[INFO 200804 13:27:35] 0: dhcp-3-182.vms.sat.rdu2.redhat.com
[INFO 200804 13:27:35] 1: dhcp-3-234.vms.sat.rdu2.redhat.com
[INFO 200804 13:27:35] 2: dhcp-3-72.vms.sat.rdu2.redhat.com
[INFO 200804 13:27:35] 3: dhcp-3-34.vms.sat.rdu2.redhat.com
[INFO 200804 13:27:35] 4: dhcp-3-66.vms.sat.rdu2.redhat.com

There should be more investigation from my side as I don't know if just first sync does not work correctly, or if the problem is in details command. I just want to address this problem here :)

For AnsibleTower provider, _merge_artifacts needs to be changed to `last`

Problem

In Pipelines, if a WF is changed or permissioned incorrectly, the assumed artifacts can be incorrect resulting in remove broker calls to fail with inaccurate information

Solution

Move to a _merge_artifacts implementation for last.

Concerns

This is going to be an integration test effort with AT WFs to ensure that, for the desired broker actions, the correct information is returned in last.

Initial best testing solution seems to be baselining the initial returns to Jenkins based standards and adapting WFs to broker running in last.

RFE: Implement proper exit codes when using broker from CLI

It would be useful especially for jenkins pipelines if we could simply assert on the broker command results by looking at the exit codes.

e.g. for broker exec --workflow, I'd like to see a non-zero return code in case the executed workflow fails. More granularity might be eventually introduced (to distinguish between the failure happening in the workflow itself, or broker unable to actually execute it for some reason [non-existing workflow, permission issues,...])

right now:

$ broker execute --workflow nonexistent
[INFO 210310 11:17:37] Using provider AnsibleTower for execution
[INFO 210310 11:17:37] Using username and password authentication
[ERROR 210310 11:17:39] Workflow not found by name: nonexistent
[INFO 210310 11:17:39] None
$ echo $?
0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.