Giter Site home page Giter Site logo

kgaikwad / insights-puptoo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from redhatinsights/insights-puptoo

0.0 1.0 0.0 905 KB

Fact extraction and inventory updates for advisor archives

Python 95.02% Shell 3.82% Dockerfile 0.76% Groovy 0.40%

insights-puptoo's Introduction

Platform Upload Processor II

The Platform Upload Processor II (PUPTOO) is designed to recieve payloads for the advisor service via the message queue, extract facts from the payload, forward the information to the inventory service.

Details

The PUPTOO service is a component of the Insights Platform that validates the payloads uploaded to Red Hat by the insights-client. The service is engaged after an upload has been recieved and stored in cloud storage.

PUPTOO retrieves payloads via URL in the message, processes it through insights-core to extract facts and guarantee integrity of the archive, and send the extracted info to the inventory service.

The service runs in Openshift Dedicated.

How it Works

UML

The PUP service workflow is as follows:

  • Recieve a message from platform.upload.advisor topic in the MQ
  • PUP downloads the archive from the url specified in the message
  • Insights Core is engaged to open the tar file and extract the facts as determined by the get_canonical_facts method in insights-core
  • During extraction it also runs a custom system-profile ruleset to extract more information about the ssytem
  • PUP sends the result to inventory via the message queue

The Compliance Situation

Puptoo also functions as a forwarder for compliance uploads put on the platform.upload.compliance. We do this because compliance depends on an inventory ID that was taken away when ingress stopped retrieving that from inventory on its own. No processing is performed on the compliance uploads. We simply forward it to inventory with canonical facts.

JSON

The JSON expected by the PUP service from the upload service looks like this:

{"account": "123456",
 "principal": "654321",
 "request_id": "23oikjonsdv329",
 "size": 234,
 "service": "advisor",
 "category": "some_category",
 "b64_identity": "<some big base64 string>",
 "metadata": {"some_key": "some_value"},
 "url": "http://some.bucket.somewhere/1234"}

The message sent to the inventory service will include the facts extracted from the archive

{"data": {"facts": [{"facts": {"insights_id": "a756b571-e227-46a5-8bcc-3a567b7edfb1",
                               "machine_id": null,
                               "bios_uuid": null,
                               "subscription_manager_id": null,
                               "ip_addresses": [],
                               "mac_addresses": [],
                               "fqdn": "Z0JTXJ7YSG.test"},
                     "namespace": "insights-client",
                     "system-profile": {"foo": "bar"}}]},
 "platform_metadata": {"key": "value"},
 "operation": "add_host"}

The above facts are managed by the insights-core project and may be added or taken away. The README should be updated to reflect those changes

Fields:

  • account: The account number used to upload. Will be modified to account_number when posting to inventory
  • principal: The upload org_id. Will be modified to org_id when posting to inventory
  • request_id: The ID of the individual uploaded archive
  • size: Size of the payload in bytes
  • service: The service name as provided by the MIME type.
  • url: The url from which the archive will be downloaded
  • facts: An array containing facts for each host

If the fact extraction fails, the archive will be considered "bad." A message will be sent back to the upload service so the file can be moved to the rejected bucket.

Failure example:

    {"validation": "failure", "request_id": "23oikjonsdv329"}

Running

The default environment variables should be acceptable for testing.
PUPTOO does expect a kafka message queue to be available for connection.

Prerequisites

Python

Create a virtualenv using pipenv and install requirements. Once complete you can start the app by running puptoo

python -m venv path/to/venv
source path/to/venv/bin/activate
pip3 install .

Running Locally

Activate your virtual environment and run the validator

source path/to/venv/bin/activate
puptoo

Running with Docker Compose

Two docker-compose files are made available in this repo for standing up a local dev environment. The docker-compose.yml file stands up putoo, kafka, and minio for isolated tested. The full-stack.yml file stands up ingress, kafka, puptoo, minio, and inventory components so that the entire first bits of the platform pipeline can be tested.

Stand Up Isolated Puptoo

cd dev && sudo docker-compose up

Stand Up Full stack

cd dev && sudo docker-compose -f full-stack.yml up 

NOTE: The full stack expects you to have an ingress and inventory image available. See those projects for steps for building the images needed. It's also typical for puptoo to fail to start if it can't initially connect to kafka. If this happens, simply run sudo docker-compose -f full-stack.yml up -d pup to have it attempt another startup.

Bonfire

Deploying with bonfire to an ephemeral environmnet is the preferred way to test puptoo. See the bonfire documentation for more information.

File Processing

The best way to test is by standing up this server and incorporating it with the upload-service. The insights-upload repo has a docker-compose that will get you most of the way there. Other details regarding posting your archive to the service can be found in that readme.

This test assumes you have an inventory service available and ready to use. Visit the insights-host-inventory repo for those instructions.

Testing System Profile

Occassionaly, an archive may be rejected by puptoo for a reason that is unclear. You may also see an archive that doens't seem to work properly or gathers the wrong information. In order to test this locally, you can use the insights-run tool to process the system-profile of that archive easily.

   poetry run insights-run -p src.puptoo ~/path/to/archive

This will print the system_profile so you can analyze it for issues.

Deployment

The PUPTOO service master branch has a webhook that notifies App-interface to build a new image within Jenkins. The image build will then trigger a redployment of the service in ingress-stage. In order to push to production, an app-interface PR should be created with the git ref for the image that should exist within ingress-prod on the Production cluster.

Contributing

All outstanding issues or feature requests should be filed as Issues on this Github repo or within JIRA. PRs should be submitted against the master branch for any features or changes.

Any new system-profile items should include a test file inside dev/test-archives/core-base. This file should emulate the file that would be found inside a real archive, and the test itself should be written and provided in the tests directory.

Running unit tests

    ACG_CONFIG=./cdappconfig.json pytest

Versioning

New functionality that may effect other services should increment by 1. Minor features and bugfixes can increment by 0.0.1

Authors

  • Stephen Adams - Initial Work - SteveHNH

insights-puptoo's People

Contributors

stevehnh avatar kylape avatar jhjaggars avatar codeheeler avatar csams avatar psav avatar beav avatar rodrigonull avatar jdobes avatar opacut avatar portante avatar jason-rh avatar gmcculloug avatar eherget avatar dehort avatar bsquizz avatar skarekrow avatar apuntamb avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.