Giter Site home page Giter Site logo

arcaflow-plugin-pcp's Introduction

Performance Copilot (PCP) Plugin

This plugin runs PCP pmlogger, collects data until cancelled, and then generates a structured output of the results.

Note

This plugin runs indefinitely until explicitly cancelled. When used as a stand-alone plugin outside of an Arcaflow workflow, the data collection can be stopped with Ctrl-c. When used in an Arcaflow workflow, the stop_if option should be used to send the cancel signal to the plugin based on the status of another plugin.

Workflow example snippet with stop_if:

steps:
  pcp:
    plugin: quay.io/arcalot/arcaflow-plugin-pcp:latest
    step: run-pcp
    input: !expr $.input.pcp_params
    stop_if: !expr $.steps.sysbench.outputs
  sysbench:
    plugin: quay.io/arcalot/arcaflow-plugin-sysbench:latest
    step: sysbenchcpu
    input: !expr $.input.sysbench_params

Using the plugin

Get a maintained build from quay.io/arcalot/arcaflow-plugin-pcp, or build the container locally:

podman build . -t arcaflow-plugin-pcp

Run with the provided example input:

podman run -i --rm arcaflow-plugin-pcp -s run-pcp -f - < configs/pcp_example.yaml

Note

This plugin is designed to be used as a container image built with the provided Dockerfile. Using the python directly on a target system will likely prove problematic

Post-Processing Only

The post-process plugin step can be run independently to convert an existing PCP archive file into the same structured format that the main run-pcp plugin step provides. In order to make the archive file available to the plugin when it is run in a container, you will need to bind-mount the archive file path to the container and specify the archive_path in the context of the container namespace. For example, to post-process an archive in /example/local_pcp_archive/pmlogger-out use the following command:

podman run -i --rm -v /example/local_pcp_archive/:/pcp_archive/:Z \
arcaflow-plugin-pcp -s post-process -f - <<< "archive_path: /pcp_archive/pmlogger-out"

Container Privileged Mode

Note that some metrics collected by PCP are at the host system level even when running in a non-privileged container, but many metrics are namespace-scoped. In order to collect all metrics at the host level, you will need to run the containerized plugin in privileged mode with host networking.

Power User Configurations

Warning

Please exercise caution in using the configuration options noted here. There is no input validation for the custom configurations, and malformed entries will lead to a plugin failure, possibly late in the run.

pmlogger config files

Under normal operation, the plugin generates for itself a default configuration file for pmlogger using the pmlogconf command. In most circumstances, this is likely adequate, and no special actions or input are needed from the user for the configuration.

For experienced users of pmlogger who would like to specify a custom configuration, the input schema for the plugin does allow a complete pmlogger config file to be included in the input as a multi-line string to the pmlogger_conf key.

pcp2json/pmrep config files

The container will deploy with a standard set of pmrep configuration files internally under /etc/pcp/pmrep/. An experienced user may also pass a complete pmrep config file as a multi-line string to the pmrep_conf key.

pcp2json/pmrep metrics

Any metrics or metricsets defined in the pmrep config file(s) can be referenced in the standard pmrep format, with which the pcp2json command is compatible. The desired metrics/sets are optionally passed as a string to the pmlogger_metrics key in the input. If nothing is provided to this key in the input, the default value documented below will be used.

Autogenerated Input/Output Documentation by Arcaflow-Docsgen Below

Post-Process PCP Archive (post-process)

Processes an existing PCP archive into a machine-readable format

Input

Type:scope
Root object:PostProcessParams
Properties
archive_path (string)
Name:archive file path
Description:The file system path to the PCP archive file. The path should include the name of the archive without a file extension.
Required:No
Default (JSON encoded):
"."
Type:string
Must match pattern:((?:[^/]*/)*)[^/]
flatten (bool)
Name:flatten JSON structure
Description:Processes the metrics first into a two-dimensional format via the pcp2csv converter, and then converts the CSV to JSON, effectively flattening the data structure. This is useful when indexing metrics to a service like Elasticsearch.
Required:No
Default (JSON encoded):
false
Type:bool
generate_csv (bool)
Name:generate CSV output
Description:Generates the data payload also in CSV format. This output goes to the debug_logs, or to stderr if the --debug flag is used.
Required:No
Default (JSON encoded):
false
Type:bool
pmlogger_interval (float)
Name:pmlogger logging interval
Description:The logging interval in seconds (float) used by pmlogger for data collection
Required:No
Default (JSON encoded):
1.0
Type:float
Units:nanoseconds
pmlogger_metrics (string)
Name:pmlogger metrics to report
Description:The pmrep-compatible metrics values to report as a space-separated string. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Default (JSON encoded):
":vmstat :sar :sar-B :sar-w :sar-b :sar-H :sar-r"
Type:string
pmrep_conf_path (string)
Name:pmrep config file path
Description:The file system path to the pmrep config file.
Required:No
Default (JSON encoded):
"/etc/pcp/pmrep"
Type:string
Must match pattern:((?:[^/]*/)*)[^/]
Objects
PostProcessParams (object)
Type:object
Properties
archive_path (string)
Name:archive file path
Description:The file system path to the PCP archive file. The path should include the name of the archive without a file extension.
Required:No
Default (JSON encoded):
"."
Type:string
Must match pattern:((?:[^/]*/)*)[^/]
flatten (bool)
Name:flatten JSON structure
Description:Processes the metrics first into a two-dimensional format via the pcp2csv converter, and then converts the CSV to JSON, effectively flattening the data structure. This is useful when indexing metrics to a service like Elasticsearch.
Required:No
Default (JSON encoded):
false
Type:bool
generate_csv (bool)
Name:generate CSV output
Description:Generates the data payload also in CSV format. This output goes to the debug_logs, or to stderr if the --debug flag is used.
Required:No
Default (JSON encoded):
false
Type:bool
pmlogger_interval (float)
Name:pmlogger logging interval
Description:The logging interval in seconds (float) used by pmlogger for data collection
Required:No
Default (JSON encoded):
1.0
Type:float
Units:nanoseconds
pmlogger_metrics (string)
Name:pmlogger metrics to report
Description:The pmrep-compatible metrics values to report as a space-separated string. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Default (JSON encoded):
":vmstat :sar :sar-B :sar-w :sar-b :sar-H :sar-r"
Type:string
pmrep_conf_path (string)
Name:pmrep config file path
Description:The file system path to the pmrep config file.
Required:No
Default (JSON encoded):
"/etc/pcp/pmrep"
Type:string
Must match pattern:((?:[^/]*/)*)[^/]

Outputs

error

Type:scope
Root object:Error
Properties
error (string)
Required:Yes
Type:string
Objects
Error (object)
Type:object
Properties
error (string)
Required:Yes
Type:string

success

Type:scope
Root object:PerfOutput
Properties
pcp_output (list[any])
Name:PCP output list
Description:List of of performance data in intervals from PCP
Required:Yes
Type:list[any]
List items
Type:any
Objects
PerfOutput (object)
Type:object
Properties
pcp_output (list[any])
Name:PCP output list
Description:List of of performance data in intervals from PCP
Required:Yes
Type:list[any]
List items
Type:any

Run PCP (run-pcp)

Runs the PCP data collection and then processes the results into a machine-readable format

Input

Type:scope
Root object:PcpInputParams
Properties
flatten (bool)
Name:flatten JSON structure
Description:Processes the metrics first into a two-dimensional format via the pcp2csv converter, and then converts the CSV to JSON, effectively flattening the data structure. This is useful when indexing metrics to a service like Elasticsearch.
Required:No
Default (JSON encoded):
false
Type:bool
generate_csv (bool)
Name:generate CSV output
Description:Generates the data payload also in CSV format. This output goes to the debug_logs, or to stderr if the --debug flag is used.
Required:No
Default (JSON encoded):
false
Type:bool
pmlogger_conf (string)
Name:pmlogger configuration file
Description:Complete configuration file content for pmlogger as a multi-line string. If no config file is provided, a default one will be generated. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Type:string
pmlogger_interval (float)
Name:pmlogger logging interval
Description:The logging interval in seconds (float) used by pmlogger for data collection
Required:No
Default (JSON encoded):
1.0
Type:float
Units:nanoseconds
pmlogger_metrics (string)
Name:pmlogger metrics to report
Description:The pmrep-compatible metrics values to report as a space-separated string. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Default (JSON encoded):
":vmstat :sar :sar-B :sar-w :sar-b :sar-H :sar-r"
Type:string
pmrep_conf (string)
Name:pmrep configuration file
Description:Complete configuration file content for pmrep as a multi-line string. If no config file is provided, a default one will be used. This configuration is used internally for `pcp2json` and `pcp2csv`. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Type:string
pmrep_conf_path (string)
Name:pmrep config file path
Description:The file system path to the pmrep config file.
Required:No
Default (JSON encoded):
"/etc/pcp/pmrep"
Type:string
Must match pattern:((?:[^/]*/)*)[^/]
timeout (int)
Name:pmlogger timeout seconds
Description:Timeout in seconds after which to cancel the pmlogger collection
Required:No
Type:int
Objects
PcpInputParams (object)
Type:object
Properties
flatten (bool)
Name:flatten JSON structure
Description:Processes the metrics first into a two-dimensional format via the pcp2csv converter, and then converts the CSV to JSON, effectively flattening the data structure. This is useful when indexing metrics to a service like Elasticsearch.
Required:No
Default (JSON encoded):
false
Type:bool
generate_csv (bool)
Name:generate CSV output
Description:Generates the data payload also in CSV format. This output goes to the debug_logs, or to stderr if the --debug flag is used.
Required:No
Default (JSON encoded):
false
Type:bool
pmlogger_conf (string)
Name:pmlogger configuration file
Description:Complete configuration file content for pmlogger as a multi-line string. If no config file is provided, a default one will be generated. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Type:string
pmlogger_interval (float)
Name:pmlogger logging interval
Description:The logging interval in seconds (float) used by pmlogger for data collection
Required:No
Default (JSON encoded):
1.0
Type:float
Units:nanoseconds
pmlogger_metrics (string)
Name:pmlogger metrics to report
Description:The pmrep-compatible metrics values to report as a space-separated string. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Default (JSON encoded):
":vmstat :sar :sar-B :sar-w :sar-b :sar-H :sar-r"
Type:string
pmrep_conf (string)
Name:pmrep configuration file
Description:Complete configuration file content for pmrep as a multi-line string. If no config file is provided, a default one will be used. This configuration is used internally for `pcp2json` and `pcp2csv`. NOTE: Input not validated by the plugin -- Any errors are likely to be produced at the end of the plugin run and may result in workflow failures.
Required:No
Type:string
pmrep_conf_path (string)
Name:pmrep config file path
Description:The file system path to the pmrep config file.
Required:No
Default (JSON encoded):
"/etc/pcp/pmrep"
Type:string
Must match pattern:((?:[^/]*/)*)[^/]
timeout (int)
Name:pmlogger timeout seconds
Description:Timeout in seconds after which to cancel the pmlogger collection
Required:No
Type:int

Outputs

error

Type:scope
Root object:Error
Properties
error (string)
Required:Yes
Type:string
Objects
Error (object)
Type:object
Properties
error (string)
Required:Yes
Type:string

success

Type:scope
Root object:PerfOutput
Properties
pcp_output (list[any])
Name:PCP output list
Description:List of of performance data in intervals from PCP
Required:Yes
Type:list[any]
List items
Type:any
Objects
PerfOutput (object)
Type:object
Properties
pcp_output (list[any])
Name:PCP output list
Description:List of of performance data in intervals from PCP
Required:Yes
Type:list[any]
List items
Type:any

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.