Giter Site home page Giter Site logo

cryostatio / cryostat Goto Github PK

View Code? Open in Web Editor NEW
215.0 9.0 31.0 7.44 MB

Secure JDK Flight Recorder management for containerized JVMs

Home Page: https://cryostat.io

License: Other

Java 98.17% Shell 1.82% Dockerfile 0.02%
java jdk missioncontrol flightrecorder containers jmx monitoring observability kubernetes openshift

cryostat's Introduction

CI build Quay Repository Google Group : Cryostat Development

A container-native JVM application which acts as a bridge to other containerized JVMs and exposes a secure API for producing, analyzing, and retrieving JDK Flight Recorder data from your cloud workloads.

SEE ALSO

  • cryostat.io : upstream documentation website with user guides, tutorials, blog posts, and other user-facing content. Start here if what you've read so far sounds interesting and you want to know more as a user, rather than as a developer.

  • cryostat-core : the core library providing a convenience wrapper and headless stubs for use of JFR using JDK Mission Control internals.

  • cryostat-operator : an OpenShift Operator deploying Cryostat in your OpenShift cluster as well as exposing the Cryostat API as Kubernetes Custom Resources.

  • cryostat-web : the React frontend included as a submodule in Cryostat and built into Cryostat's (non-headless mode) OCI images.

  • JDK Mission Control for the original JDK Mission Control, which is the desktop application complement to JFR. Some parts of JMC are borrowed and re-used to form the basis of Cryostat. JMC is still a recommended tool for more full-featured analysis of JFR files beyond what Cryostat currently implements.

CONTRIBUTING

We welcome and appreciate any contributions from our community. Please visit our guide on how you can take part in improving Cryostat.

See contribution guide โ†’

REQUIREMENTS

Build Requirements:

  • Git
  • JDK17+
  • Maven 3+
  • Podman 2.0+
  • qemu-user-static to build container images for other archs

Run Requirements:

  • Kubernetes/OpenShift/Minishift, Podman/Docker, or other container platform
  • systemctl --user enable --now podman.socket to enable the user podman.socket for podman discovery

BUILD

Setup Dependencies

  • Clone and install cryostat-core via it's instructions
  • Initialize submodules via: git submodule init && git submodule update

Build project locally

  • ./mvnw compile

Build and run project locally in development hot-reload mode

  • sh devserver.sh - this will start the Vert.x backend in hot-reload mode, so any modifications to files in src/ will cause a re-compilation and re-deploy. This is only intended for use during development. The web-client assets will not be built and will not be included in the application classpath. To set up the web-client frontend for hot-reload development, see cryostat-web Development Server.

Build and push to local podman image registry

  • ./mvnw package
  • Run ./mvnw -Dheadless=true clean package to exclude web-client assets. The clean phase should always be specified here, or else previously-generated client assets will still be included into the built image.
  • For other OCI builders, use the imageBuilder Maven property. For example, to use docker, run: ./mvnw -DimageBuilder=$(which docker) clean verify

TEST

Unit tests

  • ./mvnw test

Integration tests and analysis tools

  • ./mvnw verify

Skipping tests

  • -DskipUTs=true to skip unit tests
  • -DskipITs=true to skip integration tests
  • -DskipTests=true to skip all tests

Running integration tests without rebuild

  • bash repeated-integration-tests.bash.
  • To run selected integration tests without rebuilding, append the name(s) of your itest class(es) as an argument to repeated-integration-tests.bash, e.g. bash repeated-integration-tests.bash AutoRulesIT,RecordingWorkflowIT. Note that modifying a test file does not require a rebuild.

RUN

Run on Kubernetes/Openshift

Run on local podman*

  • run.sh

Note: If you get a 'No plugin found' error, it is because maven has not downloaded all the necessary plugins. To resolve this error, manually run ./mvnw help:evaluate to prompt maven to download the missing help plugin.

Run on local podman with Grafana, jfr-datasource and demo application*

  • smoketest.sh

To run on local podman, cgroups v2 should be enabled. This allows resource configuration for any rootless containers running on podman. To ensure podman works with cgroups v2, follow these instructions.

Note: If your podman runtime is set to runc v1.0.0-rc91 or later it is not necessary to change it to crun as recommended in the instructions, since this version of runc supports cgroups v2. The article refers to an older version of runc.

CONFIGURATION

Cryostat can be configured via the following environment variables:

Configuration for cryostat

  • CRYOSTAT_WEB_HOST: the hostname used by the cryostat web server. Defaults to reverse-DNS resolving the host machine's hostname.
  • CRYOSTAT_WEB_PORT: the internal port used by the cryostat web server. Defaults to 8181.
  • CRYOSTAT_EXT_WEB_PORT: the external port used by the cryostat web server. Defaults to be equal to CRYOSTAT_WEB_PORT.
  • CRYOSTAT_CORS_ORIGIN: the origin for CORS to load a different cryostat-web instance. Defaults to the empty string, which disables CORS.
  • CRYOSTAT_MAX_WS_CONNECTIONS: the maximum number of websocket client connections allowed (minimum 1, maximum Integer.MAX_VALUE, default Integer.MAX_VALUE)
  • CRYOSTAT_AUTH_MANAGER: the authentication/authorization manager used for validating user accesses. See the USER AUTHENTICATION / AUTHORIZATION section for more details. Set to the fully-qualified class name of the auth manager implementation to use, ex. io.cryostat.net.BasicAuthManager. Defaults to an AuthManager corresponding to the selected deployment platform, whether explicit or automatic (see below).
  • CRYOSTAT_PLATFORM: the platform clients used for performing platform-specific actions, such as listing available target JVMs. If CRYOSTAT_AUTH_MANAGER is not specified then a default auth manager will also be selected corresponding to the highest priority platform, whether those platforms are specified by the user or automatically detected. Set to the fully-qualified names of the platform detection strategy implementations to use, ex. io.cryostat.platform.internal.KubeApiPlatformStrategy,io.cryostat.platform.internal.PodmanPlatformStrategy.
  • CRYOSTAT_ENABLE_JDP_BROADCAST: enable the Cryostat JVM to broadcast itself via JDP (Java Discovery Protocol). Defaults to true.
  • CRYOSTAT_JDP_ADDRESS: the JDP multicast address to send discovery packets. Defaults to 224.0.23.178.
  • CRYOSTAT_JDP_PORT: the JDP multicast port to send discovery packets. Defaults to 7095.
  • CRYOSTAT_CONFIG_PATH: the filesystem path for the configuration directory. Defaults to /opt/cryostat.d/conf.d.
  • CRYOSTAT_DISABLE_BUILTIN_DISCOVERY: set to true to disable built-in target discovery mechanisms (see CRYOSTAT_PLATFORM). Custom Target "discovery" remains available, but discovery via JDP, Kubernetes API, or Podman API is disabled and ignored. This will still allow platform detection to automatically select an AuthManager. This is intended for use when Cryostat Discovery Plugins are the only desired mechanism for locating target applications. See #936 and cryostat-agent. Defaults to false.
  • CRYOSTAT_K8S_NAMESPACES: set to a comma-separated list of Namespaces that Cryostat should query to discover target JVM applications with its built-in discovey mechanism.

Configuration for Automated Analysis Reports

  • CRYOSTAT_REPORT_GENERATION_MAX_HEAP: the maximum heap size used by the container subprocess which forks to perform automated rules analysis report generation. The default is 200, representing a 200MiB maximum heap size. Too small of a heap size will lead to report generation failing due to Out-Of-Memory errors. Too large of a heap size may lead to the subprocess being forcibly killed and the parent process failing to detect the reason for the failure, leading to inaccurate failure error messages and API responses.

Configuration for JMX Connections and Cache

  • CRYOSTAT_JMX_CONNECTION_TIMEOUT_SECONDS: the maximum wait time for a JMX connection to open and a single operation to complete. This is only used for specific internally-fired operations that are expected to execute very quickly after the connection opens. Default 3, minimum 1.
  • CRYOSTAT_TARGET_MAX_CONCURRENT_CONNECTIONS: the maximum number of concurrent JMX connections open. When this number of connections are open any requests requiring further connections will block until a previous connection closes. Defaults to -1 which indicates an unlimited number of connections.
  • CRYOSTAT_TARGET_CACHE_TTL: the time to live (in seconds) for cached JMX connections. Defaults to 10, minimum 1. Any values less than 1 will be overridden with 1.

Configuration for Logging

  • CRYOSTAT_JUL_CONFIG : the java.util.logging.config.file configuration file for logging via SLF4J Some of Cryostat's dependencies also use java.util.logging for their logging. Cryostat disables some of these by default, because they generate unnecessary logs. However, they can be reenabled by overriding the default configuration file and setting the disabled loggers to the desired level.

Configuration for Event Templates

  • CRYOSTAT_TEMPLATE_PATH: the storage path for Cryostat event templates

Configuration for Archiving

  • CRYOSTAT_ARCHIVE_PATH: the storage path for archived recordings
  • CRYOSTAT_PUSH_MAX_FILES: the maximum number of archived recordings stored in a FIFO manner per target JVM when pushing JFR files using the RecordingsFromIdPostHandler. Mainly used with the cryostat-agent as a global default configuration for the maximum number of archived JFR recordings to keep on disk per-agent-attached-target, which can be overridden by the agent itself. Defaults to Integer.MAX_VALUE, minimum 1. Any values less than 1 will be overridden with 1.

Configuration for database

  • CRYOSTAT_JDBC_DRIVER: driver to use for communicating with the database. Defaults to org.h2.Driver. org.postgresql.Driver is also supported.
  • CRYOSTAT_JDBC_URL: URL for connecting to the database. Defaults to jdbc:h2:mem:cryostat;INIT=create domain if not exists jsonb as other for an h2 in-memory database. Also supported: jdbc:h2:file:/opt/cryostat.d/conf.d/h2;INIT=create domain if not exists jsonb as other, or a PostgreSQL URL such as jdbc:postgresql://cryostat:5432/cryostat.
  • CRYOSTAT_JDBC_USERNAME: username for JDBC connection.
  • CRYOSTAT_JDBC_PASSWORD: password for JDBC connection.
  • CRYOSTAT_JMX_CREDENTIALS_DB_PASSWORD: encryption password for stored JMX credentials.
  • CRYOSTAT_HIBERNATE_DIALECT: Defaults to org.hibernate.dialect.H2Dialect. Also supported: org.hibernate.dialect.PostgreSQL95Dialect.
  • CRYOSTAT_HBM2DDL: Control Hibernate schema DDL. Defaults to create.
  • CRYOSTAT_LOG_DB_QUERIES: Enable verbose logging of database queries. Defaults to false.

MONITORING APPLICATIONS

In order for cryostat to be able to monitor JVM application targets the targets must have RJMX enabled or have the Cryostat Agent installed and configured. cryostat has several strategies for automatic discovery of potential targets.

The first target discovery mechanism uses the OpenShift/Kubernetes API to list service endpoints and expose all discovered services as potential targets. This is runtime dynamic, allowing cryostat to discover new services which come online after cryostat, or to detect when known services disappear later. This requires the cryostat pod to have authorization to list services within its own namespace. By default this will look for Endpoints objects with ports named jfr-jmx or numbered 9091. This behaviour can be overridden using the environment variables CRYOSTAT_DISCOVERY_K8S_PORT_NAMES and CRYOSTAT_DISCOVERY_K8S_PORT_NUMBERS respectively. Both of these accept comma-separated lists as values. Any observed Endpoints object with a name in the given list or a number in the given list will be taken as a connectable target application. To set the names list to the empty list use -. To set the numbers list to the empty list use 0.

The second discovery mechanism is JDP (Java Discovery Protocol). This relies on target JVMs being configured with the JVM flags to enable JDP and requires the targets to be reachable and in the same subnet as cryostat. JDP can be enabled by passing the flag "-Dcom.sun.management.jmxremote.autodiscovery=true" when starting target JVMs; for more configuration options, see this document . Once the targets are properly configured, cryostat will automatically discover their JMX Service URLs, which includes the RJMX port number for that specific target.

The third discovery mechanism is the Podman API. If the Podman API socket is available at its default filesystem location then Cryostat will query the libpod/containers endpoint to determine what target applications may be available. Containers must have the Podman label io.cryostat.connectUrl applied, and the value should be the remote JMX or Cryostat Agent HTTP connection URL that Cryostat can use to communicate with the target.

To enable RJMX on port 9091, the following JVM flag should be passed at target startup:

    '-Dcom.sun.management.jmxremote.port=9091'

The port number 9091 is arbitrary and may be configured to suit individual deployments, so long as the port property above matches the desired port number and the deployment network configuration allows connections on the configured port.

Additionally, the following flags are recommended to enable JMX authentication and connection encryption:

-Dcom.sun.management.jmxremote.authenticate=true # enable JMX authentication
-Dcom.sun.management.jmxremote.password.file=/app/resources/jmxremote.password # define users for JMX auth
-Dcom.sun.management.jmxremote.access.file=/app/resources/jmxremote.access # set permissions for JMX users
-Dcom.sun.management.jmxremote.ssl=true # enable JMX SSL
-Dcom.sun.management.jmxremote.registry.ssl=true # enable JMX registry SSL
-Djavax.net.ssl.keyStore=/app/resources/keystore # set your SSL keystore
-Djavax.net.ssl.keyStorePassword=somePassword # set your SSL keystore password

JMX Connectors

Cryostat supports end-user target applications using other JMX connectors than RMI (for example, WildFly remote+http) using "client library" configuration. The path pointed to by the environment variable CRYOSTAT_CLIENTLIB_PATH is appended to Cryostat's classpath. This path should be a directory within a volume mounted to the Cryostat container and containing library JARs (ex. jboss-client.jar) in a flat structure.

In the particular case of WildFly remote+http, you might do something like the following to add this capability:

$ podman cp wildfly:/opt/jboss/wildfly/bin/client/jboss-client.jar clientlib/

EVENT TEMPLATES

JDK Flight Recorder has event templates, which are preset definition of a set of events, and for each a set of options and option values. A given JVM is likely to have some built-in templates ready for use out-of-the-box, but Cryostat also hosts its own small catalog of templates within its own storage. This catalog is stored at the path specified by the environment variable CRYOSTAT_TEMPLATE_PATH. Templates can be uploaded to Cryostat and then used to create recordings.

ARCHIVING RECORDINGS

cryostat supports a concept of "archiving" recordings. This simply means taking the contents of a recording at a point in time and saving these contents to a file to the cryostat process (as opposed to "active" recordings, which exist within the memory of the JVM target and continue to grow over time). The default directory used is /flightrecordings, but the environment variable CRYOSTAT_ARCHIVE_PATH can be used to specify a different path. To enable cryostat archive support ensure that the directory specified by CRYOSTAT_ARCHIVE_PATH (or /flightrecordings if not set) exists and has appropriate permissions. cryostat will detect the path and enable related functionality. run.sh has an example of a tmpfs volume being mounted with the default path and enabling the archive functionality.

SECURING COMMUNICATION CHANNELS

To specify the SSL certificate for HTTPS/WSS and JMX, one can set KEYSTORE_PATH to point to a .jks, .pfx or .p12 certificate file and KEYSTORE_PASS to the plaintext password to such a keystore. Alternatively, one can set KEY_PATH to a PEM encoded key file and CERT_PATH to a PEM encoded certificate file.

In the absence of these environment variables, cryostat will look for a certificate at the following locations, in an orderly fashion:

  • $HOME/cryostat-keystore.jks (used together with KEYSTORE_PASS)
  • $HOME/cryostat-keystore.pfx (used together with KEYSTORE_PASS)
  • $HOME/cryostat-keystore.p12 (used together with KEYSTORE_PASS)
  • $HOME/cryostat-key.pem and $HOME/cryostat-cert.pem

If no certificate can be found, cryostat will autogenerate a self-signed certificate and use it to secure HTTPS/WSS and JMX connections.

If HTTPS/WSS (SSL) and JMX auth credentials must be disabled then the environment variables CRYOSTAT_DISABLE_SSL=true and/or CRYOSTAT_DISABLE_JMX_AUTH=true can be set.

In case cryostat is deployed behind an SSL proxy, set the environment variable CRYOSTAT_SSL_PROXIED to a non-empty value. This informs cryostat that the URLs it reports pointing back to itself should use the secure variants of protocols, even though it itself does not encrypt the traffic. This is only required if Cryostat's own SSL is disabled as above.

If the certificate used for SSL-enabled Grafana/jfr-datasource connections is self-signed or otherwise untrusted, set the environment variable CRYOSTAT_ALLOW_UNTRUSTED_SSL to permit uploads of recordings.

Target JVMs with SSL enabled on JMX connections are also supported. In order to allow Cryostat to establish a connection, the target's certificate must be copied into Cryostat's /truststore directory before Cryostat's startup. If Cryostat attempts to connect to an SSL-enabled target and no matching trusted certificate is found then the connection attempt will fail.

USER AUTHENTICATION / AUTHORIZATION

Cryostat has multiple authz manager implementations for handling user authentication and authorization against various platforms and mechanisms. This can be controlled using an environment variable (see the RUN section above), or automatically using platform detection.

In all scenarios, the presence of an auth manager (other than NoopAuthManager) causes Cryostat to expect a token or credentials via an Authorization header on all potentially sensitive requests, ex. recording creations and downloads, report generations.

The OpenShiftPlatformClient.OpenShiftAuthManager uses token authentication. These tokens are passed through to the OpenShift API for authz and this result determines whether Cryostat accepts the request.

The BasicAuthManager uses basic credential authentication configured with a standard Java properties file at $CRYOSTAT_CONFIG_PATH/cryostat-users.properties. The credentials stored in the Java properties file are the user name and a SHA-256 sum hex of the user's password. The property file contents should look like:

user1=abc123
user2=def987

Where abc123 and def987 are substituted for the SHA-256 sum hexes of the desired user passwords. These can be obtained by ex. echo -n PASS | sha256sum | cut -d' ' -f1.

Token-based auth managers expect an HTTP Authorization: Bearer TOKEN header and a Sec-WebSocket-Protocol: base64url.bearer.authorization.cryostat.${base64(TOKEN)} WebSocket SubProtocol header. The token is never stored in any form, only kept in-memory long enough to process the external token validation.

Basic credentials-based auth managers expect an HTTP Authorization: Basic ${base64(user:pass)} header and a Sec-WebSocket-Protocol: basic.authorization.cryostat.${base64(user:pass)} WebSocket SubProtocol header.

If no appropriate auth manager is configured or can be automatically determined then the fallback is the NoopAuthManager, which does no external validation calls and simply accepts any provided token or credentials.

INCOMING JMX CONNECTION AUTHENTICATION

JMX connections into cryostat are secured using the default username "cryostat" and a randomly generated password. The environment variables CRYOSTAT_RJMX_USER and CRYOSTAT_RJMX_PASS can be used to override the default username and specify a password.

API

Cryostat exposes an HTTP API that provides the backing for its web interface, but is also intended as an automation or extension point for external clients. For details about this API see HTTP_API.md, GRAPHQL.md, and DISCOVERY_PLUGINS.md.

cryostat's People

Contributors

aali309 avatar alexjsenn avatar andrewazores avatar aptmac avatar dependabot-preview[bot] avatar dependabot[bot] avatar ebaron avatar github-actions[bot] avatar hareetd avatar jiekang avatar josh-matsuoka avatar lkonno avatar maxcao13 avatar mwangggg avatar tabjy avatar tthvo avatar vic-ma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cryostat's Issues

Update documentation with requirements for containers

The README or some other documentation should be included that describes the requirements of Java applications in containers to be visible by this project, with sample commands for Docker/Podman or even specific to OpenShift. E.g. the RJMX exposure.

Connections should require some form of authentication

An authentication manager (or rather, managers for each of TUI, TCP, and WS connections) should be implemented which blocks user interaction until successfully authenticated. Ideally this should also have hooks for connection to some external authz mechanism (SSO) so that users can reuse their existing infrastructure.

JFR files should be re-uploadable into storage

Users should be able to (re-)upload existing JFR files from their local machine into ContainerJFR's storage (archive).

  1. In case their ContainerJFR instance goes away and storage is lost (ex. OpenShift Operator is deleted and restored)

  2. To allow ContainerJFR to be used as a desktop application for analysis, but with the Flight Recorder options set by JVM flags and dumped to local file directly by the JVM or another external tool (ex. jcmd)

  3. To allow migration between different local environments, for example a user using ContainerJFR as a desktop application with a workstation and a laptop

  4. For simply restoring accidentally deleted recordings from ContainerJFR archives

ContainerJFR should do some validation that the received file is a valid JFR file, that the specified name for the file is acceptable (ie. compatible with file-relating commands), etc. The file name should also be automatically updated if the specified name conflicts with a pre-existing file name, so that the upload doesn't overwrite previous data.

HTML Reports generation (`GET /reports/foo`) can cause OOM

The HTML report generation API exposed by the JMC core libraries does not seem to offer any sort of streaming interface, so the entire recording must be loaded out of the target JVM and into memory, and then the HTML report built in-memory along with all required data structures. All of this intermediate data can add up to quite a large size, and when running container-jfr in a resource-constrained environment ex. a typical Kubernetes or Openshift scenario, can easily trigger OOMs and crash the process.

Ideally there would be a pipelined streaming API for remote JVM Flight Recording -> container-jfr -> JMC HTML Report API -> HTML report stream, so that container-jfr can then acts as an intermediate pipe between the end user's client web browser and the target JVM's Flight Recorder, reducing the amount of memory required for the report building operation. This at the very least depends upon the target JVM supporting streaming JFR events (https://openjdk.java.net/jeps/349).

As an interim measure, if a reliable way can be found to measure the size of the recording within the target JVM without first loading it into container-jfr memory, then some heuristic can be applied and the decision to produce a report cancelled early if the expected resource consumption is too high. Ex. if generating the HTML report typically requires 2x as much memory as the recording size itself, and container-jfr has less than 3x (report plus recording) recording size of memory available to it, then the report should not be attempted.

Users should be able to create custom event templates

See #141

A user should be able to create a custom event template (ex. by uploading an XML .jfc file), which would subsequently be included in the set of event templates that can be applied to a recording. This should be stored in ContainerJFR's own local storage, and available as a template for any and all target JVMs. The template may not then be applicable to every target in case it specifies events/options not known to the target, but that's up to the users to determine compatibility when they're uploading custom templates.

Unit test failures

Some unit tests are failing since #57 , and I overlooked this before merging the PR. @tabjy , please correct them.

Integration tests sometimes hang

When running mvn clean verify, or something like it, the build sometimes gets stuck and seems to hang on a pre-integration-test step. The issue appears to be with the exec plugin when attempting to start the container using podman. This uses a synchronous exec plugin configuration but gives the --detach flag to podman, so the podman process should be exiting after the container has been spun up. It seems that sometimes either podman doesn't exit, or somehow the exec plugin doesn't see that it has. When this happens, podman ps -a can be used to observe that the container does in fact exist and is running. However, podman kill container-jfr-itest typically complains about a "device or resource busy", although it still seems to succeed in killing and removing the container anyway.

Rules Report should allow linking to pertinent Grafana visualizations

The HTML Reports should be modified such that rule result descriptions contain links to Grafana views - if a Grafana dashboard and datasource are configured - so that the end user can quickly and easily jump from the container-jfr overview into a more detailed analysis from the automated report, without resorting to downloading the flight recording and analysing the recording using traditional desktop JMC.

A minimal/"headless" container image should be provided

A minimal container-jfr image, without the -web client assets or any other additional resources beyond what is needed for connecting to container-jfr and starting/stopping/retrieving a recording, should be made available.

Intermittent unit test failure

[ERROR] Failures: 
[ERROR]   MessagingServerTest.shouldHandleRemovedConnections:160 
Expected: "another message"
     but: was "hello world"

This test seems to intermittently fail. Possibly related to Maven parallel execution configuration, since I don't recall ever seeing this failure before with the Gradle build setup.

Implement a "basic" auth manager

Currently, the only auth manager implementations are "noop" (permissive) and "OpenShift" (delegates to the OpenShift API). There should be an auth manager implementation that instead allows some configuration of basic key-value pairs for usernames and password hashes. Ideally this should allow for more than one such user account to be configured. This could be via some environment variables following some predefined naming convention, or by some config file. Combined with #102 , this would allow a scenario like basic auth in OpenShift, and enabling auth for the desktop app or simple Docker deployment scenarios.

ConcurrentModificationException on WS message handling

I attempted to request a recording via the container-jfr-operator CR API and found this in the container-jfr logs:

[INFO] (10.128.0.1:56640): GET /clienturl 200 0ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:56766): GET /clienturl 200 0ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:56894): GET /clienturl 200 1ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:57020): GET /clienturl 200 1ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:57392): GET /clienturl 200 0ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:57856): GET /clienturl 200 0ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:58252): GET /clienturl 200 0ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:58378): GET /clienturl 200 0ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:58504): GET /clienturl 200 1ms
[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.1.99:35886): GET /clienturl 200 0ms
[INFO] Connected remote client 10.128.0.1:58582
[INFO] (10.128.0.1:58582): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:58582): CMD {"command":"connect","args":["172.30.181.199:9091"]}

[INFO] No active connection

[INFO] (10.128.0.1:58582): CMD {"command":"dump","args":["apimade","30","ALL"]}

[INFO] (10.128.0.1:56504): CMD {"command":"list-event-types","args":[]}
[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58582): CMD {"command":"list","args":null}

[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58582): CMD {"command":"disconnect","args":null}

[INFO] (10.128.1.99:35886): GET /clienturl 200 0ms
[INFO] Disconnected remote client 10.128.0.1:58582
[INFO] Connected remote client 10.128.0.1:58644
[INFO] (10.128.0.1:58644): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:58644): CMD {"command":"disconnect","args":null}

[INFO] (10.128.0.1:58644): CMD {"command":"connect","args":["172.30.181.199:9091"]}

[INFO] No active connection

[INFO] (10.128.0.1:58644): CMD {"command":"dump","args":["apimade","30","ALL"]}

[INFO] (10.128.0.1:56504): CMD {"command":"list-event-types","args":[]}
[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:56504): CMD {"command":"list-event-types","args":[]}
[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] Disconnected remote client 10.128.0.1:58644
[INFO] (10.128.0.1:58684): GET /clienturl 200 0ms
[INFO] (10.128.1.99:35886): GET /clienturl 200 2ms
[INFO] Connected remote client 10.128.0.1:58690
[INFO] (10.128.0.1:58690): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58690): CMD {"command":"disconnect","args":null}

[INFO] Disconnected remote client 10.128.0.1:58690
[INFO] (10.128.1.99:35886): GET /clienturl 200 3ms
[INFO] Connected remote client 10.128.0.1:58722
[INFO] (10.128.0.1:58722): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:56504): CMD {"command":"ping","args":[]}
[INFO] (10.128.0.1:58722): CMD {"command":"connect","args":["172.30.181.199:9091"]}

[INFO] No active connection

[INFO] (10.128.0.1:58722): CMD {"command":"dump","args":["apimade","30","ALL"]}

[INFO] (10.128.0.1:56504): CMD {"command":"list-event-types","args":[]}
[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58722): CMD {"command":"list","args":null}

[INFO] Disconnected remote client 10.128.0.1:58722
[INFO] (10.128.1.99:35886): GET /clienturl 200 0ms
[INFO] Connected remote client 10.128.0.1:58772
[INFO] (10.128.0.1:58772): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58772): CMD {"command":"disconnect","args":null}

[INFO] Disconnected remote client 10.128.0.1:58772
[INFO] (10.128.1.99:35886): GET /clienturl 200 1ms
[INFO] Connected remote client 10.128.0.1:58802
[INFO] (10.128.0.1:58802): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:58802): CMD {"command":"connect","args":["172.30.181.199:9091"]}

[INFO] No active connection

[INFO] (10.128.0.1:56504): CMD {"command":"list-event-types","args":[]}
[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58802): CMD {"command":"dump","args":["apimade","30","ALL"]}

[INFO] Disconnected remote client 10.128.0.1:58802
[INFO] (10.128.1.99:35886): GET /clienturl 200 0ms
[INFO] Connected remote client 10.128.0.1:58848
[INFO] (10.128.0.1:58848): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58848): CMD {"command":"disconnect","args":null}

[INFO] Disconnected remote client 10.128.0.1:58848
[INFO] (10.128.1.99:35886): GET /clienturl 200 0ms
[INFO] Connected remote client 10.128.0.1:58872
[INFO] (10.128.0.1:58872): CMD {"command":"is-connected","args":null}

[INFO] (10.128.0.1:58872): CMD {"command":"connect","args":["172.30.181.199:9091"]}

[INFO] No active connection

[INFO] (10.128.0.1:56504): CMD {"command":"list-event-types","args":[]}
[INFO] (10.128.0.1:56504): CMD {"command":"list","args":[]}
[INFO] (10.128.0.1:58872): CMD {"command":"dump","args":["apimade","30","ALL"]}

[INFO] Disconnected remote client 10.128.0.1:58872
Exception in thread "main" java.util.ConcurrentModificationException
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1542)
	at com.redhat.rhjmc.containerjfr.tui.ws.MessagingServer.flush(MessagingServer.java:115)
	at com.redhat.rhjmc.containerjfr.tui.ws.WsCommandExecutor.flush(WsCommandExecutor.java:119)
	at com.redhat.rhjmc.containerjfr.tui.ws.WsCommandExecutor.run(WsCommandExecutor.java:85)
	at com.redhat.rhjmc.containerjfr.ContainerJfr.main(ContainerJfr.java:69)
[INFO] (10.128.1.99:35886): GET /clienturl 200 0ms

The requested recording was observed to be created in the -web UI, but container-jfr seemed to be broken after this exception was thrown. For example, analysis summaries for the recording could be retrieved, but targets could not be scanned (infinite spinner).

WebSocket auth token should be expected as a websocket subprotocol, not query param

In order to avoid the auth token being entered into logs etc., it should be supplied by the client on WebSocket connection request via a websocket subprotocol (ex. https://docs.openshift.com/container-platform/4.1/authentication/understanding-authentication.html : Sent as a websocket subprotocol header in the form base64url.bearer.authorization.k8s.io.<base64url-encoded-token> for websocket requests) rather than the current query parameter.

Uncaught NPE when sent an empty websocket message with auth

Using websockat to connect to ws://localhost:8181/command if I press enter, inputting an empty string (I assume), there is an uncaught exception that makes the Websocket stuff die and no further action can be taken without restarting the server.

Exception in thread "main" java.lang.NullPointerException
        at com.redhat.rhjmc.containerjfr.tui.ws.WsCommandExecutor.run(WsCommandExecutor.java:50)
        at com.redhat.rhjmc.containerjfr.ContainerJfr.main(ContainerJfr.java:69)

If incorrect input like "hi" is provided, the server continues after responding over the websocket with error details. That should also occur for an empty message.

Recording summary reports show as plain text

This Content-Type response header for /reports/<recording> endpoint has incorrect value: text/plain.

screenshot

This problem is reproducible by running kube-setup.sh against a fresh Openshift installation. Looking at the script content, it's fetching the latest of quay.io/rh-jmc-team/container-jfr, which is currently v0.4.3

Interestingly, it appears the response content type is explicitly set to NanoHTTPD.MIME_HTML since v0.3.2.

It's possible that Openshift's reverse proxy somehow overwrote this header field.

$ curl --head -H "Accept: text/html" http://container-jfr-container-jfr.10.15.17.89.nip.io/reports/test
HTTP/1.1 200 OK 
Content-Type: text/plain; charset=UTF-8
Date: Wed, 2 Oct 2019 19:33:32 GMT
Access-Control-Allow-Origin: *
Content-Length: 122387
Set-Cookie: 98c81825faae9a2f4c5de3ce34beb5a5=160b9b5980b10c3e9837d7d18ffece6b; path=/; HttpOnly
Cache-control: private

Maven build

We should consider using Maven to build rather than Gradle.

Things to consider:

  • -web needs to be built by invoking npm
  • there needs to be some configuration flags or similar to allow building standard and minimal (no web)
  • it would be nice to have tasks for building container images, but not strictly required (can be done by an external build script, Makefile, etc.)

ALL org.openjdk.jmc classes should be refactored into -core and abstracted

Despite the existence of container-jfr-core, container-jfr itself still contains some direct dependencies on upstream JMC, mostly in the way of service interfaces, recording option definitions, recording descriptors, and exception types. These should be abstracted away behind -core, so that the coupling to upstream JMC is restricted to -core as a thin wrapper, allowing the possibility to remove the dependency on upstream JMC in the future with minimal additional refactoring work.

When saving files into archive, filenames should be automatically updated to avoid clobbering old data

Example scenario:

User has ongoing in-memory recording named foo from target app.svc.local. Saving this recording to archive produces a file on disk named "app-svc-local_foo.jfr". Currently, saving the recording again will overwrite "app-svc-local_foo.jfr". This should not be the case.

  1. A timestamp should be attached to saved filenames, ex. "app-svc-local_foo_20191105T090738.jfr"

  2. If two recordings happen to have the same filename, whether from rapid repeated saving of the same active recording or re-upload or any other reason, then some differentiating suffix should be applied to the filename such as an incrementing count of the number of copies ("local_foo_20191105T090738_1.jfr")

rh-jmc-team#43 broke unit tests

PR rh-jmc-team#43 modified error strings in implementations, but not in tests.

com.redhat.rhjmc.containerjfr.commands.internal.ConnectCommandTest > shouldNotValidateNull() FAILED
    org.mockito.exceptions.verification.ArgumentsAreDifferent at ConnectCommandTest.java:84

com.redhat.rhjmc.containerjfr.commands.internal.ConnectCommandTest > shouldNotValidateEmptyString() FAILED
    org.mockito.exceptions.verification.ArgumentsAreDifferent at ConnectCommandTest.java:90

com.redhat.rhjmc.containerjfr.commands.internal.ConnectCommandTest > shouldExpectOneArg() FAILED
    org.mockito.exceptions.verification.ArgumentsAreDifferent at ConnectCommandTest.java:67

com.redhat.rhjmc.containerjfr.commands.internal.ConnectCommandTest > shouldNotValidateInvalidIdentifiers(String)[1] FAILED
    org.mockito.exceptions.verification.ArgumentsAreDifferent at ConnectCommandTest.java:78

com.redhat.rhjmc.containerjfr.commands.internal.ConnectCommandTest > shouldNotValidateInvalidIdentifiers(String)[2] FAILED
    org.mockito.exceptions.verification.ArgumentsAreDifferent at ConnectCommandTest.java:78

com.redhat.rhjmc.containerjfr.commands.internal.ConnectCommandTest > shouldNotValidateInvalidIdentifiers(String)[3] FAILED
    org.mockito.exceptions.verification.ArgumentsAreDifferent at ConnectCommandTest.java:78

417 tests completed, 6 failed
testsuite com.redhat.rhjmc.containerjfr.commands.internal.SaveRecordingCommandTest:
  testcase com.redhat.rhjmc.containerjfr.commands.internal.ConnectCommandTest > shouldNotValidateNull(): Argument(s) are different! Wanted:
    cw.println(
        "Expected one argument: host name/URL"
    );
    -> at com.redhat.rhjmc.containerjfr.core.tui.ClientWriter.println(ClientWriter.java:11)

> Task :test FAILED

This is really trivial though.

Messages/responses should be uniquely identified

See discussion in #114 . Currently, messages do not explicitly contain any identifying information about the sending client (though the server can of course tell which socket connection received a message), and all responses to such messages are multicast out to all clients, with no indication for the clients whether they are receiving a broadcast or a direct response to one of their messages. Due to the overall design and architecture it does not make sense at this time to move to only using unicast responses, so including some kind of client/message identifier on client messages which is included in the corresponding broadcast response will allow clients to selectively react to responses.

Network-accessing classes need to be testable

Currently there are several classes which create TCP sockets or HTTP client connections (some examples: RecordingExporter, UploadSavedRecordingsCommand, SocketClientReaderWriter, SocketInteractiveShellExecutor, JMCConnection), and which are partly or entirely missing unit tests since the networking codepaths can not be exercised in unit testing without actually producing network connections. We need some enhanced mocking utilities, or a test harness to intercept connections, in order to properly test and verify these classes.

container-jfr seems to hang when connecting to itself while running in Minishift

As documented here: cryostatio/cryostat-operator#24

When running in Minishift (or perhaps more generally, OpenShift 3.x), requesting container-jfr to connect to itself as a target results in the connection hanging on the container-jfr side, and becoming unresponsive to connected websocket clients or their connections being closed. This doesn't seem to occur with CodeReady Containers/OpenShift 4.x, nor does it occur when running container-jfr with the Docker daemon locally or running it as a JVM process directly on a host machine.

Whatever causes the connection to hang to begin with should be addressed, if possible. A timeout should also be implemented in case the connection does hang, so that the container-jfr instance can remain alive and responsive (perhaps this also implies liveness/readiness probes should be implemented).

Verify minimal set of JVM flags for remote JMX connection

The documentation and all demos/examples currently state this list of JVM flags as required for container-jfr to be able to connect and manage targets:

    '-Dcom.sun.management.jmxremote.rmi.port=9091',
    '-Dcom.sun.management.jmxremote=true',
    '-Dcom.sun.management.jmxremote.port=9091',
    '-Dcom.sun.management.jmxremote.ssl=false',
    '-Dcom.sun.management.jmxremote.authenticate=false',
    '-Dcom.sun.management.jmxremote.local.only=false',
    '-Djava.rmi.server.hostname=$TARGET_HOSTNAME'

This set of flags should be examined to see if any are unnecessary or redundant.

WebSocket ConcurrentModificationException

Exception in thread "main" java.util.ConcurrentModificationException
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1542)
	at com.redhat.rhjmc.containerjfr.tui.ws.MessagingServer.flush(MessagingServer.java:131)
	at com.redhat.rhjmc.containerjfr.tui.ws.WsCommandExecutor.flush(WsCommandExecutor.java:113)
	at com.redhat.rhjmc.containerjfr.tui.ws.WsCommandExecutor.run(WsCommandExecutor.java:79)
	at com.redhat.rhjmc.containerjfr.ContainerJfr.main(ContainerJfr.java:69)

AuthManagers should be configurable

Auth management should be somehow configurable (ex. by env var), and platform detection used to determine the auth manager to use only if no configuration exists (or is invalid?). Currently the only working scenario this enables is for noop auth in an OpenShift deployment, but for future auth managers it will make more sense to allow this flexibility.

Investigate Vert.x webserver recording streaming performance

See previous discussion during review of #57 .

The current implementation for streaming of in-memory recordings relies on using an intermediate buffer in container-jfr, which reads data out of the target JVM into a buffer, then asks the Vert.x webserver to copy the contents of that buffer into its internal write queue. Once all of the data has thus been copied out of the remote JVM and into the response write queue, the response begins to be flushed out the network stack. There should be some better way to stream the data through container-jfr and the Vert.x webserver so that less time is spend building up the intermediate copy of the flight recording, perhaps using a Vert.x Pump.

OpenShift platform not detected

I'm not sure what caused this to happen, but I noticed that there was no access control for my deployed Container JFR (v0.13.0) on CodeReady Containers. Looking in the logs I saw this:

$ oc logs containerjfr-58f689b7b4-skrjl -c containerjfr
[INFO] Logger level: INFO
[INFO] container-jfr started. args: []
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[WARN] No available SSL certificates. Fallback to plain HTTP.
[INFO] Selecting platform default AuthManager
[INFO] io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://172.30.0.1/apis/route.openshift.io/v1/namespaces/default/routes. Message: service unavailable
.
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:510)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:449)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:413)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:354)
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:153)
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:620)
	at io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:69)
	at com.redhat.rhjmc.containerjfr.platform.internal.OpenShiftPlatformStrategy.isAvailable(OpenShiftPlatformStrategy.java:53)
	at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:176)
	at java.base/java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:361)
	at java.base/java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:503)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:488)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:150)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:543)
	at com.redhat.rhjmc.containerjfr.platform.PlatformModule.providePlatformClient(PlatformModule.java:24)
	at com.redhat.rhjmc.containerjfr.platform.PlatformModule_ProvidePlatformClientFactory.proxyProvidePlatformClient(PlatformModule_ProvidePlatformClientFactory.java:35)
	at com.redhat.rhjmc.containerjfr.platform.PlatformModule_ProvidePlatformClientFactory.get(PlatformModule_ProvidePlatformClientFactory.java:24)
	at com.redhat.rhjmc.containerjfr.platform.PlatformModule_ProvidePlatformClientFactory.get(PlatformModule_ProvidePlatformClientFactory.java:10)
	at dagger.internal.DoubleCheck.get(DoubleCheck.java:47)
	at com.redhat.rhjmc.containerjfr.net.web.WebModule_ProvideAuthManagerFactory.get(WebModule_ProvideAuthManagerFactory.java:23)
	at com.redhat.rhjmc.containerjfr.net.web.WebModule_ProvideAuthManagerFactory.get(WebModule_ProvideAuthManagerFactory.java:10)
	at dagger.internal.DoubleCheck.get(DoubleCheck.java:47)
	at com.redhat.rhjmc.containerjfr.net.NetworkModule.provideAuthManager(NetworkModule.java:132)
	at com.redhat.rhjmc.containerjfr.net.NetworkModule_ProvideAuthManagerFactory.proxyProvideAuthManager(NetworkModule_ProvideAuthManagerFactory.java:69)
	at com.redhat.rhjmc.containerjfr.net.NetworkModule_ProvideAuthManagerFactory.get(NetworkModule_ProvideAuthManagerFactory.java:44)
	at com.redhat.rhjmc.containerjfr.net.NetworkModule_ProvideAuthManagerFactory.get(NetworkModule_ProvideAuthManagerFactory.java:14)
	at dagger.internal.DoubleCheck.get(DoubleCheck.java:47)
	at com.redhat.rhjmc.containerjfr.net.web.WebModule_ProvideWebServerFactory.get(WebModule_ProvideWebServerFactory.java:70)
	at com.redhat.rhjmc.containerjfr.net.web.WebModule_ProvideWebServerFactory.get(WebModule_ProvideWebServerFactory.java:17)
	at dagger.internal.DoubleCheck.get(DoubleCheck.java:47)
	at com.redhat.rhjmc.containerjfr.DaggerContainerJfr_Client.webServer(DaggerContainerJfr_Client.java:660)
	at com.redhat.rhjmc.containerjfr.ContainerJfr.main(ContainerJfr.java:67)

[INFO] Selected KubeApi Platform Strategy
[INFO] low memory pressure streaming disabled for web server
[INFO] HTTPS service running on https://containerjfr-default.apps-crc.testing:443

This was an instance of CRC that was stopped previously and restarted without redeploying anything, FWIW.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.