Giter Site home page Giter Site logo

openshift / svt Goto Github PK

View Code? Open in Web Editor NEW
122.0 21.0 106.0 17.93 MB

License: Apache License 2.0

Shell 59.21% Python 36.22% DIGITAL Command Language 0.13% 1C Enterprise 0.22% Go 0.28% Awk 0.65% Makefile 0.03% Groovy 0.11% Dockerfile 1.96% Jinja 1.20%

svt's Introduction

OpenShift, Kubernetes and Docker: Performance, Scalability and Capacity Planning Research by Red Hat

OpenShift v3 Scaling, Performance and Capacity Planning Whitepaper

This repository details the approach, process and procedures used by engineering teams at Red Hat to analyze and improve the performance and scalability of integrated platform and infrastructure stacks. It shares results, best practices and reference architectures for the Kubernetes and docker-based OpenShift v3 Platform-as-a-Service, as well as the Red Hat Atomic technologies.

Unsurprisingly, performance analysis and tuning in the container and container-orchestration space has tremendous overlap with previous generation approaches to distributed computing. Performance still boils down to identifying and resolving bottlenecks, data- and compute-locality, and applying best-practices to software scale-out design hard-won over decades of grid- and high-performance computing research.

Further tests quantify application performance when running in a container hosted by OpenShift, as well as measure reliability over time, searching for things like memory leaks.

IMPORTANT

While the tests in this repository are used by the Red Hat team to measure performance, these are not supported in any OpenShift environments, and Red Hat support services cannot provide assistance with any problems.

How this repository is organized:

The hierarchy of this repository is as follows:

.
├── application_performance:  JMeter-based performance testing of applications hosted on OpenShift.
├── applications_scalability:  Performance and scalability testing of the OpenShift web UI.
├── conformance: Wrappers to run a subset of e2e/conformance tests in an SVT environment (work in progress)
├── docs:  Documentation which can help with SVT testing.
├── image_provisioner:  Ansible playbooks for building AMI and qcow2 images with OpenShift rpms and Docker images baked in.
├── networking: Performance tests for the OpenShift SDN and kube-proxy.
├── openshift_performance:  Performance tests for container build parallelism, projects and persistent storage (EBS, Ceph, Gluster and NFS)
├── openshift_scalability: Home of the infamous "cluster-loader", details in openshift_scalability/README.md
└── reliability: Run tests over long periods of time (weeks), cycle object quantity up and down.

Dockerfiles and Dependencies

Certain tests use the Quickstarts from the openshift/origin repository. Ensure that they are available in your openshift project environment before using any of the tests:

https://github.com/openshift/origin/tree/master/examples/quickstarts

Ensure also that the requisite image streams are available in your openshift project environment:

https://github.com/openshift/origin/tree/master/examples/image-streams

Also, for disconnected installations, the following Dockerfiles should be located somewhere in the target installation machines:

./reliability/Dockerfile
./openshift_scalability/content/centos-stress/Dockerfile
./openshift_scalability/content/logtest/Dockerfile
./dockerfiles/Dockerfile
./storage/fio/Dockerfile
./networking/synthetic/stac-s2i-builder-image/Dockerfile
./networking/synthetic/uperf/Dockerfile

Feedback and Issues

Feedback, issues and pull requests will be happily accepted. Feel free to submit any pull requests or issues.

svt's People

Contributors

akrzos avatar anpingli avatar chadcrum avatar chaitanyaenr avatar ekuric avatar hongkailiu avatar hroyrh avatar jeremyeder avatar jhadvig avatar jmencak avatar liqcui avatar mbruzek avatar mffiedler avatar mrsiano avatar ofthecurerh avatar paigerube14 avatar qiliredhat avatar rflorenc avatar rh-ematysek avatar rpattath avatar schituku avatar shahsahil264 avatar shrivaibavi avatar sjug avatar skordas avatar smalleni avatar svetsa-rh avatar vikaschoudhary16 avatar vikaslaad avatar wabouhamad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

svt's Issues

Evaluate current test run parameters

Now that there is 2 releases worth of data, experience and lessons-learned using cluster-loader for vertical and horizontal testing, we should update/modify the tests as appropriate. Possible areas:

  • project composition
  • new scale targets from online, if available
  • tradeoffs between test execution time and success - don't overwhelm the cluster.
  • template contents
  • other?

cc: @timothysc @jeremyeder

Cleaning policy

Copied from old repo

Currently the cluster loader tries to do step-wise cleaning of the cluster, but I would argue that this is not a realistic use case. Users may make changes independently, but they would often happen in an uncontrolled / random fashion. On the other hand, operators are likely to do bulk cleaning operations. A case in point would be removal after the 30 day evaluation period.

The purpose of this issue is to hash out the concrete use cases so we can have the cluster-loader reflect them appropriately.

Kitchen Sink: Mark all json templates triggers in quickstarts to automatic: true

The automatic: false flag in the triggers of the templates for deployments was not working till 3.4 and hence when the cluster loader creates app as part of the kitchen sink tests rolled out the deployments which are marked as false. Now that it is fixed it is no longer deploying apps automatically and one has to manually deploy the services thru "oc rollout latest" after the cluster loader script finishes.
Change all flags to automatic: true so that deployments happen automatically and no manual intervention is needed.

https://github.com/openshift/origin/blob/master/UPGRADE.md#origin-14x--ose-34x

aws-cli wrapper

For billing purposes, we need to enforce some uniformity around creating AWS instances. To do this, I propose wrapping the AWS CLI utility with our own script that we'll carry in svt.

A typical invocation is:

$ aws ec2 run-instances --image-id $IMAGE_ID --count $COUNT --instance-type $INSTANCE_TYPE --key-name $SSH_KEY_NAME --subnet $SUBNET --security-group-ids $SGID

The script capabilities should allow for customization of:

  • Image ID, but it should default to our latest. Default is latest AMI, and we will update the script in svt whenever the AMI is updated. This way everyone always uses the latest unless the choose otherwise.
  • Count, instance count is variable of course. No default, fail if not specified.
  • Instance Type, no default, fail if not specified.
  • key-name should be hard-coded
  • subnet should be hard-coded
  • security group should be hard-coded
  • disk image size, I think we want to offer this
  • storage type, gp2/io1

Instance tags:

  • group, for billing purposes we'd like to track which group created the instance (right now this would be qe or perf). No default, fail if not specified
  • username, the aws account username who created the instance

Few other thoughts:

  • "hard-coded" could mean it comes from a config file, too I guess...up to you.
  • If anyone ever needs to customize things the script doesn't provide, they can always use the CLI tool manually.

Anything I missed?

/cc @sjug

customer feedback on network test

Comments from customer using the svt/networking tests:

I also tried to run Jeremy’s network benchmarck (svt), but we faced some difficulties:

  • uperf is automatically packaged inside docker images by the ansible playbook, but the docker images are built on the nodes themselves. And it failed because, on our R-Box setup, the nodes don’t have access to Internet. So, I had to prepare the docker image elsewhere, to push it to our own docker hub, and to comment the docker image creation in the ansible playbook.

  • The tests are requiring “pbench [to be] installed and configured on all hosts”. This requires to install a new repo. And this was made difficult because our nodes don’t have access to Internet and the repositories to add are not the “standard” rhel-7 ones that we are already mirroring on our Artifactory (which is the only source of rpm which is accessible from our nodes)

  • The “Patch pbench-uperf” task (here: https://github.com/openshift/svt/blob/master/networking/synthetic/pod-ip-test-setup.yaml#L93) was failing. I looked at the file to be patched and couldn’t find something similar to the hunks of the patch, so, I tried to just skip that task. When the “Run pbench-uperf for TCP benchmarks” task was finally launched, it never ended. I looked on the master where the “pbench-uperf” was launched and it appeared that it tried to contact IPs on the overlay network. The command that was launched was “ python network-test.py podIP --master --node --pods 8 ” in order to do the measure between two PODs, but it seems that even this scenario requires the master to be able to reach the overlay network. This was unfortunately not the case on our current R-Box setup where the masters are not non-schedulable nodes yet.

@vikaschoudhary16 @sjug

Add a window-scaling functionality to cluster loader.

While the current stepsize/pause/delay options in the tuningsets sections of cluster-loader templates can be useful to rate-limit and mitigate issues like:

Error syncing pod, skipping: failed to "StartContainer" for "POD"  with RunContainerError: "runContainer: operation timeout: context deadline exceeded"

I've found that just by keeping only a certain number of pods outside of Running state seems to mitigates this issue automatically without estimating the tuningsets values to avoid this issue.

Therefore I propose to add a "queue-depth" field/functionality in the tuningsets section for cluster-loader templates, which would block the creation of new pods if the queue depth is greater than a given number.

Adding a link to a thread:
http://post-office.corp.redhat.com/archives/openshift-sme/2016-November/msg00301.html

@jeremyeder @sjug

Docker build issue with Network tests

Network tests do not work anymore on master node, docker build is failing because of Openshift firewall. Need to make tests run from a separate host.

Implement ec2 exponential back-off for cluster-loader

From old repo:

ekuric commented on Apr 5
Amazon API will reject to serve if api rate is overcome,
necessary to implement this in all functions which interact with amazon api
-> http://docs.aws.amazon.com/general/latest/gr/api-retries.html
@ofthecurerh

ofthecurerh commented on Apr 12
The boto3 library handles exponential backoff and retries, we just need to properly handle the exception.

Example snippet:

def create_volume(self, availability_zone, *_kwargs):
try:
volume = self.resource.create_volume(
DryRun = False,
AvailabilityZone = availability_zone,
*_kwargs)
except botocore.exceptions.ClientError as err:
logging.warn('Unexpected Error: %s', err.response['Error']['Code'])
else:
return volume
@ekuric

ekuric commented on Apr 14
@ofthecurerh thx, I will check to add this to create/delete scripts involving boto3.
@ekuric

ekuric commented 28 days ago
I have this working in my tests , will create PR for this PR

cluster_loader: template cleaning currently broken

From old repo:

The clean_templates() method has gotten into a bad state somehow. globalvars is not passed to it and there are other undefined varables (e.g templatefile). I have disabled template cleaning until this can be addressed.

Cluster loader: storage support creates PVCs in default namespace

Something may have changed in OCP since storage support was added to cluster_loader, but currently the tool creates the PVCs in the default namespaces.

Adding a namespace to the pvc template and giving it the value of the current namespace where the pod is being created fixes the problem. Preparing a PR for this.

/cc: @ekuric

masterVertical.sh and pyconfigMasterVirtScale.yaml behaviour

@timothysc
I've run ./masterVertical.sh in svt and got this:

In project clusterproject3 on server https://ip-172-31-41-250.us-west-2.compute.internal:8443

route/route0 not accepted: HostAlreadyClaimed (svc/service0)
  dc/deploymentconfig0 deploys docker.io/openshift/hello-openshift:latest 
    deployment #1 deployed about an hour ago - 1 pod

route/route1 not accepted: HostAlreadyClaimed (svc/service1)
  dc/deploymentconfig1 deploys docker.io/openshift/hello-openshift:latest 
    deployment #1 deployed about an hour ago - 1 pod

svc/service2v0 - 172.30.252.181:80 -> 8080
  dc/deploymentconfig2v0 deploys docker.io/openshift/hello-openshift:latest 
    deployment #1 deployed about an hour ago - 2 pods

bc/buildconfig0 source builds git://github.com/tiwillia/hello-openshift-example.git on istag/imagestream0:latest (from bc/buildconfig0)
  -> istag/imagestream0:latest
  build #1 failed 59 minutes ago

bc/buildconfig1 source builds git://github.com/tiwillia/hello-openshift-example.git on istag/imagestream1:latest (from bc/buildconfig1)
  -> istag/imagestream1:latest
  build/build1-1 failed about an hour ago (can't push to image)

bc/buildconfig2 source builds git://github.com/tiwillia/hello-openshift-example.git on istag/imagestream2:latest (from bc/buildconfig2)
  -> istag/imagestream2:latest
  build/build2-1 failed about an hour ago (can't push to image)

Errors:
  * bc/buildconfig1 is pushing to istag/imagestream1:latest, but the image stream for that tag does not exist.
  * bc/buildconfig2 is pushing to istag/imagestream2:latest, but the image stream for that tag does not exist.
  * build/build1-1 has failed.
  * build/build2-1 has failed.
  * route/route0 was not accepted by router "router": a route in another namespace holds www.example0.com and is older than route0 (HostAlreadyClaimed)
  * route/route1 was not accepted by router "router": a route in another namespace holds www.example1.com and is older than route1 (HostAlreadyClaimed)
  * route/route2 was not accepted by router "router": a route in another namespace holds www.example2.com and is older than route2 (HostAlreadyClaimed)

7 errors and 6 warnings identified, use 'oc status -v' to see details.

Tested on two clusters, same results. Not sure this is the intended behaviour.

Bring network tests up to date with OCP, Ansible, pbench, etc

  • Issue #87 - use the Ansible 2.0 API
  • remove pbench-uperf patching
  • update playbooks for latest OCP template processing syntax
  • respect new pbench-uperf port range
  • investigate svc-to-svc problem
  • investigate output data issue (possible pbench-uperf bug)
  • require user to explicitly provide public ssh key for pods to use
  • ??? more to be added as discovered

add support for ebs dynamic storage allocations

From the old repo

Openshift support dynamic storage allocation, we have now support for EBS in cluster loader but it does below create steps
ebs -> pv -> pvc

with dynamic storage allocation this is reduced only to pvc step
@ekuric

ekuric commented 14 days ago
I will send PR for #111
@ofthecurerh

ofthecurerh commented 14 days ago • edited
Is a PR required? This should already work by using a template.

Edit: Link to example: https://github.com/ofthecurerh/svt/blob/master-vert/openshift_scalability/content/quickstarts/cakephp-mysql.json#L113
@ekuric

ekuric commented 14 days ago • edited
@ofthecurerh we still need to update pvc.json dynamically with pvclaim name, ie. still need to create pvc., so will be removing ebc/pv steps, but , unless I am wrong, pvc step needs to stay
@ofthecurerh

ofthecurerh commented 14 days ago
@ekuric I'm saying that instead of creating a separate function to handle pvc, we should be using the already defined template function. What you're doing in the ebs_create create function to replace the values in the pvc json is what templates are designed to do.
@ekuric

ekuric commented 14 days ago • edited
@ofthecurerh show me code,... what template function you mean , or link to it?
@ofthecurerh

ofthecurerh commented 14 days ago
@ekuric The cluster-loader can process/create templates, its nearly equivalent to oc process -f template.json -v SOME_VAR=foo | oc create -f -.

This is the cluster-loader config I'm using: https://github.com/ofthecurerh/svt/blob/master-vert/openshift_scalability/config/master-vert.yaml

And here's an example of a template: https://github.com/ofthecurerh/svt/blob/master-vert/openshift_scalability/content/quickstarts/cakephp-mysql.json

Some pods end-up in Error state after using some quickstart templates with OCP 3.4

All quickstart templates work fine on 3.2.x and 3.3.x, however

$ oc version
oc v3.4.0.9
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-31-52-249.us-west-2.compute.internal:8443
openshift v3.4.0.9
kubernetes v1.4.0+776c994

$ oc get pods --all-namespaces
NAMESPACE            NAME                                  READY     STATUS      RESTARTS   AGE
cake                 cakephp-mysql-example-1-9a53a         1/1       Running     0          1h
cake                 cakephp-mysql-example-1-build         0/1       Completed   0          1h
cake                 mysql-1-vwo17                         1/1       Running     0          1h
cakephp-mysql0       cakephp-mysql-example-1-build         0/1       Completed   0          1h
cakephp-mysql0       cakephp-mysql-example-1-deploy        1/1       Running     0          1h
cakephp-mysql0       cakephp-mysql-example-1-hook-pre      0/1       Error       26         1h
dancer-mysql0        dancer-mysql-example-1-build          0/1       Completed   0          1h
dancer-mysql0        dancer-mysql-example-1-deploy         0/1       Error       0          1h
default              docker-registry-2-cozxh               1/1       Running     3          1d
default              registry-console-1-dpsxr              1/1       Running     3          1d
default              router-1-9cm9t                        1/1       Running     3          1d
django-postgresql0   django-psql-example-1-build           0/1       Completed   0          1h
django-postgresql0   django-psql-example-1-deploy          0/1       Error       0          1h
eap64-mysql0         eap-app-1-7pgrz                       1/1       Running     0          1h
eap64-mysql0         eap-app-1-build                       0/1       Completed   0          1h
eap64-mysql0         eap-app-mysql-1-pnaio                 1/1       Running     0          1h
nodejs-mongodb0      nodejs-mongodb-example-1-build        0/1       Completed   0          1h
nodejs-mongodb0      nodejs-mongodb-example-1-yizup        1/1       Running     0          1h
rails-postgresql0    rails-postgresql-example-1-build      0/1       Completed   0          1h
rails-postgresql0    rails-postgresql-example-1-deploy     0/1       Error       0          1h
rails-postgresql0    rails-postgresql-example-1-hook-pre   0/1       Error       0          1h
tomcat8-mongodb0     jws-app-1-5juwl                       1/1       Running     0          1h
tomcat8-mongodb0     jws-app-1-build                       0/1       Completed   0          1h
tomcat8-mongodb0     jws-app-mongodb-1-96aq0               1/1       Running     0          1h

$ oc describe pod cakephp-mysql-example-1-hook-pre -n cakephp-mysql0 
...
  FirstSeen LastSeen    Count   From                            SubobjectPath           Type        Reason      Message
  --------- --------    -----   ----                            -------------           --------    ------      -------
  1h        1h      1   {default-scheduler }                                    Normal      Scheduled   Successfully assigned cakephp-mysql-example-1-hook-pre to ip-172-31-52-249.us-west-2.compute.internal
  1h        1h      1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal      Pulling     pulling image "172.30.142.13:5000/cakephp-mysql0/cakephp-mysql-example@sha256:d4f00ca474587c79d94962d98a36b501bc498c066b982cbbf28d477c3f5562c9"
  1h        1h      1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal      Pulled      Successfully pulled image "172.30.142.13:5000/cakephp-mysql0/cakephp-mysql-example@sha256:d4f00ca474587c79d94962d98a36b501bc498c066b982cbbf28d477c3f5562c9"
  1h        1h      1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal      Created     Created container with docker id ae745fb761d9; Security:[seccomp=unconfined]
  1h        1h      1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal      Started     Started container with docker id ae745fb761d9
  1h        1h      1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal      Created     Created container with docker id 2631426ece46; Security:[seccomp=unconfined]
  1h        1h      1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal      Started     Started container with docker id 2631426ece46
  1h        1h      1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}                   Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "lifecycle" with CrashLoopBackOff: "Back-off 10s restarting failed container=lifecycle pod=cakephp-mysql-example-1-hook-pre_cakephp-mysql0(5810a073-913e-11e6-84c9-02579073c581)"

  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     Created container with docker id f0ff58cb4e37; Security:[seccomp=unconfined]
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     Started container with docker id f0ff58cb4e37
  1h    1h  2   {kubelet ip-172-31-52-249.us-west-2.compute.internal}                   Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "lifecycle" with CrashLoopBackOff: "Back-off 20s restarting failed container=lifecycle pod=cakephp-mysql-example-1-hook-pre_cakephp-mysql0(5810a073-913e-11e6-84c9-02579073c581)"

  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     Created container with docker id 3b4d3d4c809c; Security:[seccomp=unconfined]
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     Started container with docker id 3b4d3d4c809c
  1h    1h  4   {kubelet ip-172-31-52-249.us-west-2.compute.internal}                   Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "lifecycle" with CrashLoopBackOff: "Back-off 40s restarting failed container=lifecycle pod=cakephp-mysql-example-1-hook-pre_cakephp-mysql0(5810a073-913e-11e6-84c9-02579073c581)"

  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     Created container with docker id 2b51069e4451; Security:[seccomp=unconfined]
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     Started container with docker id 2b51069e4451
  1h    1h  7   {kubelet ip-172-31-52-249.us-west-2.compute.internal}                   Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "lifecycle" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=lifecycle pod=cakephp-mysql-example-1-hook-pre_cakephp-mysql0(5810a073-913e-11e6-84c9-02579073c581)"

  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     Started container with docker id 9af9c059d42e
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     Created container with docker id 9af9c059d42e; Security:[seccomp=unconfined]
  1h    1h  13  {kubelet ip-172-31-52-249.us-west-2.compute.internal}                   Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "lifecycle" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=lifecycle pod=cakephp-mysql-example-1-hook-pre_cakephp-mysql0(5810a073-913e-11e6-84c9-02579073c581)"

  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     Created container with docker id aad9452568ca; Security:[seccomp=unconfined]
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     Started container with docker id aad9452568ca
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     Started container with docker id e67f62813644
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     Created container with docker id e67f62813644; Security:[seccomp=unconfined]
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     Started container with docker id b20cded25344
  1h    1h  1   {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     Created container with docker id b20cded25344; Security:[seccomp=unconfined]
  1h    4m  27  {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Pulled      Container image "172.30.142.13:5000/cakephp-mysql0/cakephp-mysql-example@sha256:d4f00ca474587c79d94962d98a36b501bc498c066b982cbbf28d477c3f5562c9" already present on machine
  1h    4m  19  {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Started     (events with common reason combined)
  1h    4m  19  {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Normal  Created     (events with common reason combined)
  1h    7s  503 {kubelet ip-172-31-52-249.us-west-2.compute.internal}                   Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "lifecycle" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=lifecycle pod=cakephp-mysql-example-1-hook-pre_cakephp-mysql0(5810a073-913e-11e6-84c9-02579073c581)"

  1h    7s  530 {kubelet ip-172-31-52-249.us-west-2.compute.internal}   spec.containers{lifecycle}  Warning BackOff Back-off restarting failed docker container

'-hook-pre' seems to be the common denominator of the failed pods. Looking into templates now.

cluster-loader users not visible on oc get

Copied from old repository

Users created by cluster-loader are not visible when we do "oc get users". The reason being that the oauth tokens are only generated when a particular user logs in. Need to add another step to user creation where all the users log in after creation.
@hroyrh

hroyrh commented 23 days ago
@ofthecurerh @jeremyeder @timothysc @mffiedler I would like to hear you thoughts on whether we should use "oc login" and then create all the objects for that user, or instead use the "--token" parameter to assign each object to a user. Implementation wise, the latter option will be easier as we won't need to change much. The reason I am considering "oc login" option is because, in my opinion, it is more close to realistic customer setups , and so I thought that it might be useful from the perspective of running a test. Anyways please share your thoughts on which option we should go for, or should we go for both.
@jeremyeder

jeremyeder commented 23 days ago
Normally I do favor doing things exactly how the product is intended to be
used by customers to get the most coverage. But I think we should go with
--token. I think it's close enough, and we're testing the login path as
well as user density/objects this way as well.

@timothysc

timothysc commented 23 days ago
Just for completeness I would vote doing oc login unless that breaks too many conventions in the test.
@ofthecurerh

ofthecurerh commented 16 days ago
@hroyrh There are a few different things we are talking about here.

How do we create users?
Do we create project resources as a specific user or as a single superuser? (probably should be an option to do it either way)
Creating users can either be done via oc login or by using a user and rolebinding template via oc create. The oc login method has the benefit of rolling up this process into a single command (while also creating an API token). I prefer the oc create method because it allows us more flexibility and control over how the API objects are defined. Also this method simplifies the cluster-loader code base by using a single function to create resources rather than having a specialized one for each kind of object.

My suggestion to use the --token option instead of --kubeconfig was aimed at simplifying the process of creating project resources as multiple users. Currently cluster-loader creates objects as the system:admin by copying the admin.kubeconfig file. We'd have to refactor that part to create new kubeconfigs via oc adm plus manage mapping users to their kubeconfig filepath. Additionally when creating a user via oc login IIRC it only creates an API token.

@jeremyeder I'm not advocating we use the product in any non-standard way. The product in this case is OpenShift not the oc tool. IMO anything that returns a 200 response is valid and as the product was intended to be used.

Networking tests loopback gets stuck trying to register pbench on nodes

From old repo:

vikaslaad commented 22 days ago
In the networking tests there is a bug when loopback is run we need to comment 3 lines in inventory file for registering pbench task. Somehow that if condition is not working when node names are not passed.
@ofthecurerh

ofthecurerh commented 16 days ago
@vikaslaad could you provide more detail on how you're using it.

There are a few known issues that I and @schituku have run into that I'm working on a PR for. I'll turn these into actual issues to document what is being worked on.

cluster_loader: Add option to delete projects if they already exist.

A more common scenario than the need for the cleaning option (which we should consider removing altogether) is the case where the user is running cluser_loader multiple times against the same yaml file. The artifacts which get created on the previous run must be manually deleted (or cluster_loader wrapped).

Add a -x option to attempt project deletion if it already exists. If it already exists and there is no -x, put out an error message suggesting the option be used.

Keep processed template in memory instead of writing a temp file to disk

Copied from old repo

The current process for creating a template is to:

  1. Run "oc process -f template.json"
  2. Capture stdout and deserialize to a json object
  3. Deserialize json object and write it to a file on disk

I suggest we skip writing to disk and keep it memory, but as the raw output instead of a deserialized json object.

An example of this, but not necessarily the actual implementation:
processed = subprocess.Popen('oc process -f template.json', stdout=subprocess.PIPE) subprocess.Popen('oc create -f -', stdin=processed.stdout, stdout=subprocess.PIPE)

cluster_loader: pod creation fails when quotas used

From old repo:

Pod creation with quotas is currently broken in cluster_loader using the default quota-default.json file.

Quota creation succeeds but pod creation fails:

Error from server: error when creating "/tmp/tmpSfS4m6": pods "hellopods0" is forbidden: Failed quota: default: must specify cpu,memory

Workaround: remove the quota statement from projects if you don't need it.

Single python library for oc/kubectl command invocation

Copied from old repo

Instead of using the cluster-loader for all oc|kubectl actions, there should be a single library to interface with those commands.

This would allow us the flexibility to script out whatever test scenario we need without being limited by the functionality of the cluster-loader.

The cluster-loader could make use of this library too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.