Giter Site home page Giter Site logo

openshiftdemos / openshift-ops-workshops Goto Github PK

View Code? Open in Web Editor NEW
115.0 19.0 149.0 95.78 MB

Workshop materials for OpenShift admin training, covering Red Hat OpenShift Container Platform 4, Modern App Dev Roadshow's Ops Track, and Summit 2023 Hands-on with OCP Plus.

License: GNU General Public License v3.0

Shell 81.92% Python 6.99% Dockerfile 11.09%
acm acs admin openshift redhat

openshift-ops-workshops's Introduction

OpenShift Operations Workshops

This repository contains lab instructions and supporting content for a series of administrative-focused workshops centered around OpenShift.

The workshops included in this repo are:

  • Red Hat OpenShift Container Platform 4 for Admins RHDP
  • Modern App Dev Roadshow - Ops Track RHDP / More Info
  • Summit 2023 Hands on with OCP Plus Workshop RHDP

If you are a Red Hat employee with access to RHDP, we recommend deploying using the provided RHDP links above.

Requirements / Prerequisites

Doing these labs on your own requires a few things.

AWS

These labs are designed to run on top of an OpenShift 4 cluster that has been installed completely by the new installer. You will need access to AWS with sufficient permissions and limits to deploy the 3 masters, 4-6 regular nodes, and NVME-equipped nodes for storage.

Check out the documentation for Installing on AWS.

OpenShift 4

At this time an OpenShift 4 cluster can be obtained by visiting https://try.openshift.com -- a free "subscription" to / membership in the developer program is required.

Deploying the Lab Guide

Deploying the lab guide will take three steps. First, you will need to get information about your cluster. Second, you will build a container based on your lab. Third, you will deploy the lab guide using the information you found so that proper URLs and references are automatically displayed in the guide.

Required Environment Variables

Most of the information can be found in the output of the installer.

Explaination and examples

  • API_URL - URL to access API of the cluster
    • https://api.cluster-gu1d.sandbox101.opentlc.com:6443
  • MASTER_URL - Master Console URL
    • http://console-openshift-console.apps.cluster-gu1d.sandbox101.opentlc.com
  • KUBEADMIN_PASSWORD - Password for kubeadmin
  • SSH_PASSWORD - password for ssh into bastion
  • ROUTE_SUBDOMAIN - Subdomain that apps will reside on
    • apps.cluster-gu1d.sandbox101.opentlc.com:6443
    • apps.mycluster.company.com

Specific to Red Hat internal systems

  • GUID - GUID
    • gu1d
  • BASTION_FQDN - Bastion Domain Name
    • bastion.gu1d.sandbox101.opentlc.com

Create a file called workshop-settings.sh using the values of your environment. Here is an example.

โš ๏ธ For export ensure special characters are escaped (ie. use \! in place of !).

API_URL=https://api.openshift4.example.com:6443
MASTER_URL=https://console-openshift-console.apps.openshift4.example.com
KUBEADMIN_PASSWORD=IqJK7-o3hYR-ZTr6c-7sztN
SSH_USERNAME=lab-user
SSH_PASSWORD=apassword
BASTION_FQDN=foo.bar.com
GUID=XXX
ROUTE_SUBDOMAIN=apps.openshift4.example.com
HOME_PATH=/opt/app-root/src

Deploy the Lab Guide

Now that you have the workshop-settings.sh file with the various required variables, you can deploy the lab guide into your cluster.

First, clone the repo

NOTE Remember to checkout the branch you want to test against

git clone https://github.com/openshiftdemos/openshift-ops-workshops

Next, Build a container using the repo/branch you checked out.

cd openshift-ops-workshops
export QUAY_USER=myusername
export BRANCH=$(git branch --show-current)
podman build -t quay.io/${QUAY_USER}/lab-sample-workshop:${BRANCH} .

Now, login to quay (it's free to sign up) or another registry your cluster has access to.

podman login quay.io

Next push your container to your repo.

podman push quay.io/${QUAY_USER}/lab-sample-workshop:${BRANCH}

You will use this image to deploy the lab. The following command will log you in as kubeadmin on systems with oc client installed:

oc login -u kubeadmin -p $KUBEADMIN_PASSWORD

oc new-project lab-ocp-cns

# This part is needed if you're running on a "local" or "self-provisioned" cluster
oc adm policy add-role-to-user admin kube:admin -n lab-ocp-cns

# Create deployment.
oc new-app -n lab-ocp-cns https://raw.githubusercontent.com/redhat-cop/agnosticd/development/ansible/roles/ocp4-workload-workshop-admin-storage/files/production-cluster-admin.json \
--param TERMINAL_IMAGE="quay.io/${QUAY_USER}/lab-sample-workshop:${BRANCH}" --param PROJECT_NAME="lab-ocp-cns" \
--param WORKSHOP_ENVVARS="$(cat ./workshop-settings.sh)"

# Wait for deployment to finish.

oc rollout status dc/dashboard -n lab-ocp-cns

If you made changes to the container image and want to refresh your deployed Homeroom quickly, execute this:

oc import-image -n lab-ocp-cns dashboard

Doing the Labs

Your lab guide should deploy in a few moments. To find its url, execute:

oc get route dashboard -n lab-ocp-cns

You should be able to visit that URL and see the lab guide. From here you can follow the instructions in the lab guide.

Notes and Warnings

Remember, this experience is designed for a provisioning system internal to Red Hat. Your lab guide will be mostly accurate, but slightly off.

  • You aren't likely using lab-user
  • You will probably not need to actively use your GUID
  • You will see lots of output that references your GUID or other slightly off things
  • Your MachineSets are different depending on the EC2 region you chose

But, generally, everything should work. Just don't be alarmed if something looks mostly different than the lab guide.

Also note that the first lab where you SSH into the bastion host is not relevant to you -- you are likely already doing the exercises on the host where you installed OpenShift from.

Troubleshooting

Make sure you are logged-in as kubeadmin when creating the project

If you are getting too many redirects error then clearing cookies and re-login as kubeadmin. This usually happens if you're using RHPDS and stopped/started a cluster.

Cleaning up

To delete deployment run

oc delete all,serviceaccount,rolebinding,configmap -l app=admin -n labguide

License

This repository and everything within it are licensed under the GNU General Public License (GPL) v3.0

openshift-ops-workshops's People

Contributors

ahsen-shah avatar aravindhp avatar ashtondavis avatar christianh814 avatar cooktheryan avatar dlbewley avatar dmesser avatar dobbymoodge avatar ianpurdy avatar ikke-t avatar jalvarez-rh avatar jamesfalkner avatar jchraibi avatar jewzaam avatar jmferrer avatar jnewsome97 avatar kaovilai avatar kmurudi avatar mdstjean avatar mfosterrox avatar mulbc avatar mwoodson avatar netzzer avatar paddy667 avatar stencell avatar steven-ellis avatar stevenbarre avatar techjw avatar thoraxe avatar twiest avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openshift-ops-workshops's Issues

remove prompts in code blocks / examples

Instead of presenting the user with the prompt:

[cloud-user@{{MASTER_HOSTNAME}} ~]$ heketi-cli node list | grep ca777ae0285ef6d8cd7237c862bd591c

Please just have the command:

heketi-cli node list | grep ca777ae0285ef6d8cd7237c862bd591c

This makes copy/paste out of the lab guide much easier, especially for blocks of commands.

installation lab verification fails / checks for wrong node names

- name: Checking status of all the nodes to be 'Ready' command: oc get -o jsonpath='{.status.conditions[?(@.reason=="KubeletReady")].type}' node {{ item }} with_items: - "{{ groups.nodes }}" register: status_of_node failed_when: "'Ready' not in status_of_node.stdout"

@kmurudi we cannot rely on the hostnames from the ansible inventory. They are different to the hostnames used by OpenShift. I suggest we use openshift_facts.yml supplied with openshift-ansible to determine the hostname.

host names not displayed as needed

after installation is complete, the command # oc get nodes gives the IP addresses as names for nodes instead of displaying the given hostnames/instance names.

ldap group sync automation fails

groupsync.yaml gets generated at cloud-init stage but has the LDAP bindDN and baseDN hard as well as the idm URL hard-coded.

Best to deploy in generic form with placeholders and then replace it as part of a post-deploy playbooks that runs at cloud-init stage on the master. Similarly done with the LDAP urls in /etc/ansible/hosts and the inventory_ldap_auth.yml playbook.

Ansible playbook for OCP installation

For the user to be able to use ansible-playbook and access the config.yml playbook from the openshift-ansible git repository, it should be present in the master host. The inventory file is present but the ansible playbook needed to run the advanced installation is not.

Also, two instances present of node01 in the list of EC-2 Instances

Introduce a WaitCondition handle

We should introduce a WaitConditionHandle in the CFN template to signal CREATE_COMPLETE only when all resources are provisioned and stood up, which includes:

  • all nodes are online and reachable via SSH
  • IdM's LDAP service is reachable on port 389
  • IdM's setup routine has produce ca.crt
  • lab guide URL is reachable

cns-management_automation.yml failure

The nodes04-06 are not added to the cluster so the play applying labels fails.

TASK [label storage nodes] ****************************************************************************************************************************************************************************************
Wednesday 09 August 2017  20:29:09 +0000 (0:00:00.683)       0:03:28.568 ****** 
failed: [localhost] (item=node04.internal.aws.testdrive.openshift.com) => {
    "changed": true, 
    "cmd": "oc label node/node04.internal.aws.testdrive.openshift.com storagenode=glusterfs", 
    "delta": "0:00:00.221782", 
    "end": "2017-08-09 20:29:09.802963", 
    "failed": true, 
    "item": "node04.internal.aws.testdrive.openshift.com", 
    "rc": 1, 
    "start": "2017-08-09 20:29:09.581181"
}

STDERR:

Error from server (NotFound): nodes "node04.internal.aws.testdrive.openshift.com" not found

Change to using a bind user instead of the admin; create a bind user

The following ldif file should be dropped onto the IDM server during environment provisioning:

dn: uid=system,cn=sysaccounts,cn=etc,dc=auth,dc=internal,dc=aws,dc=testdrive,dc=openshift,dc=com
changetype: add
objectclass: account
objectclass: simplesecurityobject
uid: system
userPassword: bindingpassword

Just after provisioning IDM, and before doing any of the user creation, we should execute the following command:

ldapmodify -x -D 'cn=Directory Manager' -w ldapadmin -f /path/to/sysaccount.ldif

Then, we need to change the /etc/ansible/hosts file that gets deployed to use the above DN and password in place of the existing one.

This will at least prevent us from being locked out entirely when the "too many failed logins" error occurs.

We also need to update /home/cloud-user/groupsymc.yaml as well, as this appears to have auth information in it, too.

default hosts template should have ldap for auth

If we don't do the installation with LDAP for auth, it means the admin has to re-run the installer (albeit with -t master) later. This is a little bit... awkward.

I am thinking we may wish to install with LDAP auth out of the box.

  • it won't affect the system:admin special user
  • verification of installation can then include a simple "oc login" as a user
  • module 2 can become LDAP setup and group manipulation
  • module 3 can become CNS installation and configuration

Thoughts?

@cooktheryan
@dmesser

ldap group sync validation failes

@kmurudi

`TASK [Checking if all the groups have been created by 'oc adm groups sync'] *********************************************
Tuesday 25 July 2017 11:13:28 -0400 (0:00:00.547) 0:00:00.637 **********
failed: [master.unset.ocp-admin.aws.openshifttestdrive.com] (item=ose-users) => {
"changed": true,
"cmd": [
"oc",
"get",
"group",
"ose-users"
],
"delta": "0:00:00.201054",
"end": "2017-07-25 11:13:29.162421",
"failed": true,
"item": "ose-users",
"rc": 1,
"start": "2017-07-25 11:13:28.961367"
}

STDERR:

Error from server (NotFound): groups "ose-users" not found`

Happens after successful execution of ldap_automation.yml

Add master public ip address into /etc/sysconfig/workshopper

Currently we have:

                MASTER_EXTERNAL_FQDN="master.${AWS::AccountId}.${PublicHostedZone}"
                MASTER_INTERNAL_FQDN="master.internal.${PublicHostedZone}"

We probably should also add the master public IP address, since that's what we are going to tell people to SSH into. This also means that, once they find the lab guide, they don't have to worry about going back to the Qwiklab interface to find the IP address.

LDAP auth fails in OpenShift: Unwilling to perform: too many failed logins.

In a fresh, successfully finished deployment I cannot login as an IdM user. It yields "Internal error occurred: unexpected response: 500". In the system logs I can see: "logging error output: "Error: LDAP Result Code 53 Unwilling To Perform: Too many failed logins.".
This behavior is not consistently reproducible but appears every 10-20 deployments. Thoughts?

move all littered content to support folder

We have content littered in several locations. If a script, file, or etc. needs to be used during the exercises, it should go into the repo in the support folder. For generated files that will be used (like the groupsync config), the write_files section of cloud-init should be relocated such that this repo is cloned first, and then the files are written out.

Environment specific tests - how to?

When doing lab automation and verification we need to have access to the environment specific variables. E.g. the device name of the CNS bricks, the default routing suffix for OCP, the name of the project that we create for CNS? Some of this info could be just hard-coded in the lab guide and the automation but we may want to externalize that for easy updates later.

For the lab guide we are writing this info into /etc/sysconfig/workshopper on the guide node. But tests will likely need to be running from the master node, to get /etc/ansible/hosts as a second source for info.

What is the best way to get environment-specific information?

app management should come before cns

Since deploying CNS requires a basic understanding of some OpenShift components (services, pods, routes, etc) it makes sense to have the app management lab come before the cns-deploy lab.

create_failed on cloud formation

classes - 'OpenShift Test Drive for Administrators' and
'ocp-admin-testdrive-master-branch'
are not being built successfully when opening the AWS console. The CloudFormation service shows the error of 'create_failed'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.