Giter Site home page Giter Site logo

otp-gitops's Introduction

Logo

Contributors Forks Stargazers Issues MIT License

One Touch Provisioning (OTP) is a pattern that enables the seamless end-to-end provisioning of Red Hat OpenShift clusters, their applications, governance and policies to Public, Private, On-Premises and both Near and Far Edge Clouds all via Code.

Table of Contents


About the Project

One Touch Provisioning

This method/pattern is our opinionated implementation of the GitOps principles, using the latest and greatest tooling available, to enable the hitting of one big red button (figuratively) to start provisioning a platform that provides Cluster and Virtual Machine Provisioning capabilities, Governance and policy management, observability of Clusters and workloads and finally deployment of applications, such as IBM Cloud Paks, all within a single command*.

The pattern leans very heavily on technologies such as Red Hat Advanced Cluster Management (RHACM) and OpenShift GitOps (ArgoCD), enabling a process upon which OpenShift clusters, their applications, governance and policies are defined in Git repositories and by leveraging RHACM as a function of OpenShift GitOps, enables the seamless end to end provisioning of those environments.

  • Codified, Repeatable and Auditable.

(back to top)

Introduction to Concepts and Technologies leveraged

Before Getting Started with this pattern, it's important to understand some of the concepts and technologies used. This will help reduce the barrier of entry when adopting the pattern and help understand why certain design decisions were made.

(back to top)

Getting Started

Prerequisites

Red Hat OpenShift Cluster

  • Minimum OpenShift v4.10+ is required.

Deploy a "vanilla" Red Hat OpenShift cluster using one of the methods below:

CLI tools

OpenShift CLI
  • Install the OpenShift oc CLI (version 4.10+). The binary can be downloaded from the Help menu from the OpenShift cluster console, or downloaded from the OpenShift Mirror website.
Helm CLI
  • Install helm CLI from brew.sh

     brew install helm
KubeSeal CLI (Optional)
  • If you intend to use the SealedSecrets Operator, then it's recommended to install kubeseal CLI from brew.sh

    brew install kubeseal

Repository considerations

There are two repository patterns to consider when leveraging GitOps: Monorepo or Polyrepo. For OTP, we have leveraged a Polyrepo structure, which consists of six git repositories within the GitOps workflow. You can learn more on why we chose a Polyrepo here.

  • RHACM Hub GitOps repository

    • This repository contains all the ArgoCD Applications for the infrastructure, services, policies, clusters and application layers. Each ArgoCD Application will reference a specific resource that will be deployed to the RHACM Hub Cluster, or depending on your chosen configuration, it may include Spoke Cluster resources as well.
  • Infrastructure GitOps repository

    • This repository is what we've termed a "common repository". This repository will be used across both the RHACM Hub cluster and any Spoke clusters you deploy. The repository contains the YAMLs for cluster-wide and/or infrastructure related resources managed by a cluster administrator. This would include namespaces, clusterroles, clusterrolebindings, machinesets to name a few. By leveraging "common repositories", you can reduce depulication of code and ensure a consistant set of resources are applied each time.
  • Services GitOps repository

    • This repository is what we've termed a "common repository". This repository will be used across both the RHACM Hub cluster and any Spoke clusters you deploy. The repository contains the YAMLs for resources which will be used by the RHACM Hub and Spoke clusters. This repository include subscriptions for Operators, YAMLs of custom resources provided, or Helm Charts for tools provided by a third party. These resource would usually be managed by the Administrator(s) and/or a DevOps team supporting application developers. By leveraging "common repositories", you can reduce depulication of code and ensure a consistant set of resources are applied each time.
  • Policies GitOps repository

    • This repository is what we've termed a "common repository". The repository contains the YAMLs for resources to deploy Policies to both the RHACM Hub and Spoke clusters. These resource would usually be managed by the GRC and/or Security teams. By leveraging "common repositories", you can reduce depulication of code and ensure a consistant set of resources are applied each time.
  • Clusters GitOps repository

    • This repository is what we've termed a "common repository". The repository contains the YAMLs for resources to deploy OpenShift Clusters. These resource would usually be managed by the Platform Administrator(s) and/or a Ops team supporting the Cloud Platforms. By leveraging "common repositories", you can reduce depulication of code and ensure a consistant set of resources are applied each time.
  • Apps GitOps repository

    • This repository is what we've termed a "common repository". The repository contains the YAMLs for resources to deploy the RHACM Hub or Spoke OpenShift Clusters. Contains the YAMLs for resources to deploy applications. By leveraging "common repositories", you can reduce depulication of code and ensure a consistant set of resources are applied each time.

Setup of git repositories

  1. Create a new GitHub Organization using instructions from this GitHub documentation.

  2. Create the repositories within your new GitHub Organization and clone them locally.

    GIT_ORG=<new-git-organization> OUTPUT_DIR=<gitops-repos> ./scripts/create-repos.sh
  3. (Optional) Many users may wish to use private Git repositories on GitHub to store their manifests, rather than leaving them publically readable. The steps for setting up OpenShift GitOps for Private repositories can be found here.

  4. Update the default Git URL and branch references in your otp-gitops repository by running the provided script ./scripts/set-git-source.sh script.

    GIT_ORG=<GIT_ORG> GIT_BRANCH=master ./scripts/set-git-source.sh
    • You can unset the changes you made above by running the ./scripts/unset-git-source.sh.

IBM Entitlement Key for IBM Cloud Paks

If you intend to deploy the Infrastructure Automation component of IBM Cloud Pak for Watson AIOps, then please follow the instructions here.

(back to top)

Installation

Install and configure OpenShift GitOps

  1. Install the OpenShift GitOps Operator and create a ClusterRole and ClusterRoleBinding.

    oc apply -k setup/argocd-operator
    while ! oc wait crd applications.argoproj.io --timeout=-1s --for=condition=Established  2>/dev/null; do sleep 30; done
    while ! oc wait pod --timeout=-1s --for=condition=Ready -l '!job-name' -n openshift-gitops > /dev/null; do sleep 30; done
  2. Create a custom ArgoCD instance with custom checks. To customise which health checks, comment out those you don't need in setup/argocd-instance/kustomization.yaml.

    oc apply -k setup/argocd-instance
    while ! oc wait pod --timeout=-1s --for=condition=ContainersReady -l app.kubernetes.io/name=openshift-gitops-otp-server -n openshift-gitops > /dev/null; do sleep 30; done
  3. (Optional) If using IBM Cloud ROKS as a RHACM Hub Cluster, then you will need to configure TLS.

    ./scripts/patch-argocd-tls.sh

Configure Storage and Infrastructure nodes

On AWS, Azure, GCP, vSphere and Baremetal you can run the following script to configure the machinesets, infra nodes and storage definitions for the Cloud you are using for the RHACM Hub Cluster.

This will deploy additional nodes to support OpenShift Data Foundation (ODF) for Persistant Storage, as well as additional nodes to support Infrastructure (aka infra) components, such as RHACM, Quay, Ingress Controllers, OpenShift Internal Registry and ACS.

This is a design choice to reduce OpenShift licensing requirements as running these components on Infrastructure nodes does not consume a subscription cost.

When running on Baremetal, it will utilise Local Storage for deploying ODF. It will autoselect all workerIt will not deploy additional nodes for storage or Infra, this will be improved upon in later versions.

./scripts/infra-mod.sh

IBM Cloud - ROKS

If you are running a managed OpenShift cluster on IBM Cloud, you can deploy OpenShift Data Foundation as an add-on. You will also need to label some of your worker nodes as Infra nodes, otherwise RHACM will fail to deploy.

Attach the following label to the worker nodes you intend to use as Infra nodes.

node-role.kubernetes.io/infra: ''

Install a Local Hashicorp Vault Instance (Optional)

OTP works best when connected to an Secret Store like Hashicorp Vault, if you already have a pre-existing Vault-like instance available, for example IBM Secrets Manager, you can skip this step and move onto installing the External Secrets Operator, however if you'd like to install a local Hashicorp Vault Instances into the Hub Cluster, then follow the below steps.

oc apply -k setup/hashicorp-vault-chart

Install External Secrets Operator

  1. Install the External Secrets Operator to enable OTP to connect to either a pre-existing Vault-like instance or to the Local Hashicorp Vault instance deployed in the previous step.

    oc apply -k setup/external-secrets-operator
  2. Apply the API Key as a secret that will allow OTP to connect to your Vault-like instance via the External Secret Operator.

    oc create secret generic ibm-secret --from-literal=apiKey='<APIKEY>' -n kube-system
  3. Configure the ClusterSecretStore with the API Key secret and URL of your Vault-like instance.

    apiVersion: external-secrets.io/v1beta1
    kind: ClusterSecretStore
    metadata:
      name: cluster
      namespace: external-secrets
    spec:
      provider:
        ibm:
          auth:
            secretRef:
              secretApiKeySecretRef:
                name: ibm-secret
                namespace: kube-system
                key: apiKey
          serviceUrl: >-
            https://3f5f4d5b-6179-4d7c-a7a2-72dc28eb4a81.au-syd.secrets-manager.appdomain.cloud
  4. Apply the updated ClusterSecretStore.

    oc apply -f setup/external-secrets-instance/cluster-secret-store.yaml

Bootstrap the OpenShift RHACM Hub cluster

The bootstrap YAML follows the app of apps pattern.

  1. Retrieve the ArgoCD/GitOps URL and admin password and log into the UI

    oc get route -n openshift-gitops openshift-gitops-otp-server -o template --template='https://{{.spec.host}}'
    
    # Password is not required if using the OpenShift as an authorisation provider
    oc extract secrets/openshift-gitops-otp-cluster --keys=admin.password -n openshift-gitops --to=-
  2. The resources required to be deployed for this pattern have been pre-selected. However, you can review and modify the resources deployed by editing the following.

    0-bootstrap/hub/1-infra/kustomization.yaml
    0-bootstrap/hub/2-services/kustomization.yaml
    0-bootstrap/hub/3-policies/kustomization.yaml
    0-bootstrap/hub/4-clusters/kustomization.yaml
    0-bootstrap/hub/5-apps/kustomization.yaml

Any changes to the kustomization files before the Initial Bootstrap, will need to be committed back to your Git repository, otherwise they will not be picked up by OpenShift GitOps.

  1. Deploy the OpenShift GitOps Bootstrap Application.

    oc apply -f 0-bootstrap/hub/bootstrap.yaml
  2. ArgoCD Sync waves are used to managed the order of manifest deployment, this is required as some objects have parent-child relationships and are expected to exist within the RHACM Hub before they can be successfully deployed. We have seen occassions where applying both the Infrastructure, Services and Policies layers at the same time can fail. This typically occurs when there are issues with provisioning of additional nodes to support Storage and Infrastructure components. YMMV.

Once the Infrastructure, Services and Policies layers have been deployed, update the 0-bootstrap/hub/kustomization.yaml manifest to enable the Clusters and Apps layer and commit to Git. OpenShift GitOps will then automatically deploy any resources listed within those Kustomise files.

resources:
- 1-infra/1-infra.yaml
- 2-services/2-services.yaml
- 3-policies/3-policies.yaml
## Uncomment once the above layers have completed.
# - 4-clusters/4-clusters.yaml
# - 5-apps/5-apps.yaml

Installation is successful once all ArgoCD Applications are fully synced without errors.

You will be able to access the RHACM Hub console via the OpenShift console.

(back to top)

Usage

Deploying and Destroying Managed (aka Spoke) OpenShift Clusters via OpenShift GitOps

This pattern treats Managed (aka Spoke) Clusters as OpenShift GitOps Applications. This allows us to Create, Destroy, Hibernate and Import Managed Clusters into Red Hat Advanced Cluster Management via OpenShift GitOps.

Creating and Destroying Managed OpenShift Clusters

  • ClusterPools and ClusterClaims

    • We've now simplied the life-cycling of OpenShift Clusters on AWS, Google Cloud and Azure via the use of Cluster Pools and ClusterClaims.

    • Cluster Pools allows you pre-set a common cluster configuration and RHACM will take that configuration and apply it to each Cluster it deploys from that Cluster Pool. An example could be that a Production Cluster may consume specific Compute resources, exist in a multi-zone configuration and requires a particular version of OpenShift to be deployed and RHACM will deploy a cluster to meet those requirements.

    • Once a Cluster Pool has been created, you can submit ClusterClaims to deploy a cluster from that pool.

  • ClusterDeployment

    • The ClusterDeployment method can be used to deploy AWS, Azure, GCP, VMWare, On-Premise, Edge and IBM Cloud OpenShift Clusters.

    • Review the Clusters layer kustomization.yaml to enable/disable the Clusters that will be deployed via OpenShift GitOps.

      resources:
      ## ClusterPools
      ## Example: - argocd/clusterpools/<env>/<cloud>/<clusterpoolname>/<clusterpoolname.yaml>
      - argocd/clusterpools/cicd/aws/aws-cicd-pool/aws-cicd-pool.yaml
      
      ## ClusterClaims
      ## Example : - argocd/clusterclaims/<env>/<cloud>/<clusterclaimname.yaml>
      - argocd/clusterclaims/dev/aws/project-simple.yaml
      
      ## ClusterDeployments
      ## Example : - argocd/<env>/<cloud>/<clustername>/<clustername.yaml>
      - argocd/clusters/prod/aws/aws-prod/aws-prod.yaml
      - argocd/clusters/prod/azure/azure-prod/azure-prod.yaml 
    • We have have provided examples for deploying new clusters into AWS, Azure, IBM Cloud and VMWare. Cluster Deployments require the use of your Cloud Provider API Keys to allow RHACM to connect to your Cloud Provider and deploy via Terraform an OpenShift cluster. We make use of an external keystore, e.g. Vault and leveraged the use of the External Secrets Operator to pull in the Cloud Providers API keys automatically. This simplifies the creation of new clusters, reduces the values needed and works better with Scale. The deployments for the clusters is stored within the Clusters repository, under clusters/deploy/external-secrets/<cloud provider>.

    • Originally the pattern utilised the SealedSecrets Controller to encrypt your API Keys and provided a handy script for each Cloud Provider within the Clusters repository, under clusters/deploy/sealed-secrets/<cloud provider> for your use. This was deemed an ok method for 1-5 cluster deployments, but became very cumbersome when dealing with Scale and was at risk of error and misconfiguration. We will no longer be iterating the code for cluster deployment via SealedSecret and we'll eventually remove this altogether.

The pattern provides full end to end deployment of not only Clusters, but also Policies, Governance and Applications.

For more usage examples, please refer to the Documentation

(back to top)

Roadmap

  • OTP cli
  • Ansible Automation integration with Libvirt and VMWare
  • HyperShift Integration
    • HyperShift with OpenShift Virtualisation for Worker nodes
  • Deployment of IBM Cloud Satellite for IBM Managed OpenShift platform within chosen environment.

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the APACHE 2.0 License. See LICENSE for more information.

(back to top)

Contact

Project Link: https://github.com/one-touch-provisioning/otp-gitops

(back to top)

Acknowledgments

This asset has been built on the shoulders of giants and leverages the great work and efforts undertaken by the teams/individuals below. Without these efforts, then this pattern would have struggled to get off the ground.

The reference architecture for this GitOps workflow can be found here.

(back to top)

otp-gitops's People

Contributors

benswinney avatar langley-2 avatar nickmerrett avatar rampadc avatar yangliu138 avatar yliu138repo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

otp-gitops's Issues

Add Red Hat Quay Registry

Red Hat Quay is a part of the OpenShift Plus offering from Red Hat.

Currently, the asset deploys OpenShift and Red Hat Advanced Cluster Management, two of the four components of OpenShift Plus.

An effort should be made to close out this gap and deploy Quay as part of the bootstrap process.

Refactor Cluster Creation method to include common configuration

Currently, the Cluster Creation method is leveraging code for each Cloud.

A common set of base configuration should be utilised and then build the individual Cloud components on top of that common base.

This will help reduce repeated code and allow all clusters to be built from a common set of configuration files.

RHACM Post Install Job fails

Upon deployment of the Red Hat Advanced Cluster Management instance, there is a Post Install Job which will add console links to RHACM to the OpenShift Console.

This job fails with no error logs available and console links are not added for RHACM.

Screen Shot 2021-12-15 at 10 44 09 am

Question about Turbonomic

Hi All
I'm Eric Giguere, a CSM based in Quebec / Canada.

I'm currently contributing with the Asset and Architecture team in the GitOps project (Production guides). I've just found your project last week.
I was planning to start a GitOps with ArgoCD recipe to install Turbonomic on an OCP cluster. I've seen in your documentation that your project does or will support this.
Is there any work done yet on the subject?

Thanks!

Improvements to MCM workflow

Working with MCM is challenging, everything is ID based and each cluster as unique IDs

When creating cloud connections;

investigate if determining the provider IDs can be determined at runtime and not force the user to find them prior to use

When creating deployments/vms;

investigate if determining the cloud connections IDs can be determined at runtime and not force the user to find them prior to use
investigate if determining the template IDs can be determined at runtime and not force the user to find them prior to use

Enable ODF Console via Post Sync job

When install the ODF Operator, the ODF Console is disable by default.

To enable, an Admin must manually enable the console either via the UI or by patch the console operator.

A post sync job should be created so that once deployed, ArgoCD will enable the console automatically.

Advanced Cluster Security and Ansible Automation Platform binding to incorrect PVC's

During bootstrapping of the Hub Cluster, PVC's for Advanced Cluster Security and Ansible Automation Platform will bind to PVC's outside of Open Data Foundation default storageclasses.

Screen Shot 2021-12-15 at 10 34 00 am
Screen Shot 2021-12-15 at 10 33 53 am

This happens because OpenShift Data Foundation has not fully deployed.

An approach could be to specify the use of the OpenShift Data Foundation storageclasses within the Instance deployments, or adjust sync-waves and include a pre-sync job to check they exist before deployment.

Install RHACM Hub Cluster to Infrastructure Nodes

The asset currently deploys 3 worker nodes which are labelled for Infrastructure workloads. An OpenShift Container Platform cluster can be configured to contain infrastructure nodes for running approved management components. Running components on infrastructure nodes avoids allocating OpenShift Container Platform subscription quota for the nodes that are running those management components.

The asset should be updated so that the RHACM hub cluster components are deployed to Infrastructure nodes to reduce allocating unnecessary subscriptions.

Better status checking by ArgoCD on cluster creation via RHACM

Improve how ArgoCD check for the successful completion of a cluster deployment via RHACM.

Currently ArgoCD does not check on the status of the Cluster once it begins is cluster deployment.

A method should be created to do a post sync to check success or failure of deployment.

Leverage ACM Policies to automatically bootstrap Managed Clusters

Currently to bootstrap a managed cluster, we first use an ACM policy to deploy ArgoCD into the managed Cluster, we then create a ACM Application to point to an ArgoCD bootstrap.yaml within the Spoke Clusters folder, located within the otp-gitops repository.

A more elegant solution would be to utilise a Policy instead to perform this action.

Backup and Restore of Hub Cluster Services

The RHACM Hub cluster is the single source of truth across the pattern. It will hold the state of the GitOps implementations and the currently deploy/managed clusters and applications. This creates a single point of failure (SPOF), which is somewhat mitigated by the multi-zone deployment of OpenShift that that Hub cluster will reside on. Should the Cloud Provider running the Hub cluster suffer a regional outage, then the Hub cluster will no longer be accessible.

As this pattern is to provide multi-cloud operations, we can look to explore either a Hub of Hubs configuration (quite immature at this point in time) or we can explore the Cluster Backup Operator, which aims to provide a mechanism to Backup and Restore a Hub Cluster to another OpenShift environment.

https://github.com/stolostron/cluster-backup-operator

Proposal will be to leverage the Cluster Backup Operator to backup to an ODF S3 endpoint, then leverage VolSync to synchronize the S3 endpoint to another location, either on-premise or a n other Public Cloud and restore the Hub Cluster services there in the event of a failure.

Integrate ACS capabilities

OpenShift advanced cluster security is a part of the OpenShift Plus offering from Red Hat.

Currently, the asset deploys OpenShift and Red Hat Advanced Cluster Management, two of the four components of OpenShift Plus.

An effort should be made to close out this gap and deploy ACS as part of the bootstrap process.

Bootstrap helper CLI

Develop a otp CLI to setup hub clusters, handle configurations and bootstrap the cluster.

Usage 1: Interactive prompting

$ otp setup hub-cluster
Setup hub cluster...

> Create new repositories from templates? (Yn) Y
> What's your IBM Entitlement Key? ey...

[more text and logging here]

Usage 2: Command-line

$ otp setup hub-cluster --ibm-entitlement-key ey... --revision example --org mcm...

The CLI should save configurations into a config file in the same folder as when the otp command is run so they can be checked into a repo, with the exception of any secrets.

RHACM Pre-Sync Job to create pull_secret for MultiClusterHub

To enable RHACM to import non-OpenShift clusters (e.g. AKS, EKS, GKE etc) a pull_secret is required to allow access to the entitled image registry.

This can be gathered from the OpenShift Cluster via the Openshift-config namespace. This should be done before the MultiClusterHub instance is generated and a secret created within the open-cluster-management namespace.

A Pre-Sync ArgoCD job should be created to handle this automatically.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.