Giter Site home page Giter Site logo

noironetworks / acc-provision Goto Github PK

View Code? Open in Web Editor NEW
23.0 23.0 14.0 44.38 MB

Tool for provisioning Cisco ACI APIC to integrate with Container Orchestration systems, and generating CNI Plugin Containers' deployment configuration.

License: Apache License 2.0

Makefile 0.15% Shell 0.16% Python 99.68%

acc-provision's People

Contributors

abhijitherekar avatar abhis2112 avatar akhilamohanan avatar amittbose avatar anmol372 avatar apoorva11029 avatar bhavanaashok33 avatar ceridwen avatar fwardzic avatar gaurav-dalvi avatar gautvenk avatar jayaramsatya avatar jeffinkottaram avatar jojimt avatar kishorejonnala avatar lavanyavaddavalli avatar mandeepdhami avatar mchalla avatar pariyaashok avatar pkharat avatar readams avatar rupeshsahuoc avatar shastrinator avatar siva-muni avatar smshareef avatar snaiksat avatar tanyatukade avatar tomflynn avatar vlella avatar yogeshrajmane avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

acc-provision's Issues

Provisioning error when using Pre-Existing Tenant with shared L3Out

Hello,

we are currently in the process of deploying 20+ Openshift clusters in a small fabric environment (single site, single pod). But our environment consists of multiple tenants, each with multiple VRFs.

ACI 6.03(e)
Openshift 4.14

We now ran into an issue where acc-provision cannot deploy the ACI ressources as planned.

acc configuration:

  • Usage of pre-existing tenant: TENANT_A
  • Tenant/VRF: TENANT_A / VRF_A
  • L3Out in common tenant (shared for all OCP clusters)

If we do this, acc-provision is running into an error cause it tries to find the L3Out in TENANT_A. But the L3Out is actually in common tenant.
If we change the Tenant/VRF configuration to common tenant, acc-provision runs fine. But then we have the cluster BDs/EPGs also in common tenant.

I already took a look into the script. But can't figure out, if changing the script deployment would be enough since there are also settings going into the manifests for OCP.
And we already tried to move the BDs manually to TENANT_A. If we do this, the aci controller pod is running into a panic error.

Cheers

Christian

Add support for the following config options for opflex agent (with defaults shown)

"opflex": {
//  // This section controls how persistant config is treated
//  // during opflex startup
    "startup": {
//      // Location of the pol.json that will be read and used on startup
//      // will be provided by the orchestrator, needs to be on a persistent volume
        "policy-file": "<location of the policy file>",
//      // How long to use the above file after agent connects to the leaf
//      // default is 0 meaning as soon as agent connects to leaf we will
//      // stop using the local policy for future resolves
        "policy-duration": 60,
//      // Wait till opflex connects to leaf before using the local policy
//      // default 0, use in combination with policy-duration > 0
//      // This is useful if you want to preserve the old flows until
//      // the leaf connection succeeds and not overwrite them before we
//      // connect to the leaf using the local policy
//      // A related knob timers.switch-sync-delay controls after connection
//      // how much more longer to freeze the flow tables
//      "resolve-aft-conn": false
//  },
   "timers": {
       // Initial switch sync delay
       // How long to wait from PlatformConfig resolution
       // to start the switch sync, default 5 seconds
       "switch-sync-delay":  5,
       // Subsequent switch sync delay
       // In case we have subsequent resolutions pending
       // whats the minimum time any resolution can be
       // pending before we retry switch sync
       // default 0, no further wait
       // If this value is > 0, we keep checking if
       // every pending MO waited at least this long
       // before retrying switch sync
       // Max retries will be 5 so as to not wait
       // forever
       "switch-sync-dynamic": 10

Service graph contract created with scope vrf

I find that its not easy to understand what the intent is in terms of configuration for an internal EPG in a separate tenant to communicate with a K8 service IP (load balancer IP) defined in a L3Out.

So for example, K8 running with a app deployment of 3 containers and configured with a loadbalancer service. The acc-provision tool has setup ACI and the acc-provision output file has been applied to via kubectl apply. The L3Out/EPGs etc are all in the common tenant. When the deployment is created for the app, the correct service graphs/contracts are all setup correctly and as expected the service (container app) can be successfully accessed from outside the fabric via the service host address. All good.

What is not clear is the intent in relation to how an internal EPG (i.e. tn-ATEN/ap-AAP/epg-AEPG) not in the same tenant or vrf as the deployed L3Out/VRF/etc (common as above) should communicate with the application via the service IP. Most of the objects created by the ACI containers on the APIC are 'managed' and should any configuration of these objects change, the ACI containers will change the configuration back to the original state. So for example, if I change the subnet settings for the host route in the L3Out EPG to allow the advertisement of the service host IP (enable Shared Security Import Subnet) to another VRF, the change is immediately reversed by the ACI containers. This prevents the route leaking of the host IP. In the same way, if I change the contract scope for the K8 app service (the contract created by the ACI Containers when the K8 app was deployed) from VRF to Global to be able to use this contract in another tenant, the modification is reversed immediately.

kubectl get services -o wide -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
...
multicast mcast LoadBalancer 172.22.73.15 172.25.0.2 5621:32362/TCP 43h app=mcast-app

The IP 172.25.0.2 is assigned from the acc-provision extern_dynamic subnet which is configured on the L3Out by the ACI containers when the mcast app is deployed with a LoadBalancer configuration as discussed above.

I did take a look at the configuration option kube_config\snat_operator\contract_scope: global which does not seem to apply here although I did try this option and reapply the configuration but this does not change the app contract scope.

There is little documentation around the Cisco (noironetworks) / ACI Containers hence the question.

Does anybody understand what the intent is to provide this EPG to K8 Service IP communication ? I would have expected just allowing the contract to be Global and not the default VRF would be the obvious but as above I can't do this as the change is reversed soon after submission.

I would expect that the same contract/service graph can be used for EPGs other than the L3Out keeping the configuration consistent across all accesses (internal or external to the fabric)

ACI 4.2(6h)
acc-provision 5.1.3.1
K8 1.2

Thanks.

acc-provision openshift-4.5-esx

When trying to provision openshift-4.5-esx on Cisco ACI Version 5.1(1h), the following error is arised.

[root@ocp-bastion ocp_upi_integrated]# acc-provision -d -c acc-provision-input.yaml -u x -p y -f openshift-4.5-esx
INFO: Loading configuration from "acc-provision-input.yaml"
INFO: Using configuration flavor openshift-4.5-esx
ERR:  KeyError: 'infraRsVlanNs'

I had to change this file /usr/local/lib/python3.6/site-packages/acc_provision/apic_provision.py, line 214.

FROM: path = "/api/node/mo/uni/vmmp-VMware/dom-%s.json?query-target=children" % (vmmdom)
TO: path = "/api/node/mo/uni/vmmp-VMware/dom-%s.json?query-target=children&target-subtree-class=infraRsVlanNs" % (vmmdom)

Invalid configuration for aci_config/system_id

I got Invalid configuration for aci_config/system_id: aci_config/system_id during acc-provision command because system_id contains "-" char.
It is related to config_validate/checks

       "aci_config/system_id": (get(("aci_config", "system_id")),
                                 lambda x: required(x) and isname(x, 32))

My network teams said that is not possible to rename tenant/system-id.
Is the system-id dash is really avoid on ACI side or is this a python issue?

acc-provision-operator: can't set image version

I did this configuration

registry:
  image_prefix: quay.io/noiro
  aci_containers_controller_version: 5.2.3.1.1d150da
  aci_containers_host_version: 5.2.3.1.1d150da
  cnideploy_version: 5.2.3.1.1d150da
  opflex_agent_version: 5.2.3.1.1d150da
  openvswitch_version: 5.2.3.1.1d150da
  aci_containers_operator_version: 5.2.3.1.1d150da
  acc_provision_operator: 5.2.3.1.1d150da

All the images are set to 5.2.3.1.1d150da however aci_containers_operator_version is still set to 6.0.0.0.179471b

cat cni-ocpbm |  grep quay | sort | uniq
                "image_prefix": "quay.io/noiro",
        image: quay.io/noiro/acc-provision-operator:6.0.0.0.179471b. <====
          image: quay.io/noiro/aci-containers-controller:5.2.3.1.1d150da
          image: quay.io/noiro/aci-containers-host:5.2.3.1.1d150da
      - image: quay.io/noiro/aci-containers-operator:5.2.3.1.1d150da
          image: quay.io/noiro/cnideploy:5.2.3.1.1d150da
          image: quay.io/noiro/openvswitch:5.2.3.1.1d150da
          image: quay.io/noiro/opflex:5.2.3.1.1d150da

Current version of acc-provision does not work with Python 3.10

This is do to how the Mapping function from collections is used:

from collections import Mapping
Traceback (most recent call last):
File "", line 1, in
ImportError: cannot import name 'Mapping' from 'collections' (/usr/lib64/python3.10/collections/init.py)

This is a deprecated method of using Mapping and should be migrate to the new method:
:1: DeprecationWarning: Using or importing the ABCs
from 'collections' instead of from 'collections.abc' is deprecated
since Python 3.3, and in 3.10 it will stop working

[5.2.1.0] Docker images not available at DockerHub

Docker images version 5.2.2.0.0ef4718 not available at DockerHub:

acc-provision/provision/acc_provision/versions.yaml

...
  5.2:
    aci_containers_controller_version: 5.2.2.0.0ef4718
    aci_containers_host_version: 5.2.2.0.0ef4718
    aci_containers_operator_version: 5.2.2.0.0ef4718
    cnideploy_version: 5.2.2.0.0ef4718
    openvswitch_version: 5.2.2.0.5681a9b
    opflex_agent_version: 5.2.2.0.d2739da
    opflex_server_version: 5.2.2.0.d2739da
    gbp_version: 5.2.2.0.8ac1bd4

acc-provision: validate image_pull_secret

The image pull secret is supposed to be the name of the secret.
However inserting the OpenShift Auth yaml code will result in a crash:

For example if you set:
image_pull_secret: {"auths": {"openshift-mirror.local:5000": {"auth": "redacted","email": "[email protected]"}}}

The generator inside gen_file_list will crash as the yaml file is not valid anymore.
https://github.com/noironetworks/acc-provision/blob/master/provision/acc_provision/acc_provision.py#L999

We should validate the config file for this possible mistake, the image_pull_secret is just the name of the secret and we should check for that

Update acc-provision tool to support latest openshift releases

When I run the acc-provision tool version 5.2.3.1 the available flavors for openshift are outdated. The latest baremetal is 4.6. Can it be updated for Openshift 4.9 (and 4.10 has been released recently).
Thanks a lot!

INFO: Available configuration flavors:
INFO: openshift-4.7-openstack:	Red Hat OpenShift Container Platform 4.7 on OpenStack
INFO: openshift-4.6-openstack:	Red Hat OpenShift Container Platform 4.6 on OpenStack
INFO: openshift-4.6-baremetal:	Red Hat OpenShift Container Platform 4.6 on Baremetal
INFO: openshift-4.5-openstack:	Red Hat OpenShift Container Platform 4.5 on OpenStack
INFO: openshift-4.8-esx:	Red Hat OpenShift Container Platform 4.8 on ESX
INFO: openshift-4.7-esx:	Red Hat OpenShift Container Platform 4.7 on ESX
INFO: openshift-4.6-esx:	Red Hat OpenShift Container Platform 4.6 on ESX
INFO: openshift-4.5-esx:	Red Hat OpenShift Container Platform 4.5 on ESX
INFO: openshift-3.11:	Red Hat OpenShift Container Platform 3.11

Information on dnsnetworkpolicies in ACI CNI plugin

Hello,
we were referred to Github by Cisco TAC with this enquiry.

We are trying to find documentation about "dnsnetworkpolicies" and how to use it in Openshift version 4.10.
"dnsnetworkpolicies" is included in Cisco ACI CNI (acc-provision version 5.2.3.3) which we use to integrate ACI version 5.2(5c) with Openshift.

The following policy seems not to be functional

---
apiVersion: aci.dnsnetpol/v1beta
kind: DnsNetworkPolicy
metadata:
  name: dns-allow-google
  namespace: na-nettest-curlclient
spec:
  appliedTo:
      namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: na-nettest-curlclient
      podSelector:
        matchLabels:
          app: curl-extern
  egress:
    toFqdn:
      matchNames:
        - "google.com"
        - "noordmolenwerf.nl"

Could you provide more information and documentation?

$ oc api-resources

NAME SHORTNAMES APIVERSION NAMESPACED KIND

dnsnetworkpolicies aci.dnsnetpol/v1beta true DnsNetworkPolicy

HostPrefix not configurable but CIDR is

Hi guys,

I was trying to do a small lab setup and set the POD network to /24.
Then upon deploying I noticed that the Openshift playbooks complained about HostPrefix /23.
Investigating more showed that here:
acc-provision/provision/acc_provision/templates/cluster-network-03-config.yaml
we have cidr configurable (set from the acc-provision script) while the HostPrefix is static and always /23.
I'm thinking that it should also be configurable as Redhat in its documentation does allow it for its native Openshift deployment.
Changing it otherwise later is a cumbersome process (unpacking the manifests.tar.gz, manual change, repacking the archive).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.