Giter Site home page Giter Site logo

kni-upi-lab's Issues

Make all error related to openshift-install file download

Getting below error while running "make all"

tar: openshift-install-linux-.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
mv: cannot stat ‘openshift-install’: No such file or directory
make: *** [/usr/local/bin/openshift-install] Error 1

Issue was resolved when we add below lines in common.sh, not sure if this is the right fix.

OPENSHIFT_OCP_MINOR_REL="4.1.0"
export OPENSHIFT_OCP_MINOR_REL

Second RHCOS worker doesn't join the cluster

A workaround is to manually execute:

ipmitool -I lanplus -H $IPMI_IP -U xxx -P xxx chassis bootdev pxe
ipmitool -I lanplus -H $IPMI_IP -U xxx -P xxx power cycle

It seems those instructions are not executed by Terraform on a second worker. It'd be handy to add TFLOG var somewhere.

README file missing content

README file doesn't have following covered-

  1. installing git and git clone of repository to start this work.
    yum install git
    git clone https://github.com/redhat-nfvpe/kni-upi-lab.git

  2. Update of cluster/ha-lab-ipmi-creds and giving username password in base64 format

  3. Details on OS requirement of installer node. like what all OS and versions are supported?

  4. if BM and Prov CIDR is not same as used in AF then where and how to change it?

  5. Instruction on whether or not, Provisioning and baremetal network interfaces should be set prior to running this automation or just the management IP is enough?

  6. Add this instruction- Run prep_bm_host.sh while connected thru management IP. No SSH from BM or PROV IP to management IP will work since it will disconnect the session during bridge creation.

prep_bm_host warnings

I have re-executed this script after running clean_up_host script, I am getting below messages, please see if these are ok and won't affect anything.

Configuring baremetal interface (em2) and bridge (baremetal)...
device em2 is not a slave of baremetal
bridge baremetal is still up; can't delete it

You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Terraform apply is failing

Terraform apply is failing with below error.

matchbox_profile.default: Creating...
module.masters.matchbox_profile.master[0]: Creating...
matchbox_profile.default: Creation complete after 0s [id=test1]
module.masters.matchbox_profile.master[0]: Creation complete after 0s [id=test1-master-0]
module.bootstrap.matchbox_profile.bootstrap: Creating...
matchbox_group.default: Creating...
module.masters.matchbox_group.master[0]: Creating...
matchbox_group.default: Creation complete after 0s [id=test1]
module.masters.matchbox_group.master[0]: Creation complete after 0s [id=test1-master-0]
module.bootstrap.matchbox_profile.bootstrap: Creation complete after 0s [id=test1-bootstrap]
module.bootstrap.matchbox_group.bootstrap: Creating...
module.bootstrap.local_file.vm_bootstrap: Creating...
module.bootstrap.local_file.vm_bootstrap: Creation complete after 0s [id=489ef9df0c85d5cbf1e9a422332c69e9e9c01bcd]
module.bootstrap.matchbox_group.bootstrap: Creation complete after 0s [id=test1-bootstrap]
module.masters.null_resource.ipmi_master[0] (local-exec): Set Boot Device to pxe
module.bootstrap.null_resource.vm_bootstrap: Creating...
module.bootstrap.null_resource.vm_bootstrap: Provisioning with 'local-exec'...
module.bootstrap.null_resource.vm_bootstrap (local-exec): Executing: ["/bin/sh" "-c" "rm -f /var/lib/libvirt/images/bootstrap.img || true\nqemu-img create -f qcow2 /var/lib/libvirt/images/bootstrap.img 800G\nchown qemu:qemu /var/lib/libvirt/images/bootstrap.img\nvirsh create /tmp/test1-bootstrap-vm.xml\n"]
module.bootstrap.null_resource.vm_bootstrap (local-exec): qemu-img: /var/lib/libvirt/images/bootstrap.img: Could not create file: No such file or directory
module.bootstrap.null_resource.vm_bootstrap (local-exec): Formatting '/var/lib/libvirt/images/bootstrap.img', fmt=qcow2 size=858993459200 encryption=off cluster_size=65536 lazy_refcounts=off
module.masters.null_resource.ipmi_master[0] (local-exec): Chassis Power Control: Cycle
module.masters.null_resource.ipmi_master[0]: Creation complete after 1s [id=1486622495056344077]
module.bootstrap.null_resource.vm_bootstrap (local-exec): chown: cannot access ‘/var/lib/libvirt/images/bootstrap.img’: No such file or directory
module.bootstrap.null_resource.vm_bootstrap (local-exec): /bin/sh: line 3: virsh: command not found


I checked and found that libvirt wasn't installed. so I tried below and it worked.

rpm -qa |grep libv # it showed no output for libvirt
systemctl status libvirtd # it showed libvirt was not running
yum install virt-manager libvirt
systemctl start libvirtd
systemctl enable libvirtd

Bootstrap node auto-deletion

Bootstrap node auto deletion is neither part of automation nor the documentation covering the step to do this.

issue with OCP_CLIENT_BINARY_URL and OCP_INSTALL_BINARY_URL return

If we set OPENSHIFT_RHCOS_MAJOR_REL and OPENSHIFT_RHCOS_MINOR_REL to "4.1", OCP_CLIENT_BINARY_URL and OCP_INSTALL_BINARY_URL return and wrong address. The problem is locate at the end of "images_and_binaries.sh" file, at the "grep install-linux.tar" more exactelly.
Solution, We must delete the extension "tar" and the issue is solved.

OCP_CLIENT_BINARY_URL="${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}$(curl -sS "${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}" | grep client-linux. | cut -d '"' -f $FIELD_SELECTOR)"
and
OCP_INSTALL_BINARY_URL="${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}$(curl -sS "${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}" | grep install-linux. | cut -d '"' -f $FIELD_SELECTOR)"

Worker node doesn't install rhcos

Hello,
I'm facing some issues when trying to install worker node with rhcos, despite the master installation is going well.
The following capture shows that it fails fetching image headers but I don't know what it is about.
image

Many thanks for any help.

When adding workers following warnings shown

Steps to create

 cd ../workers
terraform init
terraform apply --auto-aprove

You see following warnings

Warning: Value for undeclared variable

  on terraform.tfvars line 9:
   9: worker_baremetal_interface = "ens803f0"
Warning: Value for undeclared variable

  on terraform.tfvars line 12:
  12: worker_provisioning_interface = "ens803f1"** 

Static IP specification in install-config.yaml is not working for master and worker

Here is the error I am getting,

[root@localhost kni-upi-lab]# make all
./scripts/gen_config_prov.sh
Generating /home/kni-upi-lab/dnsmasq/prov/etc/dnsmasq.d/dnsmasq.conf...
./scripts/gen_config_bm.sh bm
Using cached manifest values...
Key with no value for key "install-config.platform.hosts.0.sdnIPAddress" failed...
make: *** [dnsmasq/bm/etc/dnsmasq.d/dnsmasq.conf] Error 1

Worker ROOTPW

Worker node kickstart file (centos-worker-kickstart.cfg) parameter rootpw is not automatically updated, one needs to manually update it. It should be covered in document or should be automated.

Terraform init is failing

Getting below error.

[root@localhost cluster]# terraform init
Initializing modules...
Initializing the backend...
Initializing provider plugins...

  • Checking for available provider plugins...
    Provider "matchbox" not available for installation.
    A provider named "matchbox" could not be found in the Terraform Registry.
    This may result from mistyping the provider name, or the given provider may
    be a third-party provider that cannot be installed automatically.
    In the latter case, the plugin must be installed manually by locating and
    downloading a suitable distribution package and placing the plugin's executable
    file in the following directory:
    terraform.d/plugins/linux_amd64
    Terraform detects necessary plugins by inspecting the configuration and state.
    To view the provider versions requested by each module, run
    "terraform providers".
    Error: no provider exists with the given name

Detailed analysis of this issue is present in attachment.

terrform for matchbox provider error.txt

As a workaround, I tried following and it worked. Please see if any changes are required in prep_host_bm.sh

Yum install wget

VERSION=v0.3.0

wget https://github.com/poseidon/terraform-provider-
matchbox/releases/download/$VERSION/terraform-provider-matchbox-$VERSION-linux-amd64.tar.gz

tar xvf terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz

cd terraform-provider-matchbox-v0.3.0-linux-amd64/

mv terraform-provider-matchbox ~/.terraform.d/plugins/

Centos/RHEL Image download

CentOS binary download before worker node installation is not covered in documentation.

mount -o loop CentOS-7-x86_64-DVD-1810.iso /mnt/
cd /mnt
mkdir –p /home/kni-upi-lab/matchbox-data/var/lib/matchbox/assets/centos7
cp -av * /home/kni-upi-lab/matchbox-data/var/lib/matchbox/assets/centos7/
umount /mnt/

Support for large root disks

Worker node, gives below error for cases when disk size is >2 TB, To fix this, below need to be added in centos kickstart file. This should be added in documentation

part biosboot --fstype=biosboot --size=1

No error is shown when you have wrong ssh key

The installer does not warn or fail when the user has entered the wrong ssh key in install-config.yaml .

In fact, Ther installer continues to install openshift and fails after 30ms without proper error message.
Expected: BM preparation script should check for this or make it interactive with a menu to choose ssh key other configurations like openshift installer.

Error from prep_bm_host

ipxe.efi is not getting downloaded. Here is the error.

_

Setting up tftpboot...

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:19 --:--:-- 0curl: (6) Could not resolve host: boot.ipxe.org; Unknown error

_

[root@localhost kni-upi-lab]# ls /var/lib/tftpboot/
undionly.kpxe

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.