redhat-nfvpe / kni-upi-lab Goto Github PK
View Code? Open in Web Editor NEWAutomated installation for OpenShift 4.x using the User Provided Infrastructure (UPI) guidelines
Automated installation for OpenShift 4.x using the User Provided Infrastructure (UPI) guidelines
Getting below error while running "make all"
tar: openshift-install-linux-.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
mv: cannot stat ‘openshift-install’: No such file or directory
make: *** [/usr/local/bin/openshift-install] Error 1
Issue was resolved when we add below lines in common.sh, not sure if this is the right fix.
OPENSHIFT_OCP_MINOR_REL="4.1.0"
export OPENSHIFT_OCP_MINOR_REL
A workaround is to manually execute:
ipmitool -I lanplus -H $IPMI_IP -U xxx -P xxx chassis bootdev pxe
ipmitool -I lanplus -H $IPMI_IP -U xxx -P xxx power cycle
It seems those instructions are not executed by Terraform on a second worker. It'd be handy to add TFLOG var somewhere.
README file doesn't have following covered-
installing git and git clone of repository to start this work.
yum install git
git clone https://github.com/redhat-nfvpe/kni-upi-lab.git
Update of cluster/ha-lab-ipmi-creds and giving username password in base64 format
Details on OS requirement of installer node. like what all OS and versions are supported?
if BM and Prov CIDR is not same as used in AF then where and how to change it?
Instruction on whether or not, Provisioning and baremetal network interfaces should be set prior to running this automation or just the management IP is enough?
Add this instruction- Run prep_bm_host.sh while connected thru management IP. No SSH from BM or PROV IP to management IP will work since it will disconnect the session during bridge creation.
Want to use kni-upi-lab to use 4.3
Example : https://openshift-release-artifacts.svc.ci.openshift.org/4.3.0-0.ci-2019-10-08-092115
ha-lab-ipmi-creds.yaml update needs to be added in documentation.
I have re-executed this script after running clean_up_host script, I am getting below messages, please see if these are ok and won't affect anything.
Configuring baremetal interface (em2) and bridge (baremetal)...
device em2 is not a slave of baremetal
bridge baremetal is still up; can't delete it
You are using pip version 8.1.2, however version 19.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Terraform apply is failing with below error.
matchbox_profile.default: Creating...
module.masters.matchbox_profile.master[0]: Creating...
matchbox_profile.default: Creation complete after 0s [id=test1]
module.masters.matchbox_profile.master[0]: Creation complete after 0s [id=test1-master-0]
module.bootstrap.matchbox_profile.bootstrap: Creating...
matchbox_group.default: Creating...
module.masters.matchbox_group.master[0]: Creating...
matchbox_group.default: Creation complete after 0s [id=test1]
module.masters.matchbox_group.master[0]: Creation complete after 0s [id=test1-master-0]
module.bootstrap.matchbox_profile.bootstrap: Creation complete after 0s [id=test1-bootstrap]
module.bootstrap.matchbox_group.bootstrap: Creating...
module.bootstrap.local_file.vm_bootstrap: Creating...
module.bootstrap.local_file.vm_bootstrap: Creation complete after 0s [id=489ef9df0c85d5cbf1e9a422332c69e9e9c01bcd]
module.bootstrap.matchbox_group.bootstrap: Creation complete after 0s [id=test1-bootstrap]
module.masters.null_resource.ipmi_master[0] (local-exec): Set Boot Device to pxe
module.bootstrap.null_resource.vm_bootstrap: Creating...
module.bootstrap.null_resource.vm_bootstrap: Provisioning with 'local-exec'...
module.bootstrap.null_resource.vm_bootstrap (local-exec): Executing: ["/bin/sh" "-c" "rm -f /var/lib/libvirt/images/bootstrap.img || true\nqemu-img create -f qcow2 /var/lib/libvirt/images/bootstrap.img 800G\nchown qemu:qemu /var/lib/libvirt/images/bootstrap.img\nvirsh create /tmp/test1-bootstrap-vm.xml\n"]
module.bootstrap.null_resource.vm_bootstrap (local-exec): qemu-img: /var/lib/libvirt/images/bootstrap.img: Could not create file: No such file or directory
module.bootstrap.null_resource.vm_bootstrap (local-exec): Formatting '/var/lib/libvirt/images/bootstrap.img', fmt=qcow2 size=858993459200 encryption=off cluster_size=65536 lazy_refcounts=off
module.masters.null_resource.ipmi_master[0] (local-exec): Chassis Power Control: Cycle
module.masters.null_resource.ipmi_master[0]: Creation complete after 1s [id=1486622495056344077]
module.bootstrap.null_resource.vm_bootstrap (local-exec): chown: cannot access ‘/var/lib/libvirt/images/bootstrap.img’: No such file or directory
module.bootstrap.null_resource.vm_bootstrap (local-exec): /bin/sh: line 3: virsh: command not found
I checked and found that libvirt wasn't installed. so I tried below and it worked.
rpm -qa |grep libv # it showed no output for libvirt
systemctl status libvirtd # it showed libvirt was not running
yum install virt-manager libvirt
systemctl start libvirtd
systemctl enable libvirtd
Bootstrap node auto deletion is neither part of automation nor the documentation covering the step to do this.
If we set OPENSHIFT_RHCOS_MAJOR_REL and OPENSHIFT_RHCOS_MINOR_REL to "4.1", OCP_CLIENT_BINARY_URL and OCP_INSTALL_BINARY_URL return and wrong address. The problem is locate at the end of "images_and_binaries.sh" file, at the "grep install-linux.tar" more exactelly.
Solution, We must delete the extension "tar" and the issue is solved.
OCP_CLIENT_BINARY_URL="${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}$(curl -sS "${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}" | grep client-linux. | cut -d '"' -f $FIELD_SELECTOR)"
and
OCP_INSTALL_BINARY_URL="${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}$(curl -sS "${OCP_BINARIES["$OPENSHIFT_RHCOS_MAJOR_REL"]}" | grep install-linux. | cut -d '"' -f $FIELD_SELECTOR)"
Steps to create
cd ../workers
terraform init
terraform apply --auto-aprove
You see following warnings
Warning: Value for undeclared variable
on terraform.tfvars line 9:
9: worker_baremetal_interface = "ens803f0"
Warning: Value for undeclared variable
on terraform.tfvars line 12:
12: worker_provisioning_interface = "ens803f1"**
Here is the error I am getting,
[root@localhost kni-upi-lab]# make all
./scripts/gen_config_prov.sh
Generating /home/kni-upi-lab/dnsmasq/prov/etc/dnsmasq.d/dnsmasq.conf...
./scripts/gen_config_bm.sh bm
Using cached manifest values...
Key with no value for key "install-config.platform.hosts.0.sdnIPAddress" failed...
make: *** [dnsmasq/bm/etc/dnsmasq.d/dnsmasq.conf] Error 1
Worker node kickstart file (centos-worker-kickstart.cfg) parameter rootpw is not automatically updated, one needs to manually update it. It should be covered in document or should be automated.
Getting below error.
[root@localhost cluster]# terraform init
Initializing modules...
Initializing the backend...
Initializing provider plugins...
Detailed analysis of this issue is present in attachment.
terrform for matchbox provider error.txt
As a workaround, I tried following and it worked. Please see if any changes are required in prep_host_bm.sh
Yum install wget
VERSION=v0.3.0
wget https://github.com/poseidon/terraform-provider-
matchbox/releases/download/$VERSION/terraform-provider-matchbox-$VERSION-linux-amd64.tar.gz
tar xvf terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz
cd terraform-provider-matchbox-v0.3.0-linux-amd64/
mv terraform-provider-matchbox ~/.terraform.d/plugins/
CentOS binary download before worker node installation is not covered in documentation.
mount -o loop CentOS-7-x86_64-DVD-1810.iso /mnt/
cd /mnt
mkdir –p /home/kni-upi-lab/matchbox-data/var/lib/matchbox/assets/centos7
cp -av * /home/kni-upi-lab/matchbox-data/var/lib/matchbox/assets/centos7/
umount /mnt/
Worker node, gives below error for cases when disk size is >2 TB, To fix this, below need to be added in centos kickstart file. This should be added in documentation
part biosboot --fstype=biosboot --size=1
The installer does not warn or fail when the user has entered the wrong ssh key in install-config.yaml .
In fact, Ther installer continues to install openshift and fails after 30ms without proper error message.
Expected: BM preparation script should check for this or make it interactive with a menu to choose ssh key other configurations like openshift installer.
forward . x.x.x.x command in gen_coredns is hard-coded.
ipxe.efi is not getting downloaded. Here is the error.
_
Setting up tftpboot...
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:19 --:--:-- 0curl: (6) Could not resolve host: boot.ipxe.org; Unknown error
_
[root@localhost kni-upi-lab]# ls /var/lib/tftpboot/
undionly.kpxe
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.