Giter Site home page Giter Site logo

kubev2v / forklift-controller Goto Github PK

View Code? Open in Web Editor NEW
10.0 17.0 20.0 33.45 MB

This repository has been archived, development moved to https://github.com/kubev2v/forklift

License: Apache License 2.0

Dockerfile 0.18% Makefile 0.32% Go 99.50%
forklift migration kubernetes openshift

forklift-controller's Introduction

CI Code Coverage

forklift-controller

Konveyor Forklift controller.


Logging

Logging can be configured using environment variables:

  • LOG_DEVELOPMENT: Development mode with human readable logs and (default) verbosity=4.
  • LOG_LEVEL: Set the verbosity.

Verbosity:

  • Info(0) used for Info logging.
    • Reconcile begin,end,error.
    • Condition added,update,deleted.
    • Plan postponed.
    • Migration (k8s) resources created,deleted.
    • Migration started,stopped,run (with phase),canceled,succeeded,failed.
    • Snapshot created,updated,deleted,changed.
    • Inventory watch ensured.
    • Policy agent disabled.
  • Info(1) used for Info+ logging.
    • Connection testing.
    • Plan postpone detials.
    • Pending migration details.
    • Migration (k8s) resources found,updated.
    • Scheduler details.
  • Info(2) used for Info++ logging.
    • Full conditions list.
    • Migrating VM status (full definition).
    • Provider inventory data reconciler started,stopped.
  • Info(3) used for Info+++ logging.
    • Inventory watch: resources changed;queued reconcile events.
    • Data reconciler: models created,updated,deleted.
    • VM validation succeeded.
  • Info(4) used for Debug logging.
    • Policy agent HTTP request.

Profiler

The profiler can be enabled using the following environment variables:

  • PROFILE_KIND: Kind of profile (memory|cpu|mutex).
  • PROFILE_PATH: Profiler output directory.
  • PROFILE_DURATION: The duration (minutes) the profiler will collect data. (0=indefinately)

forklift-controller's People

Contributors

agrare avatar ahadas avatar aufi avatar bennyz avatar fabiendupont avatar fbladilo avatar jmontleon avatar jortel avatar liranr23 avatar mansam avatar mnecas avatar mrnold avatar nyoxi avatar rayfordj avatar yaacov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

forklift-controller's Issues

Compilation fails: unrecognized import path "vbom.ml/util"

Compilation currently fails with:

/home/rjones/go/bin/bin/controller-gen object:headerFile="./hack/boilerplate.go.txt" paths="./..."
Error: go [-e -json -compiled=true -test=false -export=false -deps=true -find=false -tags ignore_autogenerated -- ./...]: exit status 1: go: github.com/openshift/[email protected] requires
	vbom.ml/[email protected]: unrecognized import path "vbom.ml/util": https fetch: Get "https://vbom.ml/util?go-get=1": dial tcp 195.20.53.190:443: i/o timeout

The website (vbom.ml) went away 5 years ago. How do I fix this? See also: golang/dep#1169

Validation - Block VMware VM in inaccessible state

When a VM is in inaccessible state it will not report an operating system and the migration will fail, but the message is wrong.
It would be clearer to detect that the VM is inaccessible and block the migration at the validation stage.

The attribute to identify the state is .runtime.connectionState and the value can be connected, disconnected, orphaned, invalid or inaccessible. The import must fail when the value is not connected.

This information should also be in the inventory to allow the UI to display a warning.

Validate vSphere provider fingerprint

With the current implementation, the vSphere provider is validated by connecting to it with the provided URL and credentials. However, according to pkg/controller/provider/container/vsphere/reconciler.go#L476, the connection is insecure, so it will work even if the certificate is invalid. It would be great to add a validation of the thumbprint, as it is passed to VMIO and if it is incorrect the migration fails.

Technically, the thumbprint is the SHA1 digest of the raw certificate. Here is an example of how to retrieve the certificate and generate the MD5 digest. So, the implementation could be similar, but generate the SHA1 digest, instead of the MD5 one.

The thumbprint validation should be done before the credentials validation, as an invalid thumbprint means that something is wrong in the trust chain, so sending credentials to a possibly rogue server is not a good idea.

Allow using names instead of ids in plans and mappings

When a user creates a CR for a plan or a mapping, the VMware networks, storages, VMs... are referenced by id. The UI displays names, then translates them to ids when creating the CRs. However, for a user that would like to interact only with the API, finding these IDs is not trivial, as they are not displayed in the vCenter UI.

Then comes the uniqueness issue of names. so it would require providing additional information when using names:

  • VM name is unique withing the folder where it is located. So, the unique key is the absolute path of the VM.
  • Datastore scope the datacenter. So, the unique key is the combination of datacenter name and datastore name.
  • Standard vSphere Network scope is the host. Given that the VM belong to only one host, the unique key is the network name.
  • Distributed vSwitch scope is the datacenter. So, the unique key is the combination of datacenter name and network name. This way, if the datacenter is provided we can expect the network to be a Distributed vSwitch.

Move VM Import functionality from VMIO into Migration Plan Controller

Introduction

The aim is to move vm-import functionality from Kubevirt's VMIO project to the Forklift controller. The new implementation should be driven by an Itinerary from Plan controller Migration and existing Plan&Migration CRs should be used.

Once the VM-import functionality is moved to Forklift, the VMIO project from Kubevirt (including its CRs) will be retired.

Code move structure proposal

Forklift Itinerary phase Code origin Notes, dependencies, conditions
Started
CreatePreHook forklift-controller/blob/main/pkg/controller/plan/migration.go
PreHookCreated forklift-controller/blob/main/pkg/controller/plan/migration.go
Validate virtualmachineimport_controller.go#L332-L339 Validation will be done by controller before entering the migration itinerary
StopVM virtualmachineimport_controller.go#L342-L347 unless WarmImport
PrepareVM virtualmachineimport_controller.go#L349-L364 Could be splitted to more steps virtualmachineimport_controller.go#L728-L846
CreateDataVolumes virtualmachineimport_controller.go#L377-L387
CreateDataVolumesCompleted virtualmachineimport_controller.go#L377-L387
AddDVCheckpoint virtualmachineimport_controller.go#L389-L399 if shouldWarmImport
DataVolumesCompleted virtualmachineimport_controller.go#L389-L399
ConvertGuest virtualmachineimport_controller.go#L401-L411
ConvertGuestCompleted virtualmachineimport_controller.go#L401-L411
StartVM virtualmachineimport_controller.go#L419-L439
ResourcesCleanup virtualmachineimport_controller.go#L419-L439 part of StartVM in VMIO
CreatePostHook forklift-controller/blob/main/pkg/controller/plan/migration.go
PostHookCreated forklift-controller/blob/main/pkg/controller/plan/migration.go
Completed
Code that can be dropped not used notes
VMIO PrepareProvider virtualmachineimport_controller.go#L291-L309 replace by existing Forklift Inventory

Open questions

Emitted Events - needed to be kept?
https://github.com/kubevirt/vm-import-operator/blob/master/pkg/controller/virtualmachineimport/virtualmachineimport_controller.go#L77-L98

Forklift CreateImport steps - https://github.com/konveyor/forklift-controller/blob/main/pkg/controller/plan/migration.go#L129-L173 - is there anything need to be preserved (secrets) or it should be fully covered by steps described above?

Work progress - I'd prefer create PRs with partial changes (new itinerary skeleton and making its steps working in several follow up PRs) to a new brach (proposing vmio_import name) and merge the branch to master/main once whole itinerary works without VMIO. Is OK/not-OK?

Dependencies

The Forklift-controller currently requires VMIO as a dependency. Only dependencies required by the code moved to forklift-controller should be required instead of full VMIO.

Related resources

Revisit ProviderClient find (get) provider.

Revisit whether the ProviderClient.find() is necessary. Seems like the 206 returned by the API should be sufficient and probably better because it would detect cases where the provider is deleted between the find() and resource get.

Add VM and service instance properties.

Properties referenced in the inventory requirements document:
Provider (vSphere):

  • Provider.apiVersion (api_version)

VM:

  • TBD (has_opaque_network)
  • VM.passthroughSupported (has_passthrough_device)
  • VM.disks[*].rdm (has_rdm_disk)
  • VM.usbSupported (has_usb_controller)
  • VM.numaNodeAffinity (numa_node_affinity)

Add validation for networks and storage mapped.

Add a validation to make the user aware that all of the networks and storage are not mapped on VMs. The result should be a category=Warn condition with it's Items[]=list of offending VMs. (See Ref.String()).

To support this, let's store the validated refs in the Status of the networkMap and storageMap CRs. This can be cleared and populated in Reconciler.Validate() which is resolving the refs anyway. We only need the source refs included. This way the validation does not need to re-resolve the refs. Perhaps add a new type: StatusReferences (or something) that embeds a slice and has a Find(ref ref.Ref) method to provide the lookup by iterating/comparing the list.

Since it is provider specific, let's add 2 methods to the Builder. Something like.

  • ValidateNetworkMapped(vmRef ref.Ref) (bool, error)
  • ValidateStorageMapped(vmRef ref.Ref) (bool, error)

The validation method can call this and update the Items on the condition and setCondition if the list is not empty.. (See other validations).

https://bugzilla.redhat.com/show_bug.cgi?id=1902487

Add unit tests

We would like to be able to grow from now with unit tests.

Allow specifying a NetworkAttachmentDefinition for disks transfer

In the VMware provider, the default disk transfer network is the management network, which is unlikely the most performant network. The Host CRD was implemented to allow selecting another network for the disk transfer.

The same concern exists on the OpenShift side, as the pods that transfer the disk are created on the pod network, which is possibly not the most performant network or doesn't allow accessing the ESXi hosts. It would be beneficial to allow selecting a multus network for the disk transfer as multus is supported in Kubevirt. Multus networks are represented by NetworkAttachmentDefinition (NAD) CRs that are namespaced.

At the importer pod level, we need to add a k8s.v1.cni.cncf.io/networks annotation with the NAD name.
At the DataVolume level, it means that we need to add a cdi.kubevirt.io/data-network annotation with the NAD name.
At the VMImport level, the way to do it is not decided yet, but annotation would make sense, too.

Support VM target name.

Add optional Name field on PlanVM.
Validate name compatible with k8s resource names.
Add concens to each VMStatus. Add concern when name validation fails. Set a warn condition with any VMs have concerns.

Expand network attributes of the Host model

When a vSphere provider is added, the UI leverages the inventory to list the hosts and allow selecting a migration network.
By default, the migration traffic will go over the management network, but the UI cannot display that information. However, the vSphere API exposes the IP address of the ESXi host used for management:

MOR:HostSystem.summary.managementServerIp.

If the inventory contained that attribute, the UI could then identify the default network used for migration, by doing some IP calculation based on the IP addresses and netmasks of the pNICs. The netmask is available as:

MOR:HostSystem.configManager.networkSystem.networkInfo.vnic[*].spec.ip.subnetMask

Additional VM properties needed by MA.

has_sriov_nic (bool) - An SR-IOV NIC is presented to this VM.
VMWare: any MOR:VirtualMachine.config.hardware.device[*] class of type VirtualSriovEthernetCard

has_vm_ft_config (bool) -VM is configured for fault tolerance
VMWare: MOR:VirtualMachine.config.ftInfo

used_disk_storage (int) -VM’s used disk storage
VMWare: MOR:VirtualMachine.summary.storage.committed

Additional Host CPU properties.

Add Host properties.

cpu_sockets (int) - number of CPU sockets of this host.
VMWare: MOR:HostSystem.summary.hardware.numCpuPkgs

cpu_total_cores (int) - total number of CPU cores on this host.
VMWare: MOR:HostSystem.summary.hardware.numCpuCores

Migration fails when any VM pipeline fails.

When a VM (pipeline) has an error, the entire migration if failed.
The other migrations should continue. Likely just need to delay propagation of VM errors to the top level Migration.Error until the end.

Embed mapping in VirtualMachineImport CR

In the current implementation, migrating a virtual machine generates two custom resources:

  • ResourceMapping for the networks and storages of the virtual machine
  • VirtualMachineImport to trigger the import by VMIO. This CR references the ResourceMapping CR.

When the migration is finished, either successful or failed, the custom resources are not deleted (expected). Over time, and even more when some migrations failed, the number of custom resources grows, as well as the required API calls to have a holistic view of the migrations

The ResourceMapping custom resources belonging to only one VirtualMachineImport, it would reduce the number of custom resources by a factor of two to store the network et storage mappings in the VirtualMachineImport custom resource.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1898939

Test connection as part of Host validation.

Test the connection to the host using the specified IP and credentials in the (optional) secret. When no secret is referenced, the controller will use the associated provider credentials. Let's reflect this in conditions/events much like those used in the provider.

Validated
ConnectionTested

Also, the Host validation should not use the common provider validation. The Host controller should not require the provider to be ready. Recommend it only check that the provider reference is set and the provider exists.

Restore power state of the VM

When creating the VirtualMachineImport, it's possible to specify if we want the VM to be powered on after the import by setting
spec.startVm to true. We want to restore the power state of the VM as it was when the migration started.

This means that the inventory needs to collect the VM power state:

  • Attribute: MOR:Vm.runtime.powerState
  • Possible values: poweredOff, poweredOn

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1908021

Add destination namespace on Plan.

Add destination namespace on Plan for VMs.
The Plan controller needs to use this namespace instead of the plan namespace when populated.

Add plan validation to detect duplicate VM target names.

While validating VMs, determine the effective target VM name and set a condition when the effective target name will be a duplicate. This can happen when migrating different VMs with the same name but in different folders or DC.

Add support for access and volume mode in storage mapping

When a storage mapping is created, the only information we have is the destination storage class name. However, this is not sufficient, because some access modes and volume modes are not support by Kubevirt, e.g. Ceph + Filesystem or Cinder + RWX. So, we shouldn't allow unsupported access and volume modes.

The idea is to implement a storage support matrix that would be used to validate access and volume modes for the destination storage classes, as well as provide intelligent default values for each provisioner type. The support matrix would be stored in a Config Map managed by the operator and mounted in the controller and UI pods.

The access modes are documented in Kubernetes > Persistent Volumes > Access Modes.
The volumes modes are documented in Kubernetes > Persistent Volumes > Volume Modes.
The list of provisioners that support block mode is available in Kubernetes > Persistent Volumes > Raw Block Volume Support.

A possible format for the support matrix is:

---
- kubevirt_version: v2.5.0
  supported_storages:
  - provisioner_name: kubernetes.io/aws-ebs
    access_modes:
    - ReadWriteOnce
    volume_modes:
    - Block
    - Filesystem
  - provisioner_name: kubernetes.io/rbd
    access_modes:
    - ReadWriteOnce
    - ReadOnlyMany
    volume_modes:
    - Block
    - Filesystem
  - provisioner_name: kubernetes.io/glusterfs
    access_modes:
    - ReadWriteMany
    - ReadWriteOnce
    - ReadOnlyMany
    volume_modes:
    - Filesystem

Add referenced Host properties to VMStatus.

Get the referenced Host and add its: ipAddress and secret ref to each VM status. This insulates against changes to the Host while the plan is executing.

The reference to a Host should be removed from each VM listed on the plan but instead, find a matching Host at the beginning of execution.

Failing to import VM from oVirt to KubeVirt on v2v error

Trying to import a newly created VM from oVirt to KubeVirt fails on v2v with:

alling: settle
libguestfs: trace: v2v: inspect_get_type = "linux"
libguestfs: trace: v2v: inspect_get_package_format "/dev/sda1"
guestfsd: => inspect_get_type (0x1e3) took 0.01 secs
guestfsd: <= inspect_get_package_format (0x1e5) request length 56 bytes
commandrvf: stdout=n stderr=y flags=0x0
commandrvf: udevadm --debug settle -E /dev/sda1
calling: settle
libguestfs: trace: v2v: inspect_get_package_format = "rpm"
libguestfs: trace: v2v: filesize "/var/lib/rpm/Name"
guestfsd: => inspect_get_package_format (0x1e5) took 0.01 secs
guestfsd: <= filesize (0xda) request length 64 bytes
guestfsd: error: /var/lib/rpm/Name: No such file or directory
guestfsd: => filesize (0xda) took 0.00 secs
libguestfs: trace: v2v: filesize = -1 (error)
libguestfs: trace: v2v: inspect_list_applications2 = NULL (error)
virt-v2v: error: libguestfs error: filesize: /var/lib/rpm/Name: No such 
file or directory
rm -rf '/var/tmp/null.LCFkCO'
libguestfs: trace: v2v: close
libguestfs: closing guestfs handle 0x564e86727cd0 (state 2)
libguestfs: trace: v2v: internal_autosync
guestfsd: <= internal_autosync (0x11a) request length 40 bytes
umount-all: /proc/mounts: fsname=/dev/root dir=/ type=ext2 opts=rw,noatime freq=0 passno=0
umount-all: /proc/mounts: fsname=/proc dir=/proc type=proc opts=rw,relatime freq=0 passno=0
umount-all: /proc/mounts: fsname=/sys dir=/sys type=sysfs opts=rw,relatime freq=0 passno=0
umount-all: /proc/mounts: fsname=tmpfs dir=/run type=tmpfs opts=rw,nosuid,relatime,size=701520k,mode=755 freq=0 passno=0
umount-all: /proc/mounts: fsname=/dev dir=/dev type=devtmpfs opts=rw,relatime,size=1748016k,nr_inodes=437004,mode=755 freq=0 passno=0
umount-all: /proc/mounts: fsname=/dev/pts dir=/dev/pts type=devpts opts=rw,relatime,mode=600,ptmxmode=000 freq=0 passno=0
umount-all: /proc/mounts: fsname=shmfs dir=/dev/shm type=tmpfs opts=rw,relatime freq=0 passno=0
umount-all: /proc/mounts: fsname=/dev/sda1 dir=/sysroot type=ext4 opts=rw,relatime freq=0 passno=0
commandrvf: stdout=n stderr=y flags=0x0
commandrvf: umount /sysroot
commandrvf: stdout=n stderr=y flags=0x0
commandrvf: udevadm --debug settle -E /dev/sdb
calling: settle
commandrvf: stdout=n stderr=y flags=0x0
commandrvf: udevadm --debug settle -E /dev/sda
calling: settle
fsync /dev/sda
libguestfs: trace: v2v: internal_autosync = 0
libguestfs: sending SIGTERM to process 211
# oc version
Client Version: 4.9.0-0.okd-2021-11-28-035710
Server Version: 4.9.0-0.okd-2021-12-12-025847
Kubernetes Version: v1.22.1-1824+934e08bc2ce38f-dirty
kubevirt-hyperconverged-operator.v1.5.0
konveyor-forklift-operator.v2.2.0

oVirt 4.4.8.5-1.el8

VM created from Fedora 34 base cloud image from oVirt Glance repository.

The provider conditions are stale.

The provider is added to the DB as part of the main reconcile. The predicate for the main watch ignores updates when generation eq observedGeneration. As result, the follow up update when the inventory is built and the InventoryCreated Ready conditions are added is ignored.

We should consider managing the Provider in the DB in the OCP reconciler instead.

Translate mapped network/storage IDs to host be oriented.

When a Host is used for the migration (instead of vSphere API), the controller needs to translate mapped network/storage IDs to be host oriented when building the VM Import CR.

Likely to be easier to pass the Name instead of the ID.

RFE: RHV import / conversion pod names not as user-friendly as VMware

https://github.com/konveyor/forklift-must-gather/blob/af0bf5f53c5937077e94a5bc28f84809e497f359/README.md?plain=1#L87

This line is an example of a VMware importer pod name. RHV importer and conversion pods, however, have names that look like this: importer-<plan>-ed90dfc6-9a17-4a8btnfh, where ed90dfc6-9a17-4a8 is the beginning of the RHV VM ID and btnfh is an ID.

It would be more user-friendly if the VM name could be used instead. If this is not possible, perhaps include -vm- in the name before the ID and a hyphen between the VM ID and 5-char ID: importer-<plan>-vm-ed90dfc6-9a17-4a8-btnfh.

Missing `json:",inline"` tags.

Many of the inlined fields in API types are missing json:",inline" which results in the proper CRD entries being generated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.