microsoft / lis-test Goto Github PK
View Code? Open in Web Editor NEWContains test infrastructure for testing Linux virtual machines on Windows Azure and Hyper-V.
Contains test infrastructure for testing Linux virtual machines on Windows Azure and Hyper-V.
For example, we could add following to better integrate with Jenkins.
hvServer
vmName
os
ipv4
sshKey
suite
@adriansuhov the test case would verify that non-aligned memory chunks can be added to a running Linux VM.
This removes the 128MB aligned memory increases as in existing releases.
CreateVMs.ps1 doesn't currently work on WS 2012 (no R2).
Initially it fails due to the Generation param, which is not supported in this case.
Other changes might be required when full code gets executed.
On a Gen2VM the system disk is located on SCSI 0,0.
The clean-up script RemoveVhdxHardDisk.ps1 is removing also SCSI 0,0 at the end of the test, breaking the test run.
Running the STOR VHDx tests on a remote Nano Server results in several failures due to missing remote capabilities of the scripts.
This should start from RemoveVhdxHardDisk.ps1, other related scripts might require improvements as well.
Through a LISA flag, a test run will exit immediately and leave the VM in the current state.
This is useful to troubleshoot VM failures in case of call traces and other situations.
Test case Disable_NUMA_By_Kernel fails on WS 2012 (without R2) with the below warning:
FAILED: Could not find VmGeneration variable.
@Jingli1985 are you able to fix this?
Thank you!
In ConfigPtp4l
ptp4l is provided on RHEL and Ubuntu by linuxptp package - must be installed during setup.
Handle different service name in ConfigChrony:
On rhel service name is chrony, ubuntu refers to it as chronyd
@adriansuhov Several network tests make use of static MAC addresses on the External interface.
These must be converted to use of the random MAC generation interface - improvements to be done to that function.
A list of tests that still make use of static MAC:
There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.
Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.
Calling Initialize-HypervHost.ps1 -git
fails for me, likely because & $cmd $options
returns right away.
I'm not sure if the git-installer.exe has an option to "not fork", to make sure the ps1 scripts actually waits for git-installer.exe to return.
The following del git-installer.exe
does also not work, likely because the file is busy. No error is reported.
Finally the test for git.exe
fails because not much was actually installed when the "test -f $path/git.exe" is executed.
Current automation measures the throughput exactly after 1 minute, which sometimes is not reliable enough.
Suggestion is to make an average of the throughput after each operation, in order to better obtain the values, then compare.
Implement changes from LIS/lis-pipeline#124 into lisa code to execute at exit.
I met some problem recently. My ICABase checkpoint was with DM Enabled. When running numa cases ( eg. CheckNuma ), DM_DISABLE.ps1 is invoked, but it didn't actually work. Thus the the case failed.
I did a little debugging. I looks like the problem lie in DM_CONFIGURE_MEMORY.ps1: line 233.
No memory configuration will be done unless all the four parameters are present ( vmName, tpEnabled, tPstartupMem, tPmemWeight ).
While in CheckNuma case, the tPstartMemory param is not set in xml files. The memory configure part of the script is totally skipped ( and won't leave any warning message ).
I tried to do something about the code, but it looks tricky to me:
One approach is to simply add startupMem param in all the cases invoking DM_DISABLE.ps1, which I do not think is a good way to address this problem since you would have to hard code memory size in xml files for all cases requiring DM disable ( or even more cases ). And it is also not really reasonable to make startupMem a required param when user who just want to disable DM.
Another approach is to add an exception for the " if " in line 233 ( add an " elseif " ), which is not really safe since the script runs the whole loop when parse each param. Some unexpected behavior might happen in that way.
LIS deploy scenarios automation should also add and validate the suggested selinux rules that are needed for the LIS daemons.
Please refer to the LIS 4.1.x release notes for the full documentation on the rules to be added.
@OvidiuRusu & @paulaCrismaru
Would it be possible to replace the dependency on putty by implementing OpenSSH? Quick script to set this up:
# install Chocolatey & OpenSHH
Set-ExecutionPolicy Bypass
iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
choco install -y openssh -params "/SSHServerFeature"
refreshenv
At random tests during a test run if the VM does not boot or it doesn't get an IP, lisa will exit and not continue with the remaining tests.
This behavior must be changed to have a timeout then force a reboot on the VM.
If this doesn't work, then we must continue with the next test case.
Hi Chris,
When run the latest lis-test, we get following errors during running, possibly related with following commit. https://github.com/LIS/lis-test/blob/master/WS2012R2/lisa/utilFunctions.ps1 Stop-VM $(
We use the older script, it can run smoothly, but I haven't found why this commit inducing the error yet, powershell version is already 5.1, and we did not change the host.
09:19:50 Info : RHEL-7.5--GEN2-A Over-riding default snapshotName from global section to ICABase
09:19:51 Info : RHEL-7.5-.0-x86_64-GEN2-A is being reset to snapshot ICABase
09:19:51 Processing data for a remote command failed with the following error message:
09:19:51 <f:WSManFault xmlns:f="http://schemas.microsoft.com/wbem/wsman/1/wsmanfault"
09:19:51 Code="3221225477" Machine="2016-AUTO"><f:Message><f:ProviderFault
09:19:51 provider="microsoft.powershell" path="C:\Windows\system32\pwrshplugin.dll"></f:
09:19:51 ProviderFault></f:Message></f:WSManFault> For more information, see the
09:19:51 about_Remote_Troubleshooting Help topic.
09:19:51 + CategoryInfo : OperationStopped: (2016-AUTO:String) [], PSRemot
09:19:51 ingTransportException
09:19:51 + FullyQualifiedErrorId : JobFailure
09:19:51 + PSComputerName : 2016-AUTO
Thank you so much.
Best Regards,
Xuemin
This is a feature request to have a param where the VM vhdx file is to be stored and created the VM with the given path.
Core_Heartbeat_PausedCritical.ps1 has a parameter vhdpath=C:\TestVolume.vhdx
This must be changed to a random vhdx name in order to avoid parallel testing conflicts.
Please keep the drive letter at least.
Kdump_Execute.sh:
issue code:
--
systemctl status kdump.service | grep -q "active"
========>In RHEL7.X, kdump status "Active: active" / "Active: inactive", so $? always is 0
| if [ $? -ne 0 ]; then
| service kdump status | grep "operational"
========>In RHEL6.X kdump status "operational" / "not operational", so $? always is 0
| if [ $? -eq 0 ]; then
| LogMsg "Kdump is active after reboot"
| echo "Success: kdump service is active after reboot." >> ~/summary.log
| else
| LogMsg "ERROR: kdump service is not active after reboot!"
| echo "ERROR: kdump service is not active after reboot!" >> ~/summary.log
| UpdateTestState "TestAborted"
| exit 1
| fi
| else
| LogMsg "Kdump is active after reboot"
| echo "Success: kdump service is active after reboot." >> ~/summary.log
| fi
issue coe:
========>In cases.xml utils.sh is pushed to vm, after source it, we can use $DISTRO to check Red5., Red6., Red7.* and Ubuntu13*, Ubuntu14*, after get them, here, we use redhat* and ubuntu* to match
========>In cases.xml utils.sh is pushed to vm, after source it, we can use LogMsg directly
--
LogMsg()
| {
| # To add the time-stamp to the log file
| echo date "+%a %b %d %T %Y"
": ${1}"
| }
|
| UpdateTestState()
| {
| echo $1 >> ~/state.txt
| }
|
| #######################################################################
| #
| # LinuxRelease()
| #
| #######################################################################
| LinuxRelease()
| {
| DISTRO=grep -ihs "buntu\|Suse\|Fedora\|Debian\|CentOS\|Red Hat Enterprise Linux" /etc/{issue,*release,*version}
|
| case $DISTRO in
| buntu)
| echo "UBUNTU";;
| Fedora*)
| echo "FEDORA";;
| CentOS*)
| echo "CENTOS";;
| SUSE)
| echo "SLES";;
| RedHat*)
| echo "RHEL";;
| Debian*)
| echo "DEBIAN";;
| esac
| }
Feature description:
You can use the python script lsvmbus in /usr/sbin to get information about devices on the Hyper-V virtual machine bus (VMBus) similiar to information commands like lspci.
Automation must validate the functionality of this tool on all supported distributions.
Hi @ilenghel
We have been running hv-sock-basic case, but it failed on our hosts. I tried to run the server_on_host.exe on our hosts, and did not work. I did a little digging and found that the the binaries might be built using DLL runtime library. Most of our hosts are installed with Hyper-V Server Core without extra runtime packages, which does not have the dependencies for the binaries. Would you help check about it and if it is true, it would be very helpful if you could recompile the binaries with C++ (MT) option as Recommendation 3 in README file.
Thank you in advance!
VM Memory feature is now called Runtime Memory Resize in WS 2016.
@adriansuhov please update the tests to reflect the new feature name.
InstallPutty() checks for $filesum.Hash -ne $sums[ "x86/${util}" ]
, but the actual path in sha256sums.txt
is w32, at least for me.
Hi Chirs,
Would you like to think about adding rerun failed test cases to lis-test framwork? Even we debug failed test cases continually, still cannot make sure all the test cases run pass 100% in Jenkins jobs.
Locally we run 110+ cases(subset of current lisa-test) on WS2016, 2012R2, 2012, Gen1, Gen2,x86_64 and i386(for rhel6 only) with same xml for every internal formal build. For the host WS2012R2 and 2012, we need to login the host to rerun failed cases, change the case xml file, also after pass, update the test result which is already uploaded to the test cases result manage system- Polarion. Generally it will spend time to rerun and re-update the test result.
After rerun, it will pass, e.g. timesync related test cases, although check the failure log and we can do some enhancement, if we already know what kind of cases can pass after rerun twice, it will great and it could save lots of effort.
We are planing to do re-run the failed cases locally, e.g. parse the cases.xml, select failed cases then rerun, merge test result by local script, but these work cannot be upstream and cannot benefit others. It will be so great if the test framework could support this feature.
Any suggestions for this feature request?
Thank you so much.
Single Root I/O Virtualization (SR-IOV) specification is a standard for a type of PCI device assignment that can share a single device to multiple virtual machines. SR-IOV improves device performance for virtual machines.
SR-IOV support is present in upstream kernel and ported to newer Linux distributions.
The config-global xml section must be dynamically created as part of running lisa with parameters.
Primarily these values must be handled as lisa params - logfileRootDir (already implemented) and imageStoreDir
@alexngmsft for ack
@Jingli1985 We've observed 2 issues in regards to the STOR tests after the most recent changes. Please fix them or revert the changes.
More tests from VHD and VHDx xml files are affected, these are some samples:
For Add-VHDXForResize.ps1:
Info : VM_name currentTest updated to VHDx_SCSI_0_1_Dynamic_Large_64TB
Info : VM_name transitioned from SystemDown to RunSetupScript
Info : VM_name - running single setup script 'SetupScripts\Add-VHDXForResize.ps1'
The variable '$controllerType' cannot be retrieved because it has not been set.
At D:\lisa\SetupScripts\Add-VHDXForResize.ps1:289 char:10
+ if ( $controllerType -eq "SCSI" )
+ ~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (controllerType:String) [], RuntimeException
+ FullyQualifiedErrorId : VariableIsUndefined
The variable '$controllerType' cannot be retrieved because it has not been set.
At D:\lisaSetupScripts\Add-VHDXForResize.ps1:313 char:64
+ ... $defaultSize $vhdPath $controllerType
+ ~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (controllerType:String) [], RuntimeException
+ FullyQualifiedErrorId : VariableIsUndefined
The variable '$sts' cannot be retrieved because it has not been set.
At D:\lisaSetupScripts\Add-VHDXForResize.ps1:315 char:18
+ if (-not $sts[$sts.Length-1])
+ ~~~~
+ CategoryInfo : InvalidOperation: (sts:String) [], RuntimeException
+ FullyQualifiedErrorId : VariableIsUndefined
For RemoveVhdxHardDisk.ps1:
Info : VM_name running cleanup script setupScripts\RemoveVhdxHardDisk.ps1 for test VHDx_4k_HotADD_Multi_Dynamic_SCSI
The variable '$SCSICount' cannot be retrieved because it has not been set.
At D:\lisa\setupScripts\RemoveVhdxHardDisk.ps1:313 char:28
+ "SCSI" { $SCSICount = $SCSICount +1 }
+ ~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (SCSICount:String) [], RuntimeException
+ FullyQualifiedErrorId : VariableIsUndefined
The variable '$SCSICount' cannot be retrieved because it has not been set.
At D:\lisasetupScripts\RemoveVhdxHardDisk.ps1:313 char:28
+ "SCSI" { $SCSICount = $SCSICount +1 }
+ ~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (SCSICount:String) [], RuntimeException
+ FullyQualifiedErrorId : VariableIsUndefined
The CreateVMs.ps1 script does logging with Write-Host and Write-Warning. If LISA is not run interactively, these log messages are lost. Use Write-output to capture the output and include it in the ica.log file.
Improvement: if the IQN param is specified in the XML, do not use iscsi discover but connect automatically to the target.
Right now the script connects to the first IQN it discovers. On some environments, this may contain important data, which would be deleted by the script.
There are three issues during running STOR_VSS_BackupRestore related test cases, please refer to below comments. (mainly test on 2012R2 host)
But it can run pass on 2016 host, in order to fix the issue on 2012R2, either reduce the stress as ./iozone -ag 4G , (remove the loop) for both Hyper-v 2012R2 and 2016, or only lighten the stress for 2012R2 host.
$EventLog= Get-WinEvent - ProviderName Microsoft-Windows-Hyper-V-VMMS , looks that this command is updated, get error No provider name as Microsoft-Windows-Hyper-V-VMMS, when use Get-WinEvent only, cannot find warning ID 10107 or 10150.
Also after fsfreeze -f $MountName, it can do backup successfully, in STOR_VSS_BackupRestore_Fail, firstly check $sts = startBackup $vmName $driveletter, it expects backup successfully, in the comments, it mentions "Backu should fail". From test case step: "3. Change host side configuration to insert failure in backup. Or disable VSS daemon in guest.", a little different with test case description. Note, in rhel7.4, if disable VSS daemon, it could do offline backup.
File issues here to request help since not sure the exact fix method, can you run pass in your side?
Could you please help to take look these three issues? Thank you so much.
Hi Chris,
Looks that https://github.com/LIS/lis-test/blob/master/WS2012R2/lisa/setupscripts/FC_MultipathDetection.ps1 #301 is getting the Fibre Channel disk in the host, it depends on host FC set up.
In the test script FC_multipath_detect.sh line #174, use this number to compare with fcDiskCount in VM, it seems unreasonable?
$fcDisks = Get-Disk | Where-Object -FilterScript {$_.BusType -Eq "Fibre Channel"}
$fcCount = $fcDisks.Length
Could you please help to check whether this line is correct step to get the expected multipath disk number? I'm not sure how to change it, so create this issue request.
Thank you so much.
Scope is to have a lisa parameter that will collect a series of Hyper-V logs based on the below guide:
We have experiencing some instability when running LISA. The framework would exit with exception "Provider load failure". This exception is raised in state engine which makes the framework exit.
We made attempts to reproduce this issue. Looks like it is reproducible while the reproduce rate is low ( less than 1% in my tests ). We mostly observe this exception after pause/save cases, thus we assume the pause/save operations are related ( which we are not really sure ).
Since LISA has to do the vm operations quite frequently ( start-vm, get-vm etc.), this issue has been a headache for us, we are trying to address this issue somehow ( at least prevent the framework from exiting ).
We still don't have clear idea about how to address this issue, so opening this issue request to discuss. Thank you in advance!
Some excerpt in log:
11:10:13 Info : DoStartSystem( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:13 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A is being started
11:10:14 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from StartSystem to SystemStarting
11:10:14 Info : Entering DoSystemStarting( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:14 Debug: vm.ipv4 = 10.73.74.55
11:10:15 Debug: vm.ipv4 = 10.73.74.55 and ipv4 =
11:10:19 Info : Entering DoSystemStarting( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:19 Debug: vm.ipv4 = 10.73.74.55
11:10:19 Debug: vm.ipv4 = 10.73.74.55 and ipv4 =
11:10:22 Info : Entering DoSystemStarting( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:22 Provider load failure
11:10:22 + CategoryInfo : NotSpecified: (:) [Get-VM], VirtualizationExcept
11:10:22 ion
11:10:22 + FullyQualifiedErrorId : Unspecified,Microsoft.HyperV.PowerShell.Commands
11:10:22 .GetVM
11:10:22 + PSComputerName : 2016-Auto
11:10:22
11:10:22 Error: RHEL-6.10-20180301.2-x86_64-GEN2-A SystemStarting entered state without being in a HyperV Running state - disabling VM
11:10:22 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from SystemStarting to ForceShutDown
11:10:22 Debug: vm.ipv4 = 10.73.74.55
11:10:22 Provider load failure
11:10:22 + CategoryInfo : InvalidOperation: (:) [Get-WmiObject], Managemen
11:10:22 tException
11:10:22 + FullyQualifiedErrorId : GetWMIManagementException,Microsoft.PowerShell.C
11:10:22 ommands.GetWmiObjectCommand
11:10:22 + PSComputerName : 2016-Auto
11:10:22
11:10:22 Debug: vm.ipv4 = 10.73.74.55 and ipv4 =
11:10:26 Info : DoForceShutdown(RHEL-6.10-20180301.2-x86_64-GEN2-A)
11:10:26 Info : GetTestData(done)
11:10:27 Info : SetRunningTime(done)
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A currentTest lasts 0 Hours, 0 Minutes, 15 seconds.
11:10:27 Info : GetCurentSuite(Functional)
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from ForceShutDown to SystemDown
11:10:27 Info : Entering DoSystemDown( RHEL-6.10-20180301.2-x86_64-GEN2-A )
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A currentTest updated to done
11:10:27 Info : RHEL-6.10-20180301.2-x86_64-GEN2-A transitioned from SystemDown to Finished
11:10:27 Info : SaveResultToXML to (Functional,TestResults\cases-20180314-183507)
11:10:27 Info : GetCurentSuite(Functional)
11:10:27 Info : DoStateMachine() exiting
Reproduce script:
$count =0
while ($True) {
$a = Save-VM -Name xxx
if ($? -ne $True) {break}
$a = Start-VM -Name xxx
if ($? -ne $True) {break}
$a =Get-VM -Name xxx
if ($? -ne $True) {break}
$a = Get-VMIntegrationService -VMName xxx
if ($? -ne $True) {break}
$count +=1
$count
}
Error message:
Start-VM : Provider load failure
At line:4 char:1
+ Start-VM -Name xxx
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [Start-VM], VirtualizationException
+ FullyQualifiedErrorId : Unspecified,Microsoft.HyperV.PowerShell.Commands.StartVM
Kdump.ps1
Network over IPv6 automation still relies currently on static MAC address.
This must follow the same format as for the regular Network tests, we don't need to depend nor define static MAC address.
We must verify if we run on a supported configuration - both guest OS side and Windows Server edition support for a given test area.
Scope is to define a test case that will write a large number of KVP records in the pool 1 file, then read them.
This could result in a crash of the kvp daemon. Issue is to be fixed in upstream.
Network test cases MaxSyntheticNIC, MaxLegacyNIC, MaxNIC are not stable, get the error log "Error: Unable to set default gateway - 10.xx.xx.254".
STOR_VSS_BackupRestore PS scripts contain mostly duplicate code, with only the specific tests code differences.
Those must be converted into functions into the main STOR_VSS_BackupRestore script as much as possible.
For testcases where the testScript is powershell, uploadFiles-like test paramater is ignored.
Kdump_Results.sh:
Code:
case $DISTRO in
if [ $vm2ipv4 != "" ]; then
If $vm2ipv4 is not set, now above line will become if [ !=""]; then, this is a syntax error;
We need to implemented a method that will configure the COM2 port for new VMs and capture the VM serial log during the entire duration of testing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.