I have a RHEL 6.9 in a virtual machine, after I ran the "yum update" command, I can not connect to VM anymore.
In the BIOS it shows that it did not find the disk. Follow the attached log.
Can you help me?
can you please upload VMX file and screen shot of vm folder ?
ranchuab I updated the first post with the information requested.
Thanks for help.
Did you take a snapshot before yum updates? If so, you might want to revert.
daphnissov I did not take the snapshot before.
If you didn't then it's a guest-related issue at this point and not anything to do with ESXi. The VM is powered on it just won't boot.
I updated the first post with more screenshots.
Yeah there's nothing wrong there. If your VM was working fine one minute and then you performed a yum update and now it won't boot, it's a problem with packages/kernel that got upgraded, not the VM's configuration that changed.
Have you made any change on vm configuration as i can see in vmware log . did you remove the disk and re add ?
see time as per vmware log
2018-01-26T11:32:48.959Z| vmx| I120: DICT | scsi0:0.fileName = VM-Linux-yspp0097-LDAP.vmdk |
2018-01-26T11:32:48.960Z| vmx| I120: DICT | scsi1:0.fileName = /vmfs/volumes/52673d7e-c347a1c2-1b79-2c59e53cc874/VM-Linux-yspp0097-LDAP-PROD/VM-Linux-yspp0097-LDAP-PROD.vmdk |
changes made here
2018-01-26T11:42:40.491Z| vmx| I120: DICT | scsi0:0.fileName = VM-Linux-yspp0097-LDAP-PROD_2.vmdk |
2018-01-26T11:42:40.491Z| vmx| I120: DICT | scsi1:0.fileName = VM-Linux-yspp0097-LDAP-PROD_1.vmdk |
current VMX file
scsi1:0.fileName = VM-Linux-yspp0097-LDAP-PROD_1.vmdk
scsi0:0.fileName = "VM-Linux-yspp0097-LDAP-PROD_2.vmdk
VM-Linux-yspp0097-LDAP.vmdk is missing here .
ranchuab It does not do in the VM before the update "yum". After a system update as a machine was no longer rebooting, I did the vMotion for another datastore and host.
daphnissov It is not an operating system problem, I have already opened a support ticket on RedHat in which I have been confirmed that the VM does not load the disk, so I can not enter Rescue mode to try something.
There is nothing with the virtual BIOS settings for the disk. The VM settings are using SCSI disks and SCSI disks are not visible in Primary/Slave disks options in BIOS (even in physical PC/servers).
From the looks of the log, it looks like the VM is powering up and somehow the guest OS is hung.
If the yum update included patches for Spectre/Meltdown, try changing the VM hardware compatibility from 8 to 10. Version 10 is the maximum that ESXi 5.5 can support.
https://kb.vmware.com/s/article/2007240
More specifically, if the yum update include a Meltdown patch, it might be looking for the PCID feature which is not exposed to the VM in version 8 compatibility. The CPU that you have is Sandy Bridge has the PCID feature but does not have the INVPCID instruction. There is nothing we can do about that if the Meltdown patch requires the INVPCID instruction unless you have a server with Haswell CPU without EVC masking.
bluefirestorm That is, I lost my virtual machine?
I've already changed the VM version from 8 to 10 but it's still the same.
Maybe somehow the VMDKs have become unordered. In your VM configuration, try swapping the SCSI ids for the _1 and _2 VMDK files so that scsi0:0 is assigned to _1 and scsi1:0 is assigned to _2.
daphnissov I made the change, but it did not work.
Any more tips before telling the company that I lost a machine from the production environment?
Can you show the VMX file at this point? Or just take a screenshot under edit settings with the hard drives expanded.
daphnissov Is that what you wanted to see?
Yes, ok. Just to reduce complexity here, remove the the ISO from your CD-ROM's configuration. It's a VMware tools ISO that's still mounted and shouldn't be.
I want to understand the course of events here. Exactly what steps were taken between when this VM was booting and when it wasn't? Was it only a yum update and nothing more? You didn't reconfigure the VM? You did nothing else? Please be as specific as you can while I look through your log files.
daphnissov The CD-ROM is not mounted or connected.
I just did the OS update procedure by running the yum update command and nothing more, after this procedure that the mentioned fault occurred.
As the machine was slow to start, I entered her console and it was all black.
What's confusing here is that you seem to only have one VMDK in your VM's configuration, then you power it down and have 2 VMDKs and neither of which, according to the file name, corresponds to the first. So on what date and time did the VM first stop booting correctly?
As late as this time stamp (found in vmware-55.log), you only appear to have one VMDK.
2018-01-26T11:32:48.959Z| vmx| I120: DICT scsi0:0.fileName = VM-Linux-yspp0097-LDAP.vmdk
But thereafter, you now have two (vmware-56.log and vmware.log)
2018-01-26T11:42:40.491Z| vmx| I120: DICT scsi0:0.deviceType = scsi-hardDisk
2018-01-26T11:42:40.491Z| vmx| I120: DICT scsi0:0.fileName = VM-Linux-yspp0097-LDAP-PROD_2.vmdk
2018-01-26T11:42:40.491Z| vmx| I120: DICT sched.scsi0:0.shares = normal
2018-01-26T11:42:40.491Z| vmx| I120: DICT sched.scsi0:0.throughputCap = off
2018-01-26T11:42:40.491Z| vmx| I120: DICT scsi0:0.present = TRUE
2018-01-26T11:42:40.491Z| vmx| I120: DICT ethernet0.virtualDev = e1000
2018-01-26T11:42:40.491Z| vmx| I120: DICT ethernet0.networkName = LAN
2018-01-26T11:42:40.491Z| vmx| I120: DICT ethernet0.addressType = generated
2018-01-26T11:42:40.491Z| vmx| I120: DICT ethernet0.generatedAddress = 00:0c:29:2e:94:b5
2018-01-26T11:42:40.491Z| vmx| I120: DICT ethernet0.present = TRUE
2018-01-26T11:42:40.491Z| vmx| I120: DICT scsi1:0.deviceType = scsi-hardDisk
2018-01-26T11:42:40.491Z| vmx| I120: DICT scsi1:0.fileName = VM-Linux-yspp0097-LDAP-PROD_1.vmdk
2018-01-26T11:42:40.491Z| vmx| I120: DICT sched.scsi1:0.shares = normal
2018-01-26T11:42:40.491Z| vmx| I120: DICT scsi1:0.present = TRUE
So if this VM worked fine up until it was booted at 1/26 11:42, then someone changed the VM's configuration. I want to understand the chain of events leading up to this, because a yum update inside the guest OS has no power to make such a configuration change.
It is a bit strange to see CPUID masks in the vmx configuration file (especially those with .amd).
hostCPUID.0 = "0000000d756e65476c65746e49656e69"
hostCPUID.1 = "000206d70020080017bee3ffbfebfbff"
hostCPUID.80000001 = "0000000000000000000000012c100800"
guestCPUID.0 = "0000000d756e65476c65746e49656e69"
guestCPUID.1 = "000206d200020800969822031fabfbff"
guestCPUID.80000001 = "00000000000000000000000128100800"
userCPUID.0 = "0000000d756e65476c65746e49656e69"
userCPUID.1 = "000206d700200800169822031fabfbff"
userCPUID.80000001 = "00000000000000000000000128100800"
cpuid.80000001.eax.amd = "--------------------------------"
cpuid.80000001.ebx.amd = "--------------------------------"
cpuid.80000001.ecx.amd = "--------------------------------"
cpuid.80000001.edx.amd = "-----------H--------------------"
cpuid.80000001.eax = "--------------------------------"
cpuid.80000001.ebx = "--------------------------------"
cpuid.80000001.ecx = "--------------------------------"
cpuid.80000001.edx = "-----------H--------------------"
The guestCPUID.1 and userCPUID.1 looks like are masking out some capabilities from the guest OS including PCID.
The easiest would be to just put a # in front of guestCPUID.1 and userCPUID.1 and try to power up. Sorry I don't have the time and patience to examine bit-by-bit to detail the differences in CPU features but looks like PCID capability is masked out.
hostCPUID.1. ecx = 17bee3ff
vs
guestCPUID.1 ecx = 96982203 = 1001:0110:1001:1000:0010:0010:0000:0011
The hex 8 above is bits 19 - 16: = 1000
Bit 17 ecx = 0 means PCID is masked out