We were running 6.5u2 and have typically set our Windows VMs to "Check and Upgrade VMWare Tools before each power on". So almost 100% of our VMs had version 10287. This has always worked fine as when the servers go to reboot for Windows Updates, they will just reboot again automatically for vmware tools updates during that same maintenance window.
We just upgraded to 6.7u2 and had about (10)2016 servers go offline last night when they were rebooting for windows updates(and attempting to upgrade tools to 10341)., When I connected to the console on them they had rebooted for windows updates, were complaining that they still needed vmware tools installed, and were missing all of their NICs(vmxnet 3). So I had to install VMWare tools, configure the nic correctly and reboot them to bring them back online. We have not had any issues with 2008r2, 2012r2, or 2019 servers and those have all updated normally. Has anyone else seen this problem? It almost feels like it is removing vmware tools during the upgrade and leaving the servers hanging in limbo until they reboot again, but I was more focused on getting the servers back online than looking for more details about the problem.
I just heard back from VMware Support that they have finally published KB for this issue.
What I would suggest is:
1. Disable VMware Tools autoupgrade for Windows 2016 VMs in vCenter.
2. Let the Windows update finish all patch installation including patches for September 2019.
3. Enable the VMware Tools autoupgrade for Windows 2016 VMs again in vCenter.
4. Monitor and repeat 1 to 3 if you face same issue again.
Hi, is there same problem if you use E1000 vAdapter in your 2016 VMs? Did you check it?
And also check the device manager of guest OS when VMXNET3 is gone, is there any unknown device or not?
I cannot reproduce the problem yet, but we have had 83 server update their tools to 10341 automatically and only the ones on server 2016 have had issues. My guess is it is only with vmxnet 3 because it seems like I remember reading articles about it more of a software virtualized nic that typically performs better than the e1000. For that reason we have very few e1000 VMs(usually only if it is linux or needs pxe boot, etc.) and with that small of a sample size I cannot confirm much else at this point
Yea that's right, we had also similar problems with E1000 in our Win2012R2 servers. But whenever we deal with this, vNIC stops to working not missing exactly, So we decided to use VMXNET3. But I think removing vNIC and adding new on may fix the problem.
and for last recommendation try to migrate problematic VMs to another hosts and check their status again ...
I tried reproducing it manually with another 2016 server and could not. The only difference between that 2016 server and the ones that had issues this m,orning was it had already installed a lot of updates updates(one was a service stack release). So I cannot recreate the problem, but the ones this morning all rebooted for windows updates and then basically lost their network adapter and were just hung. I did not check device manager to see if any were present, but my guess is "no" The reason I say that is because no network adapters showed under "network connections" and because once I ran tools this nic popped in there pretty quickly. Then I assigned it the same IP address it had before and it did not complain about multiple nics having same IPs/gateway, etc that you will normally see when there is a hidden adapter.
If possible provide us vmware.log file with date and time when issue occurred.
I can now replicate the problem. If I reboot the VMs normally, they will update tools as needed and no issues. If Windows Update KB4507460 gets installed on the VM and rebooted, it will not install tools and leave it without a network card. The network card shows in device manager, but there is a driver issue until you install tools manually. Even if I reboot the VM again, it still will not update tools even that it is flagged to "Check and Upgrade VMWare Tools before each power on" I have shapshots on the lab 2016 server VM so that I can reproduce the issue as needed, but this is a pretty serious problem for anyone that tells their VMs to "Check and Update VMWare Tools before each power on"
Once KB4507460 gets installed, it causes the networking on the VMs to stop working, and rebooting or even powering off the VM and starting it back up will not get it to install VMTools to fix itself, even though it shows "VMWare Tools is not installed on this Machine" and "Check and Upgrade VMWare Tools before each power on" is checked
This sounds really crazy, but here is what I am seeing with 5 trials. If you have a VM and that VM is set to "Check and Update VMWare Tools before each power on", and windows updates runs KB4507460 and reboots, the machine will come back up without VMWare Tools, without a working network adapter, and you will have an outage. No amount of reboots will get it to install VMWare Tools even though it has no tools installed and is set to "Check and Update Tools before each power on".
If you have a VM and "Check and Update VMWare Tools" is not checked and you install KB4507560 and it reboots then everything continues to work fine and you have no issues. Then you can install VMWare tools later and everything works fine.
All the VMs I did this on had vmxnet 3 adapters and not sure if it impacts all NICs or just the ones VMs with the vmxnet 3 network adapter.
This was an issue in 2008 R2 with a certain KB due to MS patching drivers. I think it applied to E1000E but I cannot recall. Either way, you should really be using VMXNET3 - I haven't seen this issue with that adapter.
Did you try to install VMware tools manually in your VM guest OS? By addressing its default path (/vmimages)? as a ISO file?
I have tested this several times to confirm. If I install the tools manually, or if they install automatically at any other time besides when KB4507460 is getting installed, everything works 100%. If I leave my VMs set to "Check and Upgrade VMWare Tools before each power-on" and the VM happens to install KB4507460 and reboot, I will have an outage.
We have never had an issue with having our VMs set like that, but I am now concerned there might be other updates similar to KB4507460 that can also cause an outage when combined with the VM option "Check and Upgrade VMWare Tools before each power-on."
We have now disabled that option on all of our VMs and opened a case with VMWare.
Have you got any update on your case ? We are also facing issue same issue and currently awaiting response from VMware Support.
In the meanwhile, I have disabled auto update on all Windows 2016 server VMs.
I will. Case is still open, but has been stalled for a week waiting to hear back. I can replicate the issue on demand and if that option is unchecked, no issues. If that option is checked, then we will have an outage on the server when those conditions occur.
watching with interest we are due to upgrade 400 VMs., having gone from 6.5 to 6.7u2 will await what occurs here so would appreciate any feedback.
Are you aware that the VMXNET3 driver is also included in WINDOWS OS updates now, vmware started doing this recently as an attempt to avoid reboots, are you windows updates including driver updates on your estate ?, just wondering in case there is some collision between the OS updates and the VMware Tools update for the NIC.
Do you mean you are seeing this when Windows Updates installs KB4507460 and our VMWare tools get upgrqaded to the version that comes with 6.0u3?
Or did it occur with the same updates/versions, when upgrading tools from 6.0u3 version?
No, same issue as reported, even if VMtools are up to date but you have set to check at reboot and you install the KB mention in this chain it gets uninstalled and network gone.
Here is the problem in the logs when it goes to update, and have not heard of the cause as of yet
2019-08-05T20:32:53.657Z| vcpu-0| I125: Guest: Installing VMware Tools 10.3.5.7752 (build-10430147)
2019-08-05T20:32:55.820Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2019-08-05T20:33:02.657Z| vmx| I125: TOOLS autoupgrade protocol version 0
2019-08-05T20:33:02.661Z| vmx| I125: GuestRpc: Got error for channel 0 connection 7: Remote disconnected
2019-08-05T20:33:02.661Z| vmx| I125: GuestRpc: Closing channel 0 connection 7
2019-08-05T20:33:02.661Z| vcpu-1| I125: GuestRpc: Reinitializing Channel 0(toolbox)
2019-08-05T20:33:05.719Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2019-08-05T20:33:10.582Z| vcpu-0| I125: TOOLS Received tools.set.version rpc call, version = TOOLS_VERSION_NONE (uninstalled), type is
2019-08-05T20:33:10.582Z| vcpu-0| I125: TOOLS Setting toolsVersionStatus = TOOLS_STATUS_NO_TOOLS