We have only 2 esxi hosts and they are almost identical, a Dell Poweredge T630 and a Poweredge R730. on Sunday I updated both from esxi6.7 to 6.7U3. I first updated vcenter appliance to 6.7U3, then each remediated each host using the Dell esxi image VMware-VMvisor-Installer-6.7.0.update03-14320388.x86_64-DellEMC_Customized-A00.iso.
After the update, the T630 seems to be working almost normally (with just a few delays) but the R730 host has much bigger communication delays. Thankfully the VMs appear to be running and working properly. Our issues show up with managing the hosts and backups. The symptoms we've seen so far are:
I don't know where to start with looking in logs. If someone has suggestion I will post logs here
The issue is resolved. The disk enclosure error messages in the vmkernel logs was a distraction. it was unrelated. The issue was that I was running Dell OpenManage on the hosts. The instant I uninstalled OpenManage everything cleared up. This is a known vmware issue: VMware Knowledge Base
I've attached vmkernel.log from out R730 host
I see an entry every 10 seconds:
lsi_mr3: megasas_hotplug_work:495: lsi_mr3: Event : Enclosure PD 00(c 00/p0) is unstable
That would be our external enclosure. I will look into disconnecting it to see if that changes anything. First I have to figure out somewhere else to send backups as that is where they are currently stored.
I have not looked in the vmkernel.log before this upgrade so can't say for sure if these messages appeared before. I assume they did because I did notice ever since installing this enclosure that the iDRAC reports:
Communication with Enclosure 0 on Connector 0 of RAID Controller in Slot 7 is intermittent.
about every 10 minutes or so. It is still reporting this, nothing has changed. The disks have always worked though, without issues. This is an old Dell MD1200 connected through a Perc H30 adapter which is older hardware not supported anymore on newer Dell servers, but it works... and didn't seem to cause any issues with esxi 6.7
Even if that is an issue, it doesn't explain why we see some symptoms on our T630 which has no unsupported hardware. I've attached it's vmkernel.log as T630-vmkerne.log
The issue is resolved. The disk enclosure error messages in the vmkernel logs was a distraction. it was unrelated. The issue was that I was running Dell OpenManage on the hosts. The instant I uninstalled OpenManage everything cleared up. This is a known vmware issue: VMware Knowledge Base