VMware Cloud Community
alainrussell
Enthusiast
Enthusiast
Jump to solution

6.0 U1b - Hosts cannot communicate

Since upgrading to 6.0 U1b this weekend I'm seeing the following warning on some of my hosts (both DR + Prod environment), the warning will not go away until a host is put into maintenance mode/restarted - in which case the other 3 hosts of a 4 host cluster show the message. Health/Network checks all show as normal, cluster is functioning normally so I'm guessing it's a warning that is not clearing when a host restarts/comes out of maintenance mode?

Reply
0 Kudos
1 Solution

Accepted Solutions
CHogan
VMware Employee
VMware Employee
Jump to solution

Folks, multiple customers are hitting this issue after upgrading to 6.0U1b, i.e. messages such as "Host cannot communicate with all other nodes in the VSAN enabled cluster" appearing.

Please file Service Requests with GSS (Support). Quote internal bug reference 1587500.

Thanks

Cormac

http://cormachogan.com

View solution in original post

Reply
0 Kudos
14 Replies
boomboom21
Contributor
Contributor
Jump to solution

I'm seeing same issue.  Appeared to work fine until I applied 6.0 U1b.

Reply
0 Kudos
zyco123
Contributor
Contributor
Jump to solution

Had the same problem yesterday, after I´ve updated my ESXi hosts ... Only the master doesn´t show the notification, agent and backup do - as long as they are not in maintenance mode.

Came across different solutions.

Enable IGMP snooping and courier (was off since the beginning and no issues so far)

Upgrade to the latest VCSA version (what I did also but no success)

Finally I just shut down all VMs, disabled vSAN and enabled it again ... problem solved.

Reply
0 Kudos
illvilja
Hot Shot
Hot Shot
Jump to solution

Hi,

Noticed the same today. After upgrading from 6.0b to 6.0 U1b the warning message appeared. Will probably file a SR.

Reply
0 Kudos
srodenburg
Expert
Expert
Jump to solution

"Finally I just shut down all VMs, disabled vSAN and enabled it again ... problem solved."

Sorry but that is NOT a realistic option in a production environment.

I'm postponing the upgrade until VMware fixes this.

Reply
0 Kudos
dpnvektor
Contributor
Contributor
Jump to solution

Hosts on 6.0 U1b (3380124), vCenter on 6.0 U1b.

Same issue here, also accompanied with various vsan related vsphere alerts that seem to have no relation to the actual health of the cluster and are directly contradicted by the output of the health check plugin.

Also seeing an issue where the viclient gets intermittently unresponsive after the vcenter server's been up for a few hours, but other management tools (powercli) seem unaffected.  May be an unrelated issue.

Reply
0 Kudos
AlexanderLiucka
Enthusiast
Enthusiast
Jump to solution

add second VSAN vmkernel on all hosts in the vsan.

edit first vm kernel, removing VSAN traffic from it. after that edit again the first vm kernel and add again VSAN traffic. in some cases you may have to do the same with the second vm kernel for the vsan.

and yes we all see this problem after 6.0 U1b.

Reply
0 Kudos
Paul_Sheard
Enthusiast
Enthusiast
Jump to solution

Yes I see this error Re: Error "Host cannot communicate with all other nodes in the VSAN enabeld cluster""Host cannot communicate with all other nodes in the VSAN enabled cluster"

After applying this patch  VMware KB: VMware ESXi 6.0, Patch Release ESXi600-201601001

If i reboot one of the 6 hosts the error goes away on that host, and then if I reboot another host then the host that was previously ok then gets the error and the one I rebooted loses the error..

vsan health check checks out ok, all green.. I can PING all vsan hosts... I can see all VSAN disk space fine..

Paul Sheard VMware Consultant (Contract) VCP6 DCV NV CMA DTM
Reply
0 Kudos
CHogan
VMware Employee
VMware Employee
Jump to solution

Folks, multiple customers are hitting this issue after upgrading to 6.0U1b, i.e. messages such as "Host cannot communicate with all other nodes in the VSAN enabled cluster" appearing.

Please file Service Requests with GSS (Support). Quote internal bug reference 1587500.

Thanks

Cormac

http://cormachogan.com
Reply
0 Kudos
mythumbsclick
Enthusiast
Enthusiast
Jump to solution

Having exactly the same issue as you all. I notice in Veeam One monitoring the following event on the 2 hosts that have the error. (The third is fine but as mentioned, if I reboot another host, that clears the error on that one and the original healthy one now has the error). Anyway this is the Veeam one warning:

Description Fired by event: esx.problem.visorfs.ramdisk.full

Event description: The ramdisk 'vsantraces' is full. As a result, the file /vsantraces/vsantraces--2016-01-18T14h46m15s380.gz could not be written.

Initiated by: Not Set Knowledge One of the host's ramdisks reached the limit for the number of files it can contain Cause The file table of the ramdisk 'tmp' is full Resolution If the files were created by administrative action (for example, accidentally copying files to an incorrect path), then remove the files. If the normal operation of the system appears to cause the error, then contact VMware support

comparing all 3 hosts the 2 in error are 75% full whereas the non-error host is 58% (Thought it was worth mentioning)

Just got off phone with support and they say it’s purely cosmetic and unlikely to be addressed for a couple of months.

Reply
0 Kudos
Paul_Sheard
Enthusiast
Enthusiast
Jump to solution

"Just got off phone with support and they say it’s purely cosmetic and unlikely to be addressed for a couple of months."



Joking right???  :smileyshocked:

Paul Sheard VMware Consultant (Contract) VCP6 DCV NV CMA DTM
Reply
0 Kudos
dpnvektor
Contributor
Contributor
Jump to solution

Quick update on our post-upgrade issues.  Still have the VSAN error referenced in this post (very glad it's cosmetic), but we resolved the vsphere being unresponsive issues by regenerating all of our self-signed certs.  More details documented here shortly:  http://www.sym.bio/vsphere-update-issues-regen-certs/

Reply
0 Kudos
dokonek
Contributor
Contributor
Jump to solution

Reply
0 Kudos
jjgunn
Enthusiast
Enthusiast
Jump to solution

/etc/init.d/vpxa restart


Using the VPXA restart worked very well for this cosmetic error on my VxRail 3.0

Reply
0 Kudos
sungho
Enthusiast
Enthusiast
Jump to solution

Hi, alainrussell

As you may know, this issue fixed in ESXi 6.0 Update 2.

VMware ESXi 6.0 Update 2 Release Notes

After upgrade of hosts in a Virtual SAN cluster to ESXi 6.0 Update 1b, some hosts in the cluster might report false warning

After you upgrade Virtual SAN environment to ESXi 6.0 Update 1b, the vCenter Server reports a false warning similar to the following in the Summary tab in the vSphere Web Client and the ESXi host shows a notification triangle:

Host cannot communicate with all other nodes in Virtual SAN enabled cluster

This issue is resolved in this release.

Regards,

Sungho

Reply
0 Kudos