our vmkernel.log is SWAMPED with error messages like this:
2014-09-17T15:08:11.763Z cpu0:33523)<7>fnic : 2 :: Start VLAN Discovery
2014-09-17T15:08:11.764Z cpu20:33539)<6>fnic : 2 :: Sending VLAN request...
2014-09-17T15:08:13.038Z cpu0:33523)<7>fnic : 3 :: Start VLAN Discovery
2014-09-17T15:08:13.038Z cpu0:33549)<6>fnic : 3 :: Sending VLAN request...
as you can see this messages appears every other second and sometimes several times a second. our network guys dont see any VLAN discoveries.
this happens on a new triple Cisco C240M3S cluster with the latest ESX5.5U2 running a VSAN environment for Horizon 6. it doesn't seem to impact performance or functionality but because of the frequency it makes it very hard to debug other issues because it litteraly swamps the vmkernel log.
anybody have any idea what is cause this?
This is because of DCBX. You have CNA cadrs and ESXi is trying to discover FCoE VLAN. If you don't use FCoE, you can disable the FCoE driver on ESXi hosts. On UCS it's probably fnic module:
esxcfg-module -d fnic
This should add a line to /etc/vmware/esx.conf
# cat /etc/vmware/esx.conf | grep fnic
/vmkernel/module/fnic/enabled = "false"
Then you need to reboot ESXi host
ok this issue finally did have impact. because of all the logging, the SD card ran of of space and the host crashed. VMs kept running (luckily) but the console was completely unresponsive and vcenter lost connection.
i would really appreciate any ideas on how to look for the cause here..
I don't know the root cause, but could this be related to FCoE?
Do you know if FCoE has been enabled on the NICs in this Cisco cluster?
This is because of DCBX. You have CNA cadrs and ESXi is trying to discover FCoE VLAN. If you don't use FCoE, you can disable the FCoE driver on ESXi hosts. On UCS it's probably fnic module:
esxcfg-module -d fnic
This should add a line to /etc/vmware/esx.conf
# cat /etc/vmware/esx.conf | grep fnic
/vmkernel/module/fnic/enabled = "false"
Then you need to reboot ESXi host
removing fnic seems to solve this issue! is fnic cisco specific or could other devices be affected too?