1 2 Previous Next 16 Replies Latest reply on Jun 23, 2017 1:34 PM by snamidro

    Vsphere ESXi 6.5 external network connectivity lost on DL380Gen8

    tbraes Lurker

      Dear,

       

      Strange issue on my system DL380Gen8 (not on VMHCL). It looses all external connectivity. When I logon through the out of band interface (ILO) ; I can still ping all VM's and they are running.

      However the external interfaces looses all connectivity (kernel + VMWare guest uplinks).

      Restart management agents doesn't help.

      When I look at my kernel log I see my NVMe card + NTG3 (network driver) complaining :

      2017-02-01T01:23:39.574Z cpu9:68121)User: 3089: sfcb-smx: wantCoreDump:sfcb-smx signal:6 exitCode:0 coredump:enabled

      2017-02-01T01:23:39.703Z cpu9:68121)UserDump: 3024: sfcb-smx: Dumping cartel 68117 (from world 68121) to file /var/core/sfcb-smx-zdump.002 ...

      2017-02-01T01:23:41.992Z cpu9:68121)UserDump: 3172: sfcb-smx: Userworld(sfcb-smx) coredump complete.

      2017-02-01T10:20:26.125Z cpu2:69084)nvme:nvmeCoreLogError:370:command failed: 0x43077bd885f0.

      2017-02-01T10:22:27.081Z cpu2:68970)nvme:nvmeCoreLogError:370:command failed: 0x43077bd70bf0.

      2017-02-01T10:24:28.580Z cpu2:68970)nvme:nvmeCoreLogError:370:command failed: 0x43077bd71370.

      2017-02-01T10:26:31.329Z cpu2:69175)nvme:nvmeCoreLogError:370:command failed: 0x43077bd71970.

      2017-02-01T10:28:32.559Z cpu2:69175)nvme:nvmeCoreLogError:370:command failed: 0x43077bd71f70.

      2017-02-01T10:30:49.130Z cpu2:68998)nvme:nvmeCoreLogError:370:command failed: 0x43077bd72570.

      2017-02-01T10:32:50.089Z cpu2:69195)nvme:nvmeCoreLogError:370:command failed: 0x43077bd72b70.

      2017-02-01T10:34:53.349Z cpu2:69134)nvme:nvmeCoreLogError:370:command failed: 0x43077bd73170.

      2017-02-01T10:36:54.443Z cpu2:69040)nvme:nvmeCoreLogError:370:command failed: 0x43077bd73770

      2017-02-01T16:18:36.497Z cpu1:68999)WARNING: ntg3-throttled: Ntg3XmitPktList:372: vmnic0:TX ring full (0)

      2017-02-01T16:18:45.193Z cpu22:65645)ntg3:vmnic0:Ntg3UplinkReset:665:Ntg3UplinkReset

      2017-02-01T16:18:45.193Z cpu22:65645)ntg3:vmnic0:Ntg3UplinkQuiesceIO:647:Ntg3UplinkQuiesceIO

      2017-02-01T16:18:45.193Z cpu22:65645)ntg3:vmnic0:Ntg3UplinkStartIO:623:Ntg3UplinkStartIO

      2017-02-01T16:18:55.193Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkReset:665:Ntg3UplinkReset

      2017-02-01T16:18:55.193Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkQuiesceIO:647:Ntg3UplinkQuiesceIO

      2017-02-01T16:18:55.193Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkStartIO:623:Ntg3UplinkStartIO

      2017-02-01T16:19:05.195Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkReset:665:Ntg3UplinkReset

      2017-02-01T16:19:05.195Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkQuiesceIO:647:Ntg3UplinkQuiesceIO

      2017-02-01T16:19:05.195Z cpu21:65645)ntg3:vmnic0:Ntg3UplinkStartIO:623:Ntg3UplinkStartIO

      2017-02-01T15:50:50.684Z cpu9:68980)WARNING: NetPort: 1932: failed to disable port 0x2000005 on vSwitch0: Busy

      2017-02-01T15:50:50.684Z cpu9:68980)NetSched: 701: 0x2000002: received a force quiesce for port 0x2000005, dropped 727 pkts

      2017-02-01T15:50:50.685Z cpu9:68980)NetPort: 1879: disabled port 0x2000005

      2017-02-01T15:50:50.688Z cpu9:68980)Vmxnet3: 17265: Disable Rx queuing; queue size 256 is larger than Vmxnet3RxQueueLimit limit of 64.

      2017-02-01T15:50:50.688Z cpu9:68980)Vmxnet3: 17623: Using default queue delivery for vmxnet3 for port 0x2000005

      2017-02-01T15:50:50.688Z cpu9:68980)NetPort: 1660: enabled port 0x2000005 with mac 00:50:56:a4:3e:25

      2017-02-01T15:50:50.699Z cpu9:68980)NetPort: 1879: disabled port 0x2000005

      2017-02-01T15:50:50.701Z cpu9:68980)Vmxnet3: 17265: Disable Rx queuing; queue size 256 is larger than Vmxnet3RxQueueLimit limit of 64.

      2017-02-01T15:50:50.701Z cpu9:68980)Vmxnet3: 17623: Using default queue delivery for vmxnet3 for port 0x2000005

      2017-02-01T15:50:50.701Z cpu9:68980)NetPort: 1660: enabled port 0x2000005 with mac 00:50:56:a4:3e:25

      2017-02-01T15:50:56.216Z cpu0:68971)WARNING: ntg3-throttled: Ntg3XmitPktList:372: vmnic0:TX ring full (0)

       

       

      when I restart the box, all goes fine again for sometimes 1 day, 1 week... unclear... somebody an idea?

        1 2 Previous Next