Skip navigation
VMware

This Question is Possibly Answered

1 "correct" answer available (10 pts) 2 "helpful" answers available (6 pts)
4,194 Views 6 Replies Last post: Dec 15, 2010 1:24 AM by kesparlat RSS
chaddy Novice 17 posts since
Jun 3, 2009
Currently Being Moderated

Jun 30, 2010 10:52 AM

Lost network connectivity ESX 4

 

Hi all,

 

 

Just this morning this issue happened on our second VMWare server, it happened a 6 weeks ago on our first VMWare server.  Our VM's intermittantly become unresponsive via the network and we couldn't connect to the service console.  After rebooting via the CLI we are able to connect via service console and start up VM and all works as normal.  No alerts are listed in vCenter but upon connecting to the CLI I have errors in /var/log/messages, /var/log/vmkwarning and as described in KB1017458.  I verified that we do have the ESX400-201002401-BG patch installed as mentioned in the KB1017458 (it wasn't installed on the first server last time we experienced this issue).

 

 

 

 

 

vmkwarning log:

 

 

Jun 30 07:31:25 cma32 vmkernel: 6:12:46:44.715 cpu0:4096)VMNIX: WARNING: NetCos: 1075: virtual HW appears wedged (bug number 90831), resetting

Jun 30 07:31:25 cma32 vmkernel: 6:12:46:44.715 cpu11:4119)WARNING: Net: 1205: non-forced disable with 128 packets in flight.

Jun 30 07:31:25 cma32 vmkernel: 6:12:46:44.715 cpu11:4119)WARNING: Net: 1210: forced disable with 128 packets in flight.

Jun 30 07:32:30 cma32 vmkernel: 6:12:47:47.811 cpu1:4238)WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic0: transmit timed out

Jun 30 07:32:31 cma32 vmkernel: 6:12:47:50.795 cpu8:4230)WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic0: transmit timed out

Jun 30 07:33:40 cma32 vmkernel: 6:12:48:59.824 cpu8:4242)WARNING: CpuSched: 965: world 4242(helper21-12) did not yield PCPU 8 for 4001 msec, refCharge=7996 msec, coreCharge=8329 msec, 

Jun 30 07:36:00 cma32 vmkernel: 6:12:51:19.519 cpu5:4289)ALERT: Heartbeat: 518: PCPU 4 didn't have a heartbeat for 3780 seconds. may be locked up

Jun 30 07:36:00 cma32 vmkernel: 6:12:51:19.519 cpu5:4289)WARNING: NMI: 1612: Sending NMI IPI to PCPU 4 to get its backtrace (Src 1, Req 1)

Jun 30 07:36:00 cma32 vmkernel: 6:12:51:19.519 cpu4:4285)ALERT: NMI: 2001: NMI IPI received. Was eip(base):ebp:cs Lost network connectivity ESX 4(Src 0x1)

Jun 30 07:36:00 cma32 vmkernel: 6:12:51:19.595 cpu13:4280)ALERT: Heartbeat: 518: PCPU 6 didn't have a heartbeat for 3780 seconds. may be locked up

Jun 30 07:36:00 cma32 vmkernel: 6:12:51:19.595 cpu13:4280)WARNING: NMI: 1612: Sending NMI IPI to PCPU 6 to get its backtrace (Src 1, Req 1)

Jun 30 07:36:00 cma32 vmkernel: 6:12:51:19.595 cpu6:4275)ALERT: NMI: 2001: NMI IPI received. Was eip(base):ebp:cs Lost network connectivity ESX 4(Src 0x1)

Jun 30 07:39:41 cma32 vmkernel: 6:12:55:00.921 cpu0:4096)VMNIX: WARNING: NetCos: 1075: virtual HW appears wedged (bug number 90831), resetting

Jun 30 07:39:41 cma32 vmkernel: 6:12:55:00.921 cpu10:4117)WARNING: Net: 1205: non-forced disable with 128 packets in flight.

Jun 30 07:39:41 cma32 vmkernel: 6:12:55:00.921 cpu10:4117)WARNING: Net: 1210: forced disable with 128 packets in flight.

Jun 30 07:43:40 cma32 vmkernel: 6:12:59:00.071 cpu8:4237)WARNING: CpuSched: 965: world 4237(helper21-7) did not yield PCPU 8 for 4001 msec, refCharge=7998 msec, coreCharge=8331 msec, 

Jun 30 07:47:00 cma32 vmkernel: 6:13:02:20.151 cpu0:4096)VMNIX: WARNING: NetCos: 1075: virtual HW appears wedged (bug number 90831), resetting

Jun 30 07:47:00 cma32 vmkernel: 6:13:02:20.151 cpu8:4117)WARNING: Net: 1205: non-forced disable with 128 packets in flight.

Jun 30 07:47:00 cma32 vmkernel: 6:13:02:20.151 cpu8:4117)WARNING: Net: 1210: forced disable with 128 packets in flight.

Jun 30 07:50:46 cma32 vmkernel: 6:13:06:02.321 cpu1:4242)WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic1: transmit timed out

Jun 30 07:53:28 cma32 vmkernel: 6:13:08:48.321 cpu8:4240)WARNING: CpuSched: 965: world 4240(helper21-10) did not yield PCPU 8 for 4002 msec, refCharge=7995 msec, coreCharge=8241 msec,

 

 

messages log:

 

 

Jun 30 06:38:44 cma32 kernel: http://560207.735093 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 06:45:51 cma32 kernel: http://560634.167440 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 06:51:56 cma32 kernel: http://560998.696585 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 06:54:42 cma32 kernel: http://561164.391621 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 07:01:48 cma32 kernel: http://561588.905511 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 07:08:52 cma32 kernel: http://562012.361168 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 07:15:51 cma32 kernel: http://562430.779261 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 07:18:29 cma32 vobd: Jun 30 07:18:29.589: 563515795364us: http://vprob.net.redundancy.degraded Uplink redundancy degraded on virtual switch "vSwitch0". Physical NIC vmnic2 is down. 3 uplinks still up. Affected port groups: "Service Console", "VM Network", "VM Network", "VM Network", "VM Network", "VM Network".

Jun 30 07:24:20 cma32 kernel: http://562939.132972 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 07:31:25 cma32 kernel: http://563362.547808 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 07:32:31 cma32 vobd: Jun 30 07:32:31.270: 564321486766us: http://vprob.net.redundancy.degraded Uplink redundancy degraded on virtual switch "vSwitch0". Physical NIC vmnic0 is down. 2 uplinks still up. Affected port groups: "Service Console", "VM Network", "VM Network", "VM Network", "VM Network", "VM Network".

Jun 30 07:39:41 cma32 kernel: http://563857.837700 NETDEV WATCHDOG: vswif0: transmit timed out

Jun 30 07:47:00 cma32 kernel: http://564296.256806 NETDEV WATCHDOG: vswif0: transmit timed out

 

 

 

 

 

Also, the NIC we are using is a 4 port NetXen HP NC375i which have some documented problems, although most of those problems are around not all 4 ports working but they are all working for me. This issue prompted to me to do what I should've done a while ago and setup the service console to be on its own Intel nic and now just the VMs are running on 3 ports of the NetXen nic.

 

 

Any ideas what could be causing this?

 

 

Thanks,

 

 

-Chad

 

 

 

 

 

   

 

 

Simonds Lurker 1 posts since
Apr 20, 2009
Currently Being Moderated
1. Aug 22, 2010 4:57 PM in response to: chaddy
Re: Lost network connectivity ESX 4

I have the same issue now. Just upgraded to new hosts with this card (NC375T) running latest firmware 4.0.530 and the hosts will randomly lose network connectivity...

 

HELP?

Mackopes Enthusiast 64 posts since
Jan 29, 2008
Currently Being Moderated
2. Oct 12, 2010 10:42 AM in response to: Simonds
Re: Lost network connectivity ESX 4

 

We had so many problems with the quad port NetXen cards in our DL370 G6s that we ended up ripping them ALL out and replaced them with broadcom cards...

 

 

Aaron

 

 

kesparlat Enthusiast 71 posts since
Oct 30, 2007
Currently Being Moderated
3. Nov 2, 2010 2:26 AM in response to: Mackopes
Re: Lost network connectivity ESX 4

I have the same problem, in my case if you take a look in the physical adapter the link is up, but Service Console detects it as down, It seems like a driver error.

I've upgraded to that (4.0.570):

 

http://downloads.vmware.com/d/details/esx4x_qla_nx_nic_dt/ZHcqYmRAdyViZHdlZQ

 

Best regards.

EdZ Enthusiast 37 posts since
Jan 3, 2006
Currently Being Moderated
4. Nov 2, 2010 9:51 AM in response to: chaddy
Re: Lost network connectivity ESX 4

 

We are seeing the same thing happening in our environment with ESX 4.0. One of the VM's intermittenly loses network connectivity, and it can be restored by clicking the "connected" box in the VM under network settings.

 

 

Thanks,

 

 

Ed

 

 

vGuy Expert 250 posts since
Sep 19, 2007
Currently Being Moderated
5. Nov 2, 2010 7:43 PM in response to: EdZ
Re: Lost network connectivity ESX 4

Hello EdZ,

If you are facing Physical NIC issue all of your VMs will be affected not a particular one. I suggest you to double check the settings

of the affected VM. For example, the VM portgroup should be the same on all the ESX hosts in the cluster...HTH

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
kesparlat Enthusiast 71 posts since
Oct 30, 2007
Currently Being Moderated
6. Dec 15, 2010 1:24 AM in response to: vGuy
Re: Lost network connectivity ESX 4

Submited a SR with HP (the server vendor) confirmed the bug with the brocade driver. I'm waiting for a newer one.

 

Actually all the NICs are paired with other from different chipset (Intel) to avoid network loss conectivity in hosts.

Bookmarked By (0)

Share This Page

Communities