Since upgrading to ESX4/vSphere4 on some of our 64bit servers (Windows 2003) the network seems to drop for a short period of time.
I'm at a complete loss. Any abody have any ideas?
Well,
I have bene having an on going conversation with a member of support staff however I was a little bit annoyed yesterday when i was told that he didnt have enough time to look for the root cause of the problem and then basically asked me to
1.) commit some snapshots
2.) apply 1 outstyanding service pack relating to battery pack cache running out of power and causing corruption
3.) regen logs (does this mean I have to break my prod environment?!!!)
anyway, i am not amused to say the least. Support has ALWAYS been great and this felt like a fob off. I have emailed support stating that other people are having problems and pointing them to these threads...
I'm seeing this too. We're not using iSCSI at all (well some of the guests are, but nothing on the ESX host itself), it's all NFS.
Sep 24 08:45:20 gsoesx01 vmkernel: 37:16:48:59.630 cpu0:4096)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Sep 24 08:45:50 gsoesx01 vmkernel: 37:16:49:29.630 cpu1:4125)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Sep 24 08:45:51 gsoesx01 vmkernel: 37:16:49:30.630 cpu2:4254)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world restore device "mpx.vmhba34:C0:T0:L0" - no more commands to retry
Sep 24 08:44:45 gsoesx01 vmkernel: 37:16:48:24.620 cpu0:4096)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device "mpx.vmhba34:C0:T0:L0" due to Not found
Sep 24 08:44:45 gsoesx01 vmkernel: 37:16:48:24.620 cpu0:4096)WARNING: NMP: nmp_DeviceRetryCommand: Device "mpx.vmhba34:C0:T0:L0": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Sep 24 08:44:45 gsoesx01 vmkernel: 37:16:48:24.620 cpu0:4096)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Sep 24 08:45:15 gsoesx01 vmkernel: 37:16:48:54.620 cpu9:4125)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Sep 24 08:45:15 gsoesx01 vmkernel: 37:16:48:54.620 cpu21:4254)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world restore device "mpx.vmhba34:C0:T0:L0" - no more commands to retry
Sep 24 08:45:20 gsoesx01 vmkernel: 37:16:48:59.630 cpu0:4096)WARNING: NMP: nmp_IssueCommandToDevice: I/O could not be issued to device "mpx.vmhba34:C0:T0:L0" due to Not found
Sep 24 08:45:20 gsoesx01 vmkernel: 37:16:48:59.630 cpu0:4096)WARNING: NMP: nmp_DeviceRetryCommand: Device "mpx.vmhba34:C0:T0:L0": awaiting fast path state update for failover with I/O blocked. No prior reservation exists on the device.
Sep 24 08:45:20 gsoesx01 vmkernel: 37:16:48:59.630 cpu0:4096)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Sep 24 08:45:50 gsoesx01 vmkernel: 37:16:49:29.630 cpu1:4125)WARNING: NMP: nmp_DeviceStartLoop: NMP Device "mpx.vmhba34:C0:T0:L0" is blocked. Not starting I/O from device.
Sep 24 08:45:51 gsoesx01 vmkernel: 37:16:49:30.630 cpu2:4254)WARNING: NMP: nmp_DeviceAttemptFailover: Retry world restore device "mpx.vmhba34:C0:T0:L0" - no more commands to retry
How do I see what vmhba34 is? I suspect it's one of the NICs.
Hello guys,
i'm sorry to revive this post but i'm experiencing the same problem and can't seem to be able to fix it.
I'm trying to test a scenario where we have a backup san taking over, we are using 2 groups of Equallogic arrays, I promote the second group and demote the first one, at this point, vsphere should be switching over to the second group but what happens is I loose connection to the whole ESX host and VM's, i get spammed by those messages in the vmkernel log file :
May 25 17:33:42 esx2 vmkernel: 4:03:04:46.427 cpu12:4120)FSS: 666: Failed to get object f530 28 2 4bfbdb4d 63b6754c 22007fe5 1e565719 4 1 0 0 0 0 0 :Address temporarily unmapped
May 25 17:33:42 esx2 vmkernel: 4:03:04:46.428 cpu2:4122)FSS: 666: Failed to get object f530 28 2 4bfbdb4d 63b6754c 22007fe5 1e565719 4 1 0 0 0 0 0 :Address temporarily unmapped
esxcfg-rescan vmhba33 returns a nice error : Error: Unable to scan adapters for VMFS
The test VM I had on there is of course out in space so it gets me thinking : what is the best way to get out of this kind of situation ?
- Notice one SAN is dead --> Kill all VMs --> Promote second SAN --> Demote first SAN --> Restart VM's ?
Thanks !
Alex
This problems are fixed in vsphere 4.1??
Any procedure to solve in 4.0 ?
Thanks