VMware Cloud Community
JayDeah
Contributor
Contributor

extremely long boot time (crash?) on ESXi5 host, pause after "iscsi_vmk started"

ok ive got some Dell poweredge 1950 servers connected to an md3200i that have been happily running ESXi 4.1 U1 for some time and i have put ESXi5 on them using a number of scenarios all of which having the same problem. Any thoughts?

the boot process hangs near the end of the progress bar and the last entry on the screen is "iscsi_vmk started successfully"

if i alt+f12 i can see the server hasnt crashed.

i am using vcenter 5.0 with latest update manager and webclient.

i have used the following scenarios to install and configure:

1) host upgrade using updatemanager

2) clean instakll of ESXi5, apply old esxi4.1 host profile, update settings reboot

3) clean install of ESXi5, clean configuration, reboot

scenario 1 appeared to leave the host hanging , i gave up after 30minutes and reinstalled

scenario 2 didnt seem to like having the ESXi4 settings as i ended up with my swiscsi adapter on vmhba39, it did eventually start after about half an hour!

scenario 3 has been hanging for about 15minutes sofar, im hoping it finally boots!

Reply
0 Kudos
95 Replies
jbsmith1
Contributor
Contributor

YES. Thank you for digging into this! I'm having the exact same problem. Here's the relevant lines (these appear many times repeatedly) from my vmkernel.log:

2011-09-14T00:55:20.508Z cpu2:4743)FSS: 4333: No FS driver claimed device 'mpx.vmhba0:C0:T0:L0': Not supported

2011-09-14T00:55:20.515Z cpu2:4743)VC: 1449: Device rescan time 15 msec (total number of devices 5)

2011-09-14T00:55:20.515Z cpu2:4743)VC: 1452: Filesystem probe time 11 msec (devices probed 5 of 5)

2011-09-14T00:55:20.606Z cpu2:4743)FSS: 4333: No FS driver claimed device 'mpx.vmhba0:C0:T0:L0': Not supported

And here's the relevant snippet from esxcli storage core path list:

Runtime Name: vmhba0:C0:T0:L0
   Device: mpx.vmhba0:C0:T0:L0
   Device Display Name: Local PLDS CD-ROM (mpx.vmhba0:C0:T0:L0)

I hope there is a patch or at least a viable workaround for this soon!

Reply
0 Kudos
JayDeah
Contributor
Contributor

I have had a response from vmware support.

They have removed the MD series arrays from the ESXi5 HCL...... lol

aparently dell will be releasing a firmware update later this month and then they'll look at re-certification

not exactly the response i was expecting!

Reply
0 Kudos
NicolaB
Contributor
Contributor

same here,

so bad.

was installing a production environment with 2 x DELL R510 servers and one MD3220i on a new vSphere5 infrastructure.

All was on the HCL the 14 & 15 of September, when I was in Datacenter.

Now we're having a lot of issue with one of the 2 R510, loosing one Datastore while still seeing the LUN and I discover now that VMware get out from HCL the MD3220i?!?

that is not a nice move from VMware, I installed it when it was in the HCL, they should not have put it in the HCL if it was not going to work!

Now I should stay and wait for new firmware from Dell....so bad.

I cannot even create a datastore now...

2011-09-20T12:06:52.509Z cpu12:2709)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba32:CH:1 T:0 CN:0: iSCSI connection is being marked "OFFLINE" (Event:4)
2011-09-20T12:06:52.509Z cpu12:2709)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess [ISID: 00023d000002 TARGET: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0003fe45d000000003ca53d0c TPGT: 1 TSIH: 0]
2011-09-20T12:06:52.509Z cpu12:2709)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn [CID: 0 L: 192.168.2.1:61926 R: 192.168.2.254:3260]
2011-09-20T12:06:52.509Z cpu1:2711)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:52.509Z cpu1:2711)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6782bcb0003fe45d0000034b3cd274d7" state in doubt; requested fast path state update...
2011-09-20T12:06:52.509Z cpu1:2711)ScsiDeviceIO: 2305: Cmd(0x41244141edc0) 0x2a, CmdSN 0x2 to dev "naa.6782bcb0003fe45d0000034b3cd274d7" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2011-09-20T12:06:52.651Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:52.795Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:52.936Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:53.080Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:53.198Z cpu4:9654)ScsiCore: 1455: Power-on Reset occurred on vmhba32:C7:T0:L10
2011-09-20T12:06:53.200Z cpu4:9654)ScsiCore: 1455: Power-on Reset occurred on vmhba32:C6:T0:L10
2011-09-20T12:06:53.202Z cpu4:9654)ScsiCore: 1455: Power-on Reset occurred on vmhba32:C5:T0:L10
2011-09-20T12:06:53.206Z cpu4:9654)ScsiCore: 1455: Power-on Reset occurred on vmhba32:C4:T0:L10
2011-09-20T12:06:53.209Z cpu4:9654)ScsiCore: 1455: Power-on Reset occurred on vmhba32:C3:T0:L10
2011-09-20T12:06:53.209Z cpu1:2049)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6782bcb0003fe45d0000034b3cd274d7" state in doubt; requested fast path state update...
2011-09-20T12:06:53.212Z cpu4:9654)ScsiCore: 1455: Power-on Reset occurred on vmhba32:C2:T0:L10
2011-09-20T12:06:53.224Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:53.366Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:53.509Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:53.522Z cpu4:9654)WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba32:C1:T0:L10": Busy
2011-09-20T12:06:53.522Z cpu4:9654)VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba32:C1:T0:L10" Busy
2011-09-20T12:06:53.524Z cpu4:9654)ScsiCore: 1455: Power-on Reset occurred on vmhba32:C0:T0:L10
2011-09-20T12:06:53.651Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:53.795Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:53.936Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:54.080Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:54.209Z cpu1:2049)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6782bcb0003fe45d0000034b3cd274d7" state in doubt; requested fast path state update...
2011-09-20T12:06:54.224Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:54.366Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:54.509Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:54.515Z cpu2:10594)WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba32:C1:T0:L10": Busy
2011-09-20T12:06:54.515Z cpu2:10594)VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba32:C1:T0:L10" Busy
2011-09-20T12:06:54.651Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:54.795Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:54.936Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:55.080Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:55.209Z cpu1:2049)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6782bcb0003fe45d0000034b3cd274d7" state in doubt; requested fast path state update...
2011-09-20T12:06:55.224Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:55.279Z cpu12:2709)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x41001f8fbc50 network resource pool netsched.pools.persist.iscsi associated
2011-09-20T12:06:55.279Z cpu12:2709)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x41001f8fbc50 network tracker id 1 tracker.iSCSI.192.168.2.254 associated
2011-09-20T12:06:55.365Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:55.505Z cpu1:2049)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:06:55.519Z cpu8:9311)WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba32:C1:T0:L10": Busy
2011-09-20T12:06:55.519Z cpu8:9311)VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba32:C1:T0:L10" Busy
2011-09-20T12:06:55.556Z cpu12:2709)WARNING: iscsi_vmk: iscsivmk_StartConnection: vmhba32:CH:1 T:0 CN:0: iSCSI connection is being marked "ONLINE"
2011-09-20T12:06:55.556Z cpu12:2709)WARNING: iscsi_vmk: iscsivmk_StartConnection: Sess [ISID: 00023d000002 TARGET: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0003fe45d000000003ca53d0c TPGT: 1 TSIH: 0]
2011-09-20T12:06:55.556Z cpu12:2709)WARNING: iscsi_vmk: iscsivmk_StartConnection: Conn [CID: 0 L: 192.168.2.1:58205 R: 192.168.2.254:3260]
2011-09-20T12:07:06.123Z cpu13:2709)WARNING: iscsi_vmk: iscsivmk_StopConnection: vmhba32:CH:1 T:0 CN:0: iSCSI connection is being marked "OFFLINE" (Event:4)
2011-09-20T12:07:06.123Z cpu13:2709)WARNING: iscsi_vmk: iscsivmk_StopConnection: Sess [ISID: 00023d000002 TARGET: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0003fe45d000000003ca53d0c TPGT: 1 TSIH: 0]
2011-09-20T12:07:06.123Z cpu13:2709)WARNING: iscsi_vmk: iscsivmk_StopConnection: Conn [CID: 0 L: 192.168.2.1:58205 R: 192.168.2.254:3260]
2011-09-20T12:07:06.123Z cpu8:2711)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6782bcb0003fe45d0000034b3cd274d7" state in doubt; requested fast path state update...
2011-09-20T12:07:06.123Z cpu13:2709)WARNING: iscsi_vmk: iscsivmk_TaskMgmtIssue: vmhba32:CH:1 T:0 L:10 : Task mgmt "Abort Task" with itt=0x97e (refITT=0x97d) timed out.
2011-09-20T12:07:06.123Z cpu8:9465)WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba32:C1:T0:L10": Failure
2011-09-20T12:07:06.123Z cpu8:9465)VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba32:C1:T0:L10" Failure
2011-09-20T12:07:06.208Z cpu8:9996)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:07:06.208Z cpu8:9996)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6782bcb0003fe45d0000034b3cd274d7" state in doubt; requested fast path state update...
2011-09-20T12:07:06.351Z cpu8:2056)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:07:06.493Z cpu8:2056)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:07:06.521Z cpu10:9996)WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba32:C1:T0:L10": Busy
2011-09-20T12:07:06.521Z cpu10:9996)VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba32:C1:T0:L10" Busy
2011-09-20T12:07:06.635Z cpu8:2056)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x41244141edc0) to dev "naa.6782bcb0003fe45d0000034b3cd274d7" on path "vmhba32:C1:T0:L10" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-20T12:07:06.764Z cpu8:2056)ScsiDeviceIO: 2316: Cmd(0x41244141edc0) 0x2a, CmdSN 0x2 to dev "naa.6782bcb0003fe45d0000034b3cd274d7" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.
2011-09-20T12:07:06.765Z cpu8:10760)BC: 1858: Failed to write (uncached) object 'naa.6782bcb0003fe45d0000034b3cd274d7': Timeout
2011-09-20T12:07:07.521Z cpu1:2196)WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba32:C1:T0:L10": Busy
2011-09-20T12:07:07.521Z cpu1:2196)VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba32:C1:T0:L10" Busy
2011-09-20T12:07:08.886Z cpu13:2709)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x41001f8fbc50 network resource pool netsched.pools.persist.iscsi associated
2011-09-20T12:07:08.886Z cpu13:2709)iscsi_vmk: iscsivmk_ConnNetRegister: socket 0x41001f8fbc50 network tracker id 1 tracker.iSCSI.192.168.2.254 associated
2011-09-20T12:07:09.165Z cpu13:2709)WARNING: iscsi_vmk: iscsivmk_StartConnection: vmhba32:CH:1 T:0 CN:0: iSCSI connection is being marked "ONLINE"
2011-09-20T12:07:09.165Z cpu13:2709)WARNING: iscsi_vmk: iscsivmk_StartConnection: Sess [ISID: 00023d000002 TARGET: iqn.1984-05.com.dell:powervault.md3200i.6782bcb0003fe45d000000003ca53d0c TPGT: 1 TSIH: 0]
2011-09-20T12:07:09.165Z cpu13:2709)WARNING: iscsi_vmk: iscsivmk_StartConnection: Conn [CID: 0 L: 192.168.2.1:56325 R: 192.168.2.254:3260]

Reply
0 Kudos
BrooklynzFinest
Contributor
Contributor

I did the same thing and now if I lose any paths to the datastores both of my hosts cannot use any paths and all VMs become inoperable.     I neeed to remove all the VMNICs but 1 from the vSwitch used for iSCSI on both hosts from CLI to regain access to the datastores and then readd the VMNICS back to the vSwitch.

Reply
0 Kudos
BrooklynzFinest
Contributor
Contributor

Dell today released new firmware for the MD Series with Vsphere 5 support.  Still not back on HCL but I am going to apply this firmware and see if it resolves my issue

07.80.41.60

This release of the RAID controller firmware 07.80.41.60 adds the following new features:
* Support for larger than 2TB physical disks.
* Support for up to 192 physical disks.
--- By default, the storage arrays support up to 120 physical disks.
--- The support for more than 120 physical disks is offered through a premium feature option.
* Snapshot rollback.
* Snapshot scheduler.
* Simple performance monitor.
* MD Storage Manager operation completion progress indicator for prolonged operations.
* Support for 16 snapshots per LUN with a total of 256 snapshots per storage array.
* Support for 512 virtual disks.
--- Note: The actual number of virtual disks which a single host can access depends on the host operating system and cannot exceed 256 virtual disks per host.
* Added support for the following operating systems:
--- Microsoft Windows Server 2008 R2 SP1
--- Red Hat Enterprise Linux 5.6
--- Red Hat Enterprise Linux 5.7
--- Red Hat Enterprise Linux 6.1
------ Note: Red Hat Enterprise Linux 6.0 is not supported.
--- SUSE Linux Enterprise Server 10 SP4
--- VMware ESX 4.1 Update 1
--- VMware ESXi 5

Reply
0 Kudos
NicolaB
Contributor
Contributor

finally!

I'm gonna update now.

Reply
0 Kudos
NicolaB
Contributor
Contributor

I'm still getting the same errors above

Reply
0 Kudos
NicolaB
Contributor
Contributor

ok, re-enabling port binding seems to make it working correctly now....I just don't understand why it sees 2 paths per LUN on the same controller while without port-binding it was only one path per controller (and it was right)

anyone can explain?

here's some screenshots:

1835578.png

Every VMkernel Port has only one active adapter.

Reply
0 Kudos
JayDeah
Contributor
Contributor

are those subnets routable at layer 3? do the vmkernels have default gateways? I'm assuming theyre all on different vlans?

EDIT: Just noticed you have 8 differnt subnets. thats not correct.

read the manual you'll see the correct connectivity diagram. i dont think your configuration is supported at all!

and dont you think 8 pnics per host for iscsi is a bit overkill?

Reply
0 Kudos
NicolaB
Contributor
Contributor

why? what's the problem? every VMkernel interface connect to its own iSCSI target which is in his subnet, I don't see any problem on this, it is well documented and suggested on Dell design sheet: http://www.delltechcenter.com/page/VMware+ESX+4.0+and+PowerVault+MD3000i (the difference is that I have 4 port per controller instead of the 2 in the Dell tutorial).

Reply
0 Kudos
JayDeah
Contributor
Contributor

then you should have ended up with 4 seperate subnets with 2 addresses per subnet on the SAN and 1 address per vmk on the host going into these 4 seperate subnets

Reply
0 Kudos
JayDeah
Contributor
Contributor

I have just applied the firmware, rebooted a host, and...

...Same extreme boot time Smiley Sad

have emailed back vmware support asking them to re-open my case.

other things to note with the firmware, you need the new version of the MD storage manager (1.5GB download) and it is also unclear if you need to apply the "bridge" firmware if yuo already have the latest version

also during the update you cannot perform a controlled path selection on ESX, it just goes and updates the controllers one after the other with no user input so DONT do this on a production envitronment without pre-approved downtime!

my guests were up and down like yo-yos for 10 minutes!

Reply
0 Kudos
NicolaB
Contributor
Contributor

ok I ended up with this config:

ESXi1:

Interface  Port Group/DVPort   IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type
vmk0       Management Network  IPv4      10.10.3.1                               255.255.0.0     10.10.255.255   00:26:b9:58:bf:99 1500    65535     true    STATIC
vmk1       VMkernel            IPv4      172.16.125.1                            255.255.255.0   172.16.125.255  00:50:56:7e:05:58 1500    65535     true    STATIC
vmk2       iSCSI_SWA_P1        IPv4      192.168.1.1                             255.255.255.0   192.168.1.255   00:50:56:7c:6b:6f 9000    65535     true    STATIC
vmk3       iSCSI_SWA_P2        IPv4      192.168.2.1                             255.255.255.0   192.168.2.255   00:50:56:7c:19:e0 9000    65535     true    STATIC
vmk4       iSCSI_SWA_P3        IPv4      192.168.3.1                             255.255.255.0   192.168.3.255   00:50:56:7e:97:0c 9000    65535     true    STATIC
vmk5       iSCSI_SWA_P4        IPv4      192.168.4.1                             255.255.255.0   192.168.4.255   00:50:56:73:72:fc 1500    65535     true    STATIC
vmk6       iSCSI_SWB_P1        IPv4      192.168.1.2                             255.255.255.0   192.168.1.255   00:50:56:72:a5:95 9000    65535     true    STATIC
vmk7       iSCSI_SWB_P2        IPv4      192.168.2.2                             255.255.255.0   192.168.2.255   00:50:56:70:85:96 9000    65535     true    STATIC
vmk8       iSCSI_SWB_P3        IPv4      192.168.3.2                             255.255.255.0   192.168.3.255   00:50:56:7e:f9:b5 9000    65535     true    STATIC
vmk9       iSCSI_SWB_P4        IPv4      192.168.4.2                             255.255.255.0   192.168.4.255   00:50:56:78:6a:a7 9000    65535     true    STATIC

ESXi2:

Interface  Port Group/DVPort   IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type
vmk0       Management Network  IPv4      10.10.4.1                               255.255.0.0     10.10.255.255   00:26:b9:4f:d5:0d 1500    65535     true    STATIC
vmk1       VMkernel            IPv4      172.16.125.2                            255.255.255.0   172.16.125.255  00:50:56:73:5a:35 1500    65535     true    STATIC
vmk8       iSCSI_SWB_P7        IPv4      192.168.3.4                             255.255.255.0   192.168.3.255   00:50:56:75:a8:d4 9000    65535     true    STATIC
vmk9       iSCSI_SWB_P8        IPv4      192.168.4.4                             255.255.255.0   192.168.4.255   00:50:56:79:13:43 9000    65535     true    STATIC
vmk2       iSCSI_SWA_P5        IPv4      192.168.1.3                             255.255.255.0   192.168.1.255   00:50:56:7b:49:b6 9000    65535     true    STATIC
vmk3       iSCSI_SWA_P6        IPv4      192.168.2.3                             255.255.255.0   192.168.2.255   00:50:56:7f:07:c3 9000    65535     true    STATIC
vmk4       iSCSI_SWA_P7        IPv4      192.168.3.3                             255.255.255.0   192.168.3.255   00:50:56:70:c1:12 9000    65535     true    STATIC
vmk5       iSCSI_SWA_P8        IPv4      192.168.4.3                             255.255.255.0   192.168.4.255   00:50:56:77:b8:cc 9000    65535     true    STATIC
vmk6       iSCSI_SWB_P5        IPv4      192.168.1.4                             255.255.255.0   192.168.1.255   00:50:56:78:e4:da 9000    65535     true    STATIC
vmk7       iSCSI_SWB_P6        IPv4      192.168.2.4                             255.255.255.0   192.168.2.255   00:50:56:7f:7b:06 9000    65535     true    STATIC

SAN iSCSI targets:

CTRL A:

P0: 192.168.1.253

P1: 192.168.2.253

P2: 192.168.3.253

P3: 192.168.4.253

CTRL B:

P0: 192.168.1.254

P1: 192.168.2.254

P2: 192.168.3.254

P3: 192.168.4.254

But LUN scanning takes a lot more now and I cannot end up with a correct reading of all LUNs, while it was working before!

Reply
0 Kudos
JayDeah
Contributor
Contributor

that looks more like i would expect.

can you confirm that those iscsi VMKs do NOT have default gateways?

there shuold not be a way to route between the subnets, else you end up with too many routes and the extra ones wouldnt work in failover scenario anyway!

also vmk5 on ESXi1has a 1500 MTU on the above

Wait, think i can spot a problem...

SWA_P1 and SWB_P1 are on the same subnet.

in your 8x vmk/pnic scenario (which i still think is overkill) subnet 1+2 should be on SWA and 3+4 should be on SWB

if a switch fails you should lose 50% of the paths

this is all detailed in the manuals

Reply
0 Kudos
NicolaB
Contributor
Contributor

so you mean:

ESXi1:

vmk2       iSCSI_SWA_P1        IPv4      192.168.1.1   255.255.255.0   192.168.1.255
vmk3       iSCSI_SWA_P2        IPv4      192.168.2.1   255.255.255.0   192.168.2.255
vmk4       iSCSI_SWA_P3        IPv4      192.168.1.2   255.255.255.0   192.168.3.255
vmk5       iSCSI_SWA_P4        IPv4      192.168.2.2   255.255.255.0   192.168.4.255
vmk6       iSCSI_SWB_P1        IPv4      192.168.3.1   255.255.255.0   192.168.1.255
vmk7       iSCSI_SWB_P2        IPv4      192.168.4.1   255.255.255.0   192.168.2.255
vmk8       iSCSI_SWB_P3        IPv4      192.168.3.2   255.255.255.0   192.168.3.255
vmk9       iSCSI_SWB_P4        IPv4      192.168.4.2   255.255.255.0   192.168.4.255

ESXi2:

vmk2       iSCSI_SWA_P5        IPv4      192.168.1.3   255.255.255.0   192.168.1.255
vmk3       iSCSI_SWA_P6        IPv4      192.168.2.3   255.255.255.0   192.168.2.255
vmk4       iSCSI_SWA_P7        IPv4      192.168.1.4   255.255.255.0   192.168.3.255
vmk5       iSCSI_SWA_P8        IPv4      192.168.2.4   255.255.255.0   192.168.4.255
vmk6       iSCSI_SWB_P5        IPv4      192.168.3.3   255.255.255.0   192.168.1.255
vmk7       iSCSI_SWB_P6        IPv4      192.168.4.3   255.255.255.0   192.168.2.255
vmk8       iSCSI_SWB_P7        IPv4      192.168.3.4   255.255.255.0   192.168.3.255
vmk9       iSCSI_SWB_P8        IPv4      192.168.4.4   255.255.255.0   192.168.4.255

but what will change from actual config?

Reply
0 Kudos
JayDeah
Contributor
Contributor

whats your switch config? ports+vlans?

Reply
0 Kudos
NicolaB
Contributor
Contributor

switch is dedicated to iSCSI, there are no VLANs, only iSCSI traffic.

you can understand network config decrypting label in this way

iSCSI_SWX_PY where X means switch A or B and Y means port # of the switch where it's connected

Reply
0 Kudos
NicolaB
Contributor
Contributor

I'm getting a lot of this:

2011-09-27T16:01:36.255Z cpu2:2050)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:36.395Z cpu2:292372)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:36.537Z cpu2:292372)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:36.676Z cpu2:2050)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:36.815Z cpu2:2050)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:36.884Z cpu2:2050)WARNING: NMP: nmp_DeviceRequestFastDeviceProbe:237:NMP device "naa.6782bcb0003fe45d00000fd13ce34283" state in doubt; requested fast path state update...
2011-09-27T16:01:36.954Z cpu2:2050)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:37.094Z cpu2:2050)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:37.189Z cpu8:708973)WARNING: VMW_SATP_LSI: satp_lsi_pathIsUsingPreferredController:714:Failed to get volume access control data for path "vmhba32:C1:T0:L11": Busy
2011-09-27T16:01:37.189Z cpu8:708973)VMW_SATP_LSI: satp_lsi_updatePath:680: Failed to update path "vmhba32:C1:T0:L11" Busy
2011-09-27T16:01:37.230Z cpu2:2050)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL
2011-09-27T16:01:37.373Z cpu2:292372)NMP: nmp_ThrottleLogForDevice:2318: Cmd 0x2a (0x4124003a6000) to dev "naa.6782bcb0003fe45d00000fd13ce34283" on path "vmhba32:C1:T0:L11" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0.Act:EVAL

Reply
0 Kudos
JayDeah
Contributor
Contributor

so these are flat switches?

pretty sure you cant do that either!

i'd strongly suggest you move to a setup with 2 vmks per ESX host and 2 ports per controller in use.

vmk1/pnic1 plugs into switch 1 plugs into CTRL1_P1 and CTRL2_P1 on subnet 1

vmk2/pnic2 plugs into switch 2 plugs into CTRL1_P2 and CTRL2_P2 on subnet 1

you can leave the unused ports on the SAN enabled, theres a bug in the MD gui that means if you disbale them its a right pain to re-enable them!

that above config is fully supported and reccomended.

Reply
0 Kudos
BrooklynzFinest
Contributor
Contributor

When I applied the fw update I did not lose any guest during the upgrade.  I did get alerts about path degredation but that would be normal.

Reply
0 Kudos