VMware Cloud Community
JayDeah
Contributor
Contributor

extremely long boot time (crash?) on ESXi5 host, pause after "iscsi_vmk started"

ok ive got some Dell poweredge 1950 servers connected to an md3200i that have been happily running ESXi 4.1 U1 for some time and i have put ESXi5 on them using a number of scenarios all of which having the same problem. Any thoughts?

the boot process hangs near the end of the progress bar and the last entry on the screen is "iscsi_vmk started successfully"

if i alt+f12 i can see the server hasnt crashed.

i am using vcenter 5.0 with latest update manager and webclient.

i have used the following scenarios to install and configure:

1) host upgrade using updatemanager

2) clean instakll of ESXi5, apply old esxi4.1 host profile, update settings reboot

3) clean install of ESXi5, clean configuration, reboot

scenario 1 appeared to leave the host hanging , i gave up after 30minutes and reinstalled

scenario 2 didnt seem to like having the ESXi4 settings as i ended up with my swiscsi adapter on vmhba39, it did eventually start after about half an hour!

scenario 3 has been hanging for about 15minutes sofar, im hoping it finally boots!

0 Kudos
95 Replies
lamw
Community Manager
Community Manager

I would be interested if the following hack would allow you to signifcantly reduce the iSCSI delay - http://www.virtuallyghetto.com/2011/10/how-to-decrease-iscsi-login-timeout-on.html

Instead of copying out the DB and updating it, you can actually do it within ESXi 5 Busybox console:

Change ltime to 1

# vmkiscsid -x "update discovery set 'discovery.ltime'=1"
Sending SQL Request update discovery set 'discovery.ltime'=1
iSCSI MASTER Database opened. (0xffaad008)
Request completed with status: 0

Verify ltime is set to 1

# vmkiscsid -x "select * from discovery"
Sending SQL Request select * from discovery
iSCSI MASTER Database opened. (0xffaad008)
Record iscsi_vmk,172.30.0.235,3260:
   `key`='iscsi_vmk,172.30.0.235,3260'
   `discovery.startup`='manual'
   `discovery.type`='sendtargets'
   `discovery.transport_name`='iscsi_vmk'
   `discovery.ltime`='1'
   `discovery.sendtargets.address`='172.30.0.235'
   `discovery.sendtargets.port`='3260'
   `discovery.sendtargets.auth.authmethod`='None'
   `discovery.sendtargets.auth.username`=[NOT SET]
   `discovery.sendtargets.auth.password`=[NOT SET]
   `discovery.sendtargets.auth.username_in`=[NOT SET]
   `discovery.sendtargets.auth.password_in`=[NOT SET]
   `discovery.sendtargets.timeo.login_timeout`='5'
   `discovery.sendtargets.reopen_max`='5'
   `discovery.sendtargets.timeo.auth_timeout`='45'
   `discovery.sendtargets.timeo.active_timeout`='30'
   `discovery.sendtargets.timeo.idle_timeout`='60'
   `discovery.sendtargets.timeo.noop_out_interval`='15'
   `discovery.sendtargets.timeo.noop_out_timeout`='10'
   `discovery.sendtargets.timeo.replacement_timeout`='10'
   `discovery.sendtargets.iscsi.MaxRecvDataSegmentLength`='32768'
   `node.conn[0].iscsi.HeaderDigest`='None'
   `node.conn[0].iscsi.DataDigest`='None'
   `node.conn[0].iscsi.DelayedAck`='1'
   `discovery.inheritance`='1074790399'
   `node.session.iscsi.MaxOutstandingR2T`='1'
   `node.session.iscsi.FirstBurstLength`='262144'
   `node.session.iscsi.MaxBurstLength`='262144'
   `node.conn[0].iscsi.MaxRecvDataSegmentLength`='131072'
Request completed with status: 0

Initiate backup of ESXi 5 configuration

~ # /sbin/auto-backup.sh
Binary files /etc/vmware/vmkiscsid/vmkiscsid.db and /tmp/auto-backup.3756//etc/vmware/vmkiscsid/vmkiscsid.db differ
Saving current state in /bootbank

Reboot while your iSCSI target is unavailable and see if the timing has been reduced

0 Kudos
newtovms
Contributor
Contributor

Is there any progress on this issue? I am getting the exact same issue with my new ESXi 5 setup with 2x Dell R710's and Md3200i with 2 controllers. Seems to hang on the 'iscsi_vmk loaded successfully' for a few minutes, buth then hangs for ages on 'mask_path_plugin loaded successfully' before eventually booting up...sometimes it will then have 'lost' the config files for the VM's and fail to load them - but the LUNs are still there and configured.....

0 Kudos
MichaelW007
Enthusiast
Enthusiast

The patch is expected to be available publically shortly, but I don't have an exact date. As soon as I'm aware of it's availability I'll post to this discussion.

0 Kudos
flix21
Contributor
Contributor

0 Kudos
Catharsis7
Contributor
Contributor

I'm trying to figure out how to run the patch.  What they have written on the patch page doesn't make any sense.  They say this...

  • This express patch ISO contains all of the fixes in ESXi 5.0 Patch 01, plus the software iSCSI fix.
  • VMware is delivering an ISO file for this patch release due to the nature of this issue. This is not common practice and is only done in special circumstances.

...but the download IS NOT an ISO.  The download is a ZIP file and there are no instructions in it on how to upgrade.  They speak of using the vSphere Update Manager, but when I go to download that, I see that it is a paid-for product.  (Really?  I have to pay to cleanly upgrade my software?)

The patch page says:  "ESXi hosts can be updated by manually downloading the patch ZIP file from the VMware download page and installing the VIB by using the esxcli software vib command."  But they do not say how to do this.

I guess I have to leaf through their 174 page document on upgrading from 4 to 5.0, even though that's NOT what this is.  Bottom line, there are no clear-cut instructions on how to apply this patch.

EDIT:  Looks like vihostupdate.pl doesn't work anymore, either.  When I run it, I get this:  "This operation is NOT supported on 5.0.0 platform."  Any actual instructions or bones VMware wants to throw this way would sure help.  We've waited this long, and they just give us a patch with no clear instruction how to actually apply it.  This is REALLY frustrating.

0 Kudos
jtbatstaveren
Contributor
Contributor

I just did a entity scan, and it seems that the patch ESXi500-201111401-BG is already available via Update Manager right now.

I'm currently updating a ESXi 5.0 host. I hope that this patch does fix the very long boot problem...

0 Kudos
Catharsis7
Contributor
Contributor

That's great for people who have Update Manager...

0 Kudos
jtbatstaveren
Contributor
Contributor

Yep, you're right...

I just remediated a host and yes, it dit solve the problem! The complete installation with a host reboot took about 9 minutes.

That's a lot better than the 1,5 hour a host reboot took before this patch was installed.

0 Kudos
nateccs808
Contributor
Contributor

don't fret. google to the rescue.

see here on how to update without update manager: http://communities.vmware.com/people/vmroyale/blog/2011/09/15/updating-esxi-5--single-use-esxcli-how...

0 Kudos
Catharsis7
Contributor
Contributor

nateccs808 wrote:

don't fret.

Thanks.  Actually, I had already done two servers using this method (after a lot of digging in that 174 pager).  I was going to post my results when I'm done with all servers.

0 Kudos
Catharsis7
Contributor
Contributor

Our two test ESXi servers (with no iSCSI attachments) patched fine.  I've only patched one server with iSCSI so far, and sure enough, the patch works!  It only stayed at the iscsi vmk "loaded successfully" screen for about a minute, then finished booting.  The iSCSI attachment was still there when I got back into vSphere.  Everything looks good.  Won't be able to patch the second server until this evening.

0 Kudos
MichaelW007
Enthusiast
Enthusiast

Great news guys that the patch is working successfully. Thanks to all of you for your help and patience with this. Also thanks to all the team at VMware for getting us this patch out relatively quickly. I'll be patching all my hosts in my lab tonight Smiley Happy.

0 Kudos
FredericNass
Contributor
Contributor

Hey there,

After patching, ESX 5.0 host boot time went from 15 minutes to 5 minutes, but my guess is it would be even faster if it were not stucked trying to access a rarely used CD-ROM player :

2011-11-04T09:46:08.716Z cpu12:2659)FSS: 4333: No FS driver claimed device 'mpx.vmhba1:C0:T0:L0': Not supported
2011-11-04T09:46:08.723Z cpu8:2659)VC: 1449: Device rescan time 34 msec (total number of devices 5)
2011-11-04T09:46:08.723Z cpu8:2659)VC: 1452: Filesystem probe time 13 msec (devices probed 5 of 5)
2011-11-04T09:46:08.801Z cpu8:2659)FSS: 4333: No FS driver claimed device 'mpx.vmhba1:C0:T0:L0': Not supported
2011-11-04T09:46:08.805Z cpu8:2659)VC: 1449: Device rescan time 22 msec (total number of devices 5)
2011-11-04T09:46:08.805Z cpu8:2659)VC: 1452: Filesystem probe time 10 msec (devices probed 5 of 5)
2011-11-04T09:46:22.017Z cpu8:2659)FSS: 4333: No FS driver claimed device 'mpx.vmhba1:C0:T0:L0': Not supported
2011-11-04T09:46:22.018Z cpu10:2659)VC: 1449: Device rescan time 28 msec (total number of devices 5)
2011-11-04T09:46:22.018Z cpu10:2659)VC: 1452: Filesystem probe time 10 msec (devices probed 5 of 5)

~ # esxcli storage core path list

sata.vmhba1-sata.0:0-mpx.vmhba1:C0:T0:L0
UID: sata.vmhba1-sata.0:0-mpx.vmhba1:C0:T0:L0
Runtime Name: vmhba1:C0:T0:L0
Device: mpx.vmhba1:C0:T0:L0
Device Display Name: Local TSSTcorp CD-ROM (mpx.vmhba1:C0:T0:L0)
Adapter: vmhba1
Channel: 0
Target: 0
LUN: 0
Plugin: NMP
State: active
Transport: sata
Adapter Identifier: sata.vmhba1
Target Identifier: sata.0:0
Adapter Transport Details: Unavailable or path is unclaimed
Target Transport Details: Unavailable or path is unclaimed

Hope VMware can pay attention to this...

0 Kudos
MichaelW007
Enthusiast
Enthusiast

Why don't you just disconnect it if it's rarely used? Other alternative would be to disable it in the system BIOS. I don't see how this is a problem for VMware to sort out, especially if it's an unsupported device.

0 Kudos
FredericNass
Contributor
Contributor

Because "rarely" is not "never". Plus doing so might cause some troubles and unattended side effects. This player is no usb / external dvd-rom player but DELL R610 server's internal player.

And especially because this particular device has nothing to offer at that particular stage in the boot process.

0 Kudos
MichaelW007
Enthusiast
Enthusiast

In the case the best option is to disable it in the BIOS until it is needed.

0 Kudos