VMware Cloud Community
Heiko4444
Contributor
Contributor

ESXi 4.1 boots long with iSCSI luns

Hi

vCenter 4.1 is installed in a Virtual Machine, there are 3x ESXi 4.1 Build 260247 hosts, which connects to Dell switches, which connects to a Dell PS6000 iSCSI SAN.

All devices have the latest firmware installed. The iSCSI initiation lun access is granted by initiator name from the ESXi hosts, which works 100%.

The ESXi hosts are Dell MP610 blades with 6 physical netxtreme ii bcm5709 (bnx2 driver). The Dell SAN is presenting about 7 luns to the ESXi hosts.

ESXi iSCSI configuration is based on the following pdf: Configuring VMware vSphere Software iSCSI with Dell EqualLogic PS Series Storage. We used software iscsi adapter).

ESXi 4.1 has been installed on the internal storage. No iSCSI boot.

Problem:

Without any iSCSI presentation from the SAN, the ESXi hosts boots normal and you can log on with the vi client immediately. As soon as we add iSCSI luns the boot process takes about 30 minutes before you can do anything. As soon as we do rescans on the HBA's it takes a very long time.

In the boot process we can see on the SAN that all iSCSI initiation are fast and successful.

Jumbo frames have been enabled. One problem we found is that the blades physical nics we can not set to 9000 as it will not connect to the iSCSI. I changed it to 8000 and all is working. Even if the blade is compatible it seems as if there is still a problem? I found this article: http://communities.vmware.com/message/1570122.

The other question is we see all physical nics as iSCSI adapaters. The netxtreme ii bcm5709 are "s" as what it seems. (See attached print screen)

Can it be because the physical nics are iscsi enabled that ESXi tries to scan for luns on all of them? We cannot disable it or I do not know how Smiley Sad

In short, why does the ESXi 4.1 boot long when we attached iSCSI LUN's and can the cause be the netxtreme ii bcm5709?

Thank you and hope someone can assist?

Tags (3)
0 Kudos
9 Replies
PaulusG
Enthusiast
Enthusiast

Hi,

You are using the software iSCSI initiator.

After you have changed the MTU value from 9000 to 8000, did you recreate the vSwitch with a value of 8000?

You can check the current settings with this command:

$ esxcfg-vmknic –l

Enabling Jumbo frames or changing a MTU value, means destroy and recreate an existing vSwitch.

When you decide to use Jumbo Frames, everything end-to-end has to support Jumbo Frames and with the correct

MTU. If not you will experience weird situations (I know from my own experience Smiley Happy )

Did you test your configuration completely without Jumbo Frames?

Paul Grevink

Twitter: @PaulGrevink

If you find this information useful, please award points for "correct" or "helpful".

Paul Grevink Twitter: @PaulGrevink http://twitter.com/PaulGrevink If you find this information useful, please consider awarding points for "correct" or "helpful".
Heiko4444
Contributor
Contributor

Hi Paulus

No, did not re-created. I have deleted and re-created. The iSCSI switch, vmkinics are all on MTU1500 and it is still the same problem. Smiley Sad

The SAN is MTU9000 and can not be changed, the switches are set to MTU 9216 MTU for the iSCSI only.

Heiko

0 Kudos
PaulusG
Enthusiast
Enthusiast

Hello Heiko4444

Your reference to http://communities.vmware.com/message/1570122 shows that it is not very clear if the netxtreme ii bcm5709

is fully supported. I think this subject needs some more investigation.

It seems you have a strange situation, remember if you want Jumbo Frames, everything MUST use the same MTU.

The SAN has a MTU=9000 and can not be changed

The blade physical NICs can not be set to a MTU=9000, so there is a conflicting situation which has to be cleared.

Paul Grevink

Twitter: @PaulGrevink

If you find this information useful, please award points for "correct" or "helpful".

Paul Grevink Twitter: @PaulGrevink http://twitter.com/PaulGrevink If you find this information useful, please consider awarding points for "correct" or "helpful".
AndreTheGiant
Immortal
Immortal

Have you tried to disable the iSCSI optimization on the NIC and use instead a traditional software initiator?

Follow this document:

http://www.equallogic.com/resourcecenter/assetview.aspx?id=8453

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
Heiko4444
Contributor
Contributor

Andre, the link does not work.

Is it similar as described in this pdf? http://www.dell.com/downloads/global/products/pvaul/en/ip-san-best-practices-en.pdf

My experience with Dell is not good. I am in contact with Dell as well but no luck sofar Smiley Sad.

0 Kudos
Heiko4444
Contributor
Contributor

I am searching but can not find an answer on the vmfs lock problem as shown in the attachment. Do you have any ideas?

0 Kudos
AndreTheGiant
Immortal
Immortal

Strange... for me the link works.

Try this other:

http://communities.vmware.com/servlet/JiveServlet/download/1387588-29608/Configuring%20VMware%20vSph...

And the other doc is not correct, cause is for MD3000i or similar...

About your error, I suggest to follow the configuation that require multiple initiators.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
Heiko4444
Contributor
Contributor

No success, same scenario.

Will test with a single lun and see what will happen.

Any other suggestions are welcome Smiley Happy

0 Kudos
Heiko4444
Contributor
Contributor

Hi

To give feedback.

With a single existing lun assinged to the ESXi 4.1 host we had the same problem.

After investigation with VMware support team we found that all datastores where corrupted. The non-destructive remediation was to storage motion all VM's to new datastores and re-create the existing datastores. This resolved the slow boot process.

0 Kudos