VMware Cloud Community
rmuniz9336
Enthusiast
Enthusiast

Interesting issue installing NICs

Hi everyone,

Got a totally bizarre one here. We're running ESX 3.0.1 on a Dell Poweredge 6800. We've 8 processors, 32 GBs of RAM, and almost a terabyte worth of disk space in it. Recently, we made the decision to install a couple of additional NICs in. We purchased the Nics (Broadcom, and verified by Dell to be the one's needed), and went to install them. Of course the physical side of it was easy. Turned the machine back on, the OS ran through, found the new hardware, and we told it to go ahead an configure it, and waited for the machine to come up.

Now here's where things got interesting. Previously, Id been able to see the VMs in the VMFS folder just fine. Now, I couldn't see them at all. In fact, if I were to run a df -k, I'd see the sda partitions broke down into sda1 thru 8, the sdb1, and sdc1 partitions. Now, all I'd see is sda2, which apparently houses the OS, and etc.. If I do an fdisk -l, I see all my partitions, but can't seem to do a thing with them.

I shutdown the machine, pull out the cards, turn it back on, and everything is back to normal. The breakdown seems to be between the OS and ESX, as I can see the NICs in the BIOS and the OS, but everything ESX seems to have an issue anytime they're installed.

I contacted DELL (since they are our support for the product), sent them the results of my vm-support, and they're as stumped as I am. The best suggestion they had was a few hardware drive up grades, and to upgrade from 3.0.1 to 3.0.2.

Wondering if anyone else has seen this, and what you did to fix the problem.

Rich

0 Kudos
4 Replies
Schorschi
Expert
Expert

Have you checked the assigned INTs vectors? Just a thought. Sounds like your storage controller is getting slammed on the PCI bus.

We have had so many quirks with PCI bus enumeration and ESX OS, we no longer add hardware, we migrate off the hardware and reload the ESX OS with the hardware the way we want it, if we add PCI devices. You could try running esxcfg-boot, read up the command-line options, but in effect you have done that with the new hardware discovery, I don't docs in front of me, but there is an option to force the ESX host to relearn its configuration, might work to try that again and see if it pops out anything interesting.

Our favorite issue... just to illustrate PCI Bus oddness...

Two Dell 6850s, that for all practical purposes are identical in every respect, cards in same slots, same cards according to make and model, same processors, etc., etc., same version of firmware/bios, etc., but one sees the Broadcom NICs BEFORE the Intel NICs on the PCI bus (per enumeration), the other sees the Ethernet NICs BEFORE the Broadcom NICs. Running lspci on both shows the actual enumeration values vary hence the change in discovery order. We believe there is a difference with the hardware mainboards that is so far below the RADAR of the OMSA or other similar tools to see that it is missed. Want to bet the mainboards are rev 0.0.1 and 0.0.1b or some such? LOL

0 Kudos
cmanucy
Hot Shot
Hot Shot

How many free slots do you have to install these NICs in (and how much time do you have with the host down?!?).

PCI enumeration is a real pain... some vendors are worse than others. I agree that sometimes re-installing ESX is quicker (keep your /etc/vmware/esx.conf!).

---- Carter Manucy
0 Kudos
rmuniz9336
Enthusiast
Enthusiast

We've four slots on this board, and I've trtied different slots on it. Bad part is that it's crunch time for some of the developers, and I'm snagging an hour here and there trying to figure the problem out. Beginning to suspect that a reinstall will be out best solution.

0 Kudos
rmuniz9336
Enthusiast
Enthusiast

Wouldn't doubt it that the boards are like you say. All I know is I sure got some folks scratching their heads over at Dell on this one. Interesting though on the INITs, will have to pursue that one. We've looked at the IRQs, but never that. Will pursue it a bit, and see what we see.

0 Kudos