VMware Cloud Community
Schorschi
Expert
Expert
Jump to solution

Re-ordering VMNIC enumeration?

For various reasons, including ESX OS deployment via BladeLogic, for some odd ball reason, the lowest PCI enumerated NIC is not always VMNIC0. Our service console NIC is not on the same VLAN as the VM NICs or of course VMotion, so we have a standard that VMNIC0 is always the service console NIC, our network resources always put the service console NIC cable in the lowest enumerated Ethernet port. So say for example, BladeLogic assigns VMNIC4 or VMNIC 5 to this physical port? I can not just assign VMNIC4 to vswif0 and call it done, I really do have to clean up the configuration so that VMNIC0 is the lowest PCI enumerated NIC port. Any suggestions? using esxcfg-vswif, esxcfg-nics, esxcfg-vswitch, I can find out where the lowest enumerated port is, but how do I reorder the VMNIC assignments to match?

For example... If we start with...

VMNIC4 - PCI 10:0.0

VMNIC5 - PCI 10:0.1

VMNIC0 - PCI 15:0.0

VMNIC1 - PCI 15:0.1

VMNIC2 - PCI 20:0.0

VMNIC3 - PCI 20:0.1

How can we get to...

VMNIC0 - PCI 10:0.0

VMNIC1 - PCI 10:0.1

VMNIC2 - PCI 15:0.0

VMNIC3 - PCI 15:0.1

VMNIC4 - PCI 20:0.0

VMNIC5 - PCI 20:0.1

This question should be worth 20 points, since I suspect it is not going to be trival or easy. Some one please prove me wrong! Thanks in advance.

Schor-

0 Kudos
1 Solution

Accepted Solutions
linuxdynasty
Enthusiast
Enthusiast
Jump to solution

Thanks for the explanation but here is what I do not understand .... Why VMware does not follow the PCI-ID Order??? If they would just follow that then this would not be a issue at all. Actually I found in there Perl code where they are doing the reordering of the NIC's. We have contacted them previously and they said they would fix it in the next release and since changing ther ecode everytime wold just not work for us or anyone else that is why I created that Python Script.

But I was glad to be able to help you and anyone else who used my script.

View solution in original post

0 Kudos
16 Replies
kjb007
Immortal
Immortal
Jump to solution

Your /etc/vmware/esx.conf contains your logical to physical mappings. You can change the order there to what you want it to be, and then restart the server.

As far as order goes, the lowest number pci slot will be registered first, but in some cases, the onboard NICs are not always recognized immediately, and the add-on cards are sometimes seen first, and hence, numbered first. Second, the NIC you chose when you installed ESX, will also become vmnic0, and then the remaining numbered accordingly.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
lamw
Community Manager
Community Manager
Jump to solution

This phenomenon can be explained by using NICs of different manufactuers from both your embeds and add-on, detail description and how to fix this issue is listed at the following, looks like it only affects ESX 3.5 ?

http://www.linuxdynasty.org/script-to-fix-vmware-esx-35-nic-reordering-after-kickstart.html

0 Kudos
Schorschi
Expert
Expert
Jump to solution

The python script does not appear to work? I ran it, no errors, in fact no output at all. I am not familar with python so I am not sure what i should be looking for. I named it restack.py. Executed # restack.py. It did create the .orig file. Looking at esx.conf and esx.conf.orig they appear to be the different. After reboot, I did not see any difference, or missed any change? PCI enumeration does not match VMNIC enumeration.

0 Kudos
linuxdynasty
Enthusiast
Enthusiast
Jump to solution

Try the script again, I found a bug with the ordering and fixed it the other day. but it would help if you send me the output of esxcfg-nics -l

http://www.linuxdynasty.org/script-to-fix-vmware-esx-35-nic-reordering-after-kickstart.html

0 Kudos
linuxdynasty
Enthusiast
Enthusiast
Jump to solution

Also after you run the script and reboot it should reorder your nics correctly but you might have to fix the mappings. Meaning if you adjusted your mappings to work in its current state my script will adjust the order but then you might have to remap it using the esx commands. If you have any more issues with the script become a member of my site and I will help you debug your issue.

0 Kudos
Schorschi
Expert
Expert
Jump to solution

An example...

VMware ESX installer setup vswif0 on VMNIC0, which is not the lowest order device on the PCI bus, and lowest order on PCI bus was assigned to VMNIC1.

  1. esxcfg-nics -l

Name PCI Driver Link Speed Duplex MTU Description

vmnic0 08:01.00 tg3 Down 0Mbps Half 1500 Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet

vmnic1 04:00.00 bnx2 Up 1000Mbps Full 1500 Broadcom Corporation Broadcom NetXtreme II BCM5708 1000Base-SX

vmnic2 06:00.00 bnx2 Up 1000Mbps Full 1500 Broadcom Corporation Broadcom NetXtreme II BCM5708 1000Base-SX

vmnic3 08:01.01 tg3 Down 0Mbps Half 1500 Broadcom Corporation NetXtreme BCM5704S Gigabit Ethernet

vmnic4 10:04.00 tg3 Down 0Mbps Half 1500 Broadcom Corporation Broadcom BCM5715S Gigabit Ethernet

vmnic5 10:04.01 tg3 Down 0Mbps Half 1500 Broadcom Corporation Broadcom BCM5715S Gigabit Ethernet

This results in a server that never gets network connectivity after reboot, if DHCP, offline by default, if static IP based, the service console is still blind via vswif0 because the physical NIC, VMNIC1 is not bound to vSwitch0, where VMNIC0 is, and of course VMNIC0 in this case is not patched. This specific server is an IBM HS21 blade, but we have seen the same scenario in different variants on HP and Dell as well. What is really odd, per VMware, is that the NICs are split by driver as well as misaligned. The Broadcom BCM5704S NICs ports should be VMNIC0 and VMNIC1 even if the issue or misorder happens.

Running the latest script... Do I still see issues? It does not look like anything was recorded? Is output not supposed to reflect the new reorder? And script output is 16, not 10, for the last two NIC ports? Should the esx.conf not be HEX versus decimal?

  1. python ReOrder.py

08:01.0

04:00.0

06:00.0

08:01.1

/device/004:00.0/vmkname = "vmnic1"

/device/006:00.0/vmkname = "vmnic2"

/device/008:01.0/vmkname = "vmnic0"

/device/008:01.1/vmkname = "vmnic3"

/device/016:04.0/vmkname = "vmnic4"

/device/016:04.1/vmkname = "vmnic5"

#

0 Kudos
linuxdynasty
Enthusiast
Enthusiast
Jump to solution

My newest update to the script should fix that, please get it off of my

site and reboot once you are done running it.

0 Kudos
Schorschi
Expert
Expert
Jump to solution

You posted a new one in the last 10 minutes? Well 10 minutes after I grabbed what I thought was the latest? Ok, will do. (A few minutes later)

Ok this output looks better...

  1. python ReOrderAgain.py

08:01.0

04:00.0

06:00.0

08:01.1

/device/004:00.0/vmkname = "vmnic0"

/device/006:00.0/vmkname = "vmnic1"

/device/008:01.0/vmkname = "vmnic2"

/device/008:01.1/vmkname = "vmnic3"

/device/016:04.0/vmkname = "vmnic4"

/device/016:04.1/vmkname = "vmnic5"

But one last question, should the representation be decimal or hexidecimal? Does esxcfg-nics show decimal, but esx.conf is in hexidecimal?

0 Kudos
linuxdynasty
Enthusiast
Enthusiast
Jump to solution

Yea I did not understand why the vmware devlopers did that (meaning the whole conversion) But at least the script work and thanks for your input....

If you need further help please pot on my site as im trying to keep it up2date on the different projects im working on.

http://linuxdynasty.org Where IT pros come and share their knowledge..\

0 Kudos
depping
Leadership
Leadership
Jump to solution

cool stuff, just blogged about it if you don't mind.

Duncan

My virtualisation blog:

If you find this information useful, please award points for "correct" or "helpful".

linuxdynasty
Enthusiast
Enthusiast
Jump to solution

No problem what so ever, actually I appreciate if you post my link on your site. http://linuxdynasty.org Where IT pros come and share their knowledge..\

I'll post your link on my site, as I'm going to create asection on my site for links to other people's site.

0 Kudos
Schorschi
Expert
Expert
Jump to solution

Talking at length with VMware, and indirectly with the chief developer for kickstart customization for the VMware ESX installer. This issue is really a driver load sequence issue, since VMkernel, kickstart (anaconda) do not always agree on when or how the drivers are loaded. This combined with the fact that the first driver loading, expects to own VMNIC0, and the driver order can flip or switch, the potential exists for this issue.

What they said was odd, was you should have the same type of NIC port, groups with the same type, so Broadcom, Broadcom, Intel Intel, etc. if using dual-port NICs, and taking embedded into account. However, in my case the specific server is a blade and has all Broadcoms, but using different drivers, tg3 for some but not all.

The solution, per VMware, is to always use the --device option for the network option in kickstart configuration file. However, doing this breaks some network build methods, which use MAC address as a way to specify the NIC port. BladeLogic has this issue for example, which uses the MAC address in its kickstart automation process. So the only option we have is to bind all VMNIC interfaces to the service console, to complete our download of additional components, then run the re-order script (thanks for the quick fix) as soon as possible in our post build customization automation to clean up the mess. Makes things interesting if you are using DHCP (or M-DHCP) versus status.

0 Kudos
linuxdynasty
Enthusiast
Enthusiast
Jump to solution

Thanks for the explanation but here is what I do not understand .... Why VMware does not follow the PCI-ID Order??? If they would just follow that then this would not be a issue at all. Actually I found in there Perl code where they are doing the reordering of the NIC's. We have contacted them previously and they said they would fix it in the next release and since changing ther ecode everytime wold just not work for us or anyone else that is why I created that Python Script.

But I was glad to be able to help you and anyone else who used my script.

0 Kudos
Schorschi
Expert
Expert
Jump to solution

I tried to get VMware to look at the issue from a perspective, that PCI bus device order should be consistent across every environment, be if BIOS, VMware ESX OS Installer (Anaconda), and the VMkernel. But something is getting lost in translation. Call it developer inflexibility? Or I guess they just do not see this issue as significant as end-users? Not sure how to qualify the conflict in just accepting that an issue exists. I have been told in written explicit language that a solution exists, i.e. use --device method. This is not to say that VMware did not listen, but some how they routinely discount customer feedback by default. We have seen this become more frequent in recent years, it maybe the result of VMware growthing faster than reasonable? Or it maybe some odd indirect impact to a mindset that VMware holds from original management, if you get my implication? I think VMware is going to have to become more flexible in a number of ways, to continue to be successful in owning virtualization throughout the industry.

0 Kudos
linuxdynasty
Enthusiast
Enthusiast
Jump to solution

For any of you who are interested, I continue to add new scripts/programs to my website http://linuxdynasty.org. Some contain scripts/modules about vmware/xen/libvirt.. etc..

0 Kudos
Zciklacekic
Contributor
Contributor
Jump to solution

are there any update for vsphere 4.1 version. I'm having the same issue on Dell R810 after adding a SCSI card for passthrough backup device. vmnic4 and vmnic5 suddenly disappeared. Your script didn't help to solve the issue..

0 Kudos