When installing ESX is choose the wrong Pnic for the service console. The end result is that vmnic0 is in PCI slot 2, vmnic1 is in PCI slot 2, and so on. Is there any way to ny way to rename the vmins so that the vmnic number matches the PCI slot number? I know I can very easily reinstall ESX but I'd like to know if there is an other way.
Jason
Jason -
Check out this python script. I use it all the time for kiskstarts...http://www.linuxdynasty.org/script-to-fix-vmware-esx-35-nic-reordering-after-kickstart.html
Dave
Jason -
Check out this python script. I use it all the time for kiskstarts...http://www.linuxdynasty.org/script-to-fix-vmware-esx-35-nic-reordering-after-kickstart.html
Dave
I had trouble with that python script. The author responds to posts on the forum about it though, which deserves a nod.
But I got impatient (see attached):
usage: renum-vmnic < esx.conf > esx.conf.renum
The key to vmnic ordering in esx.conf is to remember that the devices are listed in ascending PCI address order.
The first instance of an Ethernet card in the device section of esx.conf (look for vmkname) should be vmnic0, and increment up one from there for each subsequent device with a vmkname.
Then you need to go to the pnic section and make sure that child 0000 is vmnic0 and child 0001 is vmnic1 and so on. Don't worry about the MAC addresses - they are correct. It's all in PCI address ascending order.
If you end up with more pnic childs than vmnics in the device section, these are orphans from ESX trying to remap things. I delete them. You should have one device instance per pnic "child" id, with the vmkname/name being the association between them.
When you experience this NIC ordering insanity as I have with DL380 G5s, you will also note that modules.conf is wrong. modules.conf is going to be used when you boot in linux mode.
For example:
alias eth0 e1000
alias eth1 e1000
alias eth2 e1000
alias eth3 e1000
alias eth4 bnx2
alias eth5 bnx2
In a nutshell, e1000 gets loaded first, even if the Broadcom NICs have lower PCI addresses. Change it to:
alias eth0 bnx2
alias eth1 bnx2
alias eth2 e1000
alias eth3 e1000
alias eth4 e1000
alias eth5 e1000
I'm still working on a script to reorder modules.conf in a portable and programmatic way, based on PCI order, and the device to module mappings.
If you are kickstarting via http or ftp, you may be scratching your head about connectivity. In the above example, where e1000 got loaded first by default, it was the same in the kernel on the ISO. (See for yourself - in esx text mode installation, ALT-F2 once anaconda installer is kicked off to enter the shell, and go ahead and 'more /tmp/modules.conf')
For the DL380 G5's with this issue, I had to ksdevice=eth4 in order to use my first onboard NIC (i.e. the first bnx2 device in modules.conf) for network installation. By the time we get to post-install and my little awk script runs, we are in the kernel installed on the disk, so we've made this right - vmnic0 and eth0 are where they are supposed to be. But you can't do anything about the boot image other than hack the module load order and make a new ISO (I've been tempted to...) We just noted in our build book that for our specific G5's use ksdevice=eth4.
This script is to make modules.conf correspond to PCI device ID order.
usage: make-modules_conf < /etc/vmware/esx.conf > /etc/modules.conf.renum
It will re-order your modules.conf according to PCI device order and detect modules from /etc/vmware/vmware-devices.map
Use after running the vmnic-renum script during first boot tasks.
Updated with bugfix on Nov29
You can implement the nic reordering scripts during the first boot. Because networking is broken at that point, retrieving scripts during POST is a no-go. So I like to inject files during pre-install from the kickstart:
%pre # declare the deploy path http_addr=http://<deploy_host>/scripts # get the pre-install script echo "* GETTING the pre-install script..." wget $http_addr/scripts/pre-install.sh echo "*" # make a backup, create a new file, inject the http_addr variable from above echo "* INJECTING the http deploy host URL to pre-install.sh..." mv pre-install.sh pre-install.orig echo "*" echo http_addr=$http_addr > pre-install.sh echo "*" # append the rest of the script cat ./pre-install.orig >> pre-install.sh echo "*" # make it executable echo "* MAKING pre-install.sh executable..." chmod 755 ./pre-install.sh echo "*" # so lets go echo "* STARTING pre-install.sh..." ./pre-install.sh & echo "*"
Attached is a vanilla version of my pre-install script.
It will install and cue up a post-install.sh of your design on second boot, after the NICs have been re-ordered.
Updated Nov29
I think the audience for this thread is clustered environments, where standardized and automated deployment is considered a requirement.
In order to deliver a consistent host install across varied hardware, you need to be absolutely sure of your NIC assignments. In a big company, many different people touch the various components of the system - one person is racking servers and patching things, another configuring switches, and yet another deploying the OS. Everything is documented, and fully tested beforehand.
I would agree that, if you only have one ESX box, you clearly aren't doing HA. And probably not looking for this thread. You'd most likely just swap the cables around or tinker with the file by hand, as you suggest.
My hope is that the community can benefit from the time I spent to solve this problem.
P.S. I also hope that editing my posts is not spamming everyone in the thread. This post editor is a little persnickety.
Hey guys, just to let you know I updated my esx_nics_fix.py script that I have posted on http://linuxdynasty.org. It fixed 2 issues...
One was that I assumed their would not be more then 1 0 in the beginning of the PCI ID in esx.conf and the other because of issue 1 it would not reorder the vmnic name.
Both issues are fixed... Also I have yet a need to modify modules.conf at both my revious employment and at my current one..
the script is here http://www.linuxdynasty.org/script-to-fix-vmware-esx-35-nic-reordering-after-kickstart.html
Allen -
Kudos on the script. I used it again last week to get out of a jam. The customer forgot to disable the onboard NICs before an install and the NICs got out of order. Incidentally, it also changed the order of the onboard SAS and the PCI QLogic adapters, not as big of a deal, but it would be nice to have a script to fix this as well... Same type of info in the esx.conf file.
Dave
************************
Accomplishing the impossible means only that the boss will add it to your regular duties.
Doug Larson
Hey no Problem, I will see what I can do. Make sure you grab my latest script since its fixed 2 issues in my previous esx_nic_fix.py.
Hey do me a favor, could you post this request on linuxdynasty.org in the Forums section. Also in the post please give me a little more details.. Like for the esx_nics_fix.py script I run a esxcfg-nics -l to get the output. How would I do it for the problem you are having with the qlogic adapters and SAS. And wat does the entry look like in esx.conf..
Version 2.0 of the script is now available. The article is here linuxdynasty and the download link is there as well. You must become a member to download and membership is free.
Once you run the script, then all you have to do is reboot and vmware will reorder the nics back in the correct order.
1- Added the verbose option to the script.
2- fixed the verious non updates of the esx.conf..
example below.....
./esx_nic_fix.py --verbose
Pre Sort
011:00.0
005:00.0
003:00.0
011:00.1
015:00.0
Post Sort
003:00.0
005:00.0
011:00.0
011:00.1
015:00.0
Original Line: /device/011:00.0/vmkname = "vmnic2"
New Line: /device/011:00.0/vmkname = "vmnic2"
Original Line: /device/005:00.0/vmkname = "vmnic0"
New Line: /device/005:00.0/vmkname = "vmnic1"
Original Line: /device/003:00.0/vmkname = "vmnic1"
New Line: /device/003:00.0/vmkname = "vmnic0"
Original Line: /device/011:00.1/vmkname = "vmnic3"
New Line: /device/011:00.1/vmkname = "vmnic3"
Original Line: /device/015:00.0/vmkname = "vmnic4"
New Line: /device/015:00.0/vmkname = "vmnic4"
You don't have to reinstall ESX you can simply do:
esxcfg-vswitch -U vmnicx vSwitch0
esxcfg-vswitch -L vmnicx vSwitch0
The first one unlinks the vmnic(x) from your service console (vSwitch0 by default) the second links it to the correct vmnic(x). So you really don't need to renumber them (unless for some reason you are a control freak with OCD and it drives you nuts to see the numbers unmatched Otherwise once your system is up, it won't make a difference anyway.
The reason for the renaming with the script is for automation purposes. Especially in an environment when you have to kickstart many ESX hosts. If you are running ESX only on a few host, then you are write just log in to each one manually.
EXACTLY. I used this script after a kickstart and before a configure script. Works like a charm.
Dave Convery
VMware vExpert 2009
Careful. We don't want to learn from this.
Bill Watterson, "Calvin and Hobbes"
Will this script work on ESXi via RCLI? (Obviously not for the vswif NICs )
Haven't tested it at all, so can't say.
-Rich
I have not tested this script on ESXi, so I really can not say. But if you have issues with the vmnic names being out of order with the PCI ID's then I would assume it should work as well.
Will this script work on ESXi via RCLI? (Obviously not for the vswif NICs )
I don't think this will work using the RCLI because of the way the script works. It needs to be executed locally, I believe. Although I haven't tested it, I would assume it would work by entering "unsupported mode" (also sometimes referred to as "Tech support mode") and running it from there.
If you DO run it from the unsupported mode on the ESXi server, it WILL change vswif0 as well.
After running the script, a reboot is required. The vswif0 reference could now potentially cause a disconnect if your vswif0 is not pointing to the first NIC on the PCI bus.
The intent here is for scripted installations where the NIC order is not what is expected and needs to be reset. I have seen this primarily on HP servers. It could be differences in the way isolinux kernel and the hypervisor kernel handles the PCI addresses.
Dave Convery
VMware vExpert 2009
Careful. We don't want to learn from this.
Bill Watterson, "Calvin and Hobbes"
It is not a just HP servers that suffer from it, Dell , and Fujitsu Seimens server do as well, I have not really found it to be an issue with HP servers, but then I could have been lucky.
If you found this or any other answer useful please consider the use of the Helpful or correct buttons to award points
Tom Howarth VCP / vExpert
VMware Communities User Moderator
Blog: www.planetvm.net
Contributing author for the upcoming book "VMware Virtual Infrastructure Security: Securing ESX and the Virtual Environment”.
You're right Tom. It is weird too. I have it with the DL580 servers, but blades are OK. I did a deployment recently with 16 Dell R900 servers and had no problem with it. I include itin all deployments just in case now anyway. I think it has something to do with the way the kernel treats the addressing of internal components vs. slot components.
Dave Convery
VMware vExpert 2009
Careful. We don't want to learn from this.
Bill Watterson, "Calvin and Hobbes"