VMware Cloud Community
CatHat
Contributor
Contributor

Updated from 4.0U2 to 4.1, network adapaters fail.

Hello there!

Just 10 minutes ago i updated my esxi whitebox from 4.0U2 (working perfectly) to 4.1 through the vmware vSphere CLI. The update and everything worked perfectly until i tried to connect to the machine. if i do a ping to the machine from my windows host it replies until i try to connect to esxi with https/vSphere client/ssh, then it just time-outs for a while and i cant connect.

I checked the server "(1) messages log" and it says:

e1000_clean_tx_irq : Detected TX unit hang

netdev_watchdog : NETDEV WATCHDOG : vmnic2 : transmit timed out

and after a while:

vmnic2 is down. Affected portgrupp: Management network. 0 uplinks up. Failed criteria : 130

and after a while it comes back up:

Uplink vmnic2 had recovered from a transient failure due to watchdog timeout.

I cant export the log since i cant connect to the machine 😕

What should i do to fix this, is it a faulty e1000 driver shipped with the update?

My setup mb and nic:

Asus M4A785TD-V EVO, Socket-AM3 and Intel® PRO/1000 MT Dual Port Server Adapter

Tanks!

0 Kudos
28 Replies
CatHat
Contributor
Contributor

Ok, iv managed to track down the error to the motherboard and not the NIC itself. I tried with 4 different nics and they were all detected and they all fail when im trying to connect. I did a BIOS update but it didn't help.

Im really running out of ideas to what could be wrong here, i tried installing esxi 4.0 again and it works like a charm, so the problem is definitely something thats being introduced in esxi 4.1 (i also did a clean 4.1 installation from a CD but the problem is the same).

Guess i have to revert back to esxi 4.0U2 until the next patch or a fix comes out, to bad.....

0 Kudos
DSTAVERT
Immortal
Immortal

Is the Intel NIC a server NIC or desktop one? Have you checked the download page for Drivers and Tools to see if there are Intel Nic drives.

These are some of the unfortunate problems with whitebox. Your combination of pieces don't get tested and may or not get included in patches or updates. Not meaning to give you a hard time but if this is at all production I would seriously look at supported hardware.

-- David -- VMware Communities Moderator
0 Kudos
ashleymilne
Enthusiast
Enthusiast

Is your hardware on the HCL? If not then although it might be a costly endeavour, running VMWare on supported hardware is always the best way to go, as updates can cause unexpected issues on unsupported hardware.

0 Kudos
CatHat
Contributor
Contributor

The NIC (Intel® PRO/1000 MT Dual Port Server Adapter) is on the HCL, the motherboard isnt.

I fully understand that supported hardware is the way to go, but im a student just trying to learn esxi and i cant afford certified parts with a decent performance so thats why i built the whitebox (its been running without problems for a ~year). I checked the esxi Drivers & Tools but found nothing for my chipset (intel 82546) Smiley Sad Iv finally managed to extract the log from /var/log/vmware, i will attach it with this post. IF there are other logs of more importance (?) just ask and il try to get them to. Hopefully some one else have had the same problems and found a workaround Smiley Happy

0 Kudos
DSTAVERT
Immortal
Immortal

There are multiple versions of the card and not all are supported on 4.x but since you do have it working on 4.0 you may have the right one. Have a close look at the card listing on the HCL and look at the subtext entries. Check to see whether there are NIC firmware updates or rather firmware versions that are different for your NIC.

-- David -- VMware Communities Moderator
0 Kudos
DSTAVERT
Immortal
Immortal

You can use lspci to see what hardware has been detected.

Use vmkload_mod --list to display the drivers that have been loaded.

-- David -- VMware Communities Moderator
0 Kudos
CatHat
Contributor
Contributor

Im a lite unsure what you meant by subtext entries but theres two relevant HCL entries that i found:

(1)

(2)

Both of them says they are supported by esxi 4.1 😕

I also checked intels website for firmware but there dosent appear to be any for my nic.

Tanks so far!

0 Kudos
DSTAVERT
Immortal
Immortal

(1)

(2)

Does your card match either of those entries. One of them point to a Sun version of the card. See whether your DID (device ID) matches

Some of this can be that, although the card is a supportable version, it does not present the PCI ID value that enables the driver or the correct driver to load.

Have a look at the following

http://vm-help.com/esx40i/Hardware_support.php#Intel

-- David -- VMware Communities Moderator
CatHat
Contributor
Contributor

I attached the output for lspci and vmkload_mod --list, also attached a picture of the chipset itself

I looked into the DID, but its a litle confusing.

A search for 82546GB returned and http://pci-ids.ucw.cz/read/PC/?restrict=0?action=jump i dont know how to tell which one of the listed devices are exactly mine but if i check the esxi supported list DID 108a and 10b5 matches my 82546GB card. If this is correct then esxi loads the e1000.o driver, which is loaded accordingly to the vmkload_mod dump, correct? 😕

edit: the link dosent work but this is what pci-ids.ucw.cz returns for my chip:

ID

Name

Parent

PCI device 8086:105b\

82546GB Gigabit Ethernet Controller (Copper)

Intel Corporation

PCI device 8086:1079\

82546GB Gigabit Ethernet Controller

Intel Corporation

PCI device 8086:107a\

82546GB Gigabit Ethernet Controller

Intel Corporation

PCI device 8086:107b\

82546GB Gigabit Ethernet Controller

Intel Corporation

PCI device 8086:108a\

82546GB Gigabit Ethernet Controller

Intel Corporation

PCI device 8086:1099\

82546GB Gigabit Ethernet Controller (Copper)

Intel Corporation

PCI device 8086:109b\

82546GB PRO/1000 GF Quad Port Server Adapter

Intel Corporation

PCI device 8086:10b5\

82546GB Gigabit Ethernet Controller (Copper)

Intel Corporation

Thanks!

0 Kudos
DSTAVERT
Immortal
Immortal

See if there is anything here that is useful

http://communities.vmware.com/thread/250070

-- David -- VMware Communities Moderator
0 Kudos
CatHat
Contributor
Contributor

Thanks again but iv already checked that thread (and 1000 other on google), sofar iv come up empty handed. It seems iv got something of a unique error here Smiley Sad

0 Kudos
DSTAVERT
Immortal
Immortal

Looking at the table for the adapters shows different DID values. There is a e1000 and e1000e driver. There is a file that ESX(i) consults for the correct driver to load. /etc/vmware/simple.map

Try the lspci -v and compare the DID value in simple.map.

Beyond this you just may need to wait but??

-- David -- VMware Communities Moderator
0 Kudos
CatHat
Contributor
Contributor

On the working esxi 4.0U2 the lspci -v returned a DID value of : 8086:1079 for the NIC; that matches the 8086:1079 0000:0000 network e1000.o in simple.map

When i check the same on esxi 4.1 lspci -v returned DID value of : 8086:1079 and simple.map 8086:1079 0000:0000 network e1000

So the only difference is that in esxi 4.1 theres e1000 and in esxi 4.0U2 theres e1000.o :S

" Beyond this you just may need to wait but??"

Im sorry but i dont really understood this part :)?

Thanks!

0 Kudos
DSTAVERT
Immortal
Immortal

The e1000.o is just the actual module name.

As for waiting you may need to wait for an update in 4.1

Try unloading and reloading the module with vmkload_mod

-- David -- VMware Communities Moderator
0 Kudos
mdwasim
Contributor
Contributor

whats the speed set on the NIC, if its 1000 try to set on 100 and see if it works fine..

I am facing NIC problem which is a Brodcom NIC, on 1000 it doesnt works but on 100 its working good..

esxcfg-nics -l will show you list of nics along with the speed set on the active NIC.

esxcfg-nics -s 100 -d full vmnic0 will set the vmnic0 to 100.

See if this can solve the problem.

0 Kudos
mdwasim
Contributor
Contributor

whats the speed set on the NIC, if its 1000 try to set on 100 and see if it works fine..

I am facing NIC problem which is a Brodcom NIC, on 1000 it doesnt works but on 100 its working good..

esxcfg-nics -l will show you list of nics along with the speed set on the active NIC.

esxcfg-nics -s 100 -d full vmnic0 will set the vmnic0 to 100.

See if this can solve the problem.

0 Kudos
DSTAVERT
Immortal
Immortal

Have a look through this download link

http://downloads.vmware.com/d/details/esx_esxi40_intel_82575_82576_dt/ZHcqYmR0QGpidGR3

-- David -- VMware Communities Moderator
0 Kudos
CatHat
Contributor
Contributor

: it was at 100 mbit and setting it manually had no effect on the problem.

DSTAVERT: Im a little unsure how i should install the drivers (even if their not for my chipset?) on that cd? Do i press something during the setup process to specify additional drivers to load?

Thanks for the good support sofar!

0 Kudos
DSTAVERT
Immortal
Immortal

I didn't look very closely to check whether your chipset was on the list. Just another spot to look. Intel CDs come with bundles for ESXi that are installed after the fact with the cli tools.

-- David -- VMware Communities Moderator
0 Kudos