VMware Cloud Community
besson3c
Contributor
Contributor

e1000_clean_tx_irq: Detected Tx Unit Hang

Hello,

I'm running VMware ESXi 4.0 on a Shuttle PC with an Intel Pro PCI NIC. I'm able to ping this machine about half of the time, the other half of the time these packets are dropped. I cannot access the machine in anyway remotely, including via its webpage. Part of the time the built in network diagnostics work, other times they don't - pinging my gateway IP in particular is what fails intermittently

I've tested this same NIC on a different OS, no problems there. The machine itself works fine in other OSes, as does my gateway/router. Looking at the ESXi logs I'm seeing screen after screen of errors such as:

e1000_clean_tx_irq: Detected Tx Unit Hang

and

LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic0: transmit timed out

I'm brand new to ESXi and am attempting to evaluate it as we speak. I'm not expecting some sort of magic cure for this sort of kernel related bug, but is there anything that I can provide in the way of information that might be useful? I'm assuming that there is no way for me to get this to work on my hardware at this time...

Thanks in advance for your time and attention!

Reply
0 Kudos
11 Replies
besson3c
Contributor
Contributor

Anybody? I'd love some ideas, thoughts, etc.!

Reply
0 Kudos
Rubeck
Virtuoso
Virtuoso

What are the negotiated NIC speed and duplex setting? 100Mbit or Gb? Half or full duplex..?

What does this NIC connect to?

/Rubeck

Reply
0 Kudos
besson3c
Contributor
Contributor

Dropping into the hidden console and doing an "esxcfg-nics -l" I see a single line with the following properties:

name: vmnic0

pci: 00:0b.00

driver: e1000

link: up

speed: 1000Mpbs

duplex: full

MAC Address: a mac address

MTU: 1500

Description: Intel Corporation PRO/1000 GT Desktop Adapter

The NIC is connected to my machine's PCI slot. Does this help? I hope so Smiley Happy

Reply
0 Kudos
besson3c
Contributor
Contributor

Interesting, doing a "esxcfg-nics -a" I get:

link: down

speed: 0mbps

duplex: half

Manually trying to assign properties (-s, -d) does not change this, I'm assuming because the link state is now down. I don't see a way to force the link state back to up, so I rebooted the machine to see if it would accept some manual assignments - same deal... starts up link up with all of the expected settings, assigning a speed (I tried -s 100) forces the link down and the above.

Reply
0 Kudos
js-hacki
Contributor
Contributor

i had the same error, but not on a esx machine.

it was a debian machine and the included e1000 driver was messy.

so i updated and got rid of this error. don't know if it's possible to update them on esxi

i heard of turning tso off could help...

Reply
0 Kudos
Rubeck
Virtuoso
Virtuoso

I've seen this issue too on an ESX3.5U4... I just removed the NIC as it wasn't needed anyway..

But the TSO thingy is a good start, imo...

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1013413&sl... 0 59089434

/Rubeck

Reply
0 Kudos
besson3c
Contributor
Contributor

#ethtool -K vmnic0 tso off

Cannot set device tcp segmentation offload settings: Function not implemented

I guess tso is already off though:

#ethtool -k vmnic0

Offload parameters for vmnic0:

Cannot get device tx csum settings: Function not implemented

Cannot get device tcp segmentation offloading settings: Function not implemented

rx-checksumming: off

tx-checksumming: off

scatter-gather: on

tcp segmentation offload: off

Reply
0 Kudos
besson3c
Contributor
Contributor

I don't get this. Why is this NIC having so much problems? I thought that this was one of the most common NICs available, and known to be 100% compatible with ESXi?

Is there another NIC I should try instead? I can return this to NewEgg, it was bought recently, so I can return or replace it. Like I said, it works beautifully in Ubuntu as far as I can tell, so I don't think the problem is a failing card.

Reply
0 Kudos
besson3c
Contributor
Contributor

I've seen this issue too on an ESX3.5U4... I just removed the NIC as it wasn't needed anyway..

But the TSO thingy is a good start, imo...

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1013413&sl... 0 59089434

/Rubeck

I'm not sure whether you were suggesting I try disabling flow control and that is why you included that link, but I did for fun, and I'm still unable to obtain a DHCP lease, or connect to my host with the vSphere client when I manually assign an IP. I'm still getting some request timeouts with my pinging too... So, not fix Smiley Sad

Any ideas and any theories as to why this NIC is giving me headaches?

Reply
0 Kudos
besson3c
Contributor
Contributor

It looks like I need to install a driver, as installed here:

http://communities.vmware.com/message/1297009

Unfortunately, this is pretty silly chicken/egg sort of thing since I need to be able to access my server via vSphere in order to install the driver. Grrrrr!

Any creative ideas as far as how I can install this?

Reply
0 Kudos
besson3c
Contributor
Contributor

Probably best to post to my other thread: http://communities.vmware.com/thread/250732

Sorry about this, but I can see that there is little point to mess around with trying to get this card to work w/o installing the driver...

Reply
0 Kudos