VMware Cloud Community
vm7user
Enthusiast
Enthusiast

WARNING: LinNet: netdev_watchdog: NETDEV WATCHDOG: vmnic0: transmit timed out

Hello,

I have host with ESXi 6.0 2494585 HP-customized iso (from HP site)

Recently host and VMs lost network connectivity. Only host reboot help.

Some data from log:

....

cpu21:33241)WARNING: LinNet: netdev_watchdog:3678: NETDEV WATCHDOG: vmnic0: transmit timed out
cpu21:33241)<3>bnx2 0000:04:00.1: vmnic0: <--- start FTQ dump --->

....

cpu21:33241)WARNING: at vmkdrivers/src_92/vmklinux_92/vmware/linux_net.c:3707/netdev_watchdog() (inside vmklinux)
cpu21:33241)Backtrace for current CPU #21, worldID=33241, rbp=0x430572898170
cpu21:33241)0x4390cec9be10:[0x41801e296b4e]vmk_LogBacktraceMessage@vmkernel#nover+0x22 stack: 0x430572898170, 0
cpu21:33241)0x4390cec9be30:[0x41801e91e7b7]watchdog_work_cb@com.vmware.driverAPI#9.2+0x27f stack: 0x430572861ce
cpu21:33241)0x4390cec9bea0:[0x41801e944a5f]vmklnx_workqueue_callout@com.vmware.driverAPI#9.2+0xd7 stack: 0x4305
cpu21:33241)0x4390cec9bf30:[0x41801e24f872]helpFunc@vmkernel#nover+0x4e6 stack: 0x0, 0x430572861ce0, 0x27, 0x0,
cpu21:33241)0x4390cec9bfd0:[0x41801e41231e]CpuSched_StartWorld@vmkernel#nover+0xa2 stack: 0x0, 0x0, 0x0, 0x0, 0
....

I find this article http://kb.vmware.com/kb/1029070

My host also have Broadcom nic, but, this article not applicable to ESXi 6.0


18 Replies
VMwareAndrew
VMware Employee
VMware Employee

Please file a support request with VMware Support and we can get you more info.

Reply
0 Kudos
cesprov
Enthusiast
Enthusiast

This is currently a known issue with 6.0 even though they haven't publicly acknowledged it yet, supposedly related to how the new version handles CPU interrupts.  Open an SR like the guy above says.  They have a workaround script to fix this, but needs to be re-applied to each host after every reboot.

Reply
0 Kudos
cesprov
Enthusiast
Enthusiast

See here for a workaround script.

Reply
0 Kudos
SeanH2309
Contributor
Contributor

Update 1a released yesterday to fix this issue.

VMware KB: VMware ESXi 6.0, Patch ESXi600-201510401-BG: Updates esx-base

Reply
0 Kudos
time81
Contributor
Contributor

Did anyone get logs like this AFTER installing this new patch ?

'Issue detected on esx1 in cluster1: IntrCookie: 3411: Interrupt received on invalid vector (cpu 5, vector 70); ignoring it. 

(2015-10-08T12:44:31.184Z cpu5:40473)'

My 8 ESXI on DL380 G7 are all having these after i installed that patch yesterday.

The newer ones DL380 G9 dont have it (other network cards ? )

fr8rt8rt
Contributor
Contributor

the fix is ignoring invalid vector?Whether can lead to packet loss?Whether a large number of repeat? can you provide mor information?thank you!

Reply
0 Kudos
BjornJohansson
Enthusiast
Enthusiast

Hey internet QA, since VMware QA had some... bad luck some of the previously released updates; Any problems reported with ESXi 6.0 Update 1a? Smiley Happy

I updated some of my HP BL460c Gen9 and ran Veeam backups. All good so far.

Reply
0 Kudos
pdraycott
Contributor
Contributor

Exactly the same here . 3 hosts all DL380 G7 show these messages after the patch. Are there any fixes for this and is it something to worry about?

Reply
0 Kudos
graoult
Contributor
Contributor

I have the same error after installing the patch 6.01a :

ALERT: IntrCookie: 3411: Interrupt received on invalid vector (cpu 13, vector 79); ignoring it.

Esxi DL380G7, firmware and bundle HP up to date.

I have open a support case.

Reply
0 Kudos
pdraycott
Contributor
Contributor

I have a support case open too. Nothing back yet as they are still examing the logs, and I have informed him of this thread.

I will post any updates.

Reply
0 Kudos
barthur68
Contributor
Contributor

2015-10-15T23:02:22.382Z cpu14:32901)ALERT: IntrCookie: 3411: Interrupt received on invalid vector (cpu 14, vector 71); ignoring it.

Clean install of 6.0U1a on DL360 G7 using VMware'sbase image. I didn't use HP's b/c they haven't released U1 yet and I was hit with this issue KB2124669

I also don't see many updates on vibsdepot.hp.com for my server...

Reply
0 Kudos
wgrixti
Contributor
Contributor

I am getting the same issue on my Dell R715.

I had upgraded from ESXi 5.5 to ESXi 6.0 using the Dell Custom ISO: VMware-VMvisor-Installer-6.0.0-3029758.x86_64-Dell_Customized-A00

The server was booting with no error messages.

I then applied the Hotfix ESXi600-201510001.zip directly downloaded from VMware to fix the issues in KB 2124669

After applying the fix I saw the following near the end of the boot sequence: cpu0:32768 IntrCookie: 3411: Interupt received on invalid vector (cpu 0, vector 27); ignoring it.

I have tried upgrading again using the newer Dell Custom ISO: VMware-VMvisor-Installer-6.0.0.update01-3073146.x86_64-Dell_Customized-A01

Still no joy, I have opened a support case with VMware as I am reluctant to put the server back in to production until I know what the error is.

Reply
0 Kudos
wgrixti
Contributor
Contributor

Just so everyone knows I got a response from support.

The warning we are seeing is a result of the patch for NETDEV WATCHDOG. This means the error has been encountered and has been dealt with.

In the next update there will better handling for the error.

Reply
0 Kudos
graoult
Contributor
Contributor

I have the same response from support: "wait for the next update with the log error, or downgrade to 5.5U3."

strangely, the four esxi are strictly identical (FW & HW), but one of them has no error.

Reply
0 Kudos
lukaszAG
Contributor
Contributor

I have exactly the same issue on 3 HP hosts DL380 G7, all three hosts were patched yesterday with the 201510401-BG critical patch.

Error on each host:

DescriptionTypeDate TimeTaskTargetUser

"Alarm 'Host error' on esx01.office.corp triggered by event 2275635 'Issue detected on esx01.office.corp in My Office: IntrCookie: 3411: Interrupt received on invalid vector (cpu 1, vector 72); ignoring it.

(2015-10-27T08:54:42.978Z cpu1:32888)'"Error27/10/2015 08:55:52esx01.office.corp

I also get the following warning:

DescriptionTypeDate TimeTaskTargetUser

"IntrCookie: 3411: Interrupt received on invalid vector (cpu 9, vector 72); ignoring it.

(2015-10-27T08:55:46.363Z cpu9:36037)"Warning27/10/2015 08:55:46esxi01.office.corp

And possible cause points to:

- Virtual machine creating might fail because the agent was unable to retrieve virtual machine creation options from the host

The above possible cause doesn't prevent in creation of VM and I was able to provision new VM.

Should I worried about the above error/warning? How can I fix it?

Lukasz

Reply
0 Kudos
jfnoriega
Contributor
Contributor

I have a similar setup and issue,  I currently have 3x HP Proliant DL370 hosts in a cluster.  After adding the 3rd host into the cluster and having it upgraded and patched with the latest patches up to two weeks ago, I started receiving this alerts:

**************************************************************************************************************************************************************************************************************************

VMware vCenter - Alarm alarm.HostErrorAlarm] Issue detected on esxi-3 in Datacenter: IntrCookie: 3411: Interrupt received on invalid vector (cpu 0, vector 70); ignoring it. (2015-10-28T21:44:49.412Z cpu0:32887)

Stateless event alarm

Alarm Definition:

([Event alarm expression: Host error] OR [Event alarm expression: Host warning])

Event details:

Issue detected on esxi-3 in Datacenter: IntrCookie: 3411: Interrupt received on invalid vector (cpu 0, vector 70); ignoring it. (2015-10-28T21:44:49.412Z cpu0:32887)

**************************************************************************************************************************************************************************************************************************

Is anyone getting something similar?


For the users that have gotten a response from VMware tech support, did they mention an ETA for that "next update" ?



Reply
0 Kudos
lukaszAG
Contributor
Contributor

Hi,

I have the same problem and I've opened the support case with VMware and this is their response:

"Engineering have an internal PR for this issue which is due to be resolved in patch 2. We do not have a date as of when patch 2 will be released.

This message can be safely ignored until the patch is released. I will update you has soon as we have any further information. This may be a week or 2."

Due to this issues event log is filling up the vcenter DB and as soon as the DB reaches 10GB limit vCenter service turns off.

Reply
0 Kudos
shredder08
Contributor
Contributor

Same problem here on two HP DL360 Gen7 - but no problems on DL360p Gen8.

I also opend a ticket and get the info, that the problem can be ignored and log-entrys will be suppressed in next esx-base update. Today I installed Patch Release ESXi600-201511001 (from 25.11.2015) with no luck, promlem persists. Sorry VMWare, this is not acceptable!


Does anyone have any ideas, how to filter the event in the logs?

Reply
0 Kudos