VMware Cloud Community
Azmidiske
Contributor
Contributor

Intel 82599 10 Gigabit Dual Port Network Adapter missing after ESXi 6.0 to 6.5u1 upgrade

After upgrading a Cisco UCS server using the Vmware-ESXi-6.5.0-5969303-Custom-Cisco-6.5.1.2.zip image, an Intel 82599 10 Gigabit Dual Port Network Adapter is missing. The 10Gb adapter that is missing happens to be the one that has active ports. The link on the switch is still active and the system sees all (the missing adapter) of the adapters:

lspci -v | grep -A1 -i ethernet

0000:02:00.0 Ethernet controller Network controller: Intel Corporation I350 Gigabit Network Connection [vmnic0]

         Class 0200: 8086:1521

--

0000:02:00.1 Ethernet controller Network controller: Intel Corporation I350 Gigabit Network Connection [vmnic1]

         Class 0200: 8086:1521

--

0000:02:00.2 Ethernet controller Network controller: Intel Corporation I350 Gigabit Network Connection [vmnic2]

         Class 0200: 8086:1521

--

0000:02:00.3 Ethernet controller Network controller: Intel Corporation I350 Gigabit Network Connection [vmnic3]

         Class 0200: 8086:1521

--

0000:04:00.0 Ethernet controller Network controller: Intel(R) 82599 10 Gigabit Dual Port Network Connection [vmnic4]

         Class 0200: 8086:10fb

--

0000:04:00.1 Ethernet controller Network controller: Intel(R) 82599 10 Gigabit Dual Port Network Connection [vmnic5]

         Class 0200: 8086:10fb

--

0000:07:00.0 Ethernet controller Network controller: Intel(R) 82599 10 Gigabit Dual Port Network Connection [vmnic6]

         Class 0200: 8086:10fb

--

0000:07:00.1 Ethernet controller Network controller: Intel(R) 82599 10 Gigabit Dual Port Network Connection [vmnic7]

         Class 0200: 8086:10fb

--

0000:83:00.0 Ethernet controller Network controller: Intel(R) 82599 10 Gigabit Dual Port Network Connection [vmnic10]

         Class 0200: 8086:10fb

--

0000:83:00.1 Ethernet controller Network controller: Intel(R) 82599 10 Gigabit Dual Port Network Connection [vmnic11]

         Class 0200: 8086:10fb

Here are the device contents of /etc/vmware/esx.conf:

/device/00000:131:00.0/vmkname = "vmnic10"

/device/00000:002:00.1/vmkname = "vmnic1"

/device/00000:004:00.0/vmkname = "vmnic4"

/device/00000:002:00.3/vmkname = "vmnic3"

/device/00000:007:00.0/vmkname = "vmnic6"

/device/00000:007:00.1/vmkname = "vmnic7"

/device/00000:130:00.0/vmkname = "vmhba0"

/device/00000:002:00.2/vmkname = "vmnic2"

/device/00000:002:00.0/vmkname = "vmnic0"

/device/00000:131:00.1/vmkname = "vmnic11"

/device/00000:004:00.1/vmkname = "vmnic5"

But ESXi is not registering one of them, previous vmnic4 and 5:

esxcfg-nics -l

Name    PCI          Driver      Link Speed      Duplex MAC Address       MTU    Description

vmnic0  0000:02:00.0 igbn        Up   1000Mbps   Full   50:57:a8:e1:3c:34 1500   Intel Corporation I350 Gigabit Network Connection

vmnic1  0000:02:00.1 igbn        Down 0Mbps      Half   50:57:a8:e1:3c:35 1500   Intel Corporation I350 Gigabit Network Connection

vmnic10 0000:83:00.0 ixgbe       Down 0Mbps      Half   90:e2:ba:99:27:1c 1500   Intel(R) 82599 10 Gigabit Dual Port Network Connection

vmnic11 0000:83:00.1 ixgbe       Down 0Mbps      Half   90:e2:ba:99:27:1d 1500   Intel(R) 82599 10 Gigabit Dual Port Network Connection

vmnic2  0000:02:00.2 igbn        Down 0Mbps      Half   50:57:a8:e1:3c:36 1500   Intel Corporation I350 Gigabit Network Connection

vmnic3  0000:02:00.3 igbn        Down 0Mbps      Half   50:57:a8:e1:3c:37 1500   Intel Corporation I350 Gigabit Network Connection

vmnic6  0000:07:00.0 ixgbe       Down 0Mbps      Half   90:e2:ba:99:95:88 9000   Intel(R) 82599 10 Gigabit Dual Port Network Connection

vmnic7  0000:07:00.1 ixgbe       Down 0Mbps      Half   90:e2:ba:99:95:89 9000   Intel(R) 82599 10 Gigabit Dual Port Network Connection

Updating to the latest driver does not seem to help:

esxcli software vib update -v "/vmfs/volumes/datastore1/net-ixgbe_4.5.3-1OEM.600.0.0.2494585.vib"

   VIBs Installed: INT_bootbank_net-ixgbe_4.5.3-1OEM.600.0.0.2494585

   VIBs Removed: INT_bootbank_net-ixgbe_4.4.1-1OEM.600.0.0.2159203

Also tried this setting this:

esxcli system settings kernel set --setting="netNetqueueEnabled" --value="FALSE"

Here are the currently installed drivers when querying for ixgbe:

esxcli software vib list | grep ixgbe

net-ixgbe                      4.5.3-1OEM.600.0.0.2494585             INT                 VMwareCertified   2018-01-26

ixgben                         1.4.1-2vmw.650.1.26.5969303            VMW                 VMwareCertified   2018-01-26

The driver that worked with version 6.0 was 3.7.13 but I was unable to find/download it for 6.5u1.

net-ixgbe                      3.7.13.7.14iov-20vmw.600.0.0.2494585  VMware  VMwareCertified   2016-02-11

Any troubleshooting tips would be appreciated!

7 Replies
daphnissov
Immortal
Immortal

What's the output of vsish -e get /net/pNics/vmnicN/properties | grep Driver where vmnicN is the number representing the vmnic backed by that adapter?

0 Kudos
Azmidiske
Contributor
Contributor

Not found

vsish -e get /net/pNics/vmnic4/properties | grep Driver

VSISHCmdGetInt():Get failed: Not found

vsish -e get /net/pNics/vmnic5/properties | grep Driver

VSISHCmdGetInt():Get failed: Not found

But when performed for one the ESXi sees:

vsish -e get /net/pNics/vmnic6/properties | grep Driver

   Driver Name:ixgbe

   Driver Version:4.5.3-iov

   Driver Firmware Version:0x61b50001, 0.324.97

   Module Interface Used By The Driver:vmklinux

0 Kudos
daphnissov
Immortal
Immortal

I was trying to find a reference to that firmware and cross check against that driver. You might want to see if there is a later firmware available for it. I've seen something like this before when the firmware of the card isn't at a suitably compatible level with the driver generation in use.

0 Kudos
Azmidiske
Contributor
Contributor

Upgraded to the latest firmware for the X520-DA2 NIC's and system but no change. Something interesting though, when I remove the transceivers and restart the system, ESXi then sees the NIC but will not bring up the interfaces if I plug the them back in while it sees them. At this point, looks like I will have to downgrade as I am seeing the same symptoms on another Cisco UCS C240 host with the same 10Gb NICS.

0 Kudos
gdev87
Contributor
Contributor

Hey Azmidiske,

Did the downgrade fix the issue? If so which version did you downgrade to that worked. I am seeing the same issue with UCS 240M4 and intel x520-da2 nic cards. It was working fine before but as soon as i upgraded to esxi 6.5 the port is no longer available to select if the link is up. After a google search it seems the x520-da2 will only support intel sfp+. If i use a cisco sfp+ the link will come up but not be available/seen in esxi web client. Checking the vm kernel log shows me this message

[root@ESXI-1:~]less /var/log/vmkernel.log | grep -i 'SFP'

2018-02-06T21:45:26.196Z cpu36:66244)<3>ixgbe 0000:86:00.0: failed to load because an unsupported SFP+ or QSFP module type was detected.

Strange how it was working before.Not sure how to proceed at this point other than purchasing an intel sfp+. Wanted to see if a downgrade helped before i do that.

0 Kudos
Azmidiske
Contributor
Contributor

Not yet, tried downgrading to 5.5 and symptoms were worse, coredump requiring a restart when I plugged the SFP's in. I was running Juniper SFP's and 6.0.x before. I upgraded to 6.5 and switched to Brocade SFP's at the same time. Not sure if the Juniper SFP's would of continued working with 6.5 but no longer have access to them to test.

However, I installed Cisco 1000BASE-T Copper SFP in the troubled 10G NIC's and they work fine. So it is definitely a quirk in 6.5 and the 10G SFP's. I hope I can try some Intel or Cisco branded 10G SFP's soon.

0 Kudos
gdev87
Contributor
Contributor

I finally gave in and borrowed an Intel 10G SFP+(FTLX8571D3BCVIT1) and it worked. These newer versions of ESXI are becoming a pain to manage(Web client feels clunky). I guess i will leave the others in 6.0 and hope someone comes up with a workaround.

0 Kudos