VMware Cloud Community
Henning_Svane
Contributor
Contributor

how to install Mellanox connectx-4 in ESXi 6.7

Hi

I cannot get ESXi 6.7 to install Mellanox ConnectX-4 in ESXi 6.7

I the UI on the ESXi I can see the Mellanox card under manage->hardware.

But it will not show up under NIC or under HBA, not sure where to find it.

If I from the CLI run the command

esxcli hardware pci list

I can also see the ESXi have found the controller correctly.

But as you can see Module ID and Module Name has not been set.

   Module ID: -1

   Module Name: None

I use the build in driver in ESXi 6.7 as it should support this controller and I have upgraded it to newest firmware so it fits with HCL

Any way to fix this?

0000:87:00.0
   Address: 0000:87:00.0
   Segment: 0x0000
   Bus: 0x87
   Slot: 0x00
   Function: 0x0
   VMkernel Name: vmnic4
   Vendor Name: Mellanox Technologies
   Device Name: MT27700 Family [ConnectX-4]
   Configured Owner: VMkernel
   Current Owner: VMkernel
   Vendor ID: 0x15b3
   Device ID: 0x1013
   SubVendor ID: 0x15b3
   SubDevice ID: 0x0010
   Device Class: 0x0207
   Device Class Name: Infiniband controller
   Programming Interface: 0x00
   Revision ID: 0x00
   Interrupt Line: 0x0b
   IRQ: 255
   Interrupt Vector: 0x00
   PCI Pin: 0x00
   Spawned Bus: 0x00
   Flags: 0x3201
   Module ID: -1
   Module Name: None
   Chassis: 0
   Physical Slot: 2
   Slot Description: RSC-UN4-88 SLOT2 PCI-E X8
   Passthru Capable: true
   Parent Device: PCI 0:128:3:0
   Dependent Device: PCI 0:135:0:0
   Reset Method: Function reset
   FPT Sharable: true

0000:87:00.1
   Address: 0000:87:00.1
   Segment: 0x0000
   Bus: 0x87
   Slot: 0x00
   Function: 0x1
   VMkernel Name: vmnic5
   Vendor Name: Mellanox Technologies
   Device Name: MT27700 Family [ConnectX-4]
   Configured Owner: VMkernel
   Current Owner: VMkernel
   Vendor ID: 0x15b3
   Device ID: 0x1013
   SubVendor ID: 0x15b3
   SubDevice ID: 0x0010
   Device Class: 0x0207
   Device Class Name: Infiniband controller
   Programming Interface: 0x00
   Revision ID: 0x00
   Interrupt Line: 0x0a
   IRQ: 255
   Interrupt Vector: 0x00
   PCI Pin: 0x01
   Spawned Bus: 0x00
   Flags: 0x3201
   Module ID: -1
   Module Name: None
   Chassis: 0
   Physical Slot: 2
   Slot Description: Chassis slot 2; function 1
   Passthru Capable: true
   Parent Device: PCI 0:128:3:0
   Dependent Device: PCI 0:135:0:1
   Reset Method: Function reset
   FPT Sharable: true

0 Kudos
3 Replies
time81
Contributor
Contributor

The stock driver worked for me after upgrade from 6.5 to 6.7

You can even install the original Mellanox driver for 6.5, works as well.

Be prepared for problems though:

The Mellanox ConnectX-4/ConnectX-5 native ESXi driver might exhibit performance degradation when its Default Queue Receive Side Scaling (DRSS) feature is turned on

Receive Side Scaling (RSS) technology distributes incoming network traffic across several hardware-based receive queues, allowing inbound traffic to be processed by multiple CPUs. In Default Queue Receive Side Scaling (DRSS) mode, the entire device is in RSS mode. The driver presents a single logical queue to OS and is backed by several hardware queues.

The native nmlx5_core driver for the Mellanox ConnectX-4 and ConnectX-5 adapter cards enables the DRSS functionality by default. While DRSS helps to improve performance for many workloads, it could lead to possible performance degradation with certain multi-VM and multi-vCPU workloads.

Workaround: If significant performance degradation is observed, you can disable the DRSS functionality.

  1. Run the esxcli system module parameters set -m nmlx5_core -p DRSS=0 RSS=0 command.
  2. Reboot the host.

This is from the GA 6.7 release notes and i can tell you in Real life i have problems using storage vmotion since that problem under 6.7 makes my network speed go down to 5-10 mbits/s instead of 3Gbit/s.

Moving around VMs in 6.5 works fast. I would say wait for a patch or use 6.5 Smiley Happy

0 Kudos
billglick
Contributor
Contributor

Did you ever resolve this? I think I’m having the same issue.

0 Kudos
billglick
Contributor
Contributor

My issue (and perhaps that of the original poster) is that Mellanox does not support Infiniband mode on their cards under ESXi 6.x. My understanding is that they supported it with older cards under ESXi 5.x, and it marginally worked under ESXi 6.0. But there is now only support for Ethernet mode on their cards under ESXi 6.x.

While Mellanox mentions this in passing in their documentation, it was not very obvious to me. The best documentation I've read about this is in the threads from users at https://community.mellanox.com/thread/3379

0 Kudos