VMware Cloud Community
Eugen12334
Contributor
Contributor

High latency on ESXI 6.7 with Samsung 970 Evo plus 1TB

Hi dear Vmware community!  I'm a rookie with ESXI and faced with a problem. I have an issue with my ESXI work which i will describe below and hope someone could help me with this, thank you all in advance!

So my setup:

- CPU Intel Xeon Silver 4214

- Supermicro MBD-X11DPL-I-O - ATX

- Supermicro NMVe AOC-SLG3-2M2 with two Samsung M2 disks in it (Samsung 970 Evo plus 1TB)

- 4 pieces of Crucial 32GB DDR4-2666 RDIMM

Esxi runs from USB flash drive.

I installed Intel-nvme-vmd (1.8.0.1001-1oem.670.0.0.8169922) driver for vmhba adapter (with the driver by default iavmd_1.2.0.1011-2vmw.670.0.0.8169922 it fails to BSOD )

The problem i will describe below occurs after some period of time (as usual after ~20 hours of working under usual load) i turned HOST on and launch set of my Virtual machines.

All the VMs are configured identically and stored on different datastores  (some Vms on 1st, some on 2nd disk) :

Disk - Thick provisioned, eagerly zeroed with Nvme Controller

Network - VMxnet 3

Others option mostly default..

Free space on both storage is more than 50%

So the problem is - Very high latency on monitor:

1q.png

i started dig dipper and found very high Kavg during see high latency above (It easily jumps to 1000+)

2q.png

What is strange - this high KAVG appears only on certain SSD (let it be  marked with "A") disk. I tried change places for disks in PCI card, and high KAVG was still on those disk A.

At the same time i do not see any Queue take place:

3q.png

Also it seems to me that latency on Vms is pretty normal...

4q.png

I switched off hardware acceleration (Vaai), but it did not help.

I attached 2 log files:

- vmkernelv

- mkwarning

There are a lot of errors and warnings like:

HppThrottleLogForDevice:564: Cmd 0x42 (0x459a4233da40, 2108176) to dev "t10.NVMe____Samsung_SSD_970_EVO_Plus_1TB____________S4EWNF0M717097A_____00000001" on path "vmhba2:C0:T1:L0" Failed:

2020-02-14T08:56:48.253Z cpu3:2097198)WARNING: HPP: HppThrottleLogForDevice:570: Error status H:0xc D:0x0 P:0x0 Invalid sense data: 0x0 0x0 0x0.

In case any other logs required i will provide for analyzing the issue. I will really-really appreciate any help!

Reply
0 Kudos
1 Reply
EDV-Schuster
Contributor
Contributor

did you find a solution to that problem? i got a samsung 980 pro in a supermicro esxi 7 and get similar problems....

Reply
0 Kudos