9 Replies Latest reply on Dec 15, 2017 6:48 PM by ghSky

    ESXi drops / loses VMFS-partition

    Jixe Lurker

      Hi

       

      I'm experimenting with an ESXi 6.5 installation on a Intel NUC6I5SYH for a lab enviroment. I'm aware of the "no official support", but please hear me out :-)

       

      As stated, I'm installing ESXi on a Intel NUC which has a Intel 600p NVMe SSD installed. For the most part everything works fine, but in the last month I have experienced twice that the partitions on the SSD disappears from ESXi. A simple reboot of the device will bring everything back to normal, but during the time with no access to the content of the SSD, the VM's are (of cause) not responding.

      I can, however, log on to the web-interface of ESXi 6.5 and from there I see that the SSD is still recognized (I can see the make and model of the SSD), but the capacity is "0 bytes". If I log on to the ESXi host via SSH and do a "df -h" I see two partitions: one which is around 4 GB and one which is 0 bytes. This makes me think, that the SSD is not totally dead,

       

      Even though VMware is not supporting this setup, I wonder what my next troubleshooting step should be. Does the ESXi-installation have a CLI command to read out SMART-data or "rescan" the SSD for partitions? Something to guide me in a direction if I should RMA the SSD, the NUC or just give up on ESXi in this setup.

       

      I don't really have any logs about the incident since ESXi doesn't have anywhere to write the logs to when this problem occurs.

       

      Thanks!

        • 1. Re: ESXi drops / loses VMFS-partition
          terrytin Lurker

          Hi,

           

          I have exactly the same problem with my ASRock beebox with Intel 600p NVMe SSD. everytime I found its SSD "0-byte" and I was forced to reboot esxi.

          I have tried firmware upgrade for Intel 600p SSD but no help.

           

          Anyone has same problem and got solutions?

          • 2. Re: ESXi drops / loses VMFS-partition
            Marmotte94 Enthusiast
            vExpert

            Hi,

            You can look at all your storage device by use SSH on the ESXi host.

             

            #esxcli storage core device list

             

            Thank you,

            Olivier

            • 3. Re: ESXi drops / loses VMFS-partition
              terrytin Lurker

              Hi,

               

              I doubt if it is "nvme" driver bug.

               

              anyway my "esxcli storage core device list" below:

              t10.NVMe____INTEL_SSDPEKKW512G7_____________________BTPY631307NR512F____00000001

                 Display Name: Local NVMe Disk (t10.NVMe____INTEL_SSDPEKKW512G7_____________________BTPY631307NR512F____00000001)

                 Has Settable Display Name: true

                 Size: 488386

                 Device Type: Direct-Access

                 Multipath Plugin: NMP

                 Devfs Path: /vmfs/devices/disks/t10.NVMe____INTEL_SSDPEKKW512G7_____________________BTPY631307NR512F____00000001

                 Vendor: NVMe

                 Model: INTEL SSDPEKKW51

                 Revision:  PSF

                 SCSI Level: 6

                 Is Pseudo: false

                 Status: on

                 Is RDM Capable: false

                 Is Local: true

                 Is Removable: false

                 Is SSD: true

                 Is VVOL PE: false

                 Is Offline: false

                 Is Perennially Reserved: false

                 Queue Full Sample Size: 0

                 Queue Full Threshold: 0

                 Thin Provisioning Status: yes

                 Attached Filters:

                 VAAI Status: unknown

                 Other UIDs: vml.0100000000425450593633313330374e523531324620202020494e54454c20

                 Is Shared Clusterwide: false

                 Is Local SAS Device: false

                 Is SAS: false

                 Is USB: false

                 Is Boot USB Device: false

                 Is Boot Device: true

                 Device Max Queue Depth: 256

                 No of outstanding IOs with competing worlds: 32

                 Drive Type: unknown

                 RAID Level: unknown

                 Number of Physical Drives: unknown

                 Protection Enabled: false

                 PI Activated: false

                 PI Type: 0

                 PI Protection Mask: NO PROTECTION

                 Supported Guard Types: NO GUARD SUPPORT

                 DIX Enabled: false

                 DIX Guard Type: NO GUARD SUPPORT

                 Emulated DIX/DIF Enabled: false

              • 4. Re: ESXi drops / loses VMFS-partition
                ivanyeung510 Lurker

                i have the same problem too.

                when it comes to a large data copy from nvme to hdd .it just drop the partition and reboot will fix it.

                i wonder if the nvme is overheating  cause partition drop?

                • 5. Re: ESXi drops / loses VMFS-partition
                  dangbry1978 Lurker

                  Hi,

                   

                  I am having the exact same issue as well with the Intel 600P and ESXi 6.5 U1 running on a SuperMicro SYS-5028D-TN4T. It seems to be working fine until I try and provision a VM and then I get an error message that connection to the Datastore has been lost. I have updated to the latest Intel 600P firmware, I get the output for esxcli storage core device list as follows:

                   

                  [root@pESXi-01:~] esxcli storage core device list

                  t10.NVMe____INTEL_SSDPEKKW010T7_____________________BTPY65320GA71P0H____00000001

                     Display Name: Local NVMe Disk (t10.NVMe____INTEL_SSDPEKKW010T7_____________________BTPY65320GA71P0H____00000001)

                     Has Settable Display Name: true

                     Size: 976762

                     Device Type: Direct-Access

                     Multipath Plugin: NMP

                     Devfs Path:

                     Vendor: NVMe

                     Model: INTEL SSDPEKKW01

                     Revision:  PSF

                     SCSI Level: 6

                     Is Pseudo: false

                     Status: not connected

                     Is RDM Capable: false

                     Is Local: true

                     Is Removable: false

                     Is SSD: true

                     Is VVOL PE: false

                     Is Offline: false

                     Is Perennially Reserved: false

                     Queue Full Sample Size: 0

                     Queue Full Threshold: 0

                     Thin Provisioning Status: yes

                     Attached Filters:

                     VAAI Status: unsupported

                     Other UIDs: vml.01000000004254505936353332304741373150304820202020494e54454c20

                     Is Shared Clusterwide: false

                     Is Local SAS Device: false

                     Is SAS: false

                     Is USB: false

                     Is Boot USB Device: false

                     Is Boot Device: false

                     Device Max Queue Depth: 256

                     No of outstanding IOs with competing worlds: 32

                     Drive Type: unknown

                     RAID Level: unknown

                     Number of Physical Drives: unknown

                     Protection Enabled: false

                     PI Activated: false

                     PI Type: 0

                     PI Protection Mask: NO PROTECTION

                     Supported Guard Types: NO GUARD SUPPORT

                     DIX Enabled: false

                     DIX Guard Type: NO GUARD SUPPORT

                     Emulated DIX/DIF Enabled: false

                   

                  I would be extremely grateful is someone has found a fix and can share.

                  • 6. Re: ESXi drops / loses VMFS-partition
                    ivanyeung510 Lurker

                    my problem is fixed after attached a small heatsink on the controller

                     

                    i suggest that you better check your temperature by running following command:

                     

                    esxcli storage core device list | grep '  Display Name:' | cut -d'(' -f2 | cut -d')' -f1 | while read DISK

                    do

                       echo "********** $DISK **********"

                       esxcli storage core device smart get -d $DISK

                    done

                    • 7. Re: ESXi drops / loses VMFS-partition
                      dangbry1978 Lurker

                      Thanks ivanyeung510,

                       

                      It is definitely a heat related issue. I had the fan setting set to Optimal speed and I have had to put it on full speed to keep the drive working which unfortunately is significantly noiser. It looks like I will need to add a heatsink myself to allow for the quieter fan.

                       

                      I ran the command that you provided and the heat when I started to have issues was only 45 degrees which surprised me, I thought it would have had a higher threshold before I started to see the issues.

                      • 8. Re: ESXi drops / loses VMFS-partition
                        ivanyeung510 Lurker

                        nvme_overheat.jpg

                         

                        i have a experience on 70 degree

                        after installing a small heatsink ,the maximum temperature <50

                        • 9. Re: ESXi drops / loses VMFS-partition
                          ghSky Lurker

                          Hi,

                           

                          I have exactly the same problem with my ASRock beebox with Intel 600p NVMe SSD. everytime I found its SSD "0-byte" and I was forced to reboot esxi.

                          I have tried firmware upgrade for Intel 600p SSD but no help.

                           

                          Have you find any solution about this issue?