1 2 3 Previous Next 31 Replies Latest reply on Jun 8, 2016 1:59 PM by 7007VM7007

    Can't create all flash VSAN with PCIe SSD

    7007VM7007 Enthusiast

      Hi All

       

      I'm trying to create my first test VSAN in my lab at home. Its a single node VSAN. I know its not supported but this is just to get me started before I get some more servers.

       

      My hardware is as follows:

       

      Supermicro X10SL7-F

      32GB RAM

      Xeon CPU E3-1230 v3 @ 3.30GHz

      One Samsung 950 Pro 256GB PCIe SSD NVMe

      Two Samsung 840 Pro 128GB SATA SSDs

       

      As a test I was able to create a VSAN datastore using one of the Samsung 840 128GB drives as the cache tier and the other Samsung 840 128GB was used for capacity. This worked great and I was able to place VMs on the vsanDatastore and see dedupe/compression in action!

       

      I have since deleted the above configuration and am now trying to use my Samsung 950 Pro 256GB PCIe SSD NVMe for the cache tier and to then use the two Samsung 840 Pro 128GB SATA SSDs for capacity. When I enable VSAN on the cluster and complete the setup I get the following errors as soon as I try to access the vsanDatastore:

      vsan error.jpg

      The capacity of the vsanDatastore shows as 0.00 B and I can't place any VMs on this datastore.

       

      Can someone assist with getting this to work? I have tried recreating the VSAN setup a few times but it hasn't helped. I know the PCIe SSD works as I can set it up as a single disk (non VSAN) datastore and place VMs on it with no problem.

       

      How can I troubleshoot this? Thanks.

        • 1. Re: Can't create all flash VSAN with PCIe SSD
          depping Champion
          VMware EmployeesUser Moderators

          Is it recognized as an SSD in the UI and as local? If not tag them accordingly first...

          • 2. Re: Can't create all flash VSAN with PCIe SSD
            7007VM7007 Enthusiast

            I think the answer is yes to both of your questions but here are two screenshots of the PCIe SSD drive:

             

            pcie.jpg

            pcie2.jpg

             

            What am I doing wrong? Thanks for the reply

            • 3. Re: Can't create all flash VSAN with PCIe SSD
              zdickinson Expert

              Good morning, my guess would be that the PCIe card has some partitions or volume information.  Have you cleaned the drive?

               

              Some info here:  http://www.vladan.fr/how-to-delete-partitions-to-prepare-disks-for-vsan/

               

              and here:  http://cormachogan.com/2014/02/18/vsan-part-16-reclaiming-disks-for-other-uses/

               

              Thank you, Zach.

              • 4. Re: Can't create all flash VSAN with PCIe SSD
                7007VM7007 Enthusiast

                This is a brand new disk so I would be surprised if exising partitions caused this to fail but I ran on the PCIe SSD:

                 

                partedUtil getptbl /vmfs/devices/disks/t10.NVMe____Samsung_SSD_950_PRO_256GB_______________D28D50F15C382500

                 

                and the output was as follows:

                 

                gpt

                31130 255 63 500118192

                 

                I don't think theres a partition to delete in this instance?

                 

                I also ran the same comman on the two SATA SSD drives and got the same output on each of them:

                 

                gpt

                15566 255 63 250069680

                 

                So I'm assuming all 3 disks are blank and ready for VSAN use? Is there anything else I have to do to get these drives to work in an all flash VSAN datastore?

                 

                Thanks for the help!

                • 5. Re: Can't create all flash VSAN with PCIe SSD
                  zdickinson Expert

                  It does sound like they are ready to use by vSAN.  It was worth a shot, in our hybrid setup we had about 50% of the HDD come with partitions and needed to be cleaned.

                   

                  Maybe try this vsan.disks_info <HOST> from here http://www.virten.net/2013/12/identify-and-solve-ineligible-disk-problems-in-virtual-san/  to find out why it is ineligible.  That is an RVC command, FYI.

                   

                  Thank you, Zach.

                  • 6. Re: Can't create all flash VSAN with PCIe SSD
                    7007VM7007 Enthusiast

                    Thanks for the help, I have never used RVC before but here is the output for the 3 disks I would like to use to create a VSAN datastore:

                     

                    +----------------------------------------------------------------------------------------+-------+---------+------------------------------------------------------------------------+

                    | Local ATA Disk (naa.50025385a01113b7)                                                  | SSD   | 119 GB  | eligible                                                               |

                    | ATA Samsung SSD 840                                                                    |       |         |                                                                        |

                    +----------------------------------------------------------------------------------------+-------+---------+------------------------------------------------------------------------+

                    | Local ATA Disk (naa.50025385a01113ba)                                                  | SSD   | 119 GB  | eligible                                                               |

                    | ATA Samsung SSD 840                                                                    |       |         |                                                                        |

                    +----------------------------------------------------------------------------------------+-------+---------+-------------------------

                    --------------------------------------------+

                    | Local NVMe Disk (t10.NVMe____Samsung_SSD_950_PRO_256GB_______________D28D50F15C382500) | SSD   | 238 GB  | eligible                                                               |

                    | NVMe Samsung SSD 950                                                                   |       |         |                                                                        |

                    +----------------------------------------------------------------------------------------+-------+---------+--------------


                    Apologies for the poor formatting. So all 3 disks are showing as eligible! Is there anything else I can try?


                    Appreciate the help.

                    • 7. Re: Can't create all flash VSAN with PCIe SSD
                      elerium Hot Shot
                      vExpert

                      Have you tried rebuilding at the cluster level?

                      - delete any disk groups, disable VSAN, move host out of cluster

                      - delete cluster, recreate cluster

                      - move host back into cluster, enable VSAN, create disk groups etc...

                       

                      also check /var/log/vmkernel.log, sometimes some more useful error messages may show there.

                      • 8. Re: Can't create all flash VSAN with PCIe SSD
                        7007VM7007 Enthusiast

                        Thanks for the reply.

                         

                        Unfortunately because I only have a single host in the "cluster" I can't delete the cluster and recreate it.

                         

                        I did try to enable VSAN again with the same 3 disks and after it failed again I had a look in the vmkernel.log logfile. Here is some of the output:

                         

                        2016-05-21T08:56:43.043Z cpu1:34765 opID=dbc4d773)LSOMCommon: LSOMDiskGroupDestroy:1833: Destroyed LSOM's global memory

                        2016-05-21T08:56:43.043Z cpu1:34765 opID=dbc4d773)WARNING: PLOG: PLOGInitDiskGroupMemory:6394: Failed to initialize the memory for the diskgroup: Out of memory

                        2016-05-21T08:56:43.043Z cpu1:34765 opID=dbc4d773)WARNING: PLOG: PLOGAnnounceSSD:6495: Failed to initialize the memory for the diskgroup: Success

                        2016-05-21T08:56:43.056Z cpu6:32797)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x9e (0x439d808b3540, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

                        2016-05-21T08:56:43.060Z cpu6:32797)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x1a (0x439d808b3540, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE

                        2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 2687: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for probing: Not found

                        2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 2687: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for probing: Not found

                        2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 1078: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for volume open: Not found

                        2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 1078: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for volume open: Not found

                        2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)FSS: 5334: No FS driver claimed device '570a1804-b7b5175c-f172-00259086cd5c': No filesystem on the device

                        2016-05-21T08:56:43.278Z cpu1:34765 opID=dbc4d773)VC: 3551: Device rescan time 1349 msec (total number of devices 6)

                        2016-05-21T08:56:43.278Z cpu1:34765 opID=dbc4d773)VC: 3554: Filesystem probe time 196 msec (devices probed 6 of 6)

                        2016-05-21T08:56:43.278Z cpu1:34765 opID=dbc4d773)VC: 3556: Refresh open volume time 1 msec

                        - vmkernel.log 4519/4519 100%

                         

                         

                        I've highlighted a couple of entries that caught my attention. I'm not sure why its saying out of memory when there was almost 9GB of free RAM available?

                        • 9. Re: Can't create all flash VSAN with PCIe SSD
                          7007VM7007 Enthusiast

                          Hi All

                           

                          I managed to get my VSAN enabled with the PCIe SSD! Looks like it may have been due to the amount of memory available in the server. I basically started over again by formatting the SATA and PCIe SSD disks. I then shutdown ALL my VMs except vCenter and proceeded with the enabling of VSAN. This time the process completed and I was able to access the vsanDatastore. I have since migrated all my VMs to this datastore. Thanks to all for your valuable assistance.

                           

                          I do have a few more VSAN related questions I hope someone can assist with. I plan on building a 3 node VSAN cluster later this year for use at home to study VMware products and to use it for testing and learning in general. I was thinking of purchasing 3 of the Supermicro SuperServer SYS-5028D-TN4T mini-towers:

                           

                          Supermicro SuperServer SYS-5028D-TN4T mini-tower

                           

                          The server is on the VMware HCL.

                           

                          But the one area I am battling with is the storage/disks to choose. I would like to avoid buying a RAID card for each server if possible but I would still like to get great disk IO so that things like creating a VM from a template/cloning/VSAN perform well. If I use the following disk in the cache tier in VSAN:

                           

                          Samsung  950 Pro 256GB M.2 PCI-e 3.0 x 4 NVMe Solid State Drive

                           

                          and for the capacity tier I was thinking of using two of the following drives in each server:

                           

                          Samsung 256GB SSD 850 PRO SATA 6Gbps 3D NAND Solid State Drive

                           

                          Would this give good read/write performance on each node in the VSAN cluster? I understand that these drives are consumer and not on the HCL but this is for a lab environment. I'm more concerned with choosing the correct disks for great storage performance rather than for getting support due to it being on the HCL.

                           

                          Can I get good disk IO/performance with VSAN and the above configuration even if I don't use any kind of RAID card with cache/BBU?

                           

                          If there are better drives to use for the cache/capacity tier then I would really like to hear your thoughts/recommendations before I proceed with the purchase! This purchase will be costly so I'd like to get it right

                           

                          The Supermicro Superserver can take 6 SATA drives, one M2 drive on the motherboard and has a single PCIe slot for another PCIe SSD drive.

                           

                          Thank you.

                          • 10. Re: Can't create all flash VSAN with PCIe SSD
                            zdickinson Expert

                            Good morning, glad you got past your first problem.  As to a RAID card...  It's all about the queue depth.  More than likely you're using the onboard SATA, so queue depth is likely small... 32 or so.  That might lead you to poor performance.  I would like only read though.  The writes will hit the PCIe first and then de-stage.  Since it's all flash, there will be not read cache.  That's where I would expect a performance hit.  Thank you, Zach.

                            • 11. Re: Can't create all flash VSAN with PCIe SSD
                              7007VM7007 Enthusiast

                              Thanks Zach.

                               

                              I am using the onbaord SAS connectors that come with the LSI 2308. According to the Yellow Bricks website this has a queue depth of 600 so I'm still not sure why the performance is so poor? I'm using an M2 PCIe SSD NVMe for the write cache so I'm unsure what the queue depth is for this device.

                               

                              Is a 600 queue depth adequate for my setup? If yes then what else could be the source of the problem? If no then do I need to purchase an HBA on the VMware HCL to get decent speeds?

                               

                              Thanks agian for the help as I try to figure this all out!

                               

                              PS: I should mention that the LSI 2308 is running in IT mode.

                              • 12. Re: Can't create all flash VSAN with PCIe SSD
                                7007VM7007 Enthusiast

                                I think I have just discovered what the queue depth is with esxtop:

                                 

                                queue.jpg

                                 

                                I'm assuming vmhba1 which has a queue depth of 600 is the LSI 2308 and that the vmhba2 is (possibly) the PCIe SSD which shows a value of 1024.

                                 

                                From what I have read VSAN needs 256 queue depth or more and I have 600/1024 so something strange going on here!

                                • 13. Re: Can't create all flash VSAN with PCIe SSD
                                  zdickinson Expert

                                  Everything seems to line up for good performance.  What's the vSAN networking?  Thank you, Zach.

                                  • 14. Re: Can't create all flash VSAN with PCIe SSD
                                    elerium Hot Shot
                                    vExpert

                                    What kind of poor performance are you seeing? Throughput, latency? read or write, block size? Any metrics you can share? Hard to diagnose performance issues without details. Also which raid driver are you using with the LSI card?

                                     

                                    Queue depth of 600+ should be more than enough for VSAN for raid controller. Individual disks max out at 32 for any SATA drive. NVME is 1024 queue depth

                                     

                                    In terms of performance, the Samsung SSDs should be fine for most workloads, if you really push I/O hard, you may find inconsistent I/O performance. Biggest danger of using consumer SSDs is lack of power loss protection. If there are writes occuring during power loss, you have a high chance of silent data corruption or data loss that may not be immediately apparent.

                                     

                                    Edit: didn't see earlier esxtop post

                                    1 2 3 Previous Next