VMware Cloud Community
7007VM7007
Enthusiast
Enthusiast

Can't create all flash VSAN with PCIe SSD

Hi All

I'm trying to create my first test VSAN in my lab at home. Its a single node VSAN. I know its not supported but this is just to get me started before I get some more servers.

My hardware is as follows:

Supermicro X10SL7-F

32GB RAM

Xeon CPU E3-1230 v3 @ 3.30GHz

One Samsung 950 Pro 256GB PCIe SSD NVMe

Two Samsung 840 Pro 128GB SATA SSDs

As a test I was able to create a VSAN datastore using one of the Samsung 840 128GB drives as the cache tier and the other Samsung 840 128GB was used for capacity. This worked great and I was able to place VMs on the vsanDatastore and see dedupe/compression in action!

I have since deleted the above configuration and am now trying to use my Samsung 950 Pro 256GB PCIe SSD NVMe for the cache tier and to then use the two Samsung 840 Pro 128GB SATA SSDs for capacity. When I enable VSAN on the cluster and complete the setup I get the following errors as soon as I try to access the vsanDatastore:

vsan error.jpg

The capacity of the vsanDatastore shows as 0.00 B and I can't place any VMs on this datastore.

Can someone assist with getting this to work? I have tried recreating the VSAN setup a few times but it hasn't helped. I know the PCIe SSD works as I can set it up as a single disk (non VSAN) datastore and place VMs on it with no problem.

How can I troubleshoot this? Thanks.

31 Replies
depping
Leadership
Leadership

Is it recognized as an SSD in the UI and as local? If not tag them accordingly first...

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

I think the answer is yes to both of your questions but here are two screenshots of the PCIe SSD drive:

pcie.jpg

pcie2.jpg

What am I doing wrong? Thanks for the reply Smiley Happy

Reply
0 Kudos
zdickinson
Expert
Expert

Good morning, my guess would be that the PCIe card has some partitions or volume information.  Have you cleaned the drive?

Some info here:  http://www.vladan.fr/how-to-delete-partitions-to-prepare-disks-for-vsan/

and here:  http://cormachogan.com/2014/02/18/vsan-part-16-reclaiming-disks-for-other-uses/

Thank you, Zach.

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

This is a brand new disk so I would be surprised if exising partitions caused this to fail but I ran on the PCIe SSD:

partedUtil getptbl /vmfs/devices/disks/t10.NVMe____Samsung_SSD_950_PRO_256GB_______________D28D50F15C382500

and the output was as follows:

gpt

31130 255 63 500118192

I don't think theres a partition to delete in this instance?

I also ran the same comman on the two SATA SSD drives and got the same output on each of them:

gpt

15566 255 63 250069680

So I'm assuming all 3 disks are blank and ready for VSAN use? Is there anything else I have to do to get these drives to work in an all flash VSAN datastore?

Thanks for the help!

Reply
0 Kudos
zdickinson
Expert
Expert

It does sound like they are ready to use by vSAN.  It was worth a shot, in our hybrid setup we had about 50% of the HDD come with partitions and needed to be cleaned.

Maybe try this vsan.disks_info <HOST> from here http://www.virten.net/2013/12/identify-and-solve-ineligible-disk-problems-in-virtual-san/  to find out why it is ineligible.  That is an RVC command, FYI.

Thank you, Zach.

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

Thanks for the help, I have never used RVC before but here is the output for the 3 disks I would like to use to create a VSAN datastore:

+----------------------------------------------------------------------------------------+-------+---------+------------------------------------------------------------------------+

| Local ATA Disk (naa.50025385a01113b7)                                                  | SSD   | 119 GB  | eligible                                                               |

| ATA Samsung SSD 840                                                                    |       |         |                                                                        |

+----------------------------------------------------------------------------------------+-------+---------+------------------------------------------------------------------------+

| Local ATA Disk (naa.50025385a01113ba)                                                  | SSD   | 119 GB  | eligible                                                               |

| ATA Samsung SSD 840                                                                    |       |         |                                                                        |

+----------------------------------------------------------------------------------------+-------+---------+-------------------------

--------------------------------------------+

| Local NVMe Disk (t10.NVMe____Samsung_SSD_950_PRO_256GB_______________D28D50F15C382500) | SSD   | 238 GB  | eligible                                                               |

| NVMe Samsung SSD 950                                                                   |       |         |                                                                        |

+----------------------------------------------------------------------------------------+-------+---------+--------------


Apologies for the poor formatting. So all 3 disks are showing as eligible! Is there anything else I can try?


Appreciate the help.

Reply
0 Kudos
elerium
Hot Shot
Hot Shot

Have you tried rebuilding at the cluster level?

- delete any disk groups, disable VSAN, move host out of cluster

- delete cluster, recreate cluster

- move host back into cluster, enable VSAN, create disk groups etc...

also check /var/log/vmkernel.log, sometimes some more useful error messages may show there.

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

Thanks for the reply.

Unfortunately because I only have a single host in the "cluster" I can't delete the cluster and recreate it.

I did try to enable VSAN again with the same 3 disks and after it failed again I had a look in the vmkernel.log logfile. Here is some of the output:

2016-05-21T08:56:43.043Z cpu1:34765 opID=dbc4d773)LSOMCommon: LSOMDiskGroupDestroy:1833: Destroyed LSOM's global memory

2016-05-21T08:56:43.043Z cpu1:34765 opID=dbc4d773)WARNING: PLOG: PLOGInitDiskGroupMemory:6394: Failed to initialize the memory for the diskgroup: Out of memory

2016-05-21T08:56:43.043Z cpu1:34765 opID=dbc4d773)WARNING: PLOG: PLOGAnnounceSSD:6495: Failed to initialize the memory for the diskgroup: Success

2016-05-21T08:56:43.056Z cpu6:32797)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x9e (0x439d808b3540, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE

2016-05-21T08:56:43.060Z cpu6:32797)NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x1a (0x439d808b3540, 0) to dev "mpx.vmhba32:C0:T0:L0" on path "vmhba32:C0:T0:L0" Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0. Act:NONE

2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 2687: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for probing: Not found

2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 2687: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for probing: Not found

2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 1078: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for volume open: Not found

2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)Vol3: 1078: Could not open device '570a1804-b7b5175c-f172-00259086cd5c' for volume open: Not found

2016-05-21T08:56:43.217Z cpu1:34765 opID=dbc4d773)FSS: 5334: No FS driver claimed device '570a1804-b7b5175c-f172-00259086cd5c': No filesystem on the device

2016-05-21T08:56:43.278Z cpu1:34765 opID=dbc4d773)VC: 3551: Device rescan time 1349 msec (total number of devices 6)

2016-05-21T08:56:43.278Z cpu1:34765 opID=dbc4d773)VC: 3554: Filesystem probe time 196 msec (devices probed 6 of 6)

2016-05-21T08:56:43.278Z cpu1:34765 opID=dbc4d773)VC: 3556: Refresh open volume time 1 msec

- vmkernel.log 4519/4519 100%

I've highlighted a couple of entries that caught my attention. I'm not sure why its saying out of memory when there was almost 9GB of free RAM available?

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

Hi All

I managed to get my VSAN enabled with the PCIe SSD! Looks like it may have been due to the amount of memory available in the server. I basically started over again by formatting the SATA and PCIe SSD disks. I then shutdown ALL my VMs except vCenter and proceeded with the enabling of VSAN. This time the process completed and I was able to access the vsanDatastore. I have since migrated all my VMs to this datastore. Thanks to all for your valuable assistance.

I do have a few more VSAN related questions I hope someone can assist with. I plan on building a 3 node VSAN cluster later this year for use at home to study VMware products and to use it for testing and learning in general. I was thinking of purchasing 3 of the Supermicro SuperServer SYS-5028D-TN4T mini-towers:

Supermicro SuperServer SYS-5028D-TN4T mini-tower

The server is on the VMware HCL.

But the one area I am battling with is the storage/disks to choose. I would like to avoid buying a RAID card for each server if possible but I would still like to get great disk IO so that things like creating a VM from a template/cloning/VSAN perform well. If I use the following disk in the cache tier in VSAN:

Samsung  950 Pro 256GB M.2 PCI-e 3.0 x 4 NVMe Solid State Drive

and for the capacity tier I was thinking of using two of the following drives in each server:

Samsung 256GB SSD 850 PRO SATA 6Gbps 3D NAND Solid State Drive

Would this give good read/write performance on each node in the VSAN cluster? I understand that these drives are consumer and not on the HCL but this is for a lab environment. I'm more concerned with choosing the correct disks for great storage performance rather than for getting support due to it being on the HCL.

Can I get good disk IO/performance with VSAN and the above configuration even if I don't use any kind of RAID card with cache/BBU?

If there are better drives to use for the cache/capacity tier then I would really like to hear your thoughts/recommendations before I proceed with the purchase! This purchase will be costly so I'd like to get it right Smiley Happy

The Supermicro Superserver can take 6 SATA drives, one M2 drive on the motherboard and has a single PCIe slot for another PCIe SSD drive.

Thank you.

Reply
0 Kudos
zdickinson
Expert
Expert

Good morning, glad you got past your first problem.  As to a RAID card...  It's all about the queue depth.  More than likely you're using the onboard SATA, so queue depth is likely small... 32 or so.  That might lead you to poor performance.  I would like only read though.  The writes will hit the PCIe first and then de-stage.  Since it's all flash, there will be not read cache.  That's where I would expect a performance hit.  Thank you, Zach.

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

Thanks Zach.

I am using the onbaord SAS connectors that come with the LSI 2308. According to the Yellow Bricks website this has a queue depth of 600 so I'm still not sure why the performance is so poor? I'm using an M2 PCIe SSD NVMe for the write cache so I'm unsure what the queue depth is for this device.

Is a 600 queue depth adequate for my setup? If yes then what else could be the source of the problem? If no then do I need to purchase an HBA on the VMware HCL to get decent speeds?

Thanks agian for the help as I try to figure this all out!

PS: I should mention that the LSI 2308 is running in IT mode.

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

I think I have just discovered what the queue depth is with esxtop:

queue.jpg

I'm assuming vmhba1 which has a queue depth of 600 is the LSI 2308 and that the vmhba2 is (possibly) the PCIe SSD which shows a value of 1024.

From what I have read VSAN needs 256 queue depth or more and I have 600/1024 so something strange going on here!

Reply
0 Kudos
zdickinson
Expert
Expert

Everything seems to line up for good performance.  What's the vSAN networking?  Thank you, Zach.

Reply
0 Kudos
elerium
Hot Shot
Hot Shot

What kind of poor performance are you seeing? Throughput, latency? read or write, block size? Any metrics you can share? Hard to diagnose performance issues without details. Also which raid driver are you using with the LSI card?

Queue depth of 600+ should be more than enough for VSAN for raid controller. Individual disks max out at 32 for any SATA drive. NVME is 1024 queue depth

In terms of performance, the Samsung SSDs should be fine for most workloads, if you really push I/O hard, you may find inconsistent I/O performance. Biggest danger of using consumer SSDs is lack of power loss protection. If there are writes occuring during power loss, you have a high chance of silent data corruption or data loss that may not be immediately apparent.

Edit: didn't see earlier esxtop post

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

Zach:

I have two x 1Gb NICs that are teamed and both are active.I have a single standard vSwitch. I have two VMkernel ports: One is for managment and the other is for VSAN. Everythings on one subnet. I only have the single server.

Eluerium:

Ok so for performance testing I kept it simple. I had two VMs running on the vsanDatastore. Both VMs were using vmxnet3 for the vNICs. I then copied a 5GB ISO onto one of the VMs local drive. I then copied the ISO between the VMs measuring the speed. The speeds ranged from 20MB/s to 80MB/s. Occassionally I saw speeds spike to 120 to 300MB/s but this was rare and only briefly. I never once saw consistent speed (ie: a constance 200MB/s). The file copy was showing the speed going up and down. Now considering that I am running an ALL flash VSAN these speeds are terrible. My slowest SSD can do 390MB/s write speeds and over 500MB/s read speeds.The PCIe SSD has insane speeds. I also tried copying the ISO between a VM on the vsanDatastore to a VM on another (non VSAN) datastore (and copied it back again). Again, performance was never more than 20-40MB/s on average. I did see brief spikes to 80MB/s, 120MB/s and even as high as 300MB/s but it was so brief. If I average the speeds out after the file copies it was between 20 to 40MB/s. Defintely no more than 80MB/s.

I don't think I am running a RAID driver as the LSI 2308 is running in IT mode?

Remember, when I did this testing there was no load on this cluster/datastore. We're talking ideal conditions here. So no backups running. No other users on the system etc etc. I am baffled. Something is fundementally wrong. I'm sure its something I have done incorrectly but I am battling to pin point it!

I would have thought that a queue depth of 1024 for a PCIe SSD NVMe drive used in the cache tier and a 600 queue depth on the LSI 2308 with Samsung SSD Pro drives for capacity that I would, at the very least, get 200MB/s read/write speeds?

Thanks to all for contributing. Hopefully we can get to the bottom of this 😉

Edit: I also tried changing the stripe width in the default VSAN policy but this hasn't really helped much.

Reply
0 Kudos
elerium
Hot Shot
Hot Shot

20-40 MB/s network copy does seem slow, but the test is flawed as you introduce network speeds into a storage test. Network copy speeds are then dependent on other variables (CPU/thread/queues/OScache/OS network driver/ESXi network driver etc) and as a result makes storage tests over network mostly meaningless.

To test network speeds, use a tool like iperf (check google for how to use) and you'll see what I mean, it's rare than an OS (especially windows) will make use of the full network speed for a single file copy and this can become the bottleneck. This would be just be how the OS behaves and not any indication of a network issue. Without a tool like iperf, you wouldn't be able to accurately test network speed.

To test storage speed only, use a tool like IOMeter (more indepth and can test every I/O scenario) or CrystalDiskMark (quick and simple benchmark result for 4k and sequential) which will test/benchmark only the storage on a system. This is going to be much more accurate and repeatable storage test than a ISO copy test over network.

In VSAN health, check under Hardware compatibility->Controller driver, it should report the LSI driver in use (there is one being used for your LSI HBA).


Also, in the vsphere web client, in the Host and Clusters mode, click on your host, then under Monitor->Performance you should also have a bunch of VSAN performance related graphs. You'll want to check that latency and congestion remain low during tests. These graphs only sample every 5 minutes so your storage tests should at least be 20 minutes long to get any kind of useful data from this.


Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

Thanks Elerium!

Before I do anymore benchamrks, is there anything else I am or could be doing wrong that could be causing slow storage performance? Ok so I'm not doing the most fantastic storage benchamark but surely if I copy an ISO from one VM to another I should get more than 20MB/s? Also, when I am doing VMware related tasks like migrating VMs using vMotion or I am creating a new VM from a template its takes a long time for these tasks to complete.

I'll look into the methods you mentioned for benchmarking my storage but is there anything else that I could be doing that is wrong? I think its safe to say that my disk controllers support enough queue depth so this leaves my Samsung SSD drives. Could consumer SATA SSD drives be THIS bad? The performance is so bad with zero load on it! Would data centre Samsung drives like the SM/PM863 drives perform better? These drives are still SATA based.

I'm open to almost any ideas at this point. Even if I have to purchase new hardware Smiley Happy

Reply
0 Kudos
elerium
Hot Shot
Hot Shot

Your slow vmotion is probably from your slow network links, vmotion on 1gb (even teamed) is going to be slow. If you're using 1 host though, how are you using vmotion?

For a single host, I think your SSDs are fine, once you scale out to more than 1 host, your network will likely be the bottleneck. 1gb links will greatly limit storage speed across nodes.

Reply
0 Kudos
7007VM7007
Enthusiast
Enthusiast

Maybe vMotion is the wrong term. Its when I migrate a VM from one datastore to another datastore on the same host. A 40GB VM can take over an hour to migrate, is this normal? I don't think networking can be the root cause of my poor disk storage performance (although i could be wrong) as everything i am doing is on the same host and never goes over the NICs/LAN. Even if it does go over the 1gb NIC I should still see more than 20MB/s?

Having said this, I have (at this stage) given up on VSAN. I just couldn't get my disk performance to be over 20-80MB/s on my all flash VSAN datastore. I tried with a RAID card (with cache and BBU) and without and it didn't make a difference. Not even a brand new PCIe SSD NVMe cache drive could save the day. When copying files between VMs its so strange because I don't get (say) a consistent copy speed, the speed goes up a lot and then drops a lot. The file copy speed dialog box looks like a wave!

I have deleted my VSAN datastore and setup each SSD drive in my single ESXi server as its own datastore. The performance on the PCIe SSD NVMe drive is good (but not amazing considering the specs of the drive) and I am getting over 300MB/s when copying a 5GB ISO between two VMs on the same datastore. My SATA SSD are still getting 25MB/s write speeds although I am getting close to 200MB/s read speeds.

I'm still not sure what I am doing wrong though. How can I get (at the very least) 200MB/s or more read/write speeds with an all flash VSAN setup at home? I want to do VDI, templates/cloning, Citrix MCS etc etc which is very disk IO intensive so I don't want to have to sit around for hours waiting for something to happen.

I'd really appreciate any help with this. I definitely have budget to buy new drives/controllers/etc so if I need new hardware then its not a problem but I do want to make sure the performance will be good and that it will work with VSAN.

Thank you!

Reply
0 Kudos