I have three node vSAN cluster that I'm using to see if vSAN would be suitable to us. Before I moved any VMs on the cluster I ran several iometer tests and got nice results. With 4GB test file (8k, 65% read, 60% random) I got about 13k iops with about 5ms and also with 25GB test file I got ok results. I then moved 7 VMs on the vSAN to run some real workload on it. 6 of these are testing our own software and one is making software builds. Each VM generate 100-300iops burst during the workday. Occasionally there could a bit higher iops bursts that vSAN handles great:
But mostly I see really high latency on all of the VMs:
Node specs:
Dell PowerEdge T320
- Intel E5-2420 v2 2.2GHz
- 192GB memory
- PERC H710
- 6x 900GB 2,5" 10k SAS drive
- 1x Samsung PRO series SSD 512GB 2,5" SATA3
- Broadcom 5720 DP 1Gb for VM network traffic
- Broadcom 57810 DP 10Gb for vSAN traffic
- ESXi 5.5 U2
Disks presented as RAID0, "No read ahead" disabled and write policy set to "Write Through".
My guess is that the SSD that I'm using is choking when several VMs start pushing io on the vSAN. I know that the SSD that I'm using is not on the VMware compatibility list. I'm not planning to run any real production with this configuration. Just wanted to test the vSAN with the HW that I had for other purposes. Could it be the SSD or is it something else?
Here are some vSAN observer screenshots taken at the same time as the previous screenshots:
I am afraid the "consumer" grade SSD isn't cutting it Henri unfortunately. I have seen this in the past as well with other solutions (not VSAN) that the SSD simply doesn't do well when multiple workloads are accessing it concurrently.
Is it possible to change the SSD on disk group by putting the host on maintenance mode and then removing the SSD and adding another one to the group? Or do I have to do full data migration to every host and delete/recreate the disk group?
I can't imagine you would have to do a full migration....long term data is not stored on the ssd anyway. You should be able to place it in maintenance mode, replace the drive as a "failed" drive, then bring it back in. I'll thumb through Cormac and Duncan's book, see if there's any data on it.
---here you go, read through this. http://www.yellow-bricks.com/2013/09/18/vsan-handles-disk-host-failure/
Personally, I would probably do a full just out of paranoia, but that's just me.
I've been reading the book and there are mentions about adding and removing disks. Well, it's a test environment so maybe I'll just test it :smileysilly:
Please do and post the results, I'll follow the thread for my own knowledge add.
Thanks;
William
Don't forget to click "answered" by Duncan, he did answer your post question.
you need to move all data out, delete the disk group, swap out the SSD and then create the disk group again... In the current version that is.
Are Intel DC S3500 considered consumer grade? My vSAN acts nearly the same. See my other thread with specs to my setup. Poor All Flash vSAN Performance
I get 400MB+ sec write speeds in a Windows 2012 VM, but the latency are off the charts when zipping or copying large filesets. 20GB+