VMware Cloud Community
andreaspa
Hot Shot
Hot Shot

Evaluating VSAN - Need input on a couple of questions

Hi fellow VSAN supporters! 🙂

I am currently evaluating VSAN on a three Dell R720's and I am planning to write some kind of article when I've concluded my tests (both in Swedish and English).

So far it has been a quite smooth ride setting everything up (thanks to Duncan Epping's and Cormac Hogan's blogs), and it looks like the performance is quite good.

Here are my initial observations and questions:

- I understand that you can use different FTT settings for different VMs etc, but it would be good to have some tool that helps you understand how much space you really have left on the VSAN datastore. For example, if using FTT=1 with three hosts, each VM requires the double amount of space. Doing the same in a regular storage array will still produce relevant numbers for the VMFS datastore (I.e 1TB VM takes 1TB on the datastore, but SAN management will show that this volume takes up twice the space on the underlying disks). Having a tool to help you calculate or show how much is free with different FTT levels will help much for helpdesks when increasing VMDK-files or provisioning new VMs.

- Will a VM be part of one or more disk groups? If it is part of many disk groups, will it utilize more than one SSD for reads/writes?

- When running in Auto mode, how will VSAN provision disks? Will it make one group full with disks, or will it try to create disk groups with roughly the same amount of disks/total storage space?

- I have noticed that VSAN Observer only runs for about two hours, then you have to manually start it again. Is it possible to have it always running in the background for quick and easy access? It would be nice to have these graphs and values directly available from the vSphere Web Client. Perhaps it could be done as an extension you install?

- How is a broken HDD handled? I'm aware that this counts as one failure, but what happens when the drive is replaced, both running in auto mode and manual mode? I'm guessing that in manual mode you have to replace the drive and assign it to the same disk group that the old drive was in. How is the data on other nodes handled, will the data be rebuilt on the witness servers drives, or how is that handled? Will there only be one copy left until the node has been repaired?

- Regarding the FTT setting, I've thought about how it handles broken drives. For example, if there is a drive that breaks in one of the hosts, why not try and rebuild the data on to the other drives on the same host if there is space left? Say you have a node with the max amount of drives, loosing the capacity/performance from that entire node feels sub-optimal when it only is one drive. I would prefer if you could set one setting for how many hosts failures to tolerate as one setting, and how many drives per host you can tolerate as another.

- Does VSAN have any similar feature to "Hot Spare" in regular SANs?

- What happens if you have many disk groups, and one of the SSDs fail? Can VSAN assign the drives from one disk group to the other disk group[s]? Say I have 2SSD and 6HDD in two groups, so each group has 1xSSD and 3xHDD. Could VSAN move the three other drives to the other disk group, so I'd have one group with 1xSSD and 6xHDD?

- Do all hosts need to have the same setup with disk groups, or is it only the total capacity that counts?

I'll probably have more questions as I play around with it a bit more Smiley Happy

EDIT 2014-04-25:

Came up with another question.. When using update manager to patch a cluster, do I have to manually set each host in maintenance mode via the web client, or can UM handle to put hosts in VSAN Maintenance mode for patching? Perhaps there is another preferred way to do this?

Message was edited by: Andreas Paulsson

Tags (3)
Reply
0 Kudos
1 Reply
depping
Leadership
Leadership

Inline!

andreaspa wrote:

Hi fellow VSAN supporters! 🙂

I am currently evaluating VSAN on a three Dell R720's and I am planning to write some kind of article when I've concluded my tests (both in Swedish and English).

So far it has been a quite smooth ride setting everything up (thanks to Duncan Epping's and Cormac Hogan's blogs), and it looks like the performance is quite good.

Here are my initial observations and questions:

- I understand that you can use different FTT settings for different VMs etc, but it would be good to have some tool that helps you understand how much space you really have left on the VSAN datastore. For example, if using FTT=1 with three hosts, each VM requires the double amount of space. Doing the same in a regular storage array will still produce relevant numbers for the VMFS datastore (I.e 1TB VM takes 1TB on the datastore, but SAN management will show that this volume takes up twice the space on the underlying disks). Having a tool to help you calculate or show how much is free with different FTT levels will help much for helpdesks when increasing VMDK-files or provisioning new VMs. DUNCAN: I have not seen such a tool.

- Will a VM be part of one or more disk groups? If it is part of many disk groups, will it utilize more than one SSD for reads/writes? DUNCAN: Yes it can be, it can be part of multiple diskgroups on one host and across hosts. The more diskgroups you use the more SSDs you will leverage for reads/writes indeed

- When running in Auto mode, how will VSAN provision disks? Will it make one group full with disks, or will it try to create disk groups with roughly the same amount of disks/total storage space? DUNCAN:It will create on a host 7+1 and then create a new one is my understanding

- I have noticed that VSAN Observer only runs for about two hours, then you have to manually start it again. Is it possible to have it always running in the background for quick and easy access? It would be nice to have these graphs and values directly available from the vSphere Web Client. Perhaps it could be done as an extension you install? DUNCAN:No this is not possible. The observer is intended for short term use, it gathers a lot of statistics!

- How is a broken HDD handled? I'm aware that this counts as one failure, but what happens when the drive is replaced, both running in auto mode and manual mode? I'm guessing that in manual mode you have to replace the drive and assign it to the same disk group that the old drive was in. How is the data on other nodes handled, will the data be rebuilt on the witness servers drives, or how is that handled? Will there only be one copy left until the node has been repaired? DUNCAN:If the drive fails and the state of that drive is "degraded" then new copies of your objects will instantly be created. If the host fails or a disk fails and VSAN doesnt know what happened it gets the state "absent" and will wait for 60 minutes before creating new copies. I have details on that topic on my blog

- Regarding the FTT setting, I've thought about how it handles broken drives. For example, if there is a drive that breaks in one of the hosts, why not try and rebuild the data on to the other drives on the same host if there is space left? Say you have a node with the max amount of drives, loosing the capacity/performance from that entire node feels sub-optimal when it only is one drive. I would prefer if you could set one setting for how many hosts failures to tolerate as one setting, and how many drives per host you can tolerate as another. DUNCAN:A single drive failure will NOT render the whole host unusable. Only when the SSD fails that diskgroup becomes unavailable

- Does VSAN have any similar feature to "Hot Spare" in regular SANs? DUNCAN: In VSAN your whole clusters acts as a "hotspare" by you making sure you have sufficient diskspace available to create new copies when X fails

- What happens if you have many disk groups, and one of the SSDs fail? Can VSAN assign the drives from one disk group to the other disk group[s]? Say I have 2SSD and 6HDD in two groups, so each group has 1xSSD and 3xHDD. Could VSAN move the three other drives to the other disk group, so I'd have one group with 1xSSD and 6xHDD?  DUNCAN: I have not tried that, but theoretically you should be able to do that

- Do all hosts need to have the same setup with disk groups, or is it only the total capacity that counts? DUNCAN: No they do not

I'll probably have more questions as I play around with it a bit more Smiley Happy

EDIT 2014-04-25:

Came up with another question.. When using update manager to patch a cluster, do I have to manually set each host in maintenance mode via the web client, or can UM handle to put hosts in VSAN Maintenance mode for patching? Perhaps there is another preferred way to do this?

Message was edited by: Andreas Paulsson

Reply
0 Kudos