How should a storage solution be designed so that you can bring parts of it down and maintain it regularly without impacting availability and performance of your VMs? For example - update storage firmware without impacting VM availability / performance? What are all the pieces that need to be in place for this to work and what are best practices for planning storage maintenance windows:
i.e.
-multiple redundant storage controllers and bring one down at a time.
-storage vmotion performed by VMware to move VMs off system being serviced and back on
-what else needs to be in place for this to work and what are your experiences?
Thanks
NetApp does this pretty well. You have two controllers (they call filers) in a single device with near lossless manual failover. When we upgrade software, we can failover all traffic to one filer, upgrade the offline filer, bring it back up, failover to that one and upgrade the other filer.
Now, this does potentially affect performance if your filers are regularly above 50% utilization in any area (CPU/memory/network/etc). If they're not, performance should not be affected at all (or very little).
we uses alot of HP P4000's network raid where by it can protect node failure protection.
To update SP's you will need to use multipathing. Be sure you can control what path is being used to access your storage, so you can do the path failover manually without disrupting the system, and check the SP's utilization to not over-utilize the ones that you left working. Anyway, there are parts (I think) of the storage system that will need a shutdown in a way or another.
If the entire storage device needs to go down at least periodically, what are strategies for maintaining service of VMs during that time? For example, storage vmotion the VMs elsewhere temporarily and move them back after the maintenance, or other strategies? Other than maintaining twice the storage needed at all times so that one storage device can go down at a time, are there better strategies?
If you have enough nodes in your storage environment you shouldn't need to bring it down very often if at all.
If you are talking about trying to get data redundancy then you need not worry as a single storage device should have more then enough disks to cater for several failures all at the same time without bringing the business down. In this case buy a single storage device and add more nodes to it the bigger you get.
If you are talking about trying to get storage redundancy then the answer is simple. Buy two storage devices with less nodes on each, then grow them simultaneously.
Cheers,
Paul
NetApp does this pretty well. You have two controllers (they call filers) in a single device with near lossless manual failover. When we upgrade software, we can failover all traffic to one filer, upgrade the offline filer, bring it back up, failover to that one and upgrade the other filer.
Now, this does potentially affect performance if your filers are regularly above 50% utilization in any area (CPU/memory/network/etc). If they're not, performance should not be affected at all (or very little).
OK thanks for the input