VMware

jayctd

jayctd's Profile

  • Name: Jered 
  • Email: (Private)
  • Member Since: Jun 3, 2009
  • Last Logged In: Nov 18, 2009 6:02 AM
  • Status Level: Hot Shot Hot Shot (619 points)
  • Signature: ##If you have found my post has answered your question or helpful please mark it as such##

jayctd's Latest Content

blog in The Completegeek

Posted by jayctd Aug 29, 2009 0 Comments

Its becoming kind of a pain to maintain my blog in two places (the one I own and on the communities) all my vmware related posts exist and will continue to be posted at www.youdontevenrealize.com/blog

0 Comments Permalink

I have gotten enough questions on the vmware community about best configuration practices on EqualLogic configuration both in general as in respect to a vmware configuration. As this is such a complex decision making process where there are many options I will walk through some of the hardware and configuration choices. And any gocha’s that I have run across.

I plan on additional editions including configuration on blades seperatly.

Hardware:

Network:

Your first choice is your actual SAN network choice. The basic rule of thumb is to purchase the best hardware that you can for your budget. The two features you are looking for are Flow Control and Jumbo support.

Note: Be sure you can have both features enabled at the same time, we have run across some HP switches that have both features but because of resource limitations do not work with both features enabled at the same time.

We have chose Cisco 3750’s as our base line SAN switch. The stacking capability and performance and reliability make them the ideal switch for really high port count SAN networks.

Disk Technologies:

We have found as we scaled (notably past the 200 virtual mark but we were seeing hints before then on performance issues) that we needed to teir performance on the disk for certain types of activity. We have found in the equallogic world that we get great performance out of even SATA technologies (specially as the groups get bigger as volumes will stripe across up to 3 arrays giving good IOP performance) but there are certain profiles where you just cant use the Teir2 type disks in.

As a general rule we get about 1400 to 1800 IOPs out of any individual SATA volume in a multi array pool even with MPIO … we can get closer to 8000 out of a SAS disk pool with same configuration.

Boot storms also had a large impact on performance of ESX system drives. As such I will make a recommendation for larger ESX environments to go to SAS for your system drives. Data itself is a more case by case decision, we have been able with the right configuration to get very good performance out of SATA as such SATA first is not a bad idea

Configuration:

Network:

The ideal situation for configuration is redundancy. We HAVE had a case of a failed DOA switch cause a lock and reboot of an entire stack. Segregating your administrative functions as such is important. With that design goal in mind you if you can afford should separate your switches or stacks of switches.

You then create an ether channel between the two stacks to pass traffic. EqualLogic has made some changes to their stated best practices on how many ports should be used in that channel based on the number of arrays. Personally we have never seen even with more than 12 arrays more than a GB through a port channel even when our channel is much larger than that.

ESX Host configuration:

Best practices is to have a minimum of one port on each of your two networks (I will refer to them as red and green as it is familiar to me to differentiate between them)

We generally directly attach volumes to virtuals with MSISCSI, as such it is good to note while MPIO gives you more redundancy it only really detects link down. As you are dealing with virtual switches even if a physical link drops MPIO on the virtual will not see this failure.

As such you will want to create 2 SAN virtual switches with two physical ports each one in each SAN network and teamed. Then the virtuals get 2 virtual interfaces one on each of the two SAN vswitches

The vswitches can then handle the failure of a link and you can guarantee up to 2 GBPS to each host of SAN connectivity even in a failure situation.

Physical server configuration:

This is a bit easier simply one port in each of your two networks as it is physical it will show link down to the server so no fancy teaming necessary simply use MPIO to achieve your performance (which will be much better than a team in regards to performance)

MPIO configuration:

I would recommend installing the whole Host integration toolkit it makes configuration of MPIO (specially with some of the quirks I have seen out of Server2008) on windows machines. It also gives you some tools as well as installing some more load balancing options

As such the recommended load balancing configuration will be least queue depth, it tends to do a better job than RR.

0 Comments Permalink

I am approaching this from the point of view of our EquaLlogic SAN some of these (namely number 4) may not be as big of an issue. Also discussing in the 3.5 world for most of this will address what 4.0 changes in the end
Pro
1) One place that VMDK’s are a good choice to use is where you have questionable System administrators with local administrative accounts. In some cases we have vendor or client administer accounts (they are the exception) so they can handle specific processes. The problem is if the virtual actually has adapters on the SAN network they have a more direct avenue for causing problems weather on purpose or accident. Anything from re-addressing adapters or some mal-ware that could get installed as they are an admin (not likely in a controlled environment but still possible)As such VMDK’s protect that network from those admins.
2) VMDK’s also behave nicer in the case of SAN failure or performance issues, if a virtual looses connectivity to its VMDK it pauses rather than dropping the volume.
3) Svmotion, while moving volumes between pools is easy moving them between groups and or different SAN vendors is much harder. There are processes but if your data is in VMDK it is a snap. Simply mount the new SAN or volume in a new group and fly, no downtime necessary for what would have been a major move. Virtual storage is the best thing for storage administrators.
4) Not having to administer separate SAN volumes and configuration. This cuts deployment time for a SAN volume to a fraction of the time separate volumes take. Not having to setup ACLS, MPIO configurations … install HIT and configure adapters. It cuts deployment time from a half an hour to 2 Minutes and moves the knowledge out to system administrators rather than storage administrators.
5) We have run across a number of situations lately that either required a host integration tool upgrade or MSISCSI version upgrade to overcome the loss of connectivity to volumes. This is a BIG deal when that means touching 200+ servers. Reducing update and hand maintenance tasks is a huge money saver from the point of view of labor costs.
6) And the last advantage Ill discuss is administration security, unfortunately the granularity of permissions for EqualLogic is lacking. You may not want all your system administrators to be group admins just so they can expand a SAN volume. We have in the past written custom interfaces to achieve this without giving them direct permission but VMDK’s also achieves this. No longer does a sys admin need to be a group administrator on your SAN just for creating new volumes or expanding drives.
Con:
1) Performance: This is the biggest concern with 3.5 not supporting MPIO on its software initiator the performance is just not as good in the equallogic world where MPIO has a huge impact on performance.
2) Not able to take advantage right now of some of the advanced features the Host integration toolkit provides you. Most people are not using these features so it is not as large of a concern
And I am not pleased to say that we are seeing awesome performance (especially with the new beta driver) out of vsphere4 as the software initiator now does have MPIO support. We are likely to evaluate it as our default configuration (presuming successful testing in production) because of the advantages.

0 Comments Permalink

Brainstorm with co-workers, get your questions answered, build status with your responses.

Write your own drafts, invite selected collaborators, or leave it open for all to pitch in.

Communities