VMware Cloud Community
sel57
Enthusiast
Enthusiast

Question about VM Redundancy

Hi All,

I have a 12 host production cluster that are all ESXi 5.1. Half of my hosts are 2 socket, 6 core, or 24 logical processors with HT active. The other half is 2 socket, 8 core, or 32 logical processors with HT active.

My team recently wanted a couple larger vm's deployed, specifically ones with 24 vCPU. They want to enjoy the benefits of managing from vCenter even for vm's that could certainly warrant having their own dedicated physical servers. This is fine, but I found out the hard way that when one of those larger vm's wants to use all of it's allocated resources and they're not available, problems arise and the vm's network becomes intermittent and/or unresponsive.

I've been hesitant to turn on HA or DRS because of these larger vm's. I have a lot of resources available in the cluster, but if I have 5 to 10 smaller vm's per host, in the event of a host failure, I'm not sure there would be ample room on any single one of them for these larger vm's. For now, I have these larger 24 vCPU vm's dedicated to their own physical (32 logical processor) host. Is fully automated DRS smart enough to move vm's off one server to be able to move one of these bigger vm's on to it's own server (if needed)? I'm concerned because I don't want these large vm's becoming unresponsive every time they ramp up and require all of their allocated cpu, only to have it not available until automated DRS does its thing.

I've thought about creating a separate cluster for these larger vm's, one physical host per large vm with one empty host for fail over. I've also looked into reserving resources a little, but I'm not quite sure how that works or how it could effect the cluster resources.

How would you set up such an environment? Your thoughts are appreciated.

0 Kudos
2 Replies
steveb05
Enthusiast
Enthusiast

You can likely accomplish your goals by using DRS groups and rules.

http://blog.clearpathsg.com/blog/bid/252910/vSphere-5-DRS-Groups-and-Rules

- Steve Please consider marking this answer "correct" or "helpful" if you found it useful. Steve Brill Virtualization Junkie VMware, SAN/NAS, Networking and Server Infrastructure Engineer
sel57
Enthusiast
Enthusiast

Thanks Steve. I think DRS groups and rules is a great start to ensuring the larger vm's stay on the larger (cpu) hosts, but I'll still have to figure out what I'll do about over subscription.

In one example, I think I ended up with a 24 vCPU vm on a host with 24 logical processors, and the host also had 3 or 4 other small vm's when the large vm just stopped responding and the host CPU shot to like 170% until I migrated the vm on to it's own host. I would imagine even on a slightly larger 32 logical processor host, this could still happen if I let enough of these smaller vm's intermingle with the big ones.

0 Kudos