VMware Cloud Community
FrostyatCBM
Enthusiast
Enthusiast

UPS recommendations (make/model, cabling, software)?

I had a UPS die on me the other day. Fortunately almost everything kept working, apart from a few switches which we quickly recabled to the remaining UPS. However I am now in a position where I am running everything through my one remaining APC SmartUPS 3000 and I need to make a decision on what to replace the dead UPS with and how I should set things up for the future.

Equipment-wise we need to support:

3 x Dell R710 ESXi 4.1 hosts

2 x Dell R610 Windows servers

1 x Dell MD3200 storage (12 drives)

1 x Dell MD1200 storage (12 drives)

a few switches, a firewall and a router

Previously I had cabled all my servers/storage so that one PSU went to UPS1 and the other to UPS2. So when the UPS died everything kept working. The one remaining UPS is sitting at about 60% load right now.

I was thinking of buying 2 new SmartUPS 3000's so that I have 3 in total, and then ensuring that each UPS is on a separate phase of power. So far so good. But how best to cable up the servers to distribute the load AND ensure that in the event of power loss we can control things properly.

One idea was to cable like this:

-- server #1 to UPS1 and UPS2

-- server #2 to UPS2 and UPS3

-- server #3 to UPS3 and UPS1

but if I lose power (anywhere from 1 to all 3 phases) that working out how to detect the failure and then manage/control the shutdown of VMs/hosts will be horribly complicated. Could really use some suggestions as to how best to utilise UPS gear in a 3-host, 2-storage tray, + 2 physical server setup like mine, given that I have the flexibility to either completely replace my one remaining SmartUPS 3000, or alternatively, add 1 or 2 new 3000's to the mix.

Tags (2)
0 Kudos
5 Replies
FrostyatCBM
Enthusiast
Enthusiast

Having investigated the power situation in our building, it turns out that the building used to be occupied by Telstra (Australia's biggest telco) and the power setup in the basement seems pretty good. There seems to be an isolation switch for each of the 3 phases of power.

I'm currently thinking to cable all PSUs on a VMware ESX host unit to a single UPS, rather than splitting them across multiple UPSs as this will make the monitoring and shutdown very simple. I do recognise that this means the UPS is a single point of failure, however the chance that a UPS will fail is reasonably remote and I guess I can live with that event.

I did think though that I should cable the PSUs on a storage shelf to separate UPSs ... so that if any individual UPS unit fails, the storage system will keep working on the remaining one ... and with my storage systems (with their battery-backed cache) there is no automated shutdown process to worry about (when the UPS battery runs out, it fails over to the controller battery and writes to disk I guess).

0 Kudos
J1mbo
Virtuoso
Virtuoso

Across my current deployment of 8 APC UPS's, two have failed in the same way that you have experienced: I don't think a single UPS can be depended on.

Instead I would suggest balancing the loads such that storage devices and storage switches have slightly more run-time than the servers - and ensure the batteries are calibrated periodically.

For shut down, assuming you don't have a generator, you may need to consider just initiating a shutdown after say 15 minutes of any failure, especially if the loss of a phase will also take out your comms room cooling equipment. The APC units can work together in this way by adding multiple UPS's to PCNS.

HTH






http://blog.peacon.co.uk

Please award points to any useful answer.

Unofficial List of USB Passthrough Working Devices

FrostyatCBM
Enthusiast
Enthusiast

Mmm, yes, good point J1mbo, I'd not taken cooling into account in my thinking. Yes, I think about 15 minutes on battery before shutdown would be acceptable, and that ought to leave plenty of time for the storage arrays to handle the guest/host shutdowns and have some battery left (also switches, ADSL and so on).

Whilst having 3 smaller UPS's multiplies the probability of failure compared to a single UPS, I do like the idea of having 2 UPS's to fall back on if one UPS dies on me ... it'd be a manual recabling process, but I think I can live with that.

Appreciate your input!

0 Kudos
ViRT156
Contributor
Contributor

My setup is close to yours, 2 x R610's, MD3200i, MD1200 and a couple of APC 2200's. I'm using an external server to control the apc's as this is also doing the monitoring for the MD3200. I have two apcupsd instances running on the external server and am in the testing phase of shutdown sequence.

How are you shutting down the MD3200 when a power failure is detected?

0 Kudos
FrostyatCBM
Enthusiast
Enthusiast

Right now we're not shutting anything down automatically when there is power failure and we have to switch to batteries. Its all very primitive.

What we've decided to do (and I've just signed the purchase order) is buy 3 new EATON 9130 rackmount 3000VA UPS units. I'm going to cable my MD3200 and MD1200 so that one power supply runs to UPS1 and the other to UPS2. With my R710's I'm going to cable them so that both power supplies run to a single UPS (R710-1 to UPS1, R710-2 to UPS2 and so on). Then I will spread the remaining load manually. EATON have software which integrates directly with vSphere/vCenter ... so that if we get power failure on any host I can schedule an automatic vMotion of the affected VMs to move them to an unaffected host.

Regarding the MD3200 and power management ... I'm not 100% certain, but I think the answer is "there is no management".

In fact, I recently had an issue with my MD3200 and went looking for the shutdown/power-off button and discovered that there isn't one in the management application. So when I needed to restart the unit, Dell advised me to basically pull the controller cards out of it, wait a minute, then put them back in. Seems extraordinarily crude! Anyway, I don't think there's anything you can do to shut down these units. Just let the power run out and let the battery-backed cache look after the rest?!

0 Kudos