I must report to our CFO whose main interest and focus is finance, not IT.
I know we should upgrade our servers and SAN and I would like ideas and suggestions on how to justify it beyond the following valid reasons:
1) Two servers are almost 5 years old but they are already outdated and not on the HCL -- HP DL390 G5 servers with DDR2 RAM, which is barely made/sold now. They are almost ready for a new round of CarePacks, too...same for the SAN.
2) The SAN remains on the HCL but needs a complete new set of disks to accommodate future storage -- though the exisiting disks are 300GB 15k drives. 600 or 900 GB drives are possible but they are only 10k drives, reducing available IOPS. The SAN is an HP P2000 G3 iSCSI which is otherwise fine. It could be repurposed to a DR site.
3) One server is an HP DL360 G7 which could conitinue as a host server if need be...
4) The servers are not in a true N+1 setup -- if the G7 dies, the two G5s cannot handle all the existing VMs, even with the absolute minimum quantity of VMs running. One G5 + G7 might work...
5) Upgrading to more current hardware enables using newer VMware features...
7) We are getting nearer the SAN's storage limits, though no over-provinsioning messages have appeared, and as many VMs as possible are thin provisioned, but VMs and appliances seems to grow larger over time...
😎 We are on VMware 5.0 U2, we would skip to 5.5...
What other things should I consider and discuss to justify upgrading??
I'm sure I'm missing a lot of things to discuss...
Thank you, Tom
Many thanks for the detailed reply!!
It will take awhile to review and apply it, particularly collecting data. ☺
On the N+1 part, one server has 96 GB the other two have 32 GB apiece which it’s not practical to buy more because it’s old/expensive DDR2 RAM.
Thank you again, Tom
In the case of the compute cluster design, I would step back and try to think through through your design. If you haven't already, check out Duncan Eppings "vSphere 5 Clustering Technical Deepdive". Use his arguments on HA slot sizes as basis for developing a solid compute cluster HA design. His explanation will help make the case for consistently sized servers in a cluster. If you can't failover all your VMs due to lack of N+1 redundancy, then you can use a clearly presented email stating this as a risk. You can state that VMs that can't be failed over if, for example, your largest server goes down - will experience an outage. You can state what the business impact will be in terms of time and money of these VMs being down if known. Present a solid cluster design solution requiring the new hardware going forward and the risks in writing of not implementing the solution with the required hardware.
To help you collect data on things like IOPs and do predictive analysis, capacity planning software will help you, the Enteprise trial version of vCOPS, or something similar will help you get started. Just vCenter Server data might be very hard to put together in a meaningful, convincing way in this scenario. Try the vCOPS "what if" scenarios as an example for a presentable explanation of what happens when your enviorment has 20% more VMs 6 months from now on your existing, unupgraded servers and storage array. If one exists, get the HP storage adapter for VCOPs to aid your vision and integrate the information. That may help your case for new storage hardware.
For a finance person, you'll need to work with him on calculating the cost of downtime.
At some point he'll say "OK, so if that server fails and has no warranty, I'll need to spend $x to buy a part, and we'll be offline for one day", you say "but they don't make them any more, so what you will have is a surprise hunt through eBay which may or may not be fruitful. If the latter occurs, you're in for a surprise new server whether you want to buy one or not. And it may take weeks to get sorted out.
He'll inevitably feel that being without computer isn't so bad, so you start to say "can we actually sell a product during that downtime, or are you paying the salary of people who are just sitting on their thumbs?"
Hi Tom,
Bit hard to talk financial when you're a techie. See if you can get help from your reseller to put a TCO case together he will understand! The Care Packs to keep that equipment under maintenance go up after 3 years and keep rising after that, get some figures and graphs to show at what point you would be better buying new now than running another 2-3 years on the old kit....
If you have NO maintenance on key kit then find out what the leadtime is like on a replacement, add delivery, build time and testing and out of hours overtime to replace the most critical / expensive component and throw that on top of the outage cost to the business. Can people work without their IT servers, go to the business and ask the managers directly? Ask them for costs....
Keep a record of issues and rate them according to severity, downtime etc and show if the systems are getting more unstable over time. Calculate the actual hours spent in fixing vs operational activity, $$$!
Then use this to justify a proper DR so if you get budget for new, add DR to it you sell a business benefit with the hardware refresh. Just pick a really warm climate to base the Datacenter, somewhere with a beach and remote wifi and you're set...!!!
Mike