I am planning a deployment of over 3000 Virtual Machines to be migrated over the year. I would like to understand the challenges in Migrations, Operations and Governance of an environment this big.
Wow, first post and a huge question. Fundamentally I think you want to talk to a VMware partner or contractor who has proven experience in VMware architecture and design.
You may want to get a VMware Operational Readiness Assessment. This will review your organisation and operations across multiple disciplines to give you a report card and some guidance on the areas you will need to be thinking about.
Review some of the documents over at the new VI:OPS site. However your millage may vary on these, they may say best practice but you will note a lot are a little light on. But you will get some good starts and may find some jems.
For migration you are really going to want some tool. Given the scale you will want to conduct a formal review/comparison of some conversion tools such as Platespin or Vizioncore vConvert.
You are going to want to understand your workloads so you can design your clusters, storage and networking. Again, given the size I would recommend a full VMware Assessment engagement which includes capacity analysis and TCO/ROI plus much more.
Come up with a training scheme appropriate to the different roles in your teams.
Give some good consideration to your backup/restore architecture and methods.
Give some good consideration into some monitoring platform for your new environment, such as vFloglight from Vizioncore, Nimbus or others. If you already have a management platform consider how you will integrate.
How are you going to manage those machines on an ongoing basis. You will want to evaluate if Stage Manager or Life Cycle Manager will be of assistance. You can purchase the M&A bundle.
Consider if you want to include a VMware Technical Account Manager (TAM) into your environment for the first year to accelerate and assist in your adoption.
Technically for the design you are going to want to spend a lot of time thinking about scaling up versus scaling out. How many Virtual Centers are you going to run. How many clusters, of what size and how are you going to differentiate them? I would be thinking of creating a pod architecture which may even include the storage.
What are your work loads like? For example your solution would look very different if it was hosting environment of 3000 lightly loaded web servers which are all very similar compared to a Enterprise business environment with many different OSs, workload profiles and user departments. Do you already have a SANs or some VMware?
I am sure others will come up with a whole stack more suggestions, go find yourself a great partner to work with. Keep us informed of your progress.
Considering awarding points if this is of use
Rodos, you did a great job of getting the list started. The thing I would like to add is that you MUST involve the various infrastructure teams early in the planning phase of your environment. This includes networking, storage, security, administration and operations. You also must communicate to your user community what is going to be happening - particularly if you do not have centrally owned IT resources (i.e. if departments or projects own the servers you are going to migrate). Plan what happens to the systems you decommission - how are you going to get them out of the environment?
Plan your migrations thoroughly. With a project of this scope, you will need to have a full time project manager to coordinate all the moving parts. Engage your change management team - figure out how to integrate the migration schedule into your change management process. You may even want to stand up a dedicated CCB to handle the large volume of change requests. You'll need to make sure you incorporate your NOC into the change management process - they'll need to know that the box is going from physical to virtual so they can adjust their monitoring procedures accordingly.
Don't overestimate the number of systems you'll be able to migrate. In projects of this size, it's not the technical issues that will bite you (there will be technical issues...), it's the politics that will kill you. You'll find that you have a server on the schedule to move on Tuesday at 9PM and then, on Tuesday at 5PM, you'll get a call from the business owner saying that they have a critical deadline approaching and you can't take the server down for migration.
Probably the most important thing you need to do is to get the backing of a senior staff member - the more senior, the better. Get them to issue a broadcast email, or better yet, hold an all-hands meeting, to express the importance of this project to not only the company, but to him personally. Make sure that he communicates that the project is a priority for him and that your team has been empowered to effect the changes necessary to achieve the objectives.
Best of luck - keep us informed!
Technical Director, Virtualization
VMware Communities User Moderator
I want to reiterate that you need to have all parties on board from the beginning and have them be part of your design and architecture process. These include but not limited to the following administration teams: Security, Networking, Storage, Application, System, etc.
You as the Virtualization Administrator need to be the facilitator between these often competing teams. Security and Network designs are going to be extremely important. Do not run roughshod over these groups, get their input and use it where appropriate. You will need a Senior staff or management person on your side to handle the inevitable conflicts.
I would put together a team for this project that includes people from all the other disciplines and also as part of the project, educate the team, get every ones questions answered before they become an issue as well as to have a common set of concepts for the team. In some cases Virtualization's terminology conflicts with the terminology used by the other disciplines which can lead to mass confusion.
Edward L. Haletky
VMware Communities User Moderator
Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.
Blue Gears and SearchVMware Pro Blogs: http://www.astroarch.com/wiki/index.php/Blog_Roll
Top Virtualization Security Links: http://www.astroarch.com/wiki/index.php/Top_Virtualization_Security_Links
To add to my own list...
- - Identify your application support policy, and how you will proceed for applications whose vendors will not provide formal support in a VM. This is becoming less common, but it still does happen.
Will you virtualize only test & dev environments for such applications?
Will your senior management get involved with the negotiations and force the issue (including fielding alternatives)?
Will you take those systems out of scope and leave them physical?
- - Consider your DR situation. Do you have virtualization capabilities in your DR facility? Is that part of this project? Will you virtualize systems that have a role in DR if you do not have virtual capabilities in your DR environment?
- - Pay particular attention to your shared storage infrastructure. It will grow significantly and you will need the cooperation (preferably support) of your storage administrators. Breech the subject of presenting "large" LUNs to servers early - most storage folks are accustomed to presenting lots of smaller LUNs and allowing the OS to do the volume management - this is not a good plan for ESX. MetaLUNs are your friend.
- - Identify the need for 802.1Q VLAN tagging/trunking early and get your networking team onboard. Again, most network admins are not comfortable presenting at trunked (VLAN trunked) port to a "server" - they need to understand that an ESX server should be viewed more as an edge switch than as a typical server.
- - Security. Figure out how you are going to partition your environment. With 3,000 VMs, you'll have lots of options.
- - Are there issues related to regulatory compliance (HIPAA, FDA, SOX, GLBA, EUDPD, ISO-17799, etc.) that you have to comply with? If so, how will you deal with them?
- - Are there systems that have "special" funding sources? Examples include systems funded by State or Federal grants where the funds are restricted to use ONLY by that system?
- - Training - get your administrative team up to speed on VMware technologies. They will have to live with it day-in, day-out. It's critical that their early experiences are good ones.
- - Consider your server acquisition/provisioning process. If you're deploying this large of an environment, you are probably fielding several hundred new systems per year. A priority should be to ensure that any NEW server requirements are filled with virtual machines - otherwise, you'll find yourself in the same situation you're in today a few years from now (I know, I've seen it several times with customers who refused to listen when we told them that containment was more important than consolidation...)
- - Get someone who has "been there, done that" to help you with this project. You may feel that you have the skills internally, but believe me, there are more snakes and alligators out there than you can imagine. Get yourself a good guide and listen to them - you'll be glad you did.
Technical Director, Virtualization
VMware Communities User Moderator
All good answers, and I'll just reiterate the need to get buy-in at an early stage - virtualisation is a major change and it is a fact proven time and time again that people in general are uncomfortable around change.
I would suggest you get a presentation together early on that explains what virtualisation is - including all the common terminology such as clusters, hosts, VMotion etc - and arrange sessions with your application teams so they understand the technology shift. It has been my experience that the bulk of the resistance comes from application owners (and their managers!) not wanting to virtualise, and it is always better to have their support than get a senior manager to force the change on them. If possible, get a proof-of-concept environment set up very early on - nothing fancy - and give the app teams a few VMs to play with so they can see that it really is perfectly safe. If you have the resources you can even demo HA (assuming you;re going that far) and demonstrate recovery - that often 'wows' people when they see their servers get recovered so quickly. Infrastructure teams (especially support) are generally more open to virtualiastion but again I'll reiterate the need to get their representation on the project team at the earliest opportunity so the magnitude of change is understood and planned for.
Also, as has been stated, understand that this is an infrastructure change, not just a server change, so people need to understand the magnitude and the powers that be need to be prepared to give it the time and resourcing it needs in order to succeed.
PS the fact that you have found this forum gives you a very good chance of not falling into the pitfalls - there are some truly excellent people here who put in considerable amounts of their own time to help virtualisation succeed.
Hey Wiskey, I'm working with a team, over on another VMTN community called VIOPS, to write up a VI3 Server Consolidation 60-point Deployment Blueprint for something smaller but still appropriate for you. Ken, Rodos and Paul are right on the money with some of the characteristics of large-scale.
In fact, our experience of the few really large scale organizations has thrown up some interesting evolutions in approaches to virtualization which I have yet to write up (still on the road talking about it!).
One finding I can share with you is having an efficient virtual machine lifecycle and in particular having the ability to recycle virtual machines - recycling VMs is reclaiming resources from rogue or idle VMs. A rogue VM example: one your sysadmin deployed without change management after a Friday lunch in the pub. An idle VM is simple one that was over spec'd with too much CPU, memory etc.
Where are you in your project? Would love to help, perhaps at the same time as writing up the blueprint.