RanjnaAggarwal
VMware Employee
VMware Employee

vMotion for database applications

Is there anyone who can tell me that vMotion is recommended for database applications or not? and why or why not?

Regards, Ranjna Aggarwal
28 Replies
arturka
Expert
Expert

Hi

This is really tricky question 🙂  In general, vMotion is transparent and harmless for GuesOS and Application running on it. So, I don't see any obstacles to do vMotion with DB. In case if you have very memory demanding DB (RAM 16GB +) would be nice to have 10Gb vmnic for vMotion to decrese time and speed up vMotion process but if you have 1Gb vmnic it will work too but it will takes longer.

VCDX77 My blog - http://vmwaremine.com
0 Kudos
Virtualinfra
Commander
Commander

It depends, if the servers are high demanding DB for example like a gaint VM with some 8 cores and 64 GB or 128 GB RAM runing database, its not recommended for such gaint and high demanding VM to do vMotion during peak time.

because when you do a vMotion, the memory state of the VM is transfered from 1 ESXi to another ESXi, since the gaint VM will have huge process and memory runing, there might be some interruption to DB services which user might face.

Thanks & Regards Dharshan S VCP 4.0,VTSP 5.0, VCP 5.0
0 Kudos
iw123
Commander
Commander

vMotion in itself wouldn't be a problem - however, as mentioned, there is an overhead when performing vmotion relative to the size of the virtual machine, so it's probably wise not to move the machine around during peak times - then again, this is true to a point for any VM. They should only be moved around for host maintenance purposes (therefore done during non peak time) or through DRS -which will only move VMs when necessary to benefit the cluster. 

*Please, don't forget the awarding points for "helpful" and/or "correct" answers
0 Kudos
rickardnobel
Champion
Champion

There are also some improvements in ESXi 5 with small micro-pauses in the VM if the amount of changed memory is more than available bandwidth, which makes vMotion possible even for VM with very large and changing RAM.

My VMware blog: www.rickardnobel.se
0 Kudos
RanjnaAggarwal
VMware Employee
VMware Employee

But if the databases are in a cluster then even fraction of seconds of downtime can initiate the failover. any idea?????

Regards, Ranjna Aggarwal
0 Kudos
RanjnaAggarwal
VMware Employee
VMware Employee

Hi VirtualInfra,

           Then which method is recommended to move the DB Virtual machine in that case?

Regards, Ranjna Aggarwal
0 Kudos
Virtualinfra
Commander
Commander

1. Live vmotion is not suggested for DB servers during peak time ( for large size VMs).Based on the loading during off hours vmotion can be done.

2.  For Virtual machine with cluster there is no problem. we can migrate the resource from node1 to Node2 and vmotion node1 server and migrate resource from node2 to node1 and vmotion the node2 serer

Thanks & Regards Dharshan S VCP 4.0,VTSP 5.0, VCP 5.0
0 Kudos
mcowger
Immortal
Immortal

1) Totally disagree.  VMware doesn't suggest this.  You do.

I personally have no problem or concern moving a produciton, large DB during business hours if I need to.

--Matt VCDX #52 blog.cowger.us
0 Kudos
mcowger
Immortal
Immortal

The VM is never down.  For a brief (100 ms or less) time it does not response to outside stimuli, but this is no different than a heavily loaded server.

if your cluster software is configured to initiate a failover for < 1s of unresponsiveness, you probably have configured it incorrectly.  Even aggressive Oracle RAC configs are generally no less than 30s timeouts.

--Matt VCDX #52 blog.cowger.us
0 Kudos
RanjnaAggarwal
VMware Employee
VMware Employee

Yeah i have the main doubt related to the oracle RAC server so that means without any issue we can use the vmotion for oracle RAC server.

Regards, Ranjna Aggarwal
0 Kudos
Virtualinfra
Commander
Commander

Matt

Adding to my pervious comments, its my personal suggestion(no were in my comments i mentioned that its recommended or suggested by vmware). If i say something that vmware suggest will also include with source from vmware.

Thanks & Regards Dharshan S VCP 4.0,VTSP 5.0, VCP 5.0
0 Kudos
mcowger
Immortal
Immortal

Perhaps because english doesn't appear to be your first language you didn't realize that using the subjuctive voice like this: "Live vmotion is not suggested for DB servers during peak time" implies (in Western English) that the statement is supported by some sort of authority.

--Matt VCDX #52 blog.cowger.us
0 Kudos
Virtualinfra
Commander
Commander

Matt wrote:

1) Totally disagree.  VMware doesn't suggest this.  You do.

I personally have no problem or concern moving a produciton, large DB during business hours if I need to.

Since you mentioned.

I got below quotes with source from VMware suggesting the following.

VMware vMotion and VMware DRS perform best under the following conditions on SQL server:

"Virtual machines with smaller memory sizes are better candidates for migration than larger ones."

As per the above line from VMware SQL server best practices, my understanding is SQL VM with Large memory are not better candidate for vmotion.

to my personal option i take this and i dont try vmotion during peak hours. I suggest the same, because memory state of the VM is transfer over a high speed network from one host to another, where database operations might get affected.

Refer the below source link page 31:

http://www.vmware.com/files/pdf/sql_server_best_practices_guide.pdf

Thanks & Regards Dharshan S VCP 4.0,VTSP 5.0, VCP 5.0
0 Kudos
rickardnobel
Champion
Champion

Dharshan wrote:

VMware vMotion and VMware DRS perform best under the following conditions on SQL server:

"Virtual machines with smaller memory sizes are better candidates for migration than larger ones."

As per the above line from VMware SQL server best practices, my understanding is SQL VM with Large memory are not better candidate for vmotion.

I think the above just mean that for the DRS selection of VMs to move it is preferred to move a smaller VM if available, which makes sense. It is not really the same as DRS wont move a large VM if this should be necessary.

Dharshan wrote:

i dont try vmotion during peak hours. I suggest the same, because memory state of the VM is transfer over a high speed network from one host to another, where database operations might get affected.

But database operation should not be affected by vMotion. During the actual vMotion the VM is still running on the original host and performs all database access from this location.

My VMware blog: www.rickardnobel.se
0 Kudos
Virtualinfra
Commander
Commander

Rickard Nobel wrote:

I think the above just mean that for the DRS selection of VMs to move it is preferred to move a smaller VM if available, which makes sense. It is not really the same as DRS wont move a large VM if this should be necessary.

It means for both VMware vMotion and Vmware DRS, also it says large VMware not better candidate and not that it cant be moved.refer the below link page 31.

http://www.vmware.com/files/pdf/sql_server_best_practices_guide.pdf

Rickard Nobel wrote:

Dharshan wrote:

i dont try vmotion during peak hours. I suggest the same, because memory state of the VM is transfer over a high speed network from one host to another, where database operations might get affected.

But database operation should not be affected by vMotion. During the actual vMotion the VM is still running on the original host and performs all database access from this location.

There are 3 underlying action during vMotion and Second one is more critical one as follows.

Second:-
The active memory and precise execution state of the virtual machine is rapidly transferred over a high speed network, allowing the virtual machine to instantaneously switch from running on the source ESX host to the destination ESX host.

VMotion keeps the transfer period imperceptible to users by keeping track of on-going memory transactions in a bitmap.

Once the entire memory and system state has been copied over to the target ESX host, VMotion suspends the source virtual machine, copies the bitmap to the target ESX host, and resumes the virtual machine on the target ESX host.

This entire process takes less than two seconds on a Gigabit Ethernet network.( So the VM will be suspended in source ESXI and resumed in target ESXI, which will done in less that 2 second, DB servers are high senstive server and this migh cause some interruption, so IMHO, i wont do vmotion during peak hours and i make sure the db server have enough resource in the cluser)

refer below source link for vmotion:

http://www.vmware.com/files/pdf/VMware-VMotion-DS-EN.pdf

Thanks & Regards Dharshan S VCP 4.0,VTSP 5.0, VCP 5.0
0 Kudos
rickardnobel
Champion
Champion

Dharshan wrote:

Rickard Nobel wrote:

I think the above just mean that for the DRS selection of VMs to move it is preferred to move a smaller VM if available, which makes sense. It is not really the same as DRS wont move a large VM if this should be necessary.

It means for both VMware vMotion and Vmware DRS

DRS is just automatic vMotion so it just says that DRS will likely select smaller VMs to be moved, which naturally would be of lowest impact for everyone.

Dharshan wrote:

VMotion keeps the transfer period imperceptible to users by keeping track of on-going memory transactions in a bitmap.

Once the entire memory and system state has been copied over to the target ESX host, VMotion suspends the source virtual machine, copies the bitmap to the target ESX host, and resumes the virtual machine on the target ESX host.

There is not just a single copy of the RAM. Once the entire memory has been copied it will return and check which pages in memory has been changed, copy those and while doing that keep track of new changes in memory, then copy those changes, keep track of new changes, copy those - until the amount of changed pages is in such lower number that the rest could be completed in 0.5 seconds.

DB servers are high senstive server and this migh cause some interruption, so IMHO, i wont do vmotion during peak hours and i make sure the db server have enough resource in the cluser)

refer below source link for vmotion:

http://www.vmware.com/files/pdf/VMware-VMotion-DS-EN.pdf

The document is for vSphere 4.0 and is a bit outdated. Many performance improvements was done in vSphere 5. See this document:

http://www.vmware.com/files/pdf/vmotion-perf-vsphere5.pdf

Page 15 describes vMotion of database applications and VMware presents some test done with a 4 vCPU / 16 GB RAM fully utilized SQL server with 50 000 000 customers in the database. There is a drop in performance in the final phase of the vMotion, but this is both a extremely high loaded VM and the drop was very short. Since both Ethernet and TCP/IP could never guarantee that no packets are lost it is very natural for most applications that you might have to wait one second more at some point of time.

In the final best practice part of the docuement there are no recommendations from VMware about not vMotion a DB server.

My VMware blog: www.rickardnobel.se
0 Kudos
Virtualinfra
Commander
Commander

Rickard Nobel wrote:

http://www.vmware.com/files/pdf/vmotion-perf-vsphere5.pdf

Page 15 describes vMotion of database applications and VMware presents some test done with a 4 vCPU / 16 GB RAM fully utilized SQL server with 50 000 000 customers in the database. There is a drop in performance in the final phase of the vMotion, but this is both a extremely high loaded VM and the drop was very short. Since both Ethernet and TCP/IP could never guarantee that no packets are lost it is very natural for most applications that you might have to wait one second more at some point of time.

In the final best practice part of the docuement there are no recommendations from VMware about not vMotion a DB server.

4 vCPU / 16 GB RAM those are called as small VM,

Large VMs are 8 vCPU and 128 GB RAM or plus, those VMs are still not better candidate for vMotion.

There is not just a single copy of the RAM. Once the entire memory has been copied it will return and check which pages in memory has been changed, copy those and while doing that keep track of new changes in memory, then copy those changes, keep track of new changes, copy those - until the amount of changed pages is in such lower number that the rest could be completed in 0.5 seconds.


Yes offcourse in the previous comments i have also mentioned the same the entire memory state will be copied over high speed network. But what about VMotion suspends the source virtual machine, copies the bitmap to the target ESX host, and resumes the virtual machine on the target ESX host.for larger VM with more that 8 vCPU and 128 GB ram ?

Thanks & Regards Dharshan S VCP 4.0,VTSP 5.0, VCP 5.0
0 Kudos
mcowger
Immortal
Immortal

Thats now how it works.

The suspend only occurs for the last part of the copy, not the entire copy process.  The suspend is only active more less than 1s, usually while the last few MB are changed memory in the bitmap are copied.

--Matt VCDX #52 blog.cowger.us
0 Kudos
rickardnobel
Champion
Champion

Dharshan wrote:

4 vCPU / 16 GB RAM those are called as small VM,

Large VMs are 8 vCPU and 128 GB RAM or plus,

Could you please provide a link where it is defined what is a "small" or "large" VM? Since I am very un-aware of any such clear definitions.

There is not just a single copy of the RAM. Once the entire memory has been copied it will return and check which pages in memory has been changed, copy those and while doing that keep track of new changes in memory, then copy those changes, keep track of new changes, copy those - until the amount of changed pages is in such lower number that the rest could be completed in 0.5 seconds.


Yes offcourse in the previous comments i have also mentioned the same the entire memory state will be copied over high speed network. But what about VMotion suspends the source virtual machine, copies the bitmap to the target ESX host, and resumes the virtual machine on the target ESX host.for larger VM with more that 8 vCPU and 128 GB ram ?

Well the point was that the copy-phase is done in many iterations which typically brings the amount of changed page to a minimum before the switch-over is done.

If there should be a VM with an extreme write-to-RAM behavior the function of Stun During Page Send will in those rare cases  introduce small micro-pauses into the VMs vCPU to very small degree, but enough to make the changed pages less and complete the final phase. According to VMware this feature makes vMotion possible on any kind of VM in any kind of workload.

My VMware blog: www.rickardnobel.se
0 Kudos