VMware Cloud Community
Broonie27
Contributor
Contributor

Loss of Network Connectivity During Backups

Hi there,

I'm new to VMware so this might seem like a stupid question but when I back up our SQL servers hosted on vSphere 6.5 I am seeing a loss of network connectivity for the VM. The loss can be anywhere between 2 and 15 seconds. We use CommVault for our backups and I'm 99.9% sure that it is using the "quiesce" option when carrying out the pre-backup VSS snapshot.

The SQL servers are in an AlwaysOn cluster so if there is any network disruption between nodes of +10 seconds a fail-over is invoked and that's not something I really want to be happening unless it's planned. I could increase the fail over threshold but I feel that is merely masking the issue.

Further we are seeing application errors during the VSS snapshot period as it can no longer communicate with it's database. This can lead to long running batch jobs completely failing, again something I'd really like to avoid.

So is what I'm seeing expected behavior and if so what is the recommendation for backing up SQL VMs in vSphere? Do we have to block all use of the DB/application during a backup window for example?

If these drops aren't expected how can I stop them?

Cheers

C

Reply
0 Kudos
8 Replies
depping
Leadership
Leadership

In some cases the quiescing of VMs takes a while, depending on the size of the VM etc. What kind of IO is the VM doing?

Reply
0 Kudos
depping
Leadership
Leadership

Also, have you enabled the "File System and Application Consistent" option for Commvault?

Reply
0 Kudos
Broonie27
Contributor
Contributor

The VM is a SQL server with several disks for logs, DB, tempDB etc . However the only disks being backed up by CommVault are for the OS (45GB) and one used for SQL backups (350GB).

Most of the I/O would be reads and writes to the DB and logs which aren't being backed up and therefore not quiesced.

Reply
0 Kudos
Broonie27
Contributor
Contributor

Yup, definitely using application consistent option.

Reply
0 Kudos
Rubeck
Virtuoso
Virtuoso

..DB and logs which aren't being backed up and therefore not quiesced.

I believe that it is quiesced no matter what, as queiscing targets application VSS writers and not specific disks.

If this KB still applies, you could try to exclude the SqlServerWriter from the quiecing triggered by the VMTools.

https://kb.vmware.com/s/article/1031200

You can also look into Commvaults AppAware VM snaps. This orchestres VM snaps along with an in- guest SQL backup agent.

With this you can also do PIT restores as SQL transaction logs are backed in the traditional way using a separate schedule. When you do a PIT restore of the VM, transaction logs are then applied automatically within the same job..  Though this is another beast, it might be what's needed in your scenario.

Just ideas..

/Rubeck

Reply
0 Kudos
Broonie27
Contributor
Contributor

If I exclude the SqlServerWriter VSS writer won't that stop ability to create a application consistent backups?

If these network outages are expected behavior I'm interested to find out how others backup their SQL servers as I'm concerned we are going about this the wrong way.

Reply
0 Kudos
Rubeck
Virtuoso
Virtuoso

If I exclude the SqlServerWriter VSS writer won't that stop ability to create a application consistent backups?

Yes, for SQL.... but it seems you do SQL backups to a separate drive, right? The file system will still be quiesced prior to the snap being taken...

I normally prefer treating large SQL VMs as physical servers, which ofc means having a SQL backup agent installed in the guest.... but only if guest runs NOTHING else but SQL.

That may just be me, though.

/Rubeck 

Reply
0 Kudos
Broonie27
Contributor
Contributor

So you wouldn't do a VM level backup of a SQL server? You then need to rebuild the SQL server from scratch and then restore the databases?

Reply
0 Kudos