VMware Cloud Community
a2alpha
Expert
Expert

snapshot issues - system performance shows n/a indicating downtime when snapshots are taken and committed

Hi,

One of our customers is having issues with snapshots in their environment. Whenever a snapshot is taken either automatically with Veeam or manually by us or the customer the performance graphs show zero readings or n/a. The funny thing is that in the system uptime reports it will be at 10 days then drop to zero but 20 seconds later it is back up to 10 days. We know the system isn't actually going down during this period but we are losing pings. This is happening on all virtual machines in their environment but seems to be more of an issue on the Exchange and database servers.

At certain points when Veeam backup has taken a snapshot, users have lost connectivity to the servers for 20 - 30 seconds which is a problem.

We have never seen anything like this at any of our other sites and was wondering if anyone has seen this before, VMware supports has told us to not take snapshots during the day but one of these servers is a booking system so is required on 24 / 7 by the website.

Thanks in advance,

Dan

Reply
0 Kudos
9 Replies
Misterx11
Contributor
Contributor

hi,

We are seeing the same behaviour with our esx servers both 3.5 u4 . We are using Doubletake for Vmware wich also uses snapshots and when the removing snapshot process takes place then the server looses a couple of pings before returning to normal duty. All stats are further ok.

We are hoping an update of our hp san will fix this but at the moment there isn't time for planning this update. I am hoping someone else will have a tested solution for this because this behaviour is far from ideal.

Reply
0 Kudos
hernandez80
Contributor
Contributor

Hi,

Same for us.

We have scripts running the days, doing snapshots, san replication and then removing snapshots.

We are experiencing the same problem but we have never been able to make it better.

I had a look at this thread : http://communities.vmware.com/blogs/Knorrhane/2008/05/05/great-article-about-snapshots

Smiley Sad

For Exchange, in order for users not to have the disconnected popup, we use the outlook cache mode, it also free up the server ...

Hope that helps ..

b

a2alpha
Expert
Expert

Thanks for responding, in the nicest possible way, I'm glad someone else is seeing this issue!! It is so bizarre as I mentioned above we have a lot of sites and none of them except this one are having problems.

Its interesting that you think it is the SAN, is there a reason you came to this conclusion? We were thinking of flattening the environment and installing vSphere from scratch with ESX 4.0.

Reply
0 Kudos
Misterx11
Contributor
Contributor

we are thinking it's the san because when you take a look at the vmkernel log there are also errors in there.

Plus there is an update available for our hp san wich realises an 10 to 20 percent speed improvement wich is always good but also in the release notes

it is mentioned that it fixes problems related to the errors we found in our vmkernel log. Unfortunatly we don't have the time yet to implement this update so i can't you if this is the real solution.

a2alpha
Expert
Expert

Thanks for that info, I'll maybe try moving a VM onto local storage and see if that changes anything. Thanks again.

Reply
0 Kudos
petedr
Virtuoso
Virtuoso

We see this as well at times. Another to thing maybe to check is the memory allocated to the service console. Is it 800MB.

www.phdvirtual.com, makers of esXpress

www.thevirtualheadline.com www.liquidwarelabs.com
Reply
0 Kudos
a2alpha
Expert
Expert

Yep checked, the service console is set to 800MB. Any other things to try?

Thanks,

Dan

Reply
0 Kudos
petedr
Virtuoso
Virtuoso

Definitely an interesting one, any other ideas I'll post back

www.phdvirtual.com, makers of esXpress

www.thevirtualheadline.com www.liquidwarelabs.com
Reply
0 Kudos
a2alpha
Expert
Expert

This is really odd, we have moved the Veeam and the VCB proxy onto a seperate box so its not competing with Backup Exec, also we upgraded VirtualCenter to vSphere vCenter.

Further we followed the guide on the Veeam site about configuring VCB so it only has one path. We have managed to get 55MB/s to a USB drive.

Not sure what has resolved the issue but its fixed.

Thankyou all for your help,

Dan

Reply
0 Kudos