Hi,
One of our customers is having issues with snapshots in their environment. Whenever a snapshot is taken either automatically with Veeam or manually by us or the customer the performance graphs show zero readings or n/a. The funny thing is that in the system uptime reports it will be at 10 days then drop to zero but 20 seconds later it is back up to 10 days. We know the system isn't actually going down during this period but we are losing pings. This is happening on all virtual machines in their environment but seems to be more of an issue on the Exchange and database servers.
At certain points when Veeam backup has taken a snapshot, users have lost connectivity to the servers for 20 - 30 seconds which is a problem.
We have never seen anything like this at any of our other sites and was wondering if anyone has seen this before, VMware supports has told us to not take snapshots during the day but one of these servers is a booking system so is required on 24 / 7 by the website.
Thanks in advance,
Dan
hi,
We are seeing the same behaviour with our esx servers both 3.5 u4 . We are using Doubletake for Vmware wich also uses snapshots and when the removing snapshot process takes place then the server looses a couple of pings before returning to normal duty. All stats are further ok.
We are hoping an update of our hp san will fix this but at the moment there isn't time for planning this update. I am hoping someone else will have a tested solution for this because this behaviour is far from ideal.
Hi,
Same for us.
We have scripts running the days, doing snapshots, san replication and then removing snapshots.
We are experiencing the same problem but we have never been able to make it better.
I had a look at this thread : http://communities.vmware.com/blogs/Knorrhane/2008/05/05/great-article-about-snapshots
For Exchange, in order for users not to have the disconnected popup, we use the outlook cache mode, it also free up the server ...
Hope that helps ..
b
Thanks for responding, in the nicest possible way, I'm glad someone else is seeing this issue!! It is so bizarre as I mentioned above we have a lot of sites and none of them except this one are having problems.
Its interesting that you think it is the SAN, is there a reason you came to this conclusion? We were thinking of flattening the environment and installing vSphere from scratch with ESX 4.0.
we are thinking it's the san because when you take a look at the vmkernel log there are also errors in there.
Plus there is an update available for our hp san wich realises an 10 to 20 percent speed improvement wich is always good but also in the release notes
it is mentioned that it fixes problems related to the errors we found in our vmkernel log. Unfortunatly we don't have the time yet to implement this update so i can't you if this is the real solution.
Thanks for that info, I'll maybe try moving a VM onto local storage and see if that changes anything. Thanks again.
We see this as well at times. Another to thing maybe to check is the memory allocated to the service console. Is it 800MB.
www.phdvirtual.com, makers of esXpress
Yep checked, the service console is set to 800MB. Any other things to try?
Thanks,
Dan
Definitely an interesting one, any other ideas I'll post back
www.phdvirtual.com, makers of esXpress
This is really odd, we have moved the Veeam and the VCB proxy onto a seperate box so its not competing with Backup Exec, also we upgraded VirtualCenter to vSphere vCenter.
Further we followed the guide on the Veeam site about configuring VCB so it only has one path. We have managed to get 55MB/s to a USB drive.
Not sure what has resolved the issue but its fixed.
Thankyou all for your help,
Dan