VMware Cloud Community
JSeifried
Contributor
Contributor

Symantec anti-virus definition updates causing heavy I/O on our SAN, better configuration or product?

Hello,

Currently when our VM guests (Windows Server 2003 OS) receive new Symantec anti-virus definition updates it causes very heavy disk writes to our SAN. Currently we are using Symantec Corp 10.1.5.5000 on a Netapp 3050 SAN. All VMDK files reside on the SAN. I have tested using 10.1.8.8000 and this has not resolved our issue. Using Symantec Endpoint MR4, we see better performance, but it is still too heavy on our SAN infrastructure. Does anyone have suggestions on the next step? Does anyone have a similar configuration and have had to deal with this issue?

Reply
0 Kudos
11 Replies
Lightbulb
Virtuoso
Virtuoso

Check out the following thread

http://communities.vmware.com/thread/95942

You may want to give Symantec support a call. My recent experiences with their support have not been great but you may get lucky.

Reply
0 Kudos
cheeko
Expert
Expert

This might be caused by the "Quick Scan when new definitions arrive". Find the setting in Client Administrator Only Option > Scans.

If this is set and new defs come in, all VMs in this group will do a quick scan and hammer your SAN ...

We've seen this happening as well and introduced scheduled scans which run at different times. Still a heavy I/O action but not that intense and we can control when and where it's gonna happen.

Reply
0 Kudos
JSeifried
Contributor
Contributor

The setting for "Quick scan when new definitions arrive" has been unchecked for awhile now while we figure out this issue. I've verified this several times. The issue seems to be isolated to the client itself and how it unpackages the definition file and incorporates it into it's engine.

Thanks for the thread. That's actually almost identical to the issue we are seeing now. I wish he had stated what he did to fix his issue.

Also, I did work with Symantec support (Both India based and local based) and essentially, in the end, they said there's nothing they can do.

Reply
0 Kudos
Quigibo
Contributor
Contributor

The "solution" (which isn't a solution in my opinion) was to use LiveUpdate to deploy the SAV definitions rather than the SAV server pushing the defs out. This allowed us to stagger the deployment over a 4-6 hours period at night rather than all VM's getting hit at once. Since the last post a lot has been changed in our environment. We now have 300+ VM servers which would have the same problem if we didn't use LiveUpdate. We also have 500+ VDI desktops that would have the same issue. We have since upgraded our FAS3050 to a FAS6080 and migrated to NFS instead of iSCSI / FC LUNs. Both of these steps helped greatly since VMDK's located on NFS mounts do not experience LUN queuing / LUN reservations. One last step on our FAS6080 was to purchase a PAM module (Performance Acceleration Module). This is basically a PCIe card with 16GB of RAM (cache) you put into your Netapp. It then caches all common reads more aggressively than the internal cache. It helps in what netapp calls a "boot storm" for shared spindles on a netapp. When we enabled our PAM for the first time and rebooted 100 VDI desktops as a test, the reboots were 98-99% cached on the PAM card and took 1-3 minutes to boot. Disabling the PAM took the boot times up to 10-20 minutes.. It is too bad SAV is so stupid in the way they handle their defs. Their support was useless and blamed in on VMware. It isn't a VMware problems per se, rather it is because users of VMware on Netapp often take snapshots and can measure the amount of binary change happening on spindles. Something most VMware users don't do since they backup VM's the traditional ways.

http://blogs.netapp.com/storage_nuts_n_bolts/2008/08/performance-acc.html

JSeifried
Contributor
Contributor

Thank you very much for such a quick reply.

I agree that Symantec is very stupid in how they integrate their definition files into their engine. With Corp 10, we were seeing approximately 22.5MB/s of disk writes per client. With Endpoint, we see about 15MB/s of disk writes per client. Currently my next directive is to upgrade our systems to Endpoint MR4-MP1a for our server environment (Which consists of about 200 VM servers) so that we are at least on the latest product. At this time my directive is to upgrade our systems from Corp 10 to Endpoint. I'll play with the LiveUpdate scenario once I've done that. In case you were looking to change your AV, I had very good experiences in testing with Sunbelt's VIPRE. It only caused 4MB/s of disk writes during update definition times. The next best guy was ESET's NOD32 at 5MB/s.

Unfortunately at the moment I don't forsee us being able to upgrade from our 3050 to a 6080 anytime soon, as much as we would like to do that. I appreciate all the advice you've given and would be more than willing to hear anymore that you may have.

Reply
0 Kudos
RParker
Immortal
Immortal

First I will say I am not a Symantec fan. However, this is one of those 'outside the box' modes. Any antivirus software is going to update your clients. The problem is since ALL your VM's are hosted on the SAN, they recieve these updates simultaneously. So there should be a way to put your VM's in groups, so group A get's it's update at 1:00, group B 2:00, so on and so fourth, so that your VM's aren't getting hit ALL at the same time. That's what you should be looking it.

So we can all blame Symantec and their products for being crummy, but in this case the same thing happens for Antivirus Scans (all occuring at the same, CPU loads are high) and deploying software simultaneous to those machines, they all would have the same adverse affect, so you have to stage or stagger your updates.

And I view this as less of a Symantec problem and more of an administrative problem.

Reply
0 Kudos
JSeifried
Contributor
Contributor

While you are right to a certain point, there is still an issue. When you have a Symantec AV environment and your only method of transporting definitions is via VDTM from your parent server to the clients, there is no way to schedule it. As soon as the parent server downloads new definitions it automatically pushes all definitions to all outdated clients simultaneously if VDTM is enabled. We do not use LiveUpdate in our environment except for when the parent server reaches out directly to Symantec to download definitions.

You can use LiveUpdate as an alternative/backup/primary update method and break the updating out to groups, but this would create a management nightware. Each group would have to have a max of 5 VMs or it will create too much I/O. 150/5=30 groups. And that is just for our production environment at one site.

Reply
0 Kudos
doubleH
Expert
Expert

you can change the default method of pushing updates to clients once an updated has occured to having client pull updates.

If you found this or any other post helpful please consider the use of the Helpfull/Correct buttons to award points

If you found this or any other post helpful please consider the use of the Helpfull/Correct buttons to award points
Reply
0 Kudos
JSeifried
Contributor
Contributor

In order to do that you would have to enable LiveUpdate. The point of VDTM is for the parent server to push updates to clients automatically.

http://service1.symantec.com/SUPPORT/ent-security.nsf/docid/2002111915202948?Open&src=tranus_ent_sl

Reply
0 Kudos
JSeifried
Contributor
Contributor

I found this link, some interesting things to try: . I've done everything up to the registry modification. I plan to test that out later today.

Reply
0 Kudos
xnih
Contributor
Contributor

This is an old thread now and a slight different issue, but we were seeing spikes every ~3 mins with SAV clients and ~5 mins with SEP clients (not all clients just some), it appears to be an issue with the heartbeat setting, at least with SEP. Here is the thread at symantec forums on it and I only post this because I spent way too much time tracking this down and hope it helps someone else!

https://www-secure.symantec.com/connect/forums/disk-and-cpu-spikes

Above thread will be updated as I know more

Reply
0 Kudos