I've recently moved our main backup system from Backup Exec 2014 to Backup Exec 15. It has gone horribly. (Not my idea to use Backup Exec. I was very familiar with it, and hadn't had too many issues with it previously, but I still knew its reputation. This decision was made before I joined the company. Not wanting to make waves, I went along with it and tuned up the old 2014 setup, and made the plans to move to the new BE 15 setup.)
On BE 2014, I managed to adjust things to get the backups stable and performing OK. They weren't great, but they were OK. The old backup server only had 1Gb networking, and the previous admin had it set up with 2 1Gb links software bonded (Windows 2012r2 link bonding) and the iSCSI storage also used just a 1Gb link. It was able to back up even our largest file shares in just over 28 hours, so all the full backups could get done over a weekend. Mostly, backups would average 500-600MB/min.
I start out with the new backup system, the exact same server and storage, but with 10Gb networking using an Intel dual port X520 and Cisco Nexus switches, one link for network and the other for iSCSI traffic. The first 4 backups I put on there worked perfectly, and I was seeing 1.3-1.8 Gb traffic on the server network side and 400-700Mb/s traffic on the iSCSI side. The actual backups, 3 VMs and one physical machine, are showing 1300-1700MB/min speeds in Backup Exec. No problem, I figure. This should make things a lot better. It's tested and ready to take on the rest of the backups.
Not so much.
Upon moving the rest of the servers (VMs and physical) over to the new backup system, it slowed down horribly. The physical machines backup slowly, at about 600-700MB/min, but they aren't horrible. For the VMs, though, I'm getting 180-240MB/min for all the backups, including our big file servers. It is taking upwards of 5 days to complete the jobs. Even the previously fast backups have slowed to the same crawl. I've tried both direct VM backups without GRT, VM backups with GRT, and agent backups from the VM OS. All go at the exact same crawl. The VM hosts aren't even close to taxed. I can run 10 VM backups simultaneously and not impact the performance of the VMs' services at all. In fact, there's more than enough performance to spare. The VM OSes (Windows 2008r2) report less than 20% (dual) CPU usage, less than 10% disk busy time, and less than 10% (~100Mb) network usage. The backup storage isn't getting taxed, either. The backup storage shows less than 1Mb of activity over the 10Gb link. The storage shows less than near zero busy time on the RAID.
I've gone to Veritas's support, and they have nothing. They were the ones who suggested switching up to non-GRT VM backups, and later direct agent backups through the VM's OS. None of their suggestions have helped at all. They decided to write off the support ticket saying we have inferior backup storage, but the storage sees almost no activity. They've completely abandoned me. I'm about a week away from having to roll back, and my job is likely in danger because of it. (It's not totally this issue threatening my job. Long story, but I am good at many things, just not the stuff we do at this company. I'm certainly not dumb.)
I'm at my wit's end. I can't figure this out, and Veritas can't figure it out. Has anyone else out there seen this behavior from BE 15? Is there a solution to this?
On the other side of the coin, the VM restores are working well. I'm currently running 5 backups (the big ones that take 5 days to complete, 3 on host 1 and 2 on host 2) at 200MB/min and I'm able to run a restore of 190GB at almost 900MB/min to a sixth VM on host 2.
Oddly, during the restore, the slow VM backup are actually going faster than before. They're now running at over 300MB/min. It's still slow, and I certainly don't want to have to run restores while I'm backing things up in order to increase performance, but it is an interesting data point.