VMware Cloud Community
sgunelius
Hot Shot
Hot Shot
Jump to solution

BrightStor Agent for Linux - Poor write performance during restore

I recently upgraded my ESX 2.5 server to 3.0.1 by backing up the .vmx & .vmdk files to tape using the 11.5 (SP2) BrightStor Agent for Linux. I saw pretty good backup throughput during the backup operation (1088MB/min) and verified I had a good backup.

I took the 2.5 host offline (DL580 G2), installed a SA-641 controller with 128MB BBWC module and configured 2x146GB U320 10K HDD as 135GB RAID1 array supporting the operating environment and 123GB VMFS3 partition. Once the ESX installation was complete, I installed/configured 2x300GB U320 15K HDD as 279GB RAID1 array supporting a 279GB VMFS3 volume dedicated for the VMs.

I reinstalled the Agent for Linux 11.5 (SP2) in the COS and initiated the restore, but the throughput was very poor (200-240MB/min). The problem seems to be isolated to the agent because I use SCP to transfer files to the VMFS3 partitions and achieve average throughputs of 700-800MB/min.

I verified that the write cache module was enabled and confirmed that all hardware (server, array controller and HDDs) were using the latest firmware. I opened an SR with VMware technical support to see if anything had been misconfigured and there was nothing unusual in the logs.

Does anyone know if something has to be tweaked in the Agent for Linux configuration to achieve better throughput when performing a restore (writing to VMFS3)? Thank you very much.

Scott

0 Kudos
1 Solution

Accepted Solutions
kix1979
Immortal
Immortal
Jump to solution

Keep in mind that restoring to VMFS is going to cause LOTS of SCSI reservation errors, because you are growing a file on disk rather then create the full size and filling in data. Proper proceedure for an agent should be to restore to ext3 and then vmkfstools import to the VMFS, it will still be faster then a direct import 95% of the time.

On top of that, if it is local storage, what is the raid controller type? There are known issues with specific ones, as well make sure your firmware is at the appropriate level. Also check your duplex settings, all to often these are misconfigured or not set to be permanent and it will cause lots of fun...

Kix

Thomas H. Bryant III

View solution in original post

0 Kudos
8 Replies
sgunelius
Hot Shot
Hot Shot
Jump to solution

I've opened up a ticket with CA to look into this issue and they've been provided the pertinent information, but I've heard no response from them yet.

Is anyone out there currently using Brightstor 11.5 (SP2) with the Agent for Linux running in the COS and backing up across the LAN? If so, what kind of backup rates and more importantly, restore rates have you seen? Thank you again.

Scott

0 Kudos
kharbin
Commander
Commander
Jump to solution

In ESX2, you could use as much CPU, network and disk I/O as you wanted/needed in the service console, as it is just a Linux host running VMware as a service (just type 'service vmware stop' to see). So your agent could use all it needed to make the backup.

In ESX3 the service console is now actually a scheduled VM, completely under control of the vmkernel. So if you did a default install, the service console only has 272MB RAM and 400MHz of CPU and may be sharing a NIC with others. So when you do the restore you are working under these constraints.

There's nothing wrong with the software, it just can't restore very fast because its starved for resources and being time sliced with all the other VMs.

You could try increasing RAM and CPU to the console to see if it helps.

my 2 cents

Ken Harbin

www.esXpress.com

sgunelius
Hot Shot
Hot Shot
Jump to solution

Ken,

Thanks for the response. I did allocate the max memory (800MB) for the service console during ESX 3.0.1 installation and when I check using "free -m" from the service console, it indicates 780 total with 760 used.

Configuration-Software-System Resource Allocation-System Resource Reservation shows 673MHz for CPU and Memory as 0MB, which seems to disagree with Configuration-Hardware-Memory-Physical which reports 800MB for Service Console.

The Service Console has a dedicated Gigabit NIC, so I wouldn't expect any contention with other VMs for network resources.

I may have to upgrade to 3.0.2 now, because I just read the release notes and saw that support has been added for CA BrightStor ArcServe 11.5 SP1 and I happen to be running 11.5 SP2. I had previously checked the "Backup Software Compatibility for ESX Server 3.x" guide as saw that BAB 11.5 was supported (Backup Client running within service console), but there was no specific mention of SP1 or SP2 (SP3 is latest release btw), but maybe there are some issues with 3.0.1 and the SP1/SP2 releases of BAB?

If you can think of anything else I should try, please let me know. This really isn't a huge issue as I'll have the hardware to implement VC and VCB shortly, but I was just concerned that something was misconfigured. Thanks again.

Scott

0 Kudos
mdippold
Enthusiast
Enthusiast
Jump to solution

Can you try to restore to an ext3 filesystem and then use vmkfstools to import the files back to the vmfs? Maybe the BrightStor agent uses a way to write the files which doesn't work well in the service console.

Martin

kix1979
Immortal
Immortal
Jump to solution

Keep in mind that restoring to VMFS is going to cause LOTS of SCSI reservation errors, because you are growing a file on disk rather then create the full size and filling in data. Proper proceedure for an agent should be to restore to ext3 and then vmkfstools import to the VMFS, it will still be faster then a direct import 95% of the time.

On top of that, if it is local storage, what is the raid controller type? There are known issues with specific ones, as well make sure your firmware is at the appropriate level. Also check your duplex settings, all to often these are misconfigured or not set to be permanent and it will cause lots of fun...

Kix

Thomas H. Bryant III
0 Kudos
sgunelius
Hot Shot
Hot Shot
Jump to solution

Martin,

I did test the restore performance to an ext3 partition and saw even worse performance. I understand from your and Kix's response that this is the accepted method for restoring these files. I was just trying to determine whether I had missed something in the configuration or if I had a hardware/software issue.

Kix,

I am using local storage to support the VMFS3 volume. I've got a Smart Array 641 controller with 128MB BBWC module installed. Write cache is enabled and the default 50/50 ratio for read/write cache is being used. I've applied the latest firmware available for the server, array controller and HDDs. I've got the drive bay switch set for simplex, so the single channel is supporting two arrays:

2x146GB U320 10K HDD as 135GB RAID1 array supporting the operating environment and 123GB VMFS3 partition

2x300GB U320 15K HDD as 279GB RAID1 array supporting a 279GB VMFS3 volume dedicated for the VMs

I didn't check the network connection during restore to determine if I was dropping frames or anything, but since the same network connection was used when restoring or using SCP from the same W2K3 server, I didn't think that would be responsible for the vast difference in performance.

CA technical support just asked me for a Visio diagram of the architecture, but I think I'll tell them to get stuffed and probably just upgrade to ESX 3.0.2. Thank you very much for your responses.

Scott

0 Kudos
sgunelius
Hot Shot
Hot Shot
Jump to solution

Kix,

I couldn't grant another "helpful", so you got "correct" even though the problem persists. I've got a VMware SE onsite tomorrow, so I think I'll perform a test restore and monitor ESX and our network to determine what's going on.

Aside from that I'm in the process of downloading 3.0.2 and will be scheduling an upgrade for this 3.0.1 server. Hopefully this will solve the problem, although since VMware's only documented support for BrightStor ARCserve Backup 11.5 (SP1), I'll probably have to downgrade the agent from SP2 to be in compliance. Thanks again for your response.

Scott

0 Kudos
reed1
Contributor
Contributor
Jump to solution

How did you get on with the downgrade to SP1 and thre upgrade to 3.0.2? I have been trying to get SP3 to go with 3.0.2 but no luck as yet. I have a call logged with CA support

Thanks

Ben

0 Kudos