VMware Cloud Community
peterC_
Contributor
Contributor
Jump to solution

Virtualized Windows 2011SBS extremely slow

Hi all,

We have a Windows Small Business Server 2011 set up in a VMWare ESXi 5.0.0-469512

The hypervisor has 4 machines running, two debians, one windows XP and our production Windows SBS2011

the other three machines have set the resource settings as low as possible and they run OK (it's just some dev machines)

The server has following configuration:

Dell PowerEdge R310

Intel Xeon X3470

HT Active

Perc H200 Adapter (vmhba2) with drives - a Raid 5 array (at this point I am not sure wether is the cache enabled or not)

The Windows SBS is very very very unresponsive, Copying a file from a storage server does roughly 200kB - 3000kB/s (bytes, not bits), which is damn low on a gbit network.

It has 4vCPU with 12GB RAM assigned (Because of Exchange, RAM is always at a peak level), CPU seems low

esxtop:

   GID     VMNAME           VDEVNAME     NVDISK       CMDS/s     READS/s     WRITES/s     MBREAD/s     MBWRTN/s     LAT/rd     LAT/wr

1490460 sbs2011                 -                   1                    101.47    61.61              39.86              11.12               0.99             9.24       40.19

I'm not sure though about the peak in the screenshot Smiley Sad

It's really weird. I've seen the threads in here about the latency problem, but no solution. I try to check the caching on the PERC H200 (im roughly 80% sure it's enabled).

Any ideas about this? I think the machine should be relatively responsive with this configuration, but it isn't

The other three machines have the rest of the resources (i.e. 4 cores divided between them and 4GB of RAM divided aswell)

0 Kudos
1 Solution

Accepted Solutions
westcoaster
Enthusiast
Enthusiast
Jump to solution

According to this Dell article http://content.dell.com/us/en/enterprise/d/campaigns/dell-raid-controllers, the H200 doesn't support cache and doesn't support battery backup. Most likely the controller has 2 options: write-through and write-back mode. In write-through mode the hard drive's internal cache and Windows caching are disabled, leading to the lowest performance. This is done to increase data safety in the case of a blue screen or power failure. The other option, write-back mode will enable the hard drive's internal cache and will improve performance, at the increased risk of data loss/corruption when if the system does not shut down cleanly.

If you have good backups and this isn't a mission critical system you can consider changing the controller policy to write-back mode. Otherwise you should consider getting a RAID controller that has cache and batter backup.

PS I think the controller policy setting for caching is set on the virtual disk in the controller setup.

View solution in original post

0 Kudos
8 Replies
SG1234
Enthusiast
Enthusiast
Jump to solution

is there any way to monitor the array level stats ?? also how about doing the sam amount of i/o on the other guests -- is it faster there?

peterC_
Contributor
Contributor
Jump to solution

omg you're right!

I just started an FTP transfer to a linux machine and it's slow aswell (~2.49MB/s):

     GID VMNAME           VDEVNAME NVDISK   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s LAT/rd          LAT/wr

4459492 dev                      -                1               7.06            0.19     6.87               0.00     1.99           2065.04      1225.38

I have no idea what those numbers actually mean. But isn't 2065 and 1225 a bit TOO high?

What should I do? Smiley Sad

Those machines have Thin Provisioning

0 Kudos
sparrowangelste
Virtuoso
Virtuoso
Jump to solution

Check your disk's latency

http://sparrowangelstechnology.blogspot.com/2012/08/check-esxi-hosts-disk-latency-storage.html

given what you stated cache might not be enabled.

--------------------- Sparrowangelstechnology : Vmware lover http://sparrowangelstechnology.blogspot.com
peterC_
Contributor
Contributor
Jump to solution

The question is: WHAT values are within the tolerance and which are not?

0 Kudos
peterC_
Contributor
Contributor
Jump to solution

oh, found it:

A value of 25 or more is  not good.

mine is roughly about 100-250

Whupsie. I check the setup of the cache

0 Kudos
westcoaster
Enthusiast
Enthusiast
Jump to solution

According to this Dell article http://content.dell.com/us/en/enterprise/d/campaigns/dell-raid-controllers, the H200 doesn't support cache and doesn't support battery backup. Most likely the controller has 2 options: write-through and write-back mode. In write-through mode the hard drive's internal cache and Windows caching are disabled, leading to the lowest performance. This is done to increase data safety in the case of a blue screen or power failure. The other option, write-back mode will enable the hard drive's internal cache and will improve performance, at the increased risk of data loss/corruption when if the system does not shut down cleanly.

If you have good backups and this isn't a mission critical system you can consider changing the controller policy to write-back mode. Otherwise you should consider getting a RAID controller that has cache and batter backup.

PS I think the controller policy setting for caching is set on the virtual disk in the controller setup.

0 Kudos
peterC_
Contributor
Contributor
Jump to solution

Thanks for the tip. I was unable to shut down the machine yet, as the virtualized Exchange is mission critical.

First I'll try to find out a way to backup the Exchange database (without corrupting it, the file is more fragile than a wine glass) and then I'll setup write-back mode. The setting shouldn't discard the array hopefully :s

The thing is I'm not sure if it is possible to change the PERC 200H controller into some that supports battery backed up caching without disbanding the array and being offline for a day, or if it is even possible with DELL R310 (to get some better controller model with battery) - it's a 1U unit

0 Kudos
peterC_
Contributor
Contributor
Jump to solution

Ok, so after a bit of telephony, managed to sort this thing a little bit out.

Sadly, the PERC H200 will NEVER be as fast as some *REAL* raid controller, even with cache enabled, but the change of the performance on the guest OS's is visible.

Now, this is for the following config (but, analog with the versions you should be able to get it running on different versions, too):

ESXi 5.0 (no update)

DELL R310 with PERC200H

here's the "awkward" solution (or: How to freakin' enable at least a tiny bit of cache on a year 2011 sold server, with a year 1980 "RAID" (I'm not sure I should call it that way) controller:

*** WARNING! BE SURE TO HAVE AT LEAST A UPS SYSTEM, ELSE YOU COULD LOOSE DATA *** THERE IS NO BATTERY BACKUP UNIT FOR A PERC H200 RAID *ahem* CONTROLLER ***

0. BE IN ESX MAINTENANCE MODE!

1. Download the esx-cli for your VMWare (I used https://my.vmware.com/group/vmware/details?downloadGroup=VCLI50U1&productId=242 - a file named VMware-vSphere-CLI-5.0.0-615831.exe - if vmware will take the file for some reason down, just google exactly the name and you get it )

2. Install DELL Open Manage (OMSA) for ESXi 5. This is some kind of weird DELL idea of how to remotely manage their servers through the hypervisor *shrug*. Basically you need to install the SERVER software on the ESXi host and a CLIENT software (running as a service) on some windows machine which will THEN connect to the OMSA on ESXi. http://www.dell.com/support/drivers/us/en/04/DriverDetails/Product/poweredge-r710?driverId=WWT8H&fil... - OMSA 7.2 is the latest version, get it - this is the .vib file to install it

Don't ask me, why they call it a bundle, when it's only the server software, but moving on. use GOOGLE (yes, because, the dell web doesn't give you a hint where to get the "client" part to this rabbit hole.

Try to search for OM-SrvAdmin-Dell-Web-WIN-7.2.0-6870_A00.exe . BE SURE TO DOWNLOAD THE CORRESPONDING VERSION (7.2.0 in this case) for your server! Otherwise you won't connect. (btw if you open it after in chrome, it tells you your browser is not supported. Way to go in 2013!)

3. Now the tricky part, which I already forgot how I did, but roughly: copy the *ahem* bundle into the ESX-CLI directory, start the CLI and follow the instructions in ftp://ftp.dell.com/Manuals/all-products/esuprt_electronics/esuprt_software/esuprt_ent_sys_mgmt/dell-... - yes a PDF, thank god they managed to save it so you can copy the commands out of it ...

4. If (after an hour of googling and trying to install one damn vib in a zip file into ESXi) you're done, reboot ESXi (did I mention you need to be in maintenance mode above?) and go into the Configuration tab of ESX Hypervisor,  find the user variables (ESXi selected, Configuration -> Advanced Settings -> UserVars) and check if UserVars.CIMoemProviderEnabled (or something else named nearly like this is set to 1) With the version 7.2 I had the var already set to 1

5. Now you're ready to go and if you managed to do everything right (2 cups of coffee, one chocolate and 1L coke), you should be able to start the OpenManage client on your computer. Type in the IP of ESXi, your credentials, !!! IGNORE CERTIFICATE !!! and try to login. If all goes well (btw if you ignore the message about the browser, you can get to run it in Chrome aswell ) you should be able to configure your DELL server

6. Click through the configuration of the Controller, at one point you are going to be able to "Change Policy" - there you can set the write cache to YES. Again - if you do NOT have a UPS, I would not recommend doing this.

There you're done. The rough IOPS gain is about double and latency went on my system down from 250 to ~120. Still high and practically useless, but works.

I hope we never get a power outage longer than 40 mins :S

Thanks all for your help!

0 Kudos