VMware Cloud Community
ascheler
Contributor
Contributor

Performance w/ WinXP Pro VMs on iSCSI SAN

I have 3 WinXP Pro VMs as the only VMs using a VMFS volume configured RAID 5 on a Compellent SAN w/ 1 GB iSCSI connectivity between the SAN and the ESX box. Three users use these VMs as their workstations all day long. Works great with general use - word, excel, outlook, ie, printing - basically normal office work. Downloading and installing software of any kind on any of the 3 VMs drags down the performance of all 3 VMs to the point where they are no longer usable. I'm suprised by the magnitude of the performance hit even when installing Adobe Reader. Something is not right. Any thoughts?

0 Kudos
22 Replies
christianZ
Champion
Champion

How many disks in your raid 5?

Are you using esx software iscsi initiator, hba or ms iscsi initiator in vm?

What scsi driver do your xp vms use?

You can take the tests here to compare your results with others:

http://www.vmware.com/community/thread.jspa?threadID=73745

0 Kudos
ascheler
Contributor
Contributor

We have 13 x 300GB 10k FC drives in our compellent SAN and apparently the controller uses all 13 drives to support the raid 5 lun. I'm using the esx software iscsi initiator and the Buslogic scsi driver in xp vm. (Could LSI be that much better? I have not tried.) I did run a few benchmarks and it seems like we are in the neighborhood of other's test results in terms of iscsi storage performance. I expected to experience some performance degradation when one of the three VMs was busy heavily transferring files and installing software, but I didn't expect them all to grind to a halt like they do.

0 Kudos
christianZ
Champion
Champion

Well I saw your results here:

http://www.vmware.com/community/thread.jspa?threadID=91878&tstart=0

That doesn't look well, I think.

Maybe you can you post all your results?

I remember somebody here testing with xp and having poor results - changing to windows 2003 gave him better numbers then.

You can try to change the scsi driver also.

0 Kudos
ascheler
Contributor
Contributor

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

TABLE OF RESULTS - PHYSICAL

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SERVER TYPE: Windows XP Pro

CPU TYPE / NUMBER: CPU / 2

HOST TYPE: DELL PE 2950, 16 GB RAM, 2 X INTEL XEON QUAD CORE 5320 @ 1.86 GHZ

STORAGE TYPE / DISK NUMBER / RAID LEVEL: COMPELLENT / 12 X 10K 300 GB FC / RAID 5

VMFS: 100GB LUN, RDM

SAN TYPE / HBAs : COMPELLENT, ISCSI, INTEL PRO/1000 PT, CISCO 2950 SWITCH

\##################################################################################

TEST NAME--


Av. Resp. Time ms--Av. IOs/sek---Av. MB/sek----

\##################################################################################

Max Throughput-100%Read......._____50.8____.......___1171___....___36.6____

RealLife-60%Rand-65%Read......_____48.0____.......___1183___....___9.2_____

Max Throughput-50%Read........_____44.2____.......___1215___....___37.5____

Random-8k-70%Read............._____60.1___.......___919.8____....___7.2_____

0 Kudos
ascheler
Contributor
Contributor

BTW, I did change Windows XP Pro from Buslogic to Lsilogic and saw an improvement. But I'm still not satisfied with the iSCSI performance. I was expecting more. Compellent suggests that I should get more -- that other customers with similar boxes are seeing more. I get the sense that something just is not right with my setup. VMWare suggests patch ESX-6657345. We'll see....

0 Kudos
christianZ
Champion
Champion

Yes, that doesn't look well.

With 13 fc disks I would expect much better performance.

As I wrote maybe you should try windows 2003 - in the past I saw big differences between the both (win xp and 2003).

Have you cheched your esx nics - maybe you have an auto sensing problem?

0 Kudos
Paul_Lalonde
Commander
Commander

1) Disable the XP firewall service (services.msc) completely

2) Add the following options to the registry under HKLM\System\CurrentControlSet\Services\Tcpip\Parameters

TcpWindowSize - REG_DWORD - 131400 (decimal)

Tcp1323Opts - REG_DWORD - 3 (decimal)

3) Disable the QoS policy. Open group policy editor, computer policy, admin templates, QoS, set QoS reserve bandwidth amount to 0

Retry your test again.

Been there, done all that! Smiley Happy

Paul

0 Kudos
ascheler
Contributor
Contributor

\

TEST NAME--


Av. Resp. Time ms--Av. IOs/sek---Av. MB/sek----

\##################################################################################

Max Throughput-100%Read......._____84.7____.......___704.4___....___22.0____

RealLife-60%Rand-65%Read......_____44.4____.......___1283___....___10.2_____

Max Throughput-50%Read........_____42.0____.......___1391___....___43.8____

Random-8k-70%Read............._____44.6___.......___1287____....___10.0_____

0 Kudos
ascheler
Contributor
Contributor

Let me try that first one again........

\##################################################################################

TEST NAME--


Av. Resp. Time ms--Av. IOs/sek---Av. MB/sek----

\##################################################################################

Max Throughput-100%Read......._____36.2____.......___1640___....___51.3____

0 Kudos
ascheler
Contributor
Contributor

Reflecting back on these changes: I implemented the change and achieved slightly better performance results. However, I do not understand why. Maybe if I were using MS iSCSI this would help but the TCP parameters of my VM should have no bearing on the ESX software iscsi initiator's performance.

I'm going to throw in a QLogic HBA and hope for the best....

0 Kudos
mcwill
Expert
Expert

Before you try the HBA...

Is flow control enabled on the cisco switch?

Whilst you can't use jumbo frames (and its debatable whether you should even if they were available), flow control is supported by ESX nics and can give a large performance boost in situations where the SAN is capable of sending data faster than ESX can consume it.

0 Kudos
pauliew1978
Enthusiast
Enthusiast

Hi,

Looking at your results there is definately somthing wrong here. Your max throughput read 100% is sooo slow. You should be seeing nearer 100mbps after all it is a gigabit connection. I would double check all the network links have negotiated correctly to 1000mb full duplex. I have a raid 10 set on my san with 4 7.2k sata disks and I am getting 100mbps on a gigabit connnection. you should be up there if not slightly faster depending on your network hardware.

Paul

Message was edited by:

pauliew1978

0 Kudos
pauliew1978
Enthusiast
Enthusiast

also I think you should check out your disks on the san (the raid controller event log might tell you something). If you have some spare sata drives and the san is hot pluggable I would whack them in and create a new raid 0 set, create a lun and let vmware use it (install a vm) and then give it another test. If it comes up with good results you can tell that it is something worng with the disk config on the san. Your sure that raid hasnt broken and is running at degraded performance?

cheers

Paul

0 Kudos
crazex
Hot Shot
Hot Shot

I saw your original post, so I figured I'd run an IOMeter test on one of my VMs, which runs on a Compellent SAN. I was very surprised at how low the numbers were, as all of my performance graphs from the SAN interface, as well as other tests, gave me much higher results. My VMs are also running much better than they did when they were on physical servers using DAS. My Compellent Setup is:

Redundant Storage Controllers

30+2 10k FC disks - Tier 1

15+1 7.2k FATA disks - Tier 3

I have my high performance VM LUNs setup in tier 1 storage, and set to be in RAID 10, RAID5 parity 5, and RAID 5 parity 9. The Compellent system uses data progression to move data, at a block level, to the different tiers based upon usage.

My IOMeter results:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

TABLE oF RESULTS

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SERVER TYPE: VM - Windows Server 2003 Standard

CPU TYPE / Intel x5355: VCPU / 2

HOST TYPE: Dell PE2950, 16GB RAM; 2x XEON x5355, 2,66 GHz, QC

STORAGE TYPE / DISK NUMBER / RAID LEVEL: Compellent Storage Center x 2 / 302 FC Disks, 151 FATA Disks / R10/R5-5/R5-9/ Fiber Connected

##################################################################################

TEST NAME--


Av. Resp. Time ms--Av. IOs/sek---Av. MB/sek----

##################################################################################

Max Throughput-100%Read........___2.77132..........___1496.77_.........____46.77_

RealLife-60%Rand-65%Read......___11.69___..........___500.37_.........____3.91__

Max Throughput-50%Read..........___8.97____..........__ 699.46__.........___21.86__

Random-8k-70%Read.................___13.25__..........___500.33__.........___3.91___

##################################################################################

Can anyone tell me why I'm seeing such low numbers? Could this test be skewed by Compellent's Data Progression?

-Jon-

-Jon- VMware Certified Professional
0 Kudos
christianZ
Champion
Champion

Have you checked the statistics in your VC?

Have you tested it on ph. server?

What I suppose is that the 5 minutes the test is running are to few to cause the Compellent to migrate the blocks to the higher tier storage.

One could change the time to e.g. 20 minutes and the see the change.

But the answer could be given only the Compellent specialist, I think.

It would be nice to know the answer.

By ascheler's problem there are only fc disks - so only one tier storage, I think.

It is possible, that such tests aren't right for this kind of storage - not sure here.

0 Kudos
crazex
Hot Shot
Hot Shot

Thanks. My problem is also only related to FC disks, as the LUN in question is only in my Tier 1 storage. When I look at my FC enclosures, via my performance logs in the SAN management interface, I'm seeing good IO/sec and MB/sec on all of the disks. And like I said, the performance stats on the LUNs via the SAN mgmt also show much higher than my results from IOMeter. I also tried running IOMeter on a physical box, and only saw slightly better performance. When I used Xiotach on both the physical box and the VM, I got numbers much closer to what I was expecting.

Xiotach Results:

67%Read/33%Write

5%Sequential/95%Random

229.32 MB/sec Avg.

3969 IO/sec Avg.

I'm going to contact our support person at Compellent and see if I can get some answers.

-Jon-

-Jon- VMware Certified Professional
0 Kudos
christianZ
Champion
Champion

You can check the numbers from perfmon also - I saw here always the same values as by iometer, but maybe in this case the iometer doesn't show them correct.

The question for me is when you see many ios in your storage statistics why the windows wouldn't show them too.

0 Kudos
CalVino650
Contributor
Contributor

ascheler

I noticed that your SAN config is using a CISCO 2950 Switch. This switch only has 10/100 ports with Gigabit for uplink only. This will hurt your performance. For my iSCSI I'm looking to use the Cisco 2960G or 3560G.

0 Kudos
crazex
Hot Shot
Hot Shot

Well, I've been working with Compellent to troubleshoot my problems with the low IO performance. I decided to run the test with a LUN connected via iSCSI, and I got significantly better results. All of the setup for my test remained constant, Win2k3, dual proc, 2gb ram, tier1 30+2FC storage, but I ran the test connected to a QLA4052 HBA, connected to an HP ProCurve 2848 switch. Here are the numbers I got:

##################################################################################

TEST NAME--


Av. Resp. Time ms--Av. IOs/sek---Av. MB/sek----

##################################################################################

Max Throughput-100%Read........___10.22..........___5488.87_.........____171.53_

RealLife-60%Rand-65%Read......___13.15___..........___3885.71_.........____30.36__

Max Throughput-50%Read..........___17.38____..........__ 2963.63__.........___92.62__

Random-8k-70%Read.................___13.27__..........___4006.61__.........___31.30___

##################################################################################

I am still trying to figure out why my FC numbers are so bad, so I will update when I get the problem straightened out.

-Jon-

-Jon- VMware Certified Professional
0 Kudos