VMware Cloud Community
AllenKyle
Contributor
Contributor
Jump to solution

Poor Performance from MSSQL after P2V

I'm a bit new to ESX and had an experienced friend of mine help me get our san/server combo up and running recently. Here's what we've got and what we've done.

2x Dell R900's. 4x hex core processors, 128GB RAM, 2x 73GB mirrored drives.

Equallogic SAN with 16x 1TB SATA drives. Connected via iSCSI using two QLogic HBA's in each server and 2 Force10 S25 Gigabit switches set to 1000 full duplex with 9000 MTU. The SAN is a single RAID50 with 2 hot spares. 2 VMFS stores were created at 1TB each. Servers were balanced across them based on size and activity. We've also set the SAN to snapshot to the servers 3x daily (which goes quick)

We've got 7 servers running across both of them.

2x Terminal Servers

1x Exchange Server

1x Database Server (MSSQL)

1x Phone System Server

2x Domain Controllers

Everything works pretty well overall. Exchange seems a little slower, but our SQL app has lost about 40% on performance. Nothing appears to be red-lined but SQL queries just seem a great deal slower. Users complain about the app freezing up when others are running large reports and such. I've been reading up on best practices for SQL in ESX but I'm finding a lot of misinformation and contradiction. Has anyone here gone through the same issue and found a solid solution for increasing performance? Any advice would be useful. Also, keep in mind that I'm a VM noob at this point, so any articles with instructions would be awesome. Thanks.

0 Kudos
1 Solution

Accepted Solutions
SeanDA
Enthusiast
Enthusiast
Jump to solution

I will try and summarise what I believe to be the most important points to look at :

  • Upgrade to vSphere and use the new ParaVirtualised NICs and SCSI adapters which should improve iSCSI performance significantly.

  • Create the VM from scratch - ensure vdisk for the OS is partition aligned (create partition using diskpart)

  • Use SMP VM and plenty of RAM (at least 4GB) ensure the MultiProcessor HAL is being used

  • Enable Jumbo frames (9000) on all components in the I/O path (JF are now fully supported in ESX 4)

  • Create separate LUNs on the SAN for System DB's (including Tempdb), User DB's and Logs and present these using the MS iSCSI initiator from within the VM.

  • Format the LUN's with NTFS using a larger Allocation Size than the default (Try as large as 64K)

  • Use Stackable switches optimised for iSCSI where possible (Cisco 3750-G or equivalent). Create an Ether-Channel group using ports from each switch to aggregate link speed for iSCSI access from within the VM (also gives fault tolerance in case of switch failure). This should be accesssed from a second vNic in the VM, over a VM PortGroup configured for 'Load Balance based on IP hash' (The multiple uplinks in the vSwitch should be connected to the Port channel ports on the switches.

  • If the SAN supports it, present the iSCSI LUN's to the VM over multiple I/O paths (to different controllers) - Create a separate vSwitch with the same config as above - configure the VMware with 2 iSCSI vNICs and MS iSCSI initiator using multiple target addresses and use (MPIO for failover or MCS (Multiple connections for load balancing). Works best when each iSCSI NIC is on a different VLAN/IP Subnet

  • Use FC drives in the SAN where possible, SAS are next best, then SATA which should be avoided for transactional IO applications.

  • Use as many spindles as possible - more smaller drives work better than fewer larger drives

I have had great success virtualising disk intensive apps on VMware, especially in the more recent additions (vShere especially) when using as many of the above recomendations as poosible. Use IOmeter to measure disk performance from a physical server using MS iSCSI initiator connected to the LUN's. Connect the LUN's to a VMware VM and repeat the tests. There should be no reason why you shouldn't be able to obtain results within 15% of the physical server.

To troubleshoot VMware network speeds, download iPerf.exe and use the client on the VM and server on a physical server. Test varying combinations of frame sizes to ascertain the maximum transfer speed.

Make good notes to ensure you build up a picture of what works and what doesn't.

Hope this helps!

Sean

View solution in original post

0 Kudos
21 Replies
RParker
Immortal
Immortal
Jump to solution

you will get many conflicting reports on SQL and how it's used.

Yes its supported in a VM. Yes it WILL work. Yes MS supports SQL in a VM.

SQL DOES NOT work as well in a VM however. It's pathetic if you ask me. Like I said you WILL get conflicting reports.

You just did a P2V, does it work? Apparently not, so there you go.

I have been dealing with this for 3 years now. I have tried 2 vCPU, Enterprise/Standard Windows. I have tried 2008/2003 both 32-bit and 64-bit. Its just not that difficult. I have tried tweaking, performance tuning, everything to get it to work.. NOTHING makes a difference.

I quit trying, and I tell my users if you want SQL buy a machine, because I am not wasting a VM to do it.

0 Kudos
SeanDA
Enthusiast
Enthusiast
Jump to solution

Did you clean up the VM by removing all the 'nonpresent' hardware from Device Manager? See . Have you configured PAE extensions etc in boot.ini?

I have had decent performance from SQL in a VM. Try installing MS iSCSI Initiator inside the guest VM and connect to LUN's directly from the SAN - you should obtain performance comparable to physical hardware.

You will need a "Virtual Machine Network" creating on the iSCSI vSwitch to connect a second NIC to the VM for use with the MS iSCSI initiator.

Download an eval copy of vSphere and try your VM on that - ensure you upgrade your VM hardware to version 7 and upgrade tools - try using new Paravirtualised SCSI adapter and NIC for the VM iSCSI connection.

Try enabling Jumbo Frames (set to 9000) on SAN, iSCSI switches ane ESX.

You should hopefully see big performance increase.

JohnADCO
Expert
Expert
Jump to solution

I have to wonder what the difference is.

We run a pretty mean DB on SQL and we actually realized performance increases with ESX 3.5i....

Our setup is so simple t0o, even cheapo iSCSI sans. No HBA's even, just nic cards.

0 Kudos
Argyle
Enthusiast
Enthusiast
Jump to solution

If it freezes when users run big reports I'd look into monitoring your disk performance. Also do some testing with tools like iometer and maybe some test cases with the bigger reports.

If you have big SQL reports running (like OLAP) it often doesn't go well if it shares the underlying disk and arrays with other type of systems like OLTP. OLAP like big sequential bandwith performance (MB/s) while OLTP likes fast access times and smaller random read/writes (IOPS). I would split those kind of disk activities on seperate arrays in your disk system (as you would with physical SQL Servers). If you have extreme performance requirements on the report side I would not bring them into the virtual environment.

When it comes to standard OLTP databases in a virtual environment we've had high success with splitting the SQL server databases (by creating multiple mdf, ndf files) over multiple VMDKs what are on seperate disk arrays and sometimes even seperate disk controllers. Basically the same way you performance tune physical SQL servers.

highlatency
Contributor
Contributor
Jump to solution

All depends on the structure of your query in the SQL sever and which constraint its running up against. If performance was significantly better on you physical servers I'd start by looking at the differences in the hardware platforms. You will potentially lose some perfomance moving from dedicated disk to a Vdisk in VMFS. Look at the VMFS /disk performance (at the ESX host level, not on the VM) and try tuning the SQL server to have a lot more disk votes than the other severs on the machine (say 4 times more than the normal vote allocation).This is done on the resources tab when you edit the VM configuration (select the vm , right click and edit configuration, go to the reources tab, select disk, change the disk votes from normal to custom and type in a number) . If that seems to help, you may wish to consider using RDM for your sql server disk.

0 Kudos
AllenKyle
Contributor
Contributor
Jump to solution

I'll address the replies here (and thanks everyone for the input).

RParker: I still have my old physical server in the event that I have to roll back to it. I just don't want to admit defeat. lol

SeanDA: We removed the old NIC's to avoid IP issues, but other than that, I'm sure that there are some old drivers in place that don't need to be there. I'm going to try building a clean new VM and migrate the SQL to it to be rid of all the old dell driver/software rubbish on the machine and to ensure that Windows isn't the issue. We'll see how it goes. I'm hoping to avoid tying windows directly to ISCSI in any way to ensure that there are no issues with vMotion (not that I'm sure that there would be a problem). I'm also not eager to move to vSphere yet until it's a bit more established.

Argyle: Yes, that's exactly the problem that we get. Big reports lock up the client or turn out blank results. It also tanks performance for other users while its running. Performance on disk IO isn't huge. I rarely see it break 20/30mbps. It usually sits under 12.5mbps.

On the new VM, I added a third disk for logs. I'll have the OS on C, Data on D and Logs on E. I'm not sure if there's a better way to lay out the disk files across vmfs partitions or not. I still have to migrate the app and the data over to test it. I can't see it being any worse.

0 Kudos
DougBaer
Commander
Commander
Jump to solution

For my 2 cents, I will throw out there that, despite a lot of FUD, SQL works fine in a VM.

The usual caveats apply: it must be architected properly (e.g. to segment database files from logs and create the appropriate number of files for each), the profile of the workload may require I/O isolation or other special considerations, and the code within the database may need to be optimized -- I have seen corrupt indices cause poor performance that was initially attributed to the virtualization platform.

That said, I would never recommend P2V'ing a SQL server. When we do this, it is generally a new SQL install into the VM and a database migration to the new VM.

Doug Baer, Solution Architect, Advanced Services, Broadcom | VCDX #019, vExpert 2012-23
0 Kudos
olegarr
Enthusiast
Enthusiast
Jump to solution

At my previous place I had used Equallogic SAN and had SQL VMs using it...

Here is the 3 recommendations that I would give you right now:

1. You are using iSCSI... If you did not do it yet, create additional network just for iSCSI traffic and reconfigure your ESXs (it's not very difficult). At that case you will separate networks for VM, Console and iSCSI.

2. Follow MS best practice for SQL and place SQL data (mdf and ldf files) on separate drive. You might even consider to use MS iSCSI initiator inside your VM (I would use it), but it's up to your design...

3. Enable Jumbo Frames on iSCSI vSwitch, your switch and your SAN (Equallogic might do it automatically).

Good luck

0 Kudos
RParker
Immortal
Immortal
Jump to solution

For my 2 cents, I will throw out there that, despite a lot of FUD, SQL works fine in a VM.

True. I never deny this. My problem is by comparison, there is a SIGNIFICANT difference in performance between a VM and Physical host, ESPECIALLY for SQL performance. That's my only contention. You pay big $$$ for SQL, you should get $$$ performance out of it. If it's not ALL that it can be, then you aren't getting your money's worth, period end of subject.

That said, I would never recommend P2V'ing a SQL server.

Which makes THAT a valid statement. That's all I have ever said.

0 Kudos
JohnADCO
Expert
Expert
Jump to solution

I consider myself lucky then. I have PtoV'd quite a few SQL servers. Even systems where the c: drive was the only drive. They run more than OK for me, I actually think we gained performance.

On the meanest of them? The physical server was a 2 processor, dual core's Dell 4600 using the internal raid for storage.

I can only imagine you guys are talking some really heafty SQL servers. I mean my biggest databses are in the 50gig range. We do seem to have to run them dual vCPU though for good performance, but the physical was dual proc, dual core.

0 Kudos
RParker
Immortal
Immortal
Jump to solution

On the meanest of them? The physical server was a 2 processor, dual core's Dell 4600 using the internal raid for storage.

I think the issue is you aren't looking at actual benchmark numbers. You feel that since you see the CPU running moderately and then see the same behavior on the VM, then everything must be ok.

But put a SQL benchmark on that Physical box and do the same benchmark on that VM, and you will see a HUGE difference. Maybe your users don't complain, because they feel its all they will get, but for discerning SQL users, that KNOW the difference, believe me the difference is significant.

Run a query even, I doubt very seriously that any long running query will take the SAME time on both environments.

0 Kudos
JohnADCO
Expert
Expert
Jump to solution

Our performance testing is timing actual production use. If production tasks are faster, I am Ok with it, no need to go any further. We did quite a lot of testing before implementation.

I was thinking about this further since my last post. On our cheapo sans if we share luns to house multiple VM's VMDK files on any sort of busy VM's it stinks, not just the SQL servers.

Our SQL servers get dedicated LUN's, one SQL server guest VM per lun.

I still have the physical box for a little while longer. Got a specific SQL test or tests to perform? I'll give it a shot. I'd like to see it for sure.

0 Kudos
RParker
Immortal
Immortal
Jump to solution

I still have the physical box for a little while longer. Got a specific SQL test or tests to perform? I'll give it a shot. I'd like to see it for sure.

Well you are the first person in years I have seen with identical performance. Good job. I am glad that it worked for you, because for our testing and SQL servers (yea they are monster boxes) it sure didn't work.

but for you, I am glad you found it as a more convenient benefit. That's really great.

0 Kudos
AllenKyle
Contributor
Contributor
Jump to solution

Our database isn't huge. It varies from 10gb to 20gb at most. I've been auditing our set up in efforts of meeting the MS best practices. We'll see how it goes.

Yes, we do have a private iSCSI network using some pretty bad ass stuff. Jumbo frames enabled. Seems fast enough overall.

Thanks all for the input. Please continue to contribute any ideas.

0 Kudos
JohnADCO
Expert
Expert
Jump to solution

The users having it freeze sounds like either iSCSI reservations, or the mutiple vCPU's assigned are not comming available fast enough to me. If I load up my san soem snapshots on the volume or I am taxing the virtual disk owning controller to much with disk copies ect..ect.. I can produce a similar result as well. I do have to watch what I am doing on the SAN somewhat during production hours.

I have never really benchmarked SQL. What is the most popular test(s) to run?

0 Kudos
dominic7
Virtuoso
Virtuoso
Jump to solution

I have a few suggestions for you to try out.

First , have you enabled jumbo frames on the HBAs?

I'll also point out that by default ESX doesn't do multipathing, so take a look at esxcfg-mpath and setting up the HBA's to do some path switching based on number of commands or blocks that are passed through the adapter.

Second, have you tried using the Microsoft iSCSI initator for the data volumes from within the VMs? I've found it to be faster, often 2 times as fast as even the hardware adapters. I'll link the results and an old coworker of mine did comparing iSCSI performance using the VMware software initiator, Qlogic HBA and the Microsoft initiator. I'll spoil the surprise and say that the Microsoft initiator won hands down in all tests. Since the Microsoft iSCSI initiator also does multipathing, you can add a second vNIC to your VM to increase the amount of bandwidth you have available ( this can get tricky as you may need to create multiple vSwitches to ensure that both of your virtual nics are on serperate vSwitches ).

Thirdly, Have you done any filesystem alignment? If you created the VMFS volume using the VIC ( not command line ) you should be ok there, but you should also look into aligning your data partition:

Ensure that no data exists on the disk. Then open a command prompt and start the disk partitioning utility.

diskpart

Enter the command to select disk 1 ( or whatever disk is your data disk ).

select disk 1Then create the aligned primary partition.

create partition primary align=64

All that said, the iSCSI implementation in ESX 4.0 is supposedly vastly superior, if you have the option you may want to upgrade one of your hosts, VMotion the SQL server there and see what kind of improvement you get. I haven't done any type of testing of iSCSI in vSphere 4.0

0 Kudos
SeanDA
Enthusiast
Enthusiast
Jump to solution

I will try and summarise what I believe to be the most important points to look at :

  • Upgrade to vSphere and use the new ParaVirtualised NICs and SCSI adapters which should improve iSCSI performance significantly.

  • Create the VM from scratch - ensure vdisk for the OS is partition aligned (create partition using diskpart)

  • Use SMP VM and plenty of RAM (at least 4GB) ensure the MultiProcessor HAL is being used

  • Enable Jumbo frames (9000) on all components in the I/O path (JF are now fully supported in ESX 4)

  • Create separate LUNs on the SAN for System DB's (including Tempdb), User DB's and Logs and present these using the MS iSCSI initiator from within the VM.

  • Format the LUN's with NTFS using a larger Allocation Size than the default (Try as large as 64K)

  • Use Stackable switches optimised for iSCSI where possible (Cisco 3750-G or equivalent). Create an Ether-Channel group using ports from each switch to aggregate link speed for iSCSI access from within the VM (also gives fault tolerance in case of switch failure). This should be accesssed from a second vNic in the VM, over a VM PortGroup configured for 'Load Balance based on IP hash' (The multiple uplinks in the vSwitch should be connected to the Port channel ports on the switches.

  • If the SAN supports it, present the iSCSI LUN's to the VM over multiple I/O paths (to different controllers) - Create a separate vSwitch with the same config as above - configure the VMware with 2 iSCSI vNICs and MS iSCSI initiator using multiple target addresses and use (MPIO for failover or MCS (Multiple connections for load balancing). Works best when each iSCSI NIC is on a different VLAN/IP Subnet

  • Use FC drives in the SAN where possible, SAS are next best, then SATA which should be avoided for transactional IO applications.

  • Use as many spindles as possible - more smaller drives work better than fewer larger drives

I have had great success virtualising disk intensive apps on VMware, especially in the more recent additions (vShere especially) when using as many of the above recomendations as poosible. Use IOmeter to measure disk performance from a physical server using MS iSCSI initiator connected to the LUN's. Connect the LUN's to a VMware VM and repeat the tests. There should be no reason why you shouldn't be able to obtain results within 15% of the physical server.

To troubleshoot VMware network speeds, download iPerf.exe and use the client on the VM and server on a physical server. Test varying combinations of frame sizes to ascertain the maximum transfer speed.

Make good notes to ensure you build up a picture of what works and what doesn't.

Hope this helps!

Sean

0 Kudos
DougBaer
Commander
Commander
Jump to solution

  • Upgrade to vSphere and use the new ParaVirtualised NICs and SCSI adapters which should improve iSCSI performance significantly.

  • Create the VM from scratch - ensure vdisk for the OS is partition aligned (create partition using diskpart)

  • Format the LUN's with NTFS using a larger Allocation Size than the default (Try as large as 64K)

A couple of considerations here:

  • The Paravirtualuzed SCSI adapter is best used as a secondary SCSI adapter in the VM to host the data volumes. (Keep the boot drive on the first SCSI adapter (BusLogic/LSI Logic)

  • It is more beneficial to partition align the data volumes than the OS volume... and simpler Smiley Happy

  • The appropriate Allocation Size for NTFS will be based on workload and array characteristics -- bigger isn't always better.

Doug Baer, Solution Architect, Advanced Services, Broadcom | VCDX #019, vExpert 2012-23
0 Kudos
AllenKyle
Contributor
Contributor
Jump to solution

Thanks everyone for the input. I gave SeanDA the credit for the best answer due to having the most complete summary. I've already tried some of those steps and will continue to implement as many as I can. Performance has certainly been improving. Thanks again everyone.

0 Kudos