VMware Cloud Community
jwnchoate
Contributor
Contributor
Jump to solution

Virtualized MS SQL Guests and Performance and Sizing for naysayers

Greetings All,

Sorry this is long, I just wanted to paint the picture of what my environment is and what issues are popping up. If you do read this, reply and help, THANKS!

I need some feedback on all you VMware Pros out there on running MS SQL databases and other apps in VMware ESX 4. We have been a VMware shop for 7 years now, and we have grown over the years and been through several upgrades. I'm a VCP, have been pretty much the sole Administrator of our environment the entire time. I am not however a SQL DBA, so consider myself fairly unknowledgeable when it comes to tuning and working with databases, particularly MS SQL, but I am not a total noob. This creates some friction with vendors and app guys fairly constantly.

I go through the usual routine about every other month about going physical with all of our busiest app servers, and leaving VMware for the smaller servers such as print servers. So far however, we have always managed to continue to virtualize and grow. But, everyday, I get the usual routine from app people and non technical types about adding MORE CPU's and RAM every time there is a perception of slowness. Sometimes, they come to me right out of the box and want 8 CPU's and 8GB of ram on every new server, but 2 weeks after moving into production, I look at the performance numbers and see 5% cpu usage, and 3GB of consumed memory. There are several boxes that do spike up for a few hours while a report runs and if the numbers show were bottlenecking I will add the resources. Typically, I insist they start small and let us attempt to correctly size a box, but its often a case of me vs's the mob and I am forced to 'give in'.

I do have a few allies, but I'm increasingly outnumbered as we have grown. I HAVE read manuals and white papers and understand the majority of the docs out there. I have done my own testing and experimenting with various settings and have a fairly good working environment from my point of view. I have seen several very busy servers run amazingly well, that accounts for why we continue to stay virtual. However, I have seen times where the environment does not appear to perform as well as I expect and there's always room for improvement. I feel that overcrowding and over allocation contribute in a number of cases.

I have 3 blade environments, but in this case I will narrow it to 2 of the host types and give a quick rundown of what we're running.

The first is a c3000 HP blade array with 8 bl460's they are dual quad core intel boxes, with 32G RAM, 4G FC connections and 1G Ethernet each.

The second is a c7000 HP blade array with 4 bl680c's they are quad six core intel boxes with 40G RAM 8G FC connections and 10G Ethernet each.

My SAN is a compellent system, its got approximately 64 disks, in 4 enclosures (shelves of disks) 1/2 are FC disks the lower tier is SATA. Benchmark tests show throughput depends upon setup factors of the vm, but in general, 150-200MB/s when I throughput test and I can get iops approx 3000-4000 or more when my settings are geared for iops. Some tools report differently, so there is always some wiggle room.

We just added the new c7000 blades so I just started to migrate, but before the upgrade, on the 8 x 8way boxes I had 113vms, 275GB of RAM allocation, and 260 allocated vcpu's. With 256G of physcial RAM we used 275G, no a too terrible of an over allocation, however I had to plead to upgrade physical ram from 192 when ballooning was starting to cause me misery. However, I do feel that having 64 physical CPU's and allocating 260 virtual CPU's seems a bit much but the overall ready values and total CPU overhead of the host are not too bad, but we have occaisonal spikes.

Now for the 'get to the point' moment Smiley Happy I have once again been confronted with the 'we cant virtualize MS SQL' mantra. I can agree that physical servers DO run faster when doing benchmark tests. Especially, Direct Attached Storage. However, we do officially run several databases on vm and some on physical boxes and yes both do have performance issues. I have done several benchmark tests myself, and even have come up with a few tricks that can make my SAN smoke with performance, if I sacrifice a little fault tolerance in the setup of the .vmdk. What I don't have is a ton of MS SQL virtualization experience to go with it. I will repeat, I have seen several db boxes run just fine and operate to the satisfaction of users and admins. The issue often resolves around the 'when to go physical' question.

On several occasions vendors and dba's and app types have all at one point or another whined that the virtualization was the 'root cause' of the problem. Only after some painful weeks/months of going back and forth jumping through hoops (and 'giving in' to increased resources in the face of irrationalism) trying to prove to them that I have given enough resources and performance was within tolerable ranges, do they dig deeper and find their problem was some lame configuration issue. THIS HAS HAPPENED MANY TIMES and is quite frustrating!

I would like some feedback and experiences from those of you who have the knowledge to help steer me into the answers I need to arm myself with when these issues arise over and over. I searched for some good doc and whitepapers for running db's on VMware, concerning sizing of volumes, CPU numbers, RAM considerations, iops and throughput but never really come up with real world info.

Thanks again for feedback and reading this novel!

Reply
0 Kudos
1 Solution

Accepted Solutions
ChrisDearden
Expert
Expert
Jump to solution

Good Rant !!

SQL is a hard load to virtualise , but it is possible , just remember that you can have big , busy VM's in your environment , but it will affect the consolidation ratio.

Even if you were to get 2 busy SQL boxes on a blade , even when you take into account the addition VMware licence cost , it can still work out cheaper.

Plan Storage like it was physical - you dont have to use RDMS but you do want to make use of additional vSCSI adaptors and datastores with a low count of VM's and the correct RAID type for the data ( so seperate volumes for OS/Data/Logs/Tempdb/Backup )

If this post has been useful , please consider awarding points. @chrisdearden http://jfvi.co.uk http://vsoup.net

View solution in original post

Reply
0 Kudos
9 Replies
ChrisDearden
Expert
Expert
Jump to solution

Good Rant !!

SQL is a hard load to virtualise , but it is possible , just remember that you can have big , busy VM's in your environment , but it will affect the consolidation ratio.

Even if you were to get 2 busy SQL boxes on a blade , even when you take into account the addition VMware licence cost , it can still work out cheaper.

Plan Storage like it was physical - you dont have to use RDMS but you do want to make use of additional vSCSI adaptors and datastores with a low count of VM's and the correct RAID type for the data ( so seperate volumes for OS/Data/Logs/Tempdb/Backup )

If this post has been useful , please consider awarding points. @chrisdearden http://jfvi.co.uk http://vsoup.net
Reply
0 Kudos
jwnchoate
Contributor
Contributor
Jump to solution

Yes, it is a good rant!

Did a little playing around yesterday, I dont have all information in, nor have I done a long enough analysis yet, but I figured a quick peek would suffice...

this morning I did some side perf testing with identical parameters, here is a small sample

-


win2008x64 vm on our 24core, 8GFC hosts, pvscsi -


2 tests appear to bring out some maximums for our virtual disks on the SAN...

8 threads, 4k sequential reads max out at 9,000 io/s

8 threads, 2M sequential reads max out at 650MB/s

other tests mixing reads and writes with more realistic tests.....

8 threads, 128K full random and 60% reads gave me 1800 io/s and 225MB/s

8 threads, 4k half random and 80% reads gave me 2300 io/s and 9MB/s

-


win 2k3 physical server, the twin (and idle) backup box to our so called "busiest" SQL Server -


8 threads, 128k full random and 60% reads gave me 280 io/s and 32MB/s @ 30ms

8 threads, 4k half random and 80% reads gave me 860 io/s and 4MB/s @ 9ms, however, if let run for a while it slowly ramps up over time and gets faster and faster, reaching 8,000 over several minutes. (perhaps this is memory cache kicking in)

8 threads, 2M sequential reads max out at 285MB/s

______________________________________________________________

judging from above, it appears just in a simple test that my SAN is actually faster. both boxes were idle, but one was a 64bit, the other a 32. I didnt have any x64 physical available so that could be a factor, however, my SAN is also the same SAN my production VM's all reside and they show no effects during the test.

Another side note: I ran perfmon on our busiest SQL box and found that in only a few minutes, during a fairly busy period that disk reads/writes in total were no more than a few hundred to a thousand during peaks with averages around 50. This is not long enough of a sample to see the 'normal' tolerances of performance needs, but I thought it was interesting.

Reply
0 Kudos
RParker
Immortal
Immortal
Jump to solution

This is not long enough of a sample to see the 'normal' tolerances of performance needs, but I thought it was interesting.

for the AVERAGE untrained eye, SQL performance is good ENOUGH on VM, but if your goal is ULTIMATE performance, physical is best. If you compare the SAME tests on a physical server (even if it's smaller and using local disk) you will see that PHYSICAL machines are about 20% faster.

That's 20% faster query and 20% better disk. So for MOST things SQL on a VM is satisfactory (that's the highest I will give it). SQL and Oracle the purists like myself will see that we can get the MOST of a physical server for databases. That's the ultimate goal for Database performance, speed.

Now CAN SQL and Oracle be virtualized. YES. IS it fine? YES. Will you notice the speed difference.. YES. Certain apps react differently (vCenter included) with SQL on a VM vs Physical. The performance is fine, and MAYBE you won't quibble over the difference, but there IS a difference in speed.

I can agree that physical servers DO run faster when doing benchmark tests.

And from the VERY beginning that's ALL I have ever said about SQL on VM's .....

Reply
0 Kudos
jwnchoate
Contributor
Contributor
Jump to solution

I am aware in most cases the results of drag racing a physical vs vm is going to be the physical wins. Thats a very blanket no effort of thought statement. You dont need to explain that as it makes sense. My argument is NOT about drag racing P vs V. Its not even about ULTIMATE Performance.

Its about whether VM and SAN get blamed every time there is a performance issue without any proof or logic.

I know from actual work and effort that our VM environment runs extremely well, in fact, it runs faster than many of our physical boxes now. It was not always that way, and over time it has been built up to be an excellent platform. Can I go out and spend $$'s on a brand new physical box with the intention of beating out a VM? I sure can, but is it realistic in every case?

Im amazed at how often I get confronted by dba, app admin, or a vendor that immediatly gets horrified when we say we use virtual machines. I ask for proof or specific logical reasons we have to go physical and they can't give me any. Instead I get a link to some blog that says "SAN is slow", "VM's are slow", "physical is better", "direct attached is better". I have done specific tests and a lot of them, a whole lot. And my results show that our vm environment is peforming as well as any physical box. I know I can spec out a physical server to out peform it. But, I want to know what our apps and db's really need and get no answers but fear and hearsay. I run tests on the current prod boxes and their needs are well below what we can do in a vm. I would have no problems if I was presented with a logical reason for a specific need.

Reply
0 Kudos
jwnchoate
Contributor
Contributor
Jump to solution


This is a real conversation I have had with an app admin -


Admin: I need my new db server to be physical, I dont want users to complain.

Me: That db will only contain that one instance of that app's database, how much io's, cpu resources, memory does your app need?

Admin: I dont know, but if we virtualize it, my screens will take 30 seconds to load and my users will complain.

Me: How do you know that, have you ever tested this, or done any performance benchmarking to tell me what resources you must have?

Admin: No, it doesn't matter, its well known that SAN's are slow and I have this blog where someone did a test that says direct attached storage is faster. ( followed up by an email with a link, "proving" their theory)

Me: I did an actual test our our servers and I ran perfmon on your old physical box. I see your doing 20% CPU with 8 cores, 3G ram usage (5G showing available free), your io/s maxed at 1000 in a 24 hour period 3 times and averages 75 io's a second. I tested our vm's and we can give you 8 cores, and 8G RAM (overkill) and I know our SAN is cable of 2000 io/s at a minumum because I tested it. In fact, I compared it to a recently purchased physical server and the performance paramters are close to each other. (in my case this a.m., my vm was much faster on my equipment, perhaps the next nice Physical box with a fancy DAS array will win)

Admin: but the vendor (or blog) says that VM's and SAN are slow and I NEED to buy a physical server or I will be disappointed.

Me: Can you show me any PROOF that this will be true?

Admin: No, I havent had a chance to look at it, but I know its true. Look at this web page I have, it takes 20 seconds to load and its database is on a vm.

Me: This is the first I have heard of this, have you done any testing or gathered any data that shows why this is the case?

Admin: No, Its well known that VM's are "slow"

________________________________________

Im not kidding, this has been my life and I do understand there are cases where virutal is not practical when there is a GOOD REASON. Its not always this way, but its a general theme I hear at least once a month. Its not about squeezing every nano second for this database or app. Its about whether, we can virtualize and app or a database and get acceptable, and in some cases much better than acceptable results without having to constantly re-prove it works.

Reply
0 Kudos
RParker
Immortal
Immortal
Jump to solution

Admin: No, Its well known that VM's are "slow"

To a degree, he is right. They ARE slow.. but it's because people seldom take into account WHERE a VM is running. So don't blame them, they don't know any better, any more than the avg VM Admin knows about SQL clustering or Table Transaction queries. If you aren't doing it every day you won't know the intricacies, it's up to US to educate.

Its about whether, we can virtualize and app or a database and get acceptable, and in some cases much better than acceptable results without having to constantly re-prove it works.

This is the point I was trying to illustrate. When people want something (especially databases, because a TRUE DB Admin KNOWS what they are looking for), they want the BEST, ESPECIALLY when this impacts their job. He will look bad if the performance isn't there.

You already stipulated it's not AS fast as physical, so right there, the Admin has the advantage, that VM's are "slow"... so in reality he is correct. It's not perfect, and MAYBE it's ok performance, but in COMPARISON it's not.

I can come over, rip out 2 cylinders in your car... you drive home.. do you think you will just ignore the fact your car isn't performing as well? I think NOT! That's the difference. He isn't being picky, he is EXPECTING it to be the same to what he is USED to .. BIG BIG difference.

That's what I have to prove over and over again (to be devils advocate) walk a mile in another.. cubicle. See what they SEE. It's not the SAME to THEM. They can point to a server, SEE benchmarks, and say "look, it's NOT the same. Give me what I need".

Yes you can show that it's adequate, but would you consider dropping your CPU on your laptop or desktop by 20 or 30% and expect you NOT to notice? Think about it from THEIR point of view.

If this is the FIRST server he has ever setup, yeah, it will run fine.. but it's not. So therefore he can compare, and it's NOT the same. Also consider that SQL is the backbone of EVERY application, web, .NET, server (like vCenter). You slow down the database, everything else is slower, it takes longer to finish, longer to return queries, and longer to respond.. it ALL adds up.

Reply
0 Kudos
jwnchoate
Contributor
Contributor
Jump to solution

First, if they actually gave me some numbers and work in logic and reason, I would respect that. If they read it in a blog and slap it on my desk and offer up as proof, I do not. Period. I ask for something hard and factual and get nothing in return except emotion and guesswork. If they can show me where they WILL get a significant improvement by showing me some real measurable facts that have been tested in our environment I will listen and possibly agree. But to sit across from me, with no evidence, and blanketly say that his database wont run and nobody will be happy is hogwash. If we go out and overspend everysingle time someone just 'wants' physical just because it makes them feel better, we would not make any money and not have any cash for other projects.

If they want to play that game, maybe I should google up some blogs about what poor SQL query coding can do and slap it on their desk and demand they look at their code? But I know that isn't a fair assessment. It is a two way street. I never said I wouldn't listen if they made an effor to work with me and accept that sometimes I can provide more than enough to support their needs.

Second, I own 2 cars. One is a 4 banger with a turbo charger, its gets great milage, I can get to work in about 40 minutes. My second car has 8 cylendars and over 400 hp. I still get to work in about 40 minutes. One car costs twice as much. I still prefer to drive one car over the other, yet my results are the same because there are other factors at play that effect the outcome.

I have had a few cases where we went to physical, but we did it after we tested and had justifiable reasons that we could quantify and often stirs up new resources for our virtual environment. I have also had more than a few times we were able to show we could easily provide more than enough resources in virtual and after a long process, everything was working well in the end. Its the battle that always take to get there that amazes me.

Reply
0 Kudos
jwnchoate
Contributor
Contributor
Jump to solution

One last note. I am getting away from what I am really looking for...

At what point do you determine when not to go vm and move physical. Its a debatable matter for sure. But, I want to know how others who virtualize databases themselves go about getting quality and quantity in their assessments and when to go from V to P.

Forget for the moment about P or V. How would you go about asking for new hardware. If your companies management just sits back and says "yes" to every time you ask for more resources or a faster server, then you guys must make a huge pile of cash, or you had some large layoffs a few years ago. My management requires us to 'justify' expenditures.

I guess thats how my mentality came about. At some point in time, whenever someone asked for resources for a project they could not provide evidence, so subsequently I would get a call and be asked to get with the requestor and come up with some good reasons to make a purchase. Probably because I do ask those questions and provide them when I need something.

How does one go about quantifying a SQL box for NEEDS rather than WANTS? Obviously, CPU, Memory, iops, bandwith all stake their claim in the game. Certainly, if you have a databse that does 100 iops, you dont need a $20,000 DAS array capable of 15000 iops (pulled those out of my pocket for making my point not real numbers). A VM would work absolutely fine. If you show me a db that consistently wants 1500 iops when its 'idle' and spikes 4-5000 iops at several peak times a day, then we can talk turkey about going to physical. In the end it could be any factor, I just want to know what you look for so I know what to ask for.

Reply
0 Kudos
jwnchoate
Contributor
Contributor
Jump to solution

RParkers's Own Words 🙂

so, In your own words, SQL 2008 in ESX 4 is only "negligably slower". Wow, this comes as High Praise coming from you!

LOL dude, I am sure if you were in a cube just around the corner from me, I would be able to drop by and ask for some specific facts and figures that concerned our particular environment and your databases in a fashion that made send and not rehash the lastest garbage article from techworld.net. I would be delighted to get an intelligent answer.

Reply
0 Kudos