tebruno99
Contributor
Contributor

ESXi configuration for load balancing http

I just started working fpr a company using VMware ESXi. The setup here I find is a bit odd and I'm wondering if there is any benifit in using vmware for its current task and current configuration or if the configuration should be redesigned.

Current Setup:

ESXi 3.5 Running on 2 Servers 8 cores each. ( 16 cores )

14 VM instances all running exactly the same software of Apache 2.2.10. Apache fetches all data from a central nfs server.

Each VM is assigned 1 core with 2GB of ram. A load balancer then takes http requests and balances the requests to the 14 instances.

To me, this doesn't make much sense because the amount of memory, storage, and maintenence required on 14 individual instances. Fewer instances would lessen the memory impact but still keep some reliability if an instance fails. To have 14 of the same server running behind a load balancer is very confusing for me, is this a good idea? Is it common?

0 Kudos
19 Replies
weinstein5
Immortal
Immortal

Welcome to the Forrums - Remember the whole purpose of the load balancer is to spread the load across mutiple servers performing the same function so it will depnd on the load your are experiencing - if at the peak load it seems that you have spare capacity you might be able to reduce the number VMs -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
tebruno99
Contributor
Contributor

I'm curious if there was only 1 server, would load balancing http to multiple instances (8,4, or 2) be better than just setting up the 1 server without vmware? I'm not opposed to using the vmware instances but I wonder if this is the best configuration and task for our ESXi server. Seems like a lot of overhead and the load balancer is still hitting 1 physical server.

0 Kudos
jbogardus
Hot Shot
Hot Shot

It depends on the scalability of the application also. If it scales linearly where you can continue to add a certain quantity of CPU and RAM together and continue to get a linear increase in the amount of user traffic supported without performance loss then it can be practical to run on fewer instances. Consider also though that ass you add more CPUs to each VM in a VMware environment that can create a little more contention between VMs and be counter productive if the CPUs are being heavily utilized.

One of the big benefits of VMware can be in scaling applications that don't scale up in a linear manner very well. By creating multiple instances you can achieve better linear scaling for this applications by scaling out and using less resources than you would scaling up.

0 Kudos
BillyCrook
Contributor
Contributor

Yes. It is quite common. It's what happens when you combine VMWare Inc.'s marketing hype, and a sysadmin with severe debilitating mental retardation.

The setup you describe wastes untold cycles on cutting the work load up and reassembling it again. Uninstall VMWare. Ditch the load balancer. Keep the website's files on the server's running apache.

Aside from a headache, and the severe crippling security vulnerabilities of proprietary software like VMWare ESXi, you might loose a degree of redundancy. If it is really so stupidly critical that not a single pageload ever fail, ever; and that both server's files be absolutely in sync, eternally; then use drbd between the two servers. Preferably in a prisecext3nfs scenario. Alternately with pripri+gfs2.

Either way implement failure detection and reaction in a bash script or with heartbeat and cman if you really want to be fancy. Recent hardware may very well include IPMI or you could get a switched pdu (for stonith fencing, if you really think you need that. You are wrong. You do not.) You should be able to fence well enough by shutting off apache, and relinquishing the public IP to the other box.

DNS load-balances well enough between two IPs, and consumes no resources. Fucking use it.

@[~84334]. OK. So you basiclaly said nothing. Thanks. Have some points.

@jbogardus. Crikey. I must just have missed the part where you can't just run two instances of the same executible ON the host OS. I challenge you to show how running additional non-business, OS software accelerates anything or makes any part more efficient.

0 Kudos
jbogardus
Hot Shot
Hot Shot

@jbogardus. Crikey. I must just have missed the part where you can't just run two instances of the same executible ON the host OS. I challenge you to show how running additional non-business, OS software accelerates anything or makes any part more efficient.

I'll put forward an example. This example may be more complicated than web servers that for simple web sites should scale fairly while to additional CPU and Memory, but maybe there is something a little more complicated going on with this particular website where it's doing something a lot more complicated than just serving some static web pages and may quite possibly not scale up well.

In any case my example is Citrix. I've seen several examples of applications running on Citrix that a single Citrix user could peg the CPU at 100% affecting all the other users on that Citrix server. By scaling out to more Citrix instance in VM on top of the same physical hardware the negative effects of the CPU getting pegged by one user could be reduced by having far less effect on the users in the other Citrix VMs because of VMware implementing a more fair CPU sharing method that Windows allows to the Citrix users. This is a little more extreme than a web app example, but there are some web apps that behave this way to a lesser extent.

0 Kudos
jbogardus
Hot Shot
Hot Shot

@BillyCrook

I think it's an understood principal of distributed computing that some applications scale out better than they scaled up. Even some web applications. I don't disagree with the general thought that most web applications shouldn't behave so inefficiently that they benefit from being divided into as many instances as this one has been. However, I want to caution against making the assumption that this particular one will scale up really well. I wanted to present a plausible reason why this multi instance configuration may have been done so the environment wasn't reconfigured without first questioning/understanding the scalability of the app. In some way the applications scalability on a single instance should be determined before deciding what the best configuration for it is.

0 Kudos
BillyCrook
Contributor
Contributor

So you showed a problem "single Citrix user could peg the CPU at 100% affecting all the other users". And you showed how VMware could solve the problem. However:

1) It does not improve efficiency. It explicitly reduces the compute capability for all users to that of a single core machine. That is horrible. And ironic too in the case of citrix; the use of which often implies their desktop hardware is underpowered, sometimes intentionally.

2) That is not the correct way to solve the problem. If it is intentional malice, then it is a social problem, and you know what I'm about to say about that. If it's unintentional (proprietary web multimedia plugins), or one user who is simply more productive than the rest, then there are two solutions. Ban inefficient software or use your equivalent of /etc/security/limits to grant a single user access to no more than a certain amount of cputime, cores, ram, simultaneous processes or any one of over a dozen metrics to split resources as fairly as you see fit. You could even permanently de-prioritize 'trouble' users, so they can max out all they want, and other users' processes will get first dibs on cpu. Surely your Windows (if for no other reason than for what you paid for it) includes such trivial capability. If if doesn't, then surely, Citrix, for what you paid for THAT, does. If it doesn't, then certainly, you can find some anonymous, 'patched' freeware on a Russian wares site, or Steve Gibson will throw you a bone that does this.... Or maybe, just maybe, it's a sign of a deeper failing, and time to reevaluate your (NT) kernel's trustworthiness to fairly allocate resources

I routinely "Peg" every cpu, on all of my systems, for extended periods of time. If all of your CPU cores are NOT "Peg"ed around the clock, then you are wasting that hardware. You didn't buy it to sit idle did you? I observe no affects (increased latency in user interface), because I start the most CPU-hungry programs at a lower CPU priority. Even operations as CPU and disk intensive as computing the sha256sum of every file on every local filesystem create no noticeable load observable by user-interactive use.

In the event that users themselves are unaware of their resource consumption or wish to hoard resources knowingly, It is trivial for an admin to script the automatic deprioritization of known-offending programs and users. I have done this by simple bash script in a cronjob, but there are daemons to do the same thing, and instead of cron, you can "right click" in 'Scheduled Tasks' in 'Control Panel' and choose 'New Task', because that's simpler than "crontab -e".

The correct solution is for the 'greedy' user to fully be able to use up all eight cores in the box (or however many there are) but for other users to get their fair share as well. Fair in this case could mean specifically, "max 1 running process at a time per user" and replace this misuse of virtualization. Or using a better scheduler (from a better kernel) if you want to solve the problem and get better utilization out of your hardware. Virtualizing and running umpteen different instances of the entire operating system to accomplish this is is a shameless hackjob and misallocation of capital (proprietary software licenses, salary, etc.)

And regarding this 'up' and 'out' inanity; yes. I am well aware what scaling up and scaling out mean. I work in HPC. It's just that I have a particular distaste for overuse of buzzwords.

Are you aware that VMWare is often abused to "scale out" by running entirely seperate instances of the complete operating system and application? For nearly two decades now, most operating systems have been able to 'scale out' by running seperate instances of just the application. If you need a special word for this, it is "multitasking". This is significantly more efficient because all of the system services and other managerial overhead of running additional hosts is no longer necessary. It also allows those multiple instances's shared libraries to load in to memory only once, so that even more memory can be used for actual business work.

Virtualization doesn't give you more processors. It absolutely will not ever increase your FLOPS or any other measure of cpu capability. The best it can ever do is really minimally reduce it, but there is still the additional cost (per VM) of that VM's operating system itself.

0 Kudos
drummonds
Hot Shot
Hot Shot

BillCrook,

Breaking the web instances up into multiple, small virtual machines improves efficiency dramatically. This has been proven through exhaustive analysis. As for decomposition for administrative reasons, most admins I talk to are unwilling to put numerous Apache instances in the same OS install for a few reasons:

  • Isolation to limit corruption from a single instance

  • Isolation to limit effects of runaway and zombie processes

  • Isolation to limit impact of a security compromise of a single instance

  • Simplicity of management (a larger number of identical instances being preferred to a smaller number of specialized ones)

These are the reasons our customers are giving me as to why they are decomposing Apache this way. As I am not a web administrator, I will not defend them here. But the performance benefit of this decomposition is beyond doubt. When VMware implemented a large number of small Apache VMs we were able to set a new record for SPECweb2005. See the second figure here for a explanation as to why this is the case:

Scott

More information on my communities blog and on Twitter:

http://communities.vmware.com/blogs/drummonds

More information on my blog and on Twitter: http://vpivot.com http://twitter.com/drummonds
0 Kudos
BillyCrook
Contributor
Contributor

...Yes. because 4/2*2=5 when VMWare is involved...

"limit corruption"? Bad admin. Never limit corruption. Understand and prevent it.

runaway/zombies? Don't run code that does that on any regular basis, and if it does happen by accident, that is the kernel's responsability to deal with.

You can not assume a compromised VM guest can never affect other guests or the host. Particularly when the virtualizer is proprietary and contains significant unpublished security vulnerabilities. Seperating administrative/trust domains is a valid use of virtualization. I.E. to be able to give every VPS customer "root on his own box". VMWare no more isolates security compromises than NAT is a firewall.

After you "implemented a large number of small Apache VMs" did you use the exact same hardware of the host, and software from the guest, and compare with that? Exactly what 'serialization penalties' did you observe apache to be vulnerable to which vmware's host software was not? How were you configuring apache in the VMs and in the single install? What hardware did you use to provide "5 cpus" for that datapoint in the chart? 3? I would like to see more details if there are any.

0 Kudos
drummonds
Hot Shot
Hot Shot

<span class="jive-thread-reply-body-container">Yes. because 4/2*2=5 when VMWare is involved...

Ah, I get it. Forget I said anything. You already have your conclusions.

Scott

More information on my communities blog and on Twitter:

http://communities.vmware.com/blogs/drummonds

More information on my blog and on Twitter: http://vpivot.com http://twitter.com/drummonds
0 Kudos
BillyCrook
Contributor
Contributor

I don't believe in magic. I'm not placated by shiny graphs. I asked tangible questions you should be able to answer if your tests were valid. I would honestly like to hear the answers if they exist. If vmware's vmx process is not subject to the same constraints that other processes are, I'd like to know how you pulled that off.

0 Kudos
jbogardus
Hot Shot
Hot Shot

Since this discussion has all or almost all of your 4 posts since 2006, I take it that discussing this is all you want and that it's not specific to this persons question. Why don't you start your own discussion on this and not completely hijack this one.

0 Kudos
BillyCrook
Contributor
Contributor

The OP asked me to give my two cents on this, and I did. His question has to do with how vmware can be of benefit in his situation. I don't think it can. If someone does, it would be prudent to explain how it can, here, not in another thread. What is off topic is my post count.

0 Kudos
tebruno99
Contributor
Contributor

All of our data is stored in nfs shares that all 14 web servers mount. With all pointing to the exact same data how does more instances limit corruption?

0 Kudos
Josh26
Virtuoso
Virtuoso

BillCrook,

Breaking the web instances up into multiple, small virtual machines improves efficiency dramatically. This has been proven through exhaustive analysis. Scott

Apache broke into a threaded prefork model precisely to deal with the issues you mention.

It's been a long time since there's been any sense to this - I can hardly see it being described as "beyond doubt".

0 Kudos
BillyCrook
Contributor
Contributor

The 'limiting corruption' was a vague, bullshit answer to start with, but..... IF your webapp was so insecure, that it gave strangers root on the box, AND your site's usefulness is not dependant on that data being writable, then you could export it read-only, which would prevent any vm instance from editing the data.

Then again you could just as well rsync it to the web server at regular intervals and maintain a sane permissions model.

0 Kudos
drummonds
Hot Shot
Hot Shot

There are whitepapers, presentations, blog articles, and a record-setting SPECweb2005 submission in the public. A presentation on the subject at VMworld 2008 showed the results of a VTune analysis that gave very specific information on Apache's scalability limits. The only way one can doubt these inherit limits and the correction provided through VMware virtualization is to disregard the code inspection and pretend that SPEC's reputation is anything other than an unimpeachable.

I am not spoon-feeding technical explanations to people that start off explanations with accusations of misrepresentation. Go read up on SPEC, read the whitepaper, understand how VTune works, investigate the VTune analysis, and then come back with a reasonable discourse and I will offer help, if needed.

Scott

More information on my communities blog and on Twitter:

http://communities.vmware.com/blogs/drummonds

More information on my blog and on Twitter: http://vpivot.com http://twitter.com/drummonds
0 Kudos
Josh26
Virtuoso
Virtuoso

There's no point argueing with someone so set in his method, so I'll offer this to the original poster.

The one person who seems to think this is a good idea works for VMWare*. Every vendor in history has an allegedly unimpeachable third party showing why their product performs better than the competition.

Citrix are knocking at my door with the same technical discussion "proving" that Xenserver will do a faster job, but I'm not sinking money into that.

*This is not a guess. It's stated in his blog. One in which almost every post declares "VMWare makes everything faster".

0 Kudos
BillyCrook
Contributor
Contributor

Exactly! And while I don't doubt that vmware was (one of many things) used to get some award, I asked for specs, and details of how they won it, and how they did any better with vmware than without. I never got the specs, and I suspect they never even tried it without vmware. So that silly little award says nothing about the benefit of vmware; only that its use doesn't completely kill all chances of winning some award. And frankly, I don't care. Only an idiot buys a computer and software to win an award. The rest of us buy them to do work. Not giving details only further demonstrates the lack of merit.

0 Kudos