VMware Horizon Community
daleallenc
Contributor
Contributor

View Secure server flaky

I've had to reboot my View Secure gateway a couple of times now because the View Portal was inaccessible. I'm a Windows IIS guy, not an Apache guy, so I'm not even sure what to look for to prevent a reoccurence of whatever's happening. The server's in the DMZ and works normally most of the time. Because we're early in our View deployment, we've got less than a dozen users remotely accessing virtual desktops.

Any ideas?

Reply
0 Kudos
33 Replies
jflisher
Contributor
Contributor

Sorry for the delayed response daleallenc. I've been out of town.

We have not had a issue in over five business days. This is the longest period we have made it through. We had support look at the logs but they could not tell us any more info about the Java errors/warnings we saw. Adding the memory seems to have helped. Also, we are using two connection servers (brokers) and have half our users going to one and the other half going to an another. We have a a total of 42 conncurrent connections.

Reply
0 Kudos
amcwilliams
Contributor
Contributor

Hi jflisher,

Aside from the memory changes were there other changes you made while looking for a solution? We're having similar problems and have plenty of memory in our connection and security servers.

Thanks!

Reply
0 Kudos
krismcewan
Enthusiast
Enthusiast

We have a very similar issue here at a customer site.

We have 3 Connection servers (view manager) on a round robin DNS each server has 2gb memory and only using 1gb tops.

we restarted the Composer service and attached that service to a Domain account.

Is this a .Net problem?

A VMware Consultant in Scotland for Taupo Consulting http://www.taupoconsulting.co.uk If you think I'm right or helpful award me some points please
Reply
0 Kudos
daleallenc
Contributor
Contributor

I can't speak for the others, but we've narrowed our troubleshooting down to the network adapter or the TCP/IP stack for it. Restarting the service on our Security server, a VM, never helped. When we did a Windows repair of the network connection, everything returned to normal.

Reply
0 Kudos
krismcewan
Enthusiast
Enthusiast

we have a question mark over the network config here. The VM's and the ESX are fine its beyond there we have no control over.

Now trying to convince the customer to get a network company to evaluate their setup

A VMware Consultant in Scotland for Taupo Consulting http://www.taupoconsulting.co.uk If you think I'm right or helpful award me some points please
Reply
0 Kudos
jflisher
Contributor
Contributor

In our environment we have about fifty virtual desktops connecting to two connection brokers (view servers). We did the following and have not had any issues in over thirty days.

1. Split the load so 25 machines talk to one server and 25 talk to the other.

2. Removed Microsoft NLB. This only complicated things. Forget Round Robin DNS.

3. Bumped up the RAM.

This solution obviously has a single point of failure if the dependent view server fails and the client needs to re-establish connection. You can put more than one view server in the view client drop down list. However, this requires a manually switch. In my opinion this should happen automatically and be transparent to the user. Hopefully we will see this in future releases.

John Flisher

Information Technology Manager

North Carolina State Ports Authority

From: krismcewan <communities-emailer@vmware.com>

To: <john_flisher@ncports.com>

Date: 05/19/2009 08:54 AM

Subject: New message: "View Secure server flaky"

Reply
0 Kudos
krismcewan
Enthusiast
Enthusiast

we have doen a few tweaks too.

Connection Server service has a Domain login instead of the local service account,

Set reservations on resource pools

stale records for decomissioned AD servces removed from DNS.

Seems to have worked a treat.

We have 400 desktops to build up to over 14 esx hosts and 3 connection servers moving to 5 as the roll out happens. Round robin is needed in this instance. we are using Sunray servers as the broker as its sunray's that re the thin client.

I also found Nlite to be very usefull in creatign Gold XP images. Since Virtual desktops dont need all the drivers under the sun stripping down the xp footprint to 1gb as opposed to 3 has made a big improvement.

Chris

A VMware Consultant in Scotland for Taupo Consulting http://www.taupoconsulting.co.uk If you think I'm right or helpful award me some points please
Reply
0 Kudos
jflisher
Contributor
Contributor

What happens in your Round Robin setup if one of connection servers fails and a client gets sent to that address?

John Flisher

Information Technology Manager

North Carolina State Ports Authority

From: krismcewan <communities-emailer@vmware.com>

To: <john_flisher@ncports.com>

Date: 05/19/2009 10:53 AM

Subject: New message: "View Secure server flaky"

Reply
0 Kudos
krismcewan
Enthusiast
Enthusiast

it should fail over to the nex.

i mught try it later see what happens

A VMware Consultant in Scotland for Taupo Consulting http://www.taupoconsulting.co.uk If you think I'm right or helpful award me some points please
Reply
0 Kudos
jflisher
Contributor
Contributor

Let me know how that goes.

The way I understand Round Robin, if the request goes to an IP address of a server that has failed, the client will not connect. If the request is initiated again on the client end, in your scenario it will have a 80 % chance of connecting to one of the remaing 4 out of 5 servers. 20 % failure rate is not acceptable in my environment while someone manually reconfigures DNS or gets the failed server back on line.

This is why I think this issues really needs to be brought to the attention of VMware by their customers. Fail over client connections to brokers should be built into View.

John Flisher

Information Technology Manager

North Carolina State Ports Authority

From: krismcewan <communities-emailer@vmware.com>

To: <john_flisher@ncports.com>

Date: 05/19/2009 11:44 AM

Subject: New message: "View Secure server flaky"

Reply
0 Kudos
krismcewan
Enthusiast
Enthusiast

the reconfigyre isnt that bad. just remove the offendign server from the DNS file.

Unable to test it just now as the customer is gettign their blades reconfigured.

There is a few things that need polished up in View. the web interface is slow, the fact it only runs on IE, the bottle neck with 1 connection service.

Hopefully they will make huge improvements when integrating it into vCenter 4

VCP, VTSP4, VSP4, MCSE, MCTS, IBMBCE and anything else I can learn.

A VMware Consultant in Scotland for Taupo Consulting http://www.taupoconsulting.co.uk If you think I'm right or helpful award me some points please
Reply
0 Kudos
amcwilliams
Contributor
Contributor

Hi Everyone,

We've gone a week now without a problem getting the login page to come up. We found that if a user hit enter 4 or 5 times rapidly after entering the URL, not only did the login page come up for them, it came up normally for all other users as well. That became our workaround although the problem would inevitably resurface several hours later.

After some back and forth with VMware, one of our engineers may have found a lasting solution (at least to our version of the problem).

The error in the VMWare View Connection Server logs was:

11:04:17,750 DEBUG &lt;AJP-81&gt; (Request1382) Connection marked as not reusable, closing.
11:04:30,437 DEBUG &lt;SessionHandler&gt; (E161C5E0D89B15A2A2C782B65309268D) hasSessionLostContact(): threshold = Wed May 13 11:03:30 EDT 2009, lastSeen = Wed May 13 11:04:29 EDT 2009

The workaround is to change the timeout value for the connection:

Go to C:\Program Files\VMware\VMware View\Server\broker\conf directory on the Connection Server.
Open the server.xml file there and change the following line:
&lt;Connector port="8009" enableLookups="false" protocol="AJP/1.3" URIEncoding="UTF-8"&gt;
TO
&lt;Connector port="8009" enableLookups="false" protocol="AJP/1.3" URIEncoding="UTF-8" connectionTimeout="900000"/&gt;
You will need to restart the broker service once you have made the change.

This is an issue with some firewalls and their timeout values associated with the required connections through them between the Security server and the Connection server.

Hope this helps!

Reply
0 Kudos
nonoski
Contributor
Contributor

Vmware Server 2 Web Access Connection Loss (vmware-hostd crash)

Workarounds

November 16th, 2009 [George Knerr|http://webalution.com/techshare/author/george-knerr/|Posts

by George Knerr] [Leave

a comment|http://webalution.com/techshare/2009/11/16/vmware-server-2-web-access-connection-loss-vmware-hostd-crash-workarounds/#respond] [Go

to comments|http://webalution.com/techshare/2009/11/16/vmware-server-2-web-access-connection-loss-vmware-hostd-crash-workarounds/#comments]

Summery of Issue

With upgrading to RHEL 5.4, CentOS 5.4 and Ubuntu 9.10, the latest

2.x.x versions of VMware Server are having serious Web Access GUI

connection failures, specifically vmware-hostd crashing repeatedly.

This has been found with VMware Server 2.0.0, Vmware Server 2.0.1 and

VMware Server 2.0.2. VMware Server 2.x.x was stable in the previous

revisions of the mentioned OS’s. Below are two solutions that “appear”

to make for a stable vmware-hostd process. You are advised strongly to

satisfy your own assuredness of the stability of vmware-hostd using

these solutions before deployment to a mission critical environment.

Both solutions do not require you to stop all vmware related

processes on the host server. The following steps assume vmware-hostd

has crashed and left VMware clients still running.

Verify vmware-host Process has Failed

Note: If you get the below from the ps

command you have another issue and this document is not for you.

  1. ps -ef |grep vmware-hostd

root 10858 1 0 16:47 ? 00:00:02

/usr/lib/vmware/bin/vmware-hostd -a -d -u /etc/vmware/hostd/config.xml

root 11055 11026 0 17:02 pts/3 00:00:00 grep vmware-hostd

Regaining VMware Server 2 Web Access GUI Control

If you want to start the vmware-hostd process to manage your VMware

Server 2 guest operating systems again you may do so with the following

commands.

  1. export

LD_LIBRARY_PATH=/usr/lib/vmware/vmacore:/usr/lib/vmware/hostd:/usr/lib/vmware/lib/libxml2.so.2:/usr/lib/vmware/lib/libexpat.so.0:/usr/lib/vmware/lib/libstdc++.so.6:/usr/lib/vmware/lib/libgcc_s.so.1:/usr/lib/vmware/lib/libcrypto.so.0.9.8:/usr/lib/vmware/lib/libssl.so.0.9.8

  1. /usr/lib/vmware/bin/vmware-hostd -a -d -u

/etc/vmware/hostd/config.xml &

11139

  1. &lt;hit return/enter&gt;

+ Done /usr/lib/vmware/bin/vmware-hostd -a

-d -u /etc/vmware/hostd/config.xml

  1. ps -ef | grep hostd

root 11140 1 22 17:13 ? 00:00:01

/usr/lib/vmware/bin/vmware-hostd -a -d -u /etc/vmware/hostd/config.xml

root 11155 11026 0 17:13 pts/3 00:00:00 grep hostd&

nohup is not needed in this instance as vmware-hostd runs as a daemon

but the ampersand “&&rdquo; is. Otherwise you’ll get logged output to

the screen and when you exit your session vmware-hostd will stop too.

Solving the VMware Server 2 Web Access GUI Connection Failure

I recommend looking at both solutions. I’m currently employing

solution #2 but I’ll leave that decision up to you. Both allow you to

use the start/stop /etc/init.d/vmware script as you normally would and

are permanent unlike the quick fix above to get the vmware-hostd process

up and running again. Again with both solutions you need to determine

if they, in fact, produce a stable VMware Server 2 environment before

deployment to a mission critical environment.

SOLUTION #1 (libc-2.5.so reversion – RHEL 5.4 & CentOS 5.4)

Download and copy libc-2.5.so into place:

  1. lynx

http://mirror.centos.org/centos/5.3/os/x86_64/CentOS/glibc-2.5-34.x86_64.rpm

  1. rpm -Uvh –root=/tmp/ –nodeps ./glibc-2.5-34.x86_64.rpm

  1. mkdir /usr/lib/vmware/lib/libc.so.6

  1. cp /tmp/lib64/libc-2.5.so /usr/lib/vmware/lib/libc.so.6/libc.so.6

Edit /usr/sbin/vmware-hostd adding the following export command just

before the last line in the script as follows:

  1. tail -3 /usr/sbin/vmware-hostd

export

LD_LIBRARY_PATH=/usr/lib/vmware/lib/libc.so.6:$LD_LIBRARY_PATH

eval exec “$DEBUG_CMD” “$binary” “$@”

SOLUTION #2 (Circumventing vmware-hostd library wrapping script –

RHEL 5.4, CentOS 5.4 & Ubuntu 9.10 )

Here is another method not requiring reverting to an older version of

libc-2.5.so. The downside in this solution is it circumvents the

dynamic library path building of the /usr/sbin/vmware-hostd script and

executes the /usr/lib/vmware/bin/vmware-hostd binary directly. I do not

know if this will present problems in the future or not.

Below is the snippet from the modified /etc/init.d/vmware. You can

see I added a LD_LIBRARY_PATH statement, commented out the old exec call

and added a new one.

  1. Start host agent

vmware_start_hostd() {

export

LD_LIBRARY_PATH=/usr/lib/vmware/vmacore:/usr/lib/vmware/hostd:/usr/lib/vmware/lib/libxml2.so.2:/usr/lib/vmware/lib/libexpat.so.0:/usr/lib/vmware/lib/libstdc++.so.6:/usr/lib/vmware/lib/libgcc_s.so.1:/usr/lib/vmware/lib/libcrypto.so.0.9.8:/usr/lib/vmware/lib/libssl.so.0.9.8

vmware_bg_exec “`vmware_product_name` Host Agent” \

“$vmdb_answer_LIBDIR/bin/vmware-hostd” -a -d -u

“$vmware_etc_dir/hostd/config.xml”

#”$vmdb_answer_SBINDIR/vmware-hostd” -a -d -u

“$vmware_etc_dir/hostd/config.xml”

}

Restart VMware Server 2

If you don’t have critical guest OS’s running you can stop the guests

via the VMware Server 2 Web Access GUI and restart VMware:

  1. /etc/init.d/vmware restart

Stopping VMware autostart virtual machines:

Virtual machines

Stopping VMware management services:

VMware Virtual Infrastructure Web Access

VMware Server Host Agent

Stopping VMware services:

VMware Authentication Daemon

VM communication interface socket family:

Virtual machine communication interface

Virtual machine monitor

Bridged networking on /dev/vmnet0

Host network detection

DHCP server on /dev/vmnet1

Host-only networking on /dev/vmnet1

DHCP server on /dev/vmnet8

NAT service on /dev/vmnet8

Host-only networking on /dev/vmnet8

Virtual ethernet

Starting VMware services:

Virtual machine monitor

Virtual machine communication interface

VM communication interface socket family:

Virtual ethernet

Bridged networking on /dev/vmnet0

Host-only networking on /dev/vmnet1 (background)

DHCP server on /dev/vmnet1

Host-only networking on /dev/vmnet8 (background)

DHCP server on /dev/vmnet8

NAT service on /dev/vmnet8

VMware Server Authentication Daemon (background)

Shared Memory Available

Starting VMware management services:

VMware Server Host Agent (background)

VMware Virtual Infrastructure Web Access

Starting VMware autostart virtual machines:

Virtual machines

As more information on this issue becomes available this post will be

updated. Please post your findings too.

This information was generated by my experimentation and the helpful

posts of the VMware Community, reference:

Categories: [Linux|http://webalution.com/techshare/category/linux/|View all

posts in Linux], [Vmware|http://webalution.com/techshare/category/vmware/|View all

posts in Vmware] Tags:

[Leave

a comment|http://webalution.com/techshare/2009/11/16/vmware-server-2-web-access-connection-loss-vmware-hostd-crash-workarounds/#respond] Trackback

  1. Andrew
    November 26th, 2009 at 07:31 | #1
    |

    Thanks or the great blog post. This issue has been driving me
    nuts!

  2. Shawn
    December 9th, 2009 at 22:53 | #2
    |

    I am so happy Google found this post…I had been fighting this
    issue for 3 days to no avail.


    You rock!

  3. Michal Rogozinski
    December 28th, 2009 at 17:58 | #3
    |

    Thanks !!! I almost lost my hair because of that stupid issue!
    That was a really helpful post.

  4. Jouni Renfors
    January 7th, 2010 at 06:35 | #4
    |

    Thanks man. I was ready to move to a different virtualization
    solution when I couldn’t find the problem with VMWare.

  5. Pander
    January 28th, 2010 at 03:29 | #5
    |

    Thank you for info!


    Downloaded and used:


    yum downgrade glibc-2.5-34.el5_3.1.i686.rpm
    glibc-common-2.5-34.el5_3.1.i686.rpm glibc-devel-2.5-34.el5_3.1.i686.rpm
    glibc-headers-2.5-34.el5_3.1.i686.rpm


    Then reboot.


    Works good!

Reply
0 Kudos
cmartin24
Enthusiast
Enthusiast

Just wanted to add to this...I've been fighting an issue with similar errors other users have posted in the View logs.

Our environment is  Client>Lan>Firewall>View Security>Firewall>Lan>Connection Manager.

User  "A" was able to connect to the external View Portal and authenticate  successfully. The entitled desktops would display however the "Status"  displayed "Not Connected". So the tunnel never appeared to be building  correctly.

The users machine had multiple NIC's. Once I adjusted the provider order so that the active nic was at the top of the list - whala!..I was able to connect successfully. So - if you have multiple NIC's on your client machine. Check the provider order.

Hope this helps someone.

Reply
0 Kudos