VMware Horizon Community
Mike_MT
Contributor
Contributor

Recovering from network outage - any new features or robustness?

I recently had a network glitch that resulted in linked clones losing connectivity with the connection server. The clones and pool never fully recovered and I had to disable the pool, delete the VMs, and clean things up using ADSIedit and SQL. This has been a problem with View - it's unable to recover gracefully from outages / connectivity issues.

I'm on 5.3 - have there been any improvements in the latest 5.x or 6.x code that address connectivity issues, recovering gracefully from outages, etc.?

I am looking into enhancing my View architecture to mitigate this issue by adding another connection server, etc., but I fear this will only limit the damage rather than eliminate the problem completely.

-Mike

0 Kudos
6 Replies
Mike_MT
Contributor
Contributor

Nothing?

Okay - help me out here a little bit. Do I have something configured/built so completely wrong that I'm beyond help...or is this still an issue that everyone deals with and VMware just isn't addressing it with a good solution...?

If it's just my problem that's okay, I can research some more and re-architect if necessary. If it's a larger issue with View that can only be mitigated with architecture, that would be good to know as well.

Mike

0 Kudos
mittim12
Immortal
Immortal


Can you go into detail on what you  mean by "they never recovered"?    Do you have a replica group that consist of multiple connection brokers?

0 Kudos
Mike_MT
Contributor
Contributor

During my network outage all virtual desktops lost connectivity with their connection server. Once network service was restored, the dedicated desktops resumed operations without a problem, but the linked clones did not. Some of them worked okay for a while, but eventually failed to recompose/refresh, some of them never reconnected and I could not delete them using the View console, so I had to go through the process of using ADSIedit and SQL to manually remove all of the clones.

Is there a difference in how dedicated desktops vs. linked clones operate that would point to why some recovered/reconnected and some did not?

Multiple brokers would not have helped as this particular network issue was a routing issue and with desktops on different subnets than servers they all would have lost connectivity.

It's frustrating to have to deal with this issue manually (ADSIedit, etc.) when either the desktops should reconnect automatically or there should at least be a way to force kill all the clones in a pool with database cleanup happening as part of this process and then recompose the pool.

Mike

0 Kudos
mittim12
Immortal
Immortal

I assume the problem was not with the connection broker but with View composer which would control a refresh or recompose operation.   This would also be your biggest difference between clones and full desktops. 

I don't think this is typical behavior as I have been forced to hard boot my vCenter/Composer machine numerous times and I've never had any issues performing composer functions after.    It's a little late to ask but what kind of error messages were you seeing and what kind of status did the View desktops have? 

0 Kudos
Mike_MT
Contributor
Contributor

Ah, the compose <-> linked clones connection was probably at least part of the issue.

It's too late for any diagnostic data, but thanks for asking.

I'm thinking about my architecture and wondering what other's are doing.

Do you have your linked clones connected to one or more connection servers and then have your dedicated clients connected to different connection servers? I'm thinking this would at least mitigate the issue of having to reboot/restart a connection server or service and avoid disconnecting at least some clients.

Also - for my clarification - is it possible to restart the composer service (on my vCenter server) and NOT disconnect any clients (linked clone or dedicated)?

Thanks.

0 Kudos
mittim12
Immortal
Immortal

I have fairly standard configuration with 3 connection brokers in a replica group.   There are two that handle internal connections and then one paired with the security server.     As long as your not using the tunneling feature of the CB you could reboot it whenever you want without causing a user to become disconnected. 

The composer service only impacts deployment of linked clones so you can restart it anytime you like.

0 Kudos