VMware {code} Community
jhoagland
Contributor
Contributor

VIX Perl routines sometimes never return

I have a script managing the state of several VMs on the local host running VMware Server 1.0x. I'm trying to use Perl VIX to do some of the automation and notice that sometimes I've seen certain routines never exit (waiting at least a couple hours). I've seen this with:

  • VMPowerOn()

  • VMRevertToSnapshot()

  • GetProperties($vmHandle,VIX_PROPERTY_VM_POWER_STATE)

At the time of seeing the failure to return, I had these routines doing what I wanted them to do. Its just that sometimes they don't return (they hang). I don't know of anything different between the times they work and the times they don't.

For VMPowerOn() and VMRevertToSnapshot(), I tried to work around this occasional problem by introducing a timeout with code like the following:

sub do_revert_with_timeout () {

my($vmHandle,$snapshotHandle,$timeout)= @_;

return undef unless defined($fn);

$timeout= 150 unless defined $timeout;

my $err= VIX_OK;

eval {

$SIG{ALRM}= sub { die "timed out after $timeout seconds\n" };

alarm($timeout);

  1. on timeout, alarm handler above will execute and we'll fall out of this eval

  1. on normal exit, we'll fall out of the bottom of the eval with no error

$err= VMRevertToSnapshot($vmHandle,$snapshotHandle,0,VIX_INVALID_HANDLE);

alarm(0);

$SIG= 'IGNORE';

};

my $elapsed= time()-$start;

if ($@) {

if ($@ =~ /timed out after/) { # we timed out

print "$@\n";

return 0;

} else { # the method call did a die

  1. propagate

alarm(0);

die;

}

}

return 1;

}

However, my alarm never goes off (perhaps the method uses SIGALRM internally).

For GetProperties, I had used it thousands of times over weeks and hadn't noticed it having this behavior until today.

Does anyone know what is causing this or how to avoid? Or has anyone else seen this?

Alternately, does anyone have a suggestion on how to time this out (without requiring a fork())? (Guess I could arrange for a SIGUSR1 to be sent to the process periodically and the interrupt handler could check how long it has been.)

Thanks.

Reply
0 Kudos
1 Reply
jhoagland
Contributor
Contributor

I have finally found the reason why I could not timeout Perl VIX API calls via an alarm or signal to the process making the call. Newer versions of Perl implement "safe" signal handling. A side-effect of this is that signal handlers will not be invoked while Perl is in the middle of a call to C routine such as the underlying VIX routines.

My workaround is to use Perl::Unsafe::Signals from CPAN (http://search.cpan.org/dist/Perl-Unsafe-Signals/Signals.pm) around the VIX call. Then I could use alarm as I describe below.

This still doesn't answer why these routines never exit sometimes.

Another update: I haven't actually seen a hang in GetProperties; the hang was actually in a HostDisconnect of a host handle created in a parent process. That I was able to work around.

Reply
0 Kudos