VMware Cloud Community
BjornJohansson
Enthusiast
Enthusiast

Virtual Machine marked as invalid

Hello all!

I have looked into the other threads regards invalid machines in Infrastructure Client, but none of them seems to apply on my problem.

3 x ESX 3.5 U1/Virtual Center 2.5 U1

When you look into infrastructure client, the VM (running W2k3 + Blackberry Enterprise Server) is greyed out and marked as "invalid". If I connect directly to the host (with powershell) it says the machine is not running. But it is and responding.

I have restarted services on all esx servers, re-registrered the VM and restarted the VM. No luck. I cannot see anything weird in the .vmx. But I attached it if I missed something.

Please advice, thank you!

Best regards

Björn Johansson

Tags (1)
Reply
0 Kudos
14 Replies
espi3030
Expert
Expert

Bjorn,

I understand you mentioned your VM is labeled as "invalid", here is a post that fixed some VM's that were "orphaned" . Perhaps you can try this and see if it corrects you issue?

Hope this helps!

java_cat33
Virtuoso
Virtuoso

I had a similar issues yesterday - I removed the VM from the inventory, restarted the vpxa and hostd services - added the VM back to the inventory. This worked for me.

Also try restarting your VirtualCenter server service.

Reply
0 Kudos
BjornJohansson
Enthusiast
Enthusiast

Thanks guys,

Unfortunately I already tried that without success. Just to be sure, I tried it again and did following:

  1. Unregistrered VM from infrastructure client

  2. Stopped VMware VirtualCenter Service

  3. Restarted vpxa, mgmt, webAccess and vmkauthd services on all hosts in the cluster

  4. Disabled HA on cluster

  5. Started VirtualCenter service

  6. Added vmx to inventory

  7. Enabled HA on cluster

Still no luck, still marked as invalid. I guess I covered everything there... order ok?

I also checked vc logs logs via PowerShell :

Get-VM "blackberry-srv" | Get-VIEvent | Format-Table CreatedTime, FullFormattedMessage -AutoSize)

No clues there either, just that machine is now registered in the datacenter.

Anyone see anything fishy in the .vmx or have any other suggestion?

Thanks!

/Björn

Reply
0 Kudos
depping
Leadership
Leadership

I had the same issue once, only solution was removing the ESX host from VC and add it again. for some reason the VM was stuck, and by removing the ESX host the VM was removed from the database and when adding the ESX host there's nothing there to add again.



Duncan

Blogging: http://www.yellow-bricks.com

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos
Karunakar
Hot Shot
Hot Shot

Hi,

This issue can also appear in storage access case.

If the vm is on a lun, and that is not visible on the storage area of ESX.

you can try thebelow.

Remove the VM from the inventory.

Try to refresh the storage section on the esx from vi client or from esxcfg-rescan on the console of esx, then restart the mgmt-vmware service.

Then ry to add the vm to the inventory of the esx machine.

This should resolve the issue.

-Karunakar

Reply
0 Kudos
BjornJohansson
Enthusiast
Enthusiast

Thanks for the suggestion. Perhaps a stupid question: When I re-registered it again it ended up on another host. Does that mean I have to remove all hosts (we got three) from the cluster? Or would it be sufficient with the current host?

The register Virtual Machine wizard does not allow me to specify host, only cluster. I guess that is because DRS is enabled.

Thanks

/Björn

Reply
0 Kudos
Karunakar
Hot Shot
Hot Shot

You need not unregister the ESX machine, try to remove the Virtual machine from the inventory of ESX host, where you see the VM .

Then on all the hosts try to refresh the storage area, and go in to storage adapters, and try to rescan all the storage adapters.

then again locate the lun or storage where the virtual machine is there, and try to browse the datastore, and locate the virtual machine folder, and in the folder, you have the vmx file of the VM,

right click on the vmx file and say add to inventory.

-Karunakar

Reply
0 Kudos
BjornJohansson
Enthusiast
Enthusiast

Yep, I was replying to deppings post. You posted while as I was writing it Smiley Happy

Thanks for the tip though, I did as you suggested without any luck. It got acually worse when another VM is also now marked as invalid. BUT, they reside on the very same LUN which implies that it is a storage problem. Also the VMs resides on the same host.

The LUN itself contains a bunch of VMs that is successfully registered. I have also checked that it is visible from all hosts that has the LUN presented to them.

Any suggestions? (except removing and adding esx from cluster - I will try that asap)

Thanks guys!

/B

ps. What will happen with the invalid machines during host removal from cluster? I can't migrate them because they are invalid... AFAIK they should continue to run. Or...? Catch 22... Smiley Happy ds.

Reply
0 Kudos
mikepodoherty
Expert
Expert

I've seen this problem with a CLARIION SAN - the esx hosts became unregistered from the SAN - try re-registering the ESX hosts on the SAN and see if that fixes the problem. It did for us.

Reply
0 Kudos
BjornJohansson
Enthusiast
Enthusiast

Hmm... good thought. We are running HP EVA, anyone has any experience about this issue there?

Since my last post I have tried:

  1. Upgrade Virtual Center to Update 3 - still no luck trying the stuff above

  2. Removed the host from the cluster and added it again - no luck

I'm thinking about shutting down the troublesome VMs (from remote desktop) and copy the vmdk files. Then I create new virtual machines and use the existing vmdk's. Would that be something that might work?

But now I'm going home... long f**king day hitting my head into the wall... Smiley Wink

Thanks guys!

/Björn

Reply
0 Kudos
BjornJohansson
Enthusiast
Enthusiast

Hi guys,

Just wanna give you an update. I unfortunately never got any of the suggestions posted to work. This is the workaround that worked:

  1. Created a new custom, identical VM except without any hard drive

  2. Copied the .vmdk to the new VM's folder

  3. Edited the the new VM hardware and added an existing .vmdk - the one I copied to the folder

  4. Successfully started the VM

Thanks for all suggestions!

/Björn

Reply
0 Kudos
blanecatledge
Contributor
Contributor

<![endif]><![if gte mso 9]>

Hi Björn,

I experienced

the same problem twice – once a few months back and once recently. The first time I experienced this the VMware

tech performed the same steps you outlined that resolved your problem. It worked, but it required that the VM be

powered off.

I experienced

it again and performed the following steps to resolve it:

1. Remove
the invalid VM from Virtual Center by pressing the delete key.
2. Delete
the vmxf file in the VM’s directory.
Note: by vmxf file was empty.
3. Add
the VM to the ESX server inventory manually by right clicking the vmx file and
choosing “Add to Inventory.” You must
connect directly to the ESX server to do this.
4. Add
the VM to the Virtual Center inventory following the same steps – right the vmx
file and choose “Add to Inventory.”

Note:

I’ve never heard of a vmxf file before, but the vmx file indicated it was for

extended configuration settings.

It

appears that this problem can be caused by multiple host servers trying to

access the metadata on the VMFS partition at the same time. In some cases, it may be a mis-configuration

of the host group or LUN.

This

is the entry in the vmkernel log that tipped me off. It was recorded over 50 times on each host,

which is not common:

Date11 10:25:49 esxserver vmkernel: 8:02:54:08.798 cpu1:1147)StorageMonitor: 196:

vmhba0:2:3:0 status = 24/0 0x0 0x0 0x0

You

can find it by running the following command on the ESX host:

#grep

24.0 /var/log/vmkernel

Hope

this is helpful to others – leave a note on this discussion if it works for

you.

Blane

Reply
0 Kudos
blanecatledge
Contributor
Contributor

-- this is a re-post because the formatting got hosed the first time --

Hi Björn,

I experienced the same problem twice – once a few months back and once recently. The first time I experienced this the VMware tech performed the same steps you outlined that resolved your problem. It worked, but it required that the VM be powered off.

I experienced it again and performed the following steps to resolve it:

1. Remove the invalid VM from Virtual Center by pressing the delete key.

2. Delete the vmxf file in the VM’s directory. Note: by vmxf file was empty.

3. Add the VM to the ESX server inventory manually by right clicking the vmx file and choosing “Add to Inventory.” You must connect directly to the ESX server to do this.

4. Add the VM to the Virtual Center inventory following the same steps – right the vmx file and choose “Add to Inventory.”

Note: I’ve never heard of a vmxf file before, but the vmx file indicated it was for extended configuration settings.

It appears that this problem can be caused by multiple host servers trying to access the metadata on the VMFS partition at the same time. In some cases, it may be a mis-configuration of the host group or LUN.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=340814...

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&externalId=1006001&sliceId=1&doc...

http://www.vmware.com/pdf/hds_svd_technote.pdf

This is the entry in the vmkernel log that tipped me off. It was recorded over 50 times on each host, which is not common:

Date11 10:25:49 esxserver vmkernel: 8:02:54:08.798 cpu1:1147)StorageMonitor: 196: vmhba0:2:3:0 status = 24/0 0x0 0x0 0x0

You can find it by running the following command on the ESX host:

#grep 24.0 /var/log/vmkernel

Hope this is helpful to others – leave a note on this discussion if it works for you.

Blane

Reply
0 Kudos
FordyPrefect
Contributor
Contributor

We had the same issue here after pushing vmware-tools update to some VM.

Deleting and re-adding the machine in the VI did it in one case.

The second machine seemed to be off after that, but connection via rdp was still possible. We then tried additionally to disconnect the vmware-tools-disk from the machine. Deleting and re-adding was successful then.

HTH

Juergen

Reply
0 Kudos