VMware Cloud Community
cougar694u
Enthusiast
Enthusiast

Nexus 1000v VEM fails on 2 out of 8 hosts.

I have 8 ESXi hosts. I do a fresh install from the installable CD directly to 4u1.

We have another 2-node cluster with a working Nexus 1000v primary & secondary. Everything's up and running.

I installed 6 hosts and everything worked great, migrated them to the Nexus DVS, and VUM installed the modules.

I did the 7th host, and when I tried to migrate it to the DVS, it failed with the following error:

Cannot complete a Distributed Virtual Switch operation for one or more host memebers.

DVS Operation failed on host , error durring the configuration of the host: create dvswitch failed with the following error message: SysinfoException: Node (VSI_NODE_net_create) ; Status(bad0003)= Not found ; Message = Instance(0): Inpute(3) DvsPortset-0 256 cisco_nexus_1000v got (vim.fault.PlatformConfigFault) exception

Then, I tried to do host 8, and got the exact same problem. It worked about 15 minutes prior when I did host 6, nothing changed, then went to host 7 and it failed.

If I try to remediate either of these two hosts, either patches or extensions, it fails.

Anyone else have these problems?

~Luke http://thephuck.com
Reply
0 Kudos
16 Replies
pfazzone
Contributor
Contributor

Are you using an evaluation license or a regular license? I am wondering (if you are in fact using an evaluation license) if you might be out of VEM licenses on the VSM. The Evaluation license supports up to 16 CPUs per VSM. With 8 hosts, if you have a mix of 2 socket and 4 socket machines, you could be at the limit.

Thanks, pf

Reply
0 Kudos
pettori
Contributor
Contributor

Would it also be possible to check what vmware license do you have installed on those servers

Is it possible that you have enterprise only ?

Thanks

Reply
0 Kudos
AnatolyVilchins

Given Distributed Switching is a paid-for
licenced service that hasn't been out for 12 months you inherently MUST
have a valid support contract with VMWare for this lot and as N1000
setup/config is a far from common or simple job I would strongly advise
you log this problem with VMWare support.

Please don't think I'm being unhelpful, I'm clearly more than happy
to help out in just about every way I can with SF questions but when
there are vendor experts that you've paid to answer this kind of
problem I would always suggest you go there first. Hope you understand.





http://serverfault.com/questions/104722/nexus-1000v-vem-fails-on-2-out-of-8-hosts







Starwind Software Developer

www.starwindsoftware.com

Kind Regards, Anatoly Vilchinsky
Reply
0 Kudos
cougar694u
Enthusiast
Enthusiast

We're enterprise plus with something like 256 cpu ELA, no eval here.

We have the n1kv up and running on another cluster, port profiles, uplinks, it's all working fine. Even the last 6 worked just fine, but for whatever reason, the next 2 failed.

I'll get with our SE and see what he says.

~Luke http://thephuck.com
Reply
0 Kudos
pettori
Contributor
Contributor

Hi Cougar,

Can you past the output of "sh license usage NEXUS1000V_LAN_SERVICES_PKG"

Thanks

Pierre

Reply
0 Kudos
lwatta
Hot Shot
Hot Shot

If you can't remediate the hosts for patches then it sounds like a VUM issue. Make sure VUM didn't die.

Just check the services on the server and make sure VUM is still running. Might want to also restart it just to be sure.

louis

Reply
0 Kudos
cougar694u
Enthusiast
Enthusiast

@pettori: These are ESXi hosts, so no CLI other than vMA.

@lwatta: VUM still works because I can remediate other ESXi hosts (I just did a remediate to verify).

*EDIT*

I tried remediation on the failed hosts and it's a no-go. I also tried on host 6 (the last host that successfully connected to the DVS), and it fails, too. I can scan and remediate other hosts, just the last ones fail, for whatever reason.

~Luke http://thephuck.com
Reply
0 Kudos
pettori
Contributor
Contributor

Hi,

Sorry I should have precise that. Can you try that command on the VSM, the supervisor of the Nexus 1000V.

Thanks

Reply
0 Kudos
cougar694u
Enthusiast
Enthusiast

I tried to add another host to the n1kv, which fails, so I tried to use VUM to install the extensions and got this:

Host patch VEM400-200911014-BG conflicts with the package vmware-esx-firmware_4.0.0-1.10.219382 installed on the host and cannot be staged. Remove the patch from the baseline and retry stage operation.

Ran vihostupdate -q from the vma to see what's installed and got this:

-


Bulletin ID--


-
Installed- -
Summary
--


qlgc-qlge-100.17 2010-02-09T16:19:44 qlge: net driver for VMware ESX

qlg.831.k1.23vmw 2010-02-09T16:28:54 qla2xxx: scsi driver for VMware ESX

ESXi400-200911203-UG 2010-02-09T16:49:44 VI Client update for 4.0 U1 release

ESXi400-Update01 2010-02-09T16:49:44 VMware ESXi 4.0 Update 1

ESXi400-200912401-BG 2010-02-09T16:51:07 Updates Firmware

ESXi400-200912402-BG 2010-02-09T16:51:07 Updates VMware Tools

it's showing as build 219382 in vcenter.

Any ideas?

~Luke http://thephuck.com
Reply
0 Kudos
mestery
Contributor
Contributor

@cougar694u: Build 219382 was released by VMW in December. The matching release from Cisco would have the bulletin ID with a date in the range of VEM200912*. Can you verify you have the N1KV depot from vmware.com enabled and active in your VUM configuration? If you search for VUM patches in the VUM UI, you should see some patches with a release date of December 2009, those would be the ones you want.

Thanks,

Kyle

lwatta
Hot Shot
Hot Shot

If you are running kernel rev 219382 then you need to install VEM400-200912016. Looks like VUM either doesn't have the patch or is confused. Go into update manager tab and make sure that it knows about the Cisco VEM repository. It should be

Make sure that link is present and checked. Then download the patches again to be sure.

Just to make sure. Are you adding the host via the networking view or are you trying to force the VEM install by applying a baseline to the host?

You can see the patch matrix at the following URL

http://www.cisco.com/en/US/docs/switches/datacenter/nexus1000/sw/4_0_4_s_v_1_2/compatibility/informa...

cougar694u
Enthusiast
Enthusiast

I try to add the host via the networking tab, which is where it fails.

I do have the cisco repo in there, I tried to add it again and it says it's already there.

When trying to download newer patches/extensions, it doesn't find anything newer. It's checked and says connected.

The repo shows two released on 1/4/2010, which are VEM400-200912001-BG and VEM400-200912016-BG.

I did a manual install of VEM400-200912016-BG and it shows up now if I run vihostupdate -q. Prior to the manual install, it never showed up in the update manager tab for the host, only older updates. I tried scanning, rebooting, staging, but nothing would change the four listed. Even after the manual install of the newer VEM, the four still show up as having conflicts.

After the manual install, I tried to add the host and it worked.

~Luke http://thephuck.com
Reply
0 Kudos
cougar694u
Enthusiast
Enthusiast

I'm assuming VUM is the culprit here. On the two hosts at the beginning of the thread, I manually laid down ESXi400-200912001.zip, rebooted, then VEM400-200912016.zip, and I can add them to the DVS without issue.

I've reinstalled VUM once already, though, so I'm not sure what's actually causing my problem.

~Luke http://thephuck.com
Reply
0 Kudos
cougar694u
Enthusiast
Enthusiast

For whatever reason, VUM would not pick up the proper VEM needed, nor would it update the host.

I had to manually run "vihostupdate -i -b ESXi400-200912001.zip" to update to build 219382 of ESXi, reboot the host, then run "vihostupdate -i -b VEM400-200912016.zip" to get the corresponding VEM installed.

After that, adding the host to the Nexus 1000v worked just fine.

~Luke http://thephuck.com
Reply
0 Kudos
lwatta
Hot Shot
Hot Shot

If you don't mind could you open a case with VMware and let us know the case number so we can track the issue with them?

Reply
0 Kudos
cougar694u
Enthusiast
Enthusiast

I'm having this exact same problem again on a completely different environment (new hosts, new vum, newer nexus vsm, etc).

We deployed the vsm for rev (3) of the nexus, vum would put all three vems on the host, but the vsm would show rev(1) of the vem.

I uninstalled all vems from the hosts, used rev(2) of the vsm, and let vum apply the vem and it worked on 2 hosts, failed on subsequent with the same error in my original post.

Just updating the thread, I haven't called vmware yet, but guess I will on Monday.

~Luke http://thephuck.com
Reply
0 Kudos