VMware Cloud Community
billdossett
Hot Shot
Hot Shot

auto deploy host profiles distributed virtual switch problems

Hi,

I've been hacking away at this problem in my lab for a few days and I am finally ready to ask for help.

I have two hosts, one is completely compliant and the host profile is created from it.

The second host has the profile applied to it during autodeploy.

After tracking down various things that were actually non-compliant it boils down to this:

The second host always shows as non-compliant after reboot and remains in maintenance mode.

It is non-compliant because of the networking.

The reference host has a dVSwitch with two NICs. I can reboot this host and it always come up, the host profile is applied and it exits maint mode and it's perfect.

The second host stays in maint mode and is non-copliant because: of several lines that I cant copy any paste but basically is saying that the dVSwitch does not exist on the host.  And that i has no nics and that the PG do not exist...  and looking at it no, there is the temporary vSwitch that has been created and the NICs are connected to it and the dVSwitch does exist but has no NICs.

Then the stangest bit... if I apply the profile to the second host, it becomes compliant, the dVS problems go away and it is ready to go... if I reboot it, same thing again.

I guess this indicates some sort of sequencing error on applying the HP, but damned if I can figure it out.  These are completely stateless hosts, both of them.  They are identical... or pretty nearly, the 2nd host does have a dual port network adapter in the riser, but at the moment, that isn't connected to anything, I am just using the two onboard NICs in both hosts.

Saying that, I haven't tried removing that dual port NIC card, I suppose it's worth a try... but I am really grasping at straws here.

Wny ideas on where I should look next?  BTW I am using 5.5 in my lab and haven't upgrade to 5.5b yet.

thanks

Bill

Bill Dossett
0 Kudos
13 Replies
billdossett
Hot Shot
Hot Shot

well, still not got this working...

there was an NTP problem, fixed that.

Looking through syslog I see this:

2014-01-23T16:57:18Z 2014-01-23 16: 57:18,754 Host Profiles[35750]: INFO: /sbin/applyHostProfile sending boot msg to console: Applying Host Profile task list...Done^@

2014-01-23T16:57:18Z 2014-01-23 16: 57:18,816 Host Profiles[35750]: WARNING: applyHostProfile utility leaving host in maintenance mode during post boot configuration.

Apply Error: None

Reapply Required State: ['DvsProfile']

Early Boot Failed State: None^@

Yes I do have a Dvs.. and that is what is not being configured.

Then I get

2014-01-23T16:57:52Z 2014-01-23 16: 57:52,207 Host Profiles[36778]: INFO: CheckHostCompliance called^@

2014-01-23T16:57:52Z 2014-01-23 16: 57:52,435 Host Profiles[36778]: INFO: Gathering hostInfo data...^@

2014-01-23T16:57:59Z 2014-01-23 16: 57:59,539 Host Profiles[36778]: INFO: Done gathering hostInfo data (7.1038839817 seconds)^@

and then

2014-01-23T16:58:06Z 2014-01-23 16: 58:06,683 Host Profiles[36778]: INFO: Calling GatherData() for profile type CimXmlIndicationsProfile^@

2014-01-23T16:58:06Z 2014-01-23 16: 58:06,745 Host Profiles[36778]: INFO: IP (127.0.0.1) for hostname (localhost) does not match mgmt vnic ip list ['152.144.155.140']. ^@

2014-01-23T16:58:06Z 2014-01-23 16: 58:06,746 Host Profiles[36778]: INFO: CIM Indication plugin selecting first valid mgmt vnic IP address as host IP: 152.144.155.140^@

2014-01-23T16:58:06Z ComplianceManager: [2014-01-23 16:58:06,746 vmware.runcommand INFO] runcommand called with: args = '/bin/ticket --generate', outfile = 'None', returnoutput = 'True', timeout = '0.0'.^@

2014-01-23T16:58:06Z 2014-01-23 16: 58:06,757 Host Profiles[36778]: INFO: Created CIM ticket dea3235f-f0ab-499d-be5b-341dd683ee49^@

2014-01-23T16:58:06Z 2014-01-23 16: 58:06,989 Host Profiles[36778]: INFO: Calling GatherData() for profile type MotdProfile^@

2014-01-23T16:58:06Z 2014-01-23 16: 58:06,990 Host Profiles[36778]: INFO: Calling GatherData() for profile type PAMLoginMapProfile^@

and that is it...  end of host profile stuff in the syslog, no errors.

Its funny as with 5.1 I am pretty sure that in the status window, it connnect the host, then says applying host profile, then disconnects and reconnects.  Currently, all I see in my status window is that it Disconnects the host and then reconnects...

I have downloaded 5.5b of the vcenter, not sure that will make any difference, but grasping at straws, will try that tomorrow morning I guess.

any comments or ideas would certainly be welcome at this point.

Bill Dossett
0 Kudos
FlywithBobStar
Contributor
Contributor

I'm also having an issue here. I've got 2 nested ESXi hosts which I'm playing with. We do have physical ESXi hosts which all use ESXi5.5 images but a 5.1 Host Profile, these work fine and use the dvSwitch configuration. With the 5.5 host profile, there are networking issues where the networking is not applied to the host if you're using a dvSwitch. I got this working fine using a standard switch configuration, however, when I switched the config to use our dvSwitches, its doesn't work. It used to leave the IP address of the host a DHCP assigned address, but I've now managed to get it to apply the correct IP, but the host is not on the network and the host profile hasn't completely applied, so it's not manageable from vCenter. I'm using build 1746018 which is 5.5 1a I believe. Not sure why it won't work, none of us in my team at work have sussed it. May be time for a call to VMware? How did you get on? Have you sussed this one?

Bob

VCP4 & VCP5
0 Kudos
billdossett
Hot Shot
Hot Shot

two things to look at that totally stabilized my system...

1.) I moved the vcenter out of the cluster and put it in another vdatacenter and cluster.  I know that this is not supposed to matter... but, ever since I did that, I have had 0 problems...

2.) are you using vlans? if so, you need a rule or it doesn't complete, much like you describe.  You need a deploy rule that tell it what vlan to use on the mgmt interface when it is first booting as so:  set deployoptions "vlan-id" 155  ...question I never figured out was how did it ever work at all int he first place as I didn't discover this rule until recently and I had problems, but then hosts did boot sometimes... then they would have a problem and not boot after.

also, if your host has ever booted into the cluster and you are rebuilding it, you have to remove it from the cluster when you rebuild it... I guess that is rule 3.

Let me know if any of those helped.  My system is rock solid now that I got that all done... I'd like to find a job that I can use it in now! current place of work motto: if it aint broke don't fix it...  😞

Bill Dossett
0 Kudos
FlywithBobStar
Contributor
Contributor

Excellent! Glad you got yours working. Thank you for the fast reply, I'll give that a go over the next day or so, (I'm at a vForum tomorrow in London) I've spend today and yesterday trying various bits. We do use VLANs here. Could you tell me where you added that rule please? In my Host Profile configuration, i've set up Networking Configuration > vSphere Distributed Switch and created my dvswitch name in there, populated the fields such as Uplink Port Configuration to use the dvUplink1 and port group name to connect to. Also in Networking Configuration > Host Virtual NIC, I've added the dvSwitch and specified the dvPort to connect to.

Cheers,

Bob

VCP4 & VCP5
0 Kudos
billdossett
Hot Shot
Hot Shot

Hey, the rule I added was just with my 3 other deploy rules.. I use 3 rules, one for image profile, one for host profile and one for cluster to join, so I just created a 4th rule with the vlan info in it...  I haven't been in my lab for a while as I'm doing a vDatacenter upgrade out in Boulder and that's pretty much taking all my time as the person who set up the storage originally didn't really have a concept of DRS Clusters or storage and I am also doing a hardware refresh...  if I get a minute tomorrow I will login to the lab and check the order of the rules, but I don't think it really matters where the vlan rule goes

Bill Dossett
0 Kudos
FlywithBobStar
Contributor
Contributor

Hey, been doing some more troubleshooting and raise a SR with VMware for this one. Not using VLANs for this, it specified as 0 in our builds. What I am having issues with is the default gateway doesn't get added to the IP configuration and the profile doesn't apply. Even if I manually the the gateway to the IP configuration of the host in the DCUI, it is still not visible on the network. I did an esxcfg-route -l and can see the following:

Network     Netmask           Gateway          Interface

172.18.8.0 255.255.255.0  Local Subnet  vmk0

default        0.0.0.0              172.18.8.1       vmk0

Followed KB2001426 to try manually adding the default gateway but get error. Duplicate route to network x.x.x.x/xx found.  Please delete the old route first.

So I've tried removing that old route but you can't because its in use as a default kernel port. Have checked all the vSphere Distributed switch settings and cannot see why it won't apply the profile. It doesn't even disconnect and reconnect the host before the apply host configuration stage. Last log in the Syslog on the host shows failed to get profile for deferred param path network.genericNetStackInstanceProfile [key-vim-profile-host-GenericNetStackIintanceProfile-defaulyTcpipStack].genericDnsConfigProfile.



Running out of ideas now Smiley Sad


VCP4 & VCP5
0 Kudos
billdossett
Hot Shot
Hot Shot

Hey, I'm quite interested in helping you troubleshoot this one.  I'm trying to specialized in this and can use the problem solving Experience.

i Am using all 5.5 in my lab, it's a physical lab, not nested.  I will need to have a look as I was rebiulding some of it to try and slipstream vshield into my images and I left it about a month ago, so I just need to book mark where it is and have a look.  I have not seen the problems you describe yet.  I will probably need some more info and wil let you know as soon as I am ready to troubleshoot what I need.  Might have some time free this afternoon if I get lucky and will be in touch

Bill Dossett
0 Kudos
vt-vmwaresjo
Contributor
Contributor

Hi Bob,

is this problem solved on your side?

We have got exact the same problem with our Auto Deploy 6.5, deploying a Server with a vSwitch -> no Problem. Deploying servers with a distributed switch config on it, every time the default gateway is missing and Auto Deploy fails. The default Gateway is configured within the host profile.

No idea what happens...

Thanks

Alex

0 Kudos
dgrove12
Contributor
Contributor

Alex,

Did you ever find a solution to your issue? I am seeing the exact same situation that you describe.  Autodeploy on 6.5 host w/ distributed switch fails and no gateway is entered. 

0 Kudos
randomname
Enthusiast
Enthusiast

This feature set is still broken. vDS and host profiles have been "supported" since 2009. Auto deploy has been "supported" since 2011. Still can't use them together, and no documentation I've found admitting you can't.

Putting the management interface (vmk0) on a vDS with DHCP almost works. The first time the host attempts to apply the DNS configuration, it is missing a parameter. Doesn't matter how many of the 'DNS' related parameters are populated with what information in the host profile. Later in the boot process after the transition to vDS, it properly populates the parameter and successfully applies, but the previous error condition is not cleared and the host stays in maintenance mode when it should automatically exit.

First pass below. Note the unset virtualNicDevice parameter triggering the error.

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: INFO: Applying DNS config for defaultTcpipStack at postBoot

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: INFO: DNS config beforeee first apply: (vim.host.DnsConfigSpec) {    dynamicType = <unset>,    dynamicProperty = (vmodl.DynamicProperty) [],    dhcp = true,    virtualNicDevice = <unset>,    hostName = 'localhost',    domainName = 'XXX',    address = (str) [       'XXX'    ],    searchDomain = (str) [       'XXX'    ],    virtualNicConnection = (vim.host.VirtualNicConnection) {       dynamicType = <unset>,       dynamicProperty = (vmodl.DynamicProperty) [],       portgroup = <unset>,       dvPort = (vim.dvs.PortConnection) {          dynamicType = <unset>,          dynamicProperty = (vmodl.DynamicProperty) [],          switchUuid = 'XXX',          portgroupKey = 'XXX',          portKey = <unset>,          connectionCookie = <unset>       }    } }

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: WARNING: Invalid dnsConfig? DHCP is set to true, but no vnic for dv/pg. Trying anyway...

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: WARNING: (vim.host.VirtualNic) [    (vim.host.VirtualNic) {       dynamicType = <unset>,       dynamicProperty = (vmodl.DynamicProperty) [],       device = 'vmk0',       key = 'key-vim.host.VirtualNic-vmk0',       portgroup = 'XXX',       spec = (vim.host.VirtualNic.Specification) {          dynamicType = <unset>,          dynamicProperty = (vmodl.DynamicProperty) [],          ip = (vim.host.IpConfig) {             dynamicType = <unset>,             dynamicProperty = (vmodl.DynamicProperty) [],             dhcp = true,             ipAddress = 'XXX',             subnetMask = '255.255.255.224',             ipV6Config = (vim.host.IpConfig.IpV6AddressConfiguration) {                dynamicType = <unset>,                dynamicProperty = (vmodl.DynamicProperty) [],                ipV6Address = (vim.host.I

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: pConfig.IpV6Address) [                   (vim.host.IpConfig.IpV6Address) {                      dynamicType = <unset>,                      dynamicProperty = (vmodl.DynamicProperty) [],                      ipAddress = 'fe80::250:56ff:fe64:73f4',                      prefixLength = 64,                      origin = 'other',                      dadState = 'preferred',                      lifetime = <unset>,                      operation = <unset>                   }                ],                autoConfigurationEnabled = true,                dhcpV6Enabled = false             }          },          mac = 'XXX',          distributedVirtualPort = <unset>,          portgroup = 'XXX',          mtu = 1500,          tsoEnabled = true,          netStackInstanceKey = 'defaultTcpipStack',          opaqueNetwork = <unset>,          externalI

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: d = <unset>,          pinnedPnic = <unset>,          ipRouteSpec = <unset>       },       port = 'key-vim.host.PortGroup.Port-33554438'    } ]

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: INFO: Applying dnsConfig (vim.host.DnsConfigSpec) {    dynamicType = <unset>,    dynamicProperty = (vmodl.DynamicProperty) [],    dhcp = true,    virtualNicDevice = <unset>,    hostName = 'localhost',    domainName = 'XXX',    address = (str) [       'XXX'    ],    searchDomain = (str) [       'XXX'    ],    virtualNicConnection = <unset> }

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: ERROR: EngineModule::ApplyHostConfig. Exception: (vmodl.fault.InvalidArgument) {    dynamicType = <unset>,    dynamicProperty = (vmodl.DynamicProperty) [],    msg = 'A specified parameter was not correct: DnsConfig.VirtualNicDevice',    faultCause = <unset>,    faultMessage = (vmodl.LocalizableMessage) [],    invalidProperty = 'DnsConfig.VirtualNicDevice' }

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: WARNING: EngineModule::ApplyHostConfig. Backtrace:   File "/build/mts/release/bora-5310538/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/hostprofiles/tests/tools/hpcliModules/engineModule.py", line 546, in ApplyTaskList   File "/build/mts/release/bora-5310538/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/hostprofiles/pyEngine/applyConfigSpec.py", line 3663, in ApplyHostConfig   File "/build/mts/release/bora-5310538/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/hostprofiles/pyEngine/applyConfigSpec.py", line 2416, in ApplyNetworkConfigCLI   File "/build/mts/release/bora-5310538/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/hostprofiles/pyEngine/applyConfigSpec.py", line 1647, in ApplyDnsConfig   File "/build/mts/release/bora

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: WARNING: -5310538/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/VmomiSupport.py", line 557, in <lambda>   File "/build/mts/release/bora-5310538/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/VmomiSupport.py", line 363, in _InvokeMethod   File "/build/mts/release/bora-5310538/bora/build/esx/release/vmvisor/sys/lib64/python3.5/site-packages/pyVmomi/SoapAdapter.py", line 1335, in InvokeMethod 

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: WARNING: excObj is a runtime fault: (vmodl.fault.InvalidArgument) {    dynamicType = <unset>,    dynamicProperty = (vmodl.DynamicProperty) [],    msg = 'A specified parameter was not correct: DnsConfig.VirtualNicDevice',    faultCause = <unset>,    faultMessage = (vmodl.LocalizableMessage) [       (LocalizableMessageWithPath) {          dynamicType = <unset>,          dynamicProperty = (vmodl.DynamicProperty) [],          key = 'com.vmware.vim.profile.engine.UnexpectedError',          arg = (vmodl.KeyAnyValue) [             (vmodl.KeyAnyValue) {                dynamicType = <unset>,                dynamicProperty = (vmodl.DynamicProperty) [],                key = 'error',                value = 'A specified parameter was not correct: DnsConfig.VirtualNicDevice'             },             (vmodl.KeyAnyValue) {                dynami

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: cType = <unset>,                dynamicProperty = (vmodl.DynamicProperty) [],                key = 'context',                value = 'EngineModule::ApplyHostConfig'             }          ],          message = 'Error: A specified parameter was not correct: DnsConfig.VirtualNicDevice.'       }    ],    invalidProperty = 'DnsConfig.VirtualNicDevice' }

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: INFO: Cleaned up Host Configuration

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: INFO: /sbin/applyHostProfile sending boot msg to console: Applying Host Profile task list...Error

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: ERROR: while applying tasks: (vmodl.fault.InvalidArgument) {    dynamicType = <unset>,    dynamicProperty = (vmodl.DynamicProperty) [],    msg = <unset>,    faultCause = <unset>,    faultMessage = (vmodl.LocalizableMessage) [       (LocalizableMessageWithPath) {          dynamicType = <unset>,          dynamicProperty = (vmodl.DynamicProperty) [],          key = 'com.vmware.vim.profile.engine.UnexpectedError',          arg = (vmodl.KeyAnyValue) [             (vmodl.KeyAnyValue) {                dynamicType = <unset>,                dynamicProperty = (vmodl.DynamicProperty) [],                key = 'error',                value = 'A specified parameter was not correct: DnsConfig.VirtualNicDevice'             },             (vmodl.KeyAnyValue) {                dynamicType = <unset>,                dynamicProperty = (vmodl.DynamicPro

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: perty) [],                key = 'context',                value = 'EngineModule::ApplyHostConfig'             }          ],          message = 'Error: A specified parameter was not correct: DnsConfig.VirtualNicDevice.'       }    ],    invalidProperty = 'DnsConfig.VirtualNicDevice' }

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: WARNING: applyHostProfile utility leaving host in maintenance mode during post boot configuration. Apply Error: (vmodl.fault.InvalidArgument) {    dynamicType = <unset>,    dynamicProperty = (vmodl.DynamicProperty) [],    msg = <unset>,    faultCause = <unset>,    faultMessage = (vmodl.LocalizableMessage) [       (LocalizableMessageWithPath) {          dynamicType = <unset>,          dynamicProperty = (vmodl.DynamicProperty) [],          key = 'com.vmware.vim.profile.engine.UnexpectedError',          arg = (vmodl.KeyAnyValue) [             (vmodl.KeyAnyValue) {                dynamicType = <unset>,                dynamicProperty = (vmodl.DynamicProperty) [],                key = 'error',                value = 'A specified parameter was not correct: DnsConfig.VirtualNicDevice'             },             (vmodl.KeyAnyValue) {

2017-07-26T17:12:57Z Host Profiles[68594 opID=MainThread]: dynamicType = <unset>,                dynamicProperty = (vmodl.DynamicProperty) [],                key = 'context',                value = 'EngineModule::ApplyHostConfig'             }          ],          message = 'Error: A specified parameter was not correct: DnsConfig.VirtualNicDevice.'       }    ],    invalidProperty = 'DnsConfig.VirtualNicDevice' } Reapply Required State: ['DvsProfile'] Early Boot Failed State: None

Second pass below. Note that the virtualNicDevice parameter is now populated.

2017-07-26T17:16:06Z Host Profiles[69779 opID=2e26ee82-03-8c-163a]: INFO: Applying dnsConfig (vim.host.DnsConfigSpec) {    dynamicType = <unset>,    dynamicProperty = (vmodl.DynamicProperty) [],    dhcp = true,    virtualNicDevice = 'vmk0',    hostName = 'localhost',    domainName = 'XXX',    address = (str) [       'XXX'    ],    searchDomain = (str) [       'XXX'    ],    virtualNicConnection = <unset> }

Putting vmk0 on a vDS with a static IP is when the gateway doesn't get assigned. I've been unable to find a working combination of the half dozen different poorly worded and undocumented host profile parameters which reference 'gateway.'

Leaving vmk0 on a standard switch while putting other vmkernel interfaces on vDS results in the issue I describe here, which is that vmk0 gets deleted but not recreated, leaving the host without a functioning management interface at all.

0 Kudos
randomname
Enthusiast
Enthusiast

Last week's release of ESXi 6.5 Update 1 appears to have finally fixed management services vmkernel adapters on vDS with auto deploy and host profiles.

DHCP remains broken with the same behavior, as is management on a standard vSwitch with any other vmkernel adapter on a vDS.

0 Kudos
MarkusHartmann
Contributor
Contributor

same problem here, we did an update from 5.5 to 6.5u1 - now, our hostprofiles and autodeploy will not set the default gateway with our vDS. Very frustrating.

After login to dcui and manual set the default gateway, everything seems fine, and the hosts are reachable, host-profile gets applied.

Opened #SR17578133409, but no update since about 10 days 😕

0 Kudos
devs159
Contributor
Contributor

Ever find a workaround for this?

Tested with DHCP and 6.5 u1e and it's still failing to clear out vmk0. The host gets joined to vCenter in maintenance mode and is compliant with the h/profile. Using standard vswitches works successfully. Surprising this doesn't work considering Auto Deploy is available since 5.x and nothing documented saying it's unsupported with dvs.

Did hear there were significant changes to how host profiles are implemented in 6.5u1 but no details provided. Future release may resolve this problem whenever Engineering get around to fixing.

0 Kudos