VMware Cloud Community
levans01
Contributor
Contributor
Jump to solution

Upgrade from VCSA 6.0 U3j to VCSA 6.7 Failing in Stage 2

Good Afternoon,

1. For a couple of weeks now I have been working on completing what should be a fairly straight forward task of upgrading a fully integrated 6.0 VCSA to 6.7 VCSA.

2.  Stage 1: Deploy vCenter Server Appliance with an Embedded Platform  Services Controller completes without issue.

Image 1.1. Upgrade Stage 2: Data transfer and appliance setup is in progress 

pastedImage_0.png

3. Stage 2 completes two of the three tasks and then complains that the source vCenter has been powered off.   This seems like a problem with the workflow.  I've tried powering the old vCenter back on and retrying the operation but it simply fails again in error. 

Image 1.2 VCSA 6.7 Upgrade Failure Error:

pastedImage_2.png

4. The new appliance has an error which indicates the IMPORT failed.  It still has the old IP address on the console.  When I reboot the target appliance it switches to the new Appliance FQDN and IP address but the appliance is clearly broken.  The console still shows Upgrade Import error:

Image 1.3 VCSA 6.7 VCSA appliance

pastedImage_4.png    

5. When I logon to the VCSA 6.7 at https://FQDN.TLD/:5480  I see the following error message:

Image 1.4 VCSA 6.7 Appliance Broken message.

pastedImage_6.png

I think I am doing everything correctly.  I don't know if there is another way to go forward.  I want to say that several years back there was talk of being able to stand up a new appliance. (Standalone) and them manually import the configuration?  If anyone has seen this problem and have a solution I am all ears.

PS.  I have a ticket open with VMware support and I have provided them with the logs but I have not received any actionable responses from them yet.  Other than try powering on the previous appliance to complete the final step.

Thank you.

Reply
0 Kudos
1 Solution

Accepted Solutions
jonastro
VMware Employee
VMware Employee
Jump to solution

Hello Lee,

It was a pleasure working with you on this case.

To ensure clarity on the resolution of your issue and as a record for yourself below is a summary of what we worked on:

Summary:

6.0 to 6.7 appliance upgrade is failing with unable to lookup hostname

Cause and Resolution:

Found the following from the source 6.0 vCenter:

IP/Hostname has been masked for security reasons.

---------------------------------------------

# /opt/vmware/share/vami/vami_get_network

interface: eth0

config_present: true

config_flags: STATICV4+STATICV6

config_ipv4addr: 192.168.1.10

config_netmask: 255.255.255.0

config_broadcast: 192.168.1.255

config_gatewayv4:

config_ipv6addr: fe80::251:45ff:fef9:441d

config_prefix: 64

config_gatewayv6:

autoipv6:

active_ipv4addr: 192.168.1.10

active_netmask: 255.255.0.0

active_broadcast: 192.168.1.255

active_ipv6addr:

active_prefix:

active_gatewayv4: 192.168.1.1

active_gatewayv6:

---------------------------------------------

From the above output, note that "config_netmask" and "active_netmask" has two different values.

The "active_netmask: 255.255.0.0" is the correct subnet for this environment

  • IPv6 has been enabled on the environment.
  • Manually made changes from the DCUI console and disabled IPv6
  • Edited the /etc/hosts file and added the IP address and FQDN of the vCenter server
  • Made sure the source vCenter, jumpbox, destination new vCenter appliance all are on the same host and same port group
  • Started the upgrade and made sure the destination appliance has identical network details as the source (same subnet and gateway)
  • Upgrade completed successfully however the vCenter was showing HA alarms and unable to reach the isolation address

After upgrade:

  • Further checking the upgraded 6.7 vCenter we find that the network is not populated with default gateway and with a wrong subnet mask (255.255.255.0).
  • This is the same subnet which was observed in the source 6.0 vCenter.
  • The network was also getting same ipv6 address which was observed in the source vCenter 6.0
  • Suspect the old invalid configurations were copied during the upgrade to the new appliance.

Ran the following command on the new upgraded 6.7 vCenter appliance:

---------------------------------------------

/opt/vmware/share/vami/vami_config_net

Network Configuration for eth0

IPv4 Address:   192.168.1.10

Netmask:        255.255.255.0

IPv6 Address:

Prefix:

Global Configuration

IPv4 Gateway:

IPv6 Gateway:

Hostname:       abc.xyz.com

DNS Servers:    127.0.0.1, 192.168.1.10

Domain Name:

Search Path:

Proxy Server:

---------------------------------------------

Manually edited the configuration using the options available in "/opt/vmware/share/vami/vami_config_net"

After changes:

---------------------------------------------

Network Configuration for eth0

IPv4 Address:   192.168.1.10

Netmask:        255.255.0.0

IPv6 Address:

Prefix:

Global Configuration

IPv4 Gateway:   192.168.1.1

IPv6 Gateway:

Hostname:       abc.xyz.com

DNS Servers:   192.168.1.10

Domain Name:

Search Path:

Proxy Server:

---------------------------------------------

Once the changes were made, rebooted the vCenter and all services came back online and functional.

HA alarms were cleared and vCenter was stable.

Regards,

Jonathan

vSphere Install Upgrade Team

_______________________________________________________________________________________________________

"Did you find this helpful? Let us know by completing this survey (takes 1 minute!)"

VMware Cloud Foundation

View solution in original post

Reply
0 Kudos
7 Replies
jonastro
VMware Employee
VMware Employee
Jump to solution

Hello levans01

Thanks for posting in VMware communities.

A quick note: After stage 1 is completed take a snapshot of the appliance so that next time if it fails, you can resume from where stage 1 is complete.

By default, during stage 2, the source appliance will be powered off and then first boot happens on the destination upgraded vCenter appliance.

Can I have the SR number so that I can take a quick look at the logs if you have uploaded it.

Regards,

Jonathan

vSphere Install Upgrade Team

VMware Cloud Foundation
Reply
0 Kudos
levans01
Contributor
Contributor
Jump to solution

Johnathan,

Thanks for the quick answer and the TIP.   Here's the SR 20099508402

Thank you.

Reply
0 Kudos
jonastro
VMware Employee
VMware Employee
Jump to solution

Hello levans01

I have reviewed the logs.

Here are my findings:

===================================

bootstrap.log

2020-02-07T18:07:50.881Z ERROR transport Command ['/usr/bin/python3', '/usr/lib/vmware/cis_upgrade_runner/UpgradeOrchestrator.py', '-m', 'import', '-f', 'upgrade-import-config.json', '-o', '/var/log/vmware/upgrade/import.json', '--logDir', '/var/log/vmware/upgrade', '--logFileName', 'import-upgrade-runner.log', '--cancelFile', '/var/tmp/upgrade_cancel.op', '-l', 'en', '--logLevel', 'INFO', '--disableScreenLog'] exit-code=1, stdout=, stderr=

2020-02-07T18:07:50.881Z ERROR __main__ ERROR: Fatal error during upgrade IMPORT. For more details take a look at: /var/log/vmware/upgrade/import-upgrade-runner.log

2020-02-07T18:07:50.882Z INFO root Exiting with exit-code 1

import-upgrade-runner.log

2020-02-07T18:07:50.795Z INFO UpgradeRunner Loading upgrade workflow context from /storage/seat/cis-export-folder/system-data/UpgradeRunner.ctx..

2020-02-07T18:07:50.795Z INFO config.config_loader Source com.vmware.vpxd endpoint is not specified and its components could not be found automatically

2020-02-07T18:07:50.796Z WARNING networking_utils Could not find address info for 127.0.0.1

2020-02-07T18:07:50.796Z INFO config.credentials Credentials are not defined for component -- com.vmware.vpxd

2020-02-07T18:07:50.797Z ERROR networking_utils Could not validate host xyz.abc.local: [Errno -3] Temporary failure in name resolution

2020-02-07T18:07:50.798Z ERROR UpgradeRunner Upgrade Runner has encountered an exception

Traceback (most recent call last):

  File "/usr/lib/vmware/cis_upgrade_runner/UpgradeRunner.py", line 1771, in main

    credentials.loadCredentials(configData)

  File "/usr/lib/vmware/cis_upgrade_runner/py/config/credentials.py", line 74, in loadCredentials

===================================

I have scrubbed your FQDN for privacy and security. Changed it to xyz.abc.local

From the above logs, it looks like name resolution is failing.

Plan:

Make sure to keep the source appliance, the jumpbox from where you run the installer and the destination appliance all three machines on the same port group, same esxi host. This will eliminate most of the network problems as the VMs will be communicating within the virtual switch portgroup.

Make sure to keep vCenter DRS in manual mode

From the jump box, using command prompt verify both forward and reverse lookup of the vCenter FQDN.

Regards,

Jonathan

vSphere Install Upgrade Team

VMware Cloud Foundation
Reply
0 Kudos
levans01
Contributor
Contributor
Jump to solution

Johnathon, 

I will try that to see if it solves my problem.   I will post and update on my steps and the outcomes.

Thanks!

Reply
0 Kudos
levans01
Contributor
Contributor
Jump to solution

1. Forward and Reverse lookup names are working from the Jump Box where the installer is running for the VCSA FQDN.

2. All machines are in the same port group on the VDS.  They are all pointing to the same DNS servers.

3. I move all machines into the host same host.   (Source Appliance, Destination Appliance, Windows VUM machine and Jumpbox.)

4. Started Stage 2 of the installation  - Failed at the same point. (It failed)  #2 made it to 100% before generating an error.

Image 1.1 Final error.

pastedImage_0.png

Reply
0 Kudos
jonastro
VMware Employee
VMware Employee
Jump to solution

Hello,

The engineer working on the SR has been notified.

We both will work internally with you to get on a remote session and get this sorted tomorrow morning hours.

Regards,

Jonathan

VMware Install Upgrade Team

VMware Cloud Foundation
Reply
0 Kudos
jonastro
VMware Employee
VMware Employee
Jump to solution

Hello Lee,

It was a pleasure working with you on this case.

To ensure clarity on the resolution of your issue and as a record for yourself below is a summary of what we worked on:

Summary:

6.0 to 6.7 appliance upgrade is failing with unable to lookup hostname

Cause and Resolution:

Found the following from the source 6.0 vCenter:

IP/Hostname has been masked for security reasons.

---------------------------------------------

# /opt/vmware/share/vami/vami_get_network

interface: eth0

config_present: true

config_flags: STATICV4+STATICV6

config_ipv4addr: 192.168.1.10

config_netmask: 255.255.255.0

config_broadcast: 192.168.1.255

config_gatewayv4:

config_ipv6addr: fe80::251:45ff:fef9:441d

config_prefix: 64

config_gatewayv6:

autoipv6:

active_ipv4addr: 192.168.1.10

active_netmask: 255.255.0.0

active_broadcast: 192.168.1.255

active_ipv6addr:

active_prefix:

active_gatewayv4: 192.168.1.1

active_gatewayv6:

---------------------------------------------

From the above output, note that "config_netmask" and "active_netmask" has two different values.

The "active_netmask: 255.255.0.0" is the correct subnet for this environment

  • IPv6 has been enabled on the environment.
  • Manually made changes from the DCUI console and disabled IPv6
  • Edited the /etc/hosts file and added the IP address and FQDN of the vCenter server
  • Made sure the source vCenter, jumpbox, destination new vCenter appliance all are on the same host and same port group
  • Started the upgrade and made sure the destination appliance has identical network details as the source (same subnet and gateway)
  • Upgrade completed successfully however the vCenter was showing HA alarms and unable to reach the isolation address

After upgrade:

  • Further checking the upgraded 6.7 vCenter we find that the network is not populated with default gateway and with a wrong subnet mask (255.255.255.0).
  • This is the same subnet which was observed in the source 6.0 vCenter.
  • The network was also getting same ipv6 address which was observed in the source vCenter 6.0
  • Suspect the old invalid configurations were copied during the upgrade to the new appliance.

Ran the following command on the new upgraded 6.7 vCenter appliance:

---------------------------------------------

/opt/vmware/share/vami/vami_config_net

Network Configuration for eth0

IPv4 Address:   192.168.1.10

Netmask:        255.255.255.0

IPv6 Address:

Prefix:

Global Configuration

IPv4 Gateway:

IPv6 Gateway:

Hostname:       abc.xyz.com

DNS Servers:    127.0.0.1, 192.168.1.10

Domain Name:

Search Path:

Proxy Server:

---------------------------------------------

Manually edited the configuration using the options available in "/opt/vmware/share/vami/vami_config_net"

After changes:

---------------------------------------------

Network Configuration for eth0

IPv4 Address:   192.168.1.10

Netmask:        255.255.0.0

IPv6 Address:

Prefix:

Global Configuration

IPv4 Gateway:   192.168.1.1

IPv6 Gateway:

Hostname:       abc.xyz.com

DNS Servers:   192.168.1.10

Domain Name:

Search Path:

Proxy Server:

---------------------------------------------

Once the changes were made, rebooted the vCenter and all services came back online and functional.

HA alarms were cleared and vCenter was stable.

Regards,

Jonathan

vSphere Install Upgrade Team

_______________________________________________________________________________________________________

"Did you find this helpful? Let us know by completing this survey (takes 1 minute!)"

VMware Cloud Foundation
Reply
0 Kudos