VMware Cloud Community
jaguardin
Contributor
Contributor

VCSA upgrade from U3C to U3E fails at staging

hello,

ive been trying to get this upgrade working with no luck.

did the following:

- shutdown VCSA.

- took snapshot.

- power on. attach ISO patch file.

- stage files

staging fails with:

 

[2022-06-02 15:37:56,775] : Downloading RPM vmware-vsm-7.0.3.00600-9891004.x86_64.rpm ....
[2022-06-02 15:37:58,781] : Downloading RPM VMware-wcpovf-3.0.366-19705778.x86_64.rpm
[2022-06-02 15:37:59,789] : Staging failed. Retry to resume from the current state. Or please collect the VC support bundle.

 

I then ran unstage command:

Command> software-packages unstage

 

Command completes successfully. I re-run the staging command, and still failure.

 

Any suggestions?

Labels (1)
  • C

0 Kudos
15 Replies
jaguardin
Contributor
Contributor

further, in the  ./applmgmt/software-packages.log

2022-06-02T15:49:07.153 [37731]DEBUG:vmware.appliance.update.dbfunctions:Executing {SELECT CASE WHEN count(*) == 0 THEN 0 ELSE 1 END as status FROM progress WHERE _stagekey = 'patch-state' AND _message = 'Install started'}
2022-06-02T15:49:07.153 [37731]DEBUG:vmware.appliance.update.functions:Stage started = [1], Stage Completed =[0], Install started =[0]
2022-06-02T15:49:07.153 [37731]DEBUG:vmware.appliance.update.functions:Resume needed in Stage phase
2022-06-02T15:49:07.153 [37731]ERROR:vmware.appliance.update.update_b2b:Database not initialised before its use.
2022-06-02T15:49:07.153 [37731]INFO:vmware.vherd.base.software_update:Staging failed. Retry to resume from the current state. Or please collect the VC support bundle.

 

could be related to that "database not initialised" error? i could find nothing regarding this.

 

Tags (1)
0 Kudos
8islas
Enthusiast
Enthusiast

Hello;

Is the correct ISO attached? did you download ISO again? Have you tried this?

To perform an update, log in to VAMI via port 5480 and click on the STAGE AND INSTALL option in the Update section.
*Remember to have a valid backup or snapshot before proceeding with the update.(you did it)

If you cannot see that option in your VAMI, please proceed as follow:

Download and Installation via ISO.
To download this patch from VMware Customer Connect, you must navigate to Products and Accounts > Product Patches. From the Select a Product drop-down menu, select VC and from the Select a Version drop-down menu, select 7.0.3.

Attach the VMware-vCenter-Server-Appliance-7.0.3.00600-19717403-patch-FP.iso file to the vCenter Server CD or DVD drive.
Log in to the appliance shell as a user with super administrative privileges (for example, root) and run the following commands:


To stage the ISO:
software-packages stage --iso

To see the staged content:
software-packages list --staged

To install the staged rpms:
software-packages install --staged

VMware vCenter Server 7.0 Update 3e Release Notes

The correct link for download patches is:

Product Patches - VMware Customer Connect

Regards

0 Kudos
jaguardin
Contributor
Contributor

hello, i will try to download the ISO again. i did download it from the link you sent, originally, but maybe it was corrupt?

 

also, i will try the 'stage and install' option. i will update shortly the result.

0 Kudos
jaguardin
Contributor
Contributor

yes the correct iso was attached you can see that in the rpm version is for '7.03.00600' above in my first ost

0 Kudos
Ajay1988
Expert
Expert

Looks like the patching DB got corrupt mostly due to multiple CLI or VAMI attempts.

Please take a snapshot and try the below and keep us posted how the patching goes via CLI. 

service-control --stop applmgmt
rm -rf /storage/core/software-update/updates/*
rm -rf /storage/core/software-update/stage/*
rm -rf /storage/db/patching.db
mv /storage/core/software-packages/staged-configuration.json /storage/core
mv /etc/applmgmt/appliance/software_update_state.conf /storage/core
service-control --start applmgmt

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
jaguardin
Contributor
Contributor

hello,

it was the ISO file that was the issue, the transfer got corrupted.

staging went through fine - but now i got another error after running the install!

2022-06-03T10:28:20.154 [330]INFO:vmware.vherd.base.software_update:Installation failed. Please collect the VC support.

in the VAMI, it says "Exception occured in postInstallHook"

what should i do in this case?

0 Kudos
jaguardin
Contributor
Contributor

AJ  - i did not run your commands. should i rollback to my snapshot, run the commands you listed, and try running the upgrade again?

0 Kudos
jaguardin
Contributor
Contributor

" [2022-06-03 11:56:21,410] : Installing RPM VMware-Postgres-client-jdbc-13.4.0-18716783.noarch.rpm ....
[2022-06-03 12:01:25,004] : Converting data as part of post install ....
[2022-06-03 12:01:27,014] : Installation failed. Please collect the VC support."

0 Kudos
robc_yk
Enthusiast
Enthusiast

I've seen some positive results with previous builds in the past by connecting to the host the VCSA is running on and mounting the ISO from there then running the commands.

0 Kudos
jaguardin
Contributor
Contributor

that is exactly what i did. this is a new error complaining about

 

"Exception occured in postInstallHook"

0 Kudos
Ajay1988
Expert
Expert

Yes. If the vCSA pathcing attempt failed. Please rollback and try the steps . Please use the CLI method from Appliance Shell. 

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
0 Kudos
jaguardin
Contributor
Contributor

i did do that already, i have tried multiple times from the cli and it fails with the same error.

i saw a complaint in one of the logs like these (I think it was patchrunner.log):

 

""]ERROR:vmware.vherd.transport.server:Cannot listen: Couldn't listen on ::1:8201: [Errno 99] Cannot assign requested address.""

""ERROR vmware_b2b.patching.executor.hook_executor Patch hook 'wcp:Patch' failed.""

""File "/storage/updatemgr/software-updateyp8zs7oj/stage/scripts/patches/payload/components-script/wcp/__init__.py", line 213, in doPatching doIncrementalPatching(current_version)""

""line 213, in doPatching doIncrementalPatching(current_version)""

""Failed to apply patch roles_groups_users!""

"""An error occurred while invoking external command : '%(0)s""

not in this order, i reverted snapshot already but i had searched the above and got it from my google history. i tried looking through that .py file and its an exception it throws "failed to apply patch..."  but i could not understand the rest of it.

any ideas?

0 Kudos
Ajay1988
Expert
Expert

Ahh..That seems like some roles/privilege missing. Can u share the /var/log/vmware/applmgmt/PatchRunner.log ??
Also do you VC's in linked mode ? 

Also do generate a log bundle from vCSA after this failure; so that it can be used for review. Mostly seems u might have to contact VMware Support to get help as I believe you need to add the missing privileges/roles. 

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
0 Kudos
jaguardin
Contributor
Contributor

hi Ajay,

I engaged Support and waiting for them to get schedule a time with me.

Yes the vCenters are in linked mode. I took offline snapshots for both before doing any work.

I reverted to the pre upgrade snapshot before the weekend as I did not want to leave the VCSA in bad state. 

I'll let update this thread after for anyone that has this issue, if we get a resolution or not!

0 Kudos
jaguardin
Contributor
Contributor

hello,

to update this, VMware Support fixed the issue. There is pretty much no way I would have figured out how to do that. There were some issues with stale service registrations (whatever that means?), I believe because the certificates the services were registered with, were not synced or old...or something along those lines. The Rep ran a bunch of vmware cli command tools to get this fixed, I kind of got lost along the way of what he did. He also reset our administrator@vsphere.local password. After that patch completed successfully and the post install was fine, and it had the right permissions. I asked Customer Support Rep Manager to possibly have technical team put this in a kb, but I am not sure if that will be possible.  For references and anyone else on the internet, this was the stuff that pointed us in the direction, which I saw in patchrunner.log...if you google-fu and came here, contact support 🙂

 

--------------

---------------

2022-06-06 10:30:00,654 - 33714 -           update_b2b::                   installRpms:2398 -     INFO - Installation of all RPMs successfully completed

2022-06-06 10:30:00,654 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {INSERT INTO progress VALUES(NULL, ?,         DATETIME('now'), ?, ?, ?)} with parameters ('install', 'Installation of all RPMs successfully completed', 100, 70)

2022-06-06 10:30:00,656 - 33714 -         task_manager::                        update:  80 -    DEBUG - UpdateTask: status=RUNNING, progress=80, message={'id': 'com.vmware.appliance.patch_in_progress', 'default_message': 'Converting data as part of post install', 'args': []}

2022-06-06 10:30:00,660 - 33714 -           update_b2b::               _runPostInstall:1879 -    DEBUG - Running the patch hook after rpm installation

2022-06-06 10:30:00,660 - 33714 -           update_b2b::                _runScriptHook: 365 -    DEBUG - Running B2B script with hook PatchHook

2022-06-06 10:30:00,660 - 33714 -           update_b2b::                _runScriptHook: 374 -    DEBUG - update script output to file /var/log/vmware/applmgmt/upgrade_hook_PatchHook

2022-06-06 10:30:00,663 - 33859 -      script_executor::                   executeHook:  41 -    DEBUG - Add script folder /storage/updatemgr/software-updater_s09c6f/stage/scripts to python path

2022-06-06 10:30:00,673 - 33859 -           extensions::                _findExtension:  83 -    DEBUG - Found script hook <module 'update_script' from '/storage/updatemgr/software-updater_s09c6f/stage/scripts/update_script.py'>:PatchHook'

2022-06-06 10:30:00,673 - 33859 -           extensions::                      __call__: 105 -    DEBUG - Executing script hook '<module 'update_script' from '/storage/updatemgr/software-updater_s09c6f/stage/scripts/update_script.py'>:PatchHook' with arguments <vmware.update.specs.UpdateSpec object at 0x7f2e040df050>.

2022-06-06 10:30:00,674 - 33859 -         update_utils::                    runCommand: 340 -    DEBUG - Running command: ['/bin/bash', '--login', '-c', '/usr/bin/python3 "/storage/updatemgr/software-updater_s09c6f/stage/scripts/patches/PatchRunner.py" patch -u "/tmp/tmpapnr9394" -o "/var/log/vmware/applmgmt/upgrade_hook_PatchHook" -d "/storage/updatemgr/software-updater_s09c6f/stage/patch_runner" --disableStdoutLogging']

2022-06-06 10:30:00,675 - 33859 -         update_utils::                    runCommand: 345 -    DEBUG - You can find the output of the command in temp files out /var/log/vmware/applmgmt/update-script-outbpk907co, err /var/log/vmware/applmgmt/update-script-errtyblomg1

2022-06-06 10:30:07,627 - 33859 -         update_utils::                    runCommand: 377 -    DEBUG - STDOUT: b"2022-06-06T17:30:06.581Z  Running command: ['/usr/lib/vmware-vmafd/bin/dir-cli', 'ssogroup', 'create', '--name', 'NsxAdministrators', '--description', 'SSO group to view and modify NSX configuration.', '--login', '']\n2022-06-06T17:30:06.604Z  Done running command\n"

2022-06-06 10:30:07,628 - 33859 -         update_utils::                    runCommand: 378 -    DEBUG - STDERR: b''

2022-06-06 10:30:07,628 - 33859 -        update_script::_executePatchRunnerAndSupplyUserData: 125 -    DEBUG - Patch command patch:

out=b"2022-06-06T17:30:06.581Z  Running command: ['/usr/lib/vmware-vmafd/bin/dir-cli', 'ssogroup', 'create', '--name', 'NsxAdministrators', '--description', 'SSO group to view and modify NSX configuration.', '--login', ']\n2022-06-06T17:30:06.604Z  Done running command\n"

error=b''

rc=1

 

2022-06-06 10:30:07,628 - 33859 -        update_script::_executePatchRunnerAndSupplyUserData: 129 -    ERROR - Patch command patch failed

2022-06-06 10:30:07,629 - 33859 -        update_script::                _reportFailure: 161 -    DEBUG - Error status has been already reported

2022-06-06 10:30:07,629 - 33859 -           extensions::                      __call__: 107 -    DEBUG - The component script returned 'None'

2022-06-06 10:30:07,633 - 33714 -           update_b2b::                _runScriptHook: 389 -    DEBUG - update script result file '{\n    "end_time": "2022-06-06T17:30:07.612Z",\n    "error": {\n        "componentKey": "wcp:Patch",\n        "detail": [\n            {\n                "args": [\n                    "roles_groups_users",\n                    "{\\n    \\"detail\\": [\\n        {\\n            \\"id\\": \\"install.ciscommon.command.errinvoke\\",\\n            \\"translatable\\": \\"An error occurred while invoking external command : \'%(0)s\'\\",\\n            \\"args\\": [\\n                \\"Error 53 while creating SSO group \\\\\\"NsxAdministrators\\\\\\":\\\\ndir-cli failed. Error 53: Possible errors: \\\\nLDAP error: Server is unwilling to perform \\\\nWin Error: Operation failed with error ERROR_BAD_NETPATH (53) \\\\n\\"\\n            ],\\n ..... [\n                "Error 53 while creating SSO group \\"NsxAdministrators\\":\\ndir-cli failed. Error 53: Possible errors: \\nLDAP error: Server is unwilling to perform \\nWin Error: Operation failed with error ERROR_BAD_NETPATH (53) \\n"\n            ],\n            "localized": "An error occurred while invoking external command : \'Error 53 while creating SSO group \\"NsxAdministrators\\":\\ndir-cli failed. Error 53: Possible errors: \\nLDAP error: Server is unwilling to perform \\nWin Error: Operation failed with error ERROR_BAD_NETPATH (53) \\n\'"\n        }\n    ],\n    "componentKey": null,\n    "problemId": null,\n    "resolution": null\n}']}, 'resolution': None}], 'info': [], 'warnings': []} from _runScriptHook

2022-06-06 10:30:07,633 - 33714 -           update_b2b::               postInstallHook:2443 -    DEBUG - PostInstall errors: [{'id': 'com.vmware.apliance.noid', 'time': '2022-06-06T17:30:07.633Z', 'message': {'id': 'wcp.patch.incrementalPatching.fail', 'default_message': 'Failed to apply patch roles_groups_users! Error: {\n    "detail": [\n        {\n            "id": "install.ciscommon.command.errinvoke",\n            "translatable": "An error occurred while invoking external command : \'%(0)s\'",\n            "args": [\n                "Error 53 while creating SSO group \\"NsxAdministrators\\":\\ndir-cli failed. Error 53: Possible errors: \\nLDAP error: Server is unwilling to perform \\nWin Error: Operation failed with error ERROR_BAD_NETPATH (53) \\n"\n            ],\n            "localized": "An error occurred while invoking external command : \'Error 53 while creating SSO group \\"NsxAdministrators\\":\\ndir-cli failed. Error 53: Possible errors: \\nLDAP error: Server is unwilling to perform \\nWin Error: Operation failed with error ERROR_BAD_NETPATH (53) \\n\'"\n        }\n    ],\n    "componentKey": null,\n    "problemId": null,\n    "resolution": null\n}.', 'args': ['roles_groups_users', '{\n    "detail": [\n        {\n            "id": "install.ciscommon.command.errinvoke",\n            "translatable": "An error occurred while invoking external command : \'%(0)s\'",\n            "args": [\n                "Error 53 while creating SSO group \\"NsxAdministrators\\":\\ndir-cli failed. Error 53: Possible errors: \\nLDAP error: Server is unwilling to perform \\nWin Error: Operation failed with error ERROR_BAD_NETPATH (53) \\n"\n            ],\n            "localized": "An error occurred while invoking external command : \'Error 53 while creating SSO group \\"NsxAdministrators\\":\\ndir-cli failed. Error 53: Possible errors: \\nLDAP error: Server is unwilling to perform \\nWin Error: Operation failed with error ERROR_BAD_NETPATH (53) \\n\'"\n        }\n    ],\n    "componentKey": null,\n    "problemId": null,\n    "resolution": null\n}']}, 'resolution': None}]

2022-06-06 10:30:07,634 - 33714 -         task_manager::                        update:  80 -    DEBUG - UpdateTask: status=RESUMABLE, progress=80, message={'id': 'com.vmware.appliance.postinstall_errors', 'default_message': 'Data conversion/Post install hook failed', 'args': []}

2022-06-06 10:30:07,638 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {INSERT INTO progress VALUES(NULL, ?,         DATETIME('now'), ?, ?, ?)} with parameters ('failed', 'Installation failed', 100, 95)

2022-06-06 10:30:07,641 - 33714 -           update_b2b::               postInstallHook:2450 -    DEBUG - Returning from postInstallHook(), data conversion failed

2022-06-06 10:30:07,641 - 33714 -           update_b2b::               postInstallHook:2457 -    ERROR - Exception occured in postInstallHook

2022-06-06 10:30:07,641 - 33714 -         task_manager::                        update:  80 -    DEBUG - UpdateTask: status=RESUMABLE, progress=80, message={'id': 'com.vmware.appliance.plain_message', 'default_message': '%s', 'args': ['Exception occured in postInstallHook']}

2022-06-06 10:30:07,644 - 33714 -           update_b2b::                       install:2684 -    ERROR - Post-Install hook phase failed.

2022-06-06 10:30:07,645 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {SELECT _value from config where _config = ?} with parameters ('failed-state',)

2022-06-06 10:30:07,645 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {SELECT _value from config where _config = ?} with parameters ('failed-state',)

2022-06-06 10:30:07,645 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {UPDATE config SET _value = ? where _config = ?} with parameters (40, 'failed-state')

2022-06-06 10:30:07,645 - 33714 -           update_b2b::                       install:2700 -    DEBUG - Sending installation data to VAC

2022-06-06 10:30:07,645 - 33714 -          dbfunctions::                        openDb:  35 -     INFO - Opening connection to database

2022-06-06 10:30:07,646 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {SELECT _value from config where _config = ?} with parameters ('ceip_status',)

2022-06-06 10:30:07,646 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {SELECT _value from config where _config = ?} with parameters ('toBuild',)

2022-06-06 10:30:07,646 - 33714 -          dbfunctions::                    executeDML:  54 -    DEBUG - Executing Query {SELECT _value from config where _config = ?} with parameters ('fromBuild',)

2022-06-06 10:30:07,646 - 33714 -          dbfunctions::                       execute:  81 -    DEBUG - Executing {SELECT * FROM meta}

2022-06-06 10:30:07,653 - 33714 -  telemetry_functions::                   send_to_vac:  76 -    DEBUG - Sending data to 'http://localhost:15080/analytics/telemetry/ph/api/hyper/send/?_c=vCSA.7_0&_i=vcsa_patching'

2022-06-06 10:30:07,657 - 33714 -       connectionpool::                     _new_conn: 230 -    DEBUG - Starting new HTTP connection (1): localhost:15080

2022-06-06 10:30:07,658 - 33714 -  telemetry_functions::                   send_to_vac:  87 -    ERROR - Data is not sent, retrying to send the data, retry_count = 1/2

2022-06-06 10:30:07,659 - 33714 -       connectionpool::                     _new_conn: 230 -    DEBUG - Starting new HTTP connection (1): localhost:15080

2022-06-06 10:30:07,660 - 33714 -  telemetry_functions::                   send_to_vac:  87 -    ERROR - Data is not sent, retrying to send the data, retry_count = 2/2

2022-06-06 10:30:07,660 - 33714 -  telemetry_functions::                     telemetry: 125 -    ERROR - Failed to send patching meta data to VAC server, Exception : {Retries are exhausted, giving in}

2022-06-06 10:30:07,661 - 33714 -  update_microservice::                     _runEvent: 391 -    DEBUG - Event callback succeeded

2022-06-06 10:30:27,789 - 33714 -  update_microservice::                 waitForEvents: 455 -     INFO - Exiting by timeout

2022-06-06 10:30:27,789 - 33714 -  update_microservice::                       __del__: 341 -    DEBUG - Closing socket...

2022-06-06 10:30:27,789 - 33714 -  update_microservice::                       __del__: 343 -    DEBUG - Removing sockfile

2022-06-06 10:30:27,789 - 33714 -  update_microservice::                _deletePidFile: 279 -    DEBUG - Removing pid file: /var/run/vmware/applmgmt/update_microservice.pid

-----------------------