Skip navigation
2017
daphnissov Guru
Community WarriorsvExpert

vCSA File Backup Fails

Posted by daphnissov Dec 9, 2017

I encountered this issue in my home lab recently whereby vCSA 6.5 U1c was failing the file-based backup through the VAMI with the message “BackupManager encountered an exception. Please check logs for details.” A very generic error message to be sure, and not at all helpful. I checked the log responsible at /var/log/vmware/applmgmt/backup.log and saw the following message.

 

2017-12-09 02:46:10,833 [ConfigFilesBackup:PID-38335] ERROR: Encountered an error during ConfigFiles backup.

Traceback (most recent call last):

  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/ConfigFiles.py", line 223, in BackupConfigFiles

    logger, args.parts)

  File "/usr/lib/applmgmt/backup_restore/py/vmware/appliance/backup_restore/components/ConfigFiles.py", line 132, in _generateConfigListFiles

    tarInclFile.write('%s\n' % entry)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 100: ordinal not in range(128)

 

Very odd, especially the last line about the asci codec error. I went and looked at the Python script to see why it might fail at this step. I checked line 132 to see what it was doing.

 

 

After some more looking through the script, it looks fine. It’s stripping characters off paths to build the file list. Nothing unusual. I just do a cursory check around the vCSA filesystem to see what might be going on. Something catches my eye when I do a df -h /.

 

 

It might be difficult to see, but the last entry is a UNC path to a SMB share. I then remember that I created a Content Library that is connected via SMB to my Synology. Looking at the Python again, I think it’s not handling two forward slashes well and bombing out because of it. Prior to that, the backup task is pretty simple.

 

 

 

 

 

The second screenshot in the backup wizard was initially alarming as the “common” part was showing 0 MB in size, which clearly isn’t right. When running the backup, it fails in very short order and produces no files in the destination path.

I delete the Content Library, make sure it’s unmounted from the filesystem, then attempt the backup again. Complete success.

 

TL;DR:  vCSA file-based backup has a bug which fails if you have a content library mounted over SMB. Remove or unmount the Content Library in order to have the backup succeed.

Not a long post, actually more like a "note to self" moment but figured I'd post it where others can see. Probably won't be seen or cared about, but this was really really odd and thought I should write it down for posterity's sake. Anywho...

 

I recently did some major networking reconfigurations in my home lab that involved decommissioning an older D-Link (yes, I know) managed 1 GbE switch in favor of a much nicer L3 Dell PowerConnect 6248 that I got for a song on eBay. I wanted to move over to the L3 switch so I can have a lab that's better aligned to a real enterprise environment, and so this will allow me to do just about everything you can see in an enterprise from the perspective of VMware technologies anyway, and this includes, most importantly, NSX. As part of this process, I wanted to move entirely away from vSSs for VM communication and on to vDS utilizing various VLAN-backed portgroups. I'm an extensive vRA user/tester/developer so vRA is plumbed into everything. Once I got the new switch configured, up and running, and everything migrated to it, I created the new vDS and port groups. Each host had a new uplink dedicated to this vDS. Existing standard switches contained the default "VM Network" port group that still existed on vswitch 0. Once getting the new vDS up, all VMs were migrated over to it. Templates were migrated next. All good, no interruptions.

 

Once the vDS migration was complete, I reconfigured my vRA lab environment to consume the VLANs through my reservations. I simultaneously deactivated the legacy "VM Network" port group that exists on the standard switches. Later, I deleted this port group from all hosts. After a couple provisioning tests, the VMs weren't landing on the vDS port group they should. Strange, I thought, they should be. At the same time, I noticed in the Networking inventory view of my vCenter that this VM Network phantom port group remained but with no hosts showing. wtf.jpg?? Now, I've encountered this in the past, but this was usually due to templates not being updated. I made sure to do this and reconfirmed no ports were active for any VMs or templates. I also checked each host from esxcli to verify this and there was no VM Network port group. I also checked vCenter via PowerCLI and *still* no VM Network port group. I decided to give vCenter the ol' razzle dazzle and reboot the sucker. Once back up, the VM Network port group is still freaking there! I'm thinking at this point that something in Postgres is jacked and I'm going to have to excise the tumor manually. But I'm thinking that this problem is somehow linked to my vRA provisioned test machines not getting the correct VLAN. I then remember that I'm using linked clones to speed the provisioning process. When I took those snapshots, I hadn't yet introduced the vDS and so the templates were joined to VM Network. But I thought that shouldn't matter since vRA should be reconfiguring them via the customization that occurs. I convert the templates to VMs, delete the snapshot, ensure they're joined (again) to the vDS port group of my choice, and re-snapshot. Once I deleted those snapshots, the phantom VM Network port group disappeared. And after running a data collection in vRA yet again and updating my blueprints to select the new snapshot name, provisioning was working to the correct port group in my reservations.

 

Anyway, super odd and something I've never run across before. Hopefully it helps someone (probably won't, though).

 

TL;DR - If you have snapshots of VMs or templates that were taken on an old vSS, you'll need to commit the snapshots or else phantom port groups may remain. vRA may also fail to provision to the correct port groups on linked clones unless you do so as well.