VMware Cloud Community
jbjbjbjb
Contributor
Contributor

esx.conf does not purge historical datastores

Good day,

When a datastore is added to a vSphere environment, it registers that datastore in the ESXi host's 'esx.conf'. This is fine, and makes sense, however when the datastore is removed from the environment (cleanly or not) it still remains in esx.conf. This can become a major problem when you are leveraging VADP snap protect backups and the 'snapshots' from each VM are temporarily mounted on a ESXi host (typically a non-production host) and are dealing with a large amount of datastores and a busy backup schedule. We have observed instances where we have seen over 10000 stale entries in esx.conf and what this eventually leads to is an ESXi bootup time of hours, and also major host problems with various services not starting.

Example:

/storage/lun[naa.68b7b2dc0500bc1dfbf5443aa50a079f]/fromUser = "false"

/storage/lun[naa.68b7b2dc0500bc1dfbf5443aa50a079f]/displayName = "EQLOGIC iSCSI Disk (naa.68b7b2dc0500bc1dfbf5443aa50a079f)"

/storage/lun[naa.6019cb017105540c8a22e5316b735d12]/fromUser = "false"

/storage/lun[naa.6019cb017105540c8a22e5316b735d12]/displayName = "EQLOGIC iSCSI Disk (naa.6019cb017105540c8a22e5316b735d12)"

/storage/lun[naa.68b7b2dc05003ceaa5f5f437a50a27ff]/fromUser = "false"

/storage/lun[naa.68b7b2dc05003ceaa5f5f437a50a27ff]/displayName = "EQLOGIC iSCSI Disk (naa.68b7b2dc05003ceaa5f5f437a50a27ff)"

I have observed this persistent behavior in vSphere 5.0 Update 3 and vSphere 5.1 Update 2.

Has anyone else come across this? Does anyone know a way to fix this automatically? It seems like a poor design choice by VMware.

Reply
0 Kudos
4 Replies
TBKing
Enthusiast
Enthusiast

I've run across something similar with regards to networking where the conf file has old mac addresses and has caused some issues.  Specifically I was moving HP blades from one chassis to another and assigning a new virtual connect profile without reloading ESX.

The fix for me - for networking - was to delete all of the lines with /net/pnic and on the next boot ESX would re-populate the correct number of nics and macs.

Then I would re-setup the networks.

I have no idea if this would be parallel to storage.

Reply
0 Kudos
jbjbjbjb
Contributor
Contributor

TBKing,

In order for us to fix, we had to manually purge all stale LUN entries in the config file. You could probably wipe out ALL entries but then I assume you would need to re-register the valid lun's back in to the config.

Thanks for chiming in with your issue too. Look's like vmware needs to give esx.conf some TLC Smiley Happy

Reply
0 Kudos
IshtarHar
Contributor
Contributor

We're having the same problem due to NetApp snapmirror doing this.  VMware shared an internal KB with us with some commands that didn't work as the LUNs didn't get listed by them so we couldn't remove them.  The other options were to manually edit esx.conf which is "not supported" or to stop using SnapMirror.

If anyone has an actual fix, that'd be top

Reply
0 Kudos
ang9999
Contributor
Contributor

Same problem here.  We're using Commvault snap backup + Dell EqualLogic storage, but the result is the same.  ESX.conf needs to be manually purged every few months, or else it would take over an hour to boot up, and HBA rescans time out and can cause the ESX host to disconnect from vCenter. 

Reply
0 Kudos