Hi,
Upgraded one of our hosts from 3.0.2 to 3.5 and patched up with the (currently) 4 critical patches.
The host in VI client (VC or directly to host) shows NFS datastores as inactive.
Vmotion is unable to move VMs to this host.
Double clicking on the NFS datastore opens up and shows the VM data fine.
Refresh in VI (either on the network page or right click on the datastore) doesn't fix - it may update the free space but remains "inactive" and greyed out.
On the service console, vdf shows the mounts just fine, and I can enter into the /vmfs/volumes/[datastore] folder and read/write just fine.
The only way to regain connectivity is to delete the NFS datastore mounts and re-add (a time consuming process). Then it works again until the next reboot and they're inactive once more!
This is connected to an NFS export on a NetApp filer - and I've not seen this issue with the other 3.0.x hosts. I've searched logs in /var/log/ for nfs, nas but find nothing of much interest (but not really sure where to look).
Any ideas?
Regards,
Keith.
oh, and "esxcfg-nas -r" doesn't help either 😕
I have the same problem as you. I'm on ESX 3.5 Update 1 Build 98103. The other 4 hosts that were patched connect to the NAS just fine but one host will not. The only way I have seen around this is to reboot the NAS (fortuately only holding my ISOs and is a windows 2003 r2 server).
Same situation though, I can browse it through the console and through the VIC but virtualcenter says its "inactive" and cant point any VM's CDROM at any ISOs.
I too am having the same problem. ESX 3.5 Build 98103
Had no problem adding the NFS mount initially...then after 9 reboot, the NFS mount is stuck as (Inactive).
vmkping to the nfs server works fine. If I delete the NFS mount and try to add it again, that doesn't work either. I have other ESX servers that are still using the NFS, so I am scared to reboot them now.
The log on the NFS server shows a successful connection, but the log on the ESX shows a error 13 - timeout.
Any ideas what to look for next? I tried esxcfg-firewall -e nfsClient - just in case.
Jim
I am also having the same issue, was there ever a response to the issue with a solution
Any help would be appreciated
I'm having this same problem with my SnapServer 520, had it for a while; ESXi 3.5 installed. Every time I reboot the SnapServer, the NFS datastore goes "inactive" in ESX and can't be made active, and I have to delete the VMs from inventory, delete the datastore, re-add the VMs to inventory, then restart it. This is a major PITA, and causes me to lose the historical performance data. It is also forcing me to use local storage for my VMs, which I'm very frustrated by.
I got it!!!!!! It's a race condition. My suspicion is that maybe the networking isn't stabilized by the time the management daemon starts? We run ESXi 3.5, Update 3, and I don't recall if ESX was affected or not; it's been a while since we switched.
Just like so many others, things would be great until a reboot, then the NFS datastores would appear as Inactive. We could double-click on them and browse them, and esxcfg-nas -r wouldn't help.
Then what? Restarting hostd would do it:
/etc/init.d/hostd restart
But how to do that at startup, when it's already supposed to be starting up? A sleep statement in the hostd init script.
vi /etc/init.d/hostd
in the start section, right before the "setsid watchdog.sh ..." line (around line 16), put a "sleep 60s". I didn't have much time to test, but I did find that 30 seconds wasn't adequate with my setup, yours may vary, play around. It already takes long enough for these boxes to reboot, 60 seconds really doesn't really add that much
By the way, we're using a NetApp FAS270 connected with gigabit ethernet...
Good luck fellow ESXers!
I had this problem on ESXi 3.5.0 (build 110271) running standalone with two NFS mounts to a NetApp FAS2050 until I switched from DHCP to static ip address for vmkernel, it seems ESX tried to connect to the storage before it had a dhcp lease and never tried again.
My NFS datastores would come up as "inactive" until i refreshed them in the VI Client (Configuration -> Storage) but obviously no VM's would autostart because of this.
Hope it helps!
Well, that DOES make sense, as far as a race condition goes... But
we're just too big to keep track of things like that. There's like
thirty or forty subnets, and at least 700 addressable hosts.
We use a homebrew DHCP/DNS management system that generates the
necessary config files for named and dhcpd. Everybody gets their
address from DHCP, even though it's a fixed lease...
So, in my opinion, VMware should remove the DHCP client option from
their product if it doesn't work right...
Does ANYONE have dhcp working right with their ESX[i] setups? Setups
with shared NAS storage? Our dhcpd responds pretty quickly to requests
every time, much like a simpler router with a single subnet would, so
I'm curious to see if this condition is limited to people with close to
this specific setup, a NetApp FAS product with ESXi...
Thank you for your suggestion As long as I put the fixed lease into
our management software, I can set the blades to the same addresses
staticly. That might be easier/more intuitive than editing the hostd
file...
Mojo
The problem I was having was when someone rebooted the host, not the
NAS. Using ESXi, I was able to create a workaround (should be detailed
above) but it involves entering the 'unsupported' shell and modifying a
file. But since it's your NAS rebooting and not your host, and I'm not
sure if you're using ESX or ESXi, I'm not sure what to recommend...
Hi Toonix
I am seeing your problem too (NFS datastore inactive after NAS reboot - not quite the same as original poster I know). Did you find a resolution?
cheers
sorry, duplicate
Kind of, but only if you're using ESXi. Edit /etc/init.d/hostd and add
"sleep 60s" right before the "setsid watchdog.sh ...." line... but not
in ESX, sorry.
Hi elMojo
Thanks for your quick response.. as it happens I am in fact on ESXi not ESX. On the downside, I have the problem on NAS reboot rather than ESXi host reboot... I see your fix will address the latter by delaying boot. I guess in fact that the root course of these problems is the same - ESX(i) doesn't retry connection to a NFS datastore?
Oops, sorry, you were asking Toonix :smileygrin:
Well, I'm not sure my fix will help you -- with my case, ESXi is
attempting to connect to the NAS before ESXi's networking is up! funny.
But in your case, ESXi's networking was up and remained up, maybe it
just didn't feel like trying again...
I'm running ESX 3i 3.5.0 and have ran into the same issue where NAS is inactive after rebooting
my host. Refresh does not work and I didn't bother going into the restricted shell.
My datastore is actualy an NFS mount from another Linux system. To resolve my problem,
I've simply restarted the nfs daemon on the Linux system and my datastores came up
automatically.
Not a very complicated setup but if someone is playing with 3i, like I am, and is using an
NFS for datastore and come across this problem, this should be a simple fix.
jav
It's just not very automatic... and for some reason restarting the NFS
server didn't work for me! Not sure why.
We have three NFS datastores and it is imperative that it all happens
automagically Thanks for the info though.
On Thu, 30 Apr 2009 19:42:15 -0700, javelin
Has this issue been resolved by ESXi 4.1? I'm having the same issues in ESXi 4.0.
Today, I had to take down my NFS share (Synology RS810RP+ running DSM 2.3-1161) and move it to another rack. Doing so meant it was down for a few minutes. When I brought it back up, all 4 of my ESXi hosts could not start any of the guests that used that NFS share. We're only testing with this thing and the guests are NOT mission-critical so I didn't bother rebooting the hosts. I did tried doing a /etc/init.d/hostd restart but that didn't help. I'll try rebooting the hosts in the morning but this needs to be fixed. If it's fixed in the 4.1 update, then I'll be golden!
The release notes for the 4.1 update state that there have been tweaks done to "improve" NFS capabilitie.
2010-09-01
1549 EDT