<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:clearspace="http://www.jivesoftware.com/xmlns/clearspace/rss" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>VMware Communities: Message List - NFS Failover issue</title>
    <link>http://communities.vmware.com/community/vmtn/mgmt/vc?view=discussions</link>
    <description>Most recent forum messages</description>
    <language>en</language>
    <pubDate>Wed, 16 Sep 2009 04:17:39 GMT</pubDate>
    <generator>Clearspace 1.10.12 (http://jivesoftware.com/products/clearspace/)</generator>
    <dc:date>2009-09-16T04:17:39Z</dc:date>
    <dc:language>en</dc:language>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1364911?tstart=0#1364911</link>
      <description>&lt;br /&gt;
Sure...no problem....&lt;br /&gt;
&lt;p /&gt;
#1 -  3.5 Update 4 as well.&lt;br /&gt;
&lt;p /&gt;
#2 - yes, just the settings recommended by NetApp in TR-3428 (also put in automatically by installing the free NetApp Host Utilities for ESX as well).&lt;br /&gt;
&lt;p /&gt;
#3 - no, no jumbo frames....and surprisingly no real issues around that at all.&lt;br /&gt;
&lt;p /&gt;
#4 - via GUI I'm afraid....happy to review any CLI commands you're using though.</description>
      <pubDate>Wed, 16 Sep 2009 04:17:39 GMT</pubDate>
      <author>andriven</author>
      <guid>http://communities.vmware.com/message/1364911?tstart=0#1364911</guid>
      <dc:date>2009-09-16T04:17:39Z</dc:date>
      <clearspace:dateToText>2 months, 1 week ago</clearspace:dateToText>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1351304?tstart=0#1351304</link>
      <description>&lt;br /&gt;
Hi &lt;a class="jive-link-profile" href="http://communities.vmware.com/people/andriven"&gt;andriven&lt;/a&gt;,&lt;br /&gt;
&lt;p /&gt;
&lt;p /&gt;
&lt;p /&gt;
Great news! I'm still in the process of building the ESX over NFS setup here. It's nice to know that everything on your side is going as planned.&lt;br /&gt;
&lt;p /&gt;
 If you don't mind, I'd like to ask a few questions.&lt;br /&gt;
&lt;p /&gt;
 a) Which version of ESX are you using? I'm working with ESX 3.5 update 4.&lt;br /&gt;
&lt;p /&gt;
b) Have changed anything in the NFS Advanced Setting?&lt;br /&gt;
&lt;p /&gt;
c) Have you enabled Jumbo Frames?&lt;br /&gt;
&lt;p /&gt;
e) Are you configuring your ESX machines via CLI? If so, then I'd like to share with you how I do things. This way we can compare/share ideas.&lt;br /&gt;
&lt;p /&gt;
 Thanks,&lt;br /&gt;
&lt;p /&gt;
&lt;p /&gt;
&lt;p /&gt;
David</description>
      <pubDate>Mon, 31 Aug 2009 21:13:05 GMT</pubDate>
      <author>drobilla</author>
      <guid>http://communities.vmware.com/message/1351304?tstart=0#1351304</guid>
      <dc:date>2009-08-31T21:13:05Z</dc:date>
      <clearspace:dateToText>2 months, 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>1</clearspace:replyCount>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1349866?tstart=0#1349866</link>
      <description>Incidentally, I'm not sure if you're still having issues with this...but I can state from personal experience that NFS can be VERY resilient for cluster and/or switch failover (just did a NetApp ONTap upgrade last weekend (which involves cluster failover) to an 8 ESX host HA/DRS cluster connecting to a NetApp 3050 via NFS....zero issues with VM's).&lt;br /&gt;
&lt;br /&gt;
The thread here is a little bit old so I won't go into anything deep technically...but just wanted to put out there that from my experience there's nothing technically/conceptually intractable about using NFS with VMware/NetApp.</description>
      <pubDate>Sat, 29 Aug 2009 05:12:18 GMT</pubDate>
      <author>andriven</author>
      <guid>http://communities.vmware.com/message/1349866?tstart=0#1349866</guid>
      <dc:date>2009-08-29T05:12:18Z</dc:date>
      <clearspace:dateToText>2 months, 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>2</clearspace:replyCount>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1325409?tstart=0#1325409</link>
      <description>&lt;br /&gt;
@&lt;a class="jive-link-profile" href="http://communities.vmware.com/people/michael12345"&gt;michael12345&lt;/a&gt;&lt;br /&gt;
&lt;p /&gt;
 Hello Micheal,&lt;br /&gt;
&lt;p /&gt;
 I'm not sure it would help you, but I figure it can't hurt. Take a look at this post &lt;a class="jive-link-profile" href="http://communities.vmware.com/people/michael12345"&gt;http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-to-help-our-mutual-nfs-customers-using-vmware.html&lt;/a&gt;&lt;br /&gt;
&lt;p /&gt;
 They actually talk of your issue in the comments.&lt;br /&gt;
&lt;p /&gt;
There's also a few other good posts related with VMware ESX and NFS.&lt;br /&gt;
&lt;p /&gt;
&lt;a class="jive-link-external" href="http://blog.scottlowe.org/2008/04/22/esx-server-ip-storage-and-jumbo-frames/"&gt;ESX Server, IP Storage, and Jumbo Frames&lt;/a&gt; &lt;br /&gt;
&lt;p /&gt;
&lt;a class="jive-link-external" href="http://blog.scottlowe.org/2008/09/05/setting-vmware-esx-vswitch-load-balancing-policy-via-cli/"&gt;Setting VMware ESX vSwitch Load Balancing Policy via CLI&lt;/a&gt; &lt;br /&gt;
&lt;p /&gt;
&lt;a class="jive-link-external" href="http://blog.scottlowe.org/2006/12/04/esx-server-nic-teaming-and-vlan-trunking/"&gt;ESX Server, NIC Teaming, and VLAN Trunking&lt;/a&gt; &lt;br /&gt;
&lt;p /&gt;
 Good luck,&lt;br /&gt;
&lt;p /&gt;
 David &lt;br /&gt;
&lt;p /&gt;
&lt;p /&gt;
&lt;p /&gt;
&lt;br /&gt;</description>
      <pubDate>Thu, 30 Jul 2009 18:00:23 GMT</pubDate>
      <author>drobilla</author>
      <guid>http://communities.vmware.com/message/1325409?tstart=0#1325409</guid>
      <dc:date>2009-07-30T18:00:23Z</dc:date>
      <clearspace:dateToText>3 months, 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>3</clearspace:replyCount>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1253152?tstart=0#1253152</link>
      <description>No i am still working on this problem and how to over come the short falls of NFS. Some days i wish i had just done Fiber ... the added cost would be worth the less problems at this point.</description>
      <pubDate>Fri, 15 May 2009 15:24:14 GMT</pubDate>
      <author>michael12345</author>
      <guid>http://communities.vmware.com/message/1253152?tstart=0#1253152</guid>
      <dc:date>2009-05-15T15:24:14Z</dc:date>
      <clearspace:dateToText>6 months, 1 week ago</clearspace:dateToText>
      <clearspace:replyCount>4</clearspace:replyCount>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1252949?tstart=0#1252949</link>
      <description>&lt;br /&gt;
@JMorz&lt;br /&gt;
&lt;p /&gt;
 Did you ever get an answer/resolution for this?</description>
      <pubDate>Fri, 15 May 2009 12:36:36 GMT</pubDate>
      <author>pgifford</author>
      <guid>http://communities.vmware.com/message/1252949?tstart=0#1252949</guid>
      <dc:date>2009-05-15T12:36:36Z</dc:date>
      <clearspace:dateToText>6 months, 1 week ago</clearspace:dateToText>
      <clearspace:replyCount>5</clearspace:replyCount>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1252747?tstart=0#1252747</link>
      <description>Some thing to think about in this problem is that the Netapp is not true Active/Active yes it will share Nvram but it will not share aggregate's so if you have VM's talking to Filer 0 and you drop the network path to filer 0 the system will not be able to talk to the aggregate(Lun/NFS Volume) .  The Netapp will change aggregate owners only if the second Filer can see a problem.&lt;br /&gt;
&lt;br /&gt;
If a request for a file comes to Filer 1 it will hand off the request to Filer 0. If Filer 0 cant talk to you thats the end of that request it will not send I/O from aggregates it owns using Filer 1. &lt;br /&gt;
&lt;p /&gt;
I would still use fail over even with LACP you should have nics bonded with LACP and then a second set of nics on a second switch bonded with LACP using fail over and spaning tree blocking this way if you have a switch fail you have a second link source.  Also not sure how NFS handles dropped packets. &lt;br /&gt;
&lt;br /&gt;
If something has changed on the NetApp please let me know but this is how i understand it to work on my NetApp 3070 thanks.</description>
      <pubDate>Fri, 15 May 2009 05:40:33 GMT</pubDate>
      <author>michael12345</author>
      <guid>http://communities.vmware.com/message/1252747?tstart=0#1252747</guid>
      <dc:date>2009-05-15T05:40:33Z</dc:date>
      <clearspace:dateToText>6 months, 1 week ago</clearspace:dateToText>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1192701?tstart=0#1192701</link>
      <description>&lt;br /&gt;
  Hi,&lt;br /&gt;
&lt;p /&gt;
In answer to your queries got the following responses back from our storage group:&lt;br /&gt;
&lt;p /&gt;
 Do you have the latest Netapp Host utilities installed on your hosts?  &lt;br /&gt;
&lt;p /&gt;
Not sure of the answer to this one, will check with my colleague in the storage team with regards this one again&lt;br /&gt;
&lt;p /&gt;
 Have you verfied that the cluster failover on the filer is working correctly?&lt;br /&gt;
&lt;p /&gt;
We have not had a filer failover yet  As we are active/active and in a switch failure only one link in the team fails the filer does NOT failover.  With LACP all the traffic going down the previously active link should be re transmitted down the remaining active link.&lt;br /&gt;
&lt;p /&gt;
&lt;p /&gt;
&lt;p /&gt;
 Which cfmode are you using?&lt;br /&gt;
&lt;p /&gt;
Negotiated failover enabled (network_interface).&lt;br /&gt;
&lt;p /&gt;
&lt;p /&gt;
&lt;p /&gt;
 What version os DOT are you running?&lt;br /&gt;
&lt;p /&gt;
7.2.4&lt;br /&gt;
&lt;p /&gt;
&lt;p /&gt;
&lt;p /&gt;
 Can you please post your VIF configurations?&lt;br /&gt;
&lt;p /&gt;
SANsideA: 2 links, transmit 'IP Load balancing', VIF Type 'lacp' fail 'default'&lt;br /&gt;
&lt;p /&gt;
         VIF Status     Up      Addr_set&lt;br /&gt;
&lt;p /&gt;
        up:&lt;br /&gt;
&lt;p /&gt;
        e2a: state up, since 22Feb2009 20:09:10 (14+13:25:28)&lt;br /&gt;
&lt;p /&gt;
                mediatype: auto-10g_sr-fd-up&lt;br /&gt;
&lt;p /&gt;
                flags: enabled&lt;br /&gt;
&lt;p /&gt;
                active aggr, aggr port: e2b&lt;br /&gt;
&lt;p /&gt;
                input packets 5621928542, input bytes 4610125046907&lt;br /&gt;
&lt;p /&gt;
                input lacp packets 45337, output lacp packets 41936&lt;br /&gt;
&lt;p /&gt;
                output packets 6658048095, output bytes 7300071603521&lt;br /&gt;
&lt;p /&gt;
                up indications 2, broken indications 0&lt;br /&gt;
&lt;p /&gt;
                drops (if) 0, drops (link) 0&lt;br /&gt;
&lt;p /&gt;
                indication: up at 22Feb2009 20:09:10&lt;br /&gt;
&lt;p /&gt;
                        consecutive 0, transitions 2&lt;br /&gt;
&lt;p /&gt;
        e2b: state up, since 22Feb2009 20:09:10 (14+13:25:28)&lt;br /&gt;
&lt;p /&gt;
                mediatype: auto-10g_sr-fd-up&lt;br /&gt;
&lt;p /&gt;
                flags: enabled&lt;br /&gt;
&lt;p /&gt;
                active aggr, aggr port: e2b&lt;br /&gt;
&lt;p /&gt;
                input packets 6394679077, input bytes 4682641181916&lt;br /&gt;
&lt;p /&gt;
                input lacp packets 45326, output lacp packets 41936&lt;br /&gt;
&lt;p /&gt;
                output packets 6694968077, output bytes 7436061192087&lt;br /&gt;
&lt;p /&gt;
                up indications 2, broken indications 0&lt;br /&gt;
&lt;p /&gt;
                drops (if) 0, drops (link) 0&lt;br /&gt;
&lt;p /&gt;
                indication: up at 22Feb2009 20:09:10&lt;br /&gt;
&lt;p /&gt;
                        consecutive 0, transitions 2&lt;br /&gt;
&lt;p /&gt;
Many thanks!&lt;br /&gt;
&lt;p /&gt;
John</description>
      <pubDate>Mon, 09 Mar 2009 10:12:27 GMT</pubDate>
      <author>JMorz</author>
      <guid>http://communities.vmware.com/message/1192701?tstart=0#1192701</guid>
      <dc:date>2009-03-09T10:12:27Z</dc:date>
      <clearspace:dateToText>8 months, 2 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>1</clearspace:replyCount>
    </item>
    <item>
      <title>Re: NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1191478?tstart=0#1191478</link>
      <description>&lt;br /&gt;
Hi,&lt;br /&gt;
&lt;p /&gt;
Do you have the latest Netapp Host utilities installed on your hosts?  &lt;br /&gt;
&lt;p /&gt;
 Have you verfied that the cluster failover on the filer is working correctly?&lt;br /&gt;
&lt;p /&gt;
 Which cfmode are you using?&lt;br /&gt;
&lt;p /&gt;
What version os DOT are you running?&lt;br /&gt;
&lt;p /&gt;
 Can you please post your VIF configurations?&lt;br /&gt;
&lt;p /&gt;
 Thanks&lt;br /&gt;
&lt;p /&gt;
Paul</description>
      <pubDate>Fri, 06 Mar 2009 21:42:48 GMT</pubDate>
      <author>pgifford</author>
      <guid>http://communities.vmware.com/message/1191478?tstart=0#1191478</guid>
      <dc:date>2009-03-06T21:42:48Z</dc:date>
      <clearspace:dateToText>8 months, 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>2</clearspace:replyCount>
    </item>
    <item>
      <title>NFS Failover issue</title>
      <link>http://communities.vmware.com/message/1189834?tstart=0#1189834</link>
      <description>&lt;br /&gt;
Hi folks,&lt;br /&gt;
&lt;p /&gt;
Wondering if anyone on the boards has experienced this particular issue, a quick run-through of our environment:&lt;br /&gt;
&lt;p /&gt;
 7 node cluster running VirtualCenter 2.5 Update 3&lt;br /&gt;
&lt;p /&gt;
Hardware  7 x DL585 G2's(BIOS up-to-date)&lt;br /&gt;
&lt;p /&gt;
Storage: NetApp FAS3070c - NFS mounts used for storage&lt;br /&gt;
&lt;p /&gt;
Each host running ESX 3.5 Update 3 (4 critical patches added-on)&lt;br /&gt;
&lt;p /&gt;
150 virtual machines running&lt;br /&gt;
&lt;p /&gt;
5 vSwitches per host (each with 2 pNICs patched to 2 separate physical network switches (2 x Catalyst 6509)&lt;br /&gt;
&lt;p /&gt;
vSwitch Configuration&lt;br /&gt;
&lt;p /&gt;
Load Balancing:  Route based on the Originating Virtual Port ID&lt;br /&gt;
&lt;p /&gt;
Network Failover Detection: Link Status only&lt;br /&gt;
&lt;p /&gt;
Notify Switches: Yes&lt;br /&gt;
&lt;p /&gt;
Failback: Yes&lt;br /&gt;
&lt;p /&gt;
1 vSwitch SC&lt;br /&gt;
&lt;p /&gt;
1 vSwitch VMotion (private VLAN)&lt;br /&gt;
&lt;p /&gt;
1 vSwitch VM Network&lt;br /&gt;
&lt;p /&gt;
1 vSwitch NFS (separate VLAN)&lt;br /&gt;
&lt;p /&gt;
1 vSwitch VM Network (redundant)&lt;br /&gt;
&lt;p /&gt;
Have run repeated physical cable checks to ensure the vmnics are patched properly, all check out and running in their proper VLANs)&lt;br /&gt;
&lt;p /&gt;
HA/DRS/VMotion all running fine.&lt;br /&gt;
&lt;p /&gt;
Logged a ticket with VMware to verify our storage configuration was fine (included settings made from Netapp Best Practices guide for ESX) - confirmed running best practices.&lt;br /&gt;
&lt;p /&gt;
VM's generally running fine (no disk errors/reported)&lt;br /&gt;
&lt;p /&gt;
Issue:&lt;br /&gt;
&lt;p /&gt;
When performing a single network switch outage in around a quarter of the VM's lose access to their VMDKS(effectively if you go on the console of the VM it will display a PXE boot message in DOS). Failover of traffic from one switch to the other can take over 15 minutes on average.&lt;br /&gt;
&lt;p /&gt;
Now to replicate the issue both the "active" NFS vswitch vmnic &lt;i&gt;and&lt;/i&gt; the 10gb Fibre connection running from the physical switch to the netapp filer from one physical switch need to be unplugged - the issue does not occur if just one is unplugged.&lt;br /&gt;
&lt;p /&gt;
Tried:&lt;br /&gt;
&lt;p /&gt;
Setting Failback to No&lt;br /&gt;
&lt;p /&gt;
Active/Standby for NFS vswitch vmnics&lt;br /&gt;
&lt;p /&gt;
Made no difference.&lt;br /&gt;
&lt;p /&gt;
Have tested in a lab environment using a single ESX host and 30 dummy VM's&lt;br /&gt;
&lt;p /&gt;
See the VM's hang for roughly 2 minutes before returning back to life - in the event logs a series of disk errors (symmpi) will be reported during the hang period.&lt;br /&gt;
&lt;p /&gt;
Our networking group have confirmed Portfast is enabled on the ports and port security is disabled.&lt;br /&gt;
&lt;p /&gt;
 Any suggestions gratefully received - forgive the rather lengthy submission!</description>
      <pubDate>Thu, 05 Mar 2009 15:19:56 GMT</pubDate>
      <author>JMorz</author>
      <guid>http://communities.vmware.com/message/1189834?tstart=0#1189834</guid>
      <dc:date>2009-03-05T15:19:56Z</dc:date>
      <clearspace:dateToText>8 months, 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>9</clearspace:replyCount>
    </item>
  </channel>
</rss>

