I've experienced the same things while doing FTP installs.
What type of switches are you connected to?
Cisco? Try enabling "PortFast"
Other switches? Try disabling "STP"
While starting, ESX loads the NIC driver and this causes the switch port to reset. This port reset initializes STP again and this can take up to 20/30 seconds while the NFS transfer tries to start. So in first it will timeout and after the 20/30 seconds period it will work
I am having this exact same problem and its driving me crazy!
IBM server using broadcom nics connected to Cisco switch.
Portfast (STP) enabled on port.
Portfast has helped but I still am getting a DHCP time out. I have also tired adding the following in my isolinux.cfg.
ksdevice=eth0 nicdelay=50 linksleep=50 but this hasn't help either.
It would appear the the nic is recycled up to three time during the setup phase before connecting to the NFS share.
Anything else I can try? scratching my head with this one.
We are using Cisco switches and we have PortFast enabled.
I have experienced the same problem when using ftp to source the installation files (which is why I am trying NFS, hoping that the problem will go away - unfortunately it hasn't).
I am wondering whether other people (other than Mooihoek) have experienced this problem, and if so, what did they do to resolve the problem (other than enabling PortFast).
Interesting, I recently created a new deployment with RDP again (running Cisco switches) and it worked like with default settings.
Another site didn't worked with default settings (I left the problem at the network department) so it could also have something to do with networking "behind" the first switch I guess
The ESX server and the DHCP and NFS server are all connected to the same switch, so there is nothing "behind" the first switch.
I have been doing all this in my dev lab as a "proof of concept", so that when I get it right I can do it on my production network (which will have multiple switches and VLANs). We will be using juniper switches in the prod network, so I am trying to source a couple of servers to do a test there to see if the problem is only with the cisco switch in the dev lab.
What position are you getting stuck with your NFS install?
There seems to be some trunking in place behind the cisco switch I am connected to so can only assume this is causing the DHCP time out problems I am seeing. If I connect both IBM servers to a local switch all is ok.
I am getting prompted that the NFS share cannot be found when the installation looks for the ks.cfg file. If I wait for around 30 seconds and click OK, the NFS share is found and the installation continues without anymore prompts. I am fairly sure that DHCP is the issue.
1 person found this helpful
Yep Im pretty sure you have the same problem as myself! Dont have an answer for you at the moment im still looking into it.
Things I have tried.
Set port to 1000GB Full Duplex
Check out the following link with is useful.
I have a dim memory that this problem is documented in VMware's KB:
No current workaround except for skipping it py pressing OK.
A fix is planned for a future version.
Unfortunately I could not find the article again
Thanks for the link. It has some information that could be helpful.
It mentions a bug with the Broadcom nics on the HP servers. It just so happens that I am using HP DL380 G4 and G5 servers that have the Broadcom nics.
I am having a similiar problem. The OS I'm using to put out via kickstart is RHEL AS4u5 32bit. Though it doesn't matter which OS I use as the OS to put out via kickstart. Same problem with any version of the 32 and 64 bit versions of RHEL AS4, as well as Fedora 7 and RHEL5.
The anaconda portions of the boot (with DHCP and tftp) go very well. I can see in VC3 and VC4 that the booted system gets all the right network attributes and have carefully verified them (hostname, IP, GW, dns, NM, broadcast, network, next_server, and domain). In VC3, eventually, I get the error message "reverse name lookup failed", then it says "failed to mount nfs source". It displays a url and file location, both of which are, in fact, correct. Efforts at toying with various anaconda options for the pxeconfig isolinux file provide no improvement.
On the kickstart server, I configure it for use of a MAC address for another system that I've successfully kickstarted before (within dhcp and the tftp/pxeconfig link), boot that other system and the installation goes perfectly to completion. The nfs mount occurs instantly here, where it times out and fails on the other server. I switch the MAC address back on the kickstart server to the problem box (no other changes to this world) and the NFS fails at the same point again. When manually attempting to continue the build interactively, the information is given for the NFS server (192.168.10.2) and the directory (/kickstart/ks-files/ks.cfg) but always results in the error message "That directory could not be mounted from the server". Yet a manually built VM, on the same physical server where the kickstarted VM is can immediately mount the exact same path without fail.
The physical network is a 4-port Linksys hub (yes hub), 1 PC with Fedora 7 installed and configured for the kickstart server, one Dell XPS laptop with VMware Workstation v6 installed (32 bit VM's kickstarted into here always install successfully) and a MacPro with Windoze XP 64 bit installed and it also has VMware Workstation v6 installed (32/64bit VM's kickstarted into here ALWAYS exhibit the problems outlined above). Regardless of isolating this hub from the internet or making it standalone, the results are the same.
To do other troubleshooting, I've 'shared' the C: drive of the MacPro, mapped it into a drive letter on the Dell, pointed VMware on the Dell to the mounted filesystem on the Mac for the 'problem' VM and it boots, installs to completion 3 times in a row. I try it natively on the Mac again, with failure as before. I've temporarily replaces Windoze XP on the Mac with Windoze 2003, and the VM died at the same place.
Also, I can successfully manually build, via ISO image any RHEL AS4, RHEL5, Fedora 7 OS I want, either 32 and/or 64 bit on the MacPro. Subsequently to the build, NFS mounts work correctly.
Unfortunately, entries to the RedHat kickstart mailing list didn't provide any useful help. One suggestion was to switch over to squid and apache on the kickstart server and 'try' that. Not good suggestions for me in resolving an NFS problem within anaconda. My customer has a pretty widely installed base deployed via NFS based kickstart servers. Nor am I familiar with squid/apache and their complexities.
Reviews of /var/log/messages on the kickstart server show only normal DHCP handshacking. No DNS, DHCP or NFS errors. Any suggestions on what I can do to temporarily elevate the volume of messages those services provide into syslog, or some other log file for review for problems that I'm not aware of yet??? Any suggestions on what I can do to further troubleshoot this NFS mount problem?
I was wondering........ if anyone else has a thought on this, please chime in....... is it possible that Fedora Core 5, an older OS could be more stable or 'better' for doing this than Fedora 7?