Error message: Some other host already uses address x.x.x.x when booting the system. The service console ends up with no network connectivity as vswif0 doesn't come up.
It's a fully patched ESX 3.5.0 build 98103 machine. On every reboot, it thinks there's an IP address conflict. There is most definitely NOT an address conflict. I've changed the address of the service console twice to no avail. vswif0 simply will not come up in its own.
The workaround: service network restart. Which works about 50% of the time. vswif0 will come up if I execute the restart enough times.
Any idea how to fix?
Update: I just removed the two failover NIC's and the problem went away. I had three NIC's attached to vswif0 in a link state detection failover config (all options were at the defaults). Now I'm down to the one original NIC and vswif0 comes up just fine. Was this box seeing itself and too stupid to know that it's seeing itself, like a parakeet with a mirror?
The first/top NIC is a Broadcom (bnx2 driver). The other two are Intel (e1000). Changing the order of the NIC's didn't help. Running with one of each NIC type didn't help. Running with only the pair of e1000's didn't help.
If you had the two NICs teamed, what algorithm were you using? Did they go to the same physical switch, or different physical switches? Teaming NICs are a common configuration, so I don't think that would be your issue.
-KjB
Of the three NIC's, I believe that two go to the same switch and the third goes to a secondary switch. The teaming config was left at the default, just like the other eight ESX hosts we have here; "Route based on originating virtual port ID" and "Link status only". It's only this one fully patched host that's having trouble. Our other systems are 3.5.0 or 3.0.2 with no patches applied and they are doing fine.
I'm not sure if you this is a test system, but can you try a card you know to be working fine into the system and make sure it is having the same problem? I wouldn't lean toward the card if there's two exhibiting the same behavior, but just want to rule it out.
-KjB
I don't have access to another suitable NIC, unfortunately.
Did you try having the NICs in, but as standby NICs, with one being the active? That way, at least you have some redundancy, which you currently do not.
-KjB
Support recommended "route based on IP hash". No luck there. They also suggested, as a diagnostic, having all NIC's go to the same switch. I think the idea is that if this clears up the issue, there might be a switch configuration change which can be made. I'll go try it now.
This continues to get weird. I just tried it with only one NIC physically connected to the network but all NIC's enabled. The "IP conflict" still comes up about 70% of the time. I think there's something internal to the box messed up.
Do you guys know a series of commands that will blow away all network configuration and re-create?
Unlink the pnics from the vSwitches
esxcfg-vswittch -U vmnicx vSwitchx
Remove the vmkernel
esxcfg-vmknic -d 'portgroupname'
Remove the vswif interface
esxcfg-vswif d vswifx
remove the portgroups
esxcfg-vswitch -D 'pgname' vSwitchx
Remove the vSwitches
esxcfg-vswitch -d vSwitchx
Add everything back in reverse order.
I realise this is an old topic, but I got this same problem and thought I would post up the culprit which I found on my network.
I modified the "/sbin/ifup" script in the Service Console so that when it does the "arping" to check whether its address is in use it is no longer silent. I did this by changing the line:
arping -q -c 2 -w 3 -D -I $ $ to arping -c 2 -w 3 -D -I $ $
Each time I restarted the networking then I would get the MAC address of the host which claims to be using the address. This MAC turns out to be in the HP range, but strangely a different MAC address was reported with almost every network restart. The network guys then traced back the MAC addresses back to Windows servers - on different VLANs to the ESX hosts.
Turns out it is a bug in the Broadcom teaming software we are using....see: