I've managed to mistakenly assign an IP to the service console which was already assigned to the vmkernal on the same vswitch!
Frustrating thing is, I have TWO other service consoles on the box, but I think it's the DG that's breaking it.
I can't change the ip from the console (I have iLO) as it just hangs whenever I try to run ANY esxcfg commands.
I'm really stuck - been messing iwth it for 2 hrs now (and it's 03:40 here in UK at the mo).
All this was whilst I was CAREFULLY shifting things over to VLANS having TESTED it on a test pair of ESX boxes Id' built!
Frustrating as hell, a single typo in the ip address :-(((
Any suggestions guys??? I could REALLY do with getting this going by morning!!
Paul
Well normally you would use Esxcfg-vmknic to set the VMkernel parameters, ie. "esxcfg-vmknic -i 172.100.5.5" to set the IP. Not sure why it is hanging on you, you might be able to do "service network stop" first and then try and change it. You can also chaneg the SC IP by doing "esxcfg-vswif -i 172.100.5.6".
Cool, will give that a go once latest reboot is completed...
Optionally you can try editing the config files in /etc/sysconfig/network-scripts. There should be files there like ifcfg-vswif0. I've never tried that so I'm not sure if it will work. Do a "service network restart" when complete
Still no joy changing IPs yet, still hangs with either command.
Just waiting for the boot process to timeout trying to mount an nfs volume so I can try disabling networking via ilo and see what happens...
Tried editing the vswif0 file with a service restart but no joy. Giving it a full reboot again now to see...
(Appreciate the help!)
I couldn't get this resolved in time so ended up with "Plan B". Luckily, just yesterday I'd built the two "test" hosts within the same blade enclosure.
I ended up re-presenting the LUNs to these blades & firing up the VMs within the test cluster so I got the VMs online in the end, but still haven't resovled the IP conflicts (and got to bed at 0500!).
So, does anyone have any more suggestions?
Basically, somehow (after \*first* ensuring that I had TWO extra service consoles, one each on a different pNIC) I inadvertently created a VMkernel port AND a Service Console with the same IP but diffrerent VLANs.
Now, I can ping the IP but that's about it.
I can get to the RiLO interface no probs, but every time I try to make any changes using the esxcfg-? commands, they just sit there, with the exception of the esxcfg-vswitch -l command which, after an AGE, slowly starts to list info but I give up after 10 mins or so.
It tried all the suggestions above, even trying to disable networking to see if the commands "wake up" whilst the network isn't confusing the hell out of itself with duplicate IPs.
Would be greatful for any more suggestions before I have to consider vaping the server and rebuilding it...
Regards,
Paul
Hi,
do you try to delete the Vmkernel port group with VI client?
and re-create it after with new IP or chnage COS ip after this?
I can't actually get to the server via VI client in any /useful/ way. It will connect, eventually, but every click through the config takes an age, but ultimately fails at the point where I'd select a portgroup and click "Edit".
The Edit button pops up a window (eventually) , but never populates it with the objects for editing, and complains of timeouts.
the configuration should be in some textfiles. Can't you go to a lower runlevel where the ESX services are stopped?
Maybe init 2 or something will make the ESX go to a run level where you can edit the files and then reboot?
Maybe the esxcfg- commands will work in a lower runlevel?
Good luck...
I'm starting to make a /little/ headway.
I fired off an "esxcfg-vswif -l" command in the iLO console before leaving for work this morning. By the time I got there (30 mins) the command had completed with the, ultimately, correct IP information for each vswif.
I'm now doing the same with "esxcfg-vswitch -l" which is taking equally as long, but is coming back with some results so far.
I seem to be getting somewhere, albeit slowly, by editing the config files direct. however, I don't know where the VLAN info is stored as I'd like to try removing vlan settings if at all possible.
Then, I could try opening up networking a bit (putting it on the same subnet as I'm on, in the default vlan) which should get me somewhere.
Regards,
Paul
Sounds like the box is seriously sick, the esxcfg commands should not take so long to return, I'd just erbuild it as a matter or course - probably quicker than trying to work out what's wrong especially if you have to wait 30mins for each esxcfg command to complete!
It /should/ be repairable though surely?
All I want to be able to do at this stage is to get access to the esxfg commands as they should work. Once I get there, I'll strip out all networking manually, recreate vswif0 and then recreate the networking from scratch.
It seems a bit extreme to vape the box for a network mis-config, plus it's actually turning out to be an extremely useful learning exercise which I'd like to get a little more out of if I can...
Paul
Aha! I now have it at the point where I can VI-client into one of the alternative service consoles although still a little sluggish.
Now to see if I can remove the vswif0 interface & recretate it...
paul
The last time I ran into an issue very similiar to yours was last November with an IBM box. We successfully installed ESX 3.x on 6 identical boxes...but this one seemed to have the same issues that your box is having. Even after reinstalling ESX on this box...we still had trouble. I would type in some commands at the console...and it was taking forever to do anything.
I think that the final resolution was to replace some failing components inside the box. We had some IBM folks run the diagnostics and found out that it was bad hardware on this particular box.
Don't forget those 37+ patches! Might as well apply those before you get too far along.
Chris
In this case it's definitely a stupidity issue (on my part) rather than a hardware issue
Managed to delete all vswifs (basically entered the command and just left it!) and console suddenly massively more responsive.
However, tried re-creating one vswif and it's slowed down again...
Trying to delete all vswitches, portgroups etc. but it won't let me delete vsiwtch0 as it says that the "VMkernel" portgroup has an active port.
What exactly is meant by "active"?
Are you using the VMkernel port group for anything like a mounted NFS share?
At this point, it might be just as quick to reinstall ESX. It only takes about 15 minutes.
Chris
Yes, I am. How can you remove it via service console?
Nah, I'm beating it now, not gonna let it win after all this! It's been brilliant learning it all to be honest - nothing like baptism by fire!
I've now deleted \*everything* except vswitch0 (as it contains vmkernal with an active port) and vswitch1 (as that also contains a vmkernel with an active port).
If I can delete these last two, I'm back to square one and can start afresh!
Paul
Found reference to "esxcfg-vmknic" in http://www.vmware.com/community/thread.jspa?messageID=575477򌟵 and now deleted everything!
Right, let's see if I can start again...
paul
Sounds like you're having fun at a least.