IPJ
Contributor
Contributor

Firewall/Network issues on service console ?

Jump to solution

I have 2 recently built VMWare ESX3.0.2 (61618) servers. All appeared to be fine on them when I set them up in our server room, i.e. I could access the service console for configuration and building of VM's etc.

I have now relocated to my office and desktop PC, which is on a different VLAN to the ESX servers and the servcie console is not reachable (via ping's of SSH connections). So ... In summary the service console can only be reached via machines on the same VLAN as the service console.

I have made an SSH session on to the service console and done a ping to my desktop (and several other devices across several VLANS) which responds fine, thus proving the networking and defualt gateway is set up fine. ALSO when I ping my desktop PC from the service console it appears to open up a hole in the firewall and enables me to access the service console from my desktop PC (shortly after I kill the ping my desktop PC loses the access to service console again).

There doesn't appear to be anything in our network set-up that would cause problems like this, is there something in ESX I need to configure, for example a firewall setting ?

0 Kudos
1 Solution

Accepted Solutions
jonathanp
Expert
Expert

Try to unlink it and keep only vmnic0 for example like:

esxcfg-vswitch -l (to see linked vmnic)

esxcfg-vswitch -U vmnic# vSwitch0

For the vmkernel gateway I will try to explain my understanding of this Smiley Wink :

Let say you have 2 differents network that cross 2 physical switches

1 switch take care of VM network and service console

1 switch take care of storage traffic like NFS/ISCSI.

They are on separate network

So the vmkernel gateway is to send all the storage (vmotion, nfs and iSCSI) traffic to this switch to avoid having to much traffic the same network of VMs

Am I correct?

Jon

View solution in original post

0 Kudos
17 Replies
jonathanp
Expert
Expert

Did you change the service console port group vlan id?

esxcfg-vswitch -p "Service Console" -v xx vSwitch0

Regards

0 Kudos
IPJ
Contributor
Contributor

Some additional info ....

I have the service console connected to a vswitch which uses VMNic's 0, 1 & 2.

VMNic 0 is active with VMNic's 1& 2 as stand-by. All ports are configured identically (all ports configured as trunks).

Any virtual server also on the same Vswitch (using VMnics's 1 & 2 as active with 0 as standby) work and behave fine, i.e. reachable from any VLAN.

0 Kudos
IPJ
Contributor
Contributor

Thanks for the reply Jonathan.

I'm new to ESX and I'm not totally confident of issuing commands via the cmmand line yet, so I have not changed the service console prot group valn id (that I know of at least) by issuing a esxcfg-vswitch command. Is there anyway I could have done this through the network configuration wizards ?

Thanks again

0 Kudos
jonathanp
Expert
Expert

You can do this in the VI client, go to -- configuration, networking, click on "properties" of the service console virtual switch.

Select the service console port and click "edit". Click "continue modifying thic connection..."

Change the vlan id.

Note: Be sure that this vlan id is reachable from the service console connection cause you may lost service console connection if this is not the case, and then you would need to go to the command line.

Regards

0 Kudos
IPJ
Contributor
Contributor

Jonathan, just checked and the VLAN ID is correct (doesn't need changing).

0 Kudos
jhanekom
Virtuoso
Virtuoso

Recently had similar inconsistent results on a newly deployed system before networking settings were finalised. The gateway on my system was not set correctly yet, but somehow ESX was able to discover the gateway to route out to. All things aside, I was able to ping hosts outside my subnet, but they weren't able to ping me. After the network admin confirmed what the final settings would be and I configured my gateway appropriately, all was well.

Check that your default gateway for your Service Console on the ESX host is correct. The setting is in the VI client, on the Configuration tab, under DNS and Routing.

IPJ
Contributor
Contributor

Thanks Jannie,

The service console gateway appears to be correct, however, the VMKernel gateway is the gateway for a different VLAN.

What is the purpose of the VMKernel gateway ?

0 Kudos
jhanekom
Virtuoso
Virtuoso

So the problem is still there? How many uplinks do you have in the vSwitch that the Service Console port group is attached to? Have you changed any of the default load balancing settings (i.e. from "route based on originating virtual port ID" to "route based on ip hash"? Have you set up any form of EtherChannel or 802.3ad Static Channels on your physical switches?

The VMotion gateway seems to be more of a practical joke than anything else. The field for a gateway has been there since the early version of VC, but there have not been any real practical applications, to the best of my knowledge. Maybe this will change in the next release of ESX. (Some of the more senior guys ran some tests earlier this year to prove that VMotion can actually work over a routed network. Since you still need high-speed access to shared storage, the applications of this are limited, however. http://communities.vmware.com/thread/89122)

0 Kudos
jonathanp
Expert
Expert

What's the joke ? Smiley Wink

The VMkernel gateway is for Vmotion,NFS and I SCSI connection.

Let say for exemple you have your service console on ip of 10.10.10.10 and vmotion on 192.168.1.10...

the vmkernel gateway is to point to the gateway that VMkernel trafic is routed to.

Regards

Jon

IPJ
Contributor
Contributor

Thanks again for the replies, it is still not working .......

I have 3 uplinks into the vswitch, I have all 3 available to the service console, one as active and the other 2 passive. All are confirmed as working (I have pulled leads to make sure the other ports actually kick-in from standby and take the traffic).

The Load Balancing is set to 'Route Based on the originating virtual port ID' and the network does not have any of the settings mentioned.

Any more help suggestions much appreciated.

Thanks

0 Kudos
jhanekom
Virtuoso
Virtuoso

Ah, sorry, you did provide details about your networking setup. It's just that it sounded almost like the switch was getting appropriate entries in its ARP tables; that if you pinged your workstation from the ESX server, that would automatically populate the switch ARP tables with the correct entries and allow you to communicate back. It's almost as if all the ports are not in the same broadcast domain.

Does the problem persist if you only have one port active on the entire vSwitch? (either unplug the cable or unlink the other cables from the vSwitch temporarily.)

Regarding the VMKernel gateway: ah, being an FC evangelist I of course forgot about NFS and iSCSI Smiley Happy Still, I have difficulty seeing the need or appropriateness of routing a multi-gigabit iSCSI connection... not having been involved in any iSCSI rollouts myself I must confess that I don't know how common that scenario is, however.

0 Kudos
jonathanp
Expert
Expert

Try to unlink it and keep only vmnic0 for example like:

esxcfg-vswitch -l (to see linked vmnic)

esxcfg-vswitch -U vmnic# vSwitch0

For the vmkernel gateway I will try to explain my understanding of this Smiley Wink :

Let say you have 2 differents network that cross 2 physical switches

1 switch take care of VM network and service console

1 switch take care of storage traffic like NFS/ISCSI.

They are on separate network

So the vmkernel gateway is to send all the storage (vmotion, nfs and iSCSI) traffic to this switch to avoid having to much traffic the same network of VMs

Am I correct?

Jon

0 Kudos
IPJ
Contributor
Contributor

Jon/Jannie,

I unlinked vmnic 1 & 2 and still get the same behaviour from the service console Smiley Sad

Jon - you are indeed correct about " to send all the storage (vmotion, nfs and iSCSI) traffic to this switch to avoid having to much traffic the same network of VMs"

thanks

0 Kudos
jhanekom
Virtuoso
Virtuoso

Another thaught: duplicate IP? Beyond that, I think we've pretty much exhausted what troubleshooting options we have from the VMware side of the fence.

The "cool" problem would still be if it were a misconfigured 802.3ad channel on the physical switches. Just for fun, when you do get back to the office, physically unplug one of the cables to see if the symptoms disappear. (I suspect the physical switch thinks there is a 802.3ad/Etherchannel type connection and is blasting packets down the wrong link in the channel.)

As to the purpose of vmkernel routing: I'd have to disagree. The whole point of a switched environment is that you don't "send" data to a particular switch - the switching environment is clever enough to direct packets directly between devices. Every time a router gets involved (even if this is a wire-speed routing blade in a high-end switch), you're introducing latency. You'll want your devices to talk to each other over the shortest possible path, which would be a flat, layer 2 network. This can be achieved by either using physically separate links for "normal" IP traffic and iSCSI/VMotion traffic, or even just VLANs.

(Routers are devices that allow hosts on separate subnets to communicate with each other. Think "10.0.0.5 wants to talk to 192.168.7.53. To get there, it needs to route via 10.0.0.1.")

0 Kudos
eahatch
Enthusiast
Enthusiast

It may be worth the trouble to drop the firewall and see if you still have the same issue. I don't believe it is a firewall issue, but that would get it out of the picture.

To see your current ruleset (for later reference) type: service iptables status
To stop iptables (the console firewall), as root, type: service iptables stop
As soon as you have tested, restart the firewall: service iptables start
To confirm the rules have reloaded type: service iptables status

This seems like a router issue to me (between your office and the ESX server).

Thanks.

Alan

0 Kudos
jonathanp
Expert
Expert

Are you able to ping from the service console or you loose all connectivity?

or it is just from the VC server?

Is this a production server?

if not I would try is:

remove the service console vSwitch and vswif and recreate them from scratch..

esxcfg-vswif -d vswif0

esxcfg-vswitch -d vSwitch0

esxcfg-vswitch -a vSwitch0

esxcfg-vswitch -A "Service Console" -v vlanid vSwitch0

esxcfg-vswif -a -i ip_address -n net_mask -p "Service Console" vswif0

and try again.

Regards

0 Kudos
IPJ
Contributor
Contributor

All, thanks for all the help with this and apologies for the (very) delayed response. After a lengthy pat leave and assignment to other projects I was able to re-address the ESX problem and I have now it resolved.

FYI .. the problem was caused by our second service console (Service console 2), which we added as a 'backdoor' in case anything went wrong. The subnet mask on the VSwif the secondary service console was attached to was set to 255.0.0.0. As soon as this was corrected the problem was resolved !

Thanks again,

Ian

0 Kudos