VMware Cloud Community
x007alfa
Enthusiast
Enthusiast

[HELP] - VLAN confusion

Probably today is not my day but I cannot get vlans to work properly...

I have 2 esxi hosts but I'll be worrying about just one for now.

It has a 10G link to a cisco switch and the management network is set to be on vlan 200.

The cisco switch is set for that port to be a trunk, together with other vlans there is also the 200.

From cisco I have a LAG of 2 10g links going to a second switch.

Both of them have trunking enabled with all vlans passed.

At this point I can ping from the second switch the management IP of esxi, same as from the cisco switch.

My computer is connected to an untagged port of the second switch marked as vlan 200.

I cannot ping or access the esxi from there... What am I missing? I'm so confusoled....

Labels (2)
Tags (2)
Reply
0 Kudos
7 Replies
x007alfa
Enthusiast
Enthusiast

I'll add some more details now about some tests I did and a quick explanation of the setup.

Devices:

  1. 2x Dell R540 each with 2 integrated 1gbps lan and 2 10gbps ports from an x520 intel card.
  2. D-Link DGS-1520-28X (24x1gbps rj45 + 4x10gbps sfp+ managed)
  3. Cisco CBS350-12SX (10x10gbps sfp+ + 2 combo ports rj45 and sfp+ managed)
  4. Qnap NAS that is going to house the VMs on a RAID5 of 4tb wd red pro disks equipped with 10gbps nics.
  5. All cables are DAC sfp+ cables for now.

The hosts are equipped with a RAID1 of 480gb ssds for boot and local store.

The cisco switch I was going to use a backbone to the structure.

I thought of 4 VLANs:

  1. ID100 -> iSCSI
  2. ID200 -> Management
  3. ID300 -> Fieldbus
  4. ID400 -> ThinClient Network

I know I should use 2 switches with M-LAG to do this proper proper but money constraints are bad from the customer.

I thought to put in LAG the 2 10gbps ports of the hosts and link them to the cisco to a LAG on the switch set to trunk mode.

Same for the NAS which at that point would be only access mode without trunking.

I then have a LAG of 2 ports as uplink to the dlink switch which would serve the external network of thin clients and PLCs hooked to access ports on the dlink switch.

The LAG just described is configured as trunk with tags for all VLANs and untagged for the vlan1.

I just set up the hosts with fresh installs of ESXi 7.0.2 from DELL custom images.

I have the management ports set on both host to the lag described above on vlan 200 with IPs defined static.

I set up a temporary VM on the first host and just left out the nas for the moment.

Here are the tests I did:

  1. Laptop -> cisco = ping OK
  2. Laptop -> dlink = ping OK
  3. Laptop -> host1 = ping OK
  4. Laptop -> host2 = ping OK (but it's flaky sometimes it takes time to answer to ping requests)
  5. Laptop -> VM = ping OK
  6. Host1 -> Host2 from shell = NO PING (have no idea why though.......)
  7. Host1 -> VM = ping OK
  8. Host2 -> Host1 from shell = NO PING as above reverse test
  9. Host2 -> VM = ping OK (but again flaky sometimes it doesn't work)
  10. VM -> Laptop = pinged this morning now not anymore........... 😒
  11. VM -> Host1 = ping OK
  12. VM -> Host2 = ping OK after like 3 minutes...
  13. I did reboots after reboots with varying results...
  14. Sometimes after a reboot with LAG active on the hosts they would not ping.
  15. I disable the LAG in console and boom it starts pinging again....

 

I'm lost people... I'm sure I'm lost in a glass of water but I can't figure this out...

Basically it's a VST setup if I read the literature correctly...

Please help me... 😭

Reply
0 Kudos
alantz
Enthusiast
Enthusiast

If your LAG is going across different vendors are you using LACP then ? You mention that it starts working when you disable the lag so that leads me to believe its a setup issue there. 

--Alan--

 

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

But when then vm on host 1 is trying to see host 2 it's not leaving the first switch I would think...

Anyway should LACP be enabled for proper operation?

LAG should be in MAC mode or IP/MAC mode?

 

Thanks for the answer!

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

Please find here attached the configurations of the two switches.

They are very basic.

As previously stated the CISCO switch is all 10gbps and the backbone of the cluster.

The DLink switch is the access switch from the outside.

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

Right now my laptop can ping the two switches but not the hosts.

The switches can ping the hosts.

Tags (1)
Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

I have connected a spare nic to the dlink switch to an access port and configured the management network to use that nic without vlan.

Now from my laptop I see the host.

Now I created a VM inside Host1 and created a portgroup on the team of 10gbps cards with VLAN tag of 200.

The VM cannot see the host nor my laptop...

But the switches can see the vm........ what the actual f--- is going on here???????? I don't understand anything anymore!

Reply
0 Kudos
a_p_
Leadership
Leadership

I'm not a network specialist, so please don't mind me if I'm wrong.

From what you write I assume that you are using vDS (Virtual Distributed Switched). In this case you may need to change from "mode auto" - which if I understand it correctly is Cisco's proprietary PAgP negotiation - to "mode on", or "mode active/passive".

For vSS (Virtual Standard Switches) only "mode on" may work.

André

Reply
0 Kudos