VMware Cloud Community
Alok0106
Contributor
Contributor
Jump to solution

High Packet Loss in vmkping command

Hi All,

I have configured VSAN among three servers, servers are Dell R640 and it has 4 fiber ports. Out of these 4, 2 are being used for management and production traffic and 2 are being used for vSAN and vMotion (In active Standby Mode).

It works well but after sometime or on performing tasks like creation of VM or Migration of VM, start seeing error in VSAN>Health for below, between two of the hosts.

vSAN: MTU check (ping with large packet size)

vMotion: Basic (unicast) connectivity check

In this scenario, if I login to esxi shell and run vmkping command to ping other host, I observe huge packet loss (more than 60%). If I login to switch and shut port used for vSAN on any of the switch and only one port is active, everything start working normally. But, in this scenario also, This issue will be repeated and I will have to shut or no shut ports at switch level again to provide a temporary fix.

Please can anyone help me to fix this permanently, it has become very dangerous now.

Thanks in advance.

Tags (1)
1 Solution

Accepted Solutions
GreatWhiteTec
VMware Employee
VMware Employee
Jump to solution

A few things I see:

This sounds like a misconfiguration between interfaces and switch. Could be port config, jumbo frames, etc. How are the ports on the switch configured? Trunk? Port channel?

Also, I would recommend having vSAN and vMotion on separate VLANS, as well as using NIOC since you are sharing the same interfaces for both types of traffic. I would actually separate the active interfaces for those two types of traffic. For example, vSAN traffic will have NIC1 active / NIC2 passive. For vMotion NIC2 active, and NIC1 passive. In case one fails, and there is contention when both are on the same NIC, NIOC will kick in to help prioritize vSAN traffic. vMotion can be very bursty, hence the recommendation for NIOC. Only use shares. Set vSAN to high and vMotion to Low or somewhere between low and normal.

Based on your description I would be inclined to say it is a misconfiguration somewhere... BUT I have also seen this behavior when there is a mismatch between Firmware and drivers on NICs. Firmware and drivers come in pairs/groups. I there are mismatched, you will see some weird behavior.

Hope this helps

View solution in original post

3 Replies
sk84
Expert
Expert
Jump to solution

We need more detailed information to be able to help.

Which version of vSAN do you use?

What do your vsan vmkernel port settings look like?

How did you configure the portgroup for vsan?

Do you use jumbo frames? And if so, are all components configured correctly?

Do you have beacon probing enabled?

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
GreatWhiteTec
VMware Employee
VMware Employee
Jump to solution

A few things I see:

This sounds like a misconfiguration between interfaces and switch. Could be port config, jumbo frames, etc. How are the ports on the switch configured? Trunk? Port channel?

Also, I would recommend having vSAN and vMotion on separate VLANS, as well as using NIOC since you are sharing the same interfaces for both types of traffic. I would actually separate the active interfaces for those two types of traffic. For example, vSAN traffic will have NIC1 active / NIC2 passive. For vMotion NIC2 active, and NIC1 passive. In case one fails, and there is contention when both are on the same NIC, NIOC will kick in to help prioritize vSAN traffic. vMotion can be very bursty, hence the recommendation for NIOC. Only use shares. Set vSAN to high and vMotion to Low or somewhere between low and normal.

Based on your description I would be inclined to say it is a misconfiguration somewhere... BUT I have also seen this behavior when there is a mismatch between Firmware and drivers on NICs. Firmware and drivers come in pairs/groups. I there are mismatched, you will see some weird behavior.

Hope this helps

Alok0106
Contributor
Contributor
Jump to solution

Thank You for your reply, I found this as very helpful. Because I knew my configuration is fine, so I just focused on your last paragraph which says it can be related to NIC_Drivers/Firmware.

I contacted my vendor and got the NIC drivers/Firmware upgraded and issue stands resolved now. Thanks Again. @

Reply
0 Kudos