We got an outtage last week about a specific VM that was misconfigured by a vsphere user.
To summarize, we added 2 vNICs in same portgroup and used an IPS (suricata) with specific network mode (AF_PACKETS if I remmember correctly).
The result is the VM started to broadcast ARP requests on an average rate of 700k packets per second.
All this data was broadcasted to our secondary site, thus killing every active network switches on the way.
We had hard time to locate the source of this problem because the network bandwitdh generated was pretty low : about 100MBps. But 700k pps ...
I didnt find anything in NSX (we don't have NSX, but I was assuming this would be the best candidate to do that) or in vDS that would help me prevent this : a network packets per second alarm / limit.
Any ideas ?
Attached a network capture at the physical NIC level of the ESXi hosting the virtual machine. Note that 39:dd and 96:fe are both the mac address of the two vNIC of this VM.
Please refer these documents
Network I/O control for VM traffic --> Bandwidth Allocation Parameters for Virtual Machine Traffic
Traffic shaping on vsphere switch or port group --> Configure Traffic Shaping for a vSphere Standard Switch or Standard Port Group
Hope that helps
I've not seen anything in vSphere that monitors or controls packets, a vDS can do port mirroring which would send all the packets to another system from where you could monitor them using any network tool.
Would you like me to move this thread to the NSX area? (I'm a forum moderator)
Unfortunately there is no feature for that in vSphere neither in NSX. What you are looking for is "Storm Control" which can be configured in some physical devices. Same behavior happen with the alarms as they cannot be configured for such actions.
Here for example is a Cisco forum where they discuss on what to do in this situation: Solved: Prevent broadcast storm from VMs under FI - Cisco Community . To be honest is a pretty accurate solution as they are saying to limit the Trunk in vSphere to the only required VLANs to limit the broadcast for flooding to all the L2 segments.
However this may be not your expected answer so I encourage you to look for some more Storm Control discussions in different network vendors forums.
Thanks for your answer, because we are on a stretched cluster, this is not possible to not do L2 inter datacenters for the production vlans.
From what I read, storm control of CISCO 1000V would be my only option, (or storm control on physical nexus switches).