Some time ago I configured a Network Load Balacing (NLB) cluster to serve web traffic in our organization. Shortly thereafter I realized that my change to using NLB meant that web traffic was being broadcast throughout the domain! I have a fairly basic knowledge of switches and such, and so I'm hoping to get some assistence in configuring my system to prevent this multicast flooding.
First let me share a picture which outlines the small section of my network in question. Both switches are PowerConnect 6248s, joined by a single copper port. On VMHOST1 I have Hyper-V virtual machine running third party web proxy software, which is configured in a multicast NLB with a virtual machine running the same software on VMHOST2. The VMHOST servers each connect to their respective switches with a four port LAG group, and of course there is some complexity with a software based virtual switch on each HOST server which is part of the Hyper-V configuration.
The NLB is configured in multicast mode, however IGMP is not currently enabled. This NLB configuration was specified by the third party web proxy that I am using in the NLB. I have opened a case with them to understand why they want IGMP turned off. If my VERY limited understanding is correct, and I can turn on IGMP support for my NLB then I should be able to do some configuration on the switches to better handle the muticast traffic.
Sadly I'm unclear on what needs to be done at the switch level to make this happen. I believe it's some combination of IGMP Snooping, Bridge Multicast Grouping, and maybe some filters but I could really use some help and guidance here to get the specifics.
Thanks a million!
I finally had time to work this all out, and the answer was pretty easy. I actually had the right approach some time ago, but unfortunately was guided away from it.
In any case the solution was as follows:
1. Change the NLB cluster type to Multicast with IGMP
2. For all switches, globally turn Bridge Filtering On, and IGMP Snooping On
3. Ensure that Bridge Multicast Forwarding Mode is set to Forward Unregistered on all switches
4. Create a new Bridge Multicast group on each switch. For the switchs which host the NLB, make the LAG ports members of the multicast group. For all other switches, make all uplink ports which connect to the destination switch members.
This tells all of the switches how to direct this particular multicast traffic. Prior to getting this setup, clients would broadcast out web requests to the proxy NLB IP which would flood the request traffic across all switch ports on all switches. Now the traffic is directed only to uplink ports, and the actual LAG ports of the NLB cluster members.
Hopefully this will help someone in the future.
That is quite the problem. Having IGMP disabled causes the multicast to broadcast out to the entire subnet of a subscriber. IGMP allows the subscribers MAC address to be entered into the frame and only direct traffic out the port of that MAC address. You probably already know all of that, so let me try to give you some ways to work around this.
One simple method would be to enable IGMP snooping on the switches. IGMP snooping allows the switch to monitor multicast traffic and learn which port the traffic is destined for. It will then route the traffic to only those ports within the switch. To enable this just run this command from global configuration "ip igmp snooping". That will enable it for all VLANS. If you put a number at the end of that command then it will only enable it for that VLAN.
Another method would be to use VLANs to segment the traffic. The servers most likely have two dual port NIC's, so I will base this off of that assumption. I would suggest creating a VLAN for the multicast traffic. Create VLAN xx for the multicast and connect one dual port NIC from each server to that LAG. The rest of the configuration would be on the server and application. You would want the NLB traffic to go across a different subnet on that VLAN, so assign it to use that NIC and configure the NIC's on the new subnet. You can then use the second NIC for normal traffic to host machines.
The multicast traffic is only used for NLB traffic so that subnet should be the only one subscribing to the multicast. If IGMP snooping does not stop the misdirected traffic then I would use the second method I described to segment that traffic from the rest of the network.
Here is the CLI guide for the 62xx: http://support.dell.com/support/edocs/network/PC62xx/en/CLI/PDF/cli_en.pdf
Thanks very much for your reply. I still have a few questions about your response. I'd like to pursue the IGMP method as it seems to be easier to me, I'm still waiting on the third-party web proxy publisher to get back to me on enabling IGMP for the NLB.
However, let's assume that this can be done, and so I turn on IGMP Snooping for the switch as you described. Do I need to do anything else? From other material that I was reading I sort of got the impression that I needed to 'map' the NLB MAC address to particular ports on the switch. Also since the multicast traffic needs to traverse the switch uplink to reach the other node, I assumed I would need to 'map' the uplink port with the same MAC address.
Maybe I'm thinking that things are more complicated then they really are, but I didn't think that just enabling snooping was enough on it's own.
Yes, you just need to enable IGMP snooping. Static MAC address assignment on the ports is not necessary and I would discourage that practice. ARP automatically builds MAC address tables on the switch, so the MAC addresses are already dynamically assigned. Since IGMP is not enabled by the application it would not matter anyway. Without IGMP being enabled it does not direct the packets to a MAC address.
I've turned on IGMP Snooping globally on both switches but unfortunately the broadcasting continues. When I look at the Dynamic Address tables within the swtiches, I do not see the MAC address which is assigned to the NLB cluster, and so I would guess that the switch doesn't really know where to direct the traffic. Any thoughts on what to do next?
I've been looking for a resolution and have not been able to find one. I have found a few work arounds for this problem,but most will not work with NLB in multicast. I contacted our networking escalation group and was informed that the 62xx does not support NLB in multicast mode. They provided me with a possible work around, but informed me that it would not likely work. Here is the work around they provided me:
From the VLAN interface(e.g. interface vlan 10) that the servers are on you can try to direct the multicast to the correct ports.
bridge multicast address 03bf.xxxx.xxxx add 1/g46,1/g47,1/g48
The mac address will be the virtual mac address that the application is using to send the multicast. You would want to direct it to the ports that connect the two switches from the primary server. On the switch that connects to the secondary server you would want to direct it to the ports that the server connects to.
You can also do this through the web interface in Switching>Multicast Support>Bridge Multicast Group.
I hope this helps, thanks
Thanks again for your response. Unfortunately the interface seems to not want to bridge a non-IGMP multicast address. I'm thinking that I might do some testing this weekend during a maintenance window to see if I can turn IGMP on the NLB, and see what I need to do at the switch to get things working. Thanks for your help.
Still not having any luck. I've tried just about everything I can think of on the switch and I cannot get the flooding to stop while still maintaining the functionality of the NLB. The next approach is going to be to try and add a second NIC to each of the machines in the NLB cluster, and either tie them directly together with a crossover cable, or a hub. I've read somewhere that this may work. If anyone has any further comments or suggestions they would be welcomed.
Have you had any luck with this? I am having the same issue.
Unfortunately I haven't had any more time to test the second NIC / hub theory yet. I'll admit that I'm a little surprised that there isn't an easy/clear option for solving this sort of situation. I would think that plenty of companies run NLB clusters in some form, and that there would be some well established documentation on how to work around this issue.
If I am able to solve the problem I'll certainly post something here.