NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
ASKRZYNIARZ
Jul 07, 2020Aspirant
LACP doesn't balance connections between links
Hello, I have some issues with LACP and a M4300-24X24F switch: - I set up LACP bonding between two linux/debian hosts. - I use two 10Gb optical links per machine. - linux hosts hash policy is lay...
ASKRZYNIARZ
Jul 08, 2020Aspirant
Hi. Thank you for your answer.
I upgraded the switch firmware to version 12.0.11.13
I still have the same problem. Switching to mode 4 (Src IP and Src TCP/UDP Port fields) didn't fix the problem either.
You will receive a link to the maintenance file shortly.
--- please find here some informations about the receiving mac addresses ---
Extracts from maintenance file:
*************** show mac-addr-table *************** Address Entries Currently in Use............... 16 VLAN ID MAC Address Interface IfIndex Status ------- ------------------ --------------------- ------- ------------ 1 00:1E:C1:81:1A:42 1/0/25 25 Learned 1 10:7B:44:92:74:8B 1/0/25 25 Learned 1 40:A6:B7:03:DD:24 lag 3 772 Learned 1 40:A6:B7:03:DD:25 lag 3 772 Learned 1 40:A6:B7:0B:8F:18 lag 1 770 Learned 1 40:A6:B7:0B:8F:19 lag 1 770 Learned 1 40:A6:B7:0B:90:3C lag 2 771 Learned 1 40:A6:B7:0B:90:3D lag 2 771 Learned 1 84:A9:3E:84:4C:30 1/0/27 27 Learned 1 84:A9:3E:84:8E:B8 1/0/26 26 Learned 1 84:A9:3E:8A:86:01 1/0/28 28 Learned 1 8C:3B:AD:66:7D:80 CPU Interface: 0/15/1 769 Management 1 8C:3B:AD:66:7D:83 vlan 1 898 Management 1 AC:1F:6B:D9:94:6B 1/0/43 43 Learned 1 AC:1F:6B:DC:52:6F 1/0/45 45 Learned 1 AC:1F:6B:DC:52:E9 1/0/47 47 Learned
and
*************** show lldp remote-device all *************** LLDP Remote Device Summary Local Interface RemID Chassis ID Port ID System Name --------- ------- -------------------- ------------------ ------------------ 1/0/1 7 40:A6:B7:0B:8F:19 40:A6:B7:0B:8F:19 1/0/2 11 40:A6:B7:0B:8F:18 40:A6:B7:0B:8F:18 1/0/3 12 40:A6:B7:0B:90:3C 40:A6:B7:0B:90:3C 1/0/4 13 40:A6:B7:0B:90:3D 40:A6:B7:0B:90:3D 1/0/5 15 40:A6:B7:03:DD:25 40:A6:B7:03:DD:25 1/0/6 14 40:A6:B7:03:DD:24 40:A6:B7:03:DD:24 1/0/7 1/0/8
If I look at my interface definition, I have:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp24s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff
3: enp24s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff
4: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ac:1f:6b:d9:94:6a brd ff:ff:ff:ff:ff:ff
5: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether ac:1f:6b:d9:94:6b brd ff:ff:ff:ff:ff:ff
inet 192.168.42.121/24 brd 192.168.42.255 scope global eno2
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:fed9:946b/64 scope link
valid_lft forever preferred_lft forever
9: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff
inet 192.168.120.121/24 brd 192.168.120.255 scope global bond0
valid_lft forever preferred_lft forever
inet 192.168.121.121/24 scope global bond0
valid_lft forever preferred_lft forever
inet6 fe80::42a6:b7ff:fe03:dd24/64 scope link
valid_lft forever preferred_lft foreverenp24s0 is an Intel X710-DA2 network adapter. 40:a6:b7:03:dd:24 and 40:a6:b7:03:dd:25 are the default adapter MAC addresses before setting up bond0.
The 40:a6:b7:03:dd:25 address is discoved either from LLDP or LACPv1:
root@sls-1:/sys/class/net/bond0/bonding# tcpdump -i enp24s0f1 -e | grep dd:25 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on enp24s0f1, link-type EN10MB (Ethernet), capture size 262144 bytes 13:31:25.860533 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110 13:31:57.068536 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110 13:32:28.268538 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110 13:32:59.468532 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
--- some experiments ---
I did try to deactivate LLDP (I set LLDP Transmit and Receive to disable for the relevant physical interfaces).
According to the maintenance file, 40:a6:b7:03:dd:25 is no more in the LLDP table, but is still present ine the mac-addr table (affected to lag 3). I guess it is learned from the LACP frames.
That didn't fixed my issue either.
LaurentMa
Jul 08, 2020NETGEAR Expert
Thank you for the tech-support file. Let me recap what I can see in the file:
- The two relevant LAGs in the M4300-24X24F switch are now LAG1 (1/0/1, 1/0/2) and LAG3 (1/0/5, 1/0/6).
- Your Sender Intel X710-DA2 has enp97s0f0 (40:a6:b7:0b:8f:18) and enp97s0f1 (40:a6:b7:0b:8f:19) under bond0 (40:a6:b7:0b:8f:18).
--> Sender enp97s0f0 (40:a6:b7:0b:8f:18) is connected to 1/0/2 and Sender enp97s0f1 (40:a6:b7:0b:8f:19) is connected to 1/0/1 both under LAG1.
- The Receiver Intel X710-DA2 has enp24s0f0 (40:a6:b7:03:dd:24) and enp24s0f1 (40:a6:b7:03:dd:25) under bond0 (40:a6:b7:03:dd:24).
--> Receiver enp24s0f0 (40:a6:b7:03:dd:24) is connected to 1/0/6 and Receiver enp24s0f1 (40:a6:b7:03:dd:25) is connected to 1/0/5 both under LAG3.
- According to your tests, LAG mode 6 or mode 4 doesn't change the issue on LAG3, enp24s0f0 (40:a6:b7:03:dd:24) doesn't receive much traffic out of 1/0/6 and all the traffic is received by enp24s0f1 (40:a6:b7:03:dd:25) out of 1/0/5 from LAG3 on the receive side
ASKRZYNIARZ, I didn't find any obvious issue in the Tech-Support file, so I am escalating it internally to the rest of the team, and we will come back to you here shortly.
Regards,
- LaurentMaJul 10, 2020NETGEAR Expert
Hi ASKRZYNIARZ
Thank you for the tech-support file. Everything is normal in there, it means that there is no particular issue with your configuration in the switch.
Please come back to Mode 6 - Src/Dest IP and TCP/UDP Port fields for your LAG on the receive side.
On the transmit side, the hashing is OK but on the receive side, the switch is using one link only. It would indicate that the Modulo-N hash algorithm is defeated by your iperf setup. If you think about it, there are only 2 ports in the LAG, so the algorithm outputs 1 or 2 each time. Given the source and destination MAC and IP are always the same for all packets, all comes down to the 4 different TCP ports that are used by iperf on the transmit side. I believe these four ports are defeating the algorithm in this particular setup.
Could you help test with more, like 8 TCP connections? Another test could be to change the destination TCP ports for the iPerf connections. You could start multiple iperf servers (on your same server) using different ports with the “-p” option.
Please let us know and we'll continue to help you -
Regards,
- ASKRZYNIARZJul 15, 2020Aspirant
Hi,
I did the tests you suggested, and you are right about my tests defeating the hash function.
I did check all available hash functions, and every one of them were defeated in my setup.
As a default, iperf uses sucessing even number for clients ports.
By forcing odd numbers for some client ports, the connections are balanced as expected.
The main issue is that the various linux kernel I have at hand (4.19 or 5+) seem to use only even numbers for random clients ports, so I guess the connections may never balance properly if the lag is used exclusively for communications between the same two linux hosts.
I don't know if my usecase is uncommon or not, but in the later case, you should consider adding a new hash method that feeds L2 and L3 infos to a pseudorandom function (pseudorandom as defined in cryptography), and choose ports based on a modulo applied to the function returned value.
I guess that would solve any similar issues.
- ASKRZYNIARZJul 15, 2020Aspirant
edit: infos should be L3 and L4, not L2 and L3
I don't know if my usecase is uncommon or not, but in the later case, you should consider adding a new hash method that feeds L2 and L3 infos to a pseudorandom function (pseudorandom as defined in cryptography), and choose ports based on a modulo applied to the function returned value.
Related Content
NETGEAR Academy
Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!