NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.
Forum Discussion
ASKRZYNIARZ
Jul 07, 2020Aspirant
LACP doesn't balance connections between links
Hello,
I have some issues with LACP and a M4300-24X24F switch:
- I set up LACP bonding between two linux/debian hosts.
- I use two 10Gb optical links per machine.
- linux hosts hash policy is layer3+4
- switch hash mode is 6 Src/Dest IP and TCP/UDP Port fields
- I test the setup with iperf (in parallel mode).
- switch firmware version is 12.0.7.13
- switch boot version is B1.0.0.11
So far, between the sender and the switch, things works as expected: connections are balanced between the two network links.
But, between the switch and the receiver, only one network link is used. I expect the switch to balance the connections between the two links, but that is not the case.
Therefore, I am stuck at 10Gb bandwith (I was expecting 20Gb).
The two relevants LAGs are lag2 and lag3: according to the management web interfaces, active ports are ok and the LAG is up:
lag2:
- admin mode enabled
- hash mode 6 Src/Dest IP and TCP/UDP Port fields
- STP mode Enable
- Static Mode Disable
- Link Trap Disable
- Configured Ports 1/0/3, 1/0/4
- Active Ports 1/0/3, 1/0/4
- LAG State Up
- Local Preference Mode Disable
lag3:
- admin mode enabled
- hash mode 6 Src/Dest IP and TCP/UDP Port fields
- STP mode Enable
- Static Mode Disable
- Link Trap Disable
- Configured Ports 1/0/5, 1/0/6
- Active Ports 1/0/5, 1/0/6
- LAG State Up
- Local Preference Mode Disable
According to interfaces counters, on the sending side:
- 4234458 packets through enp97s0f0
- 3902574 packets through enp97s0f1
On the receiving side:
- 2 packets through enp24s0f0
- 8136583 packets through enp24s0f1
As you can see from netstat, iperf uses different connections:
ESTAB 0 3317368 192.168.120.123:60222 192.168.120.121:5001 users:(("iperf",pid=103750,fd=3)) ESTAB 0 3844440 192.168.120.123:60226 192.168.120.121:5001 users:(("iperf",pid=103750,fd=5)) ESTAB 0 2997360 192.168.120.123:60224 192.168.120.121:5001 users:(("iperf",pid=103750,fd=6)) ESTAB 0 3974760 192.168.120.123:60220 192.168.120.121:5001 users:(("iperf",pid=103750,fd=4))
Am I wrong to expect the connections to be balanced between enp24s0f0 and enp24s0f1 with Switch Hash mode 6?
--- please find additional technical informations below this line ---
sender interface definition:
auto bond0 iface bond0 inet static address 192.168.120.123 netmask 255.255.255.0 slaves enp97s0f0 enp97s0f1 bond_mode 802.3ad bond_xmit_hash_policy layer3+4
receiver interface definition:
auto bond0 iface bond0 inet static address 192.168.120.121 netmask 255.255.255.0 slaves enp24s0f0 enp24s0f1 bond_mode 802.3ad bond_xmit_hash_policy layer3+4
Here are the values of sender interfaces counters:
Before test:
4: enp97s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:0b:8f:18 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 438291403 6631898 0 3 0 6763 TX: bytes packets errors dropped carrier collsns 219646627438 145080666 0 0 0 0 5: enp97s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:0b:8f:18 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 240457512 3634166 0 3 0 6209 TX: bytes packets errors dropped carrier collsns 233533089838 154252127 0 0 0 0 11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:0b:8f:18 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 507559067 7680048 0 0 0 4846 TX: bytes packets errors dropped carrier collsns 334964455166 221250750 0 5 0 0
After test:
4: enp97s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:0b:8f:18 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 438291403 6631898 0 3 0 6763 TX: bytes packets errors dropped carrier collsns 226057578116 149315124 0 0 0 0 5: enp97s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:0b:8f:18 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 261713102 3955756 0 3 0 6209 TX: bytes packets errors dropped carrier collsns 239441568044 158154701 0 0 0 0 11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:0b:8f:18 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 528814657 8001638 0 0 0 4846 TX: bytes packets errors dropped carrier collsns 347283884050 229387782 0 5 0 0
Here are the values of receiver interfaces counters:
Before test:
2: enp24s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 286408898772 189180258 0 3 0 6799 TX: bytes packets errors dropped carrier collsns 422886324 6401287 0 0 0 0 3: enp24s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 195598904910 129201244 0 3 0 6221 TX: bytes packets errors dropped carrier collsns 281424384 4258683 0 0 0 0 9: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 363792080646 240292223 0 0 0 4832 TX: bytes packets errors dropped carrier collsns 534048768 8081408 0 1 0 0
After test:
2: enp24s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 286408898956 189180260 0 3 0 6800 TX: bytes packets errors dropped carrier collsns 433834400 6566968 0 0 0 0 3: enp24s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 207917656870 137337827 0 3 0 6222 TX: bytes packets errors dropped carrier collsns 291732128 4414594 0 0 0 0 9: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 376110832790 248428808 0 0 0 4834 TX: bytes packets errors dropped carrier collsns 555304588 8403000 0 1 0 0
sender /proc/net/bonding/bond0:
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer3+4 (1) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 802.3ad info LACP rate: slow Min links: 0 Aggregator selection policy (ad_select): stable System priority: 65535 System MAC address: 40:a6:b7:0b:8f:18 Active Aggregator Info: Aggregator ID: 8 Number of ports: 2 Actor Key: 15 Partner Key: 770 Partner Mac Address: 8c:3b:ad:66:7d:80 Slave Interface: enp97s0f0 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 40:a6:b7:0b:8f:18 Slave queue ID: 0 Aggregator ID: 8 Actor Churn State: none Partner Churn State: none Actor Churned Count: 0 Partner Churned Count: 0 details actor lacp pdu: system priority: 65535 system mac address: 40:a6:b7:0b:8f:18 port key: 15 port priority: 255 port number: 1 port state: 61 details partner lacp pdu: system priority: 32768 system mac address: 8c:3b:ad:66:7d:80 oper key: 770 port priority: 128 port number: 2 port state: 61 Slave Interface: enp97s0f1 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 40:a6:b7:0b:8f:19 Slave queue ID: 0 Aggregator ID: 8 Actor Churn State: none Partner Churn State: none Actor Churned Count: 0 Partner Churned Count: 0 details actor lacp pdu: system priority: 65535 system mac address: 40:a6:b7:0b:8f:18 port key: 15 port priority: 255 port number: 2 port state: 61 details partner lacp pdu: system priority: 32768 system mac address: 8c:3b:ad:66:7d:80 oper key: 770 port priority: 128 port number: 1 port state: 61
receiver /proc/net/bonding/bond0:
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: IEEE 802.3ad Dynamic link aggregation Transmit Hash Policy: layer3+4 (1) MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 802.3ad info LACP rate: slow Min links: 0 Aggregator selection policy (ad_select): stable System priority: 65535 System MAC address: 40:a6:b7:03:dd:24 Active Aggregator Info: Aggregator ID: 5 Number of ports: 2 Actor Key: 15 Partner Key: 772 Partner Mac Address: 8c:3b:ad:66:7d:80 Slave Interface: enp24s0f0 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 40:a6:b7:03:dd:24 Slave queue ID: 0 Aggregator ID: 5 Actor Churn State: none Partner Churn State: none Actor Churned Count: 0 Partner Churned Count: 0 details actor lacp pdu: system priority: 65535 system mac address: 40:a6:b7:03:dd:24 port key: 15 port priority: 255 port number: 1 port state: 61 details partner lacp pdu: system priority: 32768 system mac address: 8c:3b:ad:66:7d:80 oper key: 772 port priority: 128 port number: 6 port state: 61 Slave Interface: enp24s0f1 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 40:a6:b7:03:dd:25 Slave queue ID: 0 Aggregator ID: 5 Actor Churn State: none Partner Churn State: none Actor Churned Count: 0 Partner Churned Count: 0 details actor lacp pdu: system priority: 65535 system mac address: 40:a6:b7:03:dd:24 port key: 15 port priority: 255 port number: 2 port state: 61 details partner lacp pdu: system priority: 32768 system mac address: 8c:3b:ad:66:7d:80 oper key: 772 port priority: 128 port number: 5 port state: 61
8 Replies
- LaurentMaNETGEAR Expert
Hi ASKRZYNIARZ
Thank you for your message. While it is not 100% sure you can achieve 20Gbps using a LAG (it would really depend on your servers' capacity in fact), you are right, you should expect the traffic being balanced between the two legs of the LAG on the receive side too.
Thank you for the detailed info, we are gaining time in reviewing your config. I don't see anything wrong so far, can you tell us the results after these few changes:
1- Current software is 12.0.11.13, please upgrade (download section of the M4300 support page below):
https://www.netgear.com/support/product/m4300.aspx
2- Please retest with 12.0.11.13, is it the same?
3- If the same, please change the LAG settings (RECEIVE side) in the switch with hash mode 4 Src IP and Src TCP/UDP Port fields. Since we know iperf is sending with different source ports to same destination ports, retesting this way will show us if this is hashing issue, or else. What is the result?
4- If the same, please export the Tech-support file out of your switch (Maintenance\Export\HTTP File Export\tech support in the dropdown menu) and contact me by PM with a link to your tech support file. We will look into it, and search the root cause. It should be something else than the LAG/Hash causing the issue in this case.
Regards,
- ASKRZYNIARZAspirant
Hi. Thank you for your answer.
I upgraded the switch firmware to version 12.0.11.13
I still have the same problem. Switching to mode 4 (Src IP and Src TCP/UDP Port fields) didn't fix the problem either.
You will receive a link to the maintenance file shortly.
--- please find here some informations about the receiving mac addresses ---
Extracts from maintenance file:
*************** show mac-addr-table *************** Address Entries Currently in Use............... 16 VLAN ID MAC Address Interface IfIndex Status ------- ------------------ --------------------- ------- ------------ 1 00:1E:C1:81:1A:42 1/0/25 25 Learned 1 10:7B:44:92:74:8B 1/0/25 25 Learned 1 40:A6:B7:03:DD:24 lag 3 772 Learned 1 40:A6:B7:03:DD:25 lag 3 772 Learned 1 40:A6:B7:0B:8F:18 lag 1 770 Learned 1 40:A6:B7:0B:8F:19 lag 1 770 Learned 1 40:A6:B7:0B:90:3C lag 2 771 Learned 1 40:A6:B7:0B:90:3D lag 2 771 Learned 1 84:A9:3E:84:4C:30 1/0/27 27 Learned 1 84:A9:3E:84:8E:B8 1/0/26 26 Learned 1 84:A9:3E:8A:86:01 1/0/28 28 Learned 1 8C:3B:AD:66:7D:80 CPU Interface: 0/15/1 769 Management 1 8C:3B:AD:66:7D:83 vlan 1 898 Management 1 AC:1F:6B:D9:94:6B 1/0/43 43 Learned 1 AC:1F:6B:DC:52:6F 1/0/45 45 Learned 1 AC:1F:6B:DC:52:E9 1/0/47 47 Learned
and
*************** show lldp remote-device all *************** LLDP Remote Device Summary Local Interface RemID Chassis ID Port ID System Name --------- ------- -------------------- ------------------ ------------------ 1/0/1 7 40:A6:B7:0B:8F:19 40:A6:B7:0B:8F:19 1/0/2 11 40:A6:B7:0B:8F:18 40:A6:B7:0B:8F:18 1/0/3 12 40:A6:B7:0B:90:3C 40:A6:B7:0B:90:3C 1/0/4 13 40:A6:B7:0B:90:3D 40:A6:B7:0B:90:3D 1/0/5 15 40:A6:B7:03:DD:25 40:A6:B7:03:DD:25 1/0/6 14 40:A6:B7:03:DD:24 40:A6:B7:03:DD:24 1/0/7 1/0/8
If I look at my interface definition, I have:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp24s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff 3: enp24s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff 4: eno1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether ac:1f:6b:d9:94:6a brd ff:ff:ff:ff:ff:ff 5: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether ac:1f:6b:d9:94:6b brd ff:ff:ff:ff:ff:ff inet 192.168.42.121/24 brd 192.168.42.255 scope global eno2 valid_lft forever preferred_lft forever inet6 fe80::ae1f:6bff:fed9:946b/64 scope link valid_lft forever preferred_lft forever 9: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 40:a6:b7:03:dd:24 brd ff:ff:ff:ff:ff:ff inet 192.168.120.121/24 brd 192.168.120.255 scope global bond0 valid_lft forever preferred_lft forever inet 192.168.121.121/24 scope global bond0 valid_lft forever preferred_lft forever inet6 fe80::42a6:b7ff:fe03:dd24/64 scope link valid_lft forever preferred_lft forever
enp24s0 is an Intel X710-DA2 network adapter. 40:a6:b7:03:dd:24 and 40:a6:b7:03:dd:25 are the default adapter MAC addresses before setting up bond0.
The 40:a6:b7:03:dd:25 address is discoved either from LLDP or LACPv1:
root@sls-1:/sys/class/net/bond0/bonding# tcpdump -i enp24s0f1 -e | grep dd:25 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on enp24s0f1, link-type EN10MB (Ethernet), capture size 262144 bytes 13:31:25.860533 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110 13:31:57.068536 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110 13:32:28.268538 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110 13:32:59.468532 40:a6:b7:03:dd:25 (oui Unknown) > 01:80:c2:00:00:02 (oui Unknown), ethertype Slow Protocols (0x8809), length 124: LACPv1, length 110
--- some experiments ---
I did try to deactivate LLDP (I set LLDP Transmit and Receive to disable for the relevant physical interfaces).
According to the maintenance file, 40:a6:b7:03:dd:25 is no more in the LLDP table, but is still present ine the mac-addr table (affected to lag 3). I guess it is learned from the LACP frames.
That didn't fixed my issue either.
- LaurentMaNETGEAR Expert
Thank you for the tech-support file. Let me recap what I can see in the file:
- The two relevant LAGs in the M4300-24X24F switch are now LAG1 (1/0/1, 1/0/2) and LAG3 (1/0/5, 1/0/6).
- Your Sender Intel X710-DA2 has enp97s0f0 (40:a6:b7:0b:8f:18) and enp97s0f1 (40:a6:b7:0b:8f:19) under bond0 (40:a6:b7:0b:8f:18).
--> Sender enp97s0f0 (40:a6:b7:0b:8f:18) is connected to 1/0/2 and Sender enp97s0f1 (40:a6:b7:0b:8f:19) is connected to 1/0/1 both under LAG1.
- The Receiver Intel X710-DA2 has enp24s0f0 (40:a6:b7:03:dd:24) and enp24s0f1 (40:a6:b7:03:dd:25) under bond0 (40:a6:b7:03:dd:24).
--> Receiver enp24s0f0 (40:a6:b7:03:dd:24) is connected to 1/0/6 and Receiver enp24s0f1 (40:a6:b7:03:dd:25) is connected to 1/0/5 both under LAG3.
- According to your tests, LAG mode 6 or mode 4 doesn't change the issue on LAG3, enp24s0f0 (40:a6:b7:03:dd:24) doesn't receive much traffic out of 1/0/6 and all the traffic is received by enp24s0f1 (40:a6:b7:03:dd:25) out of 1/0/5 from LAG3 on the receive side
ASKRZYNIARZ, I didn't find any obvious issue in the Tech-Support file, so I am escalating it internally to the rest of the team, and we will come back to you here shortly.
Regards,
Related Content
NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology!
Join Us!