NETGEAR is aware of a growing number of phone and online scams. To learn how to stay safe click here.

Forum Discussion

JonAB's avatar
JonAB
Aspirant
Dec 11, 2025

Issues with LACP with VLANs

Hi all,

Network newbie here. I am having some trouble with VLAN's combined with LACP. The setup I have at the moment are purely for educating my self. It consists of:

One Netgear GS724Tv4


One PFsense router virtualized in Proxmox. The LAN interface of router is a separate NIC where all ports are in the same bond, which is configured for LACP. This bond have a dedicated bridge which is assigned as LAN interface for the router. This NIC is connected to port 1-4 on the switch, which is configured as "lag1"

 

One Synology rack station with a SSD. The LAN interface is configured to use all four ports in LACP. It is connected to ports 5-8 on the switch which is configured as "lag2"

I am using two computers to test the bandwidth of the system, just by simply copying a large file from each computer to the NAS. These are either connected to ports 19 and 20, or 21 and 22 depending on which VLAN I want to use.

When connecting the computers to the same VLAN as the NAS everything seems to work just fine:

Port 19 and 20 are receiving ~2050000 packets each. Lag2, the LACP of the NAS, is transmitting ~4100000 packets. These packages are distributed between port 6 and 8. This means that the LACP to the NAS is working, right?

But when moving the computers to another VLAN the trouble begins:


Port 21 and 22 are receiving ~2050000 packets each. These are transmitted to the router on lag1 and it seems like it decided to use port 2 and 3 for the task.

The router is routing the packages from the VLAN with the computers to the VLAN with the NAS. The switch is receiving the packages on lag1 again, but this time on port 1 and 3. 

So far so good? It seems like the LACP between the switch and the router is working good to?

The switch is transmitting the packages on lag2 to the NAS, but why does it only use port 8 now!?

The consequence is obvious when looking at the speeds of the file transfer...

What am I missing? 


5 Replies

  • StephenB's avatar
    StephenB
    Guru - Experienced User
    JonAB wrote:

    What am I missing? 

    LACP is designed so each data flow only runs over one port of the LAG.  The traffic for that flow is not split over multiple ports.  This approach prevents buffer overruns on the final destination link (which is presumed not to be a LAG).  It also ensures that the packets arrive in the order they were sent.

     

    You have two data flows, so there is a 50-50 chance they will be transmitted on the same port (on a 2-port LAG).  Odds are better with a 4-port LAG (1 in 4). Still, a 50-50 chance this would happen on one of your two lags.

     

    In general, LACP does a reasonable job of load balancing when there is traffic from a lot of devices running over the LAG.  But as you are finding, the load balancing doesn't perform well if you only have a couple of data flows.

  • Hi Stephen and thank you for your answer!

    I get your point regarding the distribution of the traffic. This is why I tested with two dataflows. Of course, I can set up two more computers to test with four dataflows... But the thin that is bothering me is that:

    When the sources and destination is on the same VLAN, i.e. the flow is only going through the switch once,  the LACP is performing flawlessly. I have tested many times, and different ports for each flow is always chosen. 

    But when the sources and destination is on different VLANs, i.e. the flow has to go through the router, it is always the same story. The dataflow on the lag to the router seems to work just fine. The data transmitted from the switch to the router is distributed on two ports, as well as the data received to the switch from the router, it is always two ports (in this specific test case that will say...)

    So far, everything seems to behave very well!

    The data flow, sent by two computers, is received by the switch. It is re-transmitted by a lag to the router, by two different ports. The router is transmitting the same flows, but on the right VLAN, which the switch receives on two different ports. 

    But now, the switch is going to transmit the flows to the NAS. But the switch is always choosing to transmit both flows on the same port, even though I have confirmed that the switch and NAS knows how to properly make use of two ports. Which of course is compromising the bandwidth...

    I get the feeling that there is something that I am missing in the setup, but I cannot find it...

    OK, I get that there is a 50/50 chance that one of my lags would choose to make it this way. But now, lag1 (router - switch) is choosing a good way 100 % of the time, and lag2 (switch - NAS) is choosing to transmit everything on the same port 100 % of the time...

    • StephenB's avatar
      StephenB
      Guru - Experienced User
      JonAB wrote:

      OK, I get that there is a 50/50 chance that one of my lags would choose to make it this way. But now, lag1 (router - switch) is choosing a good way 100 % of the time, and lag2 (switch - NAS) is choosing to transmit everything on the same port 100 % of the time...

      The decision on which port to use is based on a hash of the source and destination mac addresses in the ethernet packet.  So it will be deterministic for a specific LAG.  In your particular case, the router is likely rewriting the source mac address.

       

      Note that the transmit decision is made only by the sending device, so flows in the opposite direction can use different ports.

       

      While some implementations give you a couple of alternative ways to make the decision, the GS724 does not.    I guess you could try changing to a static lag, and see if the traffic flow changes.  But likely the switch will use the same policy. Another thing you could try is removing one port from the router LAG.  Then (if you are lucky) these flows will use different ports. But different flows (from different devices) will still end up with sub-optimal load-balancing, so these hacks are only useful if you are trying to optimize these specific flows.

       

      Multigig NICs and switches are the cleanest path, but of course are expensive.

       

      Is there a reason you want local data flows to be routed?  Also, what is your ISP internet speed?

       

  • I will start answering your last questions. There is no real reason. At the moment, I am using the setup purely for educational purposes. I want to try different setups to test my understanding, and when I do run into troubles like this, take the chance to learn something! My ISP is nothing special, 300/300...

    "The decision on which port to use is based on a hash of the source and destination mac addresses in the ethernet packet.  So it will be deterministic for a specific LAG.  In your particular case, the router is likely rewriting the source mac address."

    I think this is it! When the router is transmitting the flows, the MACs and IPs are changed. Instead of having unique MACs and IPs, both flows now have the same MAC and IP. This is making the switch to decide to transmit them on the same port...

    All right. What should I do instead? Let's say that I want a segmented network for increased security. Let's say that I have different VLANs for different clients. Let's say that I have a NAS serving multiple purposes (of course with multiple pools for the corresponding client group). Of course I want to make use of that the NAS has 4 gigabit connection for maximum bandwidth with multiple clients

    • StephenB's avatar
      StephenB
      Guru - Experienced User
      JonAB wrote:

      I think this is it! When the router is transmitting the flows, the MACs and IPs are changed. Instead of having unique MACs and IPs, both flows now have the same MAC and IP. This is making the switch to decide to transmit them on the same port...

      I agree that is likely it.  But I do want to point out that the hash uses both the source and destination MAC addresses in the packet.  Both flows would have the same source MAC leaving the router, but they would still have the different destination MACs.   

       

      JonAB wrote:

      All right. What should I do instead?

      At the end of the day,  LAGs on the switch won't load-balance perfectly, no matter what you do in the config.

       

      If you set up a static lag on the switch <-> synology, you could probably also set up round-robin on the synology.  That would fully load-balance on the outbound path from the synology (but not the inbound path).  But you could end up with buffer overruns in your clients (or in other LAGs), which will result in packet loss.

       

       

NETGEAR Academy

Boost your skills with the Netgear Academy - Get trained, certified and stay ahead with the latest Netgear technology! 

Join Us!

ProSupport for Business

Comprehensive support plans for maximum network uptime and business peace of mind.

 

Learn More