Reply
Highlighted
Apprentice

NAT loopback debugging help wanted

As you all know the RBR 75x and RBR 85x series do not properly support NAT loopback.  The question is how did they manage to screw this up?  Here's what I've found so far.

 

- they have set up the corresponding SNAT and DNAT rules for the actual translation so NAT loopback *ought to* work, based on the netfilter documentation

 

- the problem lies in the iptables rules they've set up, and likely in the INPUT chain.  Specifically, if you log onto your router and type "ip link set br-lan promisc on" the problem goes away.  Incidentally, when working with Tech Support, they ask you to go to the debug.htm page and provide a WAN/LAN packet trace. For the trace, they turn on promiscuous mode on the br-lan bridge interface to capture all traffic. At this point, NAT loopback starts working. So good luck to anyone trying to reproduce this effect for tech support.

 

- the mysterious kernel module acos_nat.ko is not involved. If you rmmod it, NAT loopback still works in promiscuous mode and ceases to work in non-promiscuous mode

 

- the effect of it working over wireless and not working over wired was spurious. Wired and wireless interfaces are bundled in the br-lan bridge interface. NAT loopback works on neither.

 

I'm a bit at a loss as to exactly where they screwed up. Their iptables setup is moderately complicated, so I'm not surprised they have trouble debugging it themselves and instead advertise that only some of their devices support NAT loopback. But they did figure it out for some of their devices, interestingly (unless they run those devices with br-lan in promiscuous mode, haha).

 

Is anybody with deeper iptables experience interested in tracking down what the heck is going on with these devices and why NAT loopback works only in promiscuous mode?

 

 

 

Message 1 of 7
Highlighted
Apprentice

Re: NAT loopback debugging help wanted

So I found out the NAT loopback works if the connection from the router to the server being accessed is via one of the ethernet ports on the device (rather than through an intermediate switch).  I also found out that this is likely due to the

bridge-nf-call-iptables

setting, based on information in a forum of a different consumer device.

 

That is, if you issue:

echo 0 > /proc/sys/net/bridge/bridge-nf-call-iptables

it will work (just as if you turn on promisc mode on br-lan).

So this is interesting.  The RBR 750 uses Linux's virtual bridging (a software implementation of Layer 2 switching) to connect the 3 physical ethernet ports (eth1, eth2, eth3) and the wireless interfaces into a single bridge interface. The problem is not the type or kind of switch, it's the fact that those packets cannot be sent directly to their layer-2 receiver.  I don't fully understand it yet.

 

 

Message 2 of 7
Highlighted
Apprentice

Re: NAT loopback debugging help wanted

How do you login to the router? I have not been able to enable telent on that. Any pointers for that? I can then probably star looking at the pitiable rules.

 

Thanks

 

Message 3 of 7
Highlighted
Aspirant

Re: NAT loopback debugging help wanted

Normally linux bridge does not forward the packets to same ports, to avoid loop. If you have server and client on same port, you can try setting harpin ON for the port.

 

brctl hairpin br0 ethX on
Message 4 of 7
Highlighted
Apprentice

Re: NAT loopback debugging help wanted


@minesweeper wrote:

Normally linux bridge does not forward the packets to same ports, to avoid loop. If you have server and client on same port, you can try setting harpin ON for the port.

 

brctl hairpin br0 ethX on

I've already tried that and it didn't work.  But I tell you what: Netgear has actually sent my a trial firmware they claim fixes the problem. As soon as I can take the router down, I'll try it out.

Message 5 of 7
Highlighted
Apprentice

Re: NAT loopback debugging help wanted

Well running

echo 0 > /proc/sys/net/bridge/bridge-nf-call-iptables

Makes it work. And reading about bridge-nf-call-iptables, looks like there is discussion in the community for changing the default to 0 (may be in a different context) here https://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf

So for now I have changed the value to 0, until the hot-fix or new firmware fixes that. I just need to ensure that nat/routing functionality is not impacted (i.e. the router continues to send packets from outside to ip tables and does not act like a router) will check it from external host. I don’t believe it should because bridge is at layer 2 and routing being later 3 should not get impacted.
Message 6 of 7
Highlighted
Apprentice

Re: NAT loopback debugging help wanted


@gb777 wrote:

So I found out the NAT loopback works if the connection from the router to the server being accessed is via one of the ethernet ports on the device (rather than through an intermediate switch).  I also found out that this is likely due to the

bridge-nf-call-iptables

setting, based on information in a forum of a different consumer device.

 

That is, if you issue:

echo 0 > /proc/sys/net/bridge/bridge-nf-call-iptables

it will work (just as if you turn on promisc mode on br-lan).

So this is interesting.  The RBR 750 uses Linux's virtual bridging (a software implementation of Layer 2 switching) to connect the 3 physical ethernet ports (eth1, eth2, eth3) and the wireless interfaces into a single bridge interface. The problem is not the type or kind of switch, it's the fact that those packets cannot be sent directly to their layer-2 receiver.  I don't fully understand it yet.

 

 


So I've been working with Netgear Tech Support on this, too.  They were friendly enough to let me describe the problem.  I mentioned to them that setting bridge-nf-call-iptables to 0 fixed the issue.

 

Now, they've sent me a custom firmware to try.  Of course, only under an NDA.  Not the kind where you can't talk about it, just the kind where you can't share the trial software they shared. I install the firmware and guess what, it now works....

so I took a closer look at how they've changed the configuration in the new firmware. 

 

I've compared the iptables before and after, and they're identical. I do not know if they've made other changes in the trial firmware, except that ....

 

... the trial firmware has bridge-nf-call-iptables set to 0.

Message 7 of 7
Top Contributors
Discussion stats
  • 6 replies
  • 362 views
  • 0 kudos
  • 3 in conversation
Announcements