mbox series

[nf-next,0/5] Dynamic hook interface binding

Message ID 20240503195045.6934-1-phil@nwl.cc
Headers show
Series Dynamic hook interface binding | expand

Message

Phil Sutter May 3, 2024, 7:50 p.m. UTC
Currently, netdev-family chains and flowtables expect their interfaces
to exist at creation time. In practice, this bites users of virtual
interfaces if these happen to be created after the nftables service
starts up and loads the stored ruleset.

Vice-versa, if an interface disappears at run-time (via module unloading
or 'ip link del'), it also disappears from the ruleset, along with the
chain and its rules which binds to it. This is at least problematic for
setups which store the running ruleset during system shutdown.

This series attempts to solve these problems by effectively making
netdev hooks name-based: If no matching interface is found at hook
creation time, it will be inactive until a matching interface appears.
If a bound interface is renamed, a matching inactive hook is searched
for it.

Ruleset dumps will stabilize in that regard. To still provide
information about which existing interfaces a chain/flowtable currently
binds to, new netlink attributes *_ACT_DEVS are introduced which are
filled from the active hooks only.

This series is also prep work for a simple ildcard interface binding
similar to the wildcard interface matching in meta expression. It should
suffice to turn struct nft_hook::ops into an array of all matching
interfaces, but the respective code does not exist yet.

Phil Sutter (5):
  netfilter: nf_tables: Store user-defined hook ifname
  netfilter: nf_tables: Relax hook interface binding
  netfilter: nf_tables: Report active interfaces to user space
  netfilter: nf_tables: Dynamic hook interface binding
  netfilter: nf_tables: Correctly handle NETDEV_RENAME events

 include/net/netfilter/nf_tables.h        |   4 +-
 include/uapi/linux/netfilter/nf_tables.h |   6 +-
 net/netfilter/nf_tables_api.c            | 185 +++++++++++++++--------
 net/netfilter/nft_chain_filter.c         |  70 +++++----
 4 files changed, 172 insertions(+), 93 deletions(-)

Comments

Pablo Neira Ayuso May 10, 2024, 12:13 a.m. UTC | #1
Hi Phil,

On Fri, May 03, 2024 at 09:50:40PM +0200, Phil Sutter wrote:
> Currently, netdev-family chains and flowtables expect their interfaces
> to exist at creation time. In practice, this bites users of virtual
> interfaces if these happen to be created after the nftables service
> starts up and loads the stored ruleset.
> 
> Vice-versa, if an interface disappears at run-time (via module unloading
> or 'ip link del'), it also disappears from the ruleset, along with the
> chain and its rules which binds to it. This is at least problematic for
> setups which store the running ruleset during system shutdown.
> 
> This series attempts to solve these problems by effectively making
> netdev hooks name-based: If no matching interface is found at hook
> creation time, it will be inactive until a matching interface appears.
> If a bound interface is renamed, a matching inactive hook is searched
> for it.
> 
> Ruleset dumps will stabilize in that regard. To still provide
> information about which existing interfaces a chain/flowtable currently
> binds to, new netlink attributes *_ACT_DEVS are introduced which are
> filled from the active hooks only.
> 
> This series is also prep work for a simple ildcard interface binding
> similar to the wildcard interface matching in meta expression. It should
> suffice to turn struct nft_hook::ops into an array of all matching
> interfaces, but the respective code does not exist yet.

Before taking a closer look: Would it be possible to have a torture
test to exercise this path from userspace?

Thanks!
Phil Sutter May 15, 2024, 12:30 p.m. UTC | #2
Hi Pablo,

On Fri, May 10, 2024 at 02:13:11AM +0200, Pablo Neira Ayuso wrote:
> Before taking a closer look: Would it be possible to have a torture
> test to exercise this path from userspace?

Please kindly find my torture script attached. It does the following:

1) Create three netns connected by VETH pairs:
   client [cr0]<->[rc0] router [rs0]<->[sr0] server

2) In router ns, add an nftables ruleset with:
   - A netdev chain for each interface rcN and rsN (N e [0,9])
   - A flowtable for each interface pair (rcN, rsN) (N e [0,9])
   - A base chain in forward hook with ten rules adding traffic to
     the respective flowtable.

3) Run iperf3 between client and server ns for a minute

4) While iperf runs, rename rcN -> rc((N+1)%10) (same for rsN) in a busy
   loop.

I extended my series meanwhile by an extra patch adding notifications
for each hook update and had (a patched) 'nft monitor' running in
parallel.

WDYT, is something still missing I could add to the test? Also, I'm not
sure whether I should add it to netfilter selftests as it doesn't have a
defined failure outcome.

Cheers, Phil
Florian Westphal May 15, 2024, 1:24 p.m. UTC | #3
Phil Sutter <phil@nwl.cc> wrote:
> WDYT, is something still missing I could add to the test? Also, I'm not
> sure whether I should add it to netfilter selftests as it doesn't have a
> defined failure outcome.

Isn't the expected outcome "did not crash"?

You could just append a test for /proc/sys/kernel/tainted,
i.e. script ran and no splat was triggered.

As the selftests are run in regular intervals on the netdev
CI the only critical factor is total test run time, but so far
the netfilter tests are not too bad and for the much-slower-debug kernel
the script can detect this and spent fewer cycles.
Phil Sutter May 15, 2024, 3:32 p.m. UTC | #4
On Wed, May 15, 2024 at 03:24:10PM +0200, Florian Westphal wrote:
> Phil Sutter <phil@nwl.cc> wrote:
> > WDYT, is something still missing I could add to the test? Also, I'm not
> > sure whether I should add it to netfilter selftests as it doesn't have a
> > defined failure outcome.
> 
> Isn't the expected outcome "did not crash"?
> 
> You could just append a test for /proc/sys/kernel/tainted,
> i.e. script ran and no splat was triggered.

Fair point, thanks!

> As the selftests are run in regular intervals on the netdev
> CI the only critical factor is total test run time, but so far
> the netfilter tests are not too bad and for the much-slower-debug kernel
> the script can detect this and spent fewer cycles.

My script is bound by the configured iperf3 test time, so in theory it
should take the same time irrespective of system performance.

Cheers, Phil