diff mbox series

iptables-nft: Wrong payload merge of rule filter - "! --sport xx ! --dport xx"

Message ID CAPtndGDEJVWXcggRkw66YLjhu3QyUjJ5j4YEbvJLj-qbPkQaPg@mail.gmail.com
State Changes Requested
Headers show
Series iptables-nft: Wrong payload merge of rule filter - "! --sport xx ! --dport xx" | expand

Commit Message

Sriram Rajagopalan March 8, 2024, 9:19 a.m. UTC
Hi,

iptables-nft based on nftables has an issue with the way the rule
filter - "! --sport xx ! --dport xx" is wrongly merged and rendered.

This experiment below demonstrates the issue  -

% ip -d addr show vm1; ip -d addr show vm2

203: vm1@vm2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP group default qlen 1000

    link/ether 2a:a2:43:13:94:d7 brd ff:ff:ff:ff:ff:ff promiscuity 0
minmtu 68 maxmtu 65535

    veth numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

    inet6 fe80::28a2:43ff:fe13:94d7/64 scope link

       valid_lft forever preferred_lft forever

202: vm2@vm1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP group default qlen 1000

    link/ether 9e:00:fa:a3:c9:48 brd ff:ff:ff:ff:ff:ff promiscuity 1
minmtu 68 maxmtu 65535

    veth numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

    inet 2.2.2.2/32 scope global vm2

       valid_lft forever preferred_lft forever

    inet6 fe80::9c00:faff:fea3:c948/64 scope link

       valid_lft forever preferred_lft forever

% export IPTABLES=/usr/sbin/iptables-nft; sudo $IPTABLES -A INPUT -p
tcp ! --sport 22 ! --dport 22 -i vm2; echo -e "\n---- Before data
----\n"; sudo $IPTABLES -L INPUT -vvvn; sudo python -c "from scapy.all
import *; sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
dst='2.2.2.2')/TCP(sport=23, dport=22), iface='vm1')"; echo -e "\n----
After data with either one of tcp sport/dport being 22 ----\n"; sudo
$IPTABLES -L INPUT -vn; sudo python -c "from scapy.all import *;
sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
dst='2.2.2.2')/TCP(sport=23, dport=23), iface='vm1')"; echo -e "\n----
After data with neither one of tcp sport/dport being 22 ----\n"; sudo
$IPTABLES -L INPUT -vn; sudo $IPTABLES -D INPUT -p tcp ! --sport 22 !
--dport 22 -i vm2

---- Before data ----

ip filter INPUT 39
  [ meta load iifname => reg 1 ]
  [ cmp eq reg 1 0x00326d76 ]
  [ meta load l4proto => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 4b @ transport header + 0 => reg 1 ]
  [ cmp neq reg 1 0x16001600 ]
  [ counter pkts 0 bytes 0 ]


Chain INPUT (policy ACCEPT 0 packets, 0 bytes)

 pkts bytes target     prot opt in     out     source               destination

    0     0            tcp  --  vm2    *       0.0.0.0/0
0.0.0.0/0            tcp spt:!22 dpt:!22


---- After data with either one of tcp sport/dport being 22 ----

# Warning: iptables-legacy tables present, use iptables-legacy to see them

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)

 pkts bytes target     prot opt in     out     source               destination

    1    46            tcp  --  vm2    *       0.0.0.0/0
0.0.0.0/0            tcp spt:!22 dpt:!22


---- After data with neither one of tcp sport/dport being 22 ----

# Warning: iptables-legacy tables present, use iptables-legacy to see them

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)

 pkts bytes target     prot opt in     out     source               destination

    2    92            tcp  --  vm2    *       0.0.0.0/0
0.0.0.0/0            tcp spt:!22 dpt:!22


% export IPTABLES=/usr/local/sbin/iptables-legacy; sudo $IPTABLES -A
INPUT -p tcp ! --sport 22 ! --dport 22 -i vm2; echo -e "\n---- Before
data ----\n"; sudo $IPTABLES -L INPUT -vvvn; sudo python -c "from
scapy.all import *;
sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
dst='2.2.2.2')/TCP(sport=23, dport=22), iface='vm1')"; echo -e "\n----
After data with either one of tcp sport/dport being 22 ----\n"; sudo
$IPTABLES -L INPUT -vn; sudo python -c "from scapy.all import *;
sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
dst='2.2.2.2')/TCP(sport=23, dport=23), iface='vm1')"; echo -e "\n----
After data with neither one of tcp sport/dport being 22 ----\n"; sudo
$IPTABLES -L INPUT -vn; sudo $IPTABLES -D INPUT -p tcp ! --sport 22 !
--dport 22 -i vm2


---- Before data ----

ip filter INPUT 41
  [ meta load iifname => reg 1 ]
  [ cmp eq reg 1 0x00326d76 ]
  [ payload load 1b @ network header + 9 => reg 1 ]
  [ cmp eq reg 1 0x00000006 ]
  [ payload load 2b @ transport header + 0 => reg 1 ]
  [ cmp neq reg 1 0x00001600 ]
  [ payload load 2b @ transport header + 2 => reg 1 ]
  [ cmp neq reg 1 0x00001600 ]
  [ counter pkts 0 bytes 0 ]


Chain INPUT (policy ACCEPT 13824 packets, 994K bytes)

 pkts bytes target     prot opt in     out     source               destination

    0     0            tcp  --  vm2    *       0.0.0.0/0
0.0.0.0/0            tcp spt:!22 dpt:!22


---- After data with either one of tcp sport/dport being 22 ----

Chain INPUT (policy ACCEPT 13827 packets, 994K bytes)

 pkts bytes target     prot opt in     out     source               destination

    0     0            tcp  --  vm2    *       0.0.0.0/0
0.0.0.0/0            tcp spt:!22 dpt:!22


---- After data with neither one of tcp sport/dport being 22 ----

Chain INPUT (policy ACCEPT 13831 packets, 994K bytes)

 pkts bytes target     prot opt in     out     source               destination

    1    46            tcp  --  vm2    *       0.0.0.0/0
0.0.0.0/0            tcp spt:!22 dpt:!22

With iptables-nft the logical AND of sport neq and dport neq is
resulting the payload getting merged incorrectly -


  [ payload load 4b @ transport header + 0 => reg 1 ]
  [ cmp neq reg 1 0x16001600 ]


v/s

With iptables-legacy the logical AND of sport neq and dport neq is
resulting in using separate payload matches correctly -

  [ payload load 2b @ transport header + 0 => reg 1 ]
  [ cmp neq reg 1 0x00001600 ]
  [ payload load 2b @ transport header + 2 => reg 1 ]
  [ cmp neq reg 1 0x00001600 ]

The iptables-nft has the issue where either one of tcp sport/dport
being 22 still matches the iptables rule - "tcp ! --sport 22 ! --dport
22" while it should not have matched, whereas with the
iptables-legacy, the packet with either one of tcp sport/dport being
22 does not match the iptables rule - "tcp ! --sport 22 ! --dport 22".


The below patch to iptables-nft fixes this issue -

Author: Sriram Rajagopalan <bglsriram@gmail.com>
Date:   Fri Mar 07 20:09:38 2024 -0800

iptables: Fixed the issue with combining the payload in case of invert
filter for tcp src and dst ports

Signed-off-by: Sriram Rajagopalan <bglsriram@gmail.com>
Signed-off-by: Sriram Rajagopalan <sriramr@arista.com>

                        j = i;

Please review the patches above and provide your feedback. Please let
me know of any questions/clarifications.

Thanks,
Sriram

Comments

Phil Sutter March 8, 2024, 11:30 a.m. UTC | #1
Hi Sriram,

On Fri, Mar 08, 2024 at 02:49:38PM +0530, Sriram Rajagopalan wrote:
> iptables-nft based on nftables has an issue with the way the rule
> filter - "! --sport xx ! --dport xx" is wrongly merged and rendered.

I agree with your analysis and the patches look fine. Could you please
submit them formally?

[...]
> % export IPTABLES=/usr/local/sbin/iptables-legacy; sudo $IPTABLES -A
> INPUT -p tcp ! --sport 22 ! --dport 22 -i vm2; echo -e "\n---- Before
> data ----\n"; sudo $IPTABLES -L INPUT -vvvn; sudo python -c "from
> scapy.all import *;
> sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
> dst='2.2.2.2')/TCP(sport=23, dport=22), iface='vm1')"; echo -e "\n----
> After data with either one of tcp sport/dport being 22 ----\n"; sudo
> $IPTABLES -L INPUT -vn; sudo python -c "from scapy.all import *;
> sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
> dst='2.2.2.2')/TCP(sport=23, dport=23), iface='vm1')"; echo -e "\n----
> After data with neither one of tcp sport/dport being 22 ----\n"; sudo
> $IPTABLES -L INPUT -vn; sudo $IPTABLES -D INPUT -p tcp ! --sport 22 !
> --dport 22 -i vm2
> 
> 
> ---- Before data ----
> 
> ip filter INPUT 41
>   [ meta load iifname => reg 1 ]
>   [ cmp eq reg 1 0x00326d76 ]
>   [ payload load 1b @ network header + 9 => reg 1 ]
>   [ cmp eq reg 1 0x00000006 ]
>   [ payload load 2b @ transport header + 0 => reg 1 ]
>   [ cmp neq reg 1 0x00001600 ]
>   [ payload load 2b @ transport header + 2 => reg 1 ]
>   [ cmp neq reg 1 0x00001600 ]
>   [ counter pkts 0 bytes 0 ]

You're fibbing here: That netlink debug output can't come from
iptables-legacy. I suspect it actually comes from your patched
iptables-nft or nft too. :)

[...]
> Author: Sriram Rajagopalan <bglsriram@gmail.com>
> Date:   Fri Mar 07 20:09:38 2024 -0800
> 
> iptables: Fixed the issue with combining the payload in case of invert
> filter for tcp src and dst ports
> 
> Signed-off-by: Sriram Rajagopalan <bglsriram@gmail.com>
> Signed-off-by: Sriram Rajagopalan <sriramr@arista.com>

Maybe avoid the double SoB? Apart from that:

Acked-by: Phil Sutter <phil@nwl.cc>

Thanks, Phil
Florian Westphal March 8, 2024, 1:37 p.m. UTC | #2
Sriram Rajagopalan <bglsriram@gmail.com> wrote:
> diff --git a/src/rule.c b/src/rule.c
> index 342c43fb..5def185d 100644
> --- a/src/rule.c
> +++ b/src/rule.c
> @@ -2661,8 +2661,13 @@ static void payload_do_merge(struct stmt *sa[],
> unsigned int n)
>         for (j = 0, i = 1; i < n; i++) {
>                 stmt = sa[i];
>                 this = stmt->expr;
> -
> -               if (!payload_can_merge(last->left, this->left) ||
> +               /* We should not be merging two OP_NEQs. For example -
> +                * tcp sport != 22 tcp dport != 23 should not result in
> +                * [ payload load 4b @ transport header + 0 => reg 1 ]
> +                * [ cmp neq reg 1 0x17001600 ]
> +                */
> +               if (this->op == OP_NEQ ||
> +                   !payload_can_merge(last->left, this->left) ||
>                     !relational_ops_match(last, this)) {
>                         last = this;
>                         j = i;
> 
> Please review the patches above and provide your feedback. Please let
> me know of any questions/clarifications.


Probably better to do:

diff --git a/src/rule.c b/src/rule.c
--- a/src/rule.c
+++ b/src/rule.c
@@ -2766,7 +2766,6 @@ static void stmt_reduce(const struct rule *rule)
                        switch (stmt->expr->op) {
                        case OP_EQ:
                        case OP_IMPLICIT:
-                       case OP_NEQ:
                                break;
                        default:
                                continue;
Sriram Rajagopalan March 12, 2024, 10:24 a.m. UTC | #3
On Fri, Mar 8, 2024 at 7:07 PM Florian Westphal <fw@strlen.de> wrote:
>
> Sriram Rajagopalan <bglsriram@gmail.com> wrote:
> > diff --git a/src/rule.c b/src/rule.c
> > index 342c43fb..5def185d 100644
> > --- a/src/rule.c
> > +++ b/src/rule.c
> > @@ -2661,8 +2661,13 @@ static void payload_do_merge(struct stmt *sa[],
> > unsigned int n)
> >         for (j = 0, i = 1; i < n; i++) {
> >                 stmt = sa[i];
> >                 this = stmt->expr;
> > -
> > -               if (!payload_can_merge(last->left, this->left) ||
> > +               /* We should not be merging two OP_NEQs. For example -
> > +                * tcp sport != 22 tcp dport != 23 should not result in
> > +                * [ payload load 4b @ transport header + 0 => reg 1 ]
> > +                * [ cmp neq reg 1 0x17001600 ]
> > +                */
> > +               if (this->op == OP_NEQ ||
> > +                   !payload_can_merge(last->left, this->left) ||
> >                     !relational_ops_match(last, this)) {
> >                         last = this;
> >                         j = i;
> >
> > Please review the patches above and provide your feedback. Please let
> > me know of any questions/clarifications.
>
>
> Probably better to do:
>
> diff --git a/src/rule.c b/src/rule.c
> --- a/src/rule.c
> +++ b/src/rule.c
> @@ -2766,7 +2766,6 @@ static void stmt_reduce(const struct rule *rule)
>                         switch (stmt->expr->op) {
>                         case OP_EQ:
>                         case OP_IMPLICIT:
> -                       case OP_NEQ:
>                                 break;
>                         default:
>                                 continue;
>

Sure, it makes sense to prevent this at the caller of
payload_do_merge(), i.e within stmt_reduce() itself.

Thanks,
Sriram
Florian Westphal March 12, 2024, 10:34 a.m. UTC | #4
Sriram Rajagopalan <bglsriram@gmail.com> wrote:
> Sure, it makes sense to prevent this at the caller of
> payload_do_merge(), i.e within stmt_reduce() itself.

Will you submit a patch?

Thanks,
Florian
Sriram Rajagopalan March 13, 2024, 9:01 a.m. UTC | #5
On Fri, Mar 8, 2024 at 5:00 PM Phil Sutter <phil@nwl.cc> wrote:
>
> Hi Sriram,
>
> On Fri, Mar 08, 2024 at 02:49:38PM +0530, Sriram Rajagopalan wrote:
> > iptables-nft based on nftables has an issue with the way the rule
> > filter - "! --sport xx ! --dport xx" is wrongly merged and rendered.
>
> I agree with your analysis and the patches look fine. Could you please
> submit them formally?
Sure, thanks!
>
> [...]
> > % export IPTABLES=/usr/local/sbin/iptables-legacy; sudo $IPTABLES -A
> > INPUT -p tcp ! --sport 22 ! --dport 22 -i vm2; echo -e "\n---- Before
> > data ----\n"; sudo $IPTABLES -L INPUT -vvvn; sudo python -c "from
> > scapy.all import *;
> > sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
> > dst='2.2.2.2')/TCP(sport=23, dport=22), iface='vm1')"; echo -e "\n----
> > After data with either one of tcp sport/dport being 22 ----\n"; sudo
> > $IPTABLES -L INPUT -vn; sudo python -c "from scapy.all import *;
> > sendp(Ether(dst='9e:00:fa:a3:c9:48')/IP(src='1.1.1.1',
> > dst='2.2.2.2')/TCP(sport=23, dport=23), iface='vm1')"; echo -e "\n----
> > After data with neither one of tcp sport/dport being 22 ----\n"; sudo
> > $IPTABLES -L INPUT -vn; sudo $IPTABLES -D INPUT -p tcp ! --sport 22 !
> > --dport 22 -i vm2
> >
> >
> > ---- Before data ----
> >
> > ip filter INPUT 41
> >   [ meta load iifname => reg 1 ]
> >   [ cmp eq reg 1 0x00326d76 ]
> >   [ payload load 1b @ network header + 9 => reg 1 ]
> >   [ cmp eq reg 1 0x00000006 ]
> >   [ payload load 2b @ transport header + 0 => reg 1 ]
> >   [ cmp neq reg 1 0x00001600 ]
> >   [ payload load 2b @ transport header + 2 => reg 1 ]
> >   [ cmp neq reg 1 0x00001600 ]
> >   [ counter pkts 0 bytes 0 ]
>
> You're fibbing here: That netlink debug output can't come from
> iptables-legacy. I suspect it actually comes from your patched
> iptables-nft or nft too. :)
Oh.. yeah, probably yes. iptables-legacy would not use the VM instructions.
Sorry, for mixing up things.

>
> [...]
> > Author: Sriram Rajagopalan <bglsriram@gmail.com>
> > Date:   Fri Mar 07 20:09:38 2024 -0800
> >
> > iptables: Fixed the issue with combining the payload in case of invert
> > filter for tcp src and dst ports
> >
> > Signed-off-by: Sriram Rajagopalan <bglsriram@gmail.com>
> > Signed-off-by: Sriram Rajagopalan <sriramr@arista.com>
>
> Maybe avoid the double SoB? Apart from that:
>
> Acked-by: Phil Sutter <phil@nwl.cc>
>
> Thanks, Phil
Sriram Rajagopalan March 13, 2024, 9:02 a.m. UTC | #6
On Tue, Mar 12, 2024 at 4:04 PM Florian Westphal <fw@strlen.de> wrote:
>
> Sriram Rajagopalan <bglsriram@gmail.com> wrote:
> > Sure, it makes sense to prevent this at the caller of
> > payload_do_merge(), i.e within stmt_reduce() itself.
>
> Will you submit a patch?
Sure, submitted the patch separately.
>
> Thanks,
> Florian
diff mbox series

Patch

diff --git a/iptables/nft.c b/iptables/nft.c

index dae6698d..38227d51 100644

--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -1307,14 +1307,12 @@  static int add_nft_tcpudp(struct nft_handle
*h,struct nftnl_rule *r,
        uint8_t reg;
        int ret;

-       if (src[0] && src[0] == src[1] &&
+       if (!invert_src &&
+           src[0] && src[0] == src[1] &&
            dst[0] && dst[0] == dst[1] &&
            invert_src == invert_dst) {
                uint32_t combined = dst[0] | (src[0] << 16);

-               if (invert_src)
-                       op = NFT_CMP_NEQ;
-
                expr = gen_payload(h, NFT_PAYLOAD_TRANSPORT_HEADER, 0, 4, &reg);
                if (!expr)
                        return -ENOMEM;

This issue is also present with the nft
tool(https://git.netfilter.org/nftables/) as well and the below patch
fixes the issue with the nft tool -

Author: Sriram Rajagopalan <bglsriram@gmail.com>
Date:   Fri Mar 08 10:06:30 2024 -0800

nftables: Fixed the issue with merging the payload in case of invert
filter for tcp src and dst ports

Signed-off-by: Sriram Rajagopalan <bglsriram@gmail.com>
Signed-off-by: Sriram Rajagopalan <sriramr@arista.com>

diff --git a/src/rule.c b/src/rule.c
index 342c43fb..5def185d 100644
--- a/src/rule.c
+++ b/src/rule.c
@@ -2661,8 +2661,13 @@  static void payload_do_merge(struct stmt *sa[],
unsigned int n)
        for (j = 0, i = 1; i < n; i++) {
                stmt = sa[i];
                this = stmt->expr;
-
-               if (!payload_can_merge(last->left, this->left) ||
+               /* We should not be merging two OP_NEQs. For example -
+                * tcp sport != 22 tcp dport != 23 should not result in
+                * [ payload load 4b @ transport header + 0 => reg 1 ]
+                * [ cmp neq reg 1 0x17001600 ]
+                */
+               if (this->op == OP_NEQ ||
+                   !payload_can_merge(last->left, this->left) ||
                    !relational_ops_match(last, this)) {
                        last = this;