diff mbox

[RFC] netfilter: xt_TEE: IPv4 Don't Fragmet options

Message ID rvd97lb.cb16bfe930b2100854041c0a3d714500@obelix.schillstrom.com
State Changes Requested
Headers show

Commit Message

Hans Schillstrom June 14, 2012, 6:17 a.m. UTC
Hello,

I  think it is wrong to always force the DF bit in IPv4, it's better to have an option
If an application don't set the DF bit, usually it doesn't expect to get an icmp back either.
The result is that the packet will be dropped...

To retain backwards compatibility I suggest adding a new option like

--ipv4-df-copy  Do not force "Don't Fragment" on the copied packet just copy the bit.

In IPv6 we don't have that option, so nothing has to be done there.




--
Regards 
Hans Schillstrom
+46 70 699 7150


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Pablo Neira Ayuso June 14, 2012, 5:52 p.m. UTC | #1
On Thu, Jun 14, 2012 at 08:17:35AM +0200, Hans Schillstrom wrote:
> Hello,
> 
> I  think it is wrong to always force the DF bit in IPv4, it's better
> to have an option If an application don't set the DF bit, usually it
> doesn't expect to get an icmp back either.  The result is that the
> packet will be dropped...
> 
> To retain backwards compatibility I suggest adding a new option like
> 
> --ipv4-df-copy  Do not force "Don't Fragment" on the copied packet
> just copy the bit.
> 
> In IPv6 we don't have that option, so nothing has to be done there.
> 
> 
> diff --git a/include/linux/netfilter/xt_TEE.h b/include/linux/netfilter/xt_TEE.h
> index 5c21d5c..e5fca8a 100644
> --- a/include/linux/netfilter/xt_TEE.h
> +++ b/include/linux/netfilter/xt_TEE.h
> @@ -4,6 +4,7 @@
>  struct xt_tee_tginfo {
>         union nf_inet_addr gw;
>         char oif[16];
> +       int df_copy;

This breaks backward compatibility. If you some new field, you usually
have to add a new target revision.

Moreover, something like "flags" would be better, in case we need to add
anything else in the future without modifying the binary layout of the
target info.

>         /* used internally by the kernel */
>         struct xt_tee_priv *priv __attribute__((aligned(8)));
> diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
> index ee2e5bc..e9a1ca7 100644
> --- a/net/netfilter/xt_TEE.c
> +++ b/net/netfilter/xt_TEE.c
> @@ -117,7 +117,8 @@ tee_tg4(struct sk_buff *skb, const struct xt_action_param *par)
>          * decreased MTU on the clone route. IPv6 does this too.
>          */
>         iph = ip_hdr(skb);
> -       iph->frag_off |= htons(IP_DF);
> +       if (!info->df_copy)
> +               iph->frag_off |= htons(IP_DF);
>         if (par->hooknum == NF_INET_PRE_ROUTING ||
>             par->hooknum == NF_INET_LOCAL_IN)
>                 --iph->ttl;
> 
> 
> --
> Regards 
> Hans Schillstrom
> +46 70 699 7150
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso June 14, 2012, 6:59 p.m. UTC | #2
On Thu, Jun 14, 2012 at 07:52:23PM +0200, Pablo Neira Ayuso wrote:
> On Thu, Jun 14, 2012 at 08:17:35AM +0200, Hans Schillstrom wrote:
> > Hello,
> > 
> > I  think it is wrong to always force the DF bit in IPv4, it's better
> > to have an option If an application don't set the DF bit, usually it
> > doesn't expect to get an icmp back either.  The result is that the
> > packet will be dropped...

I don't understand what effect you're observing to propose this
change. Could you clarify this?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal June 14, 2012, 8:07 p.m. UTC | #3
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Thu, Jun 14, 2012 at 08:17:35AM +0200, Hans Schillstrom wrote:
> > I  think it is wrong to always force the DF bit in IPv4, it's better
> > to have an option If an application don't set the DF bit, usually it
> > doesn't expect to get an icmp back either.  The result is that the
> > packet will be dropped...
> > 
> > To retain backwards compatibility I suggest adding a new option like
> > 
> > --ipv4-df-copy  Do not force "Don't Fragment" on the copied packet
> > just copy the bit.
> > 
> > In IPv6 we don't have that option, so nothing has to be done there.
> > --- a/net/netfilter/xt_TEE.c
> > +++ b/net/netfilter/xt_TEE.c
> > @@ -117,7 +117,8 @@ tee_tg4(struct sk_buff *skb, const struct xt_action_param *par)
> >          * decreased MTU on the clone route. IPv6 does this too.
> >          */
> >         iph = ip_hdr(skb);
> > -       iph->frag_off |= htons(IP_DF);
> > +       if (!info->df_copy)
> > +               iph->frag_off |= htons(IP_DF);

Wouldn't it make more sense to just remove the
iph->frag_off |= htons(IP_DF);
line?  I don't think forcing DF is a good idea.

Or are you dealing with some application that sets DF, but
then fails to handle the icmp error?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Engelhardt June 14, 2012, 11:55 p.m. UTC | #4
On Thursday 2012-06-14 08:17, Hans Schillstrom wrote:

>Hello,
>
>I  think it is wrong to always force the DF bit in IPv4, it's better to have an option

Do you experience an actual problem?

>If an application don't set the DF bit, usually it doesn't expect to 
>get an icmp back either.

Applications often don't have the means to set DF, think SOCK_STREAM.

>The result is that the packet will be dropped...

And exactly because of that, an ICMP message should be generated, to 
notify the sender about a reduced MTU, so that the TEE destination does 
in fact get the messages.

>
>To retain backwards compatibility I suggest adding a new option like
>
>--ipv4-df-copy  Do not force "Don't Fragment" on the copied packet just copy the bit.
>
>In IPv6 we don't have that option, so nothing has to be done there.
>
>
>diff --git a/include/linux/netfilter/xt_TEE.h b/include/linux/netfilter/xt_TEE.h
>index 5c21d5c..e5fca8a 100644
>--- a/include/linux/netfilter/xt_TEE.h
>+++ b/include/linux/netfilter/xt_TEE.h
>@@ -4,6 +4,7 @@
> struct xt_tee_tginfo {
>        union nf_inet_addr gw;
>        char oif[16];
>+       int df_copy;
>
>        /* used internally by the kernel */
>        struct xt_tee_priv *priv __attribute__((aligned(8)));

As Pablo mentioned, you cannot touch this structure.

"int" is also a bad idea. - See my very own "Writing Netfilter Modules" 
pdf for details.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/netfilter/xt_TEE.h b/include/linux/netfilter/xt_TEE.h
index 5c21d5c..e5fca8a 100644
--- a/include/linux/netfilter/xt_TEE.h
+++ b/include/linux/netfilter/xt_TEE.h
@@ -4,6 +4,7 @@ 
 struct xt_tee_tginfo {
        union nf_inet_addr gw;
        char oif[16];
+       int df_copy;

        /* used internally by the kernel */
        struct xt_tee_priv *priv __attribute__((aligned(8)));
diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
index ee2e5bc..e9a1ca7 100644
--- a/net/netfilter/xt_TEE.c
+++ b/net/netfilter/xt_TEE.c
@@ -117,7 +117,8 @@  tee_tg4(struct sk_buff *skb, const struct xt_action_param *par)
         * decreased MTU on the clone route. IPv6 does this too.
         */
        iph = ip_hdr(skb);
-       iph->frag_off |= htons(IP_DF);
+       if (!info->df_copy)
+               iph->frag_off |= htons(IP_DF);
        if (par->hooknum == NF_INET_PRE_ROUTING ||
            par->hooknum == NF_INET_LOCAL_IN)
                --iph->ttl;