diff mbox

[net-next-2.6] net/ipv4: push IP options to CB in ip_fragment

Message ID 20100901165743.GB17843@stratus.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Bandan Das Sept. 1, 2010, 4:57 p.m. UTC
On  0, Herbert Xu <herbert@gondor.hengli.com.au> wrote:
> On Tue, Aug 31, 2010 at 11:17:51AM +0200, Eric Dumazet wrote:
> >
> > Once again, the IP stack -> bridge -> IP stack flow bites us,
> > because bridge likes to dirty IPCB.
> 
> OK, so we're talking about a locally transmitted packet, with
> IP options leaving the IP stack, entering bridging, and then
> reentering the IP stack?
> 
> In that case the packet should no longer be treated as an IP
> packet when it enters the bridge.  So if it did have options
> and we want to support that in bridging then we need to parse
> IP options there as my comment suggested.

Ok. So, I am not sure if re-exporting ip_compile_options is a 
good idea nor am I sure if replicating its behavior in a different 
function is.  It was removed from the list of exported symbols way
back in 2005. 

commit 0742fd53a3774781255bd1e471e7aa2e4a82d5f7
Author: Adrian Bunk <bunk@stusta.de>
Date:   Tue Aug 9 19:35:47 2005 -0700

    [IPV4]: possible cleanups
    
    This patch contains the following possible cleanups:
    - make needlessly global code static
    - #if 0 the following unused global function:
      - xfrm4_state.c: xfrm4_state_fini
    - remove the following unneeded EXPORT_SYMBOL's:
      - ip_output.c: ip_finish_output
      - ip_output.c: sysctl_ip_default_ttl
      - fib_frontend.c: ip_dev_find
      - inetpeer.c: inet_peer_idlock
      - ip_options.c: ip_options_compile
      - ip_options.c: ip_options_undo
      - net/core/request_sock.c: sysctl_max_syn_backlog

But, nevertheless, I moved the call to ip_options_compile to 
br_nf_dev_queue_xmit(). Does something like this look ok ?
(Previously sent patch : http://www.spinics.net/lists/kernel/msg1077537.html)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Herbert Xu Sept. 3, 2010, 4:49 a.m. UTC | #1
On Wed, Sep 01, 2010 at 12:57:43PM -0400, Bandan Das wrote:
> 
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index 2c911c0..de44271 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -759,9 +759,21 @@ static unsigned int br_nf_forward_arp(unsigned int hook, struct sk_buff *skb,
>  #if defined(CONFIG_NF_CONNTRACK_IPV4) || defined(CONFIG_NF_CONNTRACK_IPV4_MODULE)
>  static int br_nf_dev_queue_xmit(struct sk_buff *skb)
>  {
> +       struct ip_options *opt;
> +       struct iphdr *iph;
> +       struct net_device *dev = skb->dev;
> +
>         if (skb->nfct != NULL && skb->protocol == htons(ETH_P_IP) &&
>             skb->len + nf_bridge_mtu_reduction(skb) > skb->dev->mtu &&
> -           !skb_is_gso(skb))
> +           !skb_is_gso(skb)) {
> +               iph = ip_hdr(skb);
> +               opt = &(IPCB(skb)->opt);
> +               opt->optlen = iph->ihl*4 - sizeof(struct iphdr);
> +               if (ip_options_compile(dev_net(dev), opt, skb)){
> +                       IP_INC_STATS(dev_net(dev), IPSTATS_MIB_INHDRERRORS);
> +                       memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
> +               }
> +       }

1. Only parse options if ihl > 5.
2. Please audit the IP stack to ensure that this does not mangle
the packet.  We should not write to the packet here.
3. Please check whether SRR is handled correctly (see ip_rcv_options).

This should go into a helper function as this isn't the only entry
point from the bridge into the IP stack.

Also it may be worth considering whether we should replace
ip_fragment here with something that only refragments a frag_list
since the only time we want to fragment here is if we reassembled
an IP datagram due to netfilter.

Thanks,
diff mbox

Patch

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 2c911c0..de44271 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -759,9 +759,21 @@  static unsigned int br_nf_forward_arp(unsigned int hook, struct sk_buff *skb,
 #if defined(CONFIG_NF_CONNTRACK_IPV4) || defined(CONFIG_NF_CONNTRACK_IPV4_MODULE)
 static int br_nf_dev_queue_xmit(struct sk_buff *skb)
 {
+       struct ip_options *opt;
+       struct iphdr *iph;
+       struct net_device *dev = skb->dev;
+
        if (skb->nfct != NULL && skb->protocol == htons(ETH_P_IP) &&
            skb->len + nf_bridge_mtu_reduction(skb) > skb->dev->mtu &&
-           !skb_is_gso(skb))
+           !skb_is_gso(skb)) {
+               iph = ip_hdr(skb);
+               opt = &(IPCB(skb)->opt);
+               opt->optlen = iph->ihl*4 - sizeof(struct iphdr);
+               if (ip_options_compile(dev_net(dev), opt, skb)){
+                       IP_INC_STATS(dev_net(dev), IPSTATS_MIB_INHDRERRORS);
+                       memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
+               }
+       }
                return ip_fragment(skb, br_dev_queue_push_xmit);
        else
                return br_dev_queue_push_xmit(skb);
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index ba9836c..72fe82c 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -466,7 +466,7 @@  error:
        }
        return -EINVAL;
 }
-
+EXPORT_SYMBOL(ip_options_compile);
 
 /*
  *     Undo all the changes done by ip_options_compile().