Message ID | 55BFDFF3.2030309@cumulusnetworks.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On 03/08/15 22:41, roopa wrote: > On 8/3/15, 9:39 AM, Robert Shearman wrote: >> Locally-generated IPv4 packets, such as from applications running on >> the host or traceroute/ping currently don't have lwtunnel output >> redirected encap applied. However, they should do in the same way as >> for forwarded packets and this patch series addresses that. >> >> Robert Shearman (2): >> lwtunnel: set skb protocol and dev >> ipv4: apply lwtunnel encap for locally-generated packets >> >> net/core/lwtunnel.c | 12 ++++++++++-- >> net/ipv4/route.c | 2 ++ >> 2 files changed, 12 insertions(+), 2 deletions(-) >> > Thanks for this patch Robert. Looks good. > I have been thinking of sending a similar patch out for this and > since i was also looking at ip fragmentation, I have a slightly > different patch which I think should also take care of > encapsulating locally generated packets too. This patch moves the output > redirection to after ip fragmentation. > What do you think about the below (I have briefly tested it. Was > planning to test some more before sending it out as RFC) ? I'm glad you're looking at fragmentation - this does need to be implemented at some point. While it looks like fragmentation should work, the issue is that now post-routing netfilter modules will be presented with un-encapsulated packets without distinguishing them from encapsulated packets. An example of why this is a problem is that this would prevent operators from implementing rules to prevent non-control IP packets being output onto an interface in an MPLS core, and I have seen service providers doing this sort of thing in the past. So I think this is a pretty big deal for MPLS. There are possibly other less obvious use cases that would be prevented by this change. So as long as you can keep these working, I'd be fine with such an approach. Thanks, Rob -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/lwtunnel.h b/include/net/lwtunnel.h index 918e03c..7816805 100644 --- a/include/net/lwtunnel.h +++ b/include/net/lwtunnel.h @@ -18,6 +18,7 @@ struct lwtunnel_state { __u16 flags; atomic_t refcnt; int len; + __u16 headroom; __u8 data[0]; }; diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 6bf89a6..ae3119f 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -73,6 +73,7 @@ #include <net/icmp.h> #include <net/checksum.h> #include <net/inetpeer.h> +#include <net/lwtunnel.h> #include <linux/igmp.h> #include <linux/netfilter_ipv4.h> #include <linux/netfilter_bridge.h> @@ -201,6 +202,9 @@ static int ip_finish_output2(struct sock *sk, struct sk_buff *skb) skb = skb2; } + if (lwtunnel_output_redirect(rt->rt_lwtstate)) + return lwtunnel_output(sk, skb); + rcu_read_lock_bh(); nexthop = (__force u32) rt_nexthop(rt, ip_hdr(skb)->daddr); neigh = __ipv4_neigh_lookup_noref(dev, nexthop); diff --git a/net/ipv4/route.c b/net/ipv4/route.c index d3964fa..4e07b9a 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1234,6 +1234,9 @@ static unsigned int ipv4_mtu(const struct dst_entry *dst) mtu = dst->dev->mtu; + if (lwtunnel_output_redirect(rt->rt_lwtstate)) + mtu -= rt->rt_lwtstate->headroom; + if (unlikely(dst_metric_locked(dst, RTAX_MTU))) { if (rt->rt_uses_gateway && mtu > 576)