Message ID | 20130729145017.GD2490@order.stressinduktion.org |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote: > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a > socket in case it is locally generated. If the packet first traversed > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic > because of address family type mismatch in the error reporting functions. > So the skb is still owned by a socket of the inner address family. Is this intentional? Maybe the ndo_start_xmit() function of the tunnel device should orphan the skb if we tunnel the packet through a different address family. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote: > On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote: > > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a > > socket in case it is locally generated. If the packet first traversed > > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic > > because of address family type mismatch in the error reporting functions. > > > > So the skb is still owned by a socket of the inner address family. > Is this intentional? Maybe the ndo_start_xmit() function of the > tunnel device should orphan the skb if we tunnel the packet > through a different address family. I thought about this, too. But then we would stop accounting the data to the socket while it is travelling the stack. I don't know about the possible problems resulting from this. Greetings, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 30, 2013 at 10:30:17AM +0200, Hannes Frederic Sowa wrote: > On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote: > > On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote: > > > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a > > > socket in case it is locally generated. If the packet first traversed > > > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic > > > because of address family type mismatch in the error reporting functions. > > > > > > > So the skb is still owned by a socket of the inner address family. > > Is this intentional? Maybe the ndo_start_xmit() function of the > > tunnel device should orphan the skb if we tunnel the packet > > through a different address family. > > I thought about this, too. But then we would stop accounting the data > to the socket while it is travelling the stack. I don't know about the > possible problems resulting from this. > I'm also not absolutely sure, but we reinsert the packet to the ipv4/ipv6 output path which is also used to output forwarded packets. So the code should be prepared for handling a skb without socket context. There are already situations where we orphan the skb in some tunnel xmit functions. For example if we tunnel through another namespace. Lets see if anyone else has an opinion to that. I'll go and look what we can do if we have to fix it in the IPsec code in the meantime. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 30, 2013 at 12:26:11PM +0200, Steffen Klassert wrote: > On Tue, Jul 30, 2013 at 10:30:17AM +0200, Hannes Frederic Sowa wrote: > > On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote: > > > On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote: > > > > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a > > > > socket in case it is locally generated. If the packet first traversed > > > > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic > > > > because of address family type mismatch in the error reporting functions. > > > > > > > > > > So the skb is still owned by a socket of the inner address family. > > > Is this intentional? Maybe the ndo_start_xmit() function of the > > > tunnel device should orphan the skb if we tunnel the packet > > > through a different address family. > > > > I thought about this, too. But then we would stop accounting the data > > to the socket while it is travelling the stack. I don't know about the > > possible problems resulting from this. > > > > I'm also not absolutely sure, but we reinsert the packet to > the ipv4/ipv6 output path which is also used to output forwarded > packets. So the code should be prepared for handling a skb without > socket context. > > There are already situations where we orphan the skb in some > tunnel xmit functions. For example if we tunnel through > another namespace. Somehow this seems the way to go. Even if we get a matching address family socket we would still call ipv6_local_error with the wrong fl6.daddr for that socket. I do think IPv4 has the same issue, but I have not checked. Because skbs could circulate multiple times through devices, a check as (skb->dev->type | tunnel_types) seems to be not enough, too. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Hannes Frederic Sowa <hannes@stressinduktion.org> Date: Tue, 30 Jul 2013 12:40:38 +0200 > On Tue, Jul 30, 2013 at 12:26:11PM +0200, Steffen Klassert wrote: >> On Tue, Jul 30, 2013 at 10:30:17AM +0200, Hannes Frederic Sowa wrote: >> > On Tue, Jul 30, 2013 at 10:21:18AM +0200, Steffen Klassert wrote: >> > > On Mon, Jul 29, 2013 at 04:50:17PM +0200, Hannes Frederic Sowa wrote: >> > > > xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a >> > > > socket in case it is locally generated. If the packet first traversed >> > > > a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic >> > > > because of address family type mismatch in the error reporting functions. >> > > > >> > > >> > > So the skb is still owned by a socket of the inner address family. >> > > Is this intentional? Maybe the ndo_start_xmit() function of the >> > > tunnel device should orphan the skb if we tunnel the packet >> > > through a different address family. >> > >> > I thought about this, too. But then we would stop accounting the data >> > to the socket while it is travelling the stack. I don't know about the >> > possible problems resulting from this. >> > >> >> I'm also not absolutely sure, but we reinsert the packet to >> the ipv4/ipv6 output path which is also used to output forwarded >> packets. So the code should be prepared for handling a skb without >> socket context. >> >> There are already situations where we orphan the skb in some >> tunnel xmit functions. For example if we tunnel through >> another namespace. > > Somehow this seems the way to go. Agreed, I think we should just orphan the SKB. There was talk about this when the skb_scrub_packet() interface was added, maybe we should do it unconditionally after all. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 30, 2013 at 12:26:11PM +0200, Steffen Klassert wrote: > Lets see if anyone else has an opinion to that. I'll go > and look what we can do if we have to fix it in the IPsec > code in the meantime. If you have not yet done so, I would try to find a solution over the weekend. Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 01, 2013 at 10:11:50AM +0200, Hannes Frederic Sowa wrote: > > If you have not yet done so, I would try to find a solution over the weekend. > I have not yet done so, please go ahead. Thanks a lot! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c index 327a617..61f6cae 100644 --- a/net/ipv4/xfrm4_output.c +++ b/net/ipv4/xfrm4_output.c @@ -32,10 +32,10 @@ static int xfrm4_tunnel_check_size(struct sk_buff *skb) dst = skb_dst(skb); mtu = dst_mtu(dst); if (skb->len > mtu) { - if (skb->sk) + if (skb->sk && skb->sk->sk_family == AF_INET) ip_local_error(skb->sk, EMSGSIZE, ip_hdr(skb)->daddr, inet_sk(skb->sk)->inet_dport, mtu); - else + else if (!skb->sk) icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu)); ret = -EMSGSIZE; diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c index 8755a30..29f6db7 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -59,6 +59,9 @@ static void xfrm6_local_error(struct sk_buff *skb, u32 mtu) struct flowi6 fl6; struct sock *sk = skb->sk; + if (sk->sk_family != AF_INET6) + return; + fl6.fl6_dport = inet_sk(sk)->inet_dport; fl6.daddr = ipv6_hdr(skb)->daddr;
xfrm6_local_error/xfrm4_tunnel_check_size report mtu errors back to a socket in case it is locally generated. If the packet first traversed a 6in4/4in6 tunnel before passing the xfrm layer, we could get a panic because of address family type mismatch in the error reporting functions. This bug fixes only the panic part of the excellent bug report <https://bugzilla.kernel.org/show_bug.cgi?id=58691> by vi0oss. It is unclear to me how to let these mtu change events travel up the stack. A match on (skb->sk && (skb->dev | tunnel_types)) could be made first and we forcefully decrease the mtu of that interface with a warning? This seems to be fragile to me. Reported-by: <vi0oss@gmail.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> --- net/ipv4/xfrm4_output.c | 4 ++-- net/ipv6/xfrm6_output.c | 3 +++ 2 files changed, 5 insertions(+), 2 deletions(-)