Message ID | 1522075727-19860-1-git-send-email-alexey.kodanev@oracle.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net,v2] udp6: set dst cache for a connected sk before udp_v6_send_skb | expand |
On 03/26/2018 07:48 AM, Alexey Kodanev wrote: > After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a > connected datagram sk during pmtu update"), when the error occurs on > sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type, > error handler can trigger the following path and call ip6_dst_store(): > > udpv6_err() > ip6_sk_update_pmtu() > ip6_datagram_dst_update() > ip6_dst_lookup_flow(), can create a RTF_CACHE clone > ... > ip6_dst_store() > > It can happen before a connected UDP socket invokes ip6_dst_store() > in the end of udpv6_sendmsg(), on destination release, as a result, > the last one changes dst to the old one, preventing getting updated > dst cache on the next udpv6_sendmsg() call. > > This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is > invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb(). > > Also, increase refcnt for dst, when passing it to ip6_dst_store() > because after that the dst cache can be released by other calls > to ip6_dst_store() with the same socket. > > Fixes: 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update") > Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> > --- > Reviewed-by: Eric Dumazet <edumazet@google.com> Thanks Alexey.
On Mon, Mar 26, 2018 at 05:48:47PM +0300, Alexey Kodanev wrote: > After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a > connected datagram sk during pmtu update"), when the error occurs on > sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type, > error handler can trigger the following path and call ip6_dst_store(): > > udpv6_err() > ip6_sk_update_pmtu() > ip6_datagram_dst_update() > ip6_dst_lookup_flow(), can create a RTF_CACHE clone Instead of ip6_dst_lookup_flow(), you meant the RTF_CACHE route created in ip6_update_pmtu() > ... > ip6_dst_store() > > It can happen before a connected UDP socket invokes ip6_dst_store() > in the end of udpv6_sendmsg(), on destination release, as a result, > the last one changes dst to the old one, preventing getting updated > dst cache on the next udpv6_sendmsg() call. > > This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is > invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb(). After this patch, the above udpv6_err() path could not happen after ip6_sk_dst_lookup_flow() and before the ip6_dst_store() in udpv6_sendmsg()? > > Also, increase refcnt for dst, when passing it to ip6_dst_store() > because after that the dst cache can be released by other calls > to ip6_dst_store() with the same socket. > > Fixes: 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update") > Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> > --- > > v2: * remove 'release_dst:' label > > * move ip6_dst_store() below MSG_CONFIRM check as > suggested by Eric and add dst_clone() > > * add 'Fixes' commit. > > > net/ipv6/udp.c | 29 +++++++++++------------------ > 1 file changed, 11 insertions(+), 18 deletions(-) > > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c > index 52e3ea0..4508e5a 100644 > --- a/net/ipv6/udp.c > +++ b/net/ipv6/udp.c > @@ -1303,6 +1303,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > goto do_confirm; > back_from_confirm: > > + if (connected) > + ip6_dst_store(sk, dst_clone(dst), > + ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? > + &sk->sk_v6_daddr : NULL, > +#ifdef CONFIG_IPV6_SUBTREES > + ipv6_addr_equal(&fl6.saddr, &np->saddr) ? > + &np->saddr : > +#endif > + NULL); > + > /* Lockless fast path for the non-corking case */ > if (!corkreq) { > struct sk_buff *skb; > @@ -1314,7 +1324,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > err = PTR_ERR(skb); > if (!IS_ERR_OR_NULL(skb)) > err = udp_v6_send_skb(skb, &fl6); > - goto release_dst; > + goto out; > } > > lock_sock(sk); > @@ -1348,23 +1358,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > err = np->recverr ? net_xmit_errno(err) : 0; > release_sock(sk); > > -release_dst: > - if (dst) { > - if (connected) { > - ip6_dst_store(sk, dst, > - ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? > - &sk->sk_v6_daddr : NULL, > -#ifdef CONFIG_IPV6_SUBTREES > - ipv6_addr_equal(&fl6.saddr, &np->saddr) ? > - &np->saddr : > -#endif > - NULL); > - } else { > - dst_release(dst); > - } > - dst = NULL; > - } > - > out: > dst_release(dst); > fl6_sock_release(flowlabel); > -- > 1.8.3.1 >
On 26.03.2018 20:02, Martin KaFai Lau wrote: > On Mon, Mar 26, 2018 at 05:48:47PM +0300, Alexey Kodanev wrote: >> After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a >> connected datagram sk during pmtu update"), when the error occurs on >> sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type, >> error handler can trigger the following path and call ip6_dst_store(): >> >> udpv6_err() >> ip6_sk_update_pmtu() >> ip6_datagram_dst_update() >> ip6_dst_lookup_flow(), can create a RTF_CACHE clone > Instead of ip6_dst_lookup_flow(), > you meant the RTF_CACHE route created in ip6_update_pmtu() > Right, or even earlier... I was using vti tunnel and it invokes skb_dst_update_pmtu() on this error, then sends ICMPv6_PKT_TOOBIG. >> ... >> ip6_dst_store() >> >> It can happen before a connected UDP socket invokes ip6_dst_store() >> in the end of udpv6_sendmsg(), on destination release, as a result, >> the last one changes dst to the old one, preventing getting updated >> dst cache on the next udpv6_sendmsg() call. >> >> This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is >> invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb(). > After this patch, the above udpv6_err() path could not happen after > ip6_sk_dst_lookup_flow() and before the ip6_dst_store() in udpv6_sendmsg()? > May be we could minimize this if save it in ip6_sk_dst_lookup_flow() for a connected UDP sockets only if we're not getting it from a cache for some reason? diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index a8a9195..0204f52 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1115,13 +1115,30 @@ struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 *fl6, * error code. */ struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, - const struct in6_addr *final_dst) + const struct in6_addr *final_dst, + bool connected) { struct dst_entry *dst = sk_dst_check(sk, inet6_sk(sk)->dst_cookie); dst = ip6_sk_dst_check(sk, dst, fl6); - if (!dst) - dst = ip6_dst_lookup_flow(sk, fl6, final_dst); + if (dst) + return dst; + + dst = ip6_dst_lookup_flow(sk, fl6, final_dst); + + if (connected && !IS_ERR(dst)) + ip6_dst_store(sk, dst_clone(dst), ...); Thanks, Alexey >> >> Also, increase refcnt for dst, when passing it to ip6_dst_store() >> because after that the dst cache can be released by other calls >> to ip6_dst_store() with the same socket. >> >> Fixes: 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update") >> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> >> --- >> >> v2: * remove 'release_dst:' label >> >> * move ip6_dst_store() below MSG_CONFIRM check as >> suggested by Eric and add dst_clone() >> >> * add 'Fixes' commit. >> >> >> net/ipv6/udp.c | 29 +++++++++++------------------ >> 1 file changed, 11 insertions(+), 18 deletions(-) >> >> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c >> index 52e3ea0..4508e5a 100644 >> --- a/net/ipv6/udp.c >> +++ b/net/ipv6/udp.c >> @@ -1303,6 +1303,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) >> goto do_confirm; >> back_from_confirm: >> >> + if (connected)>> + ip6_dst_store(sk, dst_clone(dst), >> + ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? >> + &sk->sk_v6_daddr : NULL, >> +#ifdef CONFIG_IPV6_SUBTREES >> + ipv6_addr_equal(&fl6.saddr, &np->saddr) ? >> + &np->saddr : >> +#endif >> + NULL); >> + >> /* Lockless fast path for the non-corking case */ >> if (!corkreq) { >> struct sk_buff *skb; >> @@ -1314,7 +1324,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) >> err = PTR_ERR(skb); >> if (!IS_ERR_OR_NULL(skb)) >> err = udp_v6_send_skb(skb, &fl6); >> - goto release_dst; >> + goto out; >> } >> >> lock_sock(sk); >> @@ -1348,23 +1358,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) >> err = np->recverr ? net_xmit_errno(err) : 0; >> release_sock(sk); >> >> -release_dst: >> - if (dst) { >> - if (connected) { >> - ip6_dst_store(sk, dst, >> - ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? >> - &sk->sk_v6_daddr : NULL, >> -#ifdef CONFIG_IPV6_SUBTREES >> - ipv6_addr_equal(&fl6.saddr, &np->saddr) ? >> - &np->saddr : >> -#endif >> - NULL); >> - } else { >> - dst_release(dst); >> - } >> - dst = NULL; >> - } >> - >> out: >> dst_release(dst); >> fl6_sock_release(flowlabel); >> -- >> 1.8.3.1 >>
On Tue, Mar 27, 2018 at 04:27:30PM +0300, Alexey Kodanev wrote: > On 26.03.2018 20:02, Martin KaFai Lau wrote: > > On Mon, Mar 26, 2018 at 05:48:47PM +0300, Alexey Kodanev wrote: > >> After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a > >> connected datagram sk during pmtu update"), when the error occurs on > >> sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type, > >> error handler can trigger the following path and call ip6_dst_store(): > >> > >> udpv6_err() > >> ip6_sk_update_pmtu() > >> ip6_datagram_dst_update() > >> ip6_dst_lookup_flow(), can create a RTF_CACHE clone > > Instead of ip6_dst_lookup_flow(), > > you meant the RTF_CACHE route created in ip6_update_pmtu() > > > > Right, or even earlier... I was using vti tunnel and it invokes > skb_dst_update_pmtu() on this error, then sends ICMPv6_PKT_TOOBIG. > > >> ... > >> ip6_dst_store() > >> > >> It can happen before a connected UDP socket invokes ip6_dst_store() > >> in the end of udpv6_sendmsg(), on destination release, as a result, > >> the last one changes dst to the old one, preventing getting updated > >> dst cache on the next udpv6_sendmsg() call. > >> > >> This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is > >> invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb(). > > After this patch, the above udpv6_err() path could not happen after > > ip6_sk_dst_lookup_flow() and before the ip6_dst_store() in udpv6_sendmsg()? > > > > May be we could minimize this if save it in ip6_sk_dst_lookup_flow() > for a connected UDP sockets only if we're not getting it from a cache > for some reason? I am just not clear how moving it earlier can completely stop the described issue. Beside, the next ICMPV6_PKT_TOOBIG will be received again and eventually rectify sk->sk_dst_cache? > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index a8a9195..0204f52 100644 > --- a/net/ipv6/ip6_output.c > +++ b/net/ipv6/ip6_output.c > @@ -1115,13 +1115,30 @@ struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 *fl6, > * error code. > */ > struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, > - const struct in6_addr *final_dst) > + const struct in6_addr *final_dst, > + bool connected) > { > struct dst_entry *dst = sk_dst_check(sk, inet6_sk(sk)->dst_cookie); > > dst = ip6_sk_dst_check(sk, dst, fl6); > - if (!dst) > - dst = ip6_dst_lookup_flow(sk, fl6, final_dst); > + if (dst) > + return dst; > + > + dst = ip6_dst_lookup_flow(sk, fl6, final_dst); > + > + if (connected && !IS_ERR(dst)) > + ip6_dst_store(sk, dst_clone(dst), ...); Right, I think doing dst_store() only if ip6_sk_dst_check()/ sk_dst_check() returns NULL is a more sensible thing to do here. > > Thanks, > Alexey > > >> > >> Also, increase refcnt for dst, when passing it to ip6_dst_store() > >> because after that the dst cache can be released by other calls > >> to ip6_dst_store() with the same socket. > >> > >> Fixes: 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update") > >> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> > >> --- > >> > >> v2: * remove 'release_dst:' label > >> > >> * move ip6_dst_store() below MSG_CONFIRM check as > >> suggested by Eric and add dst_clone() > >> > >> * add 'Fixes' commit. > >> > >> > >> net/ipv6/udp.c | 29 +++++++++++------------------ > >> 1 file changed, 11 insertions(+), 18 deletions(-) > >> > >> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c > >> index 52e3ea0..4508e5a 100644 > >> --- a/net/ipv6/udp.c > >> +++ b/net/ipv6/udp.c > >> @@ -1303,6 +1303,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > >> goto do_confirm; > >> back_from_confirm: > >> > >> + if (connected)>> + ip6_dst_store(sk, dst_clone(dst), > >> + ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? > >> + &sk->sk_v6_daddr : NULL, > >> +#ifdef CONFIG_IPV6_SUBTREES > >> + ipv6_addr_equal(&fl6.saddr, &np->saddr) ? > >> + &np->saddr : > >> +#endif > >> + NULL); > >> + > >> /* Lockless fast path for the non-corking case */ > >> if (!corkreq) { > >> struct sk_buff *skb; > >> @@ -1314,7 +1324,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > >> err = PTR_ERR(skb); > >> if (!IS_ERR_OR_NULL(skb)) > >> err = udp_v6_send_skb(skb, &fl6); > >> - goto release_dst; > >> + goto out; > >> } > >> > >> lock_sock(sk); > >> @@ -1348,23 +1358,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > >> err = np->recverr ? net_xmit_errno(err) : 0; > >> release_sock(sk); > >> > >> -release_dst: > >> - if (dst) { > >> - if (connected) { > >> - ip6_dst_store(sk, dst, > >> - ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? > >> - &sk->sk_v6_daddr : NULL, > >> -#ifdef CONFIG_IPV6_SUBTREES > >> - ipv6_addr_equal(&fl6.saddr, &np->saddr) ? > >> - &np->saddr : > >> -#endif > >> - NULL); > >> - } else { > >> - dst_release(dst); > >> - } > >> - dst = NULL; > >> - } > >> - > >> out: > >> dst_release(dst); > >> fl6_sock_release(flowlabel); > >> -- > >> 1.8.3.1 > >> >
On 27.03.2018 22:00, Martin KaFai Lau wrote: > On Tue, Mar 27, 2018 at 04:27:30PM +0300, Alexey Kodanev wrote: >> On 26.03.2018 20:02, Martin KaFai Lau wrote: >>> On Mon, Mar 26, 2018 at 05:48:47PM +0300, Alexey Kodanev wrote: >>>> After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a >>>> connected datagram sk during pmtu update"), when the error occurs on >>>> sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type, >>>> error handler can trigger the following path and call ip6_dst_store(): >>>> >>>> udpv6_err() >>>> ip6_sk_update_pmtu() >>>> ip6_datagram_dst_update() >>>> ip6_dst_lookup_flow(), can create a RTF_CACHE clone >>> Instead of ip6_dst_lookup_flow(), >>> you meant the RTF_CACHE route created in ip6_update_pmtu() >>> >> >> Right, or even earlier... I was using vti tunnel and it invokes >> skb_dst_update_pmtu() on this error, then sends ICMPv6_PKT_TOOBIG. >> >>>> ... >>>> ip6_dst_store() >>>> >>>> It can happen before a connected UDP socket invokes ip6_dst_store() >>>> in the end of udpv6_sendmsg(), on destination release, as a result, >>>> the last one changes dst to the old one, preventing getting updated >>>> dst cache on the next udpv6_sendmsg() call. >>>> >>>> This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is >>>> invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb(). >>> After this patch, the above udpv6_err() path could not happen after >>> ip6_sk_dst_lookup_flow() and before the ip6_dst_store() in udpv6_sendmsg()? >>> >> >> May be we could minimize this if save it in ip6_sk_dst_lookup_flow() >> for a connected UDP sockets only if we're not getting it from a cache >> for some reason? > I am just not clear how moving it earlier can completely stop the > described issue. Beside, the next ICMPV6_PKT_TOOBIG will be > received again and eventually rectify sk->sk_dst_cache? With a request/response scenario in my environment, it always happens after udp_v6_send_skb() and before the 'release_dst:' in udpv6_sendmsg(). So the socket there never gets updated cache when resending datagram again after ICMPV6_PKT_TOOBIG... dst cache is always reset to the old one in the end of udpv6_sendmsg(). > >> >> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c >> index a8a9195..0204f52 100644 >> --- a/net/ipv6/ip6_output.c >> +++ b/net/ipv6/ip6_output.c >> @@ -1115,13 +1115,30 @@ struct dst_entry *ip6_dst_lookup_flow(const struct sock *sk, struct flowi6 *fl6, >> * error code. >> */ >> struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, >> - const struct in6_addr *final_dst) >> + const struct in6_addr *final_dst, >> + bool connected) >> { >> struct dst_entry *dst = sk_dst_check(sk, inet6_sk(sk)->dst_cookie); >> >> dst = ip6_sk_dst_check(sk, dst, fl6); >> - if (!dst) >> - dst = ip6_dst_lookup_flow(sk, fl6, final_dst); >> + if (dst) >> + return dst; >> + >> + dst = ip6_dst_lookup_flow(sk, fl6, final_dst); >> + >> + if (connected && !IS_ERR(dst)) >> + ip6_dst_store(sk, dst_clone(dst), ...); > Right, I think doing dst_store() only if ip6_sk_dst_check()/ > sk_dst_check() returns NULL is a more sensible thing to > do here. OK, will prepare the patch. Thanks, Alexey
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 52e3ea0..4508e5a 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1303,6 +1303,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto do_confirm; back_from_confirm: + if (connected) + ip6_dst_store(sk, dst_clone(dst), + ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? + &sk->sk_v6_daddr : NULL, +#ifdef CONFIG_IPV6_SUBTREES + ipv6_addr_equal(&fl6.saddr, &np->saddr) ? + &np->saddr : +#endif + NULL); + /* Lockless fast path for the non-corking case */ if (!corkreq) { struct sk_buff *skb; @@ -1314,7 +1324,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) err = PTR_ERR(skb); if (!IS_ERR_OR_NULL(skb)) err = udp_v6_send_skb(skb, &fl6); - goto release_dst; + goto out; } lock_sock(sk); @@ -1348,23 +1358,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) err = np->recverr ? net_xmit_errno(err) : 0; release_sock(sk); -release_dst: - if (dst) { - if (connected) { - ip6_dst_store(sk, dst, - ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? - &sk->sk_v6_daddr : NULL, -#ifdef CONFIG_IPV6_SUBTREES - ipv6_addr_equal(&fl6.saddr, &np->saddr) ? - &np->saddr : -#endif - NULL); - } else { - dst_release(dst); - } - dst = NULL; - } - out: dst_release(dst); fl6_sock_release(flowlabel);
After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update"), when the error occurs on sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type, error handler can trigger the following path and call ip6_dst_store(): udpv6_err() ip6_sk_update_pmtu() ip6_datagram_dst_update() ip6_dst_lookup_flow(), can create a RTF_CACHE clone ... ip6_dst_store() It can happen before a connected UDP socket invokes ip6_dst_store() in the end of udpv6_sendmsg(), on destination release, as a result, the last one changes dst to the old one, preventing getting updated dst cache on the next udpv6_sendmsg() call. This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb(). Also, increase refcnt for dst, when passing it to ip6_dst_store() because after that the dst cache can be released by other calls to ip6_dst_store() with the same socket. Fixes: 33c162a980fe ("ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> --- v2: * remove 'release_dst:' label * move ip6_dst_store() below MSG_CONFIRM check as suggested by Eric and add dst_clone() * add 'Fixes' commit. net/ipv6/udp.c | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-)