Message ID | 1361305694-8303-1-git-send-email-nhorman@tuxdriver.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, 2013-02-19 at 15:28 -0500, Neil Horman wrote: > static inline void rt6_update_expires(struct rt6_info *rt, int timeout) > { > if (!(rt->rt6i_flags & RTF_EXPIRES)) { > - if (rt->dst.from) > - dst_release(rt->dst.from); > + dst_release(rt->dst.from); > /* dst_set_expires relies on expires == 0 > * if it has not been set previously. > */ > rt->dst.expires = 0; > + rt6->dst.from = NULL; > } > Sorry you didnt really address the problem, only reduce the race window. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 19, 2013 at 01:17:45PM -0800, Eric Dumazet wrote: > On Tue, 2013-02-19 at 15:28 -0500, Neil Horman wrote: > > > static inline void rt6_update_expires(struct rt6_info *rt, int timeout) > > { > > if (!(rt->rt6i_flags & RTF_EXPIRES)) { > > - if (rt->dst.from) > > - dst_release(rt->dst.from); > > + dst_release(rt->dst.from); > > /* dst_set_expires relies on expires == 0 > > * if it has not been set previously. > > */ > > rt->dst.expires = 0; > > + rt6->dst.from = NULL; > > } > > > > Sorry you didnt really address the problem, only reduce the race window. > I kinda had a feeling you would say that, but the only other solution I see here is to either introduce some locking to protect the from pointer, or two revert the patch that introduced the from pointer alltogether, neither of which sounds appealing to me. I suppose we could use an xchng to atomically update the from pointer, so there was only ever one context that was able to free it in rt6_update_path. Does that seem reasonable to you? Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2013-02-19 at 16:49 -0500, Neil Horman wrote: > On Tue, Feb 19, 2013 at 01:17:45PM -0800, Eric Dumazet wrote: > > On Tue, 2013-02-19 at 15:28 -0500, Neil Horman wrote: > > > > > static inline void rt6_update_expires(struct rt6_info *rt, int timeout) > > > { > > > if (!(rt->rt6i_flags & RTF_EXPIRES)) { > > > - if (rt->dst.from) > > > - dst_release(rt->dst.from); > > > + dst_release(rt->dst.from); > > > /* dst_set_expires relies on expires == 0 > > > * if it has not been set previously. > > > */ > > > rt->dst.expires = 0; > > > + rt6->dst.from = NULL; > > > } > > > > > > > Sorry you didnt really address the problem, only reduce the race window. > > > I kinda had a feeling you would say that, but the only other solution I see here > is to either introduce some locking to protect the from pointer, or two revert > the patch that introduced the from pointer alltogether, neither of which sounds > appealing to me. I suppose we could use an xchng to atomically update the from > pointer, so there was only ever one context that was able to free it in > rt6_update_path. Does that seem reasonable to you? I believe the setting of rt6->dst.from is safe : It should be done at : - dst creation time, when we are the only user. - dst destry time, when we are the only user. We only have to do the dst_release() at the right place, when we are the last user of the dst. So rt6_update_expires() should not mess with rt6->dst.from at all. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Eric & Neil, On 2013/02/20 05:55, Eric Dumazet wrote: > On Tue, 2013-02-19 at 16:49 -0500, Neil Horman wrote: >> On Tue, Feb 19, 2013 at 01:17:45PM -0800, Eric Dumazet wrote: >>> On Tue, 2013-02-19 at 15:28 -0500, Neil Horman wrote: >>> >>>> static inline void rt6_update_expires(struct rt6_info *rt, int timeout) >>>> { >>>> if (!(rt->rt6i_flags & RTF_EXPIRES)) { >>>> - if (rt->dst.from) >>>> - dst_release(rt->dst.from); >>>> + dst_release(rt->dst.from); >>>> /* dst_set_expires relies on expires == 0 >>>> * if it has not been set previously. >>>> */ >>>> rt->dst.expires = 0; >>>> + rt6->dst.from = NULL; >>>> } >>>> >>> >>> Sorry you didnt really address the problem, only reduce the race window. >>> >> I kinda had a feeling you would say that, but the only other solution I see here >> is to either introduce some locking to protect the from pointer, or two revert >> the patch that introduced the from pointer alltogether, neither of which sounds >> appealing to me. I suppose we could use an xchng to atomically update the from >> pointer, so there was only ever one context that was able to free it in >> rt6_update_path. Does that seem reasonable to you? > Thanks for your research on this problem. > I believe the setting of rt6->dst.from is safe : > It should be done at : > - dst creation time, when we are the only user. > - dst destry time, when we are the only user. > > We only have to do the dst_release() at the right place, when we are the > last user of the dst. > > So rt6_update_expires() should not mess with rt6->dst.from at all. > How can we? one usage of rt6_update_expires and rt6_set_expires is changing rt6->dst.from to rt6->dst.expires, we should release the already holded reference of rt6->dst.from. I think maybe we should rework the commit 1716a96101c49186b (ipv6: fix problem with expired dst cache). just force setting flag RTF_EXPIRES and expires for the rt6 which copied from the rt6(which is generated from RA). The only problem of this situation is this copied rt6's expire time may be imprecise, we may receive a new RA with a new expire time. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> static inline void rt6_clean_expires(struct rt6_info *rt) > { > - if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.from) > + if (!rt->rt6i_flags & RTF_EXPIRES) > dst_release(rt->dst.from); > > rt->rt6i_flags &= ~RTF_EXPIRES; > rt->dst.from = NULL; > + rt->dst.expires = 0; > } > > static inline void rt6_set_expires(struct rt6_info *rt, unsigned long expires) > { > - if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.from) > + if (!rt->rt6i_flags & RTF_EXPIRES) > dst_release(rt->dst.from); > > rt->rt6i_flags |= RTF_EXPIRES; > + rt->dst.from = NULL; > rt->dst.expires = expires; > } > > static inline void rt6_update_expires(struct rt6_info *rt, int timeout) > { > if (!(rt->rt6i_flags & RTF_EXPIRES)) { > - if (rt->dst.from) > - dst_release(rt->dst.from); > + dst_release(rt->dst.from); > /* dst_set_expires relies on expires == 0 > * if it has not been set previously. > */ > rt->dst.expires = 0; > + rt6->dst.from = NULL; > } Aren't there also problems with setting and clearing RTF_EXPIRES? Since that flag looks as though it was the descriminant for the union it probably isn't needed - provided dst.expires is never 0 when valid. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 20, 2013 at 10:55:35AM -0000, David Laight wrote: > > static inline void rt6_clean_expires(struct rt6_info *rt) > > { > > - if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.from) > > + if (!rt->rt6i_flags & RTF_EXPIRES) > > dst_release(rt->dst.from); > > > > rt->rt6i_flags &= ~RTF_EXPIRES; > > rt->dst.from = NULL; > > + rt->dst.expires = 0; > > } > > > > static inline void rt6_set_expires(struct rt6_info *rt, unsigned long expires) > > { > > - if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.from) > > + if (!rt->rt6i_flags & RTF_EXPIRES) > > dst_release(rt->dst.from); > > > > rt->rt6i_flags |= RTF_EXPIRES; > > + rt->dst.from = NULL; > > rt->dst.expires = expires; > > } > > > > static inline void rt6_update_expires(struct rt6_info *rt, int timeout) > > { > > if (!(rt->rt6i_flags & RTF_EXPIRES)) { > > - if (rt->dst.from) > > - dst_release(rt->dst.from); > > + dst_release(rt->dst.from); > > /* dst_set_expires relies on expires == 0 > > * if it has not been set previously. > > */ > > rt->dst.expires = 0; > > + rt6->dst.from = NULL; > > } > > Aren't there also problems with setting and clearing RTF_EXPIRES? > Since that flag looks as though it was the descriminant for the union > it probably isn't needed - provided dst.expires is never 0 when valid. > > David The use of RTF_EXPIRES is weak at this point, if there are multiple accessors, but I think the point is moot, in that the only thing we ever do when we change the flag is release dst.from, which is safe. Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Aren't there also problems with setting and clearing RTF_EXPIRES? > > Since that flag looks as though it was the descriminant for the union > > it probably isn't needed - provided dst.expires is never 0 when valid. > > > > David > > The use of RTF_EXPIRES is weak at this point, if there are multiple accessors, > but I think the point is moot, in that the only thing we ever do when we change > the flag is release dst.from, which is safe. > > Neil I was also worried about the RMW cycles of rt6i_flags itself. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/dst.h b/include/net/dst.h index 3da47e0..6b7ebcf 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -36,13 +36,10 @@ struct dst_entry { struct net_device *dev; struct dst_ops *ops; unsigned long _metrics; - union { - unsigned long expires; - /* point to where the dst_entry copied from */ - struct dst_entry *from; - }; + unsigned long expires; + /* point to where the dst_entry copied from */ + struct dst_entry *from; struct dst_entry *path; - void *__pad0; #ifdef CONFIG_XFRM struct xfrm_state *xfrm; #else diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 6919a50..a285e37 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -164,31 +164,33 @@ static inline struct inet6_dev *ip6_dst_idev(struct dst_entry *dst) static inline void rt6_clean_expires(struct rt6_info *rt) { - if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.from) + if (!rt->rt6i_flags & RTF_EXPIRES) dst_release(rt->dst.from); rt->rt6i_flags &= ~RTF_EXPIRES; rt->dst.from = NULL; + rt->dst.expires = 0; } static inline void rt6_set_expires(struct rt6_info *rt, unsigned long expires) { - if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.from) + if (!rt->rt6i_flags & RTF_EXPIRES) dst_release(rt->dst.from); rt->rt6i_flags |= RTF_EXPIRES; + rt->dst.from = NULL; rt->dst.expires = expires; } static inline void rt6_update_expires(struct rt6_info *rt, int timeout) { if (!(rt->rt6i_flags & RTF_EXPIRES)) { - if (rt->dst.from) - dst_release(rt->dst.from); + dst_release(rt->dst.from); /* dst_set_expires relies on expires == 0 * if it has not been set previously. */ rt->dst.expires = 0; + rt6->dst.from = NULL; } dst_set_expires(&rt->dst, timeout); @@ -199,7 +201,7 @@ static inline void rt6_set_from(struct rt6_info *rt, struct rt6_info *from) { struct dst_entry *new = (struct dst_entry *) from; - if (!(rt->rt6i_flags & RTF_EXPIRES) && rt->dst.from) { + if (!rt->rt6i_flags & RTF_EXPIRES) { if (new == rt->dst.from) return; dst_release(rt->dst.from); @@ -207,6 +209,7 @@ static inline void rt6_set_from(struct rt6_info *rt, struct rt6_info *from) rt->rt6i_flags &= ~RTF_EXPIRES; rt->dst.from = new; + rt->dst.expires = 0; dst_hold(new); }
I've been looking into some random ipv6 crashes lately that occur around ip6_link_failure. This and simmilar partial stack traces: ip6_link_failure+0xbe/0xd0 ipip6_tunnel_xmit+0x7e7/0x860 [sit] dev_hard_start_xmit+0x3e3/0x690 dev_queue_xmit+0x38f/0x610 neigh_direct_output+0x11/0x20 ip6_finish_output2+0x90/0x340 ? ac6_proc_exit+0x20/0x20 ip6_finish_output+0x98/0xc0 ip6_output ? __ip6_local_out ip6_local_out ip6_push_pending ? ip6_append_data udp_v6_push_pending_frames udpv6_sendmsg were all I had. Eric D. Made this click with me this morning however, noting a possible/likely race condition in the ipv6 code. It appears that, if a dst entry is accessed by multiple cpus (and it appears that certainly can happen, as routes created via ip6_rt_copy are hashed back into the fib, for future lookups), then the use of the rt6i_flags field can race, leading to multiple conflicting uses of the aforementioned union (jiffies being interpreted as the from pointer and vice versa). Eric suggested that this be fixed in rt6_update_expires, but looking at the other uses of this union I don't think thats a complete fix. All the accessors for the expired|from union in the dst entry seem to rely on the RTF_EXPIRES flag being accessed and updated atomically, and thats just not the case. I think the only fix here, that doesn't involve additional locking, is to separate the expires and from fields into their own storage, so as not to trample one another. This is currently untested, but I've given it to several people who can reproduce this problem and are testing it now. I'll post again when I have results from them. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: eric.dumazet@gmail.com CC: David Miller <davem@davemloft.net> CC: Gao feng <gaofeng@cn.fujitsu.com> CC: Jiri Bohac <jbohac@suse.cz> --- include/net/dst.h | 9 +++------ include/net/ip6_fib.h | 13 ++++++++----- 2 files changed, 11 insertions(+), 11 deletions(-)