Message ID | 20170316183238.15806.14293.stgit@localhost.localdomain |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 2017-03-16 at 11:32 -0700, Alexander Duyck wrote: > From: Sridhar Samudrala <sridhar.samudrala@intel.com> > > Fix sk_mark_napi_id() and sk_mark_napi_id_once() to set sk_napi_id only if > skb->napi_id is a valid value. > > This happens in loopback paths where skb->napi_id is not updated in > rx path and holds sender_cpu that is set in xmit path. > > Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > --- > include/net/busy_poll.h | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h > index c0452de83086..67991635953e 100644 > --- a/include/net/busy_poll.h > +++ b/include/net/busy_poll.h > @@ -116,7 +116,8 @@ static inline bool sk_busy_loop(struct sock *sk, int nonblock) > static inline void sk_mark_napi_id(struct sock *sk, const struct sk_buff *skb) > { > #ifdef CONFIG_NET_RX_BUSY_POLL > - sk->sk_napi_id = skb->napi_id; > + if (skb->napi_id > (u32)NR_CPUS) > + sk->sk_napi_id = skb->napi_id; > #endif > } > > @@ -125,7 +126,7 @@ static inline void sk_mark_napi_id_once(struct sock *sk, > const struct sk_buff *skb) > { > #ifdef CONFIG_NET_RX_BUSY_POLL > - if (!sk->sk_napi_id) > + if (!sk->sk_napi_id && (skb->napi_id > (u32)NR_CPUS)) > sk->sk_napi_id = skb->napi_id; > #endif > } > It is not clear why this patch is needed . What you describe here is the case we might receive packets for a socket coming from different interfaces ? If skb->napi_id is a sender_cpu, why should we prevent overwriting the sk_napi_id with it, knowing that busy polling will simply ignore the invalid value ? Do not get me wrong : I simply try to understand why the test about napi_id validity is now done twice : 1) At the time we are writing into sk->sk_napi_id 2) At busy polling time when we read sk->sk_napi_id
On Thu, Mar 16, 2017 at 3:05 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Thu, 2017-03-16 at 11:32 -0700, Alexander Duyck wrote: >> From: Sridhar Samudrala <sridhar.samudrala@intel.com> >> >> Fix sk_mark_napi_id() and sk_mark_napi_id_once() to set sk_napi_id only if >> skb->napi_id is a valid value. >> >> This happens in loopback paths where skb->napi_id is not updated in >> rx path and holds sender_cpu that is set in xmit path. >> >> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> --- >> include/net/busy_poll.h | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h >> index c0452de83086..67991635953e 100644 >> --- a/include/net/busy_poll.h >> +++ b/include/net/busy_poll.h >> @@ -116,7 +116,8 @@ static inline bool sk_busy_loop(struct sock *sk, int nonblock) >> static inline void sk_mark_napi_id(struct sock *sk, const struct sk_buff *skb) >> { >> #ifdef CONFIG_NET_RX_BUSY_POLL >> - sk->sk_napi_id = skb->napi_id; >> + if (skb->napi_id > (u32)NR_CPUS) >> + sk->sk_napi_id = skb->napi_id; >> #endif >> } >> >> @@ -125,7 +126,7 @@ static inline void sk_mark_napi_id_once(struct sock *sk, >> const struct sk_buff *skb) >> { >> #ifdef CONFIG_NET_RX_BUSY_POLL >> - if (!sk->sk_napi_id) >> + if (!sk->sk_napi_id && (skb->napi_id > (u32)NR_CPUS)) >> sk->sk_napi_id = skb->napi_id; >> #endif >> } >> > > It is not clear why this patch is needed . > > What you describe here is the case we might receive packets for a socket > coming from different interfaces ? > > If skb->napi_id is a sender_cpu, why should we prevent overwriting the > sk_napi_id with it, knowing that busy polling will simply ignore the > invalid value ? > > Do not get me wrong : > > I simply try to understand why the test about napi_id validity is now > done twice : > > 1) At the time we are writing into sk->sk_napi_id I would argue that this is the one piece we were missing. > 2) At busy polling time when we read sk->sk_napi_id Unless there was something recently added I don't think this was ever checked. Instead we start digging into the hash looking for the ID that won't ever be there. Maybe we should add something to napi_by_id that just returns NULL in these cases. On top of that I think there end up being several spots where once we lock in a non-NAPI ID it is stuck such as the sk_mark_napi_id_once call. I figure we are better off locking in an actual NAPI ID rather than getting a sender_cpu stuck in there.
On 3/16/2017 3:05 PM, Eric Dumazet wrote: > On Thu, 2017-03-16 at 11:32 -0700, Alexander Duyck wrote: >> From: Sridhar Samudrala <sridhar.samudrala@intel.com> >> >> Fix sk_mark_napi_id() and sk_mark_napi_id_once() to set sk_napi_id only if >> skb->napi_id is a valid value. >> >> This happens in loopback paths where skb->napi_id is not updated in >> rx path and holds sender_cpu that is set in xmit path. >> >> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> --- >> include/net/busy_poll.h | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h >> index c0452de83086..67991635953e 100644 >> --- a/include/net/busy_poll.h >> +++ b/include/net/busy_poll.h >> @@ -116,7 +116,8 @@ static inline bool sk_busy_loop(struct sock *sk, int nonblock) >> static inline void sk_mark_napi_id(struct sock *sk, const struct sk_buff *skb) >> { >> #ifdef CONFIG_NET_RX_BUSY_POLL >> - sk->sk_napi_id = skb->napi_id; >> + if (skb->napi_id > (u32)NR_CPUS) >> + sk->sk_napi_id = skb->napi_id; >> #endif >> } >> >> @@ -125,7 +126,7 @@ static inline void sk_mark_napi_id_once(struct sock *sk, >> const struct sk_buff *skb) >> { >> #ifdef CONFIG_NET_RX_BUSY_POLL >> - if (!sk->sk_napi_id) >> + if (!sk->sk_napi_id && (skb->napi_id > (u32)NR_CPUS)) >> sk->sk_napi_id = skb->napi_id; >> #endif >> } >> > It is not clear why this patch is needed . > > What you describe here is the case we might receive packets for a socket > coming from different interfaces ? This is seen with AF_UNIX or AF_INET sockets over loopback. > > If skb->napi_id is a sender_cpu, why should we prevent overwriting the > sk_napi_id with it, knowing that busy polling will simply ignore the > invalid value ? We are not checking for invalid VALUE(< NR_CPUs) in busy_poll, Non-zero sk->napi_id is considered valid. If we don't want to add this check while setting sk->sk_napi_Id, we could change the check in ep_set_busy_poll_napi_id() to check for invalid value rather than non-zero value. > > Do not get me wrong : > > I simply try to understand why the test about napi_id validity is now > done twice : > > 1) At the time we are writing into sk->sk_napi_id > > 2) At busy polling time when we read sk->sk_napi_id > > >
On Thu, 2017-03-16 at 15:33 -0700, Alexander Duyck wrote: > On Thu, Mar 16, 2017 at 3:05 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > It is not clear why this patch is needed . > > > > What you describe here is the case we might receive packets for a socket > > coming from different interfaces ? > > > > If skb->napi_id is a sender_cpu, why should we prevent overwriting the > > sk_napi_id with it, knowing that busy polling will simply ignore the > > invalid value ? > > > > Do not get me wrong : > > > > I simply try to understand why the test about napi_id validity is now > > done twice : > > > > 1) At the time we are writing into sk->sk_napi_id > > I would argue that this is the one piece we were missing. > > > 2) At busy polling time when we read sk->sk_napi_id > > Unless there was something recently added I don't think this was ever > checked. Instead we start digging into the hash looking for the ID > that won't ever be there. Maybe we should add something to napi_by_id > that just returns NULL in these cases. But this is exactly what should happen. For invalid ID, we return NULL from napi_by_id() No need to add code for that, since the function is meant to deal with valid cases. > On top of that I think there end up being several spots where once we > lock in a non-NAPI ID it is stuck such as the sk_mark_napi_id_once > call. I figure we are better off locking in an actual NAPI ID rather > than getting a sender_cpu stuck in there. Are you referring to sk_mark_napi_id_once() ? Since this is only used by UDP, I would be OK to avoid the 'locking' for 'sender_cpu' ids.
On Thu, Mar 16, 2017 at 3:50 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Thu, 2017-03-16 at 15:33 -0700, Alexander Duyck wrote: >> On Thu, Mar 16, 2017 at 3:05 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> > It is not clear why this patch is needed . >> > >> > What you describe here is the case we might receive packets for a socket >> > coming from different interfaces ? >> > >> > If skb->napi_id is a sender_cpu, why should we prevent overwriting the >> > sk_napi_id with it, knowing that busy polling will simply ignore the >> > invalid value ? >> > >> > Do not get me wrong : >> > >> > I simply try to understand why the test about napi_id validity is now >> > done twice : >> > >> > 1) At the time we are writing into sk->sk_napi_id >> >> I would argue that this is the one piece we were missing. >> >> > 2) At busy polling time when we read sk->sk_napi_id >> >> Unless there was something recently added I don't think this was ever >> checked. Instead we start digging into the hash looking for the ID >> that won't ever be there. Maybe we should add something to napi_by_id >> that just returns NULL in these cases. > > But this is exactly what should happen. > > For invalid ID, we return NULL from napi_by_id() > > No need to add code for that, since the function is meant to deal with > valid cases. I don't know. My concern here is about the cost of going through all that code just for something that we know shouldn't be valid. If nothing else I might update sk_can_busy_loop so that it doesn't try busy looping on a sk_napi_id that is NR_CPU or less. >> On top of that I think there end up being several spots where once we >> lock in a non-NAPI ID it is stuck such as the sk_mark_napi_id_once >> call. I figure we are better off locking in an actual NAPI ID rather >> than getting a sender_cpu stuck in there. > > Are you referring to sk_mark_napi_id_once() ? > > Since this is only used by UDP, I would be OK to avoid the 'locking' for > 'sender_cpu' ids. What I probably can do is go through and replace all the spots where we where checking for sk_napi_id being 0, and instead replace it with a check against NR_CPUS.
On Thu, 2017-03-16 at 19:40 -0700, Alexander Duyck wrote: > I don't know. My concern here is about the cost of going through all > that code just for something that we know shouldn't be valid. If > nothing else I might update sk_can_busy_loop so that it doesn't try > busy looping on a sk_napi_id that is NR_CPU or less. But why would that be a win ? if napi_by_id() returns NULL, we immediately give up, (goto out;) So why should we add a code that will add something that will not be useful for the vast majority of the cases where the ID is valid and we need to do the hash look up ? Is libc trying to avoid doing syscalls like close(-1) ?
On Thu, 2017-03-16 at 19:40 -0700, Alexander Duyck wrote: > What I probably can do is go through and replace all the spots where > we where checking for sk_napi_id being 0, and instead replace it with > a check against NR_CPUS. This seems a good idea.
On Thu, Mar 16, 2017 at 7:57 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Thu, 2017-03-16 at 19:40 -0700, Alexander Duyck wrote: > >> What I probably can do is go through and replace all the spots where >> we where checking for sk_napi_id being 0, and instead replace it with >> a check against NR_CPUS. > > This seems a good idea. Right. This is the path I am planning to go with. It will require about the same amount of code change but replaces those checks. I just have to make sure I catch all the spots where we were checking it for 0 but that shouldn't be too difficult of an issue.
diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index c0452de83086..67991635953e 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -116,7 +116,8 @@ static inline bool sk_busy_loop(struct sock *sk, int nonblock) static inline void sk_mark_napi_id(struct sock *sk, const struct sk_buff *skb) { #ifdef CONFIG_NET_RX_BUSY_POLL - sk->sk_napi_id = skb->napi_id; + if (skb->napi_id > (u32)NR_CPUS) + sk->sk_napi_id = skb->napi_id; #endif } @@ -125,7 +126,7 @@ static inline void sk_mark_napi_id_once(struct sock *sk, const struct sk_buff *skb) { #ifdef CONFIG_NET_RX_BUSY_POLL - if (!sk->sk_napi_id) + if (!sk->sk_napi_id && (skb->napi_id > (u32)NR_CPUS)) sk->sk_napi_id = skb->napi_id; #endif }