Message ID | 20180326065443.7880-1-bhole_prashant_q7@lab.ntt.co.jp |
---|---|
State | Changes Requested, archived |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf-next] bpf: sockmap: initialize sg table entries properly | expand |
On 03/25/2018 11:54 PM, Prashant Bhole wrote: > When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized to SG_MAGIC, > when sg table is initialized using sg_init_table(). Magic is checked > while navigating the scatterlist. We hit BUG_ON when magic check is > failed. > > Fixed following things: > - Initialization of sg table in bpf_tcp_sendpage() was missing, > initialized it using sg_init_table() > > - bpf_tcp_sendmsg() initializes sg table using sg_init_table() before > entering the loop, but further consumed sg entries are initialized > using memset. Fixed it by replacing memset with sg_init_table() in > function bpf_tcp_push() > > Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> > --- > kernel/bpf/sockmap.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c > index 69c5bccabd22..8a848a99d768 100644 > --- a/kernel/bpf/sockmap.c > +++ b/kernel/bpf/sockmap.c > @@ -312,7 +312,7 @@ static int bpf_tcp_push(struct sock *sk, int apply_bytes, > md->sg_start++; > if (md->sg_start == MAX_SKB_FRAGS) > md->sg_start = 0; > - memset(sg, 0, sizeof(*sg)); > + sg_init_table(sg, 1); Looks OK here. > > if (md->sg_start == md->sg_end) > break; > @@ -763,10 +763,14 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, > > lock_sock(sk); > > - if (psock->cork_bytes) > + if (psock->cork_bytes) { > m = psock->cork; > - else > + sg = &m->sg_data[m->sg_end]; > + } else { > m = &md; > + sg = m->sg_data; > + sg_init_table(sg, MAX_SKB_FRAGS); sg_init_table() does an unnecessary memset() though. We probably either want a new scatterlist API or just open code this, #ifdef CONFIG_DEBUG_SG { unsigned int i; for (i = 0; i < nents; i++) sgl[i].sg_magic = SG_MAGIC; } > + } > > /* Catch case where ring is full and sendpage is stalled. */ > if (unlikely(m->sg_end == m->sg_start && > @@ -774,7 +778,6 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, > goto out_err; > > psock->sg_size += size; > - sg = &m->sg_data[m->sg_end]; > sg_set_page(sg, page, size, offset); > get_page(page); > m->sg_copy[m->sg_end] = true; > Nice, catch. I probably should audit though code paths as well and run the test suite with CONFIG_DEBUG_SG. There might be a couple other spots where I open coded the sg elements. Thanks, John
On 3/27/2018 12:15 PM, John Fastabend wrote: > On 03/25/2018 11:54 PM, Prashant Bhole wrote: >> When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized to SG_MAGIC, >> when sg table is initialized using sg_init_table(). Magic is checked >> while navigating the scatterlist. We hit BUG_ON when magic check is >> failed. >> >> Fixed following things: >> - Initialization of sg table in bpf_tcp_sendpage() was missing, >> initialized it using sg_init_table() >> >> - bpf_tcp_sendmsg() initializes sg table using sg_init_table() before >> entering the loop, but further consumed sg entries are initialized >> using memset. Fixed it by replacing memset with sg_init_table() in >> function bpf_tcp_push() >> >> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> >> --- >> kernel/bpf/sockmap.c | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c >> index 69c5bccabd22..8a848a99d768 100644 >> --- a/kernel/bpf/sockmap.c >> +++ b/kernel/bpf/sockmap.c >> @@ -312,7 +312,7 @@ static int bpf_tcp_push(struct sock *sk, int apply_bytes, >> md->sg_start++; >> if (md->sg_start == MAX_SKB_FRAGS) >> md->sg_start = 0; >> - memset(sg, 0, sizeof(*sg)); >> + sg_init_table(sg, 1); > > Looks OK here. > >> >> if (md->sg_start == md->sg_end) >> break; >> @@ -763,10 +763,14 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, >> >> lock_sock(sk); >> >> - if (psock->cork_bytes) >> + if (psock->cork_bytes) { >> m = psock->cork; >> - else >> + sg = &m->sg_data[m->sg_end]; >> + } else { >> m = &md; >> + sg = m->sg_data; >> + sg_init_table(sg, MAX_SKB_FRAGS); > > sg_init_table() does an unnecessary memset() though. We > probably either want a new scatterlist API or just open > code this, > > #ifdef CONFIG_DEBUG_SG > { > unsigned int i; > for (i = 0; i < nents; i++) > sgl[i].sg_magic = SG_MAGIC; > } Similar sg_init_table() is present in bpf_tcp_sendmsg(). I agree that it causes unnecessary memset, but I don't agree with open coded fix. I am still with V1. Thanks. -Prashant
On 03/27/2018 10:41 AM, Prashant Bhole wrote: > On 3/27/2018 12:15 PM, John Fastabend wrote: >> On 03/25/2018 11:54 PM, Prashant Bhole wrote: >>> When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized to SG_MAGIC, >>> when sg table is initialized using sg_init_table(). Magic is checked >>> while navigating the scatterlist. We hit BUG_ON when magic check is >>> failed. >>> >>> Fixed following things: >>> - Initialization of sg table in bpf_tcp_sendpage() was missing, >>> initialized it using sg_init_table() >>> >>> - bpf_tcp_sendmsg() initializes sg table using sg_init_table() before >>> entering the loop, but further consumed sg entries are initialized >>> using memset. Fixed it by replacing memset with sg_init_table() in >>> function bpf_tcp_push() >>> >>> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> >>> --- >>> kernel/bpf/sockmap.c | 11 +++++++---- >>> 1 file changed, 7 insertions(+), 4 deletions(-) >>> >>> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c >>> index 69c5bccabd22..8a848a99d768 100644 >>> --- a/kernel/bpf/sockmap.c >>> +++ b/kernel/bpf/sockmap.c >>> @@ -312,7 +312,7 @@ static int bpf_tcp_push(struct sock *sk, int apply_bytes, >>> md->sg_start++; >>> if (md->sg_start == MAX_SKB_FRAGS) >>> md->sg_start = 0; >>> - memset(sg, 0, sizeof(*sg)); >>> + sg_init_table(sg, 1); >> >> Looks OK here. >> >>> if (md->sg_start == md->sg_end) >>> break; >>> @@ -763,10 +763,14 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, >>> lock_sock(sk); >>> - if (psock->cork_bytes) >>> + if (psock->cork_bytes) { >>> m = psock->cork; >>> - else >>> + sg = &m->sg_data[m->sg_end]; >>> + } else { >>> m = &md; >>> + sg = m->sg_data; >>> + sg_init_table(sg, MAX_SKB_FRAGS); >> >> sg_init_table() does an unnecessary memset() though. We >> probably either want a new scatterlist API or just open >> code this, >> >> #ifdef CONFIG_DEBUG_SG >> { >> unsigned int i; >> for (i = 0; i < nents; i++) >> sgl[i].sg_magic = SG_MAGIC; >> } > > Similar sg_init_table() is present in bpf_tcp_sendmsg(). > I agree that it causes unnecessary memset, but I don't agree with open coded fix. But then lets fix is properly and add a static inline helper to the include/linux/scatterlist.h header like ... static inline void sg_init_debug_marker(struct scatterlist *sgl, unsigned int nents) { #ifdef CONFIG_DEBUG_SG unsigned int i; for (i = 0; i < nents; i++) sgl[i].sg_magic = SG_MAGIC; #endif } ... and reuse it in all the places that would otherwise open-code this, as well as sg_init_table(): void sg_init_table(struct scatterlist *sgl, unsigned int nents) { memset(sgl, 0, sizeof(*sgl) * nents); sg_init_debug_marker(sgl, nents); sg_mark_end(&sgl[nents - 1]); } This would be a lot cleaner than having this duplicated in various places. Thanks, Daniel
On 3/27/2018 6:05 PM, Daniel Borkmann wrote: > On 03/27/2018 10:41 AM, Prashant Bhole wrote: >> On 3/27/2018 12:15 PM, John Fastabend wrote: >>> On 03/25/2018 11:54 PM, Prashant Bhole wrote: >>>> When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized to SG_MAGIC, >>>> when sg table is initialized using sg_init_table(). Magic is checked >>>> while navigating the scatterlist. We hit BUG_ON when magic check is >>>> failed. >>>> >>>> Fixed following things: >>>> - Initialization of sg table in bpf_tcp_sendpage() was missing, >>>> initialized it using sg_init_table() >>>> >>>> - bpf_tcp_sendmsg() initializes sg table using sg_init_table() before >>>> entering the loop, but further consumed sg entries are initialized >>>> using memset. Fixed it by replacing memset with sg_init_table() in >>>> function bpf_tcp_push() >>>> >>>> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> >>>> --- >>>> kernel/bpf/sockmap.c | 11 +++++++---- >>>> 1 file changed, 7 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c >>>> index 69c5bccabd22..8a848a99d768 100644 >>>> --- a/kernel/bpf/sockmap.c >>>> +++ b/kernel/bpf/sockmap.c >>>> @@ -312,7 +312,7 @@ static int bpf_tcp_push(struct sock *sk, int apply_bytes, >>>> md->sg_start++; >>>> if (md->sg_start == MAX_SKB_FRAGS) >>>> md->sg_start = 0; >>>> - memset(sg, 0, sizeof(*sg)); >>>> + sg_init_table(sg, 1); >>> >>> Looks OK here. >>> >>>> if (md->sg_start == md->sg_end) >>>> break; >>>> @@ -763,10 +763,14 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, >>>> lock_sock(sk); >>>> - if (psock->cork_bytes) >>>> + if (psock->cork_bytes) { >>>> m = psock->cork; >>>> - else >>>> + sg = &m->sg_data[m->sg_end]; >>>> + } else { >>>> m = &md; >>>> + sg = m->sg_data; >>>> + sg_init_table(sg, MAX_SKB_FRAGS); >>> >>> sg_init_table() does an unnecessary memset() though. We >>> probably either want a new scatterlist API or just open >>> code this, >>> >>> #ifdef CONFIG_DEBUG_SG >>> { >>> unsigned int i; >>> for (i = 0; i < nents; i++) >>> sgl[i].sg_magic = SG_MAGIC; >>> } >> >> Similar sg_init_table() is present in bpf_tcp_sendmsg(). >> I agree that it causes unnecessary memset, but I don't agree with open coded fix. > > But then lets fix is properly and add a static inline helper to the > include/linux/scatterlist.h header like ... > > static inline void sg_init_debug_marker(struct scatterlist *sgl, > unsigned int nents) > { > #ifdef CONFIG_DEBUG_SG > unsigned int i; > > for (i = 0; i < nents; i++) > sgl[i].sg_magic = SG_MAGIC; > #endif > } > > ... and reuse it in all the places that would otherwise open-code this, > as well as sg_init_table(): > > void sg_init_table(struct scatterlist *sgl, unsigned int nents) > { > memset(sgl, 0, sizeof(*sgl) * nents); > sg_init_debug_marker(sgl, nents); > sg_mark_end(&sgl[nents - 1]); > } > > This would be a lot cleaner than having this duplicated in various places. Daniel, This is a good suggestion. Is it ok if I submit both changes in a patch series? How scatterlist related changes will be picked up by other subsystems? -Prashant
On 03/28/2018 08:18 AM, Prashant Bhole wrote: > On 3/27/2018 6:05 PM, Daniel Borkmann wrote: >> On 03/27/2018 10:41 AM, Prashant Bhole wrote: >>> On 3/27/2018 12:15 PM, John Fastabend wrote: >>>> On 03/25/2018 11:54 PM, Prashant Bhole wrote: >>>>> When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized to SG_MAGIC, >>>>> when sg table is initialized using sg_init_table(). Magic is checked >>>>> while navigating the scatterlist. We hit BUG_ON when magic check is >>>>> failed. >>>>> >>>>> Fixed following things: >>>>> - Initialization of sg table in bpf_tcp_sendpage() was missing, >>>>> initialized it using sg_init_table() >>>>> >>>>> - bpf_tcp_sendmsg() initializes sg table using sg_init_table() before >>>>> entering the loop, but further consumed sg entries are initialized >>>>> using memset. Fixed it by replacing memset with sg_init_table() in >>>>> function bpf_tcp_push() >>>>> >>>>> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> >>>>> --- >>>>> kernel/bpf/sockmap.c | 11 +++++++---- >>>>> 1 file changed, 7 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c >>>>> index 69c5bccabd22..8a848a99d768 100644 >>>>> --- a/kernel/bpf/sockmap.c >>>>> +++ b/kernel/bpf/sockmap.c >>>>> @@ -312,7 +312,7 @@ static int bpf_tcp_push(struct sock *sk, int apply_bytes, >>>>> md->sg_start++; >>>>> if (md->sg_start == MAX_SKB_FRAGS) >>>>> md->sg_start = 0; >>>>> - memset(sg, 0, sizeof(*sg)); >>>>> + sg_init_table(sg, 1); >>>> >>>> Looks OK here. >>>> >>>>> if (md->sg_start == md->sg_end) >>>>> break; >>>>> @@ -763,10 +763,14 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, >>>>> lock_sock(sk); >>>>> - if (psock->cork_bytes) >>>>> + if (psock->cork_bytes) { >>>>> m = psock->cork; >>>>> - else >>>>> + sg = &m->sg_data[m->sg_end]; >>>>> + } else { >>>>> m = &md; >>>>> + sg = m->sg_data; >>>>> + sg_init_table(sg, MAX_SKB_FRAGS); >>>> >>>> sg_init_table() does an unnecessary memset() though. We >>>> probably either want a new scatterlist API or just open >>>> code this, >>>> >>>> #ifdef CONFIG_DEBUG_SG >>>> { >>>> unsigned int i; >>>> for (i = 0; i < nents; i++) >>>> sgl[i].sg_magic = SG_MAGIC; >>>> } >>> >>> Similar sg_init_table() is present in bpf_tcp_sendmsg(). >>> I agree that it causes unnecessary memset, but I don't agree with open coded fix. >> >> But then lets fix is properly and add a static inline helper to the >> include/linux/scatterlist.h header like ... >> >> static inline void sg_init_debug_marker(struct scatterlist *sgl, >> unsigned int nents) >> { >> #ifdef CONFIG_DEBUG_SG >> unsigned int i; >> >> for (i = 0; i < nents; i++) >> sgl[i].sg_magic = SG_MAGIC; >> #endif >> } >> >> ... and reuse it in all the places that would otherwise open-code this, >> as well as sg_init_table(): >> >> void sg_init_table(struct scatterlist *sgl, unsigned int nents) >> { >> memset(sgl, 0, sizeof(*sgl) * nents); >> sg_init_debug_marker(sgl, nents); >> sg_mark_end(&sgl[nents - 1]); >> } >> >> This would be a lot cleaner than having this duplicated in various places. > > Daniel, This is a good suggestion. Is it ok if I submit both changes in > a patch series? Sure, that's fine. > How scatterlist related changes will be picked up by other subsystems? Once this gets applied into bpf-next, this will be pushed to net-next tree, and during the merge window net-next will be pulled into Linus' tree if this is what you are asking. Then also other subsystems outside of bpf/networking can make use of the sg_init_debug_marker() helper if suitable for their situation. > -Prashant >
On 3/28/2018 5:51 PM, Daniel Borkmann wrote: > On 03/28/2018 08:18 AM, Prashant Bhole wrote: >> On 3/27/2018 6:05 PM, Daniel Borkmann wrote: >>> On 03/27/2018 10:41 AM, Prashant Bhole wrote: >>>> On 3/27/2018 12:15 PM, John Fastabend wrote: >>>>> On 03/25/2018 11:54 PM, Prashant Bhole wrote: >>>>>> When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized to SG_MAGIC, >>>>>> when sg table is initialized using sg_init_table(). Magic is checked >>>>>> while navigating the scatterlist. We hit BUG_ON when magic check is >>>>>> failed. >>>>>> >>>>>> Fixed following things: >>>>>> - Initialization of sg table in bpf_tcp_sendpage() was missing, >>>>>> initialized it using sg_init_table() >>>>>> >>>>>> - bpf_tcp_sendmsg() initializes sg table using sg_init_table() before >>>>>> entering the loop, but further consumed sg entries are initialized >>>>>> using memset. Fixed it by replacing memset with sg_init_table() in >>>>>> function bpf_tcp_push() >>>>>> >>>>>> Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> >>>>>> --- >>>>>> kernel/bpf/sockmap.c | 11 +++++++---- >>>>>> 1 file changed, 7 insertions(+), 4 deletions(-) >>>>>> >>>>>> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c >>>>>> index 69c5bccabd22..8a848a99d768 100644 >>>>>> --- a/kernel/bpf/sockmap.c >>>>>> +++ b/kernel/bpf/sockmap.c >>>>>> @@ -312,7 +312,7 @@ static int bpf_tcp_push(struct sock *sk, int apply_bytes, >>>>>> md->sg_start++; >>>>>> if (md->sg_start == MAX_SKB_FRAGS) >>>>>> md->sg_start = 0; >>>>>> - memset(sg, 0, sizeof(*sg)); >>>>>> + sg_init_table(sg, 1); >>>>> >>>>> Looks OK here. >>>>> >>>>>> if (md->sg_start == md->sg_end) >>>>>> break; >>>>>> @@ -763,10 +763,14 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, >>>>>> lock_sock(sk); >>>>>> - if (psock->cork_bytes) >>>>>> + if (psock->cork_bytes) { >>>>>> m = psock->cork; >>>>>> - else >>>>>> + sg = &m->sg_data[m->sg_end]; >>>>>> + } else { >>>>>> m = &md; >>>>>> + sg = m->sg_data; >>>>>> + sg_init_table(sg, MAX_SKB_FRAGS); >>>>> >>>>> sg_init_table() does an unnecessary memset() though. We >>>>> probably either want a new scatterlist API or just open >>>>> code this, >>>>> >>>>> #ifdef CONFIG_DEBUG_SG >>>>> { >>>>> unsigned int i; >>>>> for (i = 0; i < nents; i++) >>>>> sgl[i].sg_magic = SG_MAGIC; >>>>> } >>>> >>>> Similar sg_init_table() is present in bpf_tcp_sendmsg(). >>>> I agree that it causes unnecessary memset, but I don't agree with open coded fix. >>> >>> But then lets fix is properly and add a static inline helper to the >>> include/linux/scatterlist.h header like ... >>> >>> static inline void sg_init_debug_marker(struct scatterlist *sgl, >>> unsigned int nents) >>> { >>> #ifdef CONFIG_DEBUG_SG >>> unsigned int i; >>> >>> for (i = 0; i < nents; i++) >>> sgl[i].sg_magic = SG_MAGIC; >>> #endif >>> } >>> >>> ... and reuse it in all the places that would otherwise open-code this, >>> as well as sg_init_table(): >>> >>> void sg_init_table(struct scatterlist *sgl, unsigned int nents) >>> { >>> memset(sgl, 0, sizeof(*sgl) * nents); >>> sg_init_debug_marker(sgl, nents); >>> sg_mark_end(&sgl[nents - 1]); >>> } >>> >>> This would be a lot cleaner than having this duplicated in various places. >> >> Daniel, This is a good suggestion. Is it ok if I submit both changes in >> a patch series? > > Sure, that's fine. > >> How scatterlist related changes will be picked up by other subsystems? > > Once this gets applied into bpf-next, this will be pushed to net-next tree, > and during the merge window net-next will be pulled into Linus' tree if this > is what you are asking. Then also other subsystems outside of bpf/networking > can make use of the sg_init_debug_marker() helper if suitable for their > situation. Thanks. I am submitting V2 soon. -Prashant
diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c index 69c5bccabd22..8a848a99d768 100644 --- a/kernel/bpf/sockmap.c +++ b/kernel/bpf/sockmap.c @@ -312,7 +312,7 @@ static int bpf_tcp_push(struct sock *sk, int apply_bytes, md->sg_start++; if (md->sg_start == MAX_SKB_FRAGS) md->sg_start = 0; - memset(sg, 0, sizeof(*sg)); + sg_init_table(sg, 1); if (md->sg_start == md->sg_end) break; @@ -763,10 +763,14 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, lock_sock(sk); - if (psock->cork_bytes) + if (psock->cork_bytes) { m = psock->cork; - else + sg = &m->sg_data[m->sg_end]; + } else { m = &md; + sg = m->sg_data; + sg_init_table(sg, MAX_SKB_FRAGS); + } /* Catch case where ring is full and sendpage is stalled. */ if (unlikely(m->sg_end == m->sg_start && @@ -774,7 +778,6 @@ static int bpf_tcp_sendpage(struct sock *sk, struct page *page, goto out_err; psock->sg_size += size; - sg = &m->sg_data[m->sg_end]; sg_set_page(sg, page, size, offset); get_page(page); m->sg_copy[m->sg_end] = true;
When CONFIG_DEBUG_SG is set, sg->sg_magic is initialized to SG_MAGIC, when sg table is initialized using sg_init_table(). Magic is checked while navigating the scatterlist. We hit BUG_ON when magic check is failed. Fixed following things: - Initialization of sg table in bpf_tcp_sendpage() was missing, initialized it using sg_init_table() - bpf_tcp_sendmsg() initializes sg table using sg_init_table() before entering the loop, but further consumed sg entries are initialized using memset. Fixed it by replacing memset with sg_init_table() in function bpf_tcp_push() Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> --- kernel/bpf/sockmap.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)