mbox series

[B,0/3] Fix oops in skb_segment for Bionic series

Message ID 20210212204159.28472-1-gpiccoli@canonical.com
Headers show
Series Fix oops in skb_segment for Bionic series | expand

Message

Guilherme G. Piccoli Feb. 12, 2021, 8:41 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1915552


[Impact]
* It was reported upstream [0] that an eBPF NAT64 filter caused an oops due to
bad handling of GRO headers length on SKB segmentation path; the discussion is
rich in details, and eventually the reporter sent a fix patch for that [1], as
well as a test scenario in test_bpf kernel module that reproduces the issue.

[0] https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/
[1] https://lore.kernel.org/netdev/20180321233104.2142764-1-yhs@fb.com/

* The fix patch landed on v4.17 and for some reason didn't reach the stable
kernels; by testing our Bionic v4.15 kernel I was able to reproduce the issue,
observing the following stack trace (details in the testing section below):

kernel BUG at net/core/skbuff.c:3703!
Modules linked in: test_bpf(E+) isofs nls_iso8859_1 dm_multipath ...
RIP: 0010:skb_segment+0xa34/0xce0
[...]
Call Trace:
 test_bpf_init.part.7+0x767/0x7d1 [test_bpf]
 test_bpf_init+0xfc/0x82f [test_bpf]
 do_one_initcall+0x52/0x19f
[...]

* Interesting to mention that this fix is not complete in the sense there was
another corner case reported after that [2], which was fixed by another
patch [3], this one released in kernel v5.3 and present in the stable tree
(hence backported to our Bionic 4.15 kernels).

[2] https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
[3] http://git.kernel.org/linus/3dcbdb134f32 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")

* So we are hereby backporting both the original fix patch [4] as well
as the test_bpf patch (and a fix for it) [5][6] for Ubuntu Bionic v4.15-based
kernels.

[4] http://git.kernel.org/linus/13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb")
[5] http://git.kernel.org/linus/76db8087c4c9 ("net: bpf: add a test for skb_segment in test_bpf module")
[6] http://git.kernel.org/linus/99fe29d3a25f ("test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()")

[Test Case]
* One could use a NAT64 filter, but with the aforementioned patches [5][6]
hereby backported, one can also use the kernel infrastructure, by loading the
test_bpf module:

insmod /lib/modules/$(uname -r)/kernel/lib/test_bpf.ko

If patches [5] [6] are included and kernel doesn't contain the fix [4], an oops
will be observed.

[Where problems could occur]

* The backported patches are present upstream since v4.17, and no fixes were
released for them (other than [6], included here), so from the testing
point-of-view, these patches are being exercised for a while with no issues.

* That said, if a problem would be triggered by these patches, hypothetically
it would affect SKB segmentation, the net/core code - a bad check could case an
oops in this code or they could present a pretty small overhead due to more
checks in the hot path.


Dan Carpenter (1):
  test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()

Yonghong Song (2):
  net: permit skb_segment on head_frag frag_list skb
  net: bpf: add a test for skb_segment in test_bpf module

 lib/test_bpf.c    | 93 ++++++++++++++++++++++++++++++++++++++++++++++-
 net/core/skbuff.c | 26 +++++++++++--
 2 files changed, 113 insertions(+), 6 deletions(-)

Comments

Thadeu Lima de Souza Cascardo Feb. 12, 2021, 9:02 p.m. UTC | #1
On Fri, Feb 12, 2021 at 05:41:56PM -0300, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1915552
> 

Nice writeup. Good testing for the bug.

Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

> 
> [Impact]
> * It was reported upstream [0] that an eBPF NAT64 filter caused an oops due to
> bad handling of GRO headers length on SKB segmentation path; the discussion is
> rich in details, and eventually the reporter sent a fix patch for that [1], as
> well as a test scenario in test_bpf kernel module that reproduces the issue.
> 
> [0] https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/
> [1] https://lore.kernel.org/netdev/20180321233104.2142764-1-yhs@fb.com/
> 
> * The fix patch landed on v4.17 and for some reason didn't reach the stable
> kernels; by testing our Bionic v4.15 kernel I was able to reproduce the issue,
> observing the following stack trace (details in the testing section below):
> 
> kernel BUG at net/core/skbuff.c:3703!
> Modules linked in: test_bpf(E+) isofs nls_iso8859_1 dm_multipath ...
> RIP: 0010:skb_segment+0xa34/0xce0
> [...]
> Call Trace:
>  test_bpf_init.part.7+0x767/0x7d1 [test_bpf]
>  test_bpf_init+0xfc/0x82f [test_bpf]
>  do_one_initcall+0x52/0x19f
> [...]
> 
> * Interesting to mention that this fix is not complete in the sense there was
> another corner case reported after that [2], which was fixed by another
> patch [3], this one released in kernel v5.3 and present in the stable tree
> (hence backported to our Bionic 4.15 kernels).
> 
> [2] https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
> [3] http://git.kernel.org/linus/3dcbdb134f32 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")
> 
> * So we are hereby backporting both the original fix patch [4] as well
> as the test_bpf patch (and a fix for it) [5][6] for Ubuntu Bionic v4.15-based
> kernels.
> 
> [4] http://git.kernel.org/linus/13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb")
> [5] http://git.kernel.org/linus/76db8087c4c9 ("net: bpf: add a test for skb_segment in test_bpf module")
> [6] http://git.kernel.org/linus/99fe29d3a25f ("test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()")
> 
> [Test Case]
> * One could use a NAT64 filter, but with the aforementioned patches [5][6]
> hereby backported, one can also use the kernel infrastructure, by loading the
> test_bpf module:
> 
> insmod /lib/modules/$(uname -r)/kernel/lib/test_bpf.ko
> 
> If patches [5] [6] are included and kernel doesn't contain the fix [4], an oops
> will be observed.
> 
> [Where problems could occur]
> 
> * The backported patches are present upstream since v4.17, and no fixes were
> released for them (other than [6], included here), so from the testing
> point-of-view, these patches are being exercised for a while with no issues.
> 
> * That said, if a problem would be triggered by these patches, hypothetically
> it would affect SKB segmentation, the net/core code - a bad check could case an
> oops in this code or they could present a pretty small overhead due to more
> checks in the hot path.
> 
> 
> Dan Carpenter (1):
>   test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()
> 
> Yonghong Song (2):
>   net: permit skb_segment on head_frag frag_list skb
>   net: bpf: add a test for skb_segment in test_bpf module
> 
>  lib/test_bpf.c    | 93 ++++++++++++++++++++++++++++++++++++++++++++++-
>  net/core/skbuff.c | 26 +++++++++++--
>  2 files changed, 113 insertions(+), 6 deletions(-)
> 
> -- 
> 2.29.0
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Stefan Bader Feb. 15, 2021, 7:43 a.m. UTC | #2
On 12.02.21 21:41, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1915552
> 
> 
> [Impact]
> * It was reported upstream [0] that an eBPF NAT64 filter caused an oops due to
> bad handling of GRO headers length on SKB segmentation path; the discussion is
> rich in details, and eventually the reporter sent a fix patch for that [1], as
> well as a test scenario in test_bpf kernel module that reproduces the issue.
> 
> [0] https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/
> [1] https://lore.kernel.org/netdev/20180321233104.2142764-1-yhs@fb.com/
> 
> * The fix patch landed on v4.17 and for some reason didn't reach the stable
> kernels; by testing our Bionic v4.15 kernel I was able to reproduce the issue,
> observing the following stack trace (details in the testing section below):
> 
> kernel BUG at net/core/skbuff.c:3703!
> Modules linked in: test_bpf(E+) isofs nls_iso8859_1 dm_multipath ...
> RIP: 0010:skb_segment+0xa34/0xce0
> [...]
> Call Trace:
>  test_bpf_init.part.7+0x767/0x7d1 [test_bpf]
>  test_bpf_init+0xfc/0x82f [test_bpf]
>  do_one_initcall+0x52/0x19f
> [...]
> 
> * Interesting to mention that this fix is not complete in the sense there was
> another corner case reported after that [2], which was fixed by another
> patch [3], this one released in kernel v5.3 and present in the stable tree
> (hence backported to our Bionic 4.15 kernels).
> 
> [2] https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
> [3] http://git.kernel.org/linus/3dcbdb134f32 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")
> 
> * So we are hereby backporting both the original fix patch [4] as well
> as the test_bpf patch (and a fix for it) [5][6] for Ubuntu Bionic v4.15-based
> kernels.
> 
> [4] http://git.kernel.org/linus/13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb")
> [5] http://git.kernel.org/linus/76db8087c4c9 ("net: bpf: add a test for skb_segment in test_bpf module")
> [6] http://git.kernel.org/linus/99fe29d3a25f ("test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()")
> 
> [Test Case]
> * One could use a NAT64 filter, but with the aforementioned patches [5][6]
> hereby backported, one can also use the kernel infrastructure, by loading the
> test_bpf module:
> 
> insmod /lib/modules/$(uname -r)/kernel/lib/test_bpf.ko
> 
> If patches [5] [6] are included and kernel doesn't contain the fix [4], an oops
> will be observed.
> 
> [Where problems could occur]
> 
> * The backported patches are present upstream since v4.17, and no fixes were
> released for them (other than [6], included here), so from the testing
> point-of-view, these patches are being exercised for a while with no issues.
> 
> * That said, if a problem would be triggered by these patches, hypothetically
> it would affect SKB segmentation, the net/core code - a bad check could case an
> oops in this code or they could present a pretty small overhead due to more
> checks in the hot path.
> 
> 
> Dan Carpenter (1):
>   test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()
> 
> Yonghong Song (2):
>   net: permit skb_segment on head_frag frag_list skb
>   net: bpf: add a test for skb_segment in test_bpf module
> 
>  lib/test_bpf.c    | 93 ++++++++++++++++++++++++++++++++++++++++++++++-
>  net/core/skbuff.c | 26 +++++++++++--
>  2 files changed, 113 insertions(+), 6 deletions(-)
> 
For users only the first patch is relevant and as far as one can tell with net
code there is still a case in the flow that is as before with an additional case
added. And the next 2 patches add a way to test this. So

Acked-by: Stefan Bader <stefan.bader@canonical.com>
Stefan Bader Feb. 15, 2021, 7:49 a.m. UTC | #3
On 12.02.21 21:41, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1915552
> 
> 
> [Impact]
> * It was reported upstream [0] that an eBPF NAT64 filter caused an oops due to
> bad handling of GRO headers length on SKB segmentation path; the discussion is
> rich in details, and eventually the reporter sent a fix patch for that [1], as
> well as a test scenario in test_bpf kernel module that reproduces the issue.
> 
> [0] https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/
> [1] https://lore.kernel.org/netdev/20180321233104.2142764-1-yhs@fb.com/
> 
> * The fix patch landed on v4.17 and for some reason didn't reach the stable
> kernels; by testing our Bionic v4.15 kernel I was able to reproduce the issue,
> observing the following stack trace (details in the testing section below):
> 
> kernel BUG at net/core/skbuff.c:3703!
> Modules linked in: test_bpf(E+) isofs nls_iso8859_1 dm_multipath ...
> RIP: 0010:skb_segment+0xa34/0xce0
> [...]
> Call Trace:
>  test_bpf_init.part.7+0x767/0x7d1 [test_bpf]
>  test_bpf_init+0xfc/0x82f [test_bpf]
>  do_one_initcall+0x52/0x19f
> [...]
> 
> * Interesting to mention that this fix is not complete in the sense there was
> another corner case reported after that [2], which was fixed by another
> patch [3], this one released in kernel v5.3 and present in the stable tree
> (hence backported to our Bionic 4.15 kernels).
> 
> [2] https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/
> [3] http://git.kernel.org/linus/3dcbdb134f32 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")
> 
> * So we are hereby backporting both the original fix patch [4] as well
> as the test_bpf patch (and a fix for it) [5][6] for Ubuntu Bionic v4.15-based
> kernels.
> 
> [4] http://git.kernel.org/linus/13acc94eff12 ("net: permit skb_segment on head_frag frag_list skb")
> [5] http://git.kernel.org/linus/76db8087c4c9 ("net: bpf: add a test for skb_segment in test_bpf module")
> [6] http://git.kernel.org/linus/99fe29d3a25f ("test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()")
> 
> [Test Case]
> * One could use a NAT64 filter, but with the aforementioned patches [5][6]
> hereby backported, one can also use the kernel infrastructure, by loading the
> test_bpf module:
> 
> insmod /lib/modules/$(uname -r)/kernel/lib/test_bpf.ko
> 
> If patches [5] [6] are included and kernel doesn't contain the fix [4], an oops
> will be observed.
> 
> [Where problems could occur]
> 
> * The backported patches are present upstream since v4.17, and no fixes were
> released for them (other than [6], included here), so from the testing
> point-of-view, these patches are being exercised for a while with no issues.
> 
> * That said, if a problem would be triggered by these patches, hypothetically
> it would affect SKB segmentation, the net/core code - a bad check could case an
> oops in this code or they could present a pretty small overhead due to more
> checks in the hot path.
> 
> 
> Dan Carpenter (1):
>   test_bpf: Fix NULL vs IS_ERR() check in test_skb_segment()
> 
> Yonghong Song (2):
>   net: permit skb_segment on head_frag frag_list skb
>   net: bpf: add a test for skb_segment in test_bpf module
> 
>  lib/test_bpf.c    | 93 ++++++++++++++++++++++++++++++++++++++++++++++-
>  net/core/skbuff.c | 26 +++++++++++--
>  2 files changed, 113 insertions(+), 6 deletions(-)
> 
One thing I realized after sending the ack. Somehow the bug report and
submission were not matching on the source. I assumed from the descriptions that
the Bionic primary kernel was correct (and not linux-azure). If that is wrong,
you have to shout rather sooner than later.

-Stefan
Guilherme G. Piccoli Feb. 15, 2021, 12:07 p.m. UTC | #4
Thanks for noticing that Stefan! I guess I've used an opened
linux-azure LP to produce the new LP and got this confused hehe
Indeed, the correct is Linux/generic, in order it get delivered
automatically to all 4.15-based flavors. Seems it's fixed now, right?
Stefan Bader Feb. 15, 2021, 12:40 p.m. UTC | #5
On 15.02.21 13:07, Guilherme Piccoli wrote:
> Thanks for noticing that Stefan! I guess I've used an opened
> linux-azure LP to produce the new LP and got this confused hehe
> Indeed, the correct is Linux/generic, in order it get delivered
> automatically to all 4.15-based flavors. Seems it's fixed now, right?
> 

Yes, all good. Applied to bionic:linux/master-next. Thanks.

-Stefan