Message ID | 1400007214-3236-1-git-send-email-ast@plumgrid.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, 2014-05-13 at 11:53 -0700, Alexei Starovoitov wrote: > bpf_alloc_binary() adds 128 bytes of room to JITed program image > and rounds it up to the nearest page size. If image size is close > to page size (like 4000), it is rounded to two pages: > round_up(4000 + 4 + 128) == 8192 > then 'hole' is computed as 8192 - (4000 + 4) = 4188 > If prandom_u32() % hole selects a number >= 4096, then kernel will crash > during bpf_jit_free(): > > kernel BUG at arch/x86/mm/pageattr.c:887! > Call Trace: > [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460 > [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50 > [<ffffffff810378ff>] set_memory_rw+0x2f/0x40 > [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60 > [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0 > [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0 > [<ffffffff8106c90c>] worker_thread+0x11c/0x370 > > since bpf_jit_free() does: > unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK; > struct bpf_binary_header *header = (void *)addr; > to compute start address of 'bpf_binary_header' > and header->pages will pass junk to: > set_memory_rw(addr, header->pages); > > Fix it by picking image offset always out of the first page. You mean, offset should be in first page ? > > While at it make the offset to be within first half of the page, > so there is some room for CPU to run before it hits page miss. > > Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") > Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> > --- > > s390 commit aa2d2c73c21f ("s390/bpf,jit: address randomize and write protect jit code") > seems to have the same problem > > arch/x86/net/bpf_jit_comp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > index dc01773..c6ab7a0 100644 > --- a/arch/x86/net/bpf_jit_comp.c > +++ b/arch/x86/net/bpf_jit_comp.c > @@ -171,7 +171,7 @@ static struct bpf_binary_header *bpf_alloc_binary(unsigned int proglen, > memset(header, 0xcc, sz); /* fill whole space with int3 instructions */ > > header->pages = sz / PAGE_SIZE; > - hole = sz - (proglen + sizeof(*header)); > + hole = min(sz - (proglen + sizeof(*header)), PAGE_SIZE / 2); > > /* insert a random number of int3 instructions before BPF code */ > *image_ptr = &header->image[prandom_u32() % hole]; Good catch, but I am not sure about the PAGE_SIZE / 2 The argument of not having code ending on (or being very close of) page boundary seems orthogonal to this bug fix. Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 13, 2014 at 1:23 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Tue, 2014-05-13 at 11:53 -0700, Alexei Starovoitov wrote: >> bpf_alloc_binary() adds 128 bytes of room to JITed program image >> and rounds it up to the nearest page size. If image size is close >> to page size (like 4000), it is rounded to two pages: >> round_up(4000 + 4 + 128) == 8192 >> then 'hole' is computed as 8192 - (4000 + 4) = 4188 >> If prandom_u32() % hole selects a number >= 4096, then kernel will crash >> during bpf_jit_free(): >> >> kernel BUG at arch/x86/mm/pageattr.c:887! >> Call Trace: >> [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460 >> [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50 >> [<ffffffff810378ff>] set_memory_rw+0x2f/0x40 >> [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60 >> [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0 >> [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0 >> [<ffffffff8106c90c>] worker_thread+0x11c/0x370 >> >> since bpf_jit_free() does: >> unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK; >> struct bpf_binary_header *header = (void *)addr; >> to compute start address of 'bpf_binary_header' >> and header->pages will pass junk to: >> set_memory_rw(addr, header->pages); >> >> Fix it by picking image offset always out of the first page. > > You mean, offset should be in first page ? yes. &header->image[prandom_u32() % hole] should in the same page as &header >> >> While at it make the offset to be within first half of the page, >> so there is some room for CPU to run before it hits page miss. >> >> Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") >> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> >> --- >> >> s390 commit aa2d2c73c21f ("s390/bpf,jit: address randomize and write protect jit code") >> seems to have the same problem >> >> arch/x86/net/bpf_jit_comp.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c >> index dc01773..c6ab7a0 100644 >> --- a/arch/x86/net/bpf_jit_comp.c >> +++ b/arch/x86/net/bpf_jit_comp.c >> @@ -171,7 +171,7 @@ static struct bpf_binary_header *bpf_alloc_binary(unsigned int proglen, >> memset(header, 0xcc, sz); /* fill whole space with int3 instructions */ >> >> header->pages = sz / PAGE_SIZE; >> - hole = sz - (proglen + sizeof(*header)); >> + hole = min(sz - (proglen + sizeof(*header)), PAGE_SIZE / 2); >> >> /* insert a random number of int3 instructions before BPF code */ >> *image_ptr = &header->image[prandom_u32() % hole]; > > Good catch, but I am not sure about the PAGE_SIZE / 2 > > The argument of not having code ending on (or being very close of) page > boundary seems orthogonal to this bug fix. Gotta pick some number... page/2 seems good enough to have large range for prandom() to choose and better performance. Another alternative is to do min(…, PAGE_SIZE - sizeof(*header)), but that is harder to understand. Also just realized that I miscalculated the breaking point: "If prandom_u32() % hole selects a number >= 4096, then kernel will crash" it should read: "… >= 4092 ..." since sizeof(*header) needs to be accounted for. > Thanks > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/13/2014 01:34 PM, Alexei Starovoitov wrote: >> >> The argument of not having code ending on (or being very close of) page >> boundary seems orthogonal to this bug fix. > > Gotta pick some number... page/2 seems good enough to have > large range for prandom() to choose and better performance. > Another alternative is to do min(…, PAGE_SIZE - sizeof(*header)), > but that is harder to understand. > The latter is correct by construction, and thus doesn't end up with the question "what is going on here" or has hidden failure conditions. > Also just realized that I miscalculated the breaking point: > "If prandom_u32() % hole selects a number >= 4096, then kernel will crash" > it should read: "… >= 4092 ..." > since sizeof(*header) needs to be accounted for. No, it should read PAGE_SIZE - sizeof(*header) if anything. -hpa -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: "H. Peter Anvin" <hpa@zytor.com> Date: Tue, 13 May 2014 14:28:55 -0700 > On 05/13/2014 01:34 PM, Alexei Starovoitov wrote: >>> >>> The argument of not having code ending on (or being very close of) page >>> boundary seems orthogonal to this bug fix. >> >> Gotta pick some number... page/2 seems good enough to have >> large range for prandom() to choose and better performance. >> Another alternative is to do min(…, PAGE_SIZE - sizeof(*header)), >> but that is harder to understand. >> > > The latter is correct by construction, and thus doesn't end up with the > question "what is going on here" or has hidden failure conditions. Agreed. >> Also just realized that I miscalculated the breaking point: >> "If prandom_u32() % hole selects a number >= 4096, then kernel will crash" >> it should read: "… >= 4092 ..." >> since sizeof(*header) needs to be accounted for. > > No, it should read PAGE_SIZE - sizeof(*header) if anything. Also agreed. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 13, 2014 at 11:53:34AM -0700, Alexei Starovoitov wrote: > bpf_alloc_binary() adds 128 bytes of room to JITed program image > and rounds it up to the nearest page size. If image size is close > to page size (like 4000), it is rounded to two pages: > round_up(4000 + 4 + 128) == 8192 > then 'hole' is computed as 8192 - (4000 + 4) = 4188 > If prandom_u32() % hole selects a number >= 4096, then kernel will crash > during bpf_jit_free(): [...] > Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") > Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> > --- > > s390 commit aa2d2c73c21f ("s390/bpf,jit: address randomize and write protect jit code") > seems to have the same problem Yes, that's the same bug on s390. Would you mind fixing s390 as well, since I assume you're going to send a new patch for x86? Would be good to keep the code quite identical so these issues can be easily seen across architectures. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index dc01773..c6ab7a0 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -171,7 +171,7 @@ static struct bpf_binary_header *bpf_alloc_binary(unsigned int proglen, memset(header, 0xcc, sz); /* fill whole space with int3 instructions */ header->pages = sz / PAGE_SIZE; - hole = sz - (proglen + sizeof(*header)); + hole = min(sz - (proglen + sizeof(*header)), PAGE_SIZE / 2); /* insert a random number of int3 instructions before BPF code */ *image_ptr = &header->image[prandom_u32() % hole];
bpf_alloc_binary() adds 128 bytes of room to JITed program image and rounds it up to the nearest page size. If image size is close to page size (like 4000), it is rounded to two pages: round_up(4000 + 4 + 128) == 8192 then 'hole' is computed as 8192 - (4000 + 4) = 4188 If prandom_u32() % hole selects a number >= 4096, then kernel will crash during bpf_jit_free(): kernel BUG at arch/x86/mm/pageattr.c:887! Call Trace: [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460 [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810378ff>] set_memory_rw+0x2f/0x40 [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60 [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0 [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0 [<ffffffff8106c90c>] worker_thread+0x11c/0x370 since bpf_jit_free() does: unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK; struct bpf_binary_header *header = (void *)addr; to compute start address of 'bpf_binary_header' and header->pages will pass junk to: set_memory_rw(addr, header->pages); Fix it by picking image offset always out of the first page. While at it make the offset to be within first half of the page, so there is some room for CPU to run before it hits page miss. Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> --- s390 commit aa2d2c73c21f ("s390/bpf,jit: address randomize and write protect jit code") seems to have the same problem arch/x86/net/bpf_jit_comp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)