Message ID | 1367585820-2674-1-git-send-email-dborkman@redhat.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Daniel Borkmann <dborkman@redhat.com> Date: Fri, 3 May 2013 14:57:00 +0200 > Jakub reported that it is fairly easy to trigger the BUG() macro > from user space with TPACKET_V3's RX_RING by just giving a wrong > header status flag. We already had a similar situation in commit > 7f5c3e3a80e6654 (``af_packet: remove BUG statement in > tpacket_destruct_skb'') where this was the case in the TX_RING > side that could be triggered from user space. So really, don't use > BUG() or BUG_ON() unless there's really no way out, and i.e. > don't use it for consistency checking when there's user space > involved, no excuses, especially not if you're slapping the user > with WARN + dump_stack + BUG all at once. The two functions are > of concern: > > prb_retire_current_block() [when block status != TP_STATUS_KERNEL] > prb_open_block() [when block_status != TP_STATUS_KERNEL] > > Calls to prb_open_block() are guarded by ealier checks if block_status > is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily > triggable from user space. System behaves still stable after they are > removed. Also remove that yoda condition entirely, since it's already > guarded. > > Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> > Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Applied and queued up for -stable, thanks Daniel. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, May 3, 2013 at 8:57 AM, Daniel Borkmann <dborkman@redhat.com> wrote: > > Jakub reported that it is fairly easy to trigger the BUG() macro > from user space with TPACKET_V3's RX_RING by just giving a wrong > header status flag. We already had a similar situation in commit > 7f5c3e3a80e6654 (``af_packet: remove BUG statement in > tpacket_destruct_skb'') where this was the case in the TX_RING > side that could be triggered from user space. So really, don't use > BUG() or BUG_ON() unless there's really no way out, and i.e. > don't use it for consistency checking when there's user space > involved, no excuses, especially not if you're slapping the user > with WARN + dump_stack + BUG all at once. The two functions are remember, you were able to figure out what the problem was because of the WARN and the forceful BUG. Is BUG ok? Depends on how you look at it. If you are doing serious network monitoring on a real life appliance we don't really want folks to write crappy application. May be BUG is too harsh. But then I don't see a WARN in this patch either (may be I overlooked it). > of concern: > > prb_retire_current_block() [when block status != TP_STATUS_KERNEL] > prb_open_block() [when block_status != TP_STATUS_KERN > > Calls to prb_open_block() are guarded by ealier checks if block_status > is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily > triggable from user space. System behaves still stable after they are > removed. Also remove that yoda condition entirely, since it's already > guarded. What yoda? It's called defensive programming. It's a single check and not going to cost you anything. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/07/2013 06:59 PM, chetan loke wrote: > On Fri, May 3, 2013 at 8:57 AM, Daniel Borkmann <dborkman@redhat.com> wrote: >> >> Jakub reported that it is fairly easy to trigger the BUG() macro >> from user space with TPACKET_V3's RX_RING by just giving a wrong >> header status flag. We already had a similar situation in commit >> 7f5c3e3a80e6654 (``af_packet: remove BUG statement in >> tpacket_destruct_skb'') where this was the case in the TX_RING >> side that could be triggered from user space. So really, don't use >> BUG() or BUG_ON() unless there's really no way out, and i.e. >> don't use it for consistency checking when there's user space >> involved, no excuses, especially not if you're slapping the user >> with WARN + dump_stack + BUG all at once. The two functions are > > remember, you were able to figure out what the problem was because of > the WARN and the forceful BUG. > Is BUG ok? Depends on how you look at it. If you are doing serious > network monitoring on a real life appliance > we don't really want folks to write crappy application. May be BUG is Right, and now we treat people who write crappy applications with BUG? BUG is only for consistency checks *within* the kernel and not meant to be triggered on purpose from user space. > too harsh. But then I don't see a WARN in this patch either (may be I > overlooked it). So what's the point? If people see a WARN() in their kernel log they go and complain on netdev that AF_PACKET is broken although its their application? There was already a discussion on netdev with a different BUG macro in the TX_RING, where we concluded that it shouldn't be a warn either. >> of concern: >> >> prb_retire_current_block() [when block status != TP_STATUS_KERNEL] >> prb_open_block() [when block_status != TP_STATUS_KERN > >> >> Calls to prb_open_block() are guarded by ealier checks if block_status >> is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily >> triggable from user space. System behaves still stable after they are >> removed. Also remove that yoda condition entirely, since it's already >> guarded. > > What yoda? It's called defensive programming. It's a single check and > not going to cost you anything. Yoda condition is called if you compare ``CONSTANT == var'', actually the other way round as it should be in CodingStyle. But that's not the big issue here, as Jakub pointed out it's racy because you do this comparison two times and status could change in between, thus you trigger the next BUG there, like (pseudo code): if (frame_status & USERPSACE) { ... } else { if (frame_status & KERNEL) { ... } else { BUG } } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2013-05-07 at 22:49 +0200, Daniel Borkmann wrote: > Right, and now we treat people who write crappy applications with BUG? > > BUG is only for consistency checks *within* the kernel and not meant to > be triggered on purpose from user space. +1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Tue, 07 May 2013 14:56:27 -0700 > On Tue, 2013-05-07 at 22:49 +0200, Daniel Borkmann wrote: > >> Right, and now we treat people who write crappy applications with BUG? >> >> BUG is only for consistency checks *within* the kernel and not meant to >> be triggered on purpose from user space. > > +1 +1 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 7, 2013 at 4:49 PM, Daniel Borkmann <dborkman@redhat.com> wrote: >> remember, you were able to figure out what the problem was because of >> the WARN and the forceful BUG. >> Is BUG ok? Depends on how you look at it. If you are doing serious >> network monitoring on a real life appliance >> we don't really want folks to write crappy application. May be BUG is > > > Right, and now we treat people who write crappy applications with BUG? > > BUG is only for consistency checks *within* the kernel and not meant to > be triggered on purpose from user space. > I thought I made myself clear when I said, may be BUG is too harsh :). > >> too harsh. But then I don't see a WARN in this patch either (may be I >> overlooked it). > > > So what's the point? If people see a WARN() in their kernel log they go > and complain on netdev that AF_PACKET is broken although its their > application? > It's not complaining. Its called bug-reporting. Exactly, by removing the WARN you deprived the user from fixing him/her buggy application. Infact, that could have been a kernel bug. Who knows. Pls re-read the code again. The code will also guard itself when you start modifying it and forget to check the block-status before calling it. There are checks like these everywhere. The caller doesn't trust the callee. If you don't even print a warning then we will never know what happened. Again, you were able to see that the user-space was writing crap because of this very same WARN + BUG. As mentioned earlier, you can strip the BUG. But we need the WARN back. Use WARN_ONCE. So this is what you need. if (status == STATUS_KERNEL) { ... return; } if (status == STATUS_USER) return; WARN_ONE(); There are two kinds of users: 1) User-A runs a one-shot capture. The corruption may/may-not happen during that time. 2) A capture appliance is running 24x7 monitoring a network. If the corruption does happen then some monitoring process will detect it (when it scans the kernel-log file) and will raise an alert. Or even sysadmins will look at the logfiles from time to time. Your patch does NOT help in case 2). I can't mine the log file because nothing was printed. This could go on for hours and we will lose all the capture data during all that time. There are SLAs around appliances. > There was already a discussion on netdev with a different BUG macro in > the TX_RING, where we concluded that it shouldn't be a warn either. > > >>> of concern: >>> >>> prb_retire_current_block() [when block status != TP_STATUS_KERNEL] >>> prb_open_block() [when block_status != TP_STATUS_KERN Those checks are there today. But no guarantees when the next person who is modifying the code does so without understanding it first. >>> >>> Calls to prb_open_block() are guarded by ealier checks if block_status >>> is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily racy? How? > Yoda condition is called if you compare ``CONSTANT == var'', actually the > other way round as it should be in CodingStyle. But that's not the big issue FYI: Long back compilers weren't smart enough to warn the user about if (var = CONSTANT), so you had to teach yourself to always write (CONSTANT == var). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/08/2013 10:32 PM, chetan loke wrote: > On Tue, May 7, 2013 at 4:49 PM, Daniel Borkmann <dborkman@redhat.com> wrote: [...] >> So what's the point? If people see a WARN() in their kernel log they go >> and complain on netdev that AF_PACKET is broken although its their >> application? > > It's not complaining. Its called bug-reporting. Exactly, by removing > the WARN you deprived the user from fixing him/her buggy application. I'm sorry Chetan, I still don't see the justification. If the user does crap, he'll get an -EINVAL when trying to setup an {R,T}X_RING wrongly. How he access frames in his application and give them back to the kernel, he will get from some docs. Do you really think that such users who write crappy applications will care to look into the Linux kernel source code to locate the warning and know that this might be in relation to their code? Rather, in best case, they might send a bug report to netdev where you have to tell them that it's not a kernel bug, but a bug in their user space application. Fact is, we currently do *not* warn him in case of TPACKET_V1 or TPACKET_V2 either, and yet it works for users! Neither do we warn in case of an RX_RING nor TX_RING. So we *cannot* have double standards here, either we do not warn at all or we do warn in *every* case. (What would give justification to only WARN in TPACKET_V3 but not in TPACKET_V1 and TPACKET_V2 while they all are based on the same underlying technology?) And personally, I think the first case is more sane. Imho, the kernel is not responsible to do *bug reports* to the user in this scenario. It makes sense when you have syscalls and return an -EINVAL with possible cause explanations in man pages, but we don't have this here. I think it doesn't make sense to trigger a WARN instead, i.e. because it can also be triggered maliciously/on purpose from user space. So the man-page of packet(7) would then need a description saying that mis-usage will lead to kernel warnings? Having that said, I also think that in af_packet.c's packet_set_ring() the WARN with "Tx-ring is not supported." is too much and a simple -EINVAL or better -EOPNOTSUPP or the like would have been sufficient. Currently we have a WARN + -EINVAL for that case. But anyway, that's just a side note ... If you think differently, then please go ahead and and put WARNs all over the place for all other TPACKET versions as well and also for netlink's mmap that is based on af_packet's mmap, which went in in 3.10, and also document it in the man-page. I don't think this is reasonable. Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 8, 2013 at 8:51 PM, Daniel Borkmann <dborkman@redhat.com> wrote: > > > I'm sorry Chetan, I still don't see the justification. If the user does > crap, he'll get an -EINVAL when trying to setup an {R,T}X_RING wrongly. No :). The open_block check is a runtime thingy and not at ring_init/mmap time(At init time, the mmap'd block is zero'd by the kernel and so you will not detect this). The open_block check will be triggered by an incoming frame. > How he access frames in his application and give them back to the kernel, > he will get from some docs. Do you really think that such users who write > crappy applications will care to look into the Linux kernel source code > to locate the warning and know that this might be in relation to their Normally people(sane code) will look for the return value from a syscall. mmap is special because there's no syscall to run once you mmap/setup the rings. Since there's no syscall there's no return value. That's why we try to detect it at run-time and try to WARN the user if we can. > code? Rather, in best case, they might send a bug report to netdev where > you have to tell them that it's not a kernel bug, but a bug in their user > space application. > No, we wouldn't know if it's a kernel bug or a user-triggered bug. All we can infer is that something is not right. But you are correct you can't tell if it's a kernel/user bug. So the original WARN stmt was printing the status_bit. We can add a text - "memory mismatch/corruption detected. Either due to kernel or user code." > Fact is, we currently do *not* warn him in case of TPACKET_V1 or TPACKET_V2 > either, and yet it works for users! Neither do we warn in case of an By works, it means the corruption goes un-noticed and un-detected. The get_frame_status function simply returns if the correct status isn't set in the frame. In the _v3 it's a little different because we only check for the BLOCK_STATUS and not the frame_status. > RX_RING nor TX_RING. So we *cannot* have double standards here, either we > do not warn at all or we do warn in *every* case. (What would give > justification to only WARN in TPACKET_V3 but not in TPACKET_V1 and > TPACKET_V2 while they all are based on the same underlying technology?) The v1/v2 behavior is still intact. i.e. - they don't get the warning. But by removing the WARN, we have now changed the _v3 behavior from kernel 3.3. > > And personally, I think the first case is more sane. Imho, the kernel is > not responsible to do *bug reports* to the user in this scenario. It makes > sense when you have syscalls and return an -EINVAL with possible cause > explanations in man pages, but we don't have this here. I think it doesn't Can't return -EINVAL after mmap goes live - explained above. > > If you think differently, then please go ahead and and put WARNs all over > the place for all other TPACKET versions as well and also for netlink's I don't think I'm asking to change the existing behavior of other flavors and that too for shipping releases. I can send the patch just for _v3 but wanted to first discuss it on netdev and didn't want to step on your patch. > > Thanks, > > Daniel Chetan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/09/2013 08:11 PM, chetan loke wrote: > On Wed, May 8, 2013 at 8:51 PM, Daniel Borkmann <dborkman@redhat.com> wrote: [..] > Normally people(sane code) will look for the return value from a > syscall. mmap is special because there's no syscall to run once you > mmap/setup the rings. Since there's no syscall there's no return > value. That's why we try to detect it at run-time and try to WARN the > user if we can. Well, I'm aware of all the internals and the problem statement. No need for further explanations. I also don't want to have endless, tiring discussions on netdev about a WARN_ON_ONCE statement either. I still don't like it for the reasons I stated ;-) but if you want to send a patch, I'm not going to stand in the way. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index dd5cd49..8ec1bca 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -742,36 +742,33 @@ static void prb_open_block(struct tpacket_kbdq_core *pkc1, smp_rmb(); - if (likely(TP_STATUS_KERNEL == BLOCK_STATUS(pbd1))) { + /* We could have just memset this but we will lose the + * flexibility of making the priv area sticky + */ - /* We could have just memset this but we will lose the - * flexibility of making the priv area sticky - */ - BLOCK_SNUM(pbd1) = pkc1->knxt_seq_num++; - BLOCK_NUM_PKTS(pbd1) = 0; - BLOCK_LEN(pbd1) = BLK_PLUS_PRIV(pkc1->blk_sizeof_priv); - getnstimeofday(&ts); - h1->ts_first_pkt.ts_sec = ts.tv_sec; - h1->ts_first_pkt.ts_nsec = ts.tv_nsec; - pkc1->pkblk_start = (char *)pbd1; - pkc1->nxt_offset = pkc1->pkblk_start + BLK_PLUS_PRIV(pkc1->blk_sizeof_priv); - BLOCK_O2FP(pbd1) = (__u32)BLK_PLUS_PRIV(pkc1->blk_sizeof_priv); - BLOCK_O2PRIV(pbd1) = BLK_HDR_LEN; - pbd1->version = pkc1->version; - pkc1->prev = pkc1->nxt_offset; - pkc1->pkblk_end = pkc1->pkblk_start + pkc1->kblk_size; - prb_thaw_queue(pkc1); - _prb_refresh_rx_retire_blk_timer(pkc1); + BLOCK_SNUM(pbd1) = pkc1->knxt_seq_num++; + BLOCK_NUM_PKTS(pbd1) = 0; + BLOCK_LEN(pbd1) = BLK_PLUS_PRIV(pkc1->blk_sizeof_priv); - smp_wmb(); + getnstimeofday(&ts); - return; - } + h1->ts_first_pkt.ts_sec = ts.tv_sec; + h1->ts_first_pkt.ts_nsec = ts.tv_nsec; - WARN(1, "ERROR block:%p is NOT FREE status:%d kactive_blk_num:%d\n", - pbd1, BLOCK_STATUS(pbd1), pkc1->kactive_blk_num); - dump_stack(); - BUG(); + pkc1->pkblk_start = (char *)pbd1; + pkc1->nxt_offset = pkc1->pkblk_start + BLK_PLUS_PRIV(pkc1->blk_sizeof_priv); + + BLOCK_O2FP(pbd1) = (__u32)BLK_PLUS_PRIV(pkc1->blk_sizeof_priv); + BLOCK_O2PRIV(pbd1) = BLK_HDR_LEN; + + pbd1->version = pkc1->version; + pkc1->prev = pkc1->nxt_offset; + pkc1->pkblk_end = pkc1->pkblk_start + pkc1->kblk_size; + + prb_thaw_queue(pkc1); + _prb_refresh_rx_retire_blk_timer(pkc1); + + smp_wmb(); } /* @@ -862,10 +859,6 @@ static void prb_retire_current_block(struct tpacket_kbdq_core *pkc, prb_close_block(pkc, pbd, po, status); return; } - - WARN(1, "ERROR-pbd[%d]:%p\n", pkc->kactive_blk_num, pbd); - dump_stack(); - BUG(); } static int prb_curr_blk_in_use(struct tpacket_kbdq_core *pkc,
Jakub reported that it is fairly easy to trigger the BUG() macro from user space with TPACKET_V3's RX_RING by just giving a wrong header status flag. We already had a similar situation in commit 7f5c3e3a80e6654 (``af_packet: remove BUG statement in tpacket_destruct_skb'') where this was the case in the TX_RING side that could be triggered from user space. So really, don't use BUG() or BUG_ON() unless there's really no way out, and i.e. don't use it for consistency checking when there's user space involved, no excuses, especially not if you're slapping the user with WARN + dump_stack + BUG all at once. The two functions are of concern: prb_retire_current_block() [when block status != TP_STATUS_KERNEL] prb_open_block() [when block_status != TP_STATUS_KERNEL] Calls to prb_open_block() are guarded by ealier checks if block_status is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily triggable from user space. System behaves still stable after they are removed. Also remove that yoda condition entirely, since it's already guarded. Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> --- In general, TPACKET_V3 needs some bigger rework/cleanup in mid-term future anyway. Bug triggering is possible since introduction of v3. net/packet/af_packet.c | 53 ++++++++++++++++++++++---------------------------- 1 file changed, 23 insertions(+), 30 deletions(-)