diff mbox

[net] packet: tpacket_v3: do not trigger bug() on wrong header status

Message ID 1367585820-2674-1-git-send-email-dborkman@redhat.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Daniel Borkmann May 3, 2013, 12:57 p.m. UTC
Jakub reported that it is fairly easy to trigger the BUG() macro
from user space with TPACKET_V3's RX_RING by just giving a wrong
header status flag. We already had a similar situation in commit
7f5c3e3a80e6654 (``af_packet: remove BUG statement in
tpacket_destruct_skb'') where this was the case in the TX_RING
side that could be triggered from user space. So really, don't use
BUG() or BUG_ON() unless there's really no way out, and i.e.
don't use it for consistency checking when there's user space
involved, no excuses, especially not if you're slapping the user
with WARN + dump_stack + BUG all at once. The two functions are
of concern:

  prb_retire_current_block() [when block status != TP_STATUS_KERNEL]
  prb_open_block() [when block_status != TP_STATUS_KERNEL]

Calls to prb_open_block() are guarded by ealier checks if block_status
is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily
triggable from user space. System behaves still stable after they are
removed. Also remove that yoda condition entirely, since it's already
guarded.

Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 In general, TPACKET_V3 needs some bigger rework/cleanup in mid-term
 future anyway. Bug triggering is possible since introduction of v3.

 net/packet/af_packet.c | 53 ++++++++++++++++++++++----------------------------
 1 file changed, 23 insertions(+), 30 deletions(-)

Comments

David Miller May 3, 2013, 8:12 p.m. UTC | #1
From: Daniel Borkmann <dborkman@redhat.com>
Date: Fri,  3 May 2013 14:57:00 +0200

> Jakub reported that it is fairly easy to trigger the BUG() macro
> from user space with TPACKET_V3's RX_RING by just giving a wrong
> header status flag. We already had a similar situation in commit
> 7f5c3e3a80e6654 (``af_packet: remove BUG statement in
> tpacket_destruct_skb'') where this was the case in the TX_RING
> side that could be triggered from user space. So really, don't use
> BUG() or BUG_ON() unless there's really no way out, and i.e.
> don't use it for consistency checking when there's user space
> involved, no excuses, especially not if you're slapping the user
> with WARN + dump_stack + BUG all at once. The two functions are
> of concern:
> 
>   prb_retire_current_block() [when block status != TP_STATUS_KERNEL]
>   prb_open_block() [when block_status != TP_STATUS_KERNEL]
> 
> Calls to prb_open_block() are guarded by ealier checks if block_status
> is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily
> triggable from user space. System behaves still stable after they are
> removed. Also remove that yoda condition entirely, since it's already
> guarded.
> 
> Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>

Applied and queued up for -stable, thanks Daniel.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
chetan L May 7, 2013, 4:59 p.m. UTC | #2
On Fri, May 3, 2013 at 8:57 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>
> Jakub reported that it is fairly easy to trigger the BUG() macro
> from user space with TPACKET_V3's RX_RING by just giving a wrong
> header status flag. We already had a similar situation in commit
> 7f5c3e3a80e6654 (``af_packet: remove BUG statement in
> tpacket_destruct_skb'') where this was the case in the TX_RING
> side that could be triggered from user space. So really, don't use
> BUG() or BUG_ON() unless there's really no way out, and i.e.
> don't use it for consistency checking when there's user space
> involved, no excuses, especially not if you're slapping the user
> with WARN + dump_stack + BUG all at once. The two functions are

remember, you were able to figure out what the problem was because of
the WARN and the forceful BUG.
Is BUG ok? Depends on how you look at it. If you are doing serious
network monitoring on a real life appliance
we don't really want folks to write crappy application. May be BUG is
too harsh. But then I don't see a WARN in this patch either (may be I
overlooked it).


> of concern:
>
>   prb_retire_current_block() [when block status != TP_STATUS_KERNEL]
>   prb_open_block() [when block_status != TP_STATUS_KERN

>
> Calls to prb_open_block() are guarded by ealier checks if block_status
> is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily
> triggable from user space. System behaves still stable after they are
> removed. Also remove that yoda condition entirely, since it's already
> guarded.

What yoda? It's called defensive programming. It's a single check and
not going to cost you anything.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann May 7, 2013, 8:49 p.m. UTC | #3
On 05/07/2013 06:59 PM, chetan loke wrote:
> On Fri, May 3, 2013 at 8:57 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
>>
>> Jakub reported that it is fairly easy to trigger the BUG() macro
>> from user space with TPACKET_V3's RX_RING by just giving a wrong
>> header status flag. We already had a similar situation in commit
>> 7f5c3e3a80e6654 (``af_packet: remove BUG statement in
>> tpacket_destruct_skb'') where this was the case in the TX_RING
>> side that could be triggered from user space. So really, don't use
>> BUG() or BUG_ON() unless there's really no way out, and i.e.
>> don't use it for consistency checking when there's user space
>> involved, no excuses, especially not if you're slapping the user
>> with WARN + dump_stack + BUG all at once. The two functions are
>
> remember, you were able to figure out what the problem was because of
> the WARN and the forceful BUG.
> Is BUG ok? Depends on how you look at it. If you are doing serious
> network monitoring on a real life appliance
> we don't really want folks to write crappy application. May be BUG is

Right, and now we treat people who write crappy applications with BUG?

BUG is only for consistency checks *within* the kernel and not meant to
be triggered on purpose from user space.

> too harsh. But then I don't see a WARN in this patch either (may be I
> overlooked it).

So what's the point? If people see a WARN() in their kernel log they go
and complain on netdev that AF_PACKET is broken although its their
application?

There was already a discussion on netdev with a different BUG macro in
the TX_RING, where we concluded that it shouldn't be a warn either.

>> of concern:
>>
>>    prb_retire_current_block() [when block status != TP_STATUS_KERNEL]
>>    prb_open_block() [when block_status != TP_STATUS_KERN
>
>>
>> Calls to prb_open_block() are guarded by ealier checks if block_status
>> is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily
>> triggable from user space. System behaves still stable after they are
>> removed. Also remove that yoda condition entirely, since it's already
>> guarded.
>
> What yoda? It's called defensive programming. It's a single check and
> not going to cost you anything.

Yoda condition is called if you compare ``CONSTANT == var'', actually the
other way round as it should be in CodingStyle. But that's not the big issue
here, as Jakub pointed out it's racy because you do this comparison two
times and status could change in between, thus you trigger the next BUG
there, like (pseudo code):

if (frame_status & USERPSACE) {
   ...
} else {
    if (frame_status & KERNEL) {
       ...
    } else {
       BUG
    }
}
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 7, 2013, 9:56 p.m. UTC | #4
On Tue, 2013-05-07 at 22:49 +0200, Daniel Borkmann wrote:

> Right, and now we treat people who write crappy applications with BUG?
> 
> BUG is only for consistency checks *within* the kernel and not meant to
> be triggered on purpose from user space.

+1



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 7, 2013, 9:58 p.m. UTC | #5
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 07 May 2013 14:56:27 -0700

> On Tue, 2013-05-07 at 22:49 +0200, Daniel Borkmann wrote:
> 
>> Right, and now we treat people who write crappy applications with BUG?
>> 
>> BUG is only for consistency checks *within* the kernel and not meant to
>> be triggered on purpose from user space.
> 
> +1

+1
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
chetan L May 8, 2013, 8:32 p.m. UTC | #6
On Tue, May 7, 2013 at 4:49 PM, Daniel Borkmann <dborkman@redhat.com> wrote:

>> remember, you were able to figure out what the problem was because of
>> the WARN and the forceful BUG.
>> Is BUG ok? Depends on how you look at it. If you are doing serious
>> network monitoring on a real life appliance
>> we don't really want folks to write crappy application. May be BUG is
>
>
> Right, and now we treat people who write crappy applications with BUG?
>
> BUG is only for consistency checks *within* the kernel and not meant to
> be triggered on purpose from user space.
>

I thought I made myself clear when I said, may be BUG is too harsh :).


>
>> too harsh. But then I don't see a WARN in this patch either (may be I
>> overlooked it).
>
>
> So what's the point? If people see a WARN() in their kernel log they go
> and complain on netdev that AF_PACKET is broken although its their
> application?
>

It's not complaining. Its called bug-reporting. Exactly, by removing
the WARN you deprived the user from fixing him/her buggy application.
Infact, that could have been a kernel bug. Who knows.

Pls re-read the code again. The code will also guard itself when you
start modifying it and forget to check the block-status before calling
it. There are checks like these everywhere. The caller doesn't trust
the callee.

If you don't even print a warning then we will never know what
happened. Again, you were able to see that the user-space was writing
crap because of this very same WARN + BUG. As mentioned earlier, you
can strip the BUG. But we need the WARN back. Use WARN_ONCE.


So this is what you need.

if (status == STATUS_KERNEL) {
...
return;
}
if (status == STATUS_USER)
return;
WARN_ONE();

There are two kinds of users:
1) User-A runs a one-shot capture. The corruption may/may-not happen
during that time.
2) A capture appliance is running 24x7 monitoring a network. If the
corruption does happen then some monitoring process will detect it
(when it scans the kernel-log file) and will raise an alert. Or even
sysadmins will look at the logfiles from time to time.

Your patch does NOT help in case 2). I can't mine the log file because
nothing was printed. This could go on for hours and we will lose all
the capture data during all that time. There are SLAs around
appliances.


> There was already a discussion on netdev with a different BUG macro in
> the TX_RING, where we concluded that it shouldn't be a warn either.
>
>
>>> of concern:
>>>
>>>    prb_retire_current_block() [when block status != TP_STATUS_KERNEL]
>>>    prb_open_block() [when block_status != TP_STATUS_KERN

Those checks are there today. But no guarantees when the next person
who is modifying the code does so without understanding it first.


>>>
>>> Calls to prb_open_block() are guarded by ealier checks if block_status
>>> is really TP_STATUS_KERNEL (racy!), but the first one BUG() is easily

racy? How?



> Yoda condition is called if you compare ``CONSTANT == var'', actually the
> other way round as it should be in CodingStyle. But that's not the big issue

FYI: Long back compilers weren't smart enough to warn the user about
if (var = CONSTANT), so you had to teach yourself to always write
(CONSTANT == var).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann May 9, 2013, 12:51 a.m. UTC | #7
On 05/08/2013 10:32 PM, chetan loke wrote:
> On Tue, May 7, 2013 at 4:49 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
  [...]
>> So what's the point? If people see a WARN() in their kernel log they go
>> and complain on netdev that AF_PACKET is broken although its their
>> application?
>
> It's not complaining. Its called bug-reporting. Exactly, by removing
> the WARN you deprived the user from fixing him/her buggy application.

I'm sorry Chetan, I still don't see the justification. If the user does
crap, he'll get an -EINVAL when trying to setup an {R,T}X_RING wrongly.
How he access frames in his application and give them back to the kernel,
he will get from some docs. Do you really think that such users who write
crappy applications will care to look into the Linux kernel source code
to locate the warning and know that this might be in relation to their
code? Rather, in best case, they might send a bug report to netdev where
you have to tell them that it's not a kernel bug, but a bug in their user
space application.

Fact is, we currently do *not* warn him in case of TPACKET_V1 or TPACKET_V2
either, and yet it works for users! Neither do we warn in case of an
RX_RING nor TX_RING. So we *cannot* have double standards here, either we
do not warn at all or we do warn in *every* case. (What would give
justification to only WARN in TPACKET_V3 but not in TPACKET_V1 and
TPACKET_V2 while they all are based on the same underlying technology?)

And personally, I think the first case is more sane. Imho, the kernel is
not responsible to do *bug reports* to the user in this scenario. It makes
sense when you have syscalls and return an -EINVAL with possible cause
explanations in man pages, but we don't have this here. I think it doesn't
make sense to trigger a WARN instead, i.e. because it can also be triggered
maliciously/on purpose from user space. So the man-page of packet(7) would
then need a description saying that mis-usage will lead to kernel warnings?

Having that said, I also think that in af_packet.c's packet_set_ring() the
WARN with "Tx-ring is not supported." is too much and a simple -EINVAL or
better -EOPNOTSUPP or the like would have been sufficient. Currently we have
a WARN + -EINVAL for that case. But anyway, that's just a side note ...

If you think differently, then please go ahead and and put WARNs all over
the place for all other TPACKET versions as well and also for netlink's
mmap that is based on af_packet's mmap, which went in in 3.10, and also
document it in the man-page. I don't think this is reasonable.

Thanks,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
chetan L May 9, 2013, 6:11 p.m. UTC | #8
On Wed, May 8, 2013 at 8:51 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
>
>
> I'm sorry Chetan, I still don't see the justification. If the user does
> crap, he'll get an -EINVAL when trying to setup an {R,T}X_RING wrongly.

No :). The open_block check is a runtime thingy and not at
ring_init/mmap time(At init time, the mmap'd block is zero'd by the
kernel and so you will not detect this). The open_block check will be
triggered by an incoming frame.

> How he access frames in his application and give them back to the kernel,
> he will get from some docs. Do you really think that such users who write
> crappy applications will care to look into the Linux kernel source code
> to locate the warning and know that this might be in relation to their

Normally people(sane code) will look for the return value from a
syscall. mmap is special because there's no syscall to run once you
mmap/setup the rings. Since there's no syscall there's no return
value. That's why we try to detect it at run-time and try to WARN the
user if we can.

> code? Rather, in best case, they might send a bug report to netdev where
> you have to tell them that it's not a kernel bug, but a bug in their user
> space application.
>

No, we wouldn't know if it's a kernel bug or a user-triggered bug. All
we can infer is that something is not right. But you are correct you
can't tell if it's a kernel/user bug. So the original WARN stmt was
printing the status_bit. We can add a text - "memory
mismatch/corruption detected. Either due to kernel or user code."

> Fact is, we currently do *not* warn him in case of TPACKET_V1 or TPACKET_V2
> either, and yet it works for users! Neither do we warn in case of an

By works, it means the corruption goes un-noticed and un-detected.

The get_frame_status function simply returns if the correct status
isn't set in the frame. In the _v3 it's a little different because we
only check for
the BLOCK_STATUS and not the frame_status.

> RX_RING nor TX_RING. So we *cannot* have double standards here, either we
> do not warn at all or we do warn in *every* case. (What would give
> justification to only WARN in TPACKET_V3 but not in TPACKET_V1 and
> TPACKET_V2 while they all are based on the same underlying technology?)

The v1/v2 behavior is still intact. i.e. - they don't get the warning.
But by removing the WARN, we have now changed the _v3 behavior from
kernel 3.3.

>
> And personally, I think the first case is more sane. Imho, the kernel is
> not responsible to do *bug reports* to the user in this scenario. It makes
> sense when you have syscalls and return an -EINVAL with possible cause
> explanations in man pages, but we don't have this here. I think it doesn't

Can't return -EINVAL after mmap goes live - explained above.



>
> If you think differently, then please go ahead and and put WARNs all over
> the place for all other TPACKET versions as well and also for netlink's

I don't think I'm asking to change the existing behavior of other
flavors and that too for shipping releases. I can send the patch just
for _v3 but wanted to first discuss it on netdev and didn't want to
step on your patch.


>
> Thanks,
>
> Daniel

Chetan
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann May 9, 2013, 6:32 p.m. UTC | #9
On 05/09/2013 08:11 PM, chetan loke wrote:
> On Wed, May 8, 2013 at 8:51 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
[..]
> Normally people(sane code) will look for the return value from a
> syscall. mmap is special because there's no syscall to run once you
> mmap/setup the rings. Since there's no syscall there's no return
> value. That's why we try to detect it at run-time and try to WARN the
> user if we can.

Well, I'm aware of all the internals and the problem statement. No need
for further explanations.

I also don't want to have endless, tiring discussions on netdev about a
WARN_ON_ONCE statement either.

I still don't like it for the reasons I stated ;-) but if you want to send
a patch, I'm not going to stand in the way.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index dd5cd49..8ec1bca 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -742,36 +742,33 @@  static void prb_open_block(struct tpacket_kbdq_core *pkc1,
 
 	smp_rmb();
 
-	if (likely(TP_STATUS_KERNEL == BLOCK_STATUS(pbd1))) {
+	/* We could have just memset this but we will lose the
+	 * flexibility of making the priv area sticky
+	 */
 
-		/* We could have just memset this but we will lose the
-		 * flexibility of making the priv area sticky
-		 */
-		BLOCK_SNUM(pbd1) = pkc1->knxt_seq_num++;
-		BLOCK_NUM_PKTS(pbd1) = 0;
-		BLOCK_LEN(pbd1) = BLK_PLUS_PRIV(pkc1->blk_sizeof_priv);
-		getnstimeofday(&ts);
-		h1->ts_first_pkt.ts_sec = ts.tv_sec;
-		h1->ts_first_pkt.ts_nsec = ts.tv_nsec;
-		pkc1->pkblk_start = (char *)pbd1;
-		pkc1->nxt_offset = pkc1->pkblk_start + BLK_PLUS_PRIV(pkc1->blk_sizeof_priv);
-		BLOCK_O2FP(pbd1) = (__u32)BLK_PLUS_PRIV(pkc1->blk_sizeof_priv);
-		BLOCK_O2PRIV(pbd1) = BLK_HDR_LEN;
-		pbd1->version = pkc1->version;
-		pkc1->prev = pkc1->nxt_offset;
-		pkc1->pkblk_end = pkc1->pkblk_start + pkc1->kblk_size;
-		prb_thaw_queue(pkc1);
-		_prb_refresh_rx_retire_blk_timer(pkc1);
+	BLOCK_SNUM(pbd1) = pkc1->knxt_seq_num++;
+	BLOCK_NUM_PKTS(pbd1) = 0;
+	BLOCK_LEN(pbd1) = BLK_PLUS_PRIV(pkc1->blk_sizeof_priv);
 
-		smp_wmb();
+	getnstimeofday(&ts);
 
-		return;
-	}
+	h1->ts_first_pkt.ts_sec = ts.tv_sec;
+	h1->ts_first_pkt.ts_nsec = ts.tv_nsec;
 
-	WARN(1, "ERROR block:%p is NOT FREE status:%d kactive_blk_num:%d\n",
-		pbd1, BLOCK_STATUS(pbd1), pkc1->kactive_blk_num);
-	dump_stack();
-	BUG();
+	pkc1->pkblk_start = (char *)pbd1;
+	pkc1->nxt_offset = pkc1->pkblk_start + BLK_PLUS_PRIV(pkc1->blk_sizeof_priv);
+
+	BLOCK_O2FP(pbd1) = (__u32)BLK_PLUS_PRIV(pkc1->blk_sizeof_priv);
+	BLOCK_O2PRIV(pbd1) = BLK_HDR_LEN;
+
+	pbd1->version = pkc1->version;
+	pkc1->prev = pkc1->nxt_offset;
+	pkc1->pkblk_end = pkc1->pkblk_start + pkc1->kblk_size;
+
+	prb_thaw_queue(pkc1);
+	_prb_refresh_rx_retire_blk_timer(pkc1);
+
+	smp_wmb();
 }
 
 /*
@@ -862,10 +859,6 @@  static void prb_retire_current_block(struct tpacket_kbdq_core *pkc,
 		prb_close_block(pkc, pbd, po, status);
 		return;
 	}
-
-	WARN(1, "ERROR-pbd[%d]:%p\n", pkc->kactive_blk_num, pbd);
-	dump_stack();
-	BUG();
 }
 
 static int prb_curr_blk_in_use(struct tpacket_kbdq_core *pkc,