diff mbox

pktgen and spin_lock_bh in xmit path

Message ID 4ADE3253.10302@gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Oct. 20, 2009, 9:57 p.m. UTC
Ben Greear a écrit :
> That's definitely a nasty little issue.  Using skb_set_queue_mapping
> in pktgen makes it run for me, but may just be getting lucky with the
> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't so much
> care exactly what queue is used as long as things don't crash and the
> link doesn't reset).
> 
> Don't worry about a quick patch on my account.  I seem to have it working
> to at least some degree (no funny crashes, no link watchdog timeouts).
> 

Could you try following patch ?

This makes queue_mapping invariant if set in range [0 ... real_num_tx_queues-1]



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Ben Greear Oct. 20, 2009, 11:17 p.m. UTC | #1
On 10/20/2009 02:57 PM, Eric Dumazet wrote:
> Ben Greear a écrit :
>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>> in pktgen makes it run for me, but may just be getting lucky with the
>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't so much
>> care exactly what queue is used as long as things don't crash and the
>> link doesn't reset).
>>
>> Don't worry about a quick patch on my account.  I seem to have it working
>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>
>
> Could you try following patch ?
>
> This makes queue_mapping invariant if set in range [0 ... real_num_tx_queues-1]

Yes, that runs w/out causing link resets and without crashes (just tested it for
a few minutes).

Interestingly, the pkts sent by pktgen on the mac-vlans end up in
tx-queues that match processor ID, even though I'm on .31 where mac-vlans
have only one tx-queue and pktgen is setting the queue to 0 in the skb
(per your previous patch).

Must be something somewhere doing the mapping of processor ids to txqs.

I'll try to figure this out this evening or tomorrow.

Thanks,
Ben
Ben Greear Oct. 21, 2009, 3:05 a.m. UTC | #2
Ben Greear wrote:
> On 10/20/2009 02:57 PM, Eric Dumazet wrote:
>> Ben Greear a écrit :
>>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>>> in pktgen makes it run for me, but may just be getting lucky with the
>>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't 
>>> so much
>>> care exactly what queue is used as long as things don't crash and the
>>> link doesn't reset).
>>>
>>> Don't worry about a quick patch on my account.  I seem to have it 
>>> working
>>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>>
>>
>> Could you try following patch ?
>>
>> This makes queue_mapping invariant if set in range [0 ... 
>> real_num_tx_queues-1]
>
> Yes, that runs w/out causing link resets and without crashes (just 
> tested it for
> a few minutes).
>
> Interestingly, the pkts sent by pktgen on the mac-vlans end up in
> tx-queues that match processor ID, even though I'm on .31 where mac-vlans
> have only one tx-queue and pktgen is setting the queue to 0 in the skb
> (per your previous patch).
Ok, this is because ixgbe implements the ndo_select_queue, which is 
called from
dev_pick_tx.

So, as far as I can tell, as long as you are using ixgbe with 82559, it 
doesn't matter what you set
for the queue-map in the skb..it's always over-written.

I don't know if this is a bug or a feature, but it explains the behavior 
with tx and rx queues
that I saw when using pktgen on mac-vlans...

Thanks,
Ben
Eric Dumazet Oct. 21, 2009, 3:12 a.m. UTC | #3
Ben Greear a écrit :
> Ben Greear wrote:
>> On 10/20/2009 02:57 PM, Eric Dumazet wrote:
>>> Ben Greear a écrit :
>>>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>>>> in pktgen makes it run for me, but may just be getting lucky with the
>>>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't
>>>> so much
>>>> care exactly what queue is used as long as things don't crash and the
>>>> link doesn't reset).
>>>>
>>>> Don't worry about a quick patch on my account.  I seem to have it
>>>> working
>>>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>>>
>>>
>>> Could you try following patch ?
>>>
>>> This makes queue_mapping invariant if set in range [0 ...
>>> real_num_tx_queues-1]
>>
>> Yes, that runs w/out causing link resets and without crashes (just
>> tested it for
>> a few minutes).
>>
>> Interestingly, the pkts sent by pktgen on the mac-vlans end up in
>> tx-queues that match processor ID, even though I'm on .31 where mac-vlans
>> have only one tx-queue and pktgen is setting the queue to 0 in the skb
>> (per your previous patch).
> Ok, this is because ixgbe implements the ndo_select_queue, which is
> called from
> dev_pick_tx.
> 
> So, as far as I can tell, as long as you are using ixgbe with 82559, it
> doesn't matter what you set
> for the queue-map in the skb..it's always over-written.
> 
> I don't know if this is a bug or a feature, but it explains the behavior
> with tx and rx queues
> that I saw when using pktgen on mac-vlans...
> 

We have many bugs in this area :)

So we probably need to reset skb_set_queue_mapping(skb, queue_map);
each time skb is transmitted by pktgen.

Or else, pktgen will break if using bonding driver -> ixgbe, since bonding
uses only one txqueue (it is not yet multiqueue aware)

Thanks, I'll take care of official patches submission

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Oct. 21, 2009, 3:59 a.m. UTC | #4
Eric Dumazet a écrit :
> Ben Greear a écrit :
>> Ben Greear wrote:
>>> On 10/20/2009 02:57 PM, Eric Dumazet wrote:
>>>> Ben Greear a écrit :
>>>>> That's definitely a nasty little issue.  Using skb_set_queue_mapping
>>>>> in pktgen makes it run for me, but may just be getting lucky with the
>>>>> mac-vlan interfaces which will do the dev_queue_xmit (but, I don't
>>>>> so much
>>>>> care exactly what queue is used as long as things don't crash and the
>>>>> link doesn't reset).
>>>>>
>>>>> Don't worry about a quick patch on my account.  I seem to have it
>>>>> working
>>>>> to at least some degree (no funny crashes, no link watchdog timeouts).
>>>>>
>>>> Could you try following patch ?
>>>>
>>>> This makes queue_mapping invariant if set in range [0 ...
>>>> real_num_tx_queues-1]
>>> Yes, that runs w/out causing link resets and without crashes (just
>>> tested it for
>>> a few minutes).
>>>
>>> Interestingly, the pkts sent by pktgen on the mac-vlans end up in
>>> tx-queues that match processor ID, even though I'm on .31 where mac-vlans
>>> have only one tx-queue and pktgen is setting the queue to 0 in the skb
>>> (per your previous patch).
>> Ok, this is because ixgbe implements the ndo_select_queue, which is
>> called from
>> dev_pick_tx.
>>
>> So, as far as I can tell, as long as you are using ixgbe with 82559, it
>> doesn't matter what you set
>> for the queue-map in the skb..it's always over-written.
>>
>> I don't know if this is a bug or a feature, but it explains the behavior
>> with tx and rx queues
>> that I saw when using pktgen on mac-vlans...
>>
> 
> We have many bugs in this area :)
> 
> So we probably need to reset skb_set_queue_mapping(skb, queue_map);
> each time skb is transmitted by pktgen.
> 
> Or else, pktgen will break if using bonding driver -> ixgbe, since bonding
> uses only one txqueue (it is not yet multiqueue aware)
> 

After some thoughts, I believe user is in error :)

pktgen should not use "clone XXX" pkts if macvlan is used (or any other driver
that ultimatly calls dev_queue_xmit() and queue packet), since skb queue anchor
is shared and would be overwritten.

1) Only way to use "clone XXXX" pkts is when using real device.

2) Also, using macvlan in pktgen is sub-optimal, since you can already put any
MAC addresses in pktgen pkts, you dont need to go through macvlan layer.

3) If ixgbe overwrites skb->queue_mapping to current cpu, you should setup pktgen
 queue_map_min and queue_map_max to match you cpu number, or use QUEUE_MAP_CPU pktgen flag
Or else, pktgen wont get  the appropriate txq (and lock) before calling driver start_xmit()

Unfortunatly, the (queue_map_min==queue_map_max) case needs a patch that might be not present in 2.6.31
(commit 896a7cf8d846a9e86fb823be16f4f14ffeb7f074 : pktgen: Fix multiqueue handling)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear Oct. 21, 2009, 5 a.m. UTC | #5
Eric Dumazet wrote:
> pktgen should not use "clone XXX" pkts if macvlan is used (or any other driver
> that ultimatly calls dev_queue_xmit() and queue packet), since skb queue anchor
> is shared and would be overwritten.
>   
> After some thoughts, I believe user is in error :)
I tried to explain in my original post:  The problem arises when
when the hard-start-xmit fails with NETDEV_TX_BUSY.  Part of the
hard-start-xmit logic for virtual devices can call dev_queue_xmit, which 
can ultimately
change the queue mapping and yet may still return NETDEV_TX_BUSY.

pktgen would try to resend this skb next loop, and this is where it would
blow up.

I have a patched macvlan logic and a patched dev queue xmit logic that 
allows
me to return NETDEV_TX_BUSY when underlying device fails to transmit.

It may be that my hacked macvlan is the only virtual device that could ever
return NETDEV_TX_BUSY, and if that is the case, I don't think the bug
could ever be hit in official kernel code.  My opinion is that the 
current pktgen code makes
too many assumptions, so unless there is a performance penalty, I still
think it should be cleaned up.  But, I may be too paranoid.
> 1) Only way to use "clone XXXX" pkts is when using real device.
>   
Agreed, and I was not cloning pkts on the mac-vlan interface.
> 2) Also, using macvlan in pktgen is sub-optimal, since you can already put any
> MAC addresses in pktgen pkts, you dont need to go through macvlan layer.
>   
It's sub-optimal for massive pkt pushing, but still useful for sending 
multiple distinct flows
across a single physical wire.
> 3) If ixgbe overwrites skb->queue_mapping to current cpu, you should setup pktgen
>  queue_map_min and queue_map_max to match you cpu number, or use QUEUE_MAP_CPU pktgen flag
> Or else, pktgen wont get  the appropriate txq (and lock) before calling driver start_xmit()
>   
The hard-start-xmit path doesn't call the driver's queue-mapping logic, 
so you
only get that fun when transmitting through mac-vlans (or .1q vlans, 
etc).  There appears to be
no watchdog for virtual devices, and the dev_queue_xmit path updates the 
proper txq, so, as long as you
aren't using that +1 variant of the skb set queue map logic in pktgen, I 
think you will be fine.  The
current code is fine in this manner, but your patch broke it w/out the 
second patch to remove the +1
logic.

Thanks,
Ben
Eric Dumazet Oct. 21, 2009, 5:14 a.m. UTC | #6
Ben Greear a écrit :
> Eric Dumazet wrote:
>> pktgen should not use "clone XXX" pkts if macvlan is used (or any
>> other driver
>> that ultimatly calls dev_queue_xmit() and queue packet), since skb
>> queue anchor
>> is shared and would be overwritten.
>>   After some thoughts, I believe user is in error :)
> I tried to explain in my original post:  The problem arises when
> when the hard-start-xmit fails with NETDEV_TX_BUSY.  Part of the
> hard-start-xmit logic for virtual devices can call dev_queue_xmit, which
> can ultimately
> change the queue mapping and yet may still return NETDEV_TX_BUSY.
> 
> pktgen would try to resend this skb next loop, and this is where it would
> blow up.
> 
> I have a patched macvlan logic and a patched dev queue xmit logic that
> allows
> me to return NETDEV_TX_BUSY when underlying device fails to transmit.
> 
> It may be that my hacked macvlan is the only virtual device that could ever
> return NETDEV_TX_BUSY, and if that is the case, I don't think the bug
> could ever be hit in official kernel code.  My opinion is that the
> current pktgen code makes
> too many assumptions, so unless there is a performance penalty, I still
> think it should be cleaned up.  But, I may be too paranoid.

If a virtual device changes skb->queue_map, it must consume skb,
or it breaks caller.

Alternative would be to restore queue_map to its initial value in
your hacked macvlan when it wants to return NETDEV_TX_BUSY status.


We could add a WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
in pktgen, to catch driver errors but pktgen assumption is right IMHO

@@ -3466,6 +3471,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
 		/* fallthru */
 	case NETDEV_TX_LOCKED:
 	case NETDEV_TX_BUSY:
+		WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
 		/* Retry it next time */
 		atomic_dec(&(pkt_dev->skb->users));
 		pkt_dev->last_ok = 0;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Greear Oct. 21, 2009, 5:40 a.m. UTC | #7
Eric Dumazet wrote:
> Ben Greear a écrit :
>   
>> Eric Dumazet wrote:
>>     
>>> pktgen should not use "clone XXX" pkts if macvlan is used (or any
>>> other driver
>>> that ultimatly calls dev_queue_xmit() and queue packet), since skb
>>> queue anchor
>>> is shared and would be overwritten.
>>>   After some thoughts, I believe user is in error :)
>>>       
>> I tried to explain in my original post:  The problem arises when
>> when the hard-start-xmit fails with NETDEV_TX_BUSY.  Part of the
>> hard-start-xmit logic for virtual devices can call dev_queue_xmit, which
>> can ultimately
>> change the queue mapping and yet may still return NETDEV_TX_BUSY.
>>
>> pktgen would try to resend this skb next loop, and this is where it would
>> blow up.
>>
>> I have a patched macvlan logic and a patched dev queue xmit logic that
>> allows
>> me to return NETDEV_TX_BUSY when underlying device fails to transmit.
>>
>> It may be that my hacked macvlan is the only virtual device that could ever
>> return NETDEV_TX_BUSY, and if that is the case, I don't think the bug
>> could ever be hit in official kernel code.  My opinion is that the
>> current pktgen code makes
>> too many assumptions, so unless there is a performance penalty, I still
>> think it should be cleaned up.  But, I may be too paranoid.
>>     
>
> If a virtual device changes skb->queue_map, it must consume skb,
> or it breaks caller.
>   
> Alternative would be to restore queue_map to its initial value in
> your hacked macvlan when it wants to return NETDEV_TX_BUSY status.
>   
Yeah, that sounds fine to me, should be easy and cheap.
> We could add a WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
> in pktgen, to catch driver errors but pktgen assumption is right IMHO
>
> @@ -3466,6 +3471,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
>  		/* fallthru */
>  	case NETDEV_TX_LOCKED:
>  	case NETDEV_TX_BUSY:
> +		WARN_ON(skb_get_queue_mapping(pkt_dev->skb) != queue_map);
>  		/* Retry it next time */
>  		atomic_dec(&(pkt_dev->skb->users));
>  		pkt_dev->last_ok = 0;
>   
That looks good too.

Thanks,
Ben
diff mbox

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index df7b23a..3ef2429 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2023,17 +2023,17 @@  static inline void skb_copy_queue_mapping(struct sk_buff *to, const struct sk_bu
 
 static inline void skb_record_rx_queue(struct sk_buff *skb, u16 rx_queue)
 {
-	skb->queue_mapping = rx_queue + 1;
+	skb->queue_mapping = rx_queue;
 }
 
 static inline u16 skb_get_rx_queue(const struct sk_buff *skb)
 {
-	return skb->queue_mapping - 1;
+	return skb->queue_mapping;
 }
 
 static inline bool skb_rx_queue_recorded(const struct sk_buff *skb)
 {
-	return (skb->queue_mapping != 0);
+	return (skb->queue_mapping != 0xffff);
 }
 
 extern u16 skb_tx_hash(const struct net_device *dev,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 80a9616..5f04f00 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -198,6 +198,7 @@  struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
 	memset(skb, 0, offsetof(struct sk_buff, tail));
 	skb->truesize = size + sizeof(struct sk_buff);
 	atomic_set(&skb->users, 1);
+	skb->queue_mapping = 0xffff;
 	skb->head = data;
 	skb->data = data;
 	skb_reset_tail_pointer(skb);