Patchwork 3.7.8/amd64 full interrupt hangs due to iwlwifi under big nfs copies out

login
register
mail settings
Submitter Eric Dumazet
Date Feb. 20, 2013, 3:11 p.m.
Message ID <1361373064.19353.180.camel@edumazet-glaptop>
Download mbox | patch
Permalink /patch/222098/
State RFC
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Feb. 20, 2013, 3:11 p.m.
On Wed, 2013-02-20 at 10:15 +0100, Johannes Berg wrote:

> OTOH, this affects the protocol, and when you really can't allocate any
> order-1 pages you pointed out yourself that many other things also won't
> work, so I'm not really sure it makes a big difference if we change the
> driver?

It will make a huge difference, even on non pressure mode, as TCP
receive window will grow twice faster.

Unless wifi speed reaches 10Gbps, following patch should do the trick




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg - Feb. 20, 2013, 4:20 p.m.
On Wed, 2013-02-20 at 07:11 -0800, Eric Dumazet wrote:
> On Wed, 2013-02-20 at 10:15 +0100, Johannes Berg wrote:
> 
> > OTOH, this affects the protocol, and when you really can't allocate any
> > order-1 pages you pointed out yourself that many other things also won't
> > work, so I'm not really sure it makes a big difference if we change the
> > driver?
> 
> It will make a huge difference, even on non pressure mode, as TCP
> receive window will grow twice faster.

Hmm, why does that depend on the allocation size?

> Unless wifi speed reaches 10Gbps, following patch should do the trick
> 
> 
> diff --git a/drivers/net/wireless/iwlwifi/dvm/rx.c b/drivers/net/wireless/iwlwifi/dvm/rx.c
> index a4eed20..77a3ee3 100644
> --- a/drivers/net/wireless/iwlwifi/dvm/rx.c
> +++ b/drivers/net/wireless/iwlwifi/dvm/rx.c
> @@ -750,7 +750,12 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
>  	/* Dont use dev_alloc_skb(), we'll have enough headroom once
>  	 * ieee80211_hdr pulled.
>  	 */
> -	skb = alloc_skb(128, GFP_ATOMIC);
> +	fraglen = 128;
> +	/* if we use order-1 pages, copy to get better TCP performance */
> +	if (rxb->truesize > PAGE_SIZE)
> +		fraglen = max_t(unsigned, fraglen, len);
> +
> +	skb = alloc_skb(fraglen, GFP_ATOMIC);

Hmm, I don't quite understand -- that's not doing any copy?

FWIW if you do the copy you should not "steal" the pages, then they'd be
recycled in the RX ring right away.

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Feb. 20, 2013, 4:55 p.m.
On Wed, 2013-02-20 at 17:20 +0100, Johannes Berg wrote:
> On Wed, 2013-02-20 at 07:11 -0800, Eric Dumazet wrote:
> > On Wed, 2013-02-20 at 10:15 +0100, Johannes Berg wrote:
> > 
> > > OTOH, this affects the protocol, and when you really can't allocate any
> > > order-1 pages you pointed out yourself that many other things also won't
> > > work, so I'm not really sure it makes a big difference if we change the
> > > driver?
> > 
> > It will make a huge difference, even on non pressure mode, as TCP
> > receive window will grow twice faster.
> 
> Hmm, why does that depend on the allocation size?

I guess you missed all the patches about skb->truesize on netdev

> 
> > Unless wifi speed reaches 10Gbps, following patch should do the trick
> > 
> > 
> > diff --git a/drivers/net/wireless/iwlwifi/dvm/rx.c b/drivers/net/wireless/iwlwifi/dvm/rx.c
> > index a4eed20..77a3ee3 100644
> > --- a/drivers/net/wireless/iwlwifi/dvm/rx.c
> > +++ b/drivers/net/wireless/iwlwifi/dvm/rx.c
> > @@ -750,7 +750,12 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
> >  	/* Dont use dev_alloc_skb(), we'll have enough headroom once
> >  	 * ieee80211_hdr pulled.
> >  	 */
> > -	skb = alloc_skb(128, GFP_ATOMIC);
> > +	fraglen = 128;
> > +	/* if we use order-1 pages, copy to get better TCP performance */
> > +	if (rxb->truesize > PAGE_SIZE)
> > +		fraglen = max_t(unsigned, fraglen, len);
> > +
> > +	skb = alloc_skb(fraglen, GFP_ATOMIC);
> 
> Hmm, I don't quite understand -- that's not doing any copy?
> 
> FWIW if you do the copy you should not "steal" the pages, then they'd be
> recycled in the RX ring right away.

Code should just works, please read the following lines in the same
function....

        /* If frame is small enough to fit in skb->head, pull it completely.
         * If not, only pull ieee80211_hdr so that splice() or TCP coalesce
         * are more efficient.
         */
        hdrlen = (len <= skb_tailroom(skb)) ? len : sizeof(*hdr);

        memcpy(skb_put(skb, hdrlen), hdr, hdrlen);
        fraglen = len - hdrlen;

        if (fraglen) {
                int offset = (void *)hdr + hdrlen -
                             rxb_addr(rxb) + rxb_offset(rxb);

                skb_add_rx_frag(skb, 0, rxb_steal_page(rxb), offset,
                                fraglen, rxb->truesize);
        }



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg - Feb. 20, 2013, 4:59 p.m.
On Wed, 2013-02-20 at 08:55 -0800, Eric Dumazet wrote:

> > > It will make a huge difference, even on non pressure mode, as TCP
> > > receive window will grow twice faster.
> > 
> > Hmm, why does that depend on the allocation size?
> 
> I guess you missed all the patches about skb->truesize on netdev

Yeah, I don't follow netdev much any more...

> > > -	skb = alloc_skb(128, GFP_ATOMIC);
> > > +	fraglen = 128;
> > > +	/* if we use order-1 pages, copy to get better TCP performance */
> > > +	if (rxb->truesize > PAGE_SIZE)
> > > +		fraglen = max_t(unsigned, fraglen, len);
> > > +
> > > +	skb = alloc_skb(fraglen, GFP_ATOMIC);
> > 
> > Hmm, I don't quite understand -- that's not doing any copy?
> > 
> > FWIW if you do the copy you should not "steal" the pages, then they'd be
> > recycled in the RX ring right away.
> 
> Code should just works, please read the following lines in the same
> function....
> 
>         /* If frame is small enough to fit in skb->head, pull it completely.
>          * If not, only pull ieee80211_hdr so that splice() or TCP coalesce
>          * are more efficient.
>          */

Oh, right, though I guess the comment is now wrong since practically
every packet will be copied either here or in mac80211 (A-MSDUs are
split up there)

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg - Feb. 20, 2013, 5:01 p.m.
On Wed, 2013-02-20 at 08:55 -0800, Eric Dumazet wrote:

> > > Unless wifi speed reaches 10Gbps, following patch should do the trick
> > > 
> > > 
> > > diff --git a/drivers/net/wireless/iwlwifi/dvm/rx.c b/drivers/net/wireless/iwlwifi/dvm/rx.c
> > > index a4eed20..77a3ee3 100644
> > > --- a/drivers/net/wireless/iwlwifi/dvm/rx.c
> > > +++ b/drivers/net/wireless/iwlwifi/dvm/rx.c
> > > @@ -750,7 +750,12 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
> > >  	/* Dont use dev_alloc_skb(), we'll have enough headroom once
> > >  	 * ieee80211_hdr pulled.
> > >  	 */
> > > -	skb = alloc_skb(128, GFP_ATOMIC);
> > > +	fraglen = 128;
> > > +	/* if we use order-1 pages, copy to get better TCP performance */
> > > +	if (rxb->truesize > PAGE_SIZE)
> > > +		fraglen = max_t(unsigned, fraglen, len);
> > > +
> > > +	skb = alloc_skb(fraglen, GFP_ATOMIC);
> > 
> > Hmm, I don't quite understand -- that's not doing any copy?
> > 
> > FWIW if you do the copy you should not "steal" the pages, then they'd be
> > recycled in the RX ring right away.
> 
> Code should just works, please read the following lines in the same
> function....

FWIW, I think just using order-0 pages and turning 8k A-MSDUs off by
default makes more sense, A-MSDU is rarely used to begin with ...

Also, if we copy larger frames here, we should also take into account
the (variable) 802.11 header length to avoid copying into a position
where the IP header ends up being unaligned.

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Feb. 20, 2013, 5:24 p.m.
On Wed, 2013-02-20 at 18:01 +0100, Johannes Berg wrote:

> FWIW, I think just using order-0 pages and turning 8k A-MSDUs off by
> default makes more sense, A-MSDU is rarely used to begin with ...
> 

My suggested patch makes sure that if someone needs 8k A-MSDU, iwlwifi
still works correctly.

> Also, if we copy larger frames here, we should also take into account
> the (variable) 802.11 header length to avoid copying into a position
> where the IP header ends up being unaligned.

But the current code already has this problem (if its a problem at all,
as on x86 NET_IP_ALIGN is 0)

I only increase available tailroom in the skb. Right now there is enough
room for IP and tcp headers.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Feb. 20, 2013, 5:39 p.m.
On Wed, 2013-02-20 at 17:59 +0100, Johannes Berg wrote:

> Yeah, I don't follow netdev much any more...
> 

Short answer : tcp stack has autotuning sk_rcvbuf, tracking _memory_ use
of a socket.

If network layer provides 8k fat skbs (holding 1500 bytes), the
advertised TCP receive window will be much smaller than if network layer
provides 2k skbs.

Its quite visible in a tcpdump in the beginning of a tcp session.

You'll see how linux grows the receive window, before and after my
patch. You'll see how the sender can send its data faster.

Providing nice skbs is a way to speedup Internet browsing, as most http
traffic happen exactly while tcp receiver didnt yet advertised a big
window. (Especially visible with large RTT)

> > 
> >         /* If frame is small enough to fit in skb->head, pull it completely.
> >          * If not, only pull ieee80211_hdr so that splice() or TCP coalesce
> >          * are more efficient.
> >          */
> 
> Oh, right, though I guess the comment is now wrong since practically
> every packet will be copied either here or in mac80211 (A-MSDUs are
> split up there)

Comment is not 'wrong', it describes what happens here.

If there is enough room in skb->head to avoid attaching a page fragment
to the skb, lets copy the frame so that the page fragment can be reused
by the NIC driver immediately.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg - Feb. 20, 2013, 6:16 p.m.
On Wed, 2013-02-20 at 09:24 -0800, Eric Dumazet wrote:
> On Wed, 2013-02-20 at 18:01 +0100, Johannes Berg wrote:
> 
> > FWIW, I think just using order-0 pages and turning 8k A-MSDUs off by
> > default makes more sense, A-MSDU is rarely used to begin with ...
> > 
> 
> My suggested patch makes sure that if someone needs 8k A-MSDU, iwlwifi
> still works correctly.

True, but if we don't advertise 8k A-MSDU then they won't be used, and
it'll still work correctly, just not be able to receive such large
A-MSDUs. Linux can't even transmit _any_ A-MSDUs, and many (most?) APs
don't either.

> > Also, if we copy larger frames here, we should also take into account
> > the (variable) 802.11 header length to avoid copying into a position
> > where the IP header ends up being unaligned.
> 
> But the current code already has this problem (if its a problem at all,
> as on x86 NET_IP_ALIGN is 0)

Yes and no. The current code doesn't pull in much data, not even the IP
header, so worst case it copies 14 bytes (ethernet header) in mac80211
to obtain alignment. By pulling in everything here, it later has to copy
everything again later to obtain alignment.

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Feb. 20, 2013, 7:17 p.m.
On Wed, 2013-02-20 at 19:16 +0100, Johannes Berg wrote:

> Yes and no. The current code doesn't pull in much data, not even the IP
> header, so worst case it copies 14 bytes (ethernet header) in mac80211
> to obtain alignment. By pulling in everything here, it later has to copy
> everything again later to obtain alignment.

Can you give me pointers to the code doing that ?

On x86 we should not doing anything.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg - Feb. 20, 2013, 7:58 p.m.
On Wed, 2013-02-20 at 11:17 -0800, Eric Dumazet wrote:
> On Wed, 2013-02-20 at 19:16 +0100, Johannes Berg wrote:
> 
> > Yes and no. The current code doesn't pull in much data, not even the IP
> > header, so worst case it copies 14 bytes (ethernet header) in mac80211
> > to obtain alignment. By pulling in everything here, it later has to copy
> > everything again later to obtain alignment.
> 
> Can you give me pointers to the code doing that ?

Sure: net/mac80211/rx.c ~line 1900, look for #ifndef
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.

> On x86 we should not doing anything.

We don't do anything if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
defined.

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg - Feb. 20, 2013, 8:09 p.m.
On Wed, 2013-02-20 at 07:11 -0800, Eric Dumazet wrote:

> diff --git a/drivers/net/wireless/iwlwifi/dvm/rx.c b/drivers/net/wireless/iwlwifi/dvm/rx.c
> index a4eed20..77a3ee3 100644
> --- a/drivers/net/wireless/iwlwifi/dvm/rx.c
> +++ b/drivers/net/wireless/iwlwifi/dvm/rx.c
> @@ -750,7 +750,12 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
>  	/* Dont use dev_alloc_skb(), we'll have enough headroom once
>  	 * ieee80211_hdr pulled.
>  	 */
> -	skb = alloc_skb(128, GFP_ATOMIC);
> +	fraglen = 128;
> +	/* if we use order-1 pages, copy to get better TCP performance */
> +	if (rxb->truesize > PAGE_SIZE)
> +		fraglen = max_t(unsigned, fraglen, len);

Btw, shouldn't this rather be something like

if (rxb->truesize > PAGE_SIZE && len < SKB_WITH_OVERHEAD(PAGE_SIZE))

to avoid allocating large SKBs in the case that this actually is a big
8k A-MSDU? That wouldn't be given to the TCP RX path anyway because
mac80211 will split the A-MSDU into MSDUs and memcpy along the way.

Anyway we're preparing a patch for 4k A-MSDU (which is really just
something like 3782 bytes or something).

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Feb. 20, 2013, 8:14 p.m.
On Wed, 2013-02-20 at 20:58 +0100, Johannes Berg wrote:

> Sure: net/mac80211/rx.c ~line 1900, look for #ifndef
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.
> 
> > On x86 we should not doing anything.
> 
> We don't do anything if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
> defined.

Thanks !

BTW, comment is outdated, as it mentions __skb_push(), while
its not used.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg - Feb. 20, 2013, 8:27 p.m.
On Wed, 2013-02-20 at 12:14 -0800, Eric Dumazet wrote:
> On Wed, 2013-02-20 at 20:58 +0100, Johannes Berg wrote:
> 
> > Sure: net/mac80211/rx.c ~line 1900, look for #ifndef
> > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.
> > 
> > > On x86 we should not doing anything.
> > 
> > We don't do anything if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is
> > defined.
> 
> Thanks !

:-)
Hey, I introduced that config symbol back then for precisely this
reason ;-)

> BTW, comment is outdated, as it mentions __skb_push(), while
> its not used.

Oops, good point, now I wonder what it did actually refer to. Time to
dig in git history, I'll fix it.

johannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/wireless/iwlwifi/dvm/rx.c b/drivers/net/wireless/iwlwifi/dvm/rx.c
index a4eed20..77a3ee3 100644
--- a/drivers/net/wireless/iwlwifi/dvm/rx.c
+++ b/drivers/net/wireless/iwlwifi/dvm/rx.c
@@ -750,7 +750,12 @@  static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
 	/* Dont use dev_alloc_skb(), we'll have enough headroom once
 	 * ieee80211_hdr pulled.
 	 */
-	skb = alloc_skb(128, GFP_ATOMIC);
+	fraglen = 128;
+	/* if we use order-1 pages, copy to get better TCP performance */
+	if (rxb->truesize > PAGE_SIZE)
+		fraglen = max_t(unsigned, fraglen, len);
+
+	skb = alloc_skb(fraglen, GFP_ATOMIC);
 	if (!skb) {
 		IWL_ERR(priv, "alloc_skb failed\n");
 		return;