Message ID | 1339595401.11011.48.camel@shinybook.infradead.org |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Jun 13, 2012 at 02:50:01PM +0100, David Woodhouse wrote: > On Wed, 2012-06-13 at 10:57 +0100, David Woodhouse wrote: > > This doesn't look *so* evil... if the basic concept of using > > skb_orphan() and then setting our own destructor is OK, then I'll work > > out the rest of the details and do it for l2tp too. > > Stupid dwmw2. With patch this time... Does this actually work? Could the skb not end up sitting on the receive queue of a user socket indefinitely, deferring all further transmits? From an ISP point of view, PPPoE and L2TP are most typically used on links where the congestion point is not the local interface the packets are being pumped into (think of the vast majority of ethernet based DSL modems), and this kind of transmit overhead is a pure waste of CPU cycles. The only solution that generically works in most PPPoE/L2TP situations is to shape outgoing traffic to the speed of limiting link. Maybe there's a PPP extension that does flow control... -ben -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-06-13 at 11:55 -0400, Benjamin LaHaise wrote: > Does this actually work? Could the skb not end up sitting on the > receive queue of a user socket indefinitely, deferring all further > transmits? From an ISP point of view, I haven't tried it; only compiled it. Certainly, the similar approach in PPPoATM in commit 9d02daf7 *does* work for limiting the bufferbloat and keeping the queues under control. And it'll let me do BQL for PPPoA. I'm looking at this from the client side, not the ISP side. And in that case the local interface *is* the bottleneck. When it's a PPPoE over br2684 interface and it's full, we should stop the PPP netdev from spewing packets at it, rather than just dropping them. On the ISP side if the skb ends up sitting on a receive queue of a user socket, and nothing is servicing that socket, surely the transmits on this channel weren't happening anyway?
On Wed, Jun 13, 2012 at 05:11:34PM +0100, David Woodhouse wrote: > On Wed, 2012-06-13 at 11:55 -0400, Benjamin LaHaise wrote: > > Does this actually work? Could the skb not end up sitting on the > > receive queue of a user socket indefinitely, deferring all further > > transmits? From an ISP point of view, > > I haven't tried it; only compiled it. Certainly, the similar approach in > PPPoATM in commit 9d02daf7 *does* work for limiting the bufferbloat and > keeping the queues under control. And it'll let me do BQL for PPPoA. > > I'm looking at this from the client side, not the ISP side. And in that > case the local interface *is* the bottleneck. When it's a PPPoE over > br2684 interface and it's full, we should stop the PPP netdev from > spewing packets at it, rather than just dropping them. I would contend that PPPoE over br2684 is not the common case. The vast majority of users in client mode are going to be using PPPoE over an ethernet link to a DSL modem (or cable or wireless radios even). Just look at what DSL modems are available for users in computer stores / what ISPs actually ship to their users. Real ATM exposing devices are rare. > On the ISP side if the skb ends up sitting on a receive queue of a user > socket, and nothing is servicing that socket, surely the transmits on > this channel weren't happening anyway? True, but it's a design issue we've had to contend with elsewhere in the various tunnelling protocols. Don't get me wrong: I am very much in favour of intelligent queue management, but this approach simply does not work for the vast majority of PPPoE users, while it adds overhead that will negatively impact access concentrators. If you can somehow restrict the overhead to only impacting your use-case, that would be an improvement. -ben
> I would contend that PPPoE over br2684 is not the common case. The vast > majority of users in client mode are going to be using PPPoE over an > ethernet link to a DSL modem (or cable or wireless radios even). Just look > at what DSL modems are available for users in computer stores / what ISPs > actually ship to their users. Real ATM exposing devices are rare. PPPoA is common in the UK. David -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-06-13 at 12:31 -0400, Benjamin LaHaise wrote: > I would contend that PPPoE over br2684 is not the common case. The vast > majority of users in client mode are going to be using PPPoE over an > ethernet link to a DSL modem (or cable or wireless radios even). Just look > at what DSL modems are available for users in computer stores / what ISPs > actually ship to their users. Real ATM exposing devices are rare. I'm looking at the class of device on which OpenWRT runs. Linux is *on* the router with the ADSL port, not connected to it via Ethernet. I can buy lots of these in the shop. Anything that's an ADSL *router* rather than *modem* is likely to be running, or at least capable of running, Linux. Admittedly, if you have access to the native ADSL interface then you'd do a lot better to run PPPoA — but I already fixed this issue for PPPoA. There are people in some parts of the world who are using PPPoEoA and putting up with the resulting MTU issues because the *ISP* doesn't support proper PPPoA. And even if it *were* rare... this is the case that *should* work best, where we have complete control of the hardware. There's no excuse for the behaviour that we currently see with PPPoE on BR2684. > > On the ISP side if the skb ends up sitting on a receive queue of a user > > socket, and nothing is servicing that socket, surely the transmits on > > this channel weren't happening anyway? > > True, but it's a design issue we've had to contend with elsewhere in the > various tunnelling protocols. > > Don't get me wrong: I am very much in favour of intelligent queue > management, but this approach simply does not work for the vast majority > of PPPoE users, while it adds overhead that will negatively impact access > concentrators. I think that's largely true of BQL in general, isn't it? That's OK; it's a config option. I suspect if I make this accounting of PPPoE / L2TP packets conditional on BQL (or perhaps on a separate config option PPP_BQL) that ought to address your concern about the cases where you don't need it?
On Wed, 2012-06-13 at 17:32 +0100, David Laight wrote: > > I would contend that PPPoE over br2684 is not the common case. The vast > > majority of users in client mode are going to be using PPPoE over an > > ethernet link to a DSL modem (or cable or wireless radios even). Just look > > at what DSL modems are available for users in computer stores / what ISPs > > actually ship to their users. Real ATM exposing devices are rare. > > PPPoA is common in the UK. In the UK you tend to have the option of using PPPoA *or* PPPoE over BR2684. The ISP's kit will handle both. Ben's comment was about the *hardware*, though. If your "ADSL modem" is a separate box which just bridges Ethernet to BR2684 on the ADSL link, you're limited to using the PPPoE protocol over that. Obviously, if you have a proper ADSL *router* and it's not just a PPP bridge, then you can — and should — use PPPoA.
On Wed, Jun 13, 2012 at 05:53:03PM +0100, David Woodhouse wrote: > I'm looking at the class of device on which OpenWRT runs. Linux is *on* > the router with the ADSL port, not connected to it via Ethernet. Ah, yes, that's a worthwhile pursuit. > And even if it *were* rare... this is the case that *should* work best, > where we have complete control of the hardware. There's no excuse for > the behaviour that we currently see with PPPoE on BR2684. *nod* > I think that's largely true of BQL in general, isn't it? That's OK; it's > a config option. I suspect if I make this accounting of PPPoE / L2TP > packets conditional on BQL (or perhaps on a separate config option > PPP_BQL) that ought to address your concern about the cases where you > don't need it? That would help. On the whole question of PPPoE over intermediate ethernet links to ADSL modems, I think it would be possible to limit latency by implementing a sliding window clocked using LCP ECHO requests. Does this sound worthwhile implementing? What sort of queue depths are you looking at for the ATM devices you're working on? -ben
On Wed, 2012-06-13 at 13:21 -0400, Benjamin LaHaise wrote: > On the whole question of PPPoE over intermediate ethernet links to ADSL > modems, I think it would be possible to limit latency by implementing a > sliding window clocked using LCP ECHO requests. Does this sound worthwhile > implementing? Maybe, for someone who actually cares about those who are using separate ADSL modems, and doesn't think they should just better hardware if they care about the performance and queue management :) > What sort of queue depths are you looking at for the ATM devices > you're working on? We're managing the queue to keep it short. On the PPPoATM side, we (now) strictly limit the number of packets between the ppp_generic code and the ATM device. It's basically one in-flight and one waiting to be handed to the device in the TX done interrupt. PPP is designed to feed new packets with low latency, after all. The Solos ADSL device *used* to eat as many packets as it had internal RAM to store them in, acknowledging the "transmit" as soon as it had buffered them. I got Nathan to fix that some time ago, and the internal queue is fairly short now.
On Wed, Jun 13, 2012 at 02:50:01PM +0100, David Woodhouse wrote: > On Wed, 2012-06-13 at 10:57 +0100, David Woodhouse wrote: > > This doesn't look *so* evil... if the basic concept of using > > skb_orphan() and then setting our own destructor is OK, then I'll work > > out the rest of the details and do it for l2tp too. > > Stupid dwmw2. With patch this time... > +static void pppoe_skb_destructor(struct sk_buff *skb) > +{ > + struct sock *sk = skb->sk; > + struct pppox_sock *po = pppox_sk(sk); > + > + atomic_dec(&po->inflight); > + /* Schedule a call to ppp_output_wakeup(chan), if it was already blocked. > + Mind for race conditions with another CPU which is in pppoe_xmit() > + right now. See commit 9d02daf7 in pppoatm. */ > + sock_put(sk); > +} Umm, how does ppp_output_wakeup() actually get called? Paul. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-06-14 at 16:18 +1000, Paul Mackerras wrote:
> Umm, how does ppp_output_wakeup() actually get called?
Not directly from the destructor; it'll need to be from a tasklet like
the PPPoATM code does it. And the same race conditions, and the same
handling of them which makes me slightly uneasy as it depends quite
intimately on ppp_generic's internal locking scheme, will apply.
That's what I meant when I said I'd work out the rest of the details...
On Thu, 2012-06-14 at 16:18 +1000, Paul Mackerras wrote:
> Umm, how does ppp_output_wakeup() actually get called?
In fact I'm thinking of eliminating ppp_output_wakeup() in the general
case.
The idea (and it is *just* an idea so far) is to introduce
ppp_sent_queue(), ppp_completed_queue() and ppp_reset_queue()¹ functions
which take a ppp_chan and map onto the corresponding netdev_* functions
for BQL.
Having done that, we should be able to trigger the wakeup automatically
from the ppp_completed_queue() function, and there's no need for channel
drivers to call ppp_output_wakeup() directly. Not only do we get proper
holistic queue length management, we also move the flow control into PPP
and get rid of the horrid dependency on internal PPP locking that's
documented in commit 9d02daf75², and which we'd have to address on the
PPPoX side too.
And the overhead that Ben is concerned about should be fairly minimal.
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c index cbf7047..ddaf156 100644 --- a/drivers/net/ppp/pppoe.c +++ b/drivers/net/ppp/pppoe.c @@ -689,6 +689,8 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr, sk->sk_state = PPPOX_CONNECTED; } + atomic_set(&po->inflight, -2); + po->num = sp->sa_addr.pppoe.sid; end: @@ -952,9 +954,34 @@ abort: * sends PPP frame over PPPoE socket * ***********************************************************************/ +static void pppoe_skb_destructor(struct sk_buff *skb) +{ + struct sock *sk = skb->sk; + struct pppox_sock *po = pppox_sk(sk); + + atomic_dec(&po->inflight); + /* Schedule a call to ppp_output_wakeup(chan), if it was already blocked. + Mind for race conditions with another CPU which is in pppoe_xmit() + right now. See commit 9d02daf7 in pppoatm. */ + sock_put(sk); +} + static int pppoe_xmit(struct ppp_channel *chan, struct sk_buff *skb) { struct sock *sk = (struct sock *)chan->private; + struct pppox_sock *po = pppox_sk(sk); + + if (!atomic_inc_not_zero(&po->inflight)) + return 0; + + /* mine! all mine! */ + skb_orphan(skb); + skb->destructor = pppoe_skb_destructor; + /* XXX: Are there other implications of setting this? Should we use ->cb? */ + skb->sk = sk; + + sock_hold(sk); + return __pppoe_xmit(sk, skb); } diff --git a/include/linux/if_pppox.h b/include/linux/if_pppox.h index 09c474c..339c75d 100644 --- a/include/linux/if_pppox.h +++ b/include/linux/if_pppox.h @@ -186,6 +186,7 @@ struct pppox_sock { /* struct sock must be the first member of pppox_sock */ struct sock sk; struct ppp_channel chan; + atomic_t inflight; struct pppox_sock *next; /* for hash table */ union { struct pppoe_opt pppoe;