mbox series

[net-next,v3,0/3] net_sched: reflect tx_queue_len change for pfifo_fast

Message ID 20180126022624.20442-1-xiyou.wangcong@gmail.com
Headers show
Series net_sched: reflect tx_queue_len change for pfifo_fast | expand

Message

Cong Wang Jan. 26, 2018, 2:26 a.m. UTC
This pathcset restores the pfifo_fast qdisc behavior of dropping
packets based on latest dev->tx_queue_len. Patch 1 introduces
a helper, patch 2 introduces a new Qdisc ops which is called when
we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.

Please see each patch for details.

---
v3: use skb_array_resize_multiple()
v2: handle error case for ->change_tx_queue_len()

Cong Wang (3):
  net: introduce helper dev_change_tx_queue_len()
  net_sched: plug in qdisc ops change_tx_queue_len
  net_sched: implement ->change_tx_queue_len() for pfifo_fast

 include/linux/netdevice.h |  1 +
 include/net/sch_generic.h |  2 ++
 net/core/dev.c            | 29 +++++++++++++++++++++++++++
 net/core/net-sysfs.c      | 25 +----------------------
 net/core/rtnetlink.c      | 18 +++++------------
 net/sched/sch_generic.c   | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 89 insertions(+), 37 deletions(-)

Comments

John Fastabend Jan. 29, 2018, 5:35 a.m. UTC | #1
On 01/25/2018 06:26 PM, Cong Wang wrote:
> This pathcset restores the pfifo_fast qdisc behavior of dropping
> packets based on latest dev->tx_queue_len. Patch 1 introduces
> a helper, patch 2 introduces a new Qdisc ops which is called when
> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
> 
> Please see each patch for details.
> 

Overall this series is better than what we have at the moment, but
a better fix would preallocate the memory, to avoid ENOMEM errors,
and add a ptr_ring API to use the preallocated buffers.

We have time (its only in net-next) so lets do the complete fix
rather than this series IMO.

.John

> ---
> v3: use skb_array_resize_multiple()
> v2: handle error case for ->change_tx_queue_len()
> 
> Cong Wang (3):
>   net: introduce helper dev_change_tx_queue_len()
>   net_sched: plug in qdisc ops change_tx_queue_len
>   net_sched: implement ->change_tx_queue_len() for pfifo_fast
> 
>  include/linux/netdevice.h |  1 +
>  include/net/sch_generic.h |  2 ++
>  net/core/dev.c            | 29 +++++++++++++++++++++++++++
>  net/core/net-sysfs.c      | 25 +----------------------
>  net/core/rtnetlink.c      | 18 +++++------------
>  net/sched/sch_generic.c   | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>  6 files changed, 89 insertions(+), 37 deletions(-)
>
Cong Wang Jan. 29, 2018, 5:57 a.m. UTC | #2
On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 01/25/2018 06:26 PM, Cong Wang wrote:
>> This pathcset restores the pfifo_fast qdisc behavior of dropping
>> packets based on latest dev->tx_queue_len. Patch 1 introduces
>> a helper, patch 2 introduces a new Qdisc ops which is called when
>> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
>>
>> Please see each patch for details.
>>
>
> Overall this series is better than what we have at the moment, but
> a better fix would preallocate the memory, to avoid ENOMEM errors,

I am not against for better ENOMEM error handling, but I still have to
remind you that attach_one_default_qdisc() doesn't handle it either.
Since no one complained about it, why this one is so special?


> and add a ptr_ring API to use the preallocated buffers.


What ptr_ring API could cure netdev tx queues problem here?
Looks like you still don't understand the problem here.


>
> We have time (its only in net-next) so lets do the complete fix
> rather than this series IMO.
>

Why not just accept this and complete the error handling later
given the fact that I already add a TODO? IOW, why it is now?
Cong Wang Jan. 29, 2018, 6:01 a.m. UTC | #3
On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> We have time (its only in net-next) so lets do the complete fix
> rather than this series IMO.

Also have to remind you: this patchset fixes a regression introduced
by your patchset in net-next, it is not a new feature. I don't think it is
a serious regression, but it is still one...
John Fastabend Jan. 29, 2018, 6:09 a.m. UTC | #4
On 01/28/2018 09:57 PM, Cong Wang wrote:
> On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>> On 01/25/2018 06:26 PM, Cong Wang wrote:
>>> This pathcset restores the pfifo_fast qdisc behavior of dropping
>>> packets based on latest dev->tx_queue_len. Patch 1 introduces
>>> a helper, patch 2 introduces a new Qdisc ops which is called when
>>> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
>>>
>>> Please see each patch for details.
>>>
>>
>> Overall this series is better than what we have at the moment, but
>> a better fix would preallocate the memory, to avoid ENOMEM errors,
> 
> I am not against for better ENOMEM error handling, but I still have to
> remind you that attach_one_default_qdisc() doesn't handle it either.
> Since no one complained about it, why this one is so special?

Its not we should fix both cases. And also clean up the multiple
net sync calls in these paths as well.

> 
> 
>> and add a ptr_ring API to use the preallocated buffers.
> 
> 
> What ptr_ring API could cure netdev tx queues problem here?
> Looks like you still don't understand the problem here.
> 

The resize multiple array API can only fail due to alloc failures.
We need to break this API into two pieces preallocate the memory
and then commit array changes. Alternatively the qdisc layer could
allocate new arrays and then swap them after all the arrays been
initialized correctly using ptr_ring APIs below the ptr_ring
resize API calls.

Having ptr_ring API operations to support this seems best.

> 
>>
>> We have time (its only in net-next) so lets do the complete fix
>> rather than this series IMO.
>>
> 
> Why not just accept this and complete the error handling later
> given the fact that I already add a TODO? IOW, why it is now?
> 

Because its not correct and its not too much work to get it so
the error is avoided.

I don't mind if the patches go in as-is a follow up patch can
also fix the TODO item.

.John
Cong Wang Jan. 29, 2018, 6:25 a.m. UTC | #5
On Sun, Jan 28, 2018 at 10:09 PM, John Fastabend
<john.fastabend@gmail.com> wrote:
> On 01/28/2018 09:57 PM, Cong Wang wrote:
>> On Sun, Jan 28, 2018 at 9:35 PM, John Fastabend
>> <john.fastabend@gmail.com> wrote:
>>> On 01/25/2018 06:26 PM, Cong Wang wrote:
>>>> This pathcset restores the pfifo_fast qdisc behavior of dropping
>>>> packets based on latest dev->tx_queue_len. Patch 1 introduces
>>>> a helper, patch 2 introduces a new Qdisc ops which is called when
>>>> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
>>>>
>>>> Please see each patch for details.
>>>>
>>>
>>> Overall this series is better than what we have at the moment, but
>>> a better fix would preallocate the memory, to avoid ENOMEM errors,
>>
>> I am not against for better ENOMEM error handling, but I still have to
>> remind you that attach_one_default_qdisc() doesn't handle it either.
>> Since no one complained about it, why this one is so special?
>
> Its not we should fix both cases. And also clean up the multiple
> net sync calls in these paths as well.

Now can we agree error handling can be improved later? You
already agree this is not a new problem introduced by this patchset,
why do you want to block a regression fix just because of an old
problem which I will fix later?


>
>>
>>
>>> and add a ptr_ring API to use the preallocated buffers.
>>
>>
>> What ptr_ring API could cure netdev tx queues problem here?
>> Looks like you still don't understand the problem here.
>>
>
> The resize multiple array API can only fail due to alloc failures.
> We need to break this API into two pieces preallocate the memory
> and then commit array changes. Alternatively the qdisc layer could
> allocate new arrays and then swap them after all the arrays been
> initialized correctly using ptr_ring APIs below the ptr_ring
> resize API calls.
>
> Having ptr_ring API operations to support this seems best.
>


Apparently qdisc layer doesn't care about ptr_ring, as you describe
here this potentially needs to change two layers: 1) qdisc layer
2) ptr_ring API. It is more work than just a simple error handling,
potentially bigger than this patchset itself.


>>
>>>
>>> We have time (its only in net-next) so lets do the complete fix
>>> rather than this series IMO.
>>>
>>
>> Why not just accept this and complete the error handling later
>> given the fact that I already add a TODO? IOW, why it is now?
>>
>
> Because its not correct and its not too much work to get it so
> the error is avoided.

So you must want a fix for net too, because attach_one_default_qdisc()
"is not correct" either and "it is not too much work".

I can't agree on any of your claims here. Sorry.
David Miller Jan. 29, 2018, 5:43 p.m. UTC | #6
From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 25 Jan 2018 18:26:21 -0800

> This pathcset restores the pfifo_fast qdisc behavior of dropping
> packets based on latest dev->tx_queue_len. Patch 1 introduces
> a helper, patch 2 introduces a new Qdisc ops which is called when
> we modify tx_queue_len, patch 3 implements this ops for pfifo_fast.
> 
> Please see each patch for details.
> 
> ---
> v3: use skb_array_resize_multiple()
> v2: handle error case for ->change_tx_queue_len()

Series applied, thanks Cong.

Please follow up with John about making the queue allocation use
a prepare + commit phase.

And as for the TX queue state handling on change, I think it's
fine to purge the whole queue.  That is definitely better than
allowing the queue length to be visibly over the tx_queue_len
setting.

Thank you.
Cong Wang Jan. 30, 2018, 12:12 a.m. UTC | #7
On Mon, Jan 29, 2018 at 9:43 AM, David Miller <davem@davemloft.net> wrote:
>
> Please follow up with John about making the queue allocation use
> a prepare + commit phase.

Will do it once net-next is re-open.


>
> And as for the TX queue state handling on change, I think it's
> fine to purge the whole queue.  That is definitely better than
> allowing the queue length to be visibly over the tx_queue_len
> setting.
>

OK. Thanks.