diff mbox series

[net-next,10/10] net: hns3: Add mqprio support when interacting with network stack

Message ID 1505992913-107256-11-git-send-email-linyunsheng@huawei.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series Add support for DCB feature in hns3 driver | expand

Commit Message

Yunsheng Lin Sept. 21, 2017, 11:21 a.m. UTC
When using tc qdisc to configure DCB parameter, dcb_ops->setup_tc
is used to tell hclge_dcb module to do the setup.
When using lldptool to configure DCB parameter, hclge_dcb module
call the client_ops->setup_tc to tell network stack which queue
and priority is using for specific tc.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c | 135 +++++++++++++++++----
 1 file changed, 111 insertions(+), 24 deletions(-)

Comments

Jiri Pirko Sept. 22, 2017, 12:55 p.m. UTC | #1
Thu, Sep 21, 2017 at 01:21:53PM CEST, linyunsheng@huawei.com wrote:
>When using tc qdisc to configure DCB parameter, dcb_ops->setup_tc
>is used to tell hclge_dcb module to do the setup.
>When using lldptool to configure DCB parameter, hclge_dcb module
>call the client_ops->setup_tc to tell network stack which queue
>and priority is using for specific tc.
>
>Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>

[...]
	
	
	
>-static int hns3_setup_tc(struct net_device *netdev, u8 tc)
>+static int hns3_setup_tc(struct net_device *netdev, u8 tc, u8 *prio_tc)
> {
> 	struct hns3_nic_priv *priv = netdev_priv(netdev);
> 	struct hnae3_handle *h = priv->ae_handle;
> 	struct hnae3_knic_private_info *kinfo = &h->kinfo;
>+	bool if_running = netif_running(netdev);
> 	unsigned int i;
> 	int ret;
> 
> 	if (tc > HNAE3_MAX_TC)
> 		return -EINVAL;
> 
>-	if (kinfo->num_tc == tc)
>-		return 0;
>-
> 	if (!netdev)
> 		return -EINVAL;
> 
>-	if (!tc) {
>+	if (if_running) {
>+		(void)hns3_nic_net_stop(netdev);
>+		msleep(100);
>+	}
>+
>+	ret = (kinfo->dcb_ops && kinfo->dcb_ops->setup_tc) ?
>+		kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : -EOPNOTSUPP;

This is most odd. Why do you call dcb_ops from ndo_setup_tc callback?
Why are you mixing this together? prio->tc mapping can be done
directly in dcbnl
Jiri Pirko Sept. 22, 2017, 4:03 p.m. UTC | #2
Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsheng@huawei.com wrote:
>Hi, Jiri
>
>>>- if (!tc) {
>>>+ if (if_running) {
>>>+ (void)hns3_nic_net_stop(netdev);
>>>+ msleep(100);
>>>+ }
>>>+
>>>+ ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
>>>+ kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>
>>This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>Why are you mixing this together? prio->tc mapping >can be done
>>directly in dcbnl
>
>Here is what we do in dcb_ops->setup_tc:
>Firstly, if current tc num is different from the tc num
>that user provide, then we setup the queues for each
>tc.
>
>Secondly, we tell hardware the pri to tc mapping that
>the stack is using. In rx direction, our hardware need
>that mapping to put different packet into different tc'
>queues according to the priority of the packet, then
>rss decides which specific queue in the tc should the
>packet goto.
>
>By mixing, I suppose you meant why we need the
>pri to tc infomation?

by mixing, I mean what I wrote. You are calling dcb_ops callback from
ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
all?



>I hope I did not misunderstand your question, thanks
>for your time reviewing.
Yunsheng Lin Sept. 23, 2017, 12:47 a.m. UTC | #3
Hi, Jiri

On 2017/9/23 0:03, Jiri Pirko wrote:
> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsheng@huawei.com wrote:
>> Hi, Jiri
>>
>>>> - if (!tc) {
>>>> + if (if_running) {
>>>> + (void)hns3_nic_net_stop(netdev);
>>>> + msleep(100);
>>>> + }
>>>> +
>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>>
>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>> Why are you mixing this together? prio->tc mapping >can be done
>>> directly in dcbnl
>>
>> Here is what we do in dcb_ops->setup_tc:
>> Firstly, if current tc num is different from the tc num
>> that user provide, then we setup the queues for each
>> tc.
>>
>> Secondly, we tell hardware the pri to tc mapping that
>> the stack is using. In rx direction, our hardware need
>> that mapping to put different packet into different tc'
>> queues according to the priority of the packet, then
>> rss decides which specific queue in the tc should the
>> packet goto.
>>
>> By mixing, I suppose you meant why we need the
>> pri to tc infomation?
> 
> by mixing, I mean what I wrote. You are calling dcb_ops callback from
> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
> all?

When using lldptool, dcbnl is involved.

But when using tc qdisc, dcbbl is not involved, below is the a few key
call graph in the kernel when tc qdisc cmd is executed.

cmd:
tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 hw 1

call graph:
rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init ->
hns3_nic_setup_tc

When hns3_nic_setup_tc is called, we need to know how many tc num and
prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter
in the ndo_setup_tc function, and dcb_ops is the our hardware specific
method to setup the tc related parameter to the hardware, so this is why
we call dcb_ops callback in ndo_setup_tc callback.

I hope this will answer your question, thanks for your time.

> 
> 
> 
>> I hope I did not misunderstand your question, thanks
>> for your time reviewing.
> 
> .
>
Jiri Pirko Sept. 24, 2017, 11:37 a.m. UTC | #4
Sat, Sep 23, 2017 at 02:47:20AM CEST, linyunsheng@huawei.com wrote:
>Hi, Jiri
>
>On 2017/9/23 0:03, Jiri Pirko wrote:
>> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsheng@huawei.com wrote:
>>> Hi, Jiri
>>>
>>>>> - if (!tc) {
>>>>> + if (if_running) {
>>>>> + (void)hns3_nic_net_stop(netdev);
>>>>> + msleep(100);
>>>>> + }
>>>>> +
>>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
>>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>>>
>>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>>> Why are you mixing this together? prio->tc mapping >can be done
>>>> directly in dcbnl
>>>
>>> Here is what we do in dcb_ops->setup_tc:
>>> Firstly, if current tc num is different from the tc num
>>> that user provide, then we setup the queues for each
>>> tc.
>>>
>>> Secondly, we tell hardware the pri to tc mapping that
>>> the stack is using. In rx direction, our hardware need
>>> that mapping to put different packet into different tc'
>>> queues according to the priority of the packet, then
>>> rss decides which specific queue in the tc should the
>>> packet goto.
>>>
>>> By mixing, I suppose you meant why we need the
>>> pri to tc infomation?
>> 
>> by mixing, I mean what I wrote. You are calling dcb_ops callback from
>> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
>> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
>> all?
>
>When using lldptool, dcbnl is involved.
>
>But when using tc qdisc, dcbbl is not involved, below is the a few key
>call graph in the kernel when tc qdisc cmd is executed.
>
>cmd:
>tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 hw 1
>
>call graph:
>rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init ->
>hns3_nic_setup_tc
>
>When hns3_nic_setup_tc is called, we need to know how many tc num and
>prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter
>in the ndo_setup_tc function, and dcb_ops is the our hardware specific
>method to setup the tc related parameter to the hardware, so this is why
>we call dcb_ops callback in ndo_setup_tc callback.
>
>I hope this will answer your question, thanks for your time.

Okay. I understand that you have a usecase for mqprio mapping offload
without lldptool being involved. Ok. I believe it is wrong to call dcb_ops
from tc callback. You should have a generic layer inside the driver and
call it from both dcb_ops and tc callbacks.

Also, what happens If I run lldptool concurrently with mqprio? Who wins
and is going to configure the mapping?


>
>> 
>> 
>> 
>>> I hope I did not misunderstand your question, thanks
>>> for your time reviewing.
>> 
>> .
>> 
>
Yunsheng Lin Sept. 25, 2017, 12:45 a.m. UTC | #5
Hi, Jiri

On 2017/9/24 19:37, Jiri Pirko wrote:
> Sat, Sep 23, 2017 at 02:47:20AM CEST, linyunsheng@huawei.com wrote:
>> Hi, Jiri
>>
>> On 2017/9/23 0:03, Jiri Pirko wrote:
>>> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsheng@huawei.com wrote:
>>>> Hi, Jiri
>>>>
>>>>>> - if (!tc) {
>>>>>> + if (if_running) {
>>>>>> + (void)hns3_nic_net_stop(netdev);
>>>>>> + msleep(100);
>>>>>> + }
>>>>>> +
>>>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
>>>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>>>>
>>>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>>>> Why are you mixing this together? prio->tc mapping >can be done
>>>>> directly in dcbnl
>>>>
>>>> Here is what we do in dcb_ops->setup_tc:
>>>> Firstly, if current tc num is different from the tc num
>>>> that user provide, then we setup the queues for each
>>>> tc.
>>>>
>>>> Secondly, we tell hardware the pri to tc mapping that
>>>> the stack is using. In rx direction, our hardware need
>>>> that mapping to put different packet into different tc'
>>>> queues according to the priority of the packet, then
>>>> rss decides which specific queue in the tc should the
>>>> packet goto.
>>>>
>>>> By mixing, I suppose you meant why we need the
>>>> pri to tc infomation?
>>>
>>> by mixing, I mean what I wrote. You are calling dcb_ops callback from
>>> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
>>> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
>>> all?
>>
>> When using lldptool, dcbnl is involved.
>>
>> But when using tc qdisc, dcbbl is not involved, below is the a few key
>> call graph in the kernel when tc qdisc cmd is executed.
>>
>> cmd:
>> tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 hw 1
>>
>> call graph:
>> rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init ->
>> hns3_nic_setup_tc
>>
>> When hns3_nic_setup_tc is called, we need to know how many tc num and
>> prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter
>> in the ndo_setup_tc function, and dcb_ops is the our hardware specific
>> method to setup the tc related parameter to the hardware, so this is why
>> we call dcb_ops callback in ndo_setup_tc callback.
>>
>> I hope this will answer your question, thanks for your time.
> 
> Okay. I understand that you have a usecase for mqprio mapping offload
> without lldptool being involved. Ok. I believe it is wrong to call dcb_ops
> from tc callback. You should have a generic layer inside the driver and
> call it from both dcb_ops and tc callbacks.

Actually, dcb_ops is our generic layer inside the driver.
Below is high level architecture:

       [ tc qdisc ]	       [ lldpad ]
             |                     |
             |                     |
             |                     |
       [ hns3_enet ]        [ hns3_dcbnl ]
             \                    /
                \              /
                   \        /
                 [ hclge_dcb ]
                   /      \
                /            \
             /                  \
     [ hclgc_main ]        [ hclge_tm ]

hns3_enet.c implements the ndo_setup_tc callback.
hns3_dcbnl.c implements the dcbnl_rtnl_ops for stack's DCBNL system.
hclge_dcb implements the dcb_ops.
So we already have a generic layer that tc and dcbnl all call from.

> 
> Also, what happens If I run lldptool concurrently with mqprio? Who wins
> and is going to configure the mapping?

Both lldptool and tc qdisc cmd use rtnl interface provided by stack, so
they are both protected by rtnl_lock, so we do not have to do the locking
in the driver.

The locking is in rtnetlink_rcv_msg:

	rtnl_lock();
	handlers = rtnl_dereference(rtnl_msg_handlers[family]);
	if (handlers) {
		doit = READ_ONCE(handlers[type].doit);
		if (doit)
			err = doit(skb, nlh, extack);
	}
	rtnl_unlock();

Thanks.

> 
> 
>>
>>>
>>>
>>>
>>>> I hope I did not misunderstand your question, thanks
>>>> for your time reviewing.
>>>
>>> .
>>>
>>
> 
> .
>
Jiri Pirko Sept. 25, 2017, 6:57 a.m. UTC | #6
Mon, Sep 25, 2017 at 02:45:08AM CEST, linyunsheng@huawei.com wrote:
>Hi, Jiri
>
>On 2017/9/24 19:37, Jiri Pirko wrote:
>> Sat, Sep 23, 2017 at 02:47:20AM CEST, linyunsheng@huawei.com wrote:
>>> Hi, Jiri
>>>
>>> On 2017/9/23 0:03, Jiri Pirko wrote:
>>>> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsheng@huawei.com wrote:
>>>>> Hi, Jiri
>>>>>
>>>>>>> - if (!tc) {
>>>>>>> + if (if_running) {
>>>>>>> + (void)hns3_nic_net_stop(netdev);
>>>>>>> + msleep(100);
>>>>>>> + }
>>>>>>> +
>>>>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
>>>>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>>>>>
>>>>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>>>>> Why are you mixing this together? prio->tc mapping >can be done
>>>>>> directly in dcbnl
>>>>>
>>>>> Here is what we do in dcb_ops->setup_tc:
>>>>> Firstly, if current tc num is different from the tc num
>>>>> that user provide, then we setup the queues for each
>>>>> tc.
>>>>>
>>>>> Secondly, we tell hardware the pri to tc mapping that
>>>>> the stack is using. In rx direction, our hardware need
>>>>> that mapping to put different packet into different tc'
>>>>> queues according to the priority of the packet, then
>>>>> rss decides which specific queue in the tc should the
>>>>> packet goto.
>>>>>
>>>>> By mixing, I suppose you meant why we need the
>>>>> pri to tc infomation?
>>>>
>>>> by mixing, I mean what I wrote. You are calling dcb_ops callback from
>>>> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
>>>> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
>>>> all?
>>>
>>> When using lldptool, dcbnl is involved.
>>>
>>> But when using tc qdisc, dcbbl is not involved, below is the a few key
>>> call graph in the kernel when tc qdisc cmd is executed.
>>>
>>> cmd:
>>> tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 hw 1
>>>
>>> call graph:
>>> rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init ->
>>> hns3_nic_setup_tc
>>>
>>> When hns3_nic_setup_tc is called, we need to know how many tc num and
>>> prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter
>>> in the ndo_setup_tc function, and dcb_ops is the our hardware specific
>>> method to setup the tc related parameter to the hardware, so this is why
>>> we call dcb_ops callback in ndo_setup_tc callback.
>>>
>>> I hope this will answer your question, thanks for your time.
>> 
>> Okay. I understand that you have a usecase for mqprio mapping offload
>> without lldptool being involved. Ok. I believe it is wrong to call dcb_ops
>> from tc callback. You should have a generic layer inside the driver and
>> call it from both dcb_ops and tc callbacks.
>
>Actually, dcb_ops is our generic layer inside the driver.
>Below is high level architecture:
>
>       [ tc qdisc ]	       [ lldpad ]
>             |                     |
>             |                     |
>             |                     |
>       [ hns3_enet ]        [ hns3_dcbnl ]
>             \                    /
>                \              /
>                   \        /
>                 [ hclge_dcb ]
>                   /      \
>                /            \
>             /                  \
>     [ hclgc_main ]        [ hclge_tm ]
>
>hns3_enet.c implements the ndo_setup_tc callback.
>hns3_dcbnl.c implements the dcbnl_rtnl_ops for stack's DCBNL system.
>hclge_dcb implements the dcb_ops.
>So we already have a generic layer that tc and dcbnl all call from.
>
>> 
>> Also, what happens If I run lldptool concurrently with mqprio? Who wins
>> and is going to configure the mapping?
>
>Both lldptool and tc qdisc cmd use rtnl interface provided by stack, so
>they are both protected by rtnl_lock, so we do not have to do the locking
>in the driver.

I was not asking about locking, which is obvious, I was asking about the
behaviour. Like for example:
If I use tc to configure some mapping, later on I run lldptool and change
the mapping. Does the tc dump show the updated values or the original
ones?

>
>The locking is in rtnetlink_rcv_msg:
>
>	rtnl_lock();
>	handlers = rtnl_dereference(rtnl_msg_handlers[family]);
>	if (handlers) {
>		doit = READ_ONCE(handlers[type].doit);
>		if (doit)
>			err = doit(skb, nlh, extack);
>	}
>	rtnl_unlock();
>
>Thanks.
>
>> 
>> 
>>>
>>>>
>>>>
>>>>
>>>>> I hope I did not misunderstand your question, thanks
>>>>> for your time reviewing.
>>>>
>>>> .
>>>>
>>>
>> 
>> .
>> 
>
Yunsheng Lin Sept. 25, 2017, 7:22 a.m. UTC | #7
Hi, Jiri

On 2017/9/25 14:57, Jiri Pirko wrote:
> Mon, Sep 25, 2017 at 02:45:08AM CEST, linyunsheng@huawei.com wrote:
>> Hi, Jiri
>>
>> On 2017/9/24 19:37, Jiri Pirko wrote:
>>> Sat, Sep 23, 2017 at 02:47:20AM CEST, linyunsheng@huawei.com wrote:
>>>> Hi, Jiri
>>>>
>>>> On 2017/9/23 0:03, Jiri Pirko wrote:
>>>>> Fri, Sep 22, 2017 at 04:11:51PM CEST, linyunsheng@huawei.com wrote:
>>>>>> Hi, Jiri
>>>>>>
>>>>>>>> - if (!tc) {
>>>>>>>> + if (if_running) {
>>>>>>>> + (void)hns3_nic_net_stop(netdev);
>>>>>>>> + msleep(100);
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + ret = (kinfo->dcb_ops && kinfo->dcb_ops->>setup_tc) ?
>>>>>>>> + kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : ->EOPNOTSUPP;
>>>>>>
>>>>>>> This is most odd. Why do you call dcb_ops from >ndo_setup_tc callback?
>>>>>>> Why are you mixing this together? prio->tc mapping >can be done
>>>>>>> directly in dcbnl
>>>>>>
>>>>>> Here is what we do in dcb_ops->setup_tc:
>>>>>> Firstly, if current tc num is different from the tc num
>>>>>> that user provide, then we setup the queues for each
>>>>>> tc.
>>>>>>
>>>>>> Secondly, we tell hardware the pri to tc mapping that
>>>>>> the stack is using. In rx direction, our hardware need
>>>>>> that mapping to put different packet into different tc'
>>>>>> queues according to the priority of the packet, then
>>>>>> rss decides which specific queue in the tc should the
>>>>>> packet goto.
>>>>>>
>>>>>> By mixing, I suppose you meant why we need the
>>>>>> pri to tc infomation?
>>>>>
>>>>> by mixing, I mean what I wrote. You are calling dcb_ops callback from
>>>>> ndo_setup_tc callback. So you are mixing DCBNL subsystem and TC
>>>>> subsystem. Why? Why do you need sch_mqprio? Why DCBNL is not enough for
>>>>> all?
>>>>
>>>> When using lldptool, dcbnl is involved.
>>>>
>>>> But when using tc qdisc, dcbbl is not involved, below is the a few key
>>>> call graph in the kernel when tc qdisc cmd is executed.
>>>>
>>>> cmd:
>>>> tc qdisc add dev eth0 root handle 1:0 mqprio num_tc 4 map 1 2 3 3 1 3 1 1 hw 1
>>>>
>>>> call graph:
>>>> rtnetlink_rcv_msg -> tc_modify_qdisc -> qdisc_create -> mqprio_init ->
>>>> hns3_nic_setup_tc
>>>>
>>>> When hns3_nic_setup_tc is called, we need to know how many tc num and
>>>> prio_tc mapping from the tc_mqprio_qopt which is provided in the paramter
>>>> in the ndo_setup_tc function, and dcb_ops is the our hardware specific
>>>> method to setup the tc related parameter to the hardware, so this is why
>>>> we call dcb_ops callback in ndo_setup_tc callback.
>>>>
>>>> I hope this will answer your question, thanks for your time.
>>>
>>> Okay. I understand that you have a usecase for mqprio mapping offload
>>> without lldptool being involved. Ok. I believe it is wrong to call dcb_ops
>>> from tc callback. You should have a generic layer inside the driver and
>>> call it from both dcb_ops and tc callbacks.
>>
>> Actually, dcb_ops is our generic layer inside the driver.
>> Below is high level architecture:
>>
>>       [ tc qdisc ]	       [ lldpad ]
>>             |                     |
>>             |                     |
>>             |                     |
>>       [ hns3_enet ]        [ hns3_dcbnl ]
>>             \                    /
>>                \              /
>>                   \        /
>>                 [ hclge_dcb ]
>>                   /      \
>>                /            \
>>             /                  \
>>     [ hclgc_main ]        [ hclge_tm ]
>>
>> hns3_enet.c implements the ndo_setup_tc callback.
>> hns3_dcbnl.c implements the dcbnl_rtnl_ops for stack's DCBNL system.
>> hclge_dcb implements the dcb_ops.
>> So we already have a generic layer that tc and dcbnl all call from.
>>
>>>
>>> Also, what happens If I run lldptool concurrently with mqprio? Who wins
>>> and is going to configure the mapping?
>>
>> Both lldptool and tc qdisc cmd use rtnl interface provided by stack, so
>> they are both protected by rtnl_lock, so we do not have to do the locking
>> in the driver.
> 
> I was not asking about locking, which is obvious, I was asking about the
> behaviour. Like for example:
> If I use tc to configure some mapping, later on I run lldptool and change
> the mapping. Does the tc dump show the updated values or the original
> ones?

If it is that case, tc dump show the updated values.
Normally, we use tc qdisc to configure the netdev to use mqprio, and
then use the lldptool the tc_prio mapping, tc schedule mode, tc bandwidth
and pfc option.

if lldptool change the tc num and tc_prio mapping, it will tell the tc
system by the following function which is called in client_ops->setup_tc:
netdev_set_tc_queue
netdev_set_prio_tc_map

So lldptool and tc qdisc can work together.



> 
>>
>> The locking is in rtnetlink_rcv_msg:
>>
>> 	rtnl_lock();
>> 	handlers = rtnl_dereference(rtnl_msg_handlers[family]);
>> 	if (handlers) {
>> 		doit = READ_ONCE(handlers[type].doit);
>> 		if (doit)
>> 			err = doit(skb, nlh, extack);
>> 	}
>> 	rtnl_unlock();
>>
>> Thanks.
>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>> I hope I did not misunderstand your question, thanks
>>>>>> for your time reviewing.
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
> 
> .
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
index 11dab26..31fcda4 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
@@ -196,6 +196,32 @@  static void hns3_vector_gl_rl_init(struct hns3_enet_tqp_vector *tqp_vector)
 	tqp_vector->tx_group.flow_level = HNS3_FLOW_LOW;
 }
 
+static int hns3_nic_set_real_num_queue(struct net_device *netdev)
+{
+	struct hns3_nic_priv *priv = netdev_priv(netdev);
+	struct hnae3_handle *h = priv->ae_handle;
+	struct hnae3_knic_private_info *kinfo = &h->kinfo;
+	unsigned int queue_size = kinfo->rss_size * kinfo->num_tc;
+	int ret;
+
+	ret = netif_set_real_num_tx_queues(netdev, queue_size);
+	if (ret) {
+		netdev_err(netdev,
+			   "netif_set_real_num_tx_queues fail, ret=%d!\n",
+			   ret);
+		return ret;
+	}
+
+	ret = netif_set_real_num_rx_queues(netdev, queue_size);
+	if (ret) {
+		netdev_err(netdev,
+			   "netif_set_real_num_rx_queues fail, ret=%d!\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
 static int hns3_nic_net_up(struct net_device *netdev)
 {
 	struct hns3_nic_priv *priv = netdev_priv(netdev);
@@ -232,26 +258,13 @@  static int hns3_nic_net_up(struct net_device *netdev)
 
 static int hns3_nic_net_open(struct net_device *netdev)
 {
-	struct hns3_nic_priv *priv = netdev_priv(netdev);
-	struct hnae3_handle *h = priv->ae_handle;
 	int ret;
 
 	netif_carrier_off(netdev);
 
-	ret = netif_set_real_num_tx_queues(netdev, h->kinfo.num_tqps);
-	if (ret) {
-		netdev_err(netdev,
-			   "netif_set_real_num_tx_queues fail, ret=%d!\n",
-			   ret);
-		return ret;
-	}
-
-	ret = netif_set_real_num_rx_queues(netdev, h->kinfo.num_tqps);
-	if (ret) {
-		netdev_err(netdev,
-			   "netif_set_real_num_rx_queues fail, ret=%d!\n", ret);
+	ret = hns3_nic_set_real_num_queue(netdev);
+	if (ret)
 		return ret;
-	}
 
 	ret = hns3_nic_net_up(netdev);
 	if (ret) {
@@ -1193,32 +1206,40 @@  static void hns3_nic_udp_tunnel_del(struct net_device *netdev,
 	}
 }
 
-static int hns3_setup_tc(struct net_device *netdev, u8 tc)
+static int hns3_setup_tc(struct net_device *netdev, u8 tc, u8 *prio_tc)
 {
 	struct hns3_nic_priv *priv = netdev_priv(netdev);
 	struct hnae3_handle *h = priv->ae_handle;
 	struct hnae3_knic_private_info *kinfo = &h->kinfo;
+	bool if_running = netif_running(netdev);
 	unsigned int i;
 	int ret;
 
 	if (tc > HNAE3_MAX_TC)
 		return -EINVAL;
 
-	if (kinfo->num_tc == tc)
-		return 0;
-
 	if (!netdev)
 		return -EINVAL;
 
-	if (!tc) {
+	if (if_running) {
+		(void)hns3_nic_net_stop(netdev);
+		msleep(100);
+	}
+
+	ret = (kinfo->dcb_ops && kinfo->dcb_ops->setup_tc) ?
+		kinfo->dcb_ops->setup_tc(h, tc, prio_tc) : -EOPNOTSUPP;
+	if (ret)
+		goto err_out;
+
+	if (tc <= 1) {
 		netdev_reset_tc(netdev);
-		return 0;
+		goto out;
 	}
 
 	/* Set num_tc for netdev */
 	ret = netdev_set_num_tc(netdev, tc);
 	if (ret)
-		return ret;
+		goto err_out;
 
 	/* Set per TC queues for the VSI */
 	for (i = 0; i < HNAE3_MAX_TC; i++) {
@@ -1229,7 +1250,14 @@  static int hns3_setup_tc(struct net_device *netdev, u8 tc)
 					    kinfo->tc_info[i].tqp_offset);
 	}
 
-	return 0;
+out:
+	ret = hns3_nic_set_real_num_queue(netdev);
+
+err_out:
+	if (if_running)
+		(void)hns3_nic_net_open(netdev);
+
+	return ret;
 }
 
 static int hns3_nic_setup_tc(struct net_device *dev, enum tc_setup_type type,
@@ -1240,7 +1268,7 @@  static int hns3_nic_setup_tc(struct net_device *dev, enum tc_setup_type type,
 	if (type != TC_SETUP_MQPRIO)
 		return -EOPNOTSUPP;
 
-	return hns3_setup_tc(dev, mqprio->num_tc);
+	return hns3_setup_tc(dev, mqprio->num_tc, mqprio->prio_tc_map);
 }
 
 static int hns3_vlan_rx_add_vid(struct net_device *netdev,
@@ -2848,10 +2876,69 @@  static void hns3_link_status_change(struct hnae3_handle *handle, bool linkup)
 	}
 }
 
+static int hns3_client_setup_tc(struct hnae3_handle *handle, u8 tc)
+{
+	struct hnae3_knic_private_info *kinfo = &handle->kinfo;
+	struct net_device *ndev = kinfo->netdev;
+	bool if_running = netif_running(ndev);
+	int ret;
+	u8 i;
+
+	if (tc > HNAE3_MAX_TC)
+		return -EINVAL;
+
+	if (!ndev)
+		return -ENODEV;
+
+	ret = netdev_set_num_tc(ndev, tc);
+	if (ret)
+		return ret;
+
+	if (if_running) {
+		(void)hns3_nic_net_stop(ndev);
+		msleep(100);
+	}
+
+	ret = (kinfo->dcb_ops && kinfo->dcb_ops->map_update) ?
+		kinfo->dcb_ops->map_update(handle) : -EOPNOTSUPP;
+	if (ret)
+		goto err_out;
+
+	if (tc <= 1) {
+		netdev_reset_tc(ndev);
+		goto out;
+	}
+
+	for (i = 0; i < HNAE3_MAX_TC; i++) {
+		struct hnae3_tc_info *tc_info = &kinfo->tc_info[i];
+
+		if (tc_info->enable)
+			netdev_set_tc_queue(ndev,
+					    tc_info->tc,
+					    tc_info->tqp_count,
+					    tc_info->tqp_offset);
+	}
+
+	for (i = 0; i < HNAE3_MAX_USER_PRIO; i++) {
+		netdev_set_prio_tc_map(ndev, i,
+				       kinfo->prio_tc[i]);
+	}
+
+out:
+	ret = hns3_nic_set_real_num_queue(ndev);
+
+err_out:
+	if (if_running)
+		(void)hns3_nic_net_open(ndev);
+
+	return ret;
+}
+
 const struct hnae3_client_ops client_ops = {
 	.init_instance = hns3_client_init,
 	.uninit_instance = hns3_client_uninit,
 	.link_status_change = hns3_link_status_change,
+	.setup_tc = hns3_client_setup_tc,
 };
 
 /* hns3_init_module - Driver registration routine