Patchwork [net-next,3/3] net/mlx4_en: Enable open-lldp DCB support

login
register
mail settings
Submitter Or Gerlitz
Date April 4, 2013, 2:26 p.m.
Message ID <1365085574-12057-4-git-send-email-ogerlitz@mellanox.com>
Download mbox | patch
Permalink /patch/233846/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Or Gerlitz - April 4, 2013, 2:26 p.m.
From: Sagi Grimberg <sagig@mellanox.com>

The lldpad daemon queries the driver caps via the getcaps and getstate
routines. Added the prpoer dbcnl_ops entries to support that.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c |   37 +++++++++++++++++++++++-
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |    2 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |    4 ++-
 3 files changed, 41 insertions(+), 2 deletions(-)
John Fastabend - April 4, 2013, 3:58 p.m.
On 4/4/2013 7:26 AM, Or Gerlitz wrote:
> From: Sagi Grimberg <sagig@mellanox.com>
>
> The lldpad daemon queries the driver caps via the getcaps and getstate
> routines. Added the prpoer dbcnl_ops entries to support that.
>
> Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> ---

Does lldpad work now with the mlx4_en driver?

Reviewed-by: John Fastabend <john.r.fastabend@intel.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg - April 4, 2013, 4:08 p.m.
On 4/4/2013 6:58 PM, John Fastabend wrote:
> On 4/4/2013 7:26 AM, Or Gerlitz wrote:
>> From: Sagi Grimberg <sagig@mellanox.com>
>>
>> The lldpad daemon queries the driver caps via the getcaps and getstate
>> routines. Added the prpoer dbcnl_ops entries to support that.
>>
>> Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
>> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
>> ---
>
> Does lldpad work now with the mlx4_en driver?
>
> Reviewed-by: John Fastabend <john.r.fastabend@intel.com>
>

Yes.

-Sagi
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend - April 4, 2013, 7:01 p.m.
On 4/4/2013 9:08 AM, Sagi Grimberg wrote:
> On 4/4/2013 6:58 PM, John Fastabend wrote:
>> On 4/4/2013 7:26 AM, Or Gerlitz wrote:
>>> From: Sagi Grimberg <sagig@mellanox.com>
>>>
>>> The lldpad daemon queries the driver caps via the getcaps and getstate
>>> routines. Added the prpoer dbcnl_ops entries to support that.
>>>
>>> Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
>>> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
>>> ---
>>
>> Does lldpad work now with the mlx4_en driver?
>>
>> Reviewed-by: John Fastabend <john.r.fastabend@intel.com>
>>
>
> Yes.
>
> -Sagi

So I looked at your code and lldpad a bit closer. I think this
just works around a bug in lldpad. Also with this patch lldpad
would try to xmit lldp pkts which would conflict with your firmware
agent right?

For IEEE DCBX modes lldpad should check the return values from
the getdcbx dcbnl_rtnl_ops which you already have filled out. I
would expect your driver to return DCB_CAP_DCBX_LLD_MANAGED to
indicate it is being managed by firmware.

See bnx2x_dcbnl_get_dcbx() for an example of an implementation
which I believe is correct. I'll fix the user space lldpad bug.

Looks like I added my reveiwed-by line too quickly.

Thanks,
.John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz - April 4, 2013, 9:18 p.m.
John Fastabend <john.r.fastabend@intel.com> wrote:
> On 4/4/2013 9:08 AM, Sagi Grimberg wrote:
>> On 4/4/2013 6:58 PM, John Fastabend wrote:
>>> On 4/4/2013 7:26 AM, Or Gerlitz wrote:

>>> Does lldpad work now with the mlx4_en driver?
>>> Reviewed-by: John Fastabend <john.r.fastabend@intel.com>

>> Yes.
>> -Sagi

> So I looked at your code and lldpad a bit closer. I think this
> just works around a bug in lldpad. Also with this patch lldpad
> would try to xmit lldp pkts which would conflict with your firmware agent right?

We don't use firmware agent


> For IEEE DCBX modes lldpad should check the return values from
> the getdcbx dcbnl_rtnl_ops which you already have filled out. I
> would expect your driver to return DCB_CAP_DCBX_LLD_MANAGED to
> indicate it is being managed by firmware.


> See bnx2x_dcbnl_get_dcbx() for an example of an implementation
> which I believe is correct. I'll fix the user space lldpad bug.

can you be more specific what's the user space bug?

> Looks like I added my reveiwed-by line too quickly.

With this patch I can get the following

# lldptool -i eth3 -T -V ETS-CFG
tsa=0:ets,1:ets,2:ets,3:ets,4:ets,5:ets,6:ets,7:ets
up2tc=0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 tcbw=12,12,12,12,13,13,13,13
TSA = 0:ets 1:ets 2:ets 3:ets 4:ets 5:ets 6:ets 7:ets
up2tc = 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
tcbw = 12% 12% 12% 12% 13% 13% 13% 13%

# lldptool -i eth3 -t -V ETS-CFG
IEEE 8021QAZ ETS Configuration TLV
         Willing: yes
         CBS: not supported
         MAX_TCS: 8
         PRIO_MAP: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
         TC Bandwidth: 12% 12% 12% 12% 13% 13% 13% 13%
         TSA_MAP: 0:ets 1:ets 2:ets 3:ets 4:ets 5:ets 6:ets 7:ets

where without it, an error is returned, so basically, the patch allows
1st and most to do gets/sets where without it, it doesn't work,
anything wrong with that?

Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend - April 4, 2013, 11:54 p.m.
On 04/04/2013 02:18 PM, Or Gerlitz wrote:
>   John Fastabend <john.r.fastabend@intel.com> wrote:
>> On 4/4/2013 9:08 AM, Sagi Grimberg wrote:
>>> On 4/4/2013 6:58 PM, John Fastabend wrote:
>>>> On 4/4/2013 7:26 AM, Or Gerlitz wrote:
>
>>>> Does lldpad work now with the mlx4_en driver?
>>>> Reviewed-by: John Fastabend <john.r.fastabend@intel.com>
>
>>> Yes.
>>> -Sagi
>
>> So I looked at your code and lldpad a bit closer. I think this
>> just works around a bug in lldpad. Also with this patch lldpad
>> would try to xmit lldp pkts which would conflict with your firmware agent right?
>
> We don't use firmware agent
>

Ah OK. Your mlx4_en_dcbnl_getdcbx() routine should also return
DCB_CAP_DCBX_HOST then just to be clear. I should write a patch
to clarify those defines I guess.

>
>> For IEEE DCBX modes lldpad should check the return values from
>> the getdcbx dcbnl_rtnl_ops which you already have filled out. I
>> would expect your driver to return DCB_CAP_DCBX_LLD_MANAGED to
>> indicate it is being managed by firmware.
>
>
>> See bnx2x_dcbnl_get_dcbx() for an example of an implementation
>> which I believe is correct. I'll fix the user space lldpad bug.
>
> can you be more specific what's the user space bug?
>

Sure.

In lldpad before the IEEE DCBx (802.1Qaz) TLVs are initialized and the
hardware is managed with the 802.1Qaz module there is a check to verify
the hardware actually supports 802.1Qaz because we don't have a SW
fall back.

Today in lldpad this check is done via a DCB_CMD_GCAP netlink msg. But
this message is really built for the older CEE version of DCBX. In
./include/net/dcbnl.h it is listed in the CEE ops section and it uses
the CEE defines such as *_PG_* instead of including the more precise
*_ETS_* attributes. It also does not give any indication of the
firmware status.

A better check for IEEE support would be to use the getdcbx op via
DCB_CMD_GDCBX which has an explicit define for IEEE/CEE and fw
or host managed.

	#define DCB_CAP_DCBX_HOST               0x01
	#define DCB_CAP_DCBX_LLD_MANAGED        0x02
	#define DCB_CAP_DCBX_VER_CEE            0x04
	#define DCB_CAP_DCBX_VER_IEEE           0x08
	#define DCB_CAP_DCBX_STATIC             0x10

If this check is used your driver would work as is and lldpad would be
able to learn in a single call if (a) the hardware supports DCB (b)
which version CEE or IEEE or both and (c) if the firmware is already
managing DCBX. This reduces driver code somewhat which is a nice thing
and is more precise.

So as a heads up I am planning to fix/improve lldpad to use these bits
and it looks like every driver implementing the IEEE dcbnl ops does in
fact implement the getdcbx op so it should not break any drivers and
be backwards compatible. Meaning running a new lldpad version on an old
kernel will still work.

>> Looks like I added my reveiwed-by line too quickly.
>
> With this patch I can get the following
>
> # lldptool -i eth3 -T -V ETS-CFG
> tsa=0:ets,1:ets,2:ets,3:ets,4:ets,5:ets,6:ets,7:ets
> up2tc=0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 tcbw=12,12,12,12,13,13,13,13
> TSA = 0:ets 1:ets 2:ets 3:ets 4:ets 5:ets 6:ets 7:ets
> up2tc = 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
> tcbw = 12% 12% 12% 12% 13% 13% 13% 13%
>
> # lldptool -i eth3 -t -V ETS-CFG
> IEEE 8021QAZ ETS Configuration TLV
>           Willing: yes
>           CBS: not supported
>           MAX_TCS: 8
>           PRIO_MAP: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
>           TC Bandwidth: 12% 12% 12% 12% 13% 13% 13% 13%
>           TSA_MAP: 0:ets 1:ets 2:ets 3:ets 4:ets 5:ets 6:ets 7:ets
>
> where without it, an error is returned, so basically, the patch allows
> 1st and most to do gets/sets where without it, it doesn't work,
> anything wrong with that?

Agree it doesn't work without this patch but we can fix it in user
space. This has the nice benefit of letting lldpad work with mlx4 on
older kernels. In general I think its easier to update user space then
the kernel for most users. I'll send you a lldpad patch in a few
moments.

Thanks,
John

>
> Or.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Or Gerlitz - April 5, 2013, 7:15 a.m.
John Fastabend <john.fastabend@gmail.com> wrote:

> Agree it doesn't work without this patch but we can fix it in user
> space. This has the nice benefit of letting lldpad work with mlx4 on
> older kernels. In general I think its easier to update user space then
> the kernel for most users. I'll send you a lldpad patch in a few moments.

So to be sure we're on the same page, what we should do is enhance
mlx4_en_dcbnl_getdcbx() routine to also return DCB_CAP_DCBX_HOST and
work with your user space patch? sounds nice, except for what happens
if people run with older versions of lldpad..

Or.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend - April 5, 2013, 7:52 a.m.
On 04/05/2013 12:15 AM, Or Gerlitz wrote:
> John Fastabend <john.fastabend@gmail.com> wrote:
>
>> Agree it doesn't work without this patch but we can fix it in user
>> space. This has the nice benefit of letting lldpad work with mlx4 on
>> older kernels. In general I think its easier to update user space then
>> the kernel for most users. I'll send you a lldpad patch in a few moments.
>
> So to be sure we're on the same page, what we should do is enhance
> mlx4_en_dcbnl_getdcbx() routine to also return DCB_CAP_DCBX_HOST and

Right I think this is a better approach in the long run.

> work with your user space patch? sounds nice, except for what happens
> if people run with older versions of lldpad..
>

Well it won't work. But on the flip side all your older drivers _will_
work with the new daemon. My working assumption is for most users its
easier to upgrade a user space pkg than a kernel. And the simpler we
can keep the kernel code the better. Oh and also if there is a bug in
user space we shouldn't fix it with kernel code.

> Or.
>

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
index b7dc59f..f9f9164 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_dcb_nl.c
@@ -36,6 +36,31 @@ 
 
 #include "mlx4_en.h"
 
+static u8 mlx4_en_dcbnl_getcap(struct net_device *dev, int capid, u8 *cap)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+
+	switch (capid) {
+	case DCB_CAP_ATTR_PFC:
+		*cap = true;
+		break;
+	case DCB_CAP_ATTR_UP2TC:
+		if (priv->mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_SET_ETH_SCHED)
+			*cap = true;
+		else
+			*cap = false;
+		break;
+	case DCB_CAP_ATTR_DCBX:
+		*cap = priv->dcbx_cap;
+		break;
+	default:
+		*cap = false;
+		break;
+	}
+
+	return 0;
+}
+
 static int mlx4_en_dcbnl_ieee_getets(struct net_device *dev,
 				   struct ieee_ets *ets)
 {
@@ -217,6 +242,13 @@  static int mlx4_en_dcbnl_ieee_getmaxrate(struct net_device *dev,
 	return 0;
 }
 
+static u8 mlx4_en_dcbnl_get_state(struct net_device *dev)
+{
+	struct mlx4_en_priv *priv = netdev_priv(dev);
+
+	return !!(priv->flags & MLX4_EN_FLAG_DCB_ENABLED);
+}
+
 static int mlx4_en_dcbnl_ieee_setmaxrate(struct net_device *dev,
 		struct ieee_maxrate *maxrate)
 {
@@ -243,18 +275,21 @@  static int mlx4_en_dcbnl_ieee_setmaxrate(struct net_device *dev,
 }
 
 const struct dcbnl_rtnl_ops mlx4_en_dcbnl_ops = {
+	.getstate       = mlx4_en_dcbnl_get_state,
 	.ieee_getets	= mlx4_en_dcbnl_ieee_getets,
 	.ieee_setets	= mlx4_en_dcbnl_ieee_setets,
 	.ieee_getmaxrate = mlx4_en_dcbnl_ieee_getmaxrate,
 	.ieee_setmaxrate = mlx4_en_dcbnl_ieee_setmaxrate,
 	.ieee_getpfc	= mlx4_en_dcbnl_ieee_getpfc,
 	.ieee_setpfc	= mlx4_en_dcbnl_ieee_setpfc,
-
+	.getcap		= mlx4_en_dcbnl_getcap,
 	.getdcbx	= mlx4_en_dcbnl_getdcbx,
 	.setdcbx	= mlx4_en_dcbnl_setdcbx,
 };
 
 const struct dcbnl_rtnl_ops mlx4_en_dcbnl_pfc_ops = {
+	.getstate       = mlx4_en_dcbnl_get_state,
 	.ieee_getpfc	= mlx4_en_dcbnl_ieee_getpfc,
 	.ieee_setpfc	= mlx4_en_dcbnl_ieee_setpfc,
+	.getcap		= mlx4_en_dcbnl_getcap,
 };
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 62795b5..a390de0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2013,6 +2013,8 @@  int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 	INIT_WORK(&priv->linkstate_task, mlx4_en_linkstate);
 	INIT_DELAYED_WORK(&priv->stats_task, mlx4_en_do_get_stats);
 #ifdef CONFIG_MLX4_EN_DCB
+	priv->dcbx_cap = DCB_CAP_DCBX_HOST;
+	priv->flags |= MLX4_EN_FLAG_DCB_ENABLED;
 	if (!mlx4_is_slave(priv->mdev->dev)) {
 		if (mdev->dev->caps.flags & MLX4_DEV_CAP_FLAG_SET_ETH_SCHED) {
 			dev->dcbnl_ops = &mlx4_en_dcbnl_ops;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d4cb5d3..71960b1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -440,7 +440,8 @@  enum {
 	MLX4_EN_FLAG_ENABLE_HW_LOOPBACK	= (1 << 2),
 	/* whether we need to drop packets that hardware loopback-ed */
 	MLX4_EN_FLAG_RX_FILTER_NEEDED	= (1 << 3),
-	MLX4_EN_FLAG_FORCE_PROMISC	= (1 << 4)
+	MLX4_EN_FLAG_FORCE_PROMISC	= (1 << 4),
+	MLX4_EN_FLAG_DCB_ENABLED	= (1 << 5)
 };
 
 #define MLX4_EN_MAC_HASH_SIZE (1 << BITS_PER_BYTE)
@@ -529,6 +530,7 @@  struct mlx4_en_priv {
 #ifdef CONFIG_MLX4_EN_DCB
 	struct ieee_ets ets;
 	u16 maxrate[IEEE_8021QAZ_MAX_TCS];
+	u8 dcbx_cap;
 #endif
 #ifdef CONFIG_RFS_ACCEL
 	spinlock_t filters_lock;