diff mbox series

[RFC,rdma-next,13/18] RDMA/mlx5: Enable decap and packet reformat on flow tables

Message ID 20180716082305.11744-14-leon@kernel.org
State RFC, archived
Delegated to: David Miller
Headers show
Series Flow actions to mutate packets | expand

Commit Message

Leon Romanovsky July 16, 2018, 8:23 a.m. UTC
From: Mark Bloch <markb@mellanox.com>

If NIC RX flow tables support decap opertion, enable it on creation.
If NIC TX flow tables support reformat opertion, enable it on creation.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/main.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

--
2.14.4

Comments

Or Gerlitz July 16, 2018, 9:23 p.m. UTC | #1
On Mon, Jul 16, 2018 at 11:23 AM, Leon Romanovsky <leon@kernel.org> wrote:
> From: Mark Bloch <markb@mellanox.com>
>
> If NIC RX flow tables support decap opertion, enable it on creation.

opertion --> operation

> If NIC TX flow tables support reformat opertion, enable it on creation.

What is the trigger to use the decap flag on RX table or encap flag on
TX table?

Please note that we have a short blanket w.r.t mutual usage by
NIC vs e-Switch  steering, did you consider to do that on demand?
Mark Bloch July 16, 2018, 9:46 p.m. UTC | #2
> -----Original Message-----
> From: Or Gerlitz [mailto:gerlitz.or@gmail.com]
> Sent: Monday, July 16, 2018 2:24 PM
> To: Mark Bloch <markb@mellanox.com>
> Cc: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe
> <jgg@mellanox.com>; Leon Romanovsky <leonro@mellanox.com>; RDMA
> mailing list <linux-rdma@vger.kernel.org>; Saeed Mahameed
> <saeedm@mellanox.com>; linux-netdev <netdev@vger.kernel.org>
> Subject: Re: [RFC PATCH rdma-next 13/18] RDMA/mlx5: Enable decap and
> packet reformat on flow tables
> 
> On Mon, Jul 16, 2018 at 11:23 AM, Leon Romanovsky <leon@kernel.org>
> wrote:
> > From: Mark Bloch <markb@mellanox.com>
> >
> > If NIC RX flow tables support decap opertion, enable it on creation.
> 
> opertion --> operation
> 
> > If NIC TX flow tables support reformat opertion, enable it on creation.
> 
> What is the trigger to use the decap flag on RX table or encap flag on
> TX table?
> 

It has no performance penalty to always enable that, so that's what I do if supported.
 
> Please note that we have a short blanket w.r.t mutual usage by

FDB and NIC steering tables have different limitations, so encap/decap on NIC steering
have nothing to do with the limitations the FDB has with those operations.

> NIC vs e-Switch  steering, did you consider to do that on demand?

The flow table needs to be created with those flags set if we want to attach
decap/packet reformat action to it. BTW, there is no modify action for those bits
so that's why I'm doing it on creation.

Mark
Or Gerlitz July 17, 2018, 12:47 p.m. UTC | #3
On Tue, Jul 17, 2018 at 12:46 AM, Mark Bloch <markb@mellanox.com> wrote:
>> From: Or Gerlitz [mailto:gerlitz.or@gmail.com]

>> > If NIC RX flow tables support decap opertion, enable it on creation.
>> opertion --> operation

saw it?

>> > If NIC TX flow tables support reformat opertion, enable it on creation.

opertion --> operation

>> What is the trigger to use the decap flag on RX table or encap flag on
>> TX table?

> It has no performance penalty to always enable that, so that's what I do if supported.

I was not referring to performance, see below

>> Please note that we have a short blanket w.r.t mutual usage by

> FDB and NIC steering tables have different limitations, so encap/decap on NIC steering
> have nothing to do with the limitations the FDB has with those operations.

no! AFAIK it has to do, the FW maintains three states for encap(decap)
NONE, FDB or NIC
if the state is NIC, an FDB table can't be created with encap set, and
the other way around, if the
state is FDB, NIC TX table can't be created with encap set, etc. This
is the short blanket I was
referring too, you can check me.

>> NIC vs e-Switch  steering, did you consider to do that on demand?
>
> The flow table needs to be created with those flags set if we want to attach
> decap/packet reformat action to it. BTW, there is no modify action for those bits
> so that's why I'm doing it on creation.

The question was if you can let the application tell you that they want to use
rules with encap/decap, as we did in the devlink switchdev API (encap enabled)
Mark Bloch July 17, 2018, 4:29 p.m. UTC | #4
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> owner@vger.kernel.org] On Behalf Of Or Gerlitz
> Sent: Tuesday, July 17, 2018 5:47 AM
> To: Mark Bloch <markb@mellanox.com>
> Cc: Doug Ledford <dledford@redhat.com>; Jason Gunthorpe
> <jgg@mellanox.com>; Leon Romanovsky <leonro@mellanox.com>; RDMA
> mailing list <linux-rdma@vger.kernel.org>; Saeed Mahameed
> <saeedm@mellanox.com>; linux-netdev <netdev@vger.kernel.org>
> Subject: Re: [RFC PATCH rdma-next 13/18] RDMA/mlx5: Enable decap and
> packet reformat on flow tables
> 
> On Tue, Jul 17, 2018 at 12:46 AM, Mark Bloch <markb@mellanox.com> wrote:
> >> From: Or Gerlitz [mailto:gerlitz.or@gmail.com]
> 
> >> > If NIC RX flow tables support decap opertion, enable it on creation.
> >> opertion --> operation
> 
> saw it?

yes, sorry I didn't say so 😊
> 
> >> > If NIC TX flow tables support reformat opertion, enable it on creation.
> 
> opertion --> operation
> 
> >> What is the trigger to use the decap flag on RX table or encap flag on
> >> TX table?
> 
> > It has no performance penalty to always enable that, so that's what I do if
> supported.
> 
> I was not referring to performance, see below
> 
> >> Please note that we have a short blanket w.r.t mutual usage by
> 
> > FDB and NIC steering tables have different limitations, so encap/decap on
> NIC steering
> > have nothing to do with the limitations the FDB has with those operations.
> 
> no! AFAIK it has to do, the FW maintains three states for encap(decap)
> NONE, FDB or NIC
> if the state is NIC, an FDB table can't be created with encap set, and
> the other way around, if the
> state is FDB, NIC TX table can't be created with encap set, etc. This
> is the short blanket I was
> referring too, you can check me.

Or I'm sorry, just realized you don't see the updated version of the patch set. (will be sent without RFC tag)
The updated one doesn't allow TX steering to be done when in switchdev mode as today
we lack the API (on the RDMA side) to specify to which rep the rules should be applied.

Also once in switchdev mode, the FW turns off the cap flag for encap, which means the VFs won't create
a flow table with the encap flag set, and because we require the VFs won't be binded when moving to switchdev
mode they will always see the updated caps.

Does that address your concerns?

Mark. 

> 
> >> NIC vs e-Switch  steering, did you consider to do that on demand?
> >
> > The flow table needs to be created with those flags set if we want to attach
> > decap/packet reformat action to it. BTW, there is no modify action for
> those bits
> > so that's why I'm doing it on creation.
> 
> The question was if you can let the application tell you that they want to use
> rules with encap/decap, as we did in the devlink switchdev API (encap
> enabled)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index fe4640fe025b..ecbf9f3e12d8 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3036,14 +3036,15 @@  enum flow_table_type {
 static struct mlx5_ib_flow_prio *_get_prio(struct mlx5_flow_namespace *ns,
 					   struct mlx5_ib_flow_prio *prio,
 					   int priority,
-					   int num_entries, int num_groups)
+					   int num_entries, int num_groups,
+					   u32 flags)
 {
 	struct mlx5_flow_table *ft;

 	ft = mlx5_create_auto_grouped_flow_table(ns, priority,
 						 num_entries,
 						 num_groups,
-						 0, 0);
+						 0, flags);
 	if (IS_ERR(ft))
 		return ERR_CAST(ft);

@@ -3063,6 +3064,7 @@  static struct mlx5_ib_flow_prio *get_flow_table(struct mlx5_ib_dev *dev,
 	int max_table_size;
 	int num_entries;
 	int num_groups;
+	u32 flags = 0;
 	int priority;

 	max_table_size = BIT(MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev,
@@ -3079,11 +3081,15 @@  static struct mlx5_ib_flow_prio *get_flow_table(struct mlx5_ib_dev *dev,
 		if (ft_type == MLX5_IB_FT_RX) {
 			fn_type = MLX5_FLOW_NAMESPACE_BYPASS;
 			prio = &dev->flow_db->prios[priority];
+			if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, decap))
+			    flags |= MLX5_FLOW_TABLE_TUNNEL_EN_DECAP;
 		} else {
 			max_table_size = BIT(MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev,
 								       log_max_ft_size));
 			fn_type = MLX5_FLOW_NAMESPACE_EGRESS;
 			prio = &dev->flow_db->egress_prios[priority];
+			if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, reformat))
+			    flags |= MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT;
 		}
 		ns = mlx5_get_flow_namespace(dev->mdev, fn_type);
 		num_entries = MLX5_FS_MAX_ENTRIES;
@@ -3119,7 +3125,8 @@  static struct mlx5_ib_flow_prio *get_flow_table(struct mlx5_ib_dev *dev,

 	ft = prio->flow_table;
 	if (!ft)
-		return _get_prio(ns, prio, priority, num_entries, num_groups);
+		return _get_prio(ns, prio, priority, num_entries, num_groups,
+				 flags);

 	return prio;
 }
@@ -3694,7 +3701,7 @@  static struct mlx5_ib_flow_prio *_get_flow_table(struct mlx5_ib_dev *dev,
 		return prio;

 	return _get_prio(ns, prio, priority, MLX5_FS_MAX_ENTRIES,
-			 MLX5_FS_MAX_TYPES);
+			 MLX5_FS_MAX_TYPES, 0);
 }

 static struct mlx5_ib_flow_handler *