diff mbox series

[v3] i40e: Add macvlan support on i40e

Message ID 20190116235414.4946-1-harshitha.ramamurthy@intel.com
State Changes Requested
Delegated to: Jeff Kirsher
Headers show
Series [v3] i40e: Add macvlan support on i40e | expand

Commit Message

Harshitha Ramamurthy Jan. 16, 2019, 11:54 p.m. UTC
This patch enables macvlan offloads for i40e. The idea is to use
channels as macvlan interfaces. The channels are VSIs of
type VMDQ. When the first macvlan is created, the maximum number of
channels possible are created. From then on, as a macvlan interface
is created, a mac filter is added to these already created channels
(VSIs).

This patch builds on top of the recent changes which move
away from the select_queue implementation of picking the tx queue.

Steps to configure macvlan offloads:
1. sudo ethtool -K ens261f1 l2-fwd-offload on
2. ip link add link ens261f1 name macvlan1 type macvlan
3. sudo ip link add link ens261f1 name macvlan1 type macvlan
4. sudo ip link set macvlan1 up

Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
---
v3: Renamed i40e_remove_macvlan_channels() to i40e_free_macvlan_channels()
    Fixed an indentation issue and added some comments
    Removed a code name from the commit message
    Addressed some more of Shannon's comments 

v2: Addressed Shannon's comments
    Added a new function to remove all macvlan VSIs

 drivers/net/ethernet/intel/i40e/i40e.h      |  26 ++
 drivers/net/ethernet/intel/i40e/i40e_main.c | 481 +++++++++++++++++++-
 2 files changed, 505 insertions(+), 2 deletions(-)

Comments

Shannon Nelson Feb. 6, 2019, 5:56 p.m. UTC | #1
On Wed, Jan 16, 2019 at 3:54 PM Harshitha Ramamurthy
<harshitha.ramamurthy@intel.com> wrote:
>
> This patch enables macvlan offloads for i40e. The idea is to use
> channels as macvlan interfaces. The channels are VSIs of
> type VMDQ. When the first macvlan is created, the maximum number of
> channels possible are created. From then on, as a macvlan interface
> is created, a mac filter is added to these already created channels
> (VSIs).
>
> This patch builds on top of the recent changes which move
> away from the select_queue implementation of picking the tx queue.
>
> Steps to configure macvlan offloads:
> 1. sudo ethtool -K ens261f1 l2-fwd-offload on
> 2. ip link add link ens261f1 name macvlan1 type macvlan
> 3. sudo ip link add link ens261f1 name macvlan1 type macvlan
> 4. sudo ip link set macvlan1 up
>
> Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>

Hi Harshitha, thanks for your patience, I've had a challenging couple of weeks.

The only heartburn I have at this point is the hard-coded numbers in
fwd_add when working out how many queues to set aside for macvlan
offloads.  I'd rather see those as fractions of a constant, where that
constant might change for different i40e devices.  However, I realize
the xl7xx family probably won't change there, so it will probably be
fine.

Ack-by: Shannon Nelson <shannon.lee.nelson@gmail.com>

Cheers,
sln
Bowers, AndrewX Feb. 8, 2019, 10:07 p.m. UTC | #2
> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Harshitha Ramamurthy
> Sent: Wednesday, January 16, 2019 3:54 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Duyck, Alexander H <alexander.h.duyck@intel.com>
> Subject: [Intel-wired-lan] [PATCH v3] i40e: Add macvlan support on i40e
> 
> This patch enables macvlan offloads for i40e. The idea is to use channels as
> macvlan interfaces. The channels are VSIs of type VMDQ. When the first
> macvlan is created, the maximum number of channels possible are created.
> From then on, as a macvlan interface is created, a mac filter is added to these
> already created channels (VSIs).
> 
> This patch builds on top of the recent changes which move away from the
> select_queue implementation of picking the tx queue.
> 
> Steps to configure macvlan offloads:
> 1. sudo ethtool -K ens261f1 l2-fwd-offload on 2. ip link add link ens261f1
> name macvlan1 type macvlan 3. sudo ip link add link ens261f1 name macvlan1
> type macvlan 4. sudo ip link set macvlan1 up
> 
> Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
> ---
> v3: Renamed i40e_remove_macvlan_channels() to
> i40e_free_macvlan_channels()
>     Fixed an indentation issue and added some comments
>     Removed a code name from the commit message
>     Addressed some more of Shannon's comments
> 
> v2: Addressed Shannon's comments
>     Added a new function to remove all macvlan VSIs
> 
>  drivers/net/ethernet/intel/i40e/i40e.h      |  26 ++
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 481 +++++++++++++++++++-
>  2 files changed, 505 insertions(+), 2 deletions(-)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Bowers, AndrewX Feb. 13, 2019, 10:21 p.m. UTC | #3
> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Harshitha Ramamurthy
> Sent: Wednesday, January 16, 2019 3:54 PM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Duyck, Alexander H <alexander.h.duyck@intel.com>
> Subject: [Intel-wired-lan] [PATCH v3] i40e: Add macvlan support on i40e
> 
> This patch enables macvlan offloads for i40e. The idea is to use channels as
> macvlan interfaces. The channels are VSIs of type VMDQ. When the first
> macvlan is created, the maximum number of channels possible are created.
> From then on, as a macvlan interface is created, a mac filter is added to these
> already created channels (VSIs).
> 
> This patch builds on top of the recent changes which move away from the
> select_queue implementation of picking the tx queue.
> 
> Steps to configure macvlan offloads:
> 1. sudo ethtool -K ens261f1 l2-fwd-offload on 2. ip link add link ens261f1
> name macvlan1 type macvlan 3. sudo ip link add link ens261f1 name macvlan1
> type macvlan 4. sudo ip link set macvlan1 up
> 
> Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
> ---
> v3: Renamed i40e_remove_macvlan_channels() to
> i40e_free_macvlan_channels()
>     Fixed an indentation issue and added some comments
>     Removed a code name from the commit message
>     Addressed some more of Shannon's comments
> 
> v2: Addressed Shannon's comments
>     Added a new function to remove all macvlan VSIs
> 
>  drivers/net/ethernet/intel/i40e/i40e.h      |  26 ++
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 481 +++++++++++++++++++-
>  2 files changed, 505 insertions(+), 2 deletions(-)

Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Brown, Aaron F Feb. 15, 2019, 4:03 a.m. UTC | #4
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> Behalf Of Bowers, AndrewX
> Sent: Wednesday, February 13, 2019 2:21 PM
> To: intel-wired-lan@lists.osuosl.org
> Subject: Re: [Intel-wired-lan] [PATCH v3] i40e: Add macvlan support on i40e
> 
> > -----Original Message-----
> > From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On
> > Behalf Of Harshitha Ramamurthy
> > Sent: Wednesday, January 16, 2019 3:54 PM
> > To: intel-wired-lan@lists.osuosl.org
> > Cc: Duyck, Alexander H <alexander.h.duyck@intel.com>
> > Subject: [Intel-wired-lan] [PATCH v3] i40e: Add macvlan support on i40e
> >
> > This patch enables macvlan offloads for i40e. The idea is to use channels as
> > macvlan interfaces. The channels are VSIs of type VMDQ. When the first
> > macvlan is created, the maximum number of channels possible are created.
> > From then on, as a macvlan interface is created, a mac filter is added to
> these
> > already created channels (VSIs).
> >
> > This patch builds on top of the recent changes which move away from the
> > select_queue implementation of picking the tx queue.
> >
> > Steps to configure macvlan offloads:
> > 1. sudo ethtool -K ens261f1 l2-fwd-offload on 2. ip link add link ens261f1
> > name macvlan1 type macvlan 3. sudo ip link add link ens261f1 name
> macvlan1
> > type macvlan 4. sudo ip link set macvlan1 up
> >
> > Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
> > ---
> > v3: Renamed i40e_remove_macvlan_channels() to
> > i40e_free_macvlan_channels()
> >     Fixed an indentation issue and added some comments
> >     Removed a code name from the commit message
> >     Addressed some more of Shannon's comments
> >
> > v2: Addressed Shannon's comments
> >     Added a new function to remove all macvlan VSIs
> >
> >  drivers/net/ethernet/intel/i40e/i40e.h      |  26 ++
> >  drivers/net/ethernet/intel/i40e/i40e_main.c | 481
> +++++++++++++++++++-
> >  2 files changed, 505 insertions(+), 2 deletions(-)
> 
> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>

This may be premature.  Andrew reported a call trace when disabling rx vlan offloads (ethtool -K ethX rxvlan off) while working on patches later in the queue.  And on another system I am getting a consistent system freeze if I try to disable rx vlan offloads.  While my system is freezing to quickly to get a trace the trigger is identical so I suspect they are the same basic issue.  I bisected my system and found the freezes on rx vlan disable started with the introduction of this patch. 

A snippet of the call trace Andrew reported with the other system is as follows:
[  118.981539] CPU: 41 PID: 4097 Comm: ethtool Not tainted 5.0.0-rc5WW07.3-ABNxQ-Reverted+ #14
[  118.981542] Hardware name: Intel Corporation S2600STB/S2600STB, BIOS SE5C620.86B.00.01.0015.110720180833 11/07/2018
[  118.981567] RIP: 0010:i40e_set_features+0xfb/0x2c0 [i40e]
[  118.981572] Code: 00 00 be 00 20 00 00 4c 89 ef e8 10 e3 ff ff eb b1 48 8b 85 40 0f 00 00 48 8b 9d a0 0f 00 00 4c 8d 70 08 48 8d 85 a0 0f 00 00 <4c> 8b 3b 48 39 d8 48 89 04 24 75 18 e9 78 ff ff ff 4c 89 f8 48 3b
[  118.981576] RSP: 0018:ffffc1180b947b88 EFLAGS: 00010282
[  118.981580] RAX: ffff9e768557ffa0 RBX: 0000000000000000 RCX: 0000000000000000
[  118.981583] RDX: 0000000001300002 RSI: 0000000006000000 RDI: ffff9e7680ec4350
[  118.981586] RBP: ffff9e768557f000 R08: 0000000000000080 R09: ffff9e76809eaa60
[  118.981589] R10: 0000000000000000 R11: 0000000000000046 R12: 000900e81fd54ab3
[  118.981592] R13: ffff9e7680ec4000 R14: ffff9e7680ec4008 R15: 0000000000000000
[  118.981597] FS:  00007f948d325740(0000) GS:ffff9e768f140000(0000) knlGS:0000000000000000
[  118.981600] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  118.981603] CR2: 0000000000000000 CR3: 00000017223b0001 CR4: 00000000007606e0
[  118.981607] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  118.981610] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  118.981612] PKRU: 55555554
[  118.981614] Call Trace:
[  118.981631]  __netdev_update_features+0x234/0x980
[  118.981639]  dev_ethtool+0x1076/0x26f0
[  118.981647]  ? netdev_run_todo+0x61/0x290
[  118.981654]  dev_ioctl+0x1d7/0x3a0
[  118.981662]  sock_do_ioctl+0xa0/0x140
[  118.981671]  ? __handle_mm_fault+0xac1/0x1360
[  118.981676]  sock_ioctl+0x19e/0x320
[  118.981685]  ? selinux_file_ioctl+0x161/0x200
[  118.981692]  do_vfs_ioctl+0xa5/0x620
[  118.981699]  ksys_ioctl+0x60/0x90
[  118.981705]  __x64_sys_ioctl+0x16/0x20
[  118.981715]  do_syscall_64+0x5b/0x160
[  118.981725]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  118.981730] RIP: 0033:0x7f948d41c09b
[  118.981734] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48
[  118.981738] RSP: 002b:00007ffe031fc988 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  118.981742] RAX: ffffffffffffffda RBX: 00007ffe031fca10 RCX: 00007f948d41c09b
[  118.981745] RDX: 00007ffe031fca20 RSI: 0000000000008946 RDI: 0000000000000003
[  118.981748] RBP: 000055786a29d260 R08: 0000000000000001 R09: 0000000010000000
[  118.981751] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
[  118.981754] R13: 000055786a29d980 R14: 0000557869cf0cc8 R15: 000055786a29d280
[  118.981757] Modules linked in: xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 devlink nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ib_isert ip6table_filter iscsi_target_mod ip6_tables ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rpcrdma ib_umad sunrpc vfat fat rdma_ucm ib_uverbs intel_rapl skx_edac nfit ib_iser x86_pkg_temp_thermal intel_powerclamp coretemp rdma_cm iw_cm kvm_intel ib_cm libiscsi scsi_transport_iscsi kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate i40iw intel_uncore ib_core intel_rapl_perf ipmi_ssif mei_me iTCO_wdt joydev iTCO_vendor_support ipmi_si mei pcspkr ioatdma i2c_i801 ipmi_devintf lpc_ich ipmi_msghandler dca acpi_power_meter acpi_pad pcc_cpufreq ast 
 i2c_algo_bit
[  118.981834]  drm_kms_helper ttm drm i40e crc32c_intel wmi
[  118.981846] CR2: 0000000000000000
[  118.981851] ---[ end trace dccc97c7b510ec6f ]---
[  119.028612] RIP: 0010:i40e_set_features+0xfb/0x2c0 [i40e]
[  119.028618] Code: 00 00 be 00 20 00 00 4c 89 ef e8 10 e3 ff ff eb b1 48 8b 85 40 0f 00 00 48 8b 9d a0 0f 00 00 4c 8d 70 08 48 8d 85 a0 0f 00 00 <4c> 8b 3b 48 39 d8 48 89 04 24 75 18 e9 78 ff ff ff 4c 89 f8 48 3b
[  119.028620] RSP: 0018:ffffc1180b947b88 EFLAGS: 00010282
[  119.028623] RAX: ffff9e768557ffa0 RBX: 0000000000000000 RCX: 0000000000000000
[  119.028624] RDX: 0000000001300002 RSI: 0000000006000000 RDI: ffff9e7680ec4350
[  119.028625] RBP: ffff9e768557f000 R08: 0000000000000080 R09: ffff9e76809eaa60
[  119.028626] R10: 0000000000000000 R11: 0000000000000046 R12: 000900e81fd54ab3
[  119.028628] R13: ffff9e7680ec4000 R14: ffff9e7680ec4008 R15: 0000000000000000
[  119.028629] FS:  00007f948d325740(0000) GS:ffff9e768f140000(0000) knlGS:0000000000000000
[  119.028630] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  119.028631] CR2: 0000000000000000 CR3: 00000017223b0001 CR4: 00000000007606e0
[  119.028633] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  119.028634] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  119.028635] PKRU: 55555554
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index c06a4b5cdfae..107edcf2cf19 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -27,6 +27,7 @@ 
 #include <net/ip6_checksum.h>
 #include <linux/ethtool.h>
 #include <linux/if_vlan.h>
+#include <linux/if_macvlan.h>
 #include <linux/if_bridge.h>
 #include <linux/clocksource.h>
 #include <linux/net_tstamp.h>
@@ -391,6 +392,11 @@  struct i40e_flex_pit {
 	u8 pit_index;
 };
 
+struct i40e_fwd_adapter {
+	struct net_device *netdev;
+	int bit_no;
+};
+
 struct i40e_channel {
 	struct list_head list;
 	bool initialized;
@@ -405,11 +411,25 @@  struct i40e_channel {
 	struct i40e_aqc_vsi_properties_data info;
 
 	u64 max_tx_rate;
+	struct i40e_fwd_adapter *fwd;
 
 	/* track this channel belongs to which VSI */
 	struct i40e_vsi *parent_vsi;
 };
 
+static inline bool i40e_is_channel_macvlan(struct i40e_channel *ch)
+{
+	return !!ch->fwd;
+}
+
+static inline u8 *i40e_channel_mac(struct i40e_channel *ch)
+{
+	if (i40e_is_channel_macvlan(ch))
+		return ch->fwd->netdev->dev_addr;
+	else
+		return NULL;
+}
+
 /* struct that defines the Ethernet device */
 struct i40e_pf {
 	struct pci_dev *pdev;
@@ -787,6 +807,12 @@  struct i40e_vsi {
 	struct list_head ch_list;
 	u16 tc_seid_map[I40E_MAX_TRAFFIC_CLASS];
 
+	/* macvlan fields */
+#define I40E_MAX_MACVLANS	128 /* Max HW capable vectors - 1 on FVL */
+	DECLARE_BITMAP(fwd_bitmask, I40E_MAX_MACVLANS);
+	struct list_head macvlan_list;
+	int macvlan_cnt;
+
 	void *priv;	/* client driver data reference. */
 
 	/* VSI specific handlers */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 6675d1bb873c..6365064d8d3d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5818,8 +5818,10 @@  static int i40e_add_channel(struct i40e_pf *pf, u16 uplink_seid,
 		return -ENOENT;
 	}
 
-	/* Success, update channel */
-	ch->enabled_tc = enabled_tc;
+	/* Success, update channel, set enabled_tc only if the channel
+	 * is not a macvlan
+	 */
+	ch->enabled_tc = !i40e_is_channel_macvlan(ch) && enabled_tc;
 	ch->seid = ctxt.seid;
 	ch->vsi_number = ctxt.vsi_number;
 	ch->stat_counter_idx = cpu_to_le16(ctxt.info.stat_counter_idx);
@@ -6811,6 +6813,473 @@  static void i40e_vsi_set_default_tc_config(struct i40e_vsi *vsi)
 	}
 }
 
+/**
+ * i40e_del_macvlan_filter
+ * @hw: pointer to the HW structure
+ * @seid: seid of the channel VSI
+ * @macaddr: the mac address to apply as a filter
+ * @aq_err: store the admin Q error
+ *
+ * This function deletes a mac filter on the channel VSI which serves as the
+ * macvlan. Returns 0 on success.
+ **/
+static i40e_status i40e_del_macvlan_filter(struct i40e_hw *hw, u16 seid,
+					   const u8 *macaddr, int *aq_err)
+{
+	struct i40e_aqc_remove_macvlan_element_data element;
+	i40e_status status;
+
+	memset(&element, 0, sizeof(element));
+	ether_addr_copy(element.mac_addr, macaddr);
+	element.vlan_tag = 0;
+	element.flags = I40E_AQC_MACVLAN_DEL_PERFECT_MATCH;
+	status = i40e_aq_remove_macvlan(hw, seid, &element, 1, NULL);
+	*aq_err = hw->aq.asq_last_status;
+	return status;
+}
+
+/**
+ * i40e_add_macvlan_filter
+ * @hw: pointer to the HW structure
+ * @seid: seid of the channel VSI
+ * @macaddr: the mac address to apply as a filter
+ * @aq_err: store the admin Q error
+ *
+ * This function adds a mac filter on the channel VSI which serves as the
+ * macvlan. Returns 0 on success.
+ **/
+static i40e_status i40e_add_macvlan_filter(struct i40e_hw *hw, u16 seid,
+					   const u8 *macaddr, int *aq_err)
+{
+	struct i40e_aqc_add_macvlan_element_data element;
+	i40e_status status;
+	u16 cmd_flags = 0;
+
+	ether_addr_copy(element.mac_addr, macaddr);
+	element.vlan_tag = 0;
+	element.queue_number = 0;
+	element.match_method = I40E_AQC_MM_ERR_NO_RES;
+	cmd_flags |= I40E_AQC_MACVLAN_ADD_PERFECT_MATCH;
+	element.flags = cpu_to_le16(cmd_flags);
+	status = i40e_aq_add_macvlan(hw, seid, &element, 1, NULL);
+	*aq_err = hw->aq.asq_last_status;
+	return status;
+}
+
+/**
+ * i40e_reset_ch_rings - Reset the queue contexts in a channel
+ * @vsi: the VSI we want to access
+ * @ch: the channel we want to access
+ */
+static void i40e_reset_ch_rings(struct i40e_vsi *vsi, struct i40e_channel *ch)
+{
+	struct i40e_ring *tx_ring, *rx_ring;
+	u16 pf_q;
+	int i;
+
+	for (i = 0; i < ch->num_queue_pairs; i++) {
+		pf_q = ch->base_queue + i;
+		tx_ring = vsi->tx_rings[pf_q];
+		tx_ring->ch = NULL;
+		rx_ring = vsi->rx_rings[pf_q];
+		rx_ring->ch = NULL;
+	}
+}
+
+/**
+ * i40e_free_macvlan_channels
+ * @vsi: the VSI we want to access
+ *
+ * This function frees the Qs of the channel VSI from
+ * the stack and also deletes the channel VSIs which
+ * serve as macvlans.
+ */
+static void i40e_free_macvlan_channels(struct i40e_vsi *vsi)
+{
+	struct i40e_channel *ch, *ch_tmp;
+	int ret;
+
+	if (list_empty(&vsi->macvlan_list))
+		return;
+
+	list_for_each_entry_safe(ch, ch_tmp, &vsi->macvlan_list, list) {
+		struct i40e_vsi *parent_vsi;
+
+		if (i40e_is_channel_macvlan(ch)) {
+			i40e_reset_ch_rings(vsi, ch);
+			clear_bit(ch->fwd->bit_no, vsi->fwd_bitmask);
+			netdev_unbind_sb_channel(vsi->netdev, ch->fwd->netdev);
+			netdev_set_sb_channel(ch->fwd->netdev, 0);
+			kfree(ch->fwd);
+			ch->fwd = NULL;
+		}
+
+		list_del(&ch->list);
+		parent_vsi = ch->parent_vsi;
+		if (!parent_vsi || !ch->initialized) {
+			kfree(ch);
+			continue;
+		}
+
+		/* remove the VSI */
+		ret = i40e_aq_delete_element(&vsi->back->hw, ch->seid,
+					     NULL);
+		if (ret)
+			dev_err(&vsi->back->pdev->dev,
+				"unable to remove channel (%d) for parent VSI(%d)\n",
+				ch->seid, parent_vsi->seid);
+		kfree(ch);
+	}
+	vsi->macvlan_cnt = 0;
+}
+
+/**
+ * i40e_fwd_ring_up - bring the macvlan device up
+ * @vsi: the VSI we want to access
+ * @vdev: macvlan netdevice
+ * @fwd: the private fwd structure
+ */
+static int i40e_fwd_ring_up(struct i40e_vsi *vsi, struct net_device *vdev,
+			    struct i40e_fwd_adapter *fwd)
+{
+	int ret = 0, num_tc = 1,  i, aq_err;
+	struct i40e_channel *ch, *ch_tmp;
+	struct i40e_pf *pf = vsi->back;
+	struct i40e_hw *hw = &pf->hw;
+
+	if (list_empty(&vsi->macvlan_list))
+		return -EINVAL;
+
+	/* Go through the list and find an available channel */
+	list_for_each_entry_safe(ch, ch_tmp, &vsi->macvlan_list, list) {
+		if (!i40e_is_channel_macvlan(ch)) {
+			ch->fwd = fwd;
+			/* record configuration for macvlan interface in vdev */
+			for (i = 0; i < num_tc; i++)
+				netdev_bind_sb_channel_queue(vsi->netdev, vdev,
+							     i,
+							     ch->num_queue_pairs,
+							     ch->base_queue);
+			for (i = 0; i < ch->num_queue_pairs; i++) {
+				struct i40e_ring *tx_ring, *rx_ring;
+				u16 pf_q;
+
+				pf_q = ch->base_queue + i;
+
+				/* Get to TX ring ptr */
+				tx_ring = vsi->tx_rings[pf_q];
+				tx_ring->ch = ch;
+
+				/* Get the RX ring ptr */
+				rx_ring = vsi->rx_rings[pf_q];
+				rx_ring->ch = ch;
+			}
+			break;
+		}
+	}
+
+	/* Guarantee all rings are updated before we update the
+	 * MAC address filter.
+	 */
+	wmb();
+
+	/* Add a mac filter */
+	ret = i40e_add_macvlan_filter(hw, ch->seid, vdev->dev_addr, &aq_err);
+	if (ret) {
+		/* if we cannot add the MAC rule then disable the offload */
+		macvlan_release_l2fw_offload(vdev);
+		for (i = 0; i < ch->num_queue_pairs; i++) {
+			struct i40e_ring *rx_ring;
+			u16 pf_q;
+
+			pf_q = ch->base_queue + i;
+			rx_ring = vsi->rx_rings[pf_q];
+			rx_ring->netdev = NULL;
+		}
+		dev_info(&pf->pdev->dev,
+			 "Error adding mac filter on macvlan err %s, aq_err %s\n",
+			  i40e_stat_str(hw, ret),
+			  i40e_aq_str(hw, aq_err));
+		netdev_err(vdev, "L2fwd offload disabled to L2 filter error\n");
+	}
+	return ret;
+}
+
+/**
+ * i40e_setup_macvlans - create the channels which will be macvlans
+ * @vsi: the VSI we want to access
+ * @macvlan_cnt: no. of macvlans to be setup
+ * @qcnt: no. of Qs per macvlan
+ * @vdev: macvlan netdevice
+ */
+static int i40e_setup_macvlans(struct i40e_vsi *vsi, u16 macvlan_cnt, u16 qcnt,
+			       struct net_device *vdev)
+{
+	struct i40e_pf *pf = vsi->back;
+	struct i40e_hw *hw = &pf->hw;
+	struct i40e_vsi_context ctxt;
+	u16 sections, qmap, num_qps;
+	struct i40e_channel *ch;
+	int i, pow, ret = 0;
+	u8 offset = 0;
+
+	if (vsi->type != I40E_VSI_MAIN)
+		return -EINVAL;
+	if (!macvlan_cnt)
+		return -EINVAL;
+
+	num_qps = vsi->num_queue_pairs - (macvlan_cnt * qcnt);
+
+	/* find the next higher power-of-2 of num queue pairs */
+	pow = fls(roundup_pow_of_two(num_qps) - 1);
+
+	qmap = (offset << I40E_AQ_VSI_TC_QUE_OFFSET_SHIFT) |
+		(pow << I40E_AQ_VSI_TC_QUE_NUMBER_SHIFT);
+
+	/* Setup context bits for the main VSI */
+	sections = I40E_AQ_VSI_PROP_QUEUE_MAP_VALID;
+	sections |= I40E_AQ_VSI_PROP_SCHED_VALID;
+	memset(&ctxt, 0, sizeof(ctxt));
+	ctxt.seid = vsi->seid;
+	ctxt.pf_num = vsi->back->hw.pf_id;
+	ctxt.vf_num = 0;
+	ctxt.uplink_seid = vsi->uplink_seid;
+	ctxt.info = vsi->info;
+	ctxt.info.tc_mapping[0] = cpu_to_le16(qmap);
+	ctxt.info.mapping_flags |= cpu_to_le16(I40E_AQ_VSI_QUE_MAP_CONTIG);
+	ctxt.info.queue_mapping[0] = cpu_to_le16(vsi->base_queue);
+	ctxt.info.valid_sections |= cpu_to_le16(sections);
+
+	/* Reconfigure RSS for main VSI with new max queue count */
+	vsi->rss_size = max_t(u16, num_qps, qcnt);
+	ret = i40e_vsi_config_rss(vsi);
+	if (ret) {
+		dev_info(&pf->pdev->dev,
+			 "Failed to reconfig RSS for num_queues (%u)\n",
+			 vsi->rss_size);
+		return ret;
+	}
+	vsi->reconfig_rss = true;
+	dev_dbg(&vsi->back->pdev->dev,
+		"Reconfigured RSS with num_queues (%u)\n", vsi->rss_size);
+	vsi->next_base_queue = num_qps;
+	vsi->cnt_q_avail = vsi->num_queue_pairs - num_qps;
+
+	/* Update the VSI after updating the VSI queue-mapping
+	 * information
+	 */
+	ret = i40e_aq_update_vsi_params(hw, &ctxt, NULL);
+	if (ret) {
+		dev_info(&pf->pdev->dev,
+			 "Update vsi tc config failed, err %s aq_err %s\n",
+			 i40e_stat_str(hw, ret),
+			 i40e_aq_str(hw, hw->aq.asq_last_status));
+		return ret;
+	}
+	/* update the local VSI info with updated queue map */
+	i40e_vsi_update_queue_map(vsi, &ctxt);
+	vsi->info.valid_sections = 0;
+
+	/* Create channels for macvlans */
+	INIT_LIST_HEAD(&vsi->macvlan_list);
+	for (i = 0; i < macvlan_cnt; i++) {
+		ch = kzalloc(sizeof(*ch), GFP_KERNEL);
+		if (!ch) {
+			ret = -ENOMEM;
+			goto err_free;
+		}
+		INIT_LIST_HEAD(&ch->list);
+		ch->num_queue_pairs = qcnt;
+		if (!i40e_setup_channel(pf, vsi, ch)) {
+			ret = -EINVAL;
+			goto err_free;
+		}
+		ch->parent_vsi = vsi;
+		vsi->cnt_q_avail -= ch->num_queue_pairs;
+		vsi->macvlan_cnt++;
+		list_add_tail(&ch->list, &vsi->macvlan_list);
+	}
+	return ret;
+
+err_free:
+	dev_info(&pf->pdev->dev, "Failed to setup macvlans\n");
+	i40e_free_macvlan_channels(vsi);
+	return ret;
+}
+
+/**
+ * i40e_fwd_add - configure macvlans
+ * @netdev: net device to configure
+ * @vdev: macvlan netdevice
+ **/
+static void *i40e_fwd_add(struct net_device *netdev, struct net_device *vdev)
+{
+	struct i40e_netdev_priv *np = netdev_priv(netdev);
+	u16 q_per_macvlan = 0, macvlan_cnt = 0, vectors;
+	struct i40e_vsi *vsi = np->vsi;
+	struct i40e_pf *pf = vsi->back;
+	struct i40e_fwd_adapter *fwd;
+	int avail_macvlan, ret;
+
+	if ((pf->flags & I40E_FLAG_DCB_ENABLED)) {
+		netdev_info(netdev, "Macvlans are not supported when DCB is enabled\n");
+		return ERR_PTR(-EINVAL);
+	}
+	if ((pf->flags & I40E_FLAG_TC_MQPRIO)) {
+		netdev_info(netdev, "Macvlans are not supported when HW TC offload is on\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* The macvlan device has to be a single Q device so that the
+	 * tc_to_txq field can be reused to pick the tx queue.
+	 */
+	if (netif_is_multiqueue(vdev))
+		return ERR_PTR(-ERANGE);
+
+	if (!vsi->macvlan_cnt) {
+		/* reserve bit 0 for the pf device */
+		set_bit(0, vsi->fwd_bitmask);
+
+		/* Try to reserve as many queues for macvlans. First reserve
+		 *  3/4th of max vectors, then half, then quarter and calculate
+		 *  Qs per macvlan as you go
+		 */
+		vectors = pf->num_lan_msix;
+		if (vectors <= I40E_MAX_MACVLANS && vectors > 96) {
+			/* allocate 4 Qs per macvlan and 32 Qs to the PF*/
+			q_per_macvlan = 4;
+			macvlan_cnt = (vectors - 32) / 4;
+		} else if (vectors <= 96 && vectors > 64) {
+			/* allocate 4 Qs per macvlan and 32 Qs to the PF*/
+			q_per_macvlan = 4;
+			macvlan_cnt = (vectors - 32) / 4;
+		} else if (vectors <= 64 && vectors > 32) {
+			/* allocate 2 Qs per macvlan and 16 Qs to the PF*/
+			q_per_macvlan = 2;
+			macvlan_cnt = (vectors - 16) / 2;
+		} else {
+			/* allocate 1 Q per macvlan 16 Qs to the PF*/
+			q_per_macvlan = 1;
+			macvlan_cnt = (vectors - 16);
+		}
+		if (macvlan_cnt == 0)
+			return ERR_PTR(-EBUSY);
+
+		/* Quiesce VSI queues */
+		i40e_quiesce_vsi(vsi);
+
+		/* sets up the macvlans but does not "enable" them */
+		ret = i40e_setup_macvlans(vsi, macvlan_cnt, q_per_macvlan,
+					  vdev);
+		if (ret)
+			return ERR_PTR(ret);
+
+		/* Unquiesce VSI */
+		i40e_unquiesce_vsi(vsi);
+	}
+	avail_macvlan = find_first_zero_bit(vsi->fwd_bitmask,
+					    vsi->macvlan_cnt);
+	if (avail_macvlan >= I40E_MAX_MACVLANS)
+		return ERR_PTR(-EBUSY);
+
+	/* create the fwd struct */
+	fwd = kzalloc(sizeof(*fwd), GFP_KERNEL);
+	if (!fwd)
+		return ERR_PTR(-ENOMEM);
+
+	set_bit(avail_macvlan, vsi->fwd_bitmask);
+	fwd->bit_no = avail_macvlan;
+	netdev_set_sb_channel(vdev, avail_macvlan);
+	fwd->netdev = vdev;
+
+	if (!netif_running(netdev))
+		return fwd;
+
+	/* Set fwd ring up */
+	ret = i40e_fwd_ring_up(vsi, vdev, fwd);
+	if (ret) {
+		/* unbind the queues and drop the subordinate channel config */
+		netdev_unbind_sb_channel(netdev, vdev);
+		netdev_set_sb_channel(vdev, 0);
+
+		kfree(fwd);
+		return ERR_PTR(-EINVAL);
+	}
+	return fwd;
+}
+
+/**
+ * i40e_del_all_macvlans - Delete all the mac filters on the channels
+ * @vsi: the VSI we want to access
+ */
+static void i40e_del_all_macvlans(struct i40e_vsi *vsi)
+{
+	struct i40e_channel *ch, *ch_tmp;
+	struct i40e_pf *pf = vsi->back;
+	struct i40e_hw *hw = &pf->hw;
+	int aq_err, ret = 0;
+
+	list_for_each_entry_safe(ch, ch_tmp, &vsi->macvlan_list, list) {
+		if (i40e_is_channel_macvlan(ch)) {
+			ret = i40e_del_macvlan_filter(hw, ch->seid,
+						      i40e_channel_mac(ch),
+						      &aq_err);
+			if (!ret) {
+				/* Reset queue contexts */
+				i40e_reset_ch_rings(vsi, ch);
+				clear_bit(ch->fwd->bit_no, vsi->fwd_bitmask);
+				netdev_unbind_sb_channel(vsi->netdev,
+							 ch->fwd->netdev);
+				netdev_set_sb_channel(ch->fwd->netdev, 0);
+				kfree(ch->fwd);
+				ch->fwd = NULL;
+			}
+		}
+	}
+}
+
+/**
+ * i40e_fwd_del - delete macvlan interfaces
+ * @netdev: net device to configure
+ * @vdev: macvlan netdevice
+ */
+static void i40e_fwd_del(struct net_device *netdev, void *vdev)
+{
+	struct i40e_netdev_priv *np = netdev_priv(netdev);
+	struct i40e_fwd_adapter *fwd = vdev;
+	struct i40e_channel *ch, *ch_tmp;
+	struct i40e_vsi *vsi = np->vsi;
+	struct i40e_pf *pf = vsi->back;
+	struct i40e_hw *hw = &pf->hw;
+	int aq_err, ret = 0;
+
+	/* Find the channel associated with the macvlan and del mac filter */
+	list_for_each_entry_safe(ch, ch_tmp, &vsi->macvlan_list, list) {
+		if (i40e_is_channel_macvlan(ch) &&
+		    ether_addr_equal(i40e_channel_mac(ch),
+				     fwd->netdev->dev_addr)) {
+			ret = i40e_del_macvlan_filter(hw, ch->seid,
+						      i40e_channel_mac(ch),
+						      &aq_err);
+			if (!ret) {
+				/* Reset queue contexts */
+				i40e_reset_ch_rings(vsi, ch);
+				clear_bit(ch->fwd->bit_no, vsi->fwd_bitmask);
+				netdev_unbind_sb_channel(netdev, fwd->netdev);
+				netdev_set_sb_channel(fwd->netdev, 0);
+				kfree(ch->fwd);
+				ch->fwd = NULL;
+			} else {
+				dev_info(&pf->pdev->dev,
+					 "Error deleting mac filter on macvlan err %s, aq_err %s\n",
+					  i40e_stat_str(hw, ret),
+					  i40e_aq_str(hw, aq_err));
+			}
+			break;
+		}
+	}
+}
+
 /**
  * i40e_setup_tc - configure multiple traffic classes
  * @netdev: net device to configure
@@ -11581,6 +12050,9 @@  static int i40e_set_features(struct net_device *netdev,
 		return -EINVAL;
 	}
 
+	if (!(features & NETIF_F_HW_L2FW_DOFFLOAD))
+		i40e_del_all_macvlans(vsi);
+
 	need_reset = i40e_set_ntuple(pf, features);
 
 	if (need_reset)
@@ -12314,6 +12786,8 @@  static const struct net_device_ops i40e_netdev_ops = {
 	.ndo_bpf		= i40e_xdp,
 	.ndo_xdp_xmit		= i40e_xdp_xmit,
 	.ndo_xsk_async_xmit	= i40e_xsk_async_xmit,
+	.ndo_dfwd_add_station	= i40e_fwd_add,
+	.ndo_dfwd_del_station	= i40e_fwd_del,
 };
 
 /**
@@ -12373,6 +12847,9 @@  static int i40e_config_netdev(struct i40e_vsi *vsi)
 	/* record features VLANs can make use of */
 	netdev->vlan_features |= hw_enc_features | NETIF_F_TSO_MANGLEID;
 
+	/* enable macvlan offloads */
+	netdev->hw_features |= NETIF_F_HW_L2FW_DOFFLOAD;
+
 	hw_features = hw_enc_features		|
 		      NETIF_F_HW_VLAN_CTAG_TX	|
 		      NETIF_F_HW_VLAN_CTAG_RX;