diff mbox series

[net,02/11] net/mlx5e: Fix multicast counter not up-to-date in "ip -s"

Message ID 20200702221923.650779-3-saeedm@mellanox.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net,01/11] net/mlx5: Fix eeprom support for SFP module | expand

Commit Message

Saeed Mahameed July 2, 2020, 10:19 p.m. UTC
From: Ron Diskin <rondi@mellanox.com>

Currently the FW does not generate events for counters other than error
counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip -s
uses) might run in atomic context, while the FW interface is non atomic.
Thus, 'ip' is not allowed to issue fw commands, so it will only display
cached counters in the driver.

Add a SW counter (mcast_packets) in the driver to count rx multicast
packets. The counter also counts broadcast packets, as we consider it a
special case of multicast.
Use the counter value when calling "ip -s"/"ifconfig".  Display the new
counter when calling "ethtool -S", and add a matching counter
(mcast_bytes) for completeness.

Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h  | 5 +++++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 8 +-------
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    | 6 ++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 6 ++++++
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 4 ++++
 5 files changed, 22 insertions(+), 7 deletions(-)

Comments

Jakub Kicinski July 3, 2020, 1:47 a.m. UTC | #1
On Thu,  2 Jul 2020 15:19:14 -0700 Saeed Mahameed wrote:
> From: Ron Diskin <rondi@mellanox.com>
> 
> Currently the FW does not generate events for counters other than error
> counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip -s
> uses) might run in atomic context, while the FW interface is non atomic.
> Thus, 'ip' is not allowed to issue fw commands, so it will only display
> cached counters in the driver.
> 
> Add a SW counter (mcast_packets) in the driver to count rx multicast
> packets. The counter also counts broadcast packets, as we consider it a
> special case of multicast.
> Use the counter value when calling "ip -s"/"ifconfig".  Display the new
> counter when calling "ethtool -S", and add a matching counter
> (mcast_bytes) for completeness.

What is the problem that is being solved here exactly?

Device counts mcast wrong / unsuitably?

> Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
> Signed-off-by: Ron Diskin <rondi@mellanox.com>
> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Saeed Mahameed July 3, 2020, 3:45 a.m. UTC | #2
On Thu, 2020-07-02 at 18:47 -0700, Jakub Kicinski wrote:
> On Thu,  2 Jul 2020 15:19:14 -0700 Saeed Mahameed wrote:
> > From: Ron Diskin <rondi@mellanox.com>
> > 
> > Currently the FW does not generate events for counters other than
> > error
> > counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip
> > -s
> > uses) might run in atomic context, while the FW interface is non
> > atomic.
> > Thus, 'ip' is not allowed to issue fw commands, so it will only
> > display
> > cached counters in the driver.
> > 
> > Add a SW counter (mcast_packets) in the driver to count rx
> > multicast
> > packets. The counter also counts broadcast packets, as we consider
> > it a
> > special case of multicast.
> > Use the counter value when calling "ip -s"/"ifconfig".  Display the
> > new
> > counter when calling "ethtool -S", and add a matching counter
> > (mcast_bytes) for completeness.
> 
> What is the problem that is being solved here exactly?
> 
> Device counts mcast wrong / unsuitably?
> 

To read mcast counter we need to execute FW command which is blocking,
we can't block in atomic context .ndo_get_stats64 :( .. we have to
count in SW. 

the previous approach wasn't accurate as we read the mcast counter in a
background thread triggered by the previous read.. so we were off by
the interval between two reads.

> > Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support
> > ConnectX-4 Ethernet functionality")
> > Signed-off-by: Ron Diskin <rondi@mellanox.com>
> > Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
> > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Jakub Kicinski July 3, 2020, 4:25 a.m. UTC | #3
On Fri, 3 Jul 2020 03:45:45 +0000 Saeed Mahameed wrote:
> On Thu, 2020-07-02 at 18:47 -0700, Jakub Kicinski wrote:
> > On Thu,  2 Jul 2020 15:19:14 -0700 Saeed Mahameed wrote:  
> > > From: Ron Diskin <rondi@mellanox.com>
> > > 
> > > Currently the FW does not generate events for counters other than
> > > error
> > > counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip
> > > -s
> > > uses) might run in atomic context, while the FW interface is non
> > > atomic.
> > > Thus, 'ip' is not allowed to issue fw commands, so it will only
> > > display
> > > cached counters in the driver.
> > > 
> > > Add a SW counter (mcast_packets) in the driver to count rx
> > > multicast
> > > packets. The counter also counts broadcast packets, as we consider
> > > it a
> > > special case of multicast.
> > > Use the counter value when calling "ip -s"/"ifconfig".  Display the
> > > new
> > > counter when calling "ethtool -S", and add a matching counter
> > > (mcast_bytes) for completeness.  
> > 
> > What is the problem that is being solved here exactly?
> > 
> > Device counts mcast wrong / unsuitably?
> >   
> 
> To read mcast counter we need to execute FW command which is blocking,
> we can't block in atomic context .ndo_get_stats64 :( .. we have to
> count in SW. 
> 
> the previous approach wasn't accurate as we read the mcast counter in a
> background thread triggered by the previous read.. so we were off by
> the interval between two reads.

And that's bad enough to cause trouble? What's the worst case time
delta you're seeing?

> > > Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support
> > > ConnectX-4 Ethernet functionality")
> > > Signed-off-by: Ron Diskin <rondi@mellanox.com>
> > > Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
> > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Saeed Mahameed July 3, 2020, 6:15 a.m. UTC | #4
On Thu, 2020-07-02 at 21:25 -0700, Jakub Kicinski wrote:
> On Fri, 3 Jul 2020 03:45:45 +0000 Saeed Mahameed wrote:
> > On Thu, 2020-07-02 at 18:47 -0700, Jakub Kicinski wrote:
> > > On Thu,  2 Jul 2020 15:19:14 -0700 Saeed Mahameed wrote:  
> > > > From: Ron Diskin <rondi@mellanox.com>
> > > > 
> > > > Currently the FW does not generate events for counters other
> > > > than
> > > > error
> > > > counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64"
> > > > (which ip
> > > > -s
> > > > uses) might run in atomic context, while the FW interface is
> > > > non
> > > > atomic.
> > > > Thus, 'ip' is not allowed to issue fw commands, so it will only
> > > > display
> > > > cached counters in the driver.
> > > > 
> > > > Add a SW counter (mcast_packets) in the driver to count rx
> > > > multicast
> > > > packets. The counter also counts broadcast packets, as we
> > > > consider
> > > > it a
> > > > special case of multicast.
> > > > Use the counter value when calling "ip -s"/"ifconfig".  Display
> > > > the
> > > > new
> > > > counter when calling "ethtool -S", and add a matching counter
> > > > (mcast_bytes) for completeness.  
> > > 
> > > What is the problem that is being solved here exactly?
> > > 
> > > Device counts mcast wrong / unsuitably?
> > >   
> > 
> > To read mcast counter we need to execute FW command which is
> > blocking,
> > we can't block in atomic context .ndo_get_stats64 :( .. we have to
> > count in SW. 
> > 
> > the previous approach wasn't accurate as we read the mcast counter
> > in a
> > background thread triggered by the previous read.. so we were off
> > by
> > the interval between two reads.
> 
> And that's bad enough to cause trouble? What's the worst case time
> delta you're seeing?
> 

Depends on the user frequency to read stats,
if you read stats once every 5 minutes then mcast stats are off by 5
minutes..

Just thinking out loud, is it ok of we busy loop and wait for FW
response for mcast stats commands ? 

In ethtool -S though, they are accurate since we grab them on the spot
from FW.

> > > > Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support
> > > > ConnectX-4 Ethernet functionality")
> > > > Signed-off-by: Ron Diskin <rondi@mellanox.com>
> > > > Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
> > > > Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Jakub Kicinski July 3, 2020, 5:59 p.m. UTC | #5
On Fri, 3 Jul 2020 06:15:09 +0000 Saeed Mahameed wrote:
> > > To read mcast counter we need to execute FW command which is
> > > blocking,
> > > we can't block in atomic context .ndo_get_stats64 :( .. we have to
> > > count in SW. 
> > > 
> > > the previous approach wasn't accurate as we read the mcast counter
> > > in a
> > > background thread triggered by the previous read.. so we were off
> > > by
> > > the interval between two reads.  
> > 
> > And that's bad enough to cause trouble? What's the worst case time
> > delta you're seeing?
> 
> Depends on the user frequency to read stats,
> if you read stats once every 5 minutes then mcast stats are off by 5
> minutes..
> 
> Just thinking out loud, is it ok of we busy loop and wait for FW
> response for mcast stats commands ? 
> 
> In ethtool -S though, they are accurate since we grab them on the spot
> from FW.

I don't really feel too strongly, I'm just trying to get the details
because I feel like the situation is going to be increasingly common.
It'd be quite sad if drivers had to reimplement all stats in sw.

I thought it would be entirely reasonable for the driver to read the
stats from a delayed work every 1/2 HZ and cache that. We do have a
knob in ethtool IRQ coalescing settings for stats writeback frequency.

I'm not sure what locks procfs actually holds, if its something that
could impact reading other files - it'd probably be a bad idea to busy
wait.
Saeed Mahameed July 6, 2020, 7:40 p.m. UTC | #6
On Fri, 2020-07-03 at 10:59 -0700, Jakub Kicinski wrote:
> On Fri, 3 Jul 2020 06:15:09 +0000 Saeed Mahameed wrote:
> > > > To read mcast counter we need to execute FW command which is
> > > > blocking,
> > > > we can't block in atomic context .ndo_get_stats64 :( .. we have
> > > > to
> > > > count in SW. 
> > > > 
> > > > the previous approach wasn't accurate as we read the mcast
> > > > counter
> > > > in a
> > > > background thread triggered by the previous read.. so we were
> > > > off
> > > > by
> > > > the interval between two reads.  
> > > 
> > > And that's bad enough to cause trouble? What's the worst case
> > > time
> > > delta you're seeing?
> > 
> > Depends on the user frequency to read stats,
> > if you read stats once every 5 minutes then mcast stats are off by
> > 5
> > minutes..
> > 
> > Just thinking out loud, is it ok of we busy loop and wait for FW
> > response for mcast stats commands ? 
> > 
> > In ethtool -S though, they are accurate since we grab them on the
> > spot
> > from FW.
> 
> I don't really feel too strongly, I'm just trying to get the details
> because I feel like the situation is going to be increasingly common.
> It'd be quite sad if drivers had to reimplement all stats in sw.
> 

Depends on HW, our HW/FW supports providing stats per (Vport/function).
which means if a packet got lost between the NIC and the netdev queue,
it will be counted as rx-packet/mcast, although we have a private
counter to show this drop in ethtool but will be counted in rx counter
in netdev stats, if we used hw stats.

so this is why i always prefer SW stats for netdev reported stats, all
we need to count in SW {rx,tx} X {packets, bytes} + rx mcast packets.

This gives more flexibility and correctness, any given HW can create
multiple netdevs on the same function, we need the netdev stats to
reflect traffic that only went through that netdev.

> I thought it would be entirely reasonable for the driver to read the
> stats from a delayed work every 1/2 HZ and cache that. We do have a
> knob in ethtool IRQ coalescing settings for stats writeback
> frequency.
> 

Some customers didn't like this since for drivers that implement this
their CPU power utilization will be slightly higher on idle.

> I'm not sure what locks procfs actually holds, if its something that
> could impact reading other files - it'd probably be a bad idea to
> busy
> wait.

Agreed.
Jakub Kicinski July 6, 2020, 7:57 p.m. UTC | #7
On Mon, 6 Jul 2020 19:40:50 +0000 Saeed Mahameed wrote:
> > I don't really feel too strongly, I'm just trying to get the details
> > because I feel like the situation is going to be increasingly common.
> > It'd be quite sad if drivers had to reimplement all stats in sw.
> 
> Depends on HW, our HW/FW supports providing stats per (Vport/function).
> which means if a packet got lost between the NIC and the netdev queue,
> it will be counted as rx-packet/mcast, although we have a private
> counter to show this drop in ethtool but will be counted in rx counter
> in netdev stats, if we used hw stats.
> 
> so this is why i always prefer SW stats for netdev reported stats, all
> we need to count in SW {rx,tx} X {packets, bytes} + rx mcast packets.

If that was indeed the intention it'd had been done in the core, not
each driver separately..

> This gives more flexibility and correctness, any given HW can create
> multiple netdevs on the same function, we need the netdev stats to
> reflect traffic that only went through that netdev.
> 
> > I thought it would be entirely reasonable for the driver to read the
> > stats from a delayed work every 1/2 HZ and cache that. We do have a
> > knob in ethtool IRQ coalescing settings for stats writeback
> > frequency.
> 
> Some customers didn't like this since for drivers that implement this
> their CPU power utilization will be slightly higher on idle.

Other customers may dislike the per packet cycles.

I don't really mind, I just found the commit message to be lacking 
for a fix, which this supposedly is.

Also looks like you report the total number of mcast packets in ethtool
-S, which should be identical to ip -s? If so please remove that.
Saeed Mahameed July 7, 2020, 1:51 a.m. UTC | #8
On Mon, 2020-07-06 at 12:57 -0700, Jakub Kicinski wrote:
> On Mon, 6 Jul 2020 19:40:50 +0000 Saeed Mahameed wrote:
> > > I don't really feel too strongly, I'm just trying to get the
> > > details
> > > because I feel like the situation is going to be increasingly
> > > common.
> > > It'd be quite sad if drivers had to reimplement all stats in sw.
> > 
> > Depends on HW, our HW/FW supports providing stats per
> > (Vport/function).
> > which means if a packet got lost between the NIC and the netdev
> > queue,
> > it will be counted as rx-packet/mcast, although we have a private
> > counter to show this drop in ethtool but will be counted in rx
> > counter
> > in netdev stats, if we used hw stats.
> > 
> > so this is why i always prefer SW stats for netdev reported stats,
> > all
> > we need to count in SW {rx,tx} X {packets, bytes} + rx mcast
> > packets.
> 
> If that was indeed the intention it'd had been done in the core, not
> each driver separately..
> 

this is why it depends on the HW. unfortunately in our case HW stats !=
SW stats.

> > This gives more flexibility and correctness, any given HW can
> > create
> > multiple netdevs on the same function, we need the netdev stats to
> > reflect traffic that only went through that netdev.
> > 
> > > I thought it would be entirely reasonable for the driver to read
> > > the
> > > stats from a delayed work every 1/2 HZ and cache that. We do have
> > > a
> > > knob in ethtool IRQ coalescing settings for stats writeback
> > > frequency.
> > 
> > Some customers didn't like this since for drivers that implement
> > this
> > their CPU power utilization will be slightly higher on idle.
> 
> Other customers may dislike the per packet cycles.
> 
> I don't really mind, I just found the commit message to be lacking 
> for a fix, which this supposedly is.
> 

Yes commit message could use some improvement.

I think it is even worse than i thought, we removed  the background
refreshing of stats due to the FW events support to updates some
counters. Multicast counter can't be refreshed by FW events since it is
a data path counters. 

> Also looks like you report the total number of mcast packets in
> ethtool
> -S, which should be identical to ip -s? If so please remove that.

why ? it is ok to report the same counter both in ehttool and netdev
stats.
Jakub Kicinski July 7, 2020, 2:07 a.m. UTC | #9
On Tue, 7 Jul 2020 01:51:21 +0000 Saeed Mahameed wrote:
> > Also looks like you report the total number of mcast packets in
> > ethtool
> > -S, which should be identical to ip -s? If so please remove that.  
> 
> why ? it is ok to report the same counter both in ehttool and netdev
> stats.

I don't think it is, Stephen and I have been trying to catch this in
review for a while now. It's an entirely unnecessary code duplication.
We should steer towards proper APIs first if those exist.

Using ethtool -S stats gets very messy very quickly in production.
Saeed Mahameed July 7, 2020, 3:29 a.m. UTC | #10
On Mon, 2020-07-06 at 19:07 -0700, Jakub Kicinski wrote:
> On Tue, 7 Jul 2020 01:51:21 +0000 Saeed Mahameed wrote:
> > > Also looks like you report the total number of mcast packets in
> > > ethtool
> > > -S, which should be identical to ip -s? If so please remove
> > > that.  
> > 
> > why ? it is ok to report the same counter both in ehttool and
> > netdev
> > stats.
> 
> I don't think it is, Stephen and I have been trying to catch this in
> review for a while now. It's an entirely unnecessary code
> duplication.

Code duplication shouldn't be a factor. For example, in mlx5 we have a
generic mechanism to define and report stats, for the netdev stats to
be reported in ethtoool all we need to do is define the representing
string and store those counters in the SW stats struct.

> We should steer towards proper APIs first if those exist.
> 
> Using ethtool -S stats gets very messy very quickly in production.

I agree on ethtool getting messy very quickly, but i disagree on not
reporting netdev stats, I don't think the 4 netdev stats are the reason
for messy ethtool.

Ethtool -S is meant for verbosity and debug, and it is full of
driver/HW specific counters, how are you planing to audit all of those
?

We had an idea in the past to allow users to choose what stats groups
to report to ethtool, we can follow up on this if this is something
other drivers might be interested in.

example: 

ethtool -S eth0 --groups XDP,SW,PER_QUEUE,PCI,PORT,DRIVER_SPECIFIC
Where non DRIVER_SPECIFC groups are standardize stats objects.. 

-Saeed.
Jakub Kicinski July 7, 2020, 4:37 p.m. UTC | #11
On Tue, 7 Jul 2020 03:29:18 +0000 Saeed Mahameed wrote:
> On Mon, 2020-07-06 at 19:07 -0700, Jakub Kicinski wrote:
> > On Tue, 7 Jul 2020 01:51:21 +0000 Saeed Mahameed wrote:  
> > > > Also looks like you report the total number of mcast packets in
> > > > ethtool
> > > > -S, which should be identical to ip -s? If so please remove
> > > > that.    
> > > 
> > > why ? it is ok to report the same counter both in ehttool and
> > > netdev
> > > stats.  
> > 
> > I don't think it is, Stephen and I have been trying to catch this in
> > review for a while now. It's an entirely unnecessary code
> > duplication.  
> 
> Code duplication shouldn't be a factor. For example, in mlx5 we have a
> generic mechanism to define and report stats, for the netdev stats to
> be reported in ethtoool all we need to do is define the representing
> string and store those counters in the SW stats struct.

And sum them up in a loop. One day we'll have a better API for standard
stats and will will we still argue then that ethtool -S should
duplicate _everything_.

Drivers should put more focus on providing useful information in
standard statistics in the first place, then think about exposing more
details as needed.

> > We should steer towards proper APIs first if those exist.
> > 
> > Using ethtool -S stats gets very messy very quickly in production.  
> 
> I agree on ethtool getting messy very quickly, but i disagree on not
> reporting netdev stats, I don't think the 4 netdev stats are the reason
> for messy ethtool.
> 
> Ethtool -S is meant for verbosity and debug, and it is full of
> driver/HW specific counters, how are you planing to audit all of those
> ?

I don't understand how verbosity, debug, and being full of HW specific
counters has any relevance for this patch - which is reporting a pure
SW driver stat.

> We had an idea in the past to allow users to choose what stats groups
> to report to ethtool, we can follow up on this if this is something
> other drivers might be interested in.
> 
> example: 
> 
> ethtool -S eth0 --groups XDP,SW,PER_QUEUE,PCI,PORT,DRIVER_SPECIFIC
> Where non DRIVER_SPECIFC groups are standardize stats objects.. 

Attributes are useful but the primary problem is the fact that we,
driver developers, seem to be funneling all our creative passion into
coming up with new names for statistics.

Look for example how many names we have for etherStatsPkts256to511Octets
And nobody(!) thought it's a good idea to just name the counter what
it's called in the RFC.

Policing a free form string interface in review is just unworkable.

Anyway.. I'm sidetracking.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
index bfd3e1161bc6..5b554e79d734 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -13,6 +13,11 @@  enum mlx5e_icosq_wqe_type {
 	MLX5E_ICOSQ_WQE_UMR_RX,
 };
 
+static inline bool mlx5e_skb_is_multicast(struct sk_buff *skb)
+{
+	return skb->pkt_type == PACKET_MULTICAST || skb->pkt_type == PACKET_BROADCAST;
+}
+
 static inline bool
 mlx5e_wqc_has_room_for(struct mlx5_wq_cyc *wq, u16 cc, u16 pc, u16 n)
 {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index a836a02a2116..ee4cc723d225 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3558,6 +3558,7 @@  void mlx5e_fold_sw_stats64(struct mlx5e_priv *priv, struct rtnl_link_stats64 *s)
 
 		s->rx_packets   += rq_stats->packets + xskrq_stats->packets;
 		s->rx_bytes     += rq_stats->bytes + xskrq_stats->bytes;
+		s->multicast    += rq_stats->mcast_packets + xskrq_stats->mcast_packets;
 
 		for (j = 0; j < priv->max_opened_tc; j++) {
 			struct mlx5e_sq_stats *sq_stats = &channel_stats->sq[j];
@@ -3573,7 +3574,6 @@  void
 mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
-	struct mlx5e_vport_stats *vstats = &priv->stats.vport;
 	struct mlx5e_pport_stats *pstats = &priv->stats.pport;
 
 	/* In switchdev mode, monitor counters doesn't monitor
@@ -3608,12 +3608,6 @@  mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	stats->rx_errors = stats->rx_length_errors + stats->rx_crc_errors +
 			   stats->rx_frame_errors;
 	stats->tx_errors = stats->tx_aborted_errors + stats->tx_carrier_errors;
-
-	/* vport multicast also counts packets that are dropped due to steering
-	 * or rx out of buffer
-	 */
-	stats->multicast =
-		VPORT_COUNTER_GET(vstats, received_eth_multicast.packets);
 }
 
 static void mlx5e_set_rx_mode(struct net_device *dev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index dbb1c6323967..ce023f1c45d5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -50,6 +50,7 @@ 
 #include "en/xdp.h"
 #include "en/xsk/rx.h"
 #include "en/health.h"
+#include "en/txrx.h"
 
 static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config)
 {
@@ -1020,6 +1021,11 @@  static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
 		mlx5e_enable_ecn(rq, skb);
 
 	skb->protocol = eth_type_trans(skb, netdev);
+
+	if (unlikely(mlx5e_skb_is_multicast(skb))) {
+		stats->mcast_packets++;
+		stats->mcast_bytes += cqe_bcnt;
+	}
 }
 
 static inline void mlx5e_complete_rx_cqe(struct mlx5e_rq *rq,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index f009fe09e99b..f028971016d2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -114,6 +114,8 @@  static const struct counter_desc sw_stats_desc[] = {
 
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_lro_packets) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_lro_bytes) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mcast_packets) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_mcast_bytes) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_ecn_mark) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_removed_vlan_packets) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_unnecessary) },
@@ -243,6 +245,8 @@  static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw)
 		s->rx_bytes	+= rq_stats->bytes;
 		s->rx_lro_packets += rq_stats->lro_packets;
 		s->rx_lro_bytes	+= rq_stats->lro_bytes;
+		s->rx_mcast_packets += rq_stats->mcast_packets;
+		s->rx_mcast_bytes += rq_stats->mcast_bytes;
 		s->rx_ecn_mark	+= rq_stats->ecn_mark;
 		s->rx_removed_vlan_packets += rq_stats->removed_vlan_packets;
 		s->rx_csum_none	+= rq_stats->csum_none;
@@ -1458,6 +1462,8 @@  static const struct counter_desc rq_stats_desc[] = {
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_redirect) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_packets) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_bytes) },
+	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mcast_packets) },
+	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mcast_bytes) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, ecn_mark) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, removed_vlan_packets) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, wqe_err) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index 2b83ba990714..86af203f5b05 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -119,6 +119,8 @@  struct mlx5e_sw_stats {
 	u64 tx_nop;
 	u64 rx_lro_packets;
 	u64 rx_lro_bytes;
+	u64 rx_mcast_packets;
+	u64 rx_mcast_bytes;
 	u64 rx_ecn_mark;
 	u64 rx_removed_vlan_packets;
 	u64 rx_csum_unnecessary;
@@ -286,6 +288,8 @@  struct mlx5e_rq_stats {
 	u64 csum_none;
 	u64 lro_packets;
 	u64 lro_bytes;
+	u64 mcast_packets;
+	u64 mcast_bytes;
 	u64 ecn_mark;
 	u64 removed_vlan_packets;
 	u64 xdp_drop;