diff mbox series

[ovs-dev,v2] netdev-dpdk: Dump packets that fail Tx preparation.

Message ID 20240307193942.3949516-1-i.maximets@ovn.org
State Accepted
Commit 2c4ffd2f8a23bed1f2141e81abcc68ccba129e3f
Headers show
Series [ovs-dev,v2] netdev-dpdk: Dump packets that fail Tx preparation. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Ilya Maximets March 7, 2024, 7:39 p.m. UTC
It's hard to debug situations where driver rejects packets for some
reason.  Dumping out the mbuf should help with that.

Sample output looks like this:

  |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet:
  dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176
    pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0
    segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1
    Dump data at [0x1180bce580], len=64
  00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`.
  00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............
  00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................
  00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:.........

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---

Version 2:
  * Only dumping the first invalid packet as we don't actually
    know if the other ones correct or not.  [Kevin]

Also, we may want to have this backported to 3.3 and maybe 3.2,
as I suspect those will be branches where will will actually have
to debug offloading rejection issues.  We already have a few.
For example: https://github.com/openvswitch/ovs-issues/issues/321
for which I actually prepared this patch in the first place.

This is a debug-only change that should not have any impact
on end users.

Thoughts?

 lib/netdev-dpdk.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

Comments

Eelco Chaudron March 8, 2024, 10:14 a.m. UTC | #1
On 7 Mar 2024, at 20:39, Ilya Maximets wrote:

> It's hard to debug situations where driver rejects packets for some
> reason.  Dumping out the mbuf should help with that.
>
> Sample output looks like this:
>
>   |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet:
>   dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176
>     pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0
>     segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1
>     Dump data at [0x1180bce580], len=64
>   00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`.
>   00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............
>   00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................
>   00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:.........
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

This debug addition looks good to me.

Acked-by: Eelco Chaudron <echaudro@redhat.com>
Kevin Traynor March 8, 2024, 11:21 a.m. UTC | #2
On 07/03/2024 19:39, Ilya Maximets wrote:
> It's hard to debug situations where driver rejects packets for some
> reason.  Dumping out the mbuf should help with that.
> 
> Sample output looks like this:
> 
>   |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet:
>   dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176
>     pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0
>     segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1
>     Dump data at [0x1180bce580], len=64
>   00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`.
>   00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............
>   00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................
>   00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:.........
> 
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Acked-by: Kevin Traynor <ktraynor@redhat.com>

> ---
> 
> Version 2:
>   * Only dumping the first invalid packet as we don't actually
>     know if the other ones correct or not.  [Kevin]
> 
> Also, we may want to have this backported to 3.3 and maybe 3.2,
> as I suspect those will be branches where will will actually have
> to debug offloading rejection issues.  We already have a few.
> For example: https://github.com/openvswitch/ovs-issues/issues/321
> for which I actually prepared this patch in the first place.
> 
> This is a debug-only change that should not have any impact
> on end users.
> 
> Thoughts?
> 

I think it's reasonable to backport it. It is non-intrusive for the
normal case and will be helpful for debug.

>  lib/netdev-dpdk.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 45f61930d..9444c53b1 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2664,6 +2664,35 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
>      return cnt;
>  }
>  
> +static void
> +netdev_dpdk_mbuf_dump(const char *prefix, const char *message,
> +                      const struct rte_mbuf *mbuf)
> +{
> +    static struct vlog_rate_limit dump_rl = VLOG_RATE_LIMIT_INIT(5, 5);
> +    char *response = NULL;
> +    FILE *stream;
> +    size_t size;
> +
> +    if (VLOG_DROP_DBG(&dump_rl)) {
> +        return;
> +    }
> +
> +    stream = open_memstream(&response, &size);
> +    if (!stream) {
> +        VLOG_ERR("Unable to open memstream for mbuf dump: %s.",
> +                 ovs_strerror(errno));
> +        return;
> +    }
> +
> +    rte_pktmbuf_dump(stream, mbuf, rte_pktmbuf_pkt_len(mbuf));
> +
> +    fclose(stream);
> +
> +    VLOG_DBG(prefix ? "%s: %s:\n%s" : "%s%s:\n%s",
> +             prefix ? prefix : "", message, response);
> +    free(response);
> +}
> +
>  /* Tries to transmit 'pkts' to txq 'qid' of device 'dev'.  Takes ownership of
>   * 'pkts', even in case of failure.
>   *
> @@ -2680,6 +2709,8 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid,
>          VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. "
>                       "Only %u/%u are valid: %s", netdev_get_name(&dev->up),
>                       nb_tx_prep, cnt, rte_strerror(rte_errno));
> +        netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up),
> +                              "First invalid packet", pkts[nb_tx_prep]);
>      }
>  
>      while (nb_tx != nb_tx_prep) {
Ilya Maximets March 8, 2024, 9:25 p.m. UTC | #3
On 3/8/24 12:21, Kevin Traynor wrote:
> On 07/03/2024 19:39, Ilya Maximets wrote:
>> It's hard to debug situations where driver rejects packets for some
>> reason.  Dumping out the mbuf should help with that.
>>
>> Sample output looks like this:
>>
>>   |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet:
>>   dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176
>>     pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0
>>     segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1
>>     Dump data at [0x1180bce580], len=64
>>   00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`.
>>   00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............
>>   00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................
>>   00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:.........
>>
>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> Acked-by: Kevin Traynor <ktraynor@redhat.com>
> 
>> ---
>>
>> Version 2:
>>   * Only dumping the first invalid packet as we don't actually
>>     know if the other ones correct or not.  [Kevin]
>>
>> Also, we may want to have this backported to 3.3 and maybe 3.2,
>> as I suspect those will be branches where will will actually have
>> to debug offloading rejection issues.  We already have a few.
>> For example: https://github.com/openvswitch/ovs-issues/issues/321
>> for which I actually prepared this patch in the first place.
>>
>> This is a debug-only change that should not have any impact
>> on end users.
>>
>> Thoughts?
>>
> 
> I think it's reasonable to backport it. It is non-intrusive for the
> normal case and will be helpful for debug.

Thanks, Kevin and Eelco!

Applied and backported down to 3.2.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 45f61930d..9444c53b1 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2664,6 +2664,35 @@  netdev_dpdk_prep_hwol_batch(struct netdev_dpdk *dev, struct rte_mbuf **pkts,
     return cnt;
 }
 
+static void
+netdev_dpdk_mbuf_dump(const char *prefix, const char *message,
+                      const struct rte_mbuf *mbuf)
+{
+    static struct vlog_rate_limit dump_rl = VLOG_RATE_LIMIT_INIT(5, 5);
+    char *response = NULL;
+    FILE *stream;
+    size_t size;
+
+    if (VLOG_DROP_DBG(&dump_rl)) {
+        return;
+    }
+
+    stream = open_memstream(&response, &size);
+    if (!stream) {
+        VLOG_ERR("Unable to open memstream for mbuf dump: %s.",
+                 ovs_strerror(errno));
+        return;
+    }
+
+    rte_pktmbuf_dump(stream, mbuf, rte_pktmbuf_pkt_len(mbuf));
+
+    fclose(stream);
+
+    VLOG_DBG(prefix ? "%s: %s:\n%s" : "%s%s:\n%s",
+             prefix ? prefix : "", message, response);
+    free(response);
+}
+
 /* Tries to transmit 'pkts' to txq 'qid' of device 'dev'.  Takes ownership of
  * 'pkts', even in case of failure.
  *
@@ -2680,6 +2709,8 @@  netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid,
         VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. "
                      "Only %u/%u are valid: %s", netdev_get_name(&dev->up),
                      nb_tx_prep, cnt, rte_strerror(rte_errno));
+        netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up),
+                              "First invalid packet", pkts[nb_tx_prep]);
     }
 
     while (nb_tx != nb_tx_prep) {