Message ID | 20240307193942.3949516-1-i.maximets@ovn.org |
---|---|
State | Accepted |
Commit | 2c4ffd2f8a23bed1f2141e81abcc68ccba129e3f |
Headers | show |
Series | [ovs-dev,v2] netdev-dpdk: Dump packets that fail Tx preparation. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 7 Mar 2024, at 20:39, Ilya Maximets wrote: > It's hard to debug situations where driver rejects packets for some > reason. Dumping out the mbuf should help with that. > > Sample output looks like this: > > |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet: > dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176 > pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0 > segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1 > Dump data at [0x1180bce580], len=64 > 00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`. > 00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............ > 00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................ > 00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:......... > > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> This debug addition looks good to me. Acked-by: Eelco Chaudron <echaudro@redhat.com>
On 07/03/2024 19:39, Ilya Maximets wrote: > It's hard to debug situations where driver rejects packets for some > reason. Dumping out the mbuf should help with that. > > Sample output looks like this: > > |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet: > dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176 > pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0 > segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1 > Dump data at [0x1180bce580], len=64 > 00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`. > 00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............ > 00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................ > 00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:......... > > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com> > --- > > Version 2: > * Only dumping the first invalid packet as we don't actually > know if the other ones correct or not. [Kevin] > > Also, we may want to have this backported to 3.3 and maybe 3.2, > as I suspect those will be branches where will will actually have > to debug offloading rejection issues. We already have a few. > For example: https://github.com/openvswitch/ovs-issues/issues/321 > for which I actually prepared this patch in the first place. > > This is a debug-only change that should not have any impact > on end users. > > Thoughts? > I think it's reasonable to backport it. It is non-intrusive for the normal case and will be helpful for debug. > lib/netdev-dpdk.c | 31 +++++++++++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 45f61930d..9444c53b1 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2664,6 +2664,35 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk *dev, struct rte_mbuf **pkts, > return cnt; > } > > +static void > +netdev_dpdk_mbuf_dump(const char *prefix, const char *message, > + const struct rte_mbuf *mbuf) > +{ > + static struct vlog_rate_limit dump_rl = VLOG_RATE_LIMIT_INIT(5, 5); > + char *response = NULL; > + FILE *stream; > + size_t size; > + > + if (VLOG_DROP_DBG(&dump_rl)) { > + return; > + } > + > + stream = open_memstream(&response, &size); > + if (!stream) { > + VLOG_ERR("Unable to open memstream for mbuf dump: %s.", > + ovs_strerror(errno)); > + return; > + } > + > + rte_pktmbuf_dump(stream, mbuf, rte_pktmbuf_pkt_len(mbuf)); > + > + fclose(stream); > + > + VLOG_DBG(prefix ? "%s: %s:\n%s" : "%s%s:\n%s", > + prefix ? prefix : "", message, response); > + free(response); > +} > + > /* Tries to transmit 'pkts' to txq 'qid' of device 'dev'. Takes ownership of > * 'pkts', even in case of failure. > * > @@ -2680,6 +2709,8 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid, > VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. " > "Only %u/%u are valid: %s", netdev_get_name(&dev->up), > nb_tx_prep, cnt, rte_strerror(rte_errno)); > + netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up), > + "First invalid packet", pkts[nb_tx_prep]); > } > > while (nb_tx != nb_tx_prep) {
On 3/8/24 12:21, Kevin Traynor wrote: > On 07/03/2024 19:39, Ilya Maximets wrote: >> It's hard to debug situations where driver rejects packets for some >> reason. Dumping out the mbuf should help with that. >> >> Sample output looks like this: >> >> |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet: >> dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176 >> pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0 >> segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1 >> Dump data at [0x1180bce580], len=64 >> 00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`. >> 00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............ >> 00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................ >> 00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:......... >> >> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > > Acked-by: Kevin Traynor <ktraynor@redhat.com> > >> --- >> >> Version 2: >> * Only dumping the first invalid packet as we don't actually >> know if the other ones correct or not. [Kevin] >> >> Also, we may want to have this backported to 3.3 and maybe 3.2, >> as I suspect those will be branches where will will actually have >> to debug offloading rejection issues. We already have a few. >> For example: https://github.com/openvswitch/ovs-issues/issues/321 >> for which I actually prepared this patch in the first place. >> >> This is a debug-only change that should not have any impact >> on end users. >> >> Thoughts? >> > > I think it's reasonable to backport it. It is non-intrusive for the > normal case and will be helpful for debug. Thanks, Kevin and Eelco! Applied and backported down to 3.2. Best regards, Ilya Maximets.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 45f61930d..9444c53b1 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2664,6 +2664,35 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk *dev, struct rte_mbuf **pkts, return cnt; } +static void +netdev_dpdk_mbuf_dump(const char *prefix, const char *message, + const struct rte_mbuf *mbuf) +{ + static struct vlog_rate_limit dump_rl = VLOG_RATE_LIMIT_INIT(5, 5); + char *response = NULL; + FILE *stream; + size_t size; + + if (VLOG_DROP_DBG(&dump_rl)) { + return; + } + + stream = open_memstream(&response, &size); + if (!stream) { + VLOG_ERR("Unable to open memstream for mbuf dump: %s.", + ovs_strerror(errno)); + return; + } + + rte_pktmbuf_dump(stream, mbuf, rte_pktmbuf_pkt_len(mbuf)); + + fclose(stream); + + VLOG_DBG(prefix ? "%s: %s:\n%s" : "%s%s:\n%s", + prefix ? prefix : "", message, response); + free(response); +} + /* Tries to transmit 'pkts' to txq 'qid' of device 'dev'. Takes ownership of * 'pkts', even in case of failure. * @@ -2680,6 +2709,8 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int qid, VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. " "Only %u/%u are valid: %s", netdev_get_name(&dev->up), nb_tx_prep, cnt, rte_strerror(rte_errno)); + netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up), + "First invalid packet", pkts[nb_tx_prep]); } while (nb_tx != nb_tx_prep) {
It's hard to debug situations where driver rejects packets for some reason. Dumping out the mbuf should help with that. Sample output looks like this: |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: First invalid packet: dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176 pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0 segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1 Dump data at [0x1180bce580], len=64 00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`. 00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............ 00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................ 00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:......... Signed-off-by: Ilya Maximets <i.maximets@ovn.org> --- Version 2: * Only dumping the first invalid packet as we don't actually know if the other ones correct or not. [Kevin] Also, we may want to have this backported to 3.3 and maybe 3.2, as I suspect those will be branches where will will actually have to debug offloading rejection issues. We already have a few. For example: https://github.com/openvswitch/ovs-issues/issues/321 for which I actually prepared this patch in the first place. This is a debug-only change that should not have any impact on end users. Thoughts? lib/netdev-dpdk.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)