Message ID | 20240523075049.1936263-1-wushaohua@chinatelecom.cn |
---|---|
State | Rejected |
Delegated to: | Ilya Maximets |
Headers | show |
Series | [ovs-dev] dpif-netdev: Resolved flow table reference issue. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 5/23/24 09:50, wushaohua@chinatelecom.cn wrote: > From: Shaohua Wu <wushaohua@chinatelecom.cn> > > description: > The statistics for the cached DP flow table are updated in > packet_batch_per_flow_execute. There might be a timing gap > where the dp flow in the batch reaches its aging time and > is prematurely aged. If actions on the dp flows are > executed at this time,and the memory for the DP flow > has been released, it can lead to a crash. > > 0 raise () from /lib64/libc.so.6 > 1 abort () from /lib64/libc.so.6 > 2 odp_execute_actions at lib/odp-execute.c:1166 > 3 dp_netdev_execute_actions at lib/dpif-netdev.c:11339 > 4 packet_batch_per_flow_execute at lib/dpif-netdev.c:8537 > 5 dp_netdev_input__ at lib/dpif-netdev.c:10722 > 6 dp_netdev_input at lib/dpif-netdev.c:10731 > 7 dp_netdev_process_rxq_port at lib/dpif-netdev.c:6332 > 8 dpif_netdev_run at lib/dpif-netdev.c:7343 > 9 dpif_run at lib/dpif.c:479 > 10 type_run at ofproto/ofproto-dpif.c:370 > 11 ofproto_type_run at ofproto/ofproto.c:1789 > 12 bridge_run__ at vswitchd/bridge.c:3245 > 13 bridge_run at vswitchd/bridge.c:3310 > 14 main at vswitchd/ovs-vswitchd.c:127 > (gdb) f 4 > (gdb) p flow->ref_cnt > $4 = {count = 0} > (gdb) p flow->dead > $5 = true Hmm. Thanks for the patch! Though the flow is supposed to be protected by RCU and PMD threads are not supposed to enter quiescent state in the middle of packet processing, so the flow should not be freed until the processing is over. The flow will be dead in this case with a zero counter, but it is not freed, i.e. should be accessible. Do you know what exactly failed in odp_execute_actions ? Your version of the code seems to be very different from the upstream main branch, so line numbers do not help much. Best regards, Ilya Maximets.
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index c7f9e1490..8f96efde0 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -8255,12 +8255,13 @@ static inline void packet_batch_per_flow_init(struct packet_batch_per_flow *batch, struct dp_netdev_flow *flow) { - flow->batch = batch; - - batch->flow = flow; - dp_packet_batch_init(&batch->array); - batch->byte_count = 0; - batch->tcp_flags = 0; + if (dp_netdev_flow_ref(flow)) { + flow->batch = batch; + batch->flow = flow; + dp_packet_batch_init(&batch->array); + batch->byte_count = 0; + batch->tcp_flags = 0; + } } static inline void @@ -8322,9 +8323,11 @@ packet_enqueue_to_flow_map(struct dp_packet *packet, size_t index) { struct dp_packet_flow_map *map = &flow_map[index]; - map->flow = flow; - map->packet = packet; - map->tcp_flags = tcp_flags; + if (dp_netdev_flow_ref(flow)) { + map->flow = flow; + map->packet = packet; + map->tcp_flags = tcp_flags; + } } /* SMC lookup function for a batch of packets. @@ -8888,6 +8891,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd, } dp_netdev_queue_batches(map->packet, map->flow, map->tcp_flags, batches, &n_batches); + dp_netdev_flow_unref(map->flow); } /* All the flow batches need to be reset before any call to @@ -8905,6 +8909,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd, for (i = 0; i < n_batches; i++) { packet_batch_per_flow_execute(&batches[i], pmd); + dp_netdev_flow_unref(batches[i].flow); } }