Message ID | 1319706145-26599-1-git-send-email-wudxw@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Thu, Oct 27, 2011 at 10:02 AM, Mark Wu <wudxw@linux.vnet.ibm.com> wrote: > Now queue flushing and sent callback could be invoked even on delivery > failure. We add a checking of receiver's return value to avoid this > case. > > Signed-off-by: Mark Wu <wudxw@linux.vnet.ibm.com> > --- > net/queue.c | 12 +++++++----- > 1 files changed, 7 insertions(+), 5 deletions(-) What problem are you trying to fix? > @@ -251,7 +253,7 @@ void qemu_net_queue_flush(NetQueue *queue) > break; > } > > - if (packet->sent_cb) { > + if (ret > 0 && packet->sent_cb) { > packet->sent_cb(packet->sender, ret); This looks wrong. ret is passed as an argument to the callback. You are skipping the callback on error and not giving it a chance to see negative ret. Looking at virtio_net_tx_complete() this causes a virtqueue element leak. Stefan
On 10/28/2011 05:13 PM, Stefan Hajnoczi wrote: > On Thu, Oct 27, 2011 at 10:02 AM, Mark Wu<wudxw@linux.vnet.ibm.com> wrote: >> Now queue flushing and sent callback could be invoked even on delivery >> failure. We add a checking of receiver's return value to avoid this >> case. >> >> Signed-off-by: Mark Wu<wudxw@linux.vnet.ibm.com> >> --- >> net/queue.c | 12 +++++++----- >> 1 files changed, 7 insertions(+), 5 deletions(-) > What problem are you trying to fix? > >> @@ -251,7 +253,7 @@ void qemu_net_queue_flush(NetQueue *queue) >> break; >> } >> >> - if (packet->sent_cb) { >> + if (ret> 0&& packet->sent_cb) { >> packet->sent_cb(packet->sender, ret); > This looks wrong. ret is passed as an argument to the callback. You > are skipping the callback on error and not giving it a chance to see > negative ret. > > Looking at virtio_net_tx_complete() this causes a virtqueue element leak. Thanks for your review! Yes, that's a problem. I thought only tap call queue send function with a callback (tap_send_completed) and confirmed that no memory leak in the case of tap. I agree that it will cause a descriptor leak, but actually virtio_net_tx_complete doesn't check 'ret'. It just pushes the elem to the virtqueue with 'async_tx.len' and flushes the tx queue. I think it assumes the callback is only called on success. Otherwise, it doesn't make sense for me. My point is that flush shouldn't happen on a deliver failure. Probably it will cause more failures. tap_send_completed assumes it's called on successfully deliver a packet too because it re-enables polling of tap fd. That's why I add a checking of 'ret'. I am not sure if the original code really needs a fix because it will not cause any visible problems. > Stefan >
On Fri, Oct 28, 2011 at 11:02 AM, Mark Wu <wudxw@linux.vnet.ibm.com> wrote: > On 10/28/2011 05:13 PM, Stefan Hajnoczi wrote: >> >> On Thu, Oct 27, 2011 at 10:02 AM, Mark Wu<wudxw@linux.vnet.ibm.com> >> wrote: >>> >>> Now queue flushing and sent callback could be invoked even on delivery >>> failure. We add a checking of receiver's return value to avoid this >>> case. >>> >>> Signed-off-by: Mark Wu<wudxw@linux.vnet.ibm.com> >>> --- >>> net/queue.c | 12 +++++++----- >>> 1 files changed, 7 insertions(+), 5 deletions(-) >> >> What problem are you trying to fix? >> >>> @@ -251,7 +253,7 @@ void qemu_net_queue_flush(NetQueue *queue) >>> break; >>> } >>> >>> - if (packet->sent_cb) { >>> + if (ret> 0&& packet->sent_cb) { >>> packet->sent_cb(packet->sender, ret); >> >> This looks wrong. ret is passed as an argument to the callback. You >> are skipping the callback on error and not giving it a chance to see >> negative ret. >> >> Looking at virtio_net_tx_complete() this causes a virtqueue element leak. > > Thanks for your review! > Yes, that's a problem. I thought only tap call queue send function with a > callback (tap_send_completed) and confirmed that no memory leak in the case > of tap. I agree that it will cause a > descriptor leak, but actually virtio_net_tx_complete doesn't check 'ret'. It > just pushes the elem to the virtqueue with 'async_tx.len' and flushes the tx > queue. I think it assumes the callback is only called on success. Otherwise, > it doesn't make sense for me. My point is that flush shouldn't happen on a > deliver failure. Probably it will cause more failures. tap_send_completed > assumes it's called on successfully deliver a packet too because it > re-enables polling of tap fd. That's why I add a checking of 'ret'. > > I am not sure if the original code really needs a fix because it will not > cause any visible problems. In that case I would leave it. I agree that if we just got an error it is likely that sending more will also cause an error. But we don't know when in the future things will succeed. Also Ethernet is lossly and the guest networking stack is designed to handle failures and dropped packets. Stefan
diff --git a/net/queue.c b/net/queue.c index 1ab5247..c9a027c 100644 --- a/net/queue.c +++ b/net/queue.c @@ -190,8 +190,9 @@ ssize_t qemu_net_queue_send(NetQueue *queue, qemu_net_queue_append(queue, sender, flags, data, size, sent_cb); return 0; } - - qemu_net_queue_flush(queue); + if (ret > 0) { + qemu_net_queue_flush(queue); + } return ret; } @@ -214,8 +215,9 @@ ssize_t qemu_net_queue_send_iov(NetQueue *queue, qemu_net_queue_append_iov(queue, sender, flags, iov, iovcnt, sent_cb); return 0; } - - qemu_net_queue_flush(queue); + if (ret > 0) { + qemu_net_queue_flush(queue); + } return ret; } @@ -251,7 +253,7 @@ void qemu_net_queue_flush(NetQueue *queue) break; } - if (packet->sent_cb) { + if (ret > 0 && packet->sent_cb) { packet->sent_cb(packet->sender, ret); }
Now queue flushing and sent callback could be invoked even on delivery failure. We add a checking of receiver's return value to avoid this case. Signed-off-by: Mark Wu <wudxw@linux.vnet.ibm.com> --- net/queue.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-)