diff mbox

net/hub: remove can_receive handler

Message ID 1366284715-10107-1-git-send-email-s.fedorov@samsung.com
State New
Headers show

Commit Message

Sergey Fedorov April 18, 2013, 11:31 a.m. UTC
Network hub should always receive incoming packets. Then forward them to
the appropriate port queue and let the qemu_send_packet() do the right
things. If the destination queue cannot receive the packet it will be
appended to the queue. When the receiver call
qemu_flush_queued_packets() later the queue will be really flushed and
no packets will be stalled in the sender network queue.

Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
---
 net/hub.c |   20 --------------------
 1 file changed, 20 deletions(-)

Comments

Stefan Hajnoczi April 22, 2013, 11:47 a.m. UTC | #1
On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
> Network hub should always receive incoming packets. Then forward them to
> the appropriate port queue and let the qemu_send_packet() do the right
> things. If the destination queue cannot receive the packet it will be
> appended to the queue. When the receiver call
> qemu_flush_queued_packets() later the queue will be really flushed and
> no packets will be stalled in the sender network queue.
> 
> Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
> ---
>  net/hub.c |   20 --------------------
>  1 file changed, 20 deletions(-)

What is the point of this change?  There is no semantic difference for
well-behaved net clients.

Does it fix a bug, if so, please include details?

Stefan
Sergey Fedorov April 22, 2013, 12:26 p.m. UTC | #2
On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
> On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
>> Network hub should always receive incoming packets. Then forward them to
>> the appropriate port queue and let the qemu_send_packet() do the right
>> things. If the destination queue cannot receive the packet it will be
>> appended to the queue. When the receiver call
>> qemu_flush_queued_packets() later the queue will be really flushed and
>> no packets will be stalled in the sender network queue.
>>
>> Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
>> ---
>>   net/hub.c |   20 --------------------
>>   1 file changed, 20 deletions(-)
> What is the point of this change?  There is no semantic difference for
> well-behaved net clients.
>
> Does it fix a bug, if so, please include details?
>
> Stefan
>
>

Yes, this fixes a bug. There were packet stalls when using user-mode 
networking with USB network device. slirp_output() calls 
qemu_send_packet() which eventually calls qemu_net_queue_send(). 
qemu_net_queue_send() calls qemu_can_send_packet(), which calls 
can_receive() callback of network hub. Then net_hub_port_can_receive() 
also calls qemu_can_send_packet() for each port except packet source port.

Sometimes USB network device is not able to receive packet and 
qemu_can_send_packet() returns false. In my case there is no more ports 
and net_hub_port_can_receive() returns false. So qemu_net_queue_send() 
call qemu_net_queue_append() instead of qemu_net_queue_deliver(). 
qemu_net_queue_append() appends the packet to the receiving port of the 
network hub which is not flushed when USB netork device calls 
qemu_flush_queued_packets(). It is flushed only when slirp resend the 
packet by timeout.

Actually there is no need in net_hub_port_can_receive() as the network 
hub can always receive packets and pass it to its port network clients 
with qemu_send_packet(). And if the destination port network client 
cannot receive the packet it will be queued in the *destination* port 
network client queue. Queued packets from that queue will be delivered 
as soon as the network client call qemu_flush_queued_packets().
Stefan Hajnoczi April 22, 2013, 2:57 p.m. UTC | #3
On Mon, Apr 22, 2013 at 04:26:16PM +0400, Fedorov Sergey wrote:
> On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
> >On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
> >>Network hub should always receive incoming packets. Then forward them to
> >>the appropriate port queue and let the qemu_send_packet() do the right
> >>things. If the destination queue cannot receive the packet it will be
> >>appended to the queue. When the receiver call
> >>qemu_flush_queued_packets() later the queue will be really flushed and
> >>no packets will be stalled in the sender network queue.
> >>
> >>Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
> >>---
> >>  net/hub.c |   20 --------------------
> >>  1 file changed, 20 deletions(-)
> >What is the point of this change?  There is no semantic difference for
> >well-behaved net clients.
> >
> >Does it fix a bug, if so, please include details?
> >
> >Stefan
> >
> >
> 
> Yes, this fixes a bug. There were packet stalls when using user-mode
> networking with USB network device. slirp_output() calls
> qemu_send_packet() which eventually calls qemu_net_queue_send().
> qemu_net_queue_send() calls qemu_can_send_packet(), which calls
> can_receive() callback of network hub. Then
> net_hub_port_can_receive() also calls qemu_can_send_packet() for
> each port except packet source port.
> 
> Sometimes USB network device is not able to receive packet and
> qemu_can_send_packet() returns false. In my case there is no more
> ports and net_hub_port_can_receive() returns false. So
> qemu_net_queue_send() call qemu_net_queue_append() instead of
> qemu_net_queue_deliver(). qemu_net_queue_append() appends the packet
> to the receiving port of the network hub which is not flushed when
> USB netork device calls qemu_flush_queued_packets(). It is flushed
> only when slirp resend the packet by timeout.
> 
> Actually there is no need in net_hub_port_can_receive() as the
> network hub can always receive packets and pass it to its port
> network clients with qemu_send_packet(). And if the destination port
> network client cannot receive the packet it will be queued in the
> *destination* port network client queue. Queued packets from that
> queue will be delivered as soon as the network client call
> qemu_flush_queued_packets().

Please confirm the bug is still present in qemu.git/master.  It should
have been fixed by the following commit:

  commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
  Author: Luigi Rizzo <rizzo@iet.unipi.it>
  Date:   Tue Feb 5 17:53:31 2013 +0100

      net: fix qemu_flush_queued_packets() in presence of a hub

Stefan
Sergey Fedorov April 22, 2013, 3:27 p.m. UTC | #4
On 04/22/2013 06:57 PM, Stefan Hajnoczi wrote:
> On Mon, Apr 22, 2013 at 04:26:16PM +0400, Fedorov Sergey wrote:
>> On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
>>> On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
>>>> Network hub should always receive incoming packets. Then forward them to
>>>> the appropriate port queue and let the qemu_send_packet() do the right
>>>> things. If the destination queue cannot receive the packet it will be
>>>> appended to the queue. When the receiver call
>>>> qemu_flush_queued_packets() later the queue will be really flushed and
>>>> no packets will be stalled in the sender network queue.
>>>>
>>>> Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
>>>> ---
>>>>   net/hub.c |   20 --------------------
>>>>   1 file changed, 20 deletions(-)
>>> What is the point of this change?  There is no semantic difference for
>>> well-behaved net clients.
>>>
>>> Does it fix a bug, if so, please include details?
>>>
>>> Stefan
>>>
>>>
>> Yes, this fixes a bug. There were packet stalls when using user-mode
>> networking with USB network device. slirp_output() calls
>> qemu_send_packet() which eventually calls qemu_net_queue_send().
>> qemu_net_queue_send() calls qemu_can_send_packet(), which calls
>> can_receive() callback of network hub. Then
>> net_hub_port_can_receive() also calls qemu_can_send_packet() for
>> each port except packet source port.
>>
>> Sometimes USB network device is not able to receive packet and
>> qemu_can_send_packet() returns false. In my case there is no more
>> ports and net_hub_port_can_receive() returns false. So
>> qemu_net_queue_send() call qemu_net_queue_append() instead of
>> qemu_net_queue_deliver(). qemu_net_queue_append() appends the packet
>> to the receiving port of the network hub which is not flushed when
>> USB netork device calls qemu_flush_queued_packets(). It is flushed
>> only when slirp resend the packet by timeout.
>>
>> Actually there is no need in net_hub_port_can_receive() as the
>> network hub can always receive packets and pass it to its port
>> network clients with qemu_send_packet(). And if the destination port
>> network client cannot receive the packet it will be queued in the
>> *destination* port network client queue. Queued packets from that
>> queue will be delivered as soon as the network client call
>> qemu_flush_queued_packets().
> Please confirm the bug is still present in qemu.git/master.  It should
> have been fixed by the following commit:
>
>    commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
>    Author: Luigi Rizzo <rizzo@iet.unipi.it>
>    Date:   Tue Feb 5 17:53:31 2013 +0100
>
>        net: fix qemu_flush_queued_packets() in presence of a hub
>
> Stefan
>
Yes, this commit fixes a bug but from other side. I think it's better to 
just let the qemu_send_packet() do the right things.

E.g. network hub has 3 ports. Suppose when iterating through port list 
in net_hub_port_can_receive() a packet is successfully delivered to the 
first port, and then is queued in the source port queue because the 
second port cannot receive packets. Later net_hub_flush() will flush the 
packet from the source port queue and it will be delivered in every 
port. But it had been already delivered to one of them. So it will be 
delivered twice to some ports. Moreover there is less chance to dequeue 
the packet if several clients can't receive periodically.

Anyway, actually there is no need in net_hub_port_can_receive() as the 
network hub can always receive  packets and pass it to its port network 
clients with qemu_send_packet(). I think it's more natural solution.

Sergey
Paolo Bonzini April 22, 2013, 4:09 p.m. UTC | #5
Il 22/04/2013 17:27, Fedorov Sergey ha scritto:
> 
> E.g. network hub has 3 ports. Suppose when iterating through port list
> in net_hub_port_can_receive() a packet is successfully delivered to the
> first port, and then is queued in the source port queue because the
> second port cannot receive packets. Later net_hub_flush() will flush the
> packet from the source port queue and it will be delivered in every
> port. But it had been already delivered to one of them. So it will be
> delivered twice to some ports. Moreover there is less chance to dequeue
> the packet if several clients can't receive periodically.

Perhaps it is indeed wrong to do this blocking in can_receive()...
You're right that a hubport can always receive, but the hub itself
should have a queue.  If one port cannot receive, the packet should be
appended to the hub's queue.  And net_hub_flush will just go through the
hub's queue.

> Anyway, actually there is no need in net_hub_port_can_receive() as the
> network hub can always receive  packets and pass it to its port network
> clients with qemu_send_packet(). I think it's more natural solution.

I think the point was to keep dumps in sync with what actually happened
on the other ports.  Otherwise a "-net dump" port will show the packet
immediately, even though it hasn't been delivered yet.

Paolo
Stefan Hajnoczi April 23, 2013, 6:58 a.m. UTC | #6
On Mon, Apr 22, 2013 at 07:27:21PM +0400, Fedorov Sergey wrote:
> On 04/22/2013 06:57 PM, Stefan Hajnoczi wrote:
> >On Mon, Apr 22, 2013 at 04:26:16PM +0400, Fedorov Sergey wrote:
> >>On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
> >>>On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
> >>>>Network hub should always receive incoming packets. Then forward them to
> >>>>the appropriate port queue and let the qemu_send_packet() do the right
> >>>>things. If the destination queue cannot receive the packet it will be
> >>>>appended to the queue. When the receiver call
> >>>>qemu_flush_queued_packets() later the queue will be really flushed and
> >>>>no packets will be stalled in the sender network queue.
> >>>>
> >>>>Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
> >>>>---
> >>>>  net/hub.c |   20 --------------------
> >>>>  1 file changed, 20 deletions(-)
> >>>What is the point of this change?  There is no semantic difference for
> >>>well-behaved net clients.
> >>>
> >>>Does it fix a bug, if so, please include details?
> >>>
> >>>Stefan
> >>>
> >>>
> >>Yes, this fixes a bug. There were packet stalls when using user-mode
> >>networking with USB network device. slirp_output() calls
> >>qemu_send_packet() which eventually calls qemu_net_queue_send().
> >>qemu_net_queue_send() calls qemu_can_send_packet(), which calls
> >>can_receive() callback of network hub. Then
> >>net_hub_port_can_receive() also calls qemu_can_send_packet() for
> >>each port except packet source port.
> >>
> >>Sometimes USB network device is not able to receive packet and
> >>qemu_can_send_packet() returns false. In my case there is no more
> >>ports and net_hub_port_can_receive() returns false. So
> >>qemu_net_queue_send() call qemu_net_queue_append() instead of
> >>qemu_net_queue_deliver(). qemu_net_queue_append() appends the packet
> >>to the receiving port of the network hub which is not flushed when
> >>USB netork device calls qemu_flush_queued_packets(). It is flushed
> >>only when slirp resend the packet by timeout.
> >>
> >>Actually there is no need in net_hub_port_can_receive() as the
> >>network hub can always receive packets and pass it to its port
> >>network clients with qemu_send_packet(). And if the destination port
> >>network client cannot receive the packet it will be queued in the
> >>*destination* port network client queue. Queued packets from that
> >>queue will be delivered as soon as the network client call
> >>qemu_flush_queued_packets().
> >Please confirm the bug is still present in qemu.git/master.  It should
> >have been fixed by the following commit:
> >
> >   commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
> >   Author: Luigi Rizzo <rizzo@iet.unipi.it>
> >   Date:   Tue Feb 5 17:53:31 2013 +0100
> >
> >       net: fix qemu_flush_queued_packets() in presence of a hub
> >
> >Stefan
> >
> Yes, this commit fixes a bug but from other side. I think it's
> better to just let the qemu_send_packet() do the right things.
> 
> E.g. network hub has 3 ports. Suppose when iterating through port
> list in net_hub_port_can_receive() a packet is successfully
> delivered to the first port, and then is queued in the source port
> queue because the second port cannot receive packets. Later
> net_hub_flush() will flush the packet from the source port queue and
> it will be delivered in every port. But it had been already
> delivered to one of them. So it will be delivered twice to some
> ports. Moreover there is less chance to dequeue the packet if
> several clients can't receive periodically.

Did you mean "second port" instead of "source port" in the beginning?

I don't see a scenario where the packet is delivered to the same port
twice.  If one port can receive then the packet will be sent and other
ports will queue the packet.  Note that the source port will not queue
the packet.  When the blocked ports can receive again their send queues
will be flushed.

Paolo mentioned that we want to provide a consistent view of packets,
especially when -net dump is used.

Beyond that, we also want to avoid growing net queues indefinitely.  If
the hub does not implement .can_receive() then it relies on growing
queues (keeping packets buffered in memory).

Stefan
Sergey Fedorov April 23, 2013, 7:27 a.m. UTC | #7
On 04/22/2013 08:09 PM, Paolo Bonzini wrote:
> Il 22/04/2013 17:27, Fedorov Sergey ha scritto:
>> E.g. network hub has 3 ports. Suppose when iterating through port list
>> in net_hub_port_can_receive() a packet is successfully delivered to the
>> first port, and then is queued in the source port queue because the
>> second port cannot receive packets. Later net_hub_flush() will flush the
>> packet from the source port queue and it will be delivered in every
>> port. But it had been already delivered to one of them. So it will be
>> delivered twice to some ports. Moreover there is less chance to dequeue
>> the packet if several clients can't receive periodically.
> Perhaps it is indeed wrong to do this blocking in can_receive()...
> You're right that a hubport can always receive, but the hub itself
> should have a queue.  If one port cannot receive, the packet should be
> appended to the hub's queue.  And net_hub_flush will just go through the
> hub's queue.
>
>> Anyway, actually there is no need in net_hub_port_can_receive() as the
>> network hub can always receive  packets and pass it to its port network
>> clients with qemu_send_packet(). I think it's more natural solution.
> I think the point was to keep dumps in sync with what actually happened
> on the other ports.  Otherwise a "-net dump" port will show the packet
> immediately, even though it hasn't been delivered yet.
>
> Paolo
>
The user documentation says that `-net dump' dumps network traffic on 
VLAN. There is nothing said about interfaces connected to VALN nor 
synchronization with them. In my view, VLAN hub is a device that 
forwards an incoming packet to all other ports as soon as possible, just 
like real hardware hub. I think of `-net dump' as a special network 
interface that dumps every packet received by VLAN hub itself, not 
delivered to its clients. If we want to dump packets actually delivered 
to network interfaces we need to do it in per-interface manner. Maybe I 
misunderstand?
Sergey Fedorov April 23, 2013, 7:41 a.m. UTC | #8
On 04/23/2013 10:58 AM, Stefan Hajnoczi wrote:
> On Mon, Apr 22, 2013 at 07:27:21PM +0400, Fedorov Sergey wrote:
>> On 04/22/2013 06:57 PM, Stefan Hajnoczi wrote:
>>> On Mon, Apr 22, 2013 at 04:26:16PM +0400, Fedorov Sergey wrote:
>>>> On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
>>>>> On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
>>>>>> Network hub should always receive incoming packets. Then forward them to
>>>>>> the appropriate port queue and let the qemu_send_packet() do the right
>>>>>> things. If the destination queue cannot receive the packet it will be
>>>>>> appended to the queue. When the receiver call
>>>>>> qemu_flush_queued_packets() later the queue will be really flushed and
>>>>>> no packets will be stalled in the sender network queue.
>>>>>>
>>>>>> Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
>>>>>> ---
>>>>>>   net/hub.c |   20 --------------------
>>>>>>   1 file changed, 20 deletions(-)
>>>>> What is the point of this change?  There is no semantic difference for
>>>>> well-behaved net clients.
>>>>>
>>>>> Does it fix a bug, if so, please include details?
>>>>>
>>>>> Stefan
>>>>>
>>>>>
>>>> Yes, this fixes a bug. There were packet stalls when using user-mode
>>>> networking with USB network device. slirp_output() calls
>>>> qemu_send_packet() which eventually calls qemu_net_queue_send().
>>>> qemu_net_queue_send() calls qemu_can_send_packet(), which calls
>>>> can_receive() callback of network hub. Then
>>>> net_hub_port_can_receive() also calls qemu_can_send_packet() for
>>>> each port except packet source port.
>>>>
>>>> Sometimes USB network device is not able to receive packet and
>>>> qemu_can_send_packet() returns false. In my case there is no more
>>>> ports and net_hub_port_can_receive() returns false. So
>>>> qemu_net_queue_send() call qemu_net_queue_append() instead of
>>>> qemu_net_queue_deliver(). qemu_net_queue_append() appends the packet
>>>> to the receiving port of the network hub which is not flushed when
>>>> USB netork device calls qemu_flush_queued_packets(). It is flushed
>>>> only when slirp resend the packet by timeout.
>>>>
>>>> Actually there is no need in net_hub_port_can_receive() as the
>>>> network hub can always receive packets and pass it to its port
>>>> network clients with qemu_send_packet(). And if the destination port
>>>> network client cannot receive the packet it will be queued in the
>>>> *destination* port network client queue. Queued packets from that
>>>> queue will be delivered as soon as the network client call
>>>> qemu_flush_queued_packets().
>>> Please confirm the bug is still present in qemu.git/master.  It should
>>> have been fixed by the following commit:
>>>
>>>    commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
>>>    Author: Luigi Rizzo <rizzo@iet.unipi.it>
>>>    Date:   Tue Feb 5 17:53:31 2013 +0100
>>>
>>>        net: fix qemu_flush_queued_packets() in presence of a hub
>>>
>>> Stefan
>>>
>> Yes, this commit fixes a bug but from other side. I think it's
>> better to just let the qemu_send_packet() do the right things.
>>
>> E.g. network hub has 3 ports. Suppose when iterating through port
>> list in net_hub_port_can_receive() a packet is successfully
>> delivered to the first port, and then is queued in the source port
>> queue because the second port cannot receive packets. Later
>> net_hub_flush() will flush the packet from the source port queue and
>> it will be delivered in every port. But it had been already
>> delivered to one of them. So it will be delivered twice to some
>> ports. Moreover there is less chance to dequeue the packet if
>> several clients can't receive periodically.
> Did you mean "second port" instead of "source port" in the beginning?
>
> I don't see a scenario where the packet is delivered to the same port
> twice.  If one port can receive then the packet will be sent and other
> ports will queue the packet.
I agree. I was wrong with that.
> Note that the source port will not queue the packet.
The source port will queue the packet if no port, except the source one, 
can receive.
> When the blocked ports can receive again their send queues
> will be flushed.
>
> Paolo mentioned that we want to provide a consistent view of packets,
> especially when -net dump is used.
>
> Beyond that, we also want to avoid growing net queues indefinitely.  If
> the hub does not implement .can_receive() then it relies on growing
> queues (keeping packets buffered in memory).
No, net_hub_receive() calls qemu_send_packet(). If the destination queue 
cannot receive the packet qemu_net_queue_append() will take care of 
queue->nq_maxlen.
>
> Stefan
>
Sergey Fedorov April 23, 2013, 9:32 a.m. UTC | #9
On 04/23/2013 10:58 AM, Stefan Hajnoczi wrote:
> On Mon, Apr 22, 2013 at 07:27:21PM +0400, Fedorov Sergey wrote:
>> On 04/22/2013 06:57 PM, Stefan Hajnoczi wrote:
>>> On Mon, Apr 22, 2013 at 04:26:16PM +0400, Fedorov Sergey wrote:
>>>> On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
>>>>> On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
>>>>>> Network hub should always receive incoming packets. Then forward them to
>>>>>> the appropriate port queue and let the qemu_send_packet() do the right
>>>>>> things. If the destination queue cannot receive the packet it will be
>>>>>> appended to the queue. When the receiver call
>>>>>> qemu_flush_queued_packets() later the queue will be really flushed and
>>>>>> no packets will be stalled in the sender network queue.
>>>>>>
>>>>>> Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
>>>>>> ---
>>>>>>   net/hub.c |   20 --------------------
>>>>>>   1 file changed, 20 deletions(-)
>>>>> What is the point of this change?  There is no semantic difference for
>>>>> well-behaved net clients.
>>>>>
>>>>> Does it fix a bug, if so, please include details?
>>>>>
>>>>> Stefan
>>>>>
>>>>>
>>>> Yes, this fixes a bug. There were packet stalls when using user-mode
>>>> networking with USB network device. slirp_output() calls
>>>> qemu_send_packet() which eventually calls qemu_net_queue_send().
>>>> qemu_net_queue_send() calls qemu_can_send_packet(), which calls
>>>> can_receive() callback of network hub. Then
>>>> net_hub_port_can_receive() also calls qemu_can_send_packet() for
>>>> each port except packet source port.
>>>>
>>>> Sometimes USB network device is not able to receive packet and
>>>> qemu_can_send_packet() returns false. In my case there is no more
>>>> ports and net_hub_port_can_receive() returns false. So
>>>> qemu_net_queue_send() call qemu_net_queue_append() instead of
>>>> qemu_net_queue_deliver(). qemu_net_queue_append() appends the packet
>>>> to the receiving port of the network hub which is not flushed when
>>>> USB netork device calls qemu_flush_queued_packets(). It is flushed
>>>> only when slirp resend the packet by timeout.
>>>>
>>>> Actually there is no need in net_hub_port_can_receive() as the
>>>> network hub can always receive packets and pass it to its port
>>>> network clients with qemu_send_packet(). And if the destination port
>>>> network client cannot receive the packet it will be queued in the
>>>> *destination* port network client queue. Queued packets from that
>>>> queue will be delivered as soon as the network client call
>>>> qemu_flush_queued_packets().
>>> Please confirm the bug is still present in qemu.git/master.  It should
>>> have been fixed by the following commit:
>>>
>>>    commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
>>>    Author: Luigi Rizzo <rizzo@iet.unipi.it>
>>>    Date:   Tue Feb 5 17:53:31 2013 +0100
>>>
>>>        net: fix qemu_flush_queued_packets() in presence of a hub
>>>
>>> Stefan
>>>
>> Yes, this commit fixes a bug but from other side. I think it's
>> better to just let the qemu_send_packet() do the right things.
>>
>> E.g. network hub has 3 ports. Suppose when iterating through port
>> list in net_hub_port_can_receive() a packet is successfully
>> delivered to the first port, and then is queued in the source port
>> queue because the second port cannot receive packets. Later
>> net_hub_flush() will flush the packet from the source port queue and
>> it will be delivered in every port. But it had been already
>> delivered to one of them. So it will be delivered twice to some
>> ports. Moreover there is less chance to dequeue the packet if
>> several clients can't receive periodically.
> Did you mean "second port" instead of "source port" in the beginning?
>
> I don't see a scenario where the packet is delivered to the same port
> twice.  If one port can receive then the packet will be sent and other
> ports will queue the packet.
Whether dump network client can always receive? If so, then the packet 
will always be sent to the dump client regardless of whether the other 
clients can receive. Is there actually any synchronization?
> Note that the source port will not queue
> the packet.  When the blocked ports can receive again their send queues
> will be flushed.
>
> Paolo mentioned that we want to provide a consistent view of packets,
> especially when -net dump is used.
>
> Beyond that, we also want to avoid growing net queues indefinitely.  If
> the hub does not implement .can_receive() then it relies on growing
> queues (keeping packets buffered in memory).
>
> Stefan
>
Stefan Hajnoczi April 23, 2013, 11:48 a.m. UTC | #10
On Tue, Apr 23, 2013 at 01:32:11PM +0400, Fedorov Sergey wrote:
> 
> On 04/23/2013 10:58 AM, Stefan Hajnoczi wrote:
> >On Mon, Apr 22, 2013 at 07:27:21PM +0400, Fedorov Sergey wrote:
> >>On 04/22/2013 06:57 PM, Stefan Hajnoczi wrote:
> >>>On Mon, Apr 22, 2013 at 04:26:16PM +0400, Fedorov Sergey wrote:
> >>>>On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
> >>>>>On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
> >>>>>>Network hub should always receive incoming packets. Then forward them to
> >>>>>>the appropriate port queue and let the qemu_send_packet() do the right
> >>>>>>things. If the destination queue cannot receive the packet it will be
> >>>>>>appended to the queue. When the receiver call
> >>>>>>qemu_flush_queued_packets() later the queue will be really flushed and
> >>>>>>no packets will be stalled in the sender network queue.
> >>>>>>
> >>>>>>Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
> >>>>>>---
> >>>>>>  net/hub.c |   20 --------------------
> >>>>>>  1 file changed, 20 deletions(-)
> >>>>>What is the point of this change?  There is no semantic difference for
> >>>>>well-behaved net clients.
> >>>>>
> >>>>>Does it fix a bug, if so, please include details?
> >>>>>
> >>>>>Stefan
> >>>>>
> >>>>>
> >>>>Yes, this fixes a bug. There were packet stalls when using user-mode
> >>>>networking with USB network device. slirp_output() calls
> >>>>qemu_send_packet() which eventually calls qemu_net_queue_send().
> >>>>qemu_net_queue_send() calls qemu_can_send_packet(), which calls
> >>>>can_receive() callback of network hub. Then
> >>>>net_hub_port_can_receive() also calls qemu_can_send_packet() for
> >>>>each port except packet source port.
> >>>>
> >>>>Sometimes USB network device is not able to receive packet and
> >>>>qemu_can_send_packet() returns false. In my case there is no more
> >>>>ports and net_hub_port_can_receive() returns false. So
> >>>>qemu_net_queue_send() call qemu_net_queue_append() instead of
> >>>>qemu_net_queue_deliver(). qemu_net_queue_append() appends the packet
> >>>>to the receiving port of the network hub which is not flushed when
> >>>>USB netork device calls qemu_flush_queued_packets(). It is flushed
> >>>>only when slirp resend the packet by timeout.
> >>>>
> >>>>Actually there is no need in net_hub_port_can_receive() as the
> >>>>network hub can always receive packets and pass it to its port
> >>>>network clients with qemu_send_packet(). And if the destination port
> >>>>network client cannot receive the packet it will be queued in the
> >>>>*destination* port network client queue. Queued packets from that
> >>>>queue will be delivered as soon as the network client call
> >>>>qemu_flush_queued_packets().
> >>>Please confirm the bug is still present in qemu.git/master.  It should
> >>>have been fixed by the following commit:
> >>>
> >>>   commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
> >>>   Author: Luigi Rizzo <rizzo@iet.unipi.it>
> >>>   Date:   Tue Feb 5 17:53:31 2013 +0100
> >>>
> >>>       net: fix qemu_flush_queued_packets() in presence of a hub
> >>>
> >>>Stefan
> >>>
> >>Yes, this commit fixes a bug but from other side. I think it's
> >>better to just let the qemu_send_packet() do the right things.
> >>
> >>E.g. network hub has 3 ports. Suppose when iterating through port
> >>list in net_hub_port_can_receive() a packet is successfully
> >>delivered to the first port, and then is queued in the source port
> >>queue because the second port cannot receive packets. Later
> >>net_hub_flush() will flush the packet from the source port queue and
> >>it will be delivered in every port. But it had been already
> >>delivered to one of them. So it will be delivered twice to some
> >>ports. Moreover there is less chance to dequeue the packet if
> >>several clients can't receive periodically.
> >Did you mean "second port" instead of "source port" in the beginning?
> >
> >I don't see a scenario where the packet is delivered to the same port
> >twice.  If one port can receive then the packet will be sent and other
> >ports will queue the packet.
> Whether dump network client can always receive? If so, then the
> packet will always be sent to the dump client regardless of whether
> the other clients can receive. Is there actually any
> synchronization?

Yes, the dump net client can always receive.  Other net clients may not.

There is no explicit synchronization between send queues on the hub.

Stefan
Sergey Fedorov April 23, 2013, 11:58 a.m. UTC | #11
On 04/23/2013 03:48 PM, Stefan Hajnoczi wrote:
> On Tue, Apr 23, 2013 at 01:32:11PM +0400, Fedorov Sergey wrote:
>> On 04/23/2013 10:58 AM, Stefan Hajnoczi wrote:
>>> On Mon, Apr 22, 2013 at 07:27:21PM +0400, Fedorov Sergey wrote:
>>>> On 04/22/2013 06:57 PM, Stefan Hajnoczi wrote:
>>>>> On Mon, Apr 22, 2013 at 04:26:16PM +0400, Fedorov Sergey wrote:
>>>>>> On 04/22/2013 03:47 PM, Stefan Hajnoczi wrote:
>>>>>>> On Thu, Apr 18, 2013 at 03:31:55PM +0400, Sergey Fedorov wrote:
>>>>>>>> Network hub should always receive incoming packets. Then forward them to
>>>>>>>> the appropriate port queue and let the qemu_send_packet() do the right
>>>>>>>> things. If the destination queue cannot receive the packet it will be
>>>>>>>> appended to the queue. When the receiver call
>>>>>>>> qemu_flush_queued_packets() later the queue will be really flushed and
>>>>>>>> no packets will be stalled in the sender network queue.
>>>>>>>>
>>>>>>>> Signed-off-by: Sergey Fedorov <s.fedorov@samsung.com>
>>>>>>>> ---
>>>>>>>>   net/hub.c |   20 --------------------
>>>>>>>>   1 file changed, 20 deletions(-)
>>>>>>> What is the point of this change?  There is no semantic difference for
>>>>>>> well-behaved net clients.
>>>>>>>
>>>>>>> Does it fix a bug, if so, please include details?
>>>>>>>
>>>>>>> Stefan
>>>>>>>
>>>>>>>
>>>>>> Yes, this fixes a bug. There were packet stalls when using user-mode
>>>>>> networking with USB network device. slirp_output() calls
>>>>>> qemu_send_packet() which eventually calls qemu_net_queue_send().
>>>>>> qemu_net_queue_send() calls qemu_can_send_packet(), which calls
>>>>>> can_receive() callback of network hub. Then
>>>>>> net_hub_port_can_receive() also calls qemu_can_send_packet() for
>>>>>> each port except packet source port.
>>>>>>
>>>>>> Sometimes USB network device is not able to receive packet and
>>>>>> qemu_can_send_packet() returns false. In my case there is no more
>>>>>> ports and net_hub_port_can_receive() returns false. So
>>>>>> qemu_net_queue_send() call qemu_net_queue_append() instead of
>>>>>> qemu_net_queue_deliver(). qemu_net_queue_append() appends the packet
>>>>>> to the receiving port of the network hub which is not flushed when
>>>>>> USB netork device calls qemu_flush_queued_packets(). It is flushed
>>>>>> only when slirp resend the packet by timeout.
>>>>>>
>>>>>> Actually there is no need in net_hub_port_can_receive() as the
>>>>>> network hub can always receive packets and pass it to its port
>>>>>> network clients with qemu_send_packet(). And if the destination port
>>>>>> network client cannot receive the packet it will be queued in the
>>>>>> *destination* port network client queue. Queued packets from that
>>>>>> queue will be delivered as soon as the network client call
>>>>>> qemu_flush_queued_packets().
>>>>> Please confirm the bug is still present in qemu.git/master.  It should
>>>>> have been fixed by the following commit:
>>>>>
>>>>>    commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
>>>>>    Author: Luigi Rizzo <rizzo@iet.unipi.it>
>>>>>    Date:   Tue Feb 5 17:53:31 2013 +0100
>>>>>
>>>>>        net: fix qemu_flush_queued_packets() in presence of a hub
>>>>>
>>>>> Stefan
>>>>>
>>>> Yes, this commit fixes a bug but from other side. I think it's
>>>> better to just let the qemu_send_packet() do the right things.
>>>>
>>>> E.g. network hub has 3 ports. Suppose when iterating through port
>>>> list in net_hub_port_can_receive() a packet is successfully
>>>> delivered to the first port, and then is queued in the source port
>>>> queue because the second port cannot receive packets. Later
>>>> net_hub_flush() will flush the packet from the source port queue and
>>>> it will be delivered in every port. But it had been already
>>>> delivered to one of them. So it will be delivered twice to some
>>>> ports. Moreover there is less chance to dequeue the packet if
>>>> several clients can't receive periodically.
>>> Did you mean "second port" instead of "source port" in the beginning?
>>>
>>> I don't see a scenario where the packet is delivered to the same port
>>> twice.  If one port can receive then the packet will be sent and other
>>> ports will queue the packet.
>> Whether dump network client can always receive? If so, then the
>> packet will always be sent to the dump client regardless of whether
>> the other clients can receive. Is there actually any
>> synchronization?
> Yes, the dump net client can always receive.  Other net clients may not.
>
> There is no explicit synchronization between send queues on the hub.
>
> Stefan
>
So why don't we apply this patch?
Stefan Hajnoczi April 23, 2013, noon UTC | #12
On Tue, Apr 23, 2013 at 11:41:42AM +0400, Fedorov Sergey wrote:
> >Beyond that, we also want to avoid growing net queues indefinitely.  If
> >the hub does not implement .can_receive() then it relies on growing
> >queues (keeping packets buffered in memory).
> No, net_hub_receive() calls qemu_send_packet(). If the destination
> queue cannot receive the packet qemu_net_queue_append() will take
> care of queue->nq_maxlen.

You are right, sorry.  We do discard packets at nq_maxlen.

The problem with ignoring .can_receive() on the hub is that it breaks
flow control.  For example, net/tap.c is designed to avoid reading more
packets if its peer cannot receive (see tap_can_send()).

If the hub claims it can always receive we waste cycles reading packets
from the tap device only to discard them.

Since qemu.git already has a fix which preserves flow control, I am not
going to merge your patch.

Stefan
Sergey Fedorov April 23, 2013, 1:14 p.m. UTC | #13
On 04/23/2013 04:00 PM, Stefan Hajnoczi wrote:
> On Tue, Apr 23, 2013 at 11:41:42AM +0400, Fedorov Sergey wrote:
>>> Beyond that, we also want to avoid growing net queues indefinitely.  If
>>> the hub does not implement .can_receive() then it relies on growing
>>> queues (keeping packets buffered in memory).
>> No, net_hub_receive() calls qemu_send_packet(). If the destination
>> queue cannot receive the packet qemu_net_queue_append() will take
>> care of queue->nq_maxlen.
> You are right, sorry.  We do discard packets at nq_maxlen.
>
> The problem with ignoring .can_receive() on the hub is that it breaks
> flow control.  For example, net/tap.c is designed to avoid reading more
> packets if its peer cannot receive (see tap_can_send()).
>
> If the hub claims it can always receive we waste cycles reading packets
> from the tap device only to discard them.
>
> Since qemu.git already has a fix which preserves flow control, I am not
> going to merge your patch.
>
> Stefan
>
OK, I see. Thanks for attention.
Sergey Fedorov Oct. 21, 2013, 11:44 a.m. UTC | #14
On 04/23/2013 04:00 PM, Stefan Hajnoczi wrote:
> On Tue, Apr 23, 2013 at 11:41:42AM +0400, Fedorov Sergey wrote:
>>> Beyond that, we also want to avoid growing net queues indefinitely.  If
>>> the hub does not implement .can_receive() then it relies on growing
>>> queues (keeping packets buffered in memory).
>> No, net_hub_receive() calls qemu_send_packet(). If the destination
>> queue cannot receive the packet qemu_net_queue_append() will take
>> care of queue->nq_maxlen.
> You are right, sorry.  We do discard packets at nq_maxlen.
>
> The problem with ignoring .can_receive() on the hub is that it breaks
> flow control.  For example, net/tap.c is designed to avoid reading more
> packets if its peer cannot receive (see tap_can_send()).
>
> If the hub claims it can always receive we waste cycles reading packets
> from the tap device only to discard them.
>
> Since qemu.git already has a fix which preserves flow control, I am not
> going to merge your patch.
>
> Stefan
>
>

Dear, Stefan Hajnoczi,

After our discussion about this patch I decided to keep my patch in our 
branch until rebase onto a new release. Recently I have rebased our 
branch onto v1.5.3 and reverted my patch. Then I face an issue when 
using user-mode networking with USB network device for mounting root 
file system through NFS. Fragmented UDP packets from host to guest does 
not handled properly. Seems that some fragments is lost or somehow 
stalled. See guest tcpdump log below.

03:16:52.259690 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 
UDP (17), length 164)
     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh 
Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
512 bytes @ 0 max 4096 verf 0000000000000000
03:16:52.262323 IP (tos 0x0, ttl 64, id 16, offset 0, flags [+], proto 
UDP (17), length 1500)
     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus 
POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
03:16:52.264592 IP (tos 0x0, ttl 64, id 16, offset 1480, flags [+], 
proto UDP (17), length 1500)
     10.0.2.2 > 10.0.2.15: udp
03:16:54.462961 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 
UDP (17), length 164)
     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh 
Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
512 bytes @ 0 max 4096 verf 0000000000000000
03:16:54.466300 IP (tos 0x0, ttl 64, id 17, offset 0, flags [+], proto 
UDP (17), length 1500)
     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus 
POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
03:16:54.467084 IP (tos 0x0, ttl 64, id 17, offset 1480, flags [+], 
proto UDP (17), length 1500)
     10.0.2.2 > 10.0.2.15: udp
...

I didn't investigate the cause of the problem in detail. I just reverted

commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
Author: Luigi Rizzo <rizzo@iet.unipi.it>
Date:   Tue Feb 5 17:53:31 2013 +0100

     net: fix qemu_flush_queued_packets() in presence of a hub

And then applied my patch. After that everything works fine for me. See 
guest tcpdump log below.

04:45:15.897245 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 
UDP (17), length 164)
     10.0.2.15.3642011847 > 10.0.2.2.nfs: 136 readdirplus fh 
Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
512 bytes @ 0 max 4096 verf 0000000000000000
04:45:15.899686 IP (tos 0x0, ttl 64, id 15, offset 0, flags [+], proto 
UDP (17), length 1500)
     10.0.2.2.nfs > 10.0.2.15.3642011847: reply ok 1472 readdirplus 
POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
04:45:15.906253 IP (tos 0x0, ttl 64, id 15, offset 1480, flags [+], 
proto UDP (17), length 1500)
     10.0.2.2 > 10.0.2.15: udp
04:45:15.906687 IP (tos 0x0, ttl 64, id 15, offset 2960, flags [none], 
proto UDP (17), length 240)
     10.0.2.2 > 10.0.2.15: udp

So there must be something wrong with already applied patch. What could 
you suggest?
Sergey Fedorov Oct. 21, 2013, 11:52 a.m. UTC | #15
On 10/21/2013 03:44 PM, Fedorov Sergey wrote:
> On 04/23/2013 04:00 PM, Stefan Hajnoczi wrote:
>> On Tue, Apr 23, 2013 at 11:41:42AM +0400, Fedorov Sergey wrote:
>>>> Beyond that, we also want to avoid growing net queues 
>>>> indefinitely.  If
>>>> the hub does not implement .can_receive() then it relies on growing
>>>> queues (keeping packets buffered in memory).
>>> No, net_hub_receive() calls qemu_send_packet(). If the destination
>>> queue cannot receive the packet qemu_net_queue_append() will take
>>> care of queue->nq_maxlen.
>> You are right, sorry.  We do discard packets at nq_maxlen.
>>
>> The problem with ignoring .can_receive() on the hub is that it breaks
>> flow control.  For example, net/tap.c is designed to avoid reading more
>> packets if its peer cannot receive (see tap_can_send()).
>>
>> If the hub claims it can always receive we waste cycles reading packets
>> from the tap device only to discard them.
>>
>> Since qemu.git already has a fix which preserves flow control, I am not
>> going to merge your patch.
>>
>> Stefan
>>
>>
>
> Dear, Stefan Hajnoczi,
>
> After our discussion about this patch I decided to keep my patch in 
> our branch until rebase onto a new release. Recently I have rebased 
> our branch onto v1.5.3 and reverted my patch. Then I face an issue 
> when using user-mode networking with USB network device for mounting 
> root file system through NFS. Fragmented UDP packets from host to 
> guest does not handled properly. Seems that some fragments is lost or 
> somehow stalled. See guest tcpdump log below.
>
> 03:16:52.259690 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 
> UDP (17), length 164)
>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh 
> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
> 512 bytes @ 0 max 4096 verf 0000000000000000
> 03:16:52.262323 IP (tos 0x0, ttl 64, id 16, offset 0, flags [+], proto 
> UDP (17), length 1500)
>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus 
> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> 03:16:52.264592 IP (tos 0x0, ttl 64, id 16, offset 1480, flags [+], 
> proto UDP (17), length 1500)
>     10.0.2.2 > 10.0.2.15: udp
> 03:16:54.462961 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 
> UDP (17), length 164)
>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh 
> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
> 512 bytes @ 0 max 4096 verf 0000000000000000
> 03:16:54.466300 IP (tos 0x0, ttl 64, id 17, offset 0, flags [+], proto 
> UDP (17), length 1500)
>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus 
> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> 03:16:54.467084 IP (tos 0x0, ttl 64, id 17, offset 1480, flags [+], 
> proto UDP (17), length 1500)
>     10.0.2.2 > 10.0.2.15: udp
> ...
>
> I didn't investigate the cause of the problem in detail. I just reverted
>
> commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
> Author: Luigi Rizzo <rizzo@iet.unipi.it>
> Date:   Tue Feb 5 17:53:31 2013 +0100
>
>     net: fix qemu_flush_queued_packets() in presence of a hub
>
> And then applied my patch. After that everything works fine for me. 
> See guest tcpdump log below.
>
> 04:45:15.897245 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 
> UDP (17), length 164)
>     10.0.2.15.3642011847 > 10.0.2.2.nfs: 136 readdirplus fh 
> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
> 512 bytes @ 0 max 4096 verf 0000000000000000
> 04:45:15.899686 IP (tos 0x0, ttl 64, id 15, offset 0, flags [+], proto 
> UDP (17), length 1500)
>     10.0.2.2.nfs > 10.0.2.15.3642011847: reply ok 1472 readdirplus 
> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> 04:45:15.906253 IP (tos 0x0, ttl 64, id 15, offset 1480, flags [+], 
> proto UDP (17), length 1500)
>     10.0.2.2 > 10.0.2.15: udp
> 04:45:15.906687 IP (tos 0x0, ttl 64, id 15, offset 2960, flags [none], 
> proto UDP (17), length 240)
>     10.0.2.2 > 10.0.2.15: udp
>
> So there must be something wrong with already applied patch. What 
> could you suggest?
>

Sorry, I missed that Anthony Liguori email address has been changed. So 
I resend the email.
Sergey Fedorov Oct. 28, 2013, 7:26 a.m. UTC | #16
On 10/21/2013 03:52 PM, Fedorov Sergey wrote:
>
> On 10/21/2013 03:44 PM, Fedorov Sergey wrote:
>> On 04/23/2013 04:00 PM, Stefan Hajnoczi wrote:
>>> On Tue, Apr 23, 2013 at 11:41:42AM +0400, Fedorov Sergey wrote:
>>>>> Beyond that, we also want to avoid growing net queues 
>>>>> indefinitely.  If
>>>>> the hub does not implement .can_receive() then it relies on growing
>>>>> queues (keeping packets buffered in memory).
>>>> No, net_hub_receive() calls qemu_send_packet(). If the destination
>>>> queue cannot receive the packet qemu_net_queue_append() will take
>>>> care of queue->nq_maxlen.
>>> You are right, sorry.  We do discard packets at nq_maxlen.
>>>
>>> The problem with ignoring .can_receive() on the hub is that it breaks
>>> flow control.  For example, net/tap.c is designed to avoid reading more
>>> packets if its peer cannot receive (see tap_can_send()).
>>>
>>> If the hub claims it can always receive we waste cycles reading packets
>>> from the tap device only to discard them.
>>>
>>> Since qemu.git already has a fix which preserves flow control, I am not
>>> going to merge your patch.
>>>
>>> Stefan
>>>
>>>
>>
>> Dear, Stefan Hajnoczi,
>>
>> After our discussion about this patch I decided to keep my patch in 
>> our branch until rebase onto a new release. Recently I have rebased 
>> our branch onto v1.5.3 and reverted my patch. Then I face an issue 
>> when using user-mode networking with USB network device for mounting 
>> root file system through NFS. Fragmented UDP packets from host to 
>> guest does not handled properly. Seems that some fragments is lost or 
>> somehow stalled. See guest tcpdump log below.
>>
>> 03:16:52.259690 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], 
>> proto UDP (17), length 164)
>>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh 
>> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
>> 512 bytes @ 0 max 4096 verf 0000000000000000
>> 03:16:52.262323 IP (tos 0x0, ttl 64, id 16, offset 0, flags [+], 
>> proto UDP (17), length 1500)
>>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus 
>> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
>> 03:16:52.264592 IP (tos 0x0, ttl 64, id 16, offset 1480, flags [+], 
>> proto UDP (17), length 1500)
>>     10.0.2.2 > 10.0.2.15: udp
>> 03:16:54.462961 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], 
>> proto UDP (17), length 164)
>>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh 
>> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
>> 512 bytes @ 0 max 4096 verf 0000000000000000
>> 03:16:54.466300 IP (tos 0x0, ttl 64, id 17, offset 0, flags [+], 
>> proto UDP (17), length 1500)
>>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus 
>> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
>> 03:16:54.467084 IP (tos 0x0, ttl 64, id 17, offset 1480, flags [+], 
>> proto UDP (17), length 1500)
>>     10.0.2.2 > 10.0.2.15: udp
>> ...
>>
>> I didn't investigate the cause of the problem in detail. I just reverted
>>
>> commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
>> Author: Luigi Rizzo <rizzo@iet.unipi.it>
>> Date:   Tue Feb 5 17:53:31 2013 +0100
>>
>>     net: fix qemu_flush_queued_packets() in presence of a hub
>>
>> And then applied my patch. After that everything works fine for me. 
>> See guest tcpdump log below.
>>
>> 04:45:15.897245 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], 
>> proto UDP (17), length 164)
>>     10.0.2.15.3642011847 > 10.0.2.2.nfs: 136 readdirplus fh 
>> Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000 
>> 512 bytes @ 0 max 4096 verf 0000000000000000
>> 04:45:15.899686 IP (tos 0x0, ttl 64, id 15, offset 0, flags [+], 
>> proto UDP (17), length 1500)
>>     10.0.2.2.nfs > 10.0.2.15.3642011847: reply ok 1472 readdirplus 
>> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
>> 04:45:15.906253 IP (tos 0x0, ttl 64, id 15, offset 1480, flags [+], 
>> proto UDP (17), length 1500)
>>     10.0.2.2 > 10.0.2.15: udp
>> 04:45:15.906687 IP (tos 0x0, ttl 64, id 15, offset 2960, flags 
>> [none], proto UDP (17), length 240)
>>     10.0.2.2 > 10.0.2.15: udp
>>
>> So there must be something wrong with already applied patch. What 
>> could you suggest?
>>
>
> Sorry, I missed that Anthony Liguori email address has been changed. 
> So I resend the email.
>

Ping.
Stefan Hajnoczi Oct. 29, 2013, 2:55 p.m. UTC | #17
On Mon, Oct 21, 2013 at 03:44:46PM +0400, Fedorov Sergey wrote:
> After our discussion about this patch I decided to keep my patch in
> our branch until rebase onto a new release. Recently I have rebased
> our branch onto v1.5.3 and reverted my patch. Then I face an issue
> when using user-mode networking with USB network device for mounting
> root file system through NFS. Fragmented UDP packets from host to
> guest does not handled properly. Seems that some fragments is lost
> or somehow stalled. See guest tcpdump log below.
> 
> 03:16:52.259690 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> proto UDP (17), length 164)
>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
> 512 bytes @ 0 max 4096 verf 0000000000000000
> 03:16:52.262323 IP (tos 0x0, ttl 64, id 16, offset 0, flags [+],
> proto UDP (17), length 1500)
>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus
> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> 03:16:52.264592 IP (tos 0x0, ttl 64, id 16, offset 1480, flags [+],
> proto UDP (17), length 1500)
>     10.0.2.2 > 10.0.2.15: udp
> 03:16:54.462961 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> proto UDP (17), length 164)
>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
> 512 bytes @ 0 max 4096 verf 0000000000000000
> 03:16:54.466300 IP (tos 0x0, ttl 64, id 17, offset 0, flags [+],
> proto UDP (17), length 1500)
>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus
> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> 03:16:54.467084 IP (tos 0x0, ttl 64, id 17, offset 1480, flags [+],
> proto UDP (17), length 1500)
>     10.0.2.2 > 10.0.2.15: udp
> ...
> 
> I didn't investigate the cause of the problem in detail. I just reverted
> 
> commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
> Author: Luigi Rizzo <rizzo@iet.unipi.it>
> Date:   Tue Feb 5 17:53:31 2013 +0100
> 
>     net: fix qemu_flush_queued_packets() in presence of a hub
> 
> And then applied my patch. After that everything works fine for me.
> See guest tcpdump log below.
> 
> 04:45:15.897245 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> proto UDP (17), length 164)
>     10.0.2.15.3642011847 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
> 512 bytes @ 0 max 4096 verf 0000000000000000
> 04:45:15.899686 IP (tos 0x0, ttl 64, id 15, offset 0, flags [+],
> proto UDP (17), length 1500)
>     10.0.2.2.nfs > 10.0.2.15.3642011847: reply ok 1472 readdirplus
> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> 04:45:15.906253 IP (tos 0x0, ttl 64, id 15, offset 1480, flags [+],
> proto UDP (17), length 1500)
>     10.0.2.2 > 10.0.2.15: udp
> 04:45:15.906687 IP (tos 0x0, ttl 64, id 15, offset 2960, flags
> [none], proto UDP (17), length 240)
>     10.0.2.2 > 10.0.2.15: udp
> 
> So there must be something wrong with already applied patch. What
> could you suggest?

The next step is to investigate the cause.

Perhaps hw/usb/dev-network.c:usb_net_handle_datain() is not calling
qemu_flush_queued_packets() every time in_buf[] is read completely.
This if statement looks strange to me:

if (s->in_ptr >= s->in_len &&
    (is_rndis(s) || (s->in_len & (64 - 1)) || !len)) {
    /* no short packet necessary */
    usb_net_reset_in_buf(s);
}

Try placing printfs to find out whether qemu_flush_queued_packets() is
getting called when you see packet loss.

Stefan
Sergey Fedorov Oct. 30, 2013, 10:29 a.m. UTC | #18
On 10/29/2013 06:55 PM, Stefan Hajnoczi wrote:
> On Mon, Oct 21, 2013 at 03:44:46PM +0400, Fedorov Sergey wrote:
>> After our discussion about this patch I decided to keep my patch in
>> our branch until rebase onto a new release. Recently I have rebased
>> our branch onto v1.5.3 and reverted my patch. Then I face an issue
>> when using user-mode networking with USB network device for mounting
>> root file system through NFS. Fragmented UDP packets from host to
>> guest does not handled properly. Seems that some fragments is lost
>> or somehow stalled. See guest tcpdump log below.
>>
>> 03:16:52.259690 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
>> proto UDP (17), length 164)
>>      10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
>> 512 bytes @ 0 max 4096 verf 0000000000000000
>> 03:16:52.262323 IP (tos 0x0, ttl 64, id 16, offset 0, flags [+],
>> proto UDP (17), length 1500)
>>      10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus
>> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
>> 03:16:52.264592 IP (tos 0x0, ttl 64, id 16, offset 1480, flags [+],
>> proto UDP (17), length 1500)
>>      10.0.2.2 > 10.0.2.15: udp
>> 03:16:54.462961 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
>> proto UDP (17), length 164)
>>      10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
>> 512 bytes @ 0 max 4096 verf 0000000000000000
>> 03:16:54.466300 IP (tos 0x0, ttl 64, id 17, offset 0, flags [+],
>> proto UDP (17), length 1500)
>>      10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus
>> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
>> 03:16:54.467084 IP (tos 0x0, ttl 64, id 17, offset 1480, flags [+],
>> proto UDP (17), length 1500)
>>      10.0.2.2 > 10.0.2.15: udp
>> ...
>>
>> I didn't investigate the cause of the problem in detail. I just reverted
>>
>> commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
>> Author: Luigi Rizzo <rizzo@iet.unipi.it>
>> Date:   Tue Feb 5 17:53:31 2013 +0100
>>
>>      net: fix qemu_flush_queued_packets() in presence of a hub
>>
>> And then applied my patch. After that everything works fine for me.
>> See guest tcpdump log below.
>>
>> 04:45:15.897245 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
>> proto UDP (17), length 164)
>>      10.0.2.15.3642011847 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
>> 512 bytes @ 0 max 4096 verf 0000000000000000
>> 04:45:15.899686 IP (tos 0x0, ttl 64, id 15, offset 0, flags [+],
>> proto UDP (17), length 1500)
>>      10.0.2.2.nfs > 10.0.2.15.3642011847: reply ok 1472 readdirplus
>> POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
>> 04:45:15.906253 IP (tos 0x0, ttl 64, id 15, offset 1480, flags [+],
>> proto UDP (17), length 1500)
>>      10.0.2.2 > 10.0.2.15: udp
>> 04:45:15.906687 IP (tos 0x0, ttl 64, id 15, offset 2960, flags
>> [none], proto UDP (17), length 240)
>>      10.0.2.2 > 10.0.2.15: udp
>>
>> So there must be something wrong with already applied patch. What
>> could you suggest?
> The next step is to investigate the cause.
>
> Perhaps hw/usb/dev-network.c:usb_net_handle_datain() is not calling
> qemu_flush_queued_packets() every time in_buf[] is read completely.
> This if statement looks strange to me:
>
> if (s->in_ptr >= s->in_len &&
>      (is_rndis(s) || (s->in_len & (64 - 1)) || !len)) {
>      /* no short packet necessary */
>      usb_net_reset_in_buf(s);
> }
>
> Try placing printfs to find out whether qemu_flush_queued_packets() is
> getting called when you see packet loss.
>
> Stefan
>

Seems that I have figured out the problem. net_hub_flush() does not 
flush source port. And qemu_flush_queued_packets() also returns after 
calling net_hub_flush(). So I think the problem is that neither 
qemu_flush_queued_packets() nor net_hub_flush() call 
qemu_net_queue_flush() for the source port. So I think it sould be fixed 
in qemu_flush_queued_packets() by removing the return statement after 
calling net_hub_flush(). That fix does work for me. So I could submit 
that patch after getting permission for that.
Stefan Hajnoczi Oct. 30, 2013, 12:55 p.m. UTC | #19
On Wed, Oct 30, 2013 at 02:29:00PM +0400, Fedorov Sergey wrote:
> On 10/29/2013 06:55 PM, Stefan Hajnoczi wrote:
> >On Mon, Oct 21, 2013 at 03:44:46PM +0400, Fedorov Sergey wrote:
> >>After our discussion about this patch I decided to keep my patch in
> >>our branch until rebase onto a new release. Recently I have rebased
> >>our branch onto v1.5.3 and reverted my patch. Then I face an issue
> >>when using user-mode networking with USB network device for mounting
> >>root file system through NFS. Fragmented UDP packets from host to
> >>guest does not handled properly. Seems that some fragments is lost
> >>or somehow stalled. See guest tcpdump log below.
> >>
> >>03:16:52.259690 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> >>proto UDP (17), length 164)
> >>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
> >>512 bytes @ 0 max 4096 verf 0000000000000000
> >>03:16:52.262323 IP (tos 0x0, ttl 64, id 16, offset 0, flags [+],
> >>proto UDP (17), length 1500)
> >>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus
> >>POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> >>03:16:52.264592 IP (tos 0x0, ttl 64, id 16, offset 1480, flags [+],
> >>proto UDP (17), length 1500)
> >>     10.0.2.2 > 10.0.2.15: udp
> >>03:16:54.462961 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> >>proto UDP (17), length 164)
> >>     10.0.2.15.3369105030 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
> >>512 bytes @ 0 max 4096 verf 0000000000000000
> >>03:16:54.466300 IP (tos 0x0, ttl 64, id 17, offset 0, flags [+],
> >>proto UDP (17), length 1500)
> >>     10.0.2.2.nfs > 10.0.2.15.3369105030: reply ok 1472 readdirplus
> >>POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> >>03:16:54.467084 IP (tos 0x0, ttl 64, id 17, offset 1480, flags [+],
> >>proto UDP (17), length 1500)
> >>     10.0.2.2 > 10.0.2.15: udp
> >>...
> >>
> >>I didn't investigate the cause of the problem in detail. I just reverted
> >>
> >>commit 199ee608f0d08510b5c6c37f31a7fbff211d63c4
> >>Author: Luigi Rizzo <rizzo@iet.unipi.it>
> >>Date:   Tue Feb 5 17:53:31 2013 +0100
> >>
> >>     net: fix qemu_flush_queued_packets() in presence of a hub
> >>
> >>And then applied my patch. After that everything works fine for me.
> >>See guest tcpdump log below.
> >>
> >>04:45:15.897245 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF],
> >>proto UDP (17), length 164)
> >>     10.0.2.15.3642011847 > 10.0.2.2.nfs: 136 readdirplus fh Unknown/0100070004001200000000002873593C9B3C43388E23748B0BAD870C00000000
> >>512 bytes @ 0 max 4096 verf 0000000000000000
> >>04:45:15.899686 IP (tos 0x0, ttl 64, id 15, offset 0, flags [+],
> >>proto UDP (17), length 1500)
> >>     10.0.2.2.nfs > 10.0.2.15.3642011847: reply ok 1472 readdirplus
> >>POST: DIR 40777 ids 0/0 sz 4096 verf 0000000000000000
> >>04:45:15.906253 IP (tos 0x0, ttl 64, id 15, offset 1480, flags [+],
> >>proto UDP (17), length 1500)
> >>     10.0.2.2 > 10.0.2.15: udp
> >>04:45:15.906687 IP (tos 0x0, ttl 64, id 15, offset 2960, flags
> >>[none], proto UDP (17), length 240)
> >>     10.0.2.2 > 10.0.2.15: udp
> >>
> >>So there must be something wrong with already applied patch. What
> >>could you suggest?
> >The next step is to investigate the cause.
> >
> >Perhaps hw/usb/dev-network.c:usb_net_handle_datain() is not calling
> >qemu_flush_queued_packets() every time in_buf[] is read completely.
> >This if statement looks strange to me:
> >
> >if (s->in_ptr >= s->in_len &&
> >     (is_rndis(s) || (s->in_len & (64 - 1)) || !len)) {
> >     /* no short packet necessary */
> >     usb_net_reset_in_buf(s);
> >}
> >
> >Try placing printfs to find out whether qemu_flush_queued_packets() is
> >getting called when you see packet loss.
> >
> >Stefan
> >
> 
> Seems that I have figured out the problem. net_hub_flush() does not
> flush source port. And qemu_flush_queued_packets() also returns
> after calling net_hub_flush(). So I think the problem is that
> neither qemu_flush_queued_packets() nor net_hub_flush() call
> qemu_net_queue_flush() for the source port. So I think it sould be
> fixed in qemu_flush_queued_packets() by removing the return
> statement after calling net_hub_flush(). That fix does work for me.
> So I could submit that patch after getting permission for that.

Sounds good to me.

I have CCed Luigi in case he wants to comment.

Stefan
diff mbox

Patch

diff --git a/net/hub.c b/net/hub.c
index df32074..552e970 100644
--- a/net/hub.c
+++ b/net/hub.c
@@ -87,25 +87,6 @@  static NetHub *net_hub_new(int id)
     return hub;
 }
 
-static int net_hub_port_can_receive(NetClientState *nc)
-{
-    NetHubPort *port;
-    NetHubPort *src_port = DO_UPCAST(NetHubPort, nc, nc);
-    NetHub *hub = src_port->hub;
-
-    QLIST_FOREACH(port, &hub->ports, next) {
-        if (port == src_port) {
-            continue;
-        }
-
-        if (qemu_can_send_packet(&port->nc)) {
-            return 1;
-        }
-    }
-
-    return 0;
-}
-
 static ssize_t net_hub_port_receive(NetClientState *nc,
                                     const uint8_t *buf, size_t len)
 {
@@ -132,7 +113,6 @@  static void net_hub_port_cleanup(NetClientState *nc)
 static NetClientInfo net_hub_port_info = {
     .type = NET_CLIENT_OPTIONS_KIND_HUBPORT,
     .size = sizeof(NetHubPort),
-    .can_receive = net_hub_port_can_receive,
     .receive = net_hub_port_receive,
     .receive_iov = net_hub_port_receive_iov,
     .cleanup = net_hub_port_cleanup,