diff mbox series

[v5,24/24] slirp: fix ipv6 timers

Message ID 20180725121706.12867.98787.stgit@pasha-VirtualBox
State New
Headers show
Series Fixing record/replay and adding reverse debugging | expand

Commit Message

Pavel Dovgalyuk July 25, 2018, 12:17 p.m. UTC
ICMP implementation for IPv6 uses timers based on virtual clock.
This is incorrect because this service is not related to the guest state.
This patch changes using virtual clock to the realtime.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
---
 slirp/ip6_icmp.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Samuel Thibault July 25, 2018, 1:44 p.m. UTC | #1
Pavel Dovgalyuk, le mer. 25 juil. 2018 15:17:06 +0300, a ecrit:
> ICMP implementation for IPv6 uses timers based on virtual clock.
> This is incorrect because this service is not related to the guest state.

? Why not?  The RAs are seen by the guest.  As documented:

 * @QEMU_CLOCK_REALTIME: Real time clock
 *
 * The real time clock should be used only for stuff which does not
 * change the virtual machine state, as it runs even if the virtual
 * machine is stopped.

There is no reason to "send RAs" while the machine is stopped.

Samuel
Pavel Dovgalyuk July 26, 2018, 7:08 a.m. UTC | #2
> From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> Pavel Dovgalyuk, le mer. 25 juil. 2018 15:17:06 +0300, a ecrit:
> > ICMP implementation for IPv6 uses timers based on virtual clock.
> > This is incorrect because this service is not related to the guest state.
> 
> ? Why not?  The RAs are seen by the guest.  

Because virtual clock should be used by the virtual devices.
slirp module is not the virtual device. Therefore processed packets
become visible to the guest after passing to the virtual network card.
Before that it can create any timers that should not change the state of the guest.

> As documented:
> 
>  * @QEMU_CLOCK_REALTIME: Real time clock
>  *
>  * The real time clock should be used only for stuff which does not
>  * change the virtual machine state, as it runs even if the virtual
>  * machine is stopped.
> 
> There is no reason to "send RAs" while the machine is stopped.

I see.
Then we'll need one more clock. Which works like realtime+virtual:
intended to be used for the internal QEMU purposes, but stops when
VM is stopped.

Pavel Dovgalyuk
Samuel Thibault July 26, 2018, 7:35 a.m. UTC | #3
Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:08:29 +0300, a ecrit:
> > As documented:
> > 
> >  * @QEMU_CLOCK_REALTIME: Real time clock
> >  *
> >  * The real time clock should be used only for stuff which does not
> >  * change the virtual machine state, as it runs even if the virtual
> >  * machine is stopped.
> > 
> > There is no reason to "send RAs" while the machine is stopped.
> 
> I see.
> Then we'll need one more clock. Which works like realtime+virtual:
> intended to be used for the internal QEMU purposes, but stops when
> VM is stopped.

Just to be sure: what is meant by "is stopped"? Is it a pause (thus time
does not advance within the guest), or is it just sleeping because it
has nothing to do?

Samuel
Pavel Dovgalyuk July 26, 2018, 7:37 a.m. UTC | #4
> From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:08:29 +0300, a ecrit:
> > > As documented:
> > >
> > >  * @QEMU_CLOCK_REALTIME: Real time clock
> > >  *
> > >  * The real time clock should be used only for stuff which does not
> > >  * change the virtual machine state, as it runs even if the virtual
> > >  * machine is stopped.
> > >
> > > There is no reason to "send RAs" while the machine is stopped.
> >
> > I see.
> > Then we'll need one more clock. Which works like realtime+virtual:
> > intended to be used for the internal QEMU purposes, but stops when
> > VM is stopped.
> 
> Just to be sure: what is meant by "is stopped"? Is it a pause (thus time
> does not advance within the guest), or is it just sleeping because it
> has nothing to do?

Paused with HMP/QMP command.
As virtual clock runs only if VM is not paused.

Pavel Dovgalyuk
Samuel Thibault July 26, 2018, 7:40 a.m. UTC | #5
Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:37:03 +0300, a ecrit:
> > From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> > Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:08:29 +0300, a ecrit:
> > > > As documented:
> > > >
> > > >  * @QEMU_CLOCK_REALTIME: Real time clock
> > > >  *
> > > >  * The real time clock should be used only for stuff which does not
> > > >  * change the virtual machine state, as it runs even if the virtual
> > > >  * machine is stopped.
> > > >
> > > > There is no reason to "send RAs" while the machine is stopped.
> > >
> > > I see.
> > > Then we'll need one more clock. Which works like realtime+virtual:
> > > intended to be used for the internal QEMU purposes, but stops when
> > > VM is stopped.
> > 
> > Just to be sure: what is meant by "is stopped"? Is it a pause (thus time
> > does not advance within the guest), or is it just sleeping because it
> > has nothing to do?
> 
> Paused with HMP/QMP command.
> As virtual clock runs only if VM is not paused.

Then all other uses of qemu_clock in slirp are bogus and need to be
fixed like ip6_icmp: they are using QEMU_CLOCK_REALTIME, but they want
it not to progress while the guest time is not advancing. Otherwise on
guest resume after a long pause basically all TCP/UDP/ARP timings will
have expired.

Samuel
Samuel Thibault July 26, 2018, 8:07 a.m. UTC | #6
Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:08:29 +0300, a ecrit:
> virtual clock should be used by the virtual devices.
> slirp module is not the virtual device. Therefore processed packets
> become visible to the guest after passing to the virtual network card.
> Before that it can create any timers that should not change the state of the guest.

I'm not sure I understand that part correctly. slirp is not a "device"
strictly speaking, but it has a whole foot in the virtual world. All
TCP/UDP/ARP/RA timings are related to the guest timing, so

> > > this service is not related to the guest state.

seems incorrect. At the moment the ip6_icmp timer's current value is not
saved in the guest state, but in principle it should, so that the guest
does see the RAs at a regular rate. In practice we don't care because
the timing is randomized anyway.

> intended to be used for the internal QEMU purposes, but stops when VM
> is stopped.

I again don't understand this. The ip6_icmp timing is not for internal
QEMU purpose, its whole point is how often RAs are sent to the guest.

slirp's guest part is not a device as directly seen by guest I/O, but
it's a router device as seen through guest packets. Think of it like a
USB device, which is seen by the guest through USB packets.

Samuel
Pavel Dovgalyuk July 26, 2018, 8:37 a.m. UTC | #7
> From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:08:29 +0300, a ecrit:
> > virtual clock should be used by the virtual devices.
> > slirp module is not the virtual device. Therefore processed packets
> > become visible to the guest after passing to the virtual network card.
> > Before that it can create any timers that should not change the state of the guest.
> 
> I'm not sure I understand that part correctly. slirp is not a "device"
> strictly speaking, but it has a whole foot in the virtual world. All
> TCP/UDP/ARP/RA timings are related to the guest timing, so

I don't know all details of slirp, so let me ask:
if the virtual timer runs very slowly (when it configured this way with icount option),
should the timings relate this speed? Or the timers are related to the network devices
(e.g., servers in the outer world)?

> > > > this service is not related to the guest state.
> 
> seems incorrect. At the moment the ip6_icmp timer's current value is not
> saved in the guest state, but in principle it should, so that the guest
> does see the RAs at a regular rate. In practice we don't care because
> the timing is randomized anyway.

Isn't this just a side effect?
I mean that slirp may be replaced by, say, tap, and the guest should not notice
the difference.

> > intended to be used for the internal QEMU purposes, but stops when VM
> > is stopped.
> 
> I again don't understand this. The ip6_icmp timing is not for internal
> QEMU purpose, its whole point is how often RAs are sent to the guest.
> 
> slirp's guest part is not a device as directly seen by guest I/O, but
> it's a router device as seen through guest packets. Think of it like a
> USB device, which is seen by the guest through USB packets.

Record/replay implementation creates a line between the guest state and
the outer world. Everything crossing this line is saved in the log replayed.
In case of network, this line is implemented with the network filter.
It takes packets from slirp(or anything) and passes(or not) them to the guest nic.
When replaying, the saved packets are injected into the filter directly.
Slirp is the part of the outer world, so it shouldn't affect the guest state directly.

Pavel Dovgalyuk
Samuel Thibault July 26, 2018, 9:15 a.m. UTC | #8
Pavel Dovgalyuk, le jeu. 26 juil. 2018 11:37:57 +0300, a ecrit:
> > From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> > Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:08:29 +0300, a ecrit:
> > > virtual clock should be used by the virtual devices.
> > > slirp module is not the virtual device. Therefore processed packets
> > > become visible to the guest after passing to the virtual network card.
> > > Before that it can create any timers that should not change the state of the guest.
> > 
> > I'm not sure I understand that part correctly. slirp is not a "device"
> > strictly speaking, but it has a whole foot in the virtual world. All
> > TCP/UDP/ARP/RA timings are related to the guest timing, so
> 
> I don't know all details of slirp, so let me ask:
> if the virtual timer runs very slowly (when it configured this way with icount option),
> should the timings relate this speed?

Yes. Otherwise the guest will not be fast enough to answer promptly
according to slirp's TCP delays.

> Or the timers are related to the network devices (e.g., servers in the
> outer world)?

No.

> > > > > this service is not related to the guest state.
> > 
> > seems incorrect. At the moment the ip6_icmp timer's current value is not
> > saved in the guest state, but in principle it should, so that the guest
> > does see the RAs at a regular rate. In practice we don't care because
> > the timing is randomized anyway.
> 
> Isn't this just a side effect?
> I mean that slirp may be replaced by, say, tap, and the guest should not notice
> the difference.

Well, if a guest is connected through a tap, the virtual time should
really run as fast as the realtime, and it should not be paused.
Otherwise TCP connections will break since the guest won't be able to
reply fast enough, without even knowing about the issue. Slirp can
compensate this thanks to a buffer between what happens in the real
world and what happens in the virtual world. Real world timings are
handled by the OS socket implementation, and virtual world timings are
handled with the qemu timer.

> > > intended to be used for the internal QEMU purposes, but stops when VM
> > > is stopped.
> > 
> > I again don't understand this. The ip6_icmp timing is not for internal
> > QEMU purpose, its whole point is how often RAs are sent to the guest.
> > 
> > slirp's guest part is not a device as directly seen by guest I/O, but
> > it's a router device as seen through guest packets. Think of it like a
> > USB device, which is seen by the guest through USB packets.
> 
> Record/replay implementation creates a line between the guest state and
> the outer world. Everything crossing this line is saved in the log replayed.
> In case of network, this line is implemented with the network filter.
> It takes packets from slirp(or anything) and passes(or not) them to the guest nic.
> When replaying, the saved packets are injected into the filter directly.

> Slirp is the part of the outer world,

In normal uses it is not. It is a virtual world (its DHCP server, tftp
server, TCP connexions, etc.) that lives along the guest.

Now, I understand that for record/replay it's simpler to put the line
after slirp.

Ideally slirp's state should ideally be split it two: the part connected
to the real world (data from/to the sockets), and the part connected to
the virtual world (TCP buffering with the guest). So that when pausing,
going back, going forward etc. the slirp buffers act accordingly, TCP
knowing exactly what is supposed to be sent or not (otherwise, TCP
would for instance be really astonished if the guest happens to insist
requesting old data that it has already ACKed).

But that's tricky, and I understand it's simpler to just put the line
after slirp, and let the replay of frames provide the guest (which for
instance has been reset to an older time) with the missing data, and TCP
will nicely cope with duplicate ACKs and spurious re-emissions from the
guest.

That being said, there will be problems with TCP connections if you
pause the guest for a long time: slirp's TCP will timeout and reset the
connexion. Yes, that happens with tap devices anyway, but slirp acting
as a buffer seems more useful to me.

Samuel
Pavel Dovgalyuk July 31, 2018, 6:58 a.m. UTC | #9
> From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> Pavel Dovgalyuk, le jeu. 26 juil. 2018 11:37:57 +0300, a ecrit:
> > > From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> > > Pavel Dovgalyuk, le jeu. 26 juil. 2018 10:08:29 +0300, a ecrit:
> > > > virtual clock should be used by the virtual devices.
> > > > slirp module is not the virtual device. Therefore processed packets
> > > > become visible to the guest after passing to the virtual network card.
> > > > Before that it can create any timers that should not change the state of the guest.
> > >
> > > I'm not sure I understand that part correctly. slirp is not a "device"
> > > strictly speaking, but it has a whole foot in the virtual world. All
> > > TCP/UDP/ARP/RA timings are related to the guest timing, so
> >
> > I don't know all details of slirp, so let me ask:
> > if the virtual timer runs very slowly (when it configured this way with icount option),
> > should the timings relate this speed?
> 
> Yes. Otherwise the guest will not be fast enough to answer promptly
> according to slirp's TCP delays.
> 
> > Or the timers are related to the network devices (e.g., servers in the
> > outer world)?
> 
> No.
> 
> > > > > > this service is not related to the guest state.
> > >
> > > seems incorrect. At the moment the ip6_icmp timer's current value is not
> > > saved in the guest state, but in principle it should, so that the guest
> > > does see the RAs at a regular rate. In practice we don't care because
> > > the timing is randomized anyway.
> >
> > Isn't this just a side effect?
> > I mean that slirp may be replaced by, say, tap, and the guest should not notice
> > the difference.
> 
> Well, if a guest is connected through a tap, the virtual time should
> really run as fast as the realtime, and it should not be paused.
> Otherwise TCP connections will break since the guest won't be able to
> reply fast enough, without even knowing about the issue. Slirp can
> compensate this thanks to a buffer between what happens in the real
> world and what happens in the virtual world. Real world timings are
> handled by the OS socket implementation, and virtual world timings are
> handled with the qemu timer.

Then maybe the solution is the new clock with the frequency of the virtual
clock, but which does not affect the replayed core?
This clock should stop when VM is paused.
It also could be saved in vmstate. As it does not affect the replay,
saving and restoring its state won't break anything.

> > > > intended to be used for the internal QEMU purposes, but stops when VM
> > > > is stopped.
> > >
> > > I again don't understand this. The ip6_icmp timing is not for internal
> > > QEMU purpose, its whole point is how often RAs are sent to the guest.
> > >
> > > slirp's guest part is not a device as directly seen by guest I/O, but
> > > it's a router device as seen through guest packets. Think of it like a
> > > USB device, which is seen by the guest through USB packets.
> >
> > Record/replay implementation creates a line between the guest state and
> > the outer world. Everything crossing this line is saved in the log replayed.
> > In case of network, this line is implemented with the network filter.
> > It takes packets from slirp(or anything) and passes(or not) them to the guest nic.
> > When replaying, the saved packets are injected into the filter directly.
> 
> > Slirp is the part of the outer world,
> 
> In normal uses it is not. It is a virtual world (its DHCP server, tftp
> server, TCP connexions, etc.) that lives along the guest.
> 
> Now, I understand that for record/replay it's simpler to put the line
> after slirp.
> 
> Ideally slirp's state should ideally be split it two: the part connected
> to the real world (data from/to the sockets), and the part connected to
> the virtual world (TCP buffering with the guest). So that when pausing,
> going back, going forward etc. the slirp buffers act accordingly, TCP
> knowing exactly what is supposed to be sent or not (otherwise, TCP
> would for instance be really astonished if the guest happens to insist
> requesting old data that it has already ACKed).
> 
> But that's tricky, and I understand it's simpler to just put the line
> after slirp, and let the replay of frames provide the guest (which for
> instance has been reset to an older time) with the missing data, and TCP
> will nicely cope with duplicate ACKs and spurious re-emissions from the
> guest.
> 
> That being said, there will be problems with TCP connections if you
> pause the guest for a long time: slirp's TCP will timeout and reset the
> connexion. Yes, that happens with tap devices anyway, but slirp acting
> as a buffer seems more useful to me.

Pavel Dovgalyuk
Samuel Thibault Aug. 1, 2018, 7:22 p.m. UTC | #10
Pavel Dovgalyuk, le mar. 31 juil. 2018 09:58:26 +0300, a ecrit:
> > From: Samuel Thibault [mailto:samuel.thibault@gnu.org]
> > Pavel Dovgalyuk, le jeu. 26 juil. 2018 11:37:57 +0300, a ecrit:
> > > Or the timers are related to the network devices (e.g., servers in the
> > > outer world)?
> > 
> > No.
> > 
> > > > > > > this service is not related to the guest state.
> > > >
> > > > seems incorrect. At the moment the ip6_icmp timer's current value is not
> > > > saved in the guest state, but in principle it should, so that the guest
> > > > does see the RAs at a regular rate. In practice we don't care because
> > > > the timing is randomized anyway.
> > >
> > > Isn't this just a side effect?
> > > I mean that slirp may be replaced by, say, tap, and the guest should not notice
> > > the difference.
> > 
> > Well, if a guest is connected through a tap, the virtual time should
> > really run as fast as the realtime, and it should not be paused.
> > Otherwise TCP connections will break since the guest won't be able to
> > reply fast enough, without even knowing about the issue. Slirp can
> > compensate this thanks to a buffer between what happens in the real
> > world and what happens in the virtual world. Real world timings are
> > handled by the OS socket implementation, and virtual world timings are
> > handled with the qemu timer.
> 
> Then maybe the solution is the new clock with the frequency of the virtual
> clock, but which does not affect the replayed core?
> This clock should stop when VM is paused.
> It also could be saved in vmstate. As it does not affect the replay,
> saving and restoring its state won't break anything.

I guess so.

Samuel
diff mbox series

Patch

diff --git a/slirp/ip6_icmp.c b/slirp/ip6_icmp.c
index ee333d0..e25818e 100644
--- a/slirp/ip6_icmp.c
+++ b/slirp/ip6_icmp.c
@@ -17,7 +17,7 @@  static void ra_timer_handler(void *opaque)
 {
     Slirp *slirp = opaque;
     timer_mod(slirp->ra_timer,
-              qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
+              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NDP_Interval);
     ndp_send_ra(slirp);
 }
 
@@ -27,9 +27,9 @@  void icmp6_init(Slirp *slirp)
         return;
     }
 
-    slirp->ra_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, ra_timer_handler, slirp);
+    slirp->ra_timer = timer_new_ms(QEMU_CLOCK_REALTIME, ra_timer_handler, slirp);
     timer_mod(slirp->ra_timer,
-              qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + NDP_Interval);
+              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NDP_Interval);
 }
 
 void icmp6_cleanup(Slirp *slirp)