Message ID | 20231018123643.1255813-1-benjamin@sipsolutions.net |
---|---|
State | Accepted |
Headers | show |
Series | [1/4] um: irqs: process outstanding IRQs when unblocking signals | expand |
Am 18.10.2023 um 14:36 schrieb benjamin@sipsolutions.net: > From: Benjamin Berg <benjamin.berg@intel.com> > > When in time-travel mode, the eventfd events are read even when signals > are blocked as SIGIO still needs to be processed. In this case, the > event is cleared on the eventfd but the IRQ still needs to be fired > later. > > We did already ensure that the SIGIO handler is run again. However, the > FDs are configured to be level triggered, so that eventfd will not > notify again. As such, add some logic to mark the IRQ as pending and > process it at the next opportunity. > > To avoid duplication, reuse the logic used for the suspend/resume case. > This does not really change anything except for delaying running the > IRQs with timetravel_handler at a slightly later point in time (and > possibly running non-timetravel IRQs that shouldn't happen earlier). > While at it, move marking as pending into irq_event_handler as that is > the more logical place for it to happen. > > Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> I also noticed this problem, but after a discussion with Johannes Berg about this, we come to the conclusion, that all drivers with interrupts need to deal with time travel mode via their own time travel handler. What I actually did, was to write a trivial handler for e.g., the serial line, which simply only call time_travel_add_irq_event(ev). I'm not entirely sure, what the right way to do it, although the outcome seems to be the same, as with no time advance the actual simulation time, when the interrupt is handled is the same. I actually also have some other significant bug fixes on time travel in my tree, but I was too lazy to send them here, are you interested in taking a look at them?
On 20/10/2023 10:15, Benjamin Beichler wrote: > Am 18.10.2023 um 14:36 schrieb benjamin@sipsolutions.net: >> From: Benjamin Berg <benjamin.berg@intel.com> >> >> When in time-travel mode, the eventfd events are read even when signals >> are blocked as SIGIO still needs to be processed. In this case, the >> event is cleared on the eventfd but the IRQ still needs to be fired >> later. >> >> We did already ensure that the SIGIO handler is run again. However, the >> FDs are configured to be level triggered, so that eventfd will not >> notify again. As such, add some logic to mark the IRQ as pending and >> process it at the next opportunity. >> >> To avoid duplication, reuse the logic used for the suspend/resume case. >> This does not really change anything except for delaying running the >> IRQs with timetravel_handler at a slightly later point in time (and >> possibly running non-timetravel IRQs that shouldn't happen earlier). >> While at it, move marking as pending into irq_event_handler as that is >> the more logical place for it to happen. >> >> Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> > > I also noticed this problem, but after a discussion with Johannes Berg about this, we come to the conclusion, that all drivers with interrupts need to deal with time travel mode via their own time travel handler. What I actually did, was to write a trivial handler for e.g., the serial line, which simply only call time_travel_add_irq_event(ev). > > I'm not entirely sure, what the right way to do it, although the outcome seems to be the same, as with no time advance the actual simulation time, when the interrupt is handled is the same. Add an extra registration flag for um_register_irq? enum um_irq_type { IRQ_READ, IRQ_WRITE, IRQ_HAVE_TT_HANDLER, NUM_IRQ_TYPES, }; Ones that do not register get the default one which advances the time. Ones that do - have it left to them and do with time whatever they need. > > I actually also have some other significant bug fixes on time travel in my tree, but I was too lazy to send them here, are you interested in taking a look at them? > > _______________________________________________ > linux-um mailing list > linux-um@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um
On Fri, 2023-10-20 at 11:15 +0200, Benjamin Beichler wrote: > Am 18.10.2023 um 14:36 schrieb benjamin@sipsolutions.net: > > From: Benjamin Berg <benjamin.berg@intel.com> > > > > When in time-travel mode, the eventfd events are read even when signals > > are blocked as SIGIO still needs to be processed. In this case, the > > event is cleared on the eventfd but the IRQ still needs to be fired > > later. > > > > We did already ensure that the SIGIO handler is run again. However, the > > FDs are configured to be level triggered, so that eventfd will not > > notify again. As such, add some logic to mark the IRQ as pending and > > process it at the next opportunity. > > > > To avoid duplication, reuse the logic used for the suspend/resume case. > > This does not really change anything except for delaying running the > > IRQs with timetravel_handler at a slightly later point in time (and > > possibly running non-timetravel IRQs that shouldn't happen earlier). > > While at it, move marking as pending into irq_event_handler as that is > > the more logical place for it to happen. > > > > Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> > > I also noticed this problem, but after a discussion with Johannes Berg > about this, we come to the conclusion, that all drivers with interrupts > need to deal with time travel mode via their own time travel handler. > What I actually did, was to write a trivial handler for e.g., the serial > line, which simply only call time_travel_add_irq_event(ev). Yes, we could instead just assert that we do not have any IRQ without a timetravel handler. We really shouldn't have, but we do have one for stdin unfortunately (which is at least used by the hostap hwsim tests, so we might want to not remove it until we have a different solution). What we could do is just add an stdin timetravel handler as you have done, but print a warning when the stdin IRQ is registered[1]. And then, if someone tries to register an IRQ without a time-travel handler we just panic(), it should never happen. Benjamin [1] We'll also need to set stdin to be a "null" channel and then somehow avoid registering the IRQ in that case. > I'm not entirely sure, what the right way to do it, although the outcome > seems to be the same, as with no time advance the actual simulation > time, when the interrupt is handled is the same. > > I actually also have some other significant bug fixes on time travel in > my tree, but I was too lazy to send them here, are you interested in > taking a look at them? > _______________________________________________ > linux-um mailing list > linux-um@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um
Am 20.10.2023 um 11:26 schrieb Anton Ivanov: >> I also noticed this problem, but after a discussion with Johannes Berg >> about this, we come to the conclusion, that all drivers with >> interrupts need to deal with time travel mode via their own time >> travel handler. What I actually did, was to write a trivial handler >> for e.g., the serial line, which simply only call >> time_travel_add_irq_event(ev). >> >> I'm not entirely sure, what the right way to do it, although the >> outcome seems to be the same, as with no time advance the actual >> simulation time, when the interrupt is handled is the same. > > Add an extra registration flag for um_register_irq? > > > enum um_irq_type { > IRQ_READ, > IRQ_WRITE, > IRQ_HAVE_TT_HANDLER, > NUM_IRQ_TYPES, > }; > > Ones that do not register get the default one which advances the time. > Ones that do - have it left to them and do with time whatever they need. > There is already mechanics for it, as we have um_request_irq_tt and um_request_irq. I don't think an additional type enhance the semantics. >> >> I actually also have some other significant bug fixes on time travel >> in my tree, but I was too lazy to send them here, are you interested >> in taking a look at them? >>
Am 20.10.2023 um 11:59 schrieb Benjamin Berg: >> >> I also noticed this problem, but after a discussion with Johannes Berg >> about this, we come to the conclusion, that all drivers with interrupts >> need to deal with time travel mode via their own time travel handler. >> What I actually did, was to write a trivial handler for e.g., the serial >> line, which simply only call time_travel_add_irq_event(ev). > > Yes, we could instead just assert that we do not have any IRQ without a > timetravel handler. We really shouldn't have, but we do have one for > stdin unfortunately (which is at least used by the hostap hwsim tests, > so we might want to not remove it until we have a different solution). > > What we could do is just add an stdin timetravel handler as you have > done, but print a warning when the stdin IRQ is registered[1]. And > then, if someone tries to register an IRQ without a time-travel handler > we just panic(), it should never happen. Can you explain, why a time travel handler for stdin may be bad? It sounds like you want to avoid it, but I see no immediate problem. > > Benjamin > > [1] We'll also need to set stdin to be a "null" channel and then > somehow avoid registering the IRQ in that case. >
On Fri, 2023-10-20 at 12:38 +0200, Benjamin Beichler wrote: > > Can you explain, why a time travel handler for stdin may be bad? It > sounds like you want to avoid it, but I see no immediate problem. > I need to read the thread, but this one's easy ;-) The thing is that on such a channel you don't send an ACK when you've seen that there's something to handle. As a result, the sender will continue running while you're trying to request a new schedule entry from the controller. As a result, it may run past your new schedule entry because it didn't know about it yet (this would likely bring down the controller and crash the simulation), or the relative order of the two entries is undefined, in the sense that it depends on the process scheduling of the host. johannes
Am 20.10.2023 um 13:39 schrieb Johannes Berg: > On Fri, 2023-10-20 at 12:38 +0200, Benjamin Beichler wrote: >> >> Can you explain, why a time travel handler for stdin may be bad? It >> sounds like you want to avoid it, but I see no immediate problem. >> > > I need to read the thread, but this one's easy ;-) > > The thing is that on such a channel you don't send an ACK when you've > seen that there's something to handle. As a result, the sender will > continue running while you're trying to request a new schedule entry > from the controller. As a result, it may run past your new schedule > entry because it didn't know about it yet (this would likely bring down > the controller and crash the simulation), or the relative order of the > two entries is undefined, in the sense that it depends on the process > scheduling of the host. Sorry, but I did not get this. What may run past the schedule entry? Is your assumption, that the "thing" connected to stdin is always totally unaware of the time travel mode? When our (to be published) simulation send something on serial lines (I think it does not matter whether it is a socket or a pipe), we expect that the uml instance needs to be run as long as it changes back to idle/wait state before the simulation time is advanced. Since the basis model of the time travel mode is, that you have infinite amount of processing power, the interrupt needs to be always handled at the current time. Maybe my think-model only holds for "smaller" amounts of data (maybe one page or something?) or not the free-running mode, but I'm not completely convinced. :-D Maybe we need to define, a bit more formally, how the (designed) processing model of interrupts in time travel mode is. > > johannes > > _______________________________________________ > linux-um mailing list > linux-um@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-um
On Fri, 2023-10-20 at 14:06 +0200, Benjamin Beichler wrote: > Am 20.10.2023 um 13:39 schrieb Johannes Berg: > > On Fri, 2023-10-20 at 12:38 +0200, Benjamin Beichler wrote: > > > > > > Can you explain, why a time travel handler for stdin may be bad? It > > > sounds like you want to avoid it, but I see no immediate problem. > > > > > > > I need to read the thread, but this one's easy ;-) > > > > The thing is that on such a channel you don't send an ACK when you've > > seen that there's something to handle. As a result, the sender will > > continue running while you're trying to request a new schedule entry > > from the controller. As a result, it may run past your new schedule > > entry because it didn't know about it yet (this would likely bring down > > the controller and crash the simulation), or the relative order of the > > two entries is undefined, in the sense that it depends on the process > > scheduling of the host. > Sorry, but I did not get this. What may run past the schedule entry? Is > your assumption, that the "thing" connected to stdin is always totally > unaware of the time travel mode? No, I'm not assuming that, if that's done all bets are off anyway. I'm assuming that both are connected to the controller. > When our (to be published) simulation send something on serial lines (I > think it does not matter whether it is a socket or a pipe), we expect > that the uml instance needs to be run as long as it changes back to > idle/wait state before the simulation time is advanced. Since the basis > model of the time travel mode is, that you have infinite amount of > processing power, the interrupt needs to be always handled at the > current time. Yes but you need to schedule for the interrupt, and you don't necessarily know what 'current time' is at interrupt time. So let's say you have "something" that's scheduled to run at times - 1000 - 2000 - 3000 and free-until is 10000 or something high. Now it sends a message to Linux stdio at 1000. But it doesn't have a way to wait for ack. So it continues, checks the schedule and free-until, and can advance time to 2000 since it doesn't yet know Linux requested time to run at 1000. This gives you something not predictable - it depends on the host scheduling, whichever ran first (Linux started next or the other continued, or both and it gets messed up). > Maybe my think-model only holds for "smaller" amounts of data (maybe one > page or something?) or not the free-running mode, but I'm not completely > convinced. :-D Nah it doesn't really matter how much data there is. > Maybe we need to define, a bit more formally, how the (designed) > processing model of interrupts in time travel mode is. That's probably a separate Master's thesis ;-) johannes
On Fri, 2023-10-20 at 14:20 +0200, Johannes Berg wrote: > > Yes but you need to schedule for the interrupt, and you don't > necessarily know what 'current time' is at interrupt time. > > So let's say you have "something" that's scheduled to run at times > - 1000 > - 2000 > - 3000 > > and free-until is 10000 or something high. > It can also happen without free-until, then it just depends which one of the two - they're running in parallel now (linux doing time-travel interrupt handling and adding the event, the other thing continuing to schedule and doing the next entry) - asks the controller first. johannes
>> Yes but you need to schedule for the interrupt, and you don't >> necessarily know what 'current time' is at interrupt time. >> >> So let's say you have "something" that's scheduled to run at times >> - 1000 >> - 2000 >> - 3000 >> >> and free-until is 10000 or something high. >> > It can also happen without free-until, then it just depends which one of > the two - they're running in parallel now (linux doing time-travel > interrupt handling and adding the event, the other thing continuing to > schedule and doing the next entry) - asks the controller first. Since they happen at the same time, most discrete event simulations make the assumption, that the actual order of execution should not be part of the semantics. Although, most of the implementations demand a deterministic order (for repetition) but even that requirement is sometimes lifted. So this is no contradiction to common discrete event simulations.
On Fri, 2023-10-20 at 14:43 +0200, Benjamin Beichler wrote: > > > Yes but you need to schedule for the interrupt, and you don't > > > necessarily know what 'current time' is at interrupt time. > > > > > > So let's say you have "something" that's scheduled to run at times > > > - 1000 > > > - 2000 > > > - 3000 > > > > > > and free-until is 10000 or something high. > > > > > It can also happen without free-until, then it just depends which one of > > the two - they're running in parallel now (linux doing time-travel > > interrupt handling and adding the event, the other thing continuing to > > schedule and doing the next entry) - asks the controller first. > > Since they happen at the same time, most discrete event simulations make > the assumption, that the actual order of execution should not be part of > the semantics. > Well, fair, but they can bounce more messages around them, and that should change things. I don't think we've built a simulation system here that can strictly assume this is true. In any case, it can get worse - if he sender process actually updated the controller to 2000 by the time the linux actually calls time_travel_add_irq_event(), the linux event will run at 2000 instead of 1000. All this is why we have the vhost-user extensions for ACK messages etc. johannes
> On Fri, 2023-10-20 at 14:06 +0200, Benjamin Beichler wrote: >> Am 20.10.2023 um 13:39 schrieb Johannes Berg: >>> On Fri, 2023-10-20 at 12:38 +0200, Benjamin Beichler wrote: >>>> Can you explain, why a time travel handler for stdin may be bad? It >>>> sounds like you want to avoid it, but I see no immediate problem. >>>> >>> I need to read the thread, but this one's easy ;-) >>> >>> The thing is that on such a channel you don't send an ACK when you've >>> seen that there's something to handle. As a result, the sender will >>> continue running while you're trying to request a new schedule entry >>> from the controller. As a result, it may run past your new schedule >>> entry because it didn't know about it yet (this would likely bring down >>> the controller and crash the simulation), or the relative order of the >>> two entries is undefined, in the sense that it depends on the process >>> scheduling of the host. >> Sorry, but I did not get this. What may run past the schedule entry? Is >> your assumption, that the "thing" connected to stdin is always totally >> unaware of the time travel mode? > No, I'm not assuming that, if that's done all bets are off anyway. I'm > assuming that both are connected to the controller. > >> When our (to be published) simulation send something on serial lines (I >> think it does not matter whether it is a socket or a pipe), we expect >> that the uml instance needs to be run as long as it changes back to >> idle/wait state before the simulation time is advanced. Since the basis >> model of the time travel mode is, that you have infinite amount of >> processing power, the interrupt needs to be always handled at the >> current time. > Yes but you need to schedule for the interrupt, and you don't > necessarily know what 'current time' is at interrupt time. > > So let's say you have "something" that's scheduled to run at times > - 1000 > - 2000 > - 3000 > > and free-until is 10000 or something high. > > Now it sends a message to Linux stdio at 1000. > > But it doesn't have a way to wait for ack. So it continues, checks the > schedule and free-until, and can advance time to 2000 since it doesn't > yet know Linux requested time to run at 1000. Do I get it right, that you anticipate that all parts of the simulation may run at different paces and the controller is only needed for synchronization? My mind model is, that simulation time advances in the controller are only done, If all parts of the simulation reached the current simulation time and are at the event processed/idle/wait state. Therefore, the thing connected to stdio will not advance to some other time, if some other component has outstanding events (i.e., an interrupt at this case). Of course free-running is a problematic in this mind model, but in this mode you willingly sacrifice the precision of events for performance. > > This gives you something not predictable - it depends on the host > scheduling, whichever ran first (Linux started next or the other > continued, or both and it gets messed up). > >> Maybe my think-model only holds for "smaller" amounts of data (maybe one >> page or something?) or not the free-running mode, but I'm not completely >> convinced. :-D > Nah it doesn't really matter how much data there is. > >> Maybe we need to define, a bit more formally, how the (designed) >> processing model of interrupts in time travel mode is. > That's probably a separate Master's thesis ;-) It could, but some basic definition for a common understanding might help. However, do you like to have a look at my current state of time travel changes in github or do you think an RFC to the mailing list is better? I like github for a quick dive into patches, but your whole workflow is maybe different ;-)
On Fri, 2023-10-20 at 14:58 +0200, Benjamin Beichler wrote: > > On Fri, 2023-10-20 at 14:06 +0200, Benjamin Beichler wrote: > > > Am 20.10.2023 um 13:39 schrieb Johannes Berg: > > > > On Fri, 2023-10-20 at 12:38 +0200, Benjamin Beichler wrote: > > > > > Can you explain, why a time travel handler for stdin may be bad? It > > > > > sounds like you want to avoid it, but I see no immediate problem. > > > > > > > > > I need to read the thread, but this one's easy ;-) > > > > > > > > The thing is that on such a channel you don't send an ACK when you've > > > > seen that there's something to handle. As a result, the sender will > > > > continue running while you're trying to request a new schedule entry > > > > from the controller. As a result, it may run past your new schedule > > > > entry because it didn't know about it yet (this would likely bring down > > > > the controller and crash the simulation), or the relative order of the > > > > two entries is undefined, in the sense that it depends on the process > > > > scheduling of the host. > > > Sorry, but I did not get this. What may run past the schedule entry? Is > > > your assumption, that the "thing" connected to stdin is always totally > > > unaware of the time travel mode? > > No, I'm not assuming that, if that's done all bets are off anyway. I'm > > assuming that both are connected to the controller. > > > > > When our (to be published) simulation send something on serial lines (I > > > think it does not matter whether it is a socket or a pipe), we expect > > > that the uml instance needs to be run as long as it changes back to > > > idle/wait state before the simulation time is advanced. Since the basis > > > model of the time travel mode is, that you have infinite amount of > > > processing power, the interrupt needs to be always handled at the > > > current time. > > Yes but you need to schedule for the interrupt, and you don't > > necessarily know what 'current time' is at interrupt time. > > > > So let's say you have "something" that's scheduled to run at times > > - 1000 > > - 2000 > > - 3000 > > > > and free-until is 10000 or something high. > > > > Now it sends a message to Linux stdio at 1000. > > > > But it doesn't have a way to wait for ack. So it continues, checks the > > schedule and free-until, and can advance time to 2000 since it doesn't > > yet know Linux requested time to run at 1000. > > Do I get it right, that you anticipate that all parts of the simulation > may run at different paces and the controller is only needed for > synchronization? Not sure what you mean by that. What I _wanted_ is that only one can run at a time (from a host perspective), so e.g. virtio stuff works like A is running A -> B virtio: A puts data on ring virtio ring A sends message on vhost-user socket to indicate ring update to B A waits for ACK for that message B sees the ring update message B requests time from controller B sends ACK to A A continues running A gets to the next scheduling point but now sees free-until==now A releases time slice to controller controller assigns time slice to B B runs to process the virtio ring data So you see that only one is running at a time. Without the ACK, you'd have B sees the ring update message B requests time from the controller in parallel with A continues running A gets to the next scheduling point but it might either not see free-until==now, or other things. > My mind model is, that simulation time advances in the controller are > only done, If all parts of the simulation reached the current simulation > time and are at the event processed/idle/wait state. Yeah that's a simpler model, and in what we implemented it's true if you do have the ACK messages. If don't have free-until, that's also possibly true, although we might have implementation bugs in a sense with time_travel_add_irq_event() since that looks at the current time. And also, if you request something the controller thinks is already in the past because the other side updated the controller due to free-until, the simulation just crashes. It's also possible that B (in the situation above) actually *does* ask the controller who should run next, and the controller simply doesn't yet know: B sends message to A's stdin (say at 1000) B requests to run at 2000 from controller B releases time to controller controller sets time to 2000 and tells B to run A requests scheduling at 1000 controller *crashes* This is actually quite likely if the host is running 6.6 with the new EEVDF scheduler, because it will prefer to let B continue running, and then it just needs to pick the controller over A next. This again is because of the lack of synchronization in editing the schedule (sending messages to the controller) which the ACK message solves. Hence our internal use cases for "text protocol" actually use virtio- console rather than stdio, so you do have ACK on the vhost-user protocol again. > Therefore, the > thing connected to stdio will not advance to some other time, if some > other component has outstanding events (i.e., an interrupt at this > case). Yes, but by the time it actually asks "are there outstanding events", they might no be added yet, due to hos scheduling order! > Of course free-running is a problematic in this mind model, but > in this mode you willingly sacrifice the precision of events for > performance. If you're referring to free-until: that isn't free-running, it's just meant to reduce the communication overhead if you know nobody else is running. The controller maintains this. Btw we have further overhead reductions with shared memory, need to push all that out. > > > Maybe we need to define, a bit more formally, how the (designed) > > > processing model of interrupts in time travel mode is. > > That's probably a separate Master's thesis ;-) > > It could, but some basic definition for a common understanding might help. :) Does the above help a bit? But yeah maybe I should try to write it down more formally. > However, do you like to have a look at my current state of time travel > changes in github or do you think an RFC to the mailing list is better? > I like github for a quick dive into patches, but your whole workflow is > maybe different ;-) > RFC to the list is probably easier to comment on etc. johannes
>> My mind model is, that simulation time advances in the controller are >> only done, If all parts of the simulation reached the current >> simulation time and are at the event processed/idle/wait state. > Yeah that's a simpler model, and in what we implemented it's true if > you do have the ACK messages. If don't have free-until, that's also > possibly true, although we might have implementation bugs in a sense > with time_travel_add_irq_event() since that looks at the current time. > And also, if you request something the controller thinks is already in > the past because the other side updated the controller due to > free-until, the simulation just crashes. It's also possible that B (in > the situation above) actually *does* ask the controller who should run > next, and the controller simply doesn't yet know: B sends message to > A's stdin (say at 1000) B requests to run at 2000 from controller B > releases time to controller controller sets time to 2000 and tells B > to run A requests scheduling at 1000 controller *crashes* I think, I get now, why I don't have this Problem: My virtio simulation and my controller/simulation-master are tightly coupled and indeed the same process. When I create a new event for the UML-instance, it is anticipated in the simulation master. So my sequence is more like: B sends message to A's stdin (say at 1000) B tells the controller, that it activated A, and time should not advance until A has sent a request for processing an interrupt and has gone back to waiting state B requests to run at 2000 from controller B releases time to controller ... I see that without further information from the device simulation to the controller, it is quite harder. Nonetheless, I'm not totally sure, how this interacts with the timing semantics of the interrupts here. Either the solution of Benjamin Berg or mine (with the trivial tt-handler have the same problem). I will post my patches the next days, you may have a look on them, maybe you also experienced some of my problems :-)
On Fri, 2023-10-20 at 17:47 +0200, Benjamin Beichler wrote: > > I think, I get now, why I don't have this Problem: My virtio simulation > and my controller/simulation-master are tightly coupled and indeed the > same process. When I create a new event for the UML-instance, it is > anticipated in the simulation master. > > So my sequence is more like: > B sends message to A's stdin (say at 1000) > B tells the controller, that it activated A, and time should not advance > until A has sent a request for processing an interrupt and has gone back > to waiting state > B requests to run at 2000 from controller > B releases time to controller > ... Ahh, OK. I thought you were probably using our controller from the usfstl, but of course you don't have to. I think I did push the shared memory optimisation there though, but I suspect I haven't posted the Linux client version. > I see that without further information from the device simulation to the > controller, it is quite harder. Right, I wanted to handle that in the other side (hence the ACK) since you might not always know _when_ the interrupt is scheduled there, even if it's currently always "immediately". > Nonetheless, I'm not totally sure, how this interacts with the timing > semantics of the interrupts here. Either the solution of Benjamin Berg > or mine (with the trivial tt-handler have the same problem). Oh, yeah, his changes do have the same problem. He's just fixing that interrupts got lost completely in some already _otherwise_ broken cases, to make it work better without hanging, not to make it work correctly. To make it work correctly you shouldn't use stdio at all, or I guess have some external helper thing like you do :) johannes
diff --git a/arch/um/kernel/irq.c b/arch/um/kernel/irq.c index 635d44606bfe..ceda4bd2e5ed 100644 --- a/arch/um/kernel/irq.c +++ b/arch/um/kernel/irq.c @@ -37,7 +37,7 @@ struct irq_reg { bool pending; bool wakeup; #ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT - bool pending_on_resume; + bool pending_event; void (*timetravel_handler)(int, int, void *, struct time_travel_event *); struct time_travel_event event; @@ -56,6 +56,9 @@ static DEFINE_SPINLOCK(irq_lock); static LIST_HEAD(active_fds); static DECLARE_BITMAP(irqs_allocated, UM_LAST_SIGNAL_IRQ); static bool irqs_suspended; +#ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT +static bool irqs_pending; +#endif static void irq_io_loop(struct irq_reg *irq, struct uml_pt_regs *regs) { @@ -84,9 +87,12 @@ static void irq_event_handler(struct time_travel_event *ev) { struct irq_reg *reg = container_of(ev, struct irq_reg, event); - /* do nothing if suspended - just to cause a wakeup */ - if (irqs_suspended) + /* do nothing if suspended; just cause a wakeup and mark as pending */ + if (irqs_suspended) { + irqs_pending = true; + reg->pending_event = true; return; + } generic_handle_irq(reg->irq); } @@ -110,16 +116,47 @@ static bool irq_do_timetravel_handler(struct irq_entry *entry, if (!reg->event.pending) return false; - if (irqs_suspended) - reg->pending_on_resume = true; return true; } + +static void irq_do_pending_events(bool timetravel_handlers_only) +{ + struct irq_entry *entry; + + if (!irqs_pending || timetravel_handlers_only) + return; + + irqs_pending = false; + + list_for_each_entry(entry, &active_fds, list) { + enum um_irq_type t; + + for (t = 0; t < NUM_IRQ_TYPES; t++) { + struct irq_reg *reg = &entry->reg[t]; + + /* + * Any timetravel_handler was invoked already, just + * directly run the IRQ. + */ + if (reg->pending_event) { + irq_enter(); + generic_handle_irq(reg->irq); + irq_exit(); + reg->pending_event = false; + } + } + } +} #else static bool irq_do_timetravel_handler(struct irq_entry *entry, enum um_irq_type t) { return false; } + +static void irq_do_pending_events(bool timetravel_handlers_only) +{ +} #endif static void sigio_reg_handler(int idx, struct irq_entry *entry, enum um_irq_type t, @@ -145,6 +182,8 @@ static void sigio_reg_handler(int idx, struct irq_entry *entry, enum um_irq_type */ if (timetravel_handlers_only) { #ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT + reg->pending_event = true; + irqs_pending = true; mark_sigio_pending(); #endif return; @@ -162,6 +201,10 @@ static void _sigio_handler(struct uml_pt_regs *regs, if (timetravel_handlers_only && !um_irq_timetravel_handler_used()) return; + /* Flush out pending events that were ignored due to time-travel. */ + if (!irqs_suspended) + irq_do_pending_events(timetravel_handlers_only); + while (1) { /* This is now lockless - epoll keeps back-referencesto the irqs * which have trigger it so there is no need to walk the irq @@ -543,30 +586,7 @@ void um_irqs_resume(void) unsigned long flags; - local_irq_save(flags); -#ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT - /* - * We don't need to lock anything here since we're in resume - * and nothing else is running, but have disabled IRQs so we - * don't try anything else with the interrupt list from there. - */ - list_for_each_entry(entry, &active_fds, list) { - enum um_irq_type t; - - for (t = 0; t < NUM_IRQ_TYPES; t++) { - struct irq_reg *reg = &entry->reg[t]; - - if (reg->pending_on_resume) { - irq_enter(); - generic_handle_irq(reg->irq); - irq_exit(); - reg->pending_on_resume = false; - } - } - } -#endif - - spin_lock(&irq_lock); + spin_lock_irqsave(&irq_lock, flags); list_for_each_entry(entry, &active_fds, list) { if (entry->suspended) { int err = os_set_fd_async(entry->fd);