Message ID | 20231103-bb-timetravel-patches-v1-10-e2c68efcf664@uni-rostock.de |
---|---|
State | Changes Requested |
Headers | show |
Series | Several Time Travel Mode Enhancements | expand |
On Fri, 2023-11-03 at 16:41 +0000, Benjamin Beichler wrote: > This slows down external TT-mode as more simulation roundtrips are > required, and it unnecessarily affects the determinism and accuracy of > the simulation. I still don't think this is really true, it doesn't really affect determinism? It makes it ... different, sure, but not non-deterministic? > +static const int suspicious_busy_loop_syscalls[] = { > + 36, //sys_getitimer > + 96, //sys_gettimeofday > + 201, //sys_time > + 224, //sys_timer_gettime > + 228, //sys_clock_gettime > + 287, //sys_timerfd_gettime > +}; That's kind of awful. Surely we can use __NR_timer_gettime etc. here at least? > +static bool suspicious_busy_loop(void) > +{ > + int i; > + int syscall = syscall_get_nr(current, task_pt_regs(current)); > + > + for (i = 0; i < ARRAY_SIZE(suspicious_busy_loop_syscalls); i++) { > + if (suspicious_busy_loop_syscalls[i] == syscall) > + return true; > + } Might also be faster to have a bitmap? But ... also kind of awkward I guess. I dunno. I'm not even sure what you're trying to achieve - apart from "determinism" which seems odd or even wrong, and speed, which is probably easier done with a better free-until and the shared memory calendar we have been working on. johannes
Am 06.11.2023 um 21:51 schrieb Johannes Berg: > On Fri, 2023-11-03 at 16:41 +0000, Benjamin Beichler wrote: >> This slows down external TT-mode as more simulation roundtrips are >> required, and it unnecessarily affects the determinism and accuracy of >> the simulation. > I still don't think this is really true, it doesn't really affect > determinism? It makes it ... different, sure, but not non-deterministic? I intentionally kept it vague, but what I meant is that it's unnecessarily challenging to determine. Perhaps I should mention that I'm running an unmodified Ubuntu rootfs with systemd, which starts many daemons and other processes. To me, it seems illogical to delay everything just because one process is waiting for a timestamp. At the moment, we haven't patched the random device that fetches random bytes from the host (do you already have a patch for this?), so complete repeatability isn't guaranteed at the moment. However, that could be a logical next step. >> +static const int suspicious_busy_loop_syscalls[] = { >> + 36, //sys_getitimer >> + 96, //sys_gettimeofday >> + 201, //sys_time >> + 224, //sys_timer_gettime >> + 228, //sys_clock_gettime >> + 287, //sys_timerfd_gettime >> +}; > That's kind of awful. Surely we can use __NR_timer_gettime etc. here at > least? Actually, this was a quick attempt to address the issue, and during that period, I couldn't locate the appropriate macros. These numbers are generated from arch/x86/entry/syscalls/syscall_64.tbl (or 32 if configured in that manner). I might be overlooking something, but it seems that __NR_timer_gettime isn't defined in the kernel. If you have a better reference for this translation, I'd appreciate it. I could verify if the current syscall translates into the corresponding function symbol in the um-syscall table. >> +static bool suspicious_busy_loop(void) >> +{ >> + int i; >> + int syscall = syscall_get_nr(current, task_pt_regs(current)); >> + >> + for (i = 0; i < ARRAY_SIZE(suspicious_busy_loop_syscalls); i++) { >> + if (suspicious_busy_loop_syscalls[i] == syscall) >> + return true; >> + } > Might also be faster to have a bitmap? But ... also kind of awkward I > guess. Actually, a short fixed size array should be optimized quite well with loop unrolling or other stuff, isn't it? I could also do a switch with all calls, but this loop seems for me the easiest. > I dunno. I'm not even sure what you're trying to achieve - apart from > "determinism" which seems odd or even wrong, and speed, which is > probably easier done with a better free-until and the shared memory > calendar we have been working on. In my perspective, delaying get_timer only serves as a tie-breaker for poorly behaving software that resorts to a busy-loop while waiting for time to advance. While this behavior might not be uncommon, why penalize all processes for it? Consider an experiment where I aim to measure the impact of network latency on software. Sometimes, the response latency fluctuates because a background task was scheduled randomly but deterministically at the same time, obtaining a timestamp that blocks all other processes and advances simulation time. This action simply undermines the utility of the time travel mode unnecessarily. However, software actively waiting for time advancement in a busy loop achieves its goal. It’s almost a win-win situation, isn't it? Benjamin
On Fri, 2023-11-10 at 16:54 +0100, Benjamin Beichler wrote: > At the moment, we haven't patched the random device that fetches random > bytes from the host (do you already have a patch for this?), > so complete repeatability isn't guaranteed at the moment. However, that > could be a logical next step. Right, we have the attached kernel patches internally. This simply disables some of the random sources and replaces os_getrandom with returning static random from the UML_RANDOM environment variable. I doubt that it makes sense to upstream these patches, but may we can include them as patch files in USFSTL or so. The second piece is using a mount namespace to ensure that the linux command line is identical between runs and that the location of all files that are accessed directly from the host through hostfs never changes. The last piece was setting GLIBC_TUNABLES=-AVX512CD in the environment just in case the CPU feature set is slightly different. That would cause ld.so to search for a different set of optimized library versions (affecting syscalls and with that randomness). Benjamin
On Fri, 2023-11-10 at 16:54 +0100, Benjamin Beichler wrote: > Am 06.11.2023 um 21:51 schrieb Johannes Berg: > > On Fri, 2023-11-03 at 16:41 +0000, Benjamin Beichler wrote: > > > This slows down external TT-mode as more simulation roundtrips are > > > required, and it unnecessarily affects the determinism and accuracy of > > > the simulation. > > I still don't think this is really true, it doesn't really affect > > determinism? It makes it ... different, sure, but not non-deterministic? > I intentionally kept it vague, but what I meant is that it's > unnecessarily challenging to determine. Yeah, ok, fair enough. > Perhaps I should mention that I'm running an unmodified Ubuntu rootfs > with systemd, which starts many daemons and other processes. That sounds like a bit of a nightmare, to be honest, wouldn't you want to keep things under tighter control? But I guess it really depends on what you're trying to achieve. > To me, it seems illogical to delay everything just because one process > is waiting for a timestamp. Yeah I guess we'll just have to disagree ;-) You're running some process, so you've kind of decided to "give it time" of sorts, and in a normal system reading time will always take time, just like everything else :) But anyway, I'm not really opposed to this patch, it's just ... not great, I guess? And like I said, makes more sense to squash 9 and 10? > At the moment, we haven't patched the random device that fetches random > bytes from the host (do you already have a patch for this?), > so complete repeatability isn't guaranteed at the moment. However, that > could be a logical next step. > > > +static const int suspicious_busy_loop_syscalls[] = { > > > + 36, //sys_getitimer > > > + 96, //sys_gettimeofday > > > + 201, //sys_time > > > + 224, //sys_timer_gettime > > > + 228, //sys_clock_gettime > > > + 287, //sys_timerfd_gettime > > > +}; > > That's kind of awful. Surely we can use __NR_timer_gettime etc. here at > > least? > Actually, this was a quick attempt to address the issue, and during that > period, I couldn't locate the appropriate macros. > > These numbers are generated from arch/x86/entry/syscalls/syscall_64.tbl > (or 32 if configured in that manner). > > I might be overlooking something, but it seems that __NR_timer_gettime > isn't defined in the kernel. If you have a better reference for this > translation, I'd appreciate it. Look at the arch/x86/include/generated/uapi/asm/unistd*.h files after you build the tree. How do they actually get generated? Beats me. > > > +static bool suspicious_busy_loop(void) > > > +{ > > > + int i; > > > + int syscall = syscall_get_nr(current, task_pt_regs(current)); > > > + > > > + for (i = 0; i < ARRAY_SIZE(suspicious_busy_loop_syscalls); i++) { > > > + if (suspicious_busy_loop_syscalls[i] == syscall) > > > + return true; > > > + } > > Might also be faster to have a bitmap? But ... also kind of awkward I > > guess. > Actually, a short fixed size array should be optimized quite well with > loop unrolling or other stuff, isn't it? I could also do a switch with > all calls, but this loop seems for me the easiest. Yeah, maybe. I haven't checked the output. > > I dunno. I'm not even sure what you're trying to achieve - apart from > > "determinism" which seems odd or even wrong, and speed, which is > > probably easier done with a better free-until and the shared memory > > calendar we have been working on. > In my perspective, delaying get_timer only serves as a tie-breaker for > poorly behaving software that resorts to a busy-loop while waiting for > time to advance. Yeah I guess that's true. > While this behavior might not be uncommon, why penalize all processes > for it? Well I think we have a different sense of "penalize", I wouldn't say that. I mean, you can't reasonably expect getting a timestamp doesn't take any time at all, that's just not how physical reality works? Now we're bending the rules here in that a lot of things that normally take time suddenly don't, but I guess I don't fully understand why you're so keen on bending the rules _all the way_. But I think that's this: > Consider an experiment where I aim to measure the impact of network > latency on software. Sometimes, the response latency fluctuates > because a background task was scheduled randomly but deterministically "randomly but deterministically" is kind of fun 😂️ > at the same time, obtaining a timestamp that blocks all other > processes and advances simulation time. This action simply undermines > the utility of the time travel mode unnecessarily. > > However, software actively waiting for time advancement in a busy loop > achieves its goal. It’s almost a win-win situation, isn't it? Fair enough. johannes
diff --git a/arch/um/kernel/time.c b/arch/um/kernel/time.c index f76237cfc1ea..1782f1ed140e 100644 --- a/arch/um/kernel/time.c +++ b/arch/um/kernel/time.c @@ -22,6 +22,7 @@ #include <linux/time-internal.h> #include <linux/um_timetravel.h> #include <shared/init.h> +#include <asm/syscall-generic.h> #ifdef CONFIG_UML_TIME_TRAVEL_SUPPORT enum time_travel_mode time_travel_mode; @@ -721,6 +722,27 @@ static irqreturn_t um_timer(int irq, void *dev) return IRQ_HANDLED; } +static const int suspicious_busy_loop_syscalls[] = { + 36, //sys_getitimer + 96, //sys_gettimeofday + 201, //sys_time + 224, //sys_timer_gettime + 228, //sys_clock_gettime + 287, //sys_timerfd_gettime +}; + +static bool suspicious_busy_loop(void) +{ + int i; + int syscall = syscall_get_nr(current, task_pt_regs(current)); + + for (i = 0; i < ARRAY_SIZE(suspicious_busy_loop_syscalls); i++) { + if (suspicious_busy_loop_syscalls[i] == syscall) + return true; + } + return false; +}; + static u64 timer_read(struct clocksource *cs) { static unsigned long long last_timer_read; @@ -742,11 +764,12 @@ static u64 timer_read(struct clocksource *cs) if (last_timer_read != time_travel_time) { last_timer_read = time_travel_time; consecutive_calls_at_same_time = 0; - } else { + } else if (suspicious_busy_loop()) { consecutive_calls_at_same_time++; } if (!irqs_disabled() && !in_interrupt() && !in_softirq() && - !time_travel_ext_waiting && consecutive_calls_at_same_time > 10) + !time_travel_ext_waiting && consecutive_calls_at_same_time > 10 && + suspicious_busy_loop()) time_travel_update_time(time_travel_time + TIMER_MULTIPLIER, false);
Some userspace programs use the current timestamp as a looping condition (directly or indirectly), which can lead to infinite loops in TT-mode since the timestamp doesn't change until the next time step. Commit 065038706f77 ("um: Support time travel mode") introduced a workaround by always inserting a time step when the current timestamp is read. However, this introduces delays, even in cases where such a loop doesn't exist, for example, in filesystem code that updates access timestamps. This slows down external TT-mode as more simulation roundtrips are required, and it unnecessarily affects the determinism and accuracy of the simulation. This patch attempts to detect potential busy loops by identifying the currently executed syscall that would return a timestamp. If 10 consecutive calls at the same timestamp occur, the timer_read is delayed. Since the patch does not consider the PID of the caller, it might be fooled by legitimate processes that take a single timestamp. However, the result is no worse than the original behavior. Signed-off-by: Benjamin Beichler <benjamin.beichler@uni-rostock.de> --- arch/um/kernel/time.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-)