Message ID | 20230808042001.411094-1-npiggin@gmail.com |
---|---|
Headers | show |
Series | ppc: record-replay enablement and fixes | expand |
Hello, On 8/8/23 06:19, Nicholas Piggin wrote: > The patches in this series has been seen a few times in various > iterations. > > There are two main pieces, some assorted small fixes and tests for > record-replay, plus a large set of decrementer fixes. I merged these > into one series rather than send decrementer fixes alone first, because > record-replay has been very good at uncovering timer problems, so it's > good to have those test cases in at the same time IMO. > > Some of the fixes we might take to stable, but unclear which. > Decrementer fixes were a bit of a tangle so maybe we just leave those > alone since they work okay. > > The decrementer is not emulated perfectly still. Underflow from -ve > to +ve is not implemented, for one. I started doing that but it's not > trivial so better stop here for now. > > For record-replay, pseries is now quite solid with rr. Surely some > issues to iron out but it is becoming usable. > > powernv record-replay has some known problems migrating edge-triggered > decrementer, and edge triggered msgsnd. Also it seems to get stuck in > xive init somewhere when replaying from checkpoint, so there is probably > some state in xive not being reset. But at least it runs the avocado > test and seems close to working, so I've added that test case so we > don't go backwards (ha!). > > Other machine types might not be too far off if there is interest. I > found it quite difficult to find these problems though, reverse > debugging will sometimes just lock up, stop at wrong location, or abort > with wrong event. Difficult understand what went wrong. Worst case I had > to basically bisect the replay of the trace, and find the minimum length > of replay that hit the problem -- that sometimes would land near a > mtDEC or timer interrupt or similar. > > Thanks, > Nick > > Nicholas Piggin (19): > ppc/vhyp: reset exception state when handling vhyp hcall > ppc/vof: Fix missed fields in VOF cleanup > hw/ppc/ppc.c: Tidy over-long lines > hw/ppc: Introduce functions for conversion between timebase and > nanoseconds > host-utils: Add muldiv64_round_up > hw/ppc: Round up the decrementer interval when converting to ns > hw/ppc: Avoid decrementer rounding errors > target/ppc: Sign-extend large decrementer to 64-bits > hw/ppc: Always store the decrementer value > target/ppc: Migrate DECR SPR > hw/ppc: Reset timebase facilities on machine reset > hw/ppc: Read time only once to perform decrementer write > target/ppc: Fix CPU reservation migration for record-replay > target/ppc: Fix timebase reset with record-replay > spapr: Fix machine reset deadlock from replay-record > spapr: Fix record-replay machine reset consuming too many events > tests/avocado: boot ppc64 pseries replay-record test to Linux VFS > mount > tests/avocado: reverse-debugging cope with re-executing breakpoints > tests/avocado: ppc64 reverse debugging tests for pseries and powernv > > hw/ppc/mac_oldworld.c | 1 + > hw/ppc/pegasos2.c | 1 + > hw/ppc/pnv_core.c | 2 + > hw/ppc/ppc.c | 236 +++++++++++++++++++---------- > hw/ppc/prep.c | 1 + > hw/ppc/spapr.c | 32 +++- > hw/ppc/spapr_cpu_core.c | 2 + > hw/ppc/vof.c | 2 + > include/hw/ppc/ppc.h | 3 +- > include/hw/ppc/spapr.h | 2 + > include/qemu/host-utils.h | 21 ++- > target/ppc/compat.c | 19 +++ > target/ppc/cpu.h | 3 + > target/ppc/excp_helper.c | 3 + > target/ppc/machine.c | 40 ++++- > target/ppc/translate.c | 4 + > tests/avocado/replay_kernel.py | 3 +- > tests/avocado/reverse_debugging.py | 54 ++++++- > 18 files changed, 330 insertions(+), 99 deletions(-) > I am preparing a PR with this series. It is time to take a look at it if you haven't already ! Thanks, C.