Message ID | 5461DBC4.9090508@redhat.com |
---|---|
State | New |
Headers | show |
On 11/11/2014 10:49 AM, Carlos O'Donell wrote: > For argv only. What instruction is the block copy? Are you > talking about lm/stm? I was thinking of EX combined with MVC. But of course you're right about envp and argp, where you'd have to count the number of entries before forming the count. > + # Now we have to zero out the envp entries after NULL to allow\n\ > + # start.S to properly find auxv by skipping zeroes.\n\ > + # zero out loop:\n\ > + lhi %r7,0\n\ > +.L3: st %r7,0(%r6) # Store zero.\n\ > + ahi %r6,4 # Advance dest pointer.\n\ > + ahi %r1,-1 # Subtract one from the word count.\n\ > + ltr %r1,%r1\n\ > + jne .L3 # Keep copying if the word count is non-zero.\n\ In the alpha port, I just copy auxv down as well. r~
On Tue, Nov 11, 2014 at 04:49:56AM -0500, Carlos O'Donell wrote: > + # Set the back chain to zero again\n\ > +.L4: xc 0(4,%r15),0(%r15)\n\ I don't think this is needed now, since you're not shifting the stack pointer anymore. Siddhesh
On 11/11/2014 10:49 AM, Carlos O'Donell wrote: > On 11/11/2014 04:28 AM, Richard Henderson wrote: >> On 11/11/2014 06:31 AM, Carlos O'Donell wrote: >>> Any clever ideas on how to fix this without copying up a large >>> portion of the stack? >> >> Nope, because other targets do in fact have to do just that. > > Right. > >> I'm actually surprised that almost all of them don't. I suppose >> that just depends on how the ABI is set up to pass parameters to >> the user _start... > > Yes, on hppa the ABI was such that I didn't depend on the layout, > but s390 does, thus I think it is non-trivial. > > The argv might be trivial to copy, but the size of envp is unknown, > and _start from the application side in start.S expects argv and > envp to be right up against eachother AFAICT. Worse it wants to > scan past them in order to find auxv. The saving grace is that > the auxv scan parses an arbitrary number of zeros between envp > and auxv. So I can move the gap to be between envp and auxv. > >> Fortunately, s390 has a block copy instruction, so the move >> should be trivial to implement. > > For argv only. What instruction is the block copy? Are you > talking about lm/stm? > > The most naive fix I have working is as follows (I've handed > off to Siddhesh to have a look since I'm out of time this > evening/morning). Thanks for working on this! Really surprising that this stayed unnoticed for that long. I thought the testsuite invokes the testcases as ld.so parameter as well? As I understand it an alignment fix is only required if _dl_skip_args is odd. So what about checking the least significant bit (tmll) and copying argv/envv down by 4 bytes? That way we would not need to copy anything in 50% of the cases? Would you like to continue working on it or should we try to take over? (Stefan or myself) Bye, -Andreas- > > diff --git a/sysdeps/s390/s390-32/dl-machine.h b/sysdeps/s390/s390-32/dl-machine.h > index c56185c..b189552 100644 > --- a/sysdeps/s390/s390-32/dl-machine.h > +++ b/sysdeps/s390/s390-32/dl-machine.h > @@ -166,18 +166,47 @@ _dl_start_user:\n\ > # See if we were run as a command with the executable file\n\ > # name as an extra leading argument.\n\ > l %r1,_dl_skip_args@GOT12(0,%r12)\n\ > - l %r1,0(%r1) # load _dl_skip_args\n\ > + l %r1,0(%r1) # load _dl_skip_args\n\ > + ltr %r1,%r1\n\ > + je .L4 # Skip the arg adjustment if there were none.\n\ > # Get the original argument count.\n\ > l %r0,96(%r15)\n\ > # Subtract _dl_skip_args from it.\n\ > sr %r0,%r1\n\ > - # Adjust the stack pointer to skip _dl_skip_args words.\n\ > - sll %r1,2\n\ > - ar %r15,%r1\n\ > - # Set the back chain to zero again\n\ > - xc 0(4,%r15),0(%r15)\n\ > # Store back the modified argument count.\n\ > st %r0,96(%r15)\n\ > + # Copy argv and envp forward to account for skipped argv entries.\n\ > + # We skipped at least one argument or we would not get here.\n\ > + la %r6,100(%r15) # Destination pointer i.e. &argv[0]\n\ > + lr %r5,%r6\n\ > + lr %r0,%r1\n\ > + sll %r0,2\n\ # Number of skipped bytes.\n\ > + ar %r5,%r0 # Source pointer = Dest + Skipped args.\n\ > + # argv copy loop:\n\ > +.L1: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L1 # Stop after copying the NULL.\n\ > + # envp copy loop:\n\ > +.L2: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L2 # Stop after copying the NULL.\n\ > + # Now we have to zero out the envp entries after NULL to allow\n\ > + # start.S to properly find auxv by skipping zeroes.\n\ > + # zero out loop:\n\ > + lhi %r7,0\n\ > +.L3: st %r7,0(%r6) # Store zero.\n\ > + ahi %r6,4 # Advance dest pointer.\n\ > + ahi %r1,-1 # Subtract one from the word count.\n\ > + ltr %r1,%r1\n\ > + jne .L3 # Keep copying if the word count is non-zero.\n\ > + # Set the back chain to zero again\n\ > +.L4: xc 0(4,%r15),0(%r15)\n\ > # The special initializer gets called with the stack just\n\ > # as the application's entry point will see it; it can\n\ > # switch stacks if it moves these contents over.\n\ > --- > > Cheers, > Carlos. >
Andreas Krebbel <krebbel@linux.vnet.ibm.com> writes: > As I understand it an alignment fix is only required if _dl_skip_args is > odd. So what about > checking the least significant bit (tmll) and copying argv/envv down by 4 > bytes? That way we would > not need to copy anything in 50% of the cases? I don't think it makes sense optimizing for this case, which is rare outside of runing the glibc testsuite. Andreas.
On 11/11/2014 10:49 AM, Carlos O'Donell wrote: > On 11/11/2014 04:28 AM, Richard Henderson wrote: >> On 11/11/2014 06:31 AM, Carlos O'Donell wrote: >>> Any clever ideas on how to fix this without copying up a large >>> portion of the stack? >> >> Nope, because other targets do in fact have to do just that. > > Right. > >> I'm actually surprised that almost all of them don't. I suppose >> that just depends on how the ABI is set up to pass parameters to >> the user _start... > > Yes, on hppa the ABI was such that I didn't depend on the layout, > but s390 does, thus I think it is non-trivial. > > The argv might be trivial to copy, but the size of envp is unknown, > and _start from the application side in start.S expects argv and > envp to be right up against eachother AFAICT. Worse it wants to > scan past them in order to find auxv. The saving grace is that > the auxv scan parses an arbitrary number of zeros between envp > and auxv. So I can move the gap to be between envp and auxv. > >> Fortunately, s390 has a block copy instruction, so the move >> should be trivial to implement. > > For argv only. What instruction is the block copy? Are you > talking about lm/stm? > > The most naive fix I have working is as follows (I've handed > off to Siddhesh to have a look since I'm out of time this > evening/morning). > > diff --git a/sysdeps/s390/s390-32/dl-machine.h b/sysdeps/s390/s390-32/dl-machine.h > index c56185c..b189552 100644 > --- a/sysdeps/s390/s390-32/dl-machine.h > +++ b/sysdeps/s390/s390-32/dl-machine.h > @@ -166,18 +166,47 @@ _dl_start_user:\n\ > # See if we were run as a command with the executable file\n\ > # name as an extra leading argument.\n\ > l %r1,_dl_skip_args@GOT12(0,%r12)\n\ > - l %r1,0(%r1) # load _dl_skip_args\n\ > + l %r1,0(%r1) # load _dl_skip_args\n\ > + ltr %r1,%r1\n\ > + je .L4 # Skip the arg adjustment if there were none.\n\ > # Get the original argument count.\n\ > l %r0,96(%r15)\n\ > # Subtract _dl_skip_args from it.\n\ > sr %r0,%r1\n\ > - # Adjust the stack pointer to skip _dl_skip_args words.\n\ > - sll %r1,2\n\ > - ar %r15,%r1\n\ > - # Set the back chain to zero again\n\ > - xc 0(4,%r15),0(%r15)\n\ > # Store back the modified argument count.\n\ > st %r0,96(%r15)\n\ > + # Copy argv and envp forward to account for skipped argv entries.\n\ > + # We skipped at least one argument or we would not get here.\n\ > + la %r6,100(%r15) # Destination pointer i.e. &argv[0]\n\ > + lr %r5,%r6\n\ > + lr %r0,%r1\n\ > + sll %r0,2\n\ # Number of skipped bytes.\n\ > + ar %r5,%r0 # Source pointer = Dest + Skipped args.\n\ > + # argv copy loop:\n\ > +.L1: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L1 # Stop after copying the NULL.\n\ > + # envp copy loop:\n\ > +.L2: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L2 # Stop after copying the NULL.\n\ > + # Now we have to zero out the envp entries after NULL to allow\n\ > + # start.S to properly find auxv by skipping zeroes.\n\ > + # zero out loop:\n\ > + lhi %r7,0\n\ > +.L3: st %r7,0(%r6) # Store zero.\n\ > + ahi %r6,4 # Advance dest pointer.\n\ > + ahi %r1,-1 # Subtract one from the word count.\n\ > + ltr %r1,%r1\n\ > + jne .L3 # Keep copying if the word count is non-zero.\n\ > + # Set the back chain to zero again\n\ > +.L4: xc 0(4,%r15),0(%r15)\n\ > # The special initializer gets called with the stack just\n\ > # as the application's entry point will see it; it can\n\ > # switch stacks if it moves these contents over.\n\ The patch is ok to apply with the xc removed. Thanks! A minor improvement might be to make use of an index reg in the loops. Perhaps something like this (untested): + # argv copy loop:\n\ + lhi %r8,0 +.L1: l %r7,0(%r8,%r5) # Load a word from the source.\n\ + st %r7,0(%r8,%r6) # Store the word in the destination.\n\ + ahi %r8,4\n\ + ltr %r7,%r7\n\ + jne .L1 # Stop after copying the NULL.\n\ + # envp copy loop:\n\ +.L2: l %r7,0(%r8,%r5) # Load a word from the source.\n\ + st %r7,0(%r8,%r6) # Store the word in the destination.\n\ + ahi %r8,4\n\ + ltr %r7,%r7\n\ + jne .L2 # Stop after copying the NULL.\n\ + # Now we have to zero out the envp entries after NULL to allow\n\ + # start.S to properly find auxv by skipping zeroes.\n\ + # zero out loop:\n\ + lhi %r7,0\n\ +.L3: st %r7,0(%r8,%r6) # Store zero.\n\ + ahi %r8,4 # Advance dest pointer.\n\ + ahi %r1,-1 # Subtract one from the word count.\n\ + ltr %r1,%r1\n\ + jne .L3 # Keep copying if the word count is non-zero.\n\ + # Set the back chain to zero again\n\ +.L4: Bye, -Andreas-
On 11/11/2014 09:54 AM, Andreas Krebbel wrote: > Would you like to continue working on it or should we try to take > over? (Stefan or myself) We can get it done. It would be great to have you review the final patch. Please note that the patch I posted is incomplete, it fails to readjust _dl_argv which is cached by the loader and needs to be changed if argv is moved. Simple fix though. I also don't know how long s390 lasted without this breaking something. I guess aligned stacks don't really matter all that much ;-) Cheers, Carlos.
On 11/12/2014 05:00 AM, Carlos O'Donell wrote: > On 11/11/2014 09:54 AM, Andreas Krebbel wrote: >> Would you like to continue working on it or should we try to take >> over? (Stefan or myself) > > We can get it done. It would be great to have you review the final > patch. > > Please note that the patch I posted is incomplete, it fails to > readjust _dl_argv which is cached by the loader and needs to be > changed if argv is moved. Simple fix though. > > I also don't know how long s390 lasted without this breaking > something. I guess aligned stacks don't really matter all that > much ;-) The reason probably is that the broken alignment never reaches the executable. In _start.S we correct the alignment again. So it really is only a problem for code executed between the argv adjustments done by ld.so and the entry point of the executable. -Andreas- > > Cheers, > Carlos. >
On Wed, Nov 12, 2014 at 10:05:45AM +0100, Andreas Krebbel wrote: > On 11/12/2014 05:00 AM, Carlos O'Donell wrote: > > On 11/11/2014 09:54 AM, Andreas Krebbel wrote: > >> Would you like to continue working on it or should we try to take > >> over? (Stefan or myself) > > > > We can get it done. It would be great to have you review the final > > patch. > > > > Please note that the patch I posted is incomplete, it fails to > > readjust _dl_argv which is cached by the loader and needs to be > > changed if argv is moved. Simple fix though. > > > > I also don't know how long s390 lasted without this breaking > > something. I guess aligned stacks don't really matter all that > > much ;-) > > The reason probably is that the broken alignment never reaches the executable. In _start.S we > correct the alignment again. So it really is only a problem for code executed between the argv > adjustments done by ld.so and the entry point of the executable. How/why does any code get executed in that interval anyway? Rich
Rich Felker <dalias@libc.org> writes: > On Wed, Nov 12, 2014 at 10:05:45AM +0100, Andreas Krebbel wrote: >> The reason probably is that the broken alignment never reaches the executable. In _start.S we >> correct the alignment again. So it really is only a problem for code executed between the argv >> adjustments done by ld.so and the entry point of the executable. > > How/why does any code get executed in that interval anyway? It's the dynamic linker itself and the init code. Andreas.
diff --git a/sysdeps/s390/s390-32/dl-machine.h b/sysdeps/s390/s390-32/dl-machine.h index c56185c..b189552 100644 --- a/sysdeps/s390/s390-32/dl-machine.h +++ b/sysdeps/s390/s390-32/dl-machine.h @@ -166,18 +166,47 @@ _dl_start_user:\n\ # See if we were run as a command with the executable file\n\ # name as an extra leading argument.\n\ l %r1,_dl_skip_args@GOT12(0,%r12)\n\ - l %r1,0(%r1) # load _dl_skip_args\n\ + l %r1,0(%r1) # load _dl_skip_args\n\ + ltr %r1,%r1\n\ + je .L4 # Skip the arg adjustment if there were none.\n\ # Get the original argument count.\n\ l %r0,96(%r15)\n\ # Subtract _dl_skip_args from it.\n\ sr %r0,%r1\n\ - # Adjust the stack pointer to skip _dl_skip_args words.\n\ - sll %r1,2\n\ - ar %r15,%r1\n\ - # Set the back chain to zero again\n\ - xc 0(4,%r15),0(%r15)\n\ # Store back the modified argument count.\n\ st %r0,96(%r15)\n\ + # Copy argv and envp forward to account for skipped argv entries.\n\ + # We skipped at least one argument or we would not get here.\n\ + la %r6,100(%r15) # Destination pointer i.e. &argv[0]\n\ + lr %r5,%r6\n\ + lr %r0,%r1\n\ + sll %r0,2\n\ # Number of skipped bytes.\n\ + ar %r5,%r0 # Source pointer = Dest + Skipped args.\n\ + # argv copy loop:\n\ +.L1: l %r7,0(%r5) # Load a word from the source.\n\ + st %r7,0(%r6) # Store the word in the destination.\n\ + ahi %r5,4\n\ + ahi %r6,4\n\ + ltr %r7,%r7\n\ + jne .L1 # Stop after copying the NULL.\n\ + # envp copy loop:\n\ +.L2: l %r7,0(%r5) # Load a word from the source.\n\ + st %r7,0(%r6) # Store the word in the destination.\n\ + ahi %r5,4\n\ + ahi %r6,4\n\ + ltr %r7,%r7\n\ + jne .L2 # Stop after copying the NULL.\n\ + # Now we have to zero out the envp entries after NULL to allow\n\ + # start.S to properly find auxv by skipping zeroes.\n\ + # zero out loop:\n\ + lhi %r7,0\n\ +.L3: st %r7,0(%r6) # Store zero.\n\ + ahi %r6,4 # Advance dest pointer.\n\ + ahi %r1,-1 # Subtract one from the word count.\n\ + ltr %r1,%r1\n\ + jne .L3 # Keep copying if the word count is non-zero.\n\ + # Set the back chain to zero again\n\ +.L4: xc 0(4,%r15),0(%r15)\n\ # The special initializer gets called with the stack just\n\ # as the application's entry point will see it; it can\n\ # switch stacks if it moves these contents over.\n\