diff mbox

[v2,MIPS] fix CRT_CALL_STATIC_FUNCTION macro

Message ID 6D39441BF12EF246A7ABCE6654B0235320FCA3F1@LEMAIL01.le.imgtec.org
State New
Headers show

Commit Message

Matthew Fortune Feb. 5, 2015, 8:51 p.m. UTC
Hi Petar,

I've put your patch inline below and switched to plain text. I suspect
your post was bounced by gcc-patches.

I'm OK with this change but I'd like Catherine to comment before committing.
It seems a shame to duplicate the block of code but it is probably just as
ugly to define a macro for the la/dla instruction.

For future reference, it is best not to include changelog content in a patch
but instead just paste into the email. Also the ChangeLog you need for this
change is gcc/ChangeLog (as confusing as that may be at first).

Thanks,
Matthew

> From: Petar Jovanovic [mailto:petar.jovanovic@rt-rk.com] 
> Sent: 05 February 2015 19:28
> To: gcc-patches@gcc.gnu.org; 'Maciej W. Rozycki'; Matthew Fortune
> Subject: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro
>
> v2:
> - add ChangeLog entry
> - use DLA instead of LA for n64
>
> PTAL. Thanks.
>
> Regards,
> Petar

---
 ChangeLog              |  5 +++++
 gcc/config/mips/mips.h | 23 +++++++++++++++++++----
 2 files changed, 24 insertions(+), 4 deletions(-)

Comments

Maciej W. Rozycki Feb. 6, 2015, 10:43 a.m. UTC | #1
On Thu, 5 Feb 2015, Matthew Fortune wrote:

> I'm OK with this change but I'd like Catherine to comment before committing.
> It seems a shame to duplicate the block of code but it is probably just as
> ugly to define a macro for the la/dla instruction.

 Native systems have <sys/asm.h> for such ABI dependencies, including 
stuff to set up $gp.  Perhaps we could reuse these bits, the licence I 
think allows us to.

  Maciej
Matthew Fortune Feb. 6, 2015, 10:56 a.m. UTC | #2
Maciej W. Rozycki <macro@linux-mips.org> writes:
> On Thu, 5 Feb 2015, Matthew Fortune wrote:
> 
> > I'm OK with this change but I'd like Catherine to comment before
> committing.
> > It seems a shame to duplicate the block of code but it is probably just
> as
> > ugly to define a macro for the la/dla instruction.
> 
>  Native systems have <sys/asm.h> for such ABI dependencies, including
> stuff to set up $gp.  Perhaps we could reuse these bits, the licence I
> think allows us to.

That's a good idea. Perhaps I should take that on as part of some cleanup
of the MIPS backend in the next stage1. I'm looking to rework how the
ISA_HAS logic works so perhaps there would be value in doing this mostly in
a header that can also be used for assembly programmers. That would naturally
mean we get all the other nice assembly macros available in the backend of
GCC too.

Thanks,
Matthew
Maciej W. Rozycki Feb. 6, 2015, 12:23 p.m. UTC | #3
On Fri, 6 Feb 2015, Matthew Fortune wrote:

> >  Native systems have <sys/asm.h> for such ABI dependencies, including
> > stuff to set up $gp.  Perhaps we could reuse these bits, the licence I
> > think allows us to.
> 
> That's a good idea. Perhaps I should take that on as part of some cleanup
> of the MIPS backend in the next stage1. I'm looking to rework how the
> ISA_HAS logic works so perhaps there would be value in doing this mostly in
> a header that can also be used for assembly programmers. That would naturally
> mean we get all the other nice assembly macros available in the backend of
> GCC too.

 This consideration made me realise I've had a patch outstanding for some 
10 years to convert all the `BAL x' instructions there to `BLTZAL $0, x'.  
This has always been a good idea in case implementations recognised the 
special case and avoided involving branch prediction, and I believe it has 
become even more apparent with r6 calling it NAL.

 I'll see if I can submit it to glibc soon -- as you may have been aware 
10 years ago wasn't exactly the most friendly period in glibc maintenance 
and hence I wasn't very prompt with patch submissions.

  Maciej
Moore, Catherine Feb. 6, 2015, 3:12 p.m. UTC | #4
> -----Original Message-----
> From: Matthew Fortune [mailto:Matthew.Fortune@imgtec.com]
> Sent: Thursday, February 05, 2015 3:52 PM
> Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro
> > 
> I've put your patch inline below and switched to plain text. I suspect your
> post was bounced by gcc-patches.
> 
> I'm OK with this change but I'd like Catherine to comment before committing.
> It seems a shame to duplicate the block of code but it is probably just as ugly
> to define a macro for the la/dla instruction.

The patch itself is OK, although I agree with other's comments about the unfortunate duplication of code.
Petar,  would you please add your testcase to the gcc testsuite and repost the patch.
Thanks,
Catherine

> 
> For future reference, it is best not to include changelog content in a patch
> but instead just paste into the email. Also the ChangeLog you need for this
> change is gcc/ChangeLog (as confusing as that may be at first).
> 
> Thanks,
> Matthew
> 
> > From: Petar Jovanovic [mailto:petar.jovanovic@rt-rk.com]
> > Sent: 05 February 2015 19:28
> > To: gcc-patches@gcc.gnu.org; 'Maciej W. Rozycki'; Matthew Fortune
> > Subject: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro
> >
> > v2:
> > - add ChangeLog entry
> > - use DLA instead of LA for n64
> >
> > PTAL. Thanks.
> >
> > Regards,
> > Petar
> 
> ---
>  ChangeLog              |  5 +++++
>  gcc/config/mips/mips.h | 23 +++++++++++++++++++----
>  2 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/ChangeLog b/ChangeLog
> index 5c61c66..3a15f4a 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-05  Petar Jovanovic  <petar.jovanovic@rt-rk.com>
> +
> +	* config/mips/mips.h (CRT_CALL_STATIC_FUNCTION): Fix the macro
> to use
> +	la/jalr instead of jal.
> +
>  2015-02-02  Janis Johnson  <janis.marie.johnson@gmail.com>
> 
>  	* MAINTAINERS (Various Maintainers: testsuite): Remove myself.
> diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index
> ec69ed5..4bd83f5 100644
> --- a/gcc/config/mips/mips.h
> +++ b/gcc/config/mips/mips.h
> @@ -3034,11 +3034,11 @@ while (0)
>  	nop\n\
>  1:	.cpload $31\n\
>  	.set reorder\n\
> -	jal " USER_LABEL_PREFIX #FUNC "\n\
> +	la $25, " USER_LABEL_PREFIX #FUNC "\n\
> +	jalr $25\n\
>  	.set pop\n\
>  	" TEXT_SECTION_ASM_OP);
> -#elif ((defined _ABIN32 && _MIPS_SIM == _ABIN32) \
> -       || (defined _ABI64 && _MIPS_SIM == _ABI64))
> +#elif (defined _ABIN32 && _MIPS_SIM == _ABIN32)
>  #define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
>     asm (SECTION_OP "\n\
>  	.set push\n\
> @@ -3048,7 +3048,22 @@ while (0)
>  	nop\n\
>  1:	.set reorder\n\
>  	.cpsetup $31, $2, 1b\n\
> -	jal " USER_LABEL_PREFIX #FUNC "\n\
> +	la $25, " USER_LABEL_PREFIX #FUNC "\n\
> +	jalr $25\n\
> +	.set pop\n\
> +	" TEXT_SECTION_ASM_OP);
> +#elif (defined _ABI64 && _MIPS_SIM == _ABI64)
> +#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
> +   asm (SECTION_OP "\n\
> +	.set push\n\
> +	.set nomips16\n\
> +	.set noreorder\n\
> +	bal 1f\n\
> +	nop\n\
> +1:	.set reorder\n\
> +	.cpsetup $31, $2, 1b\n\
> +	dla $25, " USER_LABEL_PREFIX #FUNC "\n\
> +	jalr $25\n\
>  	.set pop\n\
>  	" TEXT_SECTION_ASM_OP);
>  #endif
> --
> 1.8.2.1
Mike Stump Feb. 6, 2015, 5:26 p.m. UTC | #5
On Feb 6, 2015, at 4:23 AM, Maciej W. Rozycki <macro@linux-mips.org> wrote:
> This consideration made me realise I've had a patch outstanding for some 
> 10 years to convert all the `BAL x' instructions there to `BLTZAL $0, x'.  
> This has always been a good idea in case implementations recognised the 
> special case and avoided involving branch prediction, and I believe it has 
> become even more apparent with r6 calling it NAL.

Ick, no.
Matthew Fortune Feb. 6, 2015, 5:41 p.m. UTC | #6
Mike Stump <mikestump@comcast.net> writes:
> On Feb 6, 2015, at 4:23 AM, Maciej W. Rozycki <macro@linux-mips.org>
> wrote:
> > This consideration made me realise I've had a patch outstanding for
> > some
> > 10 years to convert all the `BAL x' instructions there to `BLTZAL $0,
> x'.
> > This has always been a good idea in case implementations recognised
> > the special case and avoided involving branch prediction, and I
> > believe it has become even more apparent with r6 calling it NAL.
> 
> Ick, no.

What part of this are you referring to?

NAL (bizarre name or not) is the least intrusive way to obtain the PC on
MIPS <= R5. The use of BAL for this, albeit common, has a high risk of
affecting hardware optimisations like return predictors by introducing
a call that will never return.

This is a change that I am also planning to propagate to as many projects
as possible. If you can see a problem with using BLTZAL for this purpose
please could you explain as it may have been overlooked?

Thanks,
Matthew
Maciej W. Rozycki Feb. 6, 2015, 5:46 p.m. UTC | #7
On Fri, 6 Feb 2015, Mike Stump wrote:

> > This consideration made me realise I've had a patch outstanding for some 
> > 10 years to convert all the `BAL x' instructions there to `BLTZAL $0, x'.  
> > This has always been a good idea in case implementations recognised the 
> > special case and avoided involving branch prediction, and I believe it has 
> > become even more apparent with r6 calling it NAL.
> 
> Ick, no.

 Hmm, have you hit `send' too quickly by any chance?  There's surely a 
further part missing from your post where you'd explain what you mean.

  Maciej
Mike Stump Feb. 6, 2015, 6:02 p.m. UTC | #8
On Feb 6, 2015, at 9:41 AM, Matthew Fortune <Matthew.Fortune@imgtec.com> wrote:
> Mike Stump <mikestump@comcast.net> writes:
>> On Feb 6, 2015, at 4:23 AM, Maciej W. Rozycki <macro@linux-mips.org>
>> wrote:
>>> This consideration made me realise I've had a patch outstanding for
>>> some
>>> 10 years to convert all the `BAL x' instructions there to `BLTZAL $0,
>> x'.
>>> This has always been a good idea in case implementations recognised
>>> the special case and avoided involving branch prediction, and I
>>> believe it has become even more apparent with r6 calling it NAL.
>> 
>> Ick, no.
> 
> What part of this are you referring to?

The first part.  Ah, yes, I had mentally flipped the two cases.  I mistakingly thought you wanted to change all the BALs to BLTZAL, which you don’t want to do.  You only want to flip the non-calls to that form, which is perfectly reasonable.

Personally, the call form of bal in my book should be called call, and the non-call form of it should be called bal, but, I realize it is likely to late to do much about now.  If one went down this path, then even changing it away from bal is wrong.  That’s the basis of why I fired up the first email.  Maybe more a lament at this point.
Maciej W. Rozycki Feb. 6, 2015, 7:19 p.m. UTC | #9
On Fri, 6 Feb 2015, Mike Stump wrote:

> Personally, the call form of bal in my book should be called call, and 
> the non-call form of it should be called bal, but, I realize it is 
> likely to late to do much about now.  If one went down this path, then 
> even changing it away from bal is wrong.  That’s the basis of why I 
> fired up the first email.  Maybe more a lament at this point.

 I'm confused, these all are conditional short-distance calls.  And I 
suspect that they were included in the original MIPS architecture for the 
very purpose of letting relocatable code find its own position in memory 
at the run time -- and the rest was added to maximise the benefit of the 
opcode(s) that had to be there anyway.

 And it's just, owing to how the architecture has been structured, that 
there are exactly two cases (of the 64 encodings available) where the 
condition is fixed to either `true' or `false', respectively.  And if 
anyhow, I'd call any assembly-language idiom of `BLTZAL $0, foo' just 
`MOVE $31, $pc' or maybe to avoid confusion `[D]ADDIU $31, $pc, 8'.

 Note that `foo' is never used in that operation (and that the MIPS16 and 
microMIPS, and now also r6 instruction sets actually have such direct 
operations that have a wider range of immediates available; although care 
has to be taken about them as they may have unusual alignment or other 
restrictions).

  Maciej
Petar Jovanovic Feb. 13, 2015, 12:28 a.m. UTC | #10
-----Original Message-----
From: Moore, Catherine [mailto:Catherine_Moore@mentor.com] 
Sent: Friday, February 6, 2015 4:13 PM
To: Matthew Fortune; Petar Jovanovic; gcc-patches@gcc.gnu.org; 'Maciej W.
Rozycki'
Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro


> Petar,  would you please add your testcase to the gcc testsuite and repost
the patch.

Hi Catherine,

Sure, I can repost the patch and add the test case I used.
My concern - and I would appreciate feedback from you and others on it - is
that the test I originally used to report [1] the problem creates a large
executable (380 MB).

Here is a variant of it:

/* { dg-do run } */

#include <stdio.h>

#if defined(__GLIBC__) && (__GLIBC__ < 2 || (__GLIBC__ == 2 &&
__GLIBC_MINOR__ < 21))
#define BROKEN
#endif

int
main (void)
{
#ifndef BROKEN
  asm(".fill 100000000, 4, 0x00000000\n");
#endif
  return 0;
}

The test also takes time to execute. Obviously, I can make it slightly more
complex and make execute time short. Another path is to introduce and use
options such as "Wl,-Ttext-segment" to make its size smaller. Any other
ideas?

Regards,
Petar

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=17601
Moore, Catherine Feb. 13, 2015, 1:36 a.m. UTC | #11
> -----Original Message-----
> From: Petar Jovanovic [mailto:petar.jovanovic@rt-rk.com]
> Sent: Thursday, February 12, 2015 7:28 PM
> Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro
> 
> -----Original Message-----
> From: Moore, Catherine [mailto:Catherine_Moore@mentor.com]
> Sent: Friday, February 6, 2015 4:13 PM
> To: Matthew Fortune; Petar Jovanovic; gcc-patches@gcc.gnu.org; 'Maciej W.
> Rozycki'
> Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro
> 
> 
> > Petar,  would you please add your testcase to the gcc testsuite and repost
> the patch.
> 
> Hi Catherine,
> 
> Sure, I can repost the patch and add the test case I used.
> My concern - and I would appreciate feedback from you and others on it - is
> that the test I originally used to report [1] the problem creates a large
> executable (380 MB).
> 
> 
> The test also takes time to execute. Obviously, I can make it slightly more
> complex and make execute time short. Another path is to introduce and use
> options such as "Wl,-Ttext-segment" to make its size smaller. Any other
> ideas?
> 
Hi Petar,

There isn't any need to execute a large testcase.  Instead, try adding a short version of your test to the directory gcc/testsuite/gcc.target/mips.
There are examples of other tests there they check for specific assembler sequences by using scan-assembler.   See umips-swp-1.c (and others) for a specific example of how to do this.
Let me know if you need additional help.
Thanks,
Catherine
Petar Jovanovic April 16, 2015, 4:53 p.m. UTC | #12
-----Original Message-----
From: Moore, Catherine [mailto:Catherine_Moore@mentor.com] 
Sent: Friday, February 13, 2015 2:37 AM
To: Petar Jovanovic; 'Matthew Fortune'; gcc-patches@gcc.gnu.org; 'Maciej W.
Rozycki'
Cc: Moore, Catherine
Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro

> Hi Petar,
> 
> There isn't any need to execute a large testcase.  Instead, try adding a
short version of your test to the directory gcc/testsuite/gcc.target/mips.
> There are examples of other tests there they check for specific assembler
sequences by using scan-assembler.   See umips-swp-1.c (and others) for a
specific example of how to do this.
> Let me know if you need additional help.
> Thanks,
> Catherine

Hi Catherine,

Sorry for late response, I have missed to follow up on this.
IIUC, scan-assembler will scan assembly file generated from a test file.
This particular change will - on the other hand - modify crtend.o and thus
init section. Is there a 'scan-assembler' functionality over a final linked
executable, as that would actually test the change?

Let me know. Thanks.

Regards,
Petar
Maciej W. Rozycki April 16, 2015, 6:22 p.m. UTC | #13
On Thu, 16 Apr 2015, Petar Jovanovic wrote:

> > There isn't any need to execute a large testcase.  Instead, try adding a
> short version of your test to the directory gcc/testsuite/gcc.target/mips.
> > There are examples of other tests there they check for specific assembler
> sequences by using scan-assembler.   See umips-swp-1.c (and others) for a
> specific example of how to do this.
> > Let me know if you need additional help.
> > Thanks,
> > Catherine
> 
> Hi Catherine,
> 
> Sorry for late response, I have missed to follow up on this.
> IIUC, scan-assembler will scan assembly file generated from a test file.
> This particular change will - on the other hand - modify crtend.o and thus
> init section. Is there a 'scan-assembler' functionality over a final linked
> executable, as that would actually test the change?

 You'd need `objdump' for doing that and there is no ready-to-use solution 
within the GCC test suite available, you'd have to cook something up 
yourself, perhaps starting with `[find_binutils_prog objdump]'.

 Another solution might be using an auxiliary linker script (`-Wl,-T,...' 
or maybe just an implicit linker script will do; see the LD manual for 
details) to place the sections the jump is made between apart manually at 
addresses appropriate for JAL to fail.  They span does not have to be 
large -- when placed correctly, even `JAL .' can jump to the wrong place, 
which is what LD is supposed to catch as a relocation error -- so a test 
executable successfully linked with your correction in place won't be 
large, it doesn't have to take more than 2 virtual pages.

 The downside of the latter solution are possible portability issues with 
the linker script.  You won't have to run your executable AFAICT from 
glibc BZ #17601 as you'll get a link error if a jump target is out of 
range.  So you could make it a mere link test, no need to run it.

  Maciej
Moore, Catherine April 16, 2015, 8:38 p.m. UTC | #14
> -----Original Message-----
> From: Maciej W. Rozycki [mailto:macro@linux-mips.org]
> Sent: Thursday, April 16, 2015 2:23 PM
> To: Petar Jovanovic
> Cc: Moore, Catherine; 'Matthew Fortune'; gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH v2][MIPS] fix CRT_CALL_STATIC_FUNCTION macro
> 
> On Thu, 16 Apr 2015, Petar Jovanovic wrote:
> 
> > > There isn't any need to execute a large testcase.  Instead, try
> > > adding a
> > short version of your test to the directory gcc/testsuite/gcc.target/mips.
> > > There are examples of other tests there they check for specific
> > > assembler
> > sequences by using scan-assembler.   See umips-swp-1.c (and others) for a
> > specific example of how to do this.
> > > Let me know if you need additional help.
> > > Thanks,
> > > Catherine
> >
> > Hi Catherine,
> >
> > Sorry for late response, I have missed to follow up on this.
> > IIUC, scan-assembler will scan assembly file generated from a test file.
> > This particular change will - on the other hand - modify crtend.o and
> > thus init section. Is there a 'scan-assembler' functionality over a
> > final linked executable, as that would actually test the change?
> 
Hi Petar,
It looks like I answered a little too quickly the first time around.  I'm sorry.
In any case, Maciej is correct.  I think a link-time test should do it.
Thanks,
Catherine

>  You'd need `objdump' for doing that and there is no ready-to-use solution
> within the GCC test suite available, you'd have to cook something up
> yourself, perhaps starting with `[find_binutils_prog objdump]'.
> 
>  Another solution might be using an auxiliary linker script (`-Wl,-T,...'
> or maybe just an implicit linker script will do; see the LD manual for
> details) to place the sections the jump is made between apart manually at
> addresses appropriate for JAL to fail.  They span does not have to be large --
> when placed correctly, even `JAL .' can jump to the wrong place, which is
> what LD is supposed to catch as a relocation error -- so a test executable
> successfully linked with your correction in place won't be large, it doesn't
> have to take more than 2 virtual pages.
> 
>  The downside of the latter solution are possible portability issues with the
> linker script.  You won't have to run your executable AFAICT from glibc BZ
> #17601 as you'll get a link error if a jump target is out of range.  So you could
> make it a mere link test, no need to run it.
> 
>   Maciej
diff mbox

Patch

diff --git a/ChangeLog b/ChangeLog
index 5c61c66..3a15f4a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@ 
+2015-02-05  Petar Jovanovic  <petar.jovanovic@rt-rk.com>
+
+	* config/mips/mips.h (CRT_CALL_STATIC_FUNCTION): Fix the macro to use
+	la/jalr instead of jal.
+
 2015-02-02  Janis Johnson  <janis.marie.johnson@gmail.com>
 
 	* MAINTAINERS (Various Maintainers: testsuite): Remove myself.
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index ec69ed5..4bd83f5 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -3034,11 +3034,11 @@  while (0)
 	nop\n\
 1:	.cpload $31\n\
 	.set reorder\n\
-	jal " USER_LABEL_PREFIX #FUNC "\n\
+	la $25, " USER_LABEL_PREFIX #FUNC "\n\
+	jalr $25\n\
 	.set pop\n\
 	" TEXT_SECTION_ASM_OP);
-#elif ((defined _ABIN32 && _MIPS_SIM == _ABIN32) \
-       || (defined _ABI64 && _MIPS_SIM == _ABI64))
+#elif (defined _ABIN32 && _MIPS_SIM == _ABIN32)
 #define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
    asm (SECTION_OP "\n\
 	.set push\n\
@@ -3048,7 +3048,22 @@  while (0)
 	nop\n\
 1:	.set reorder\n\
 	.cpsetup $31, $2, 1b\n\
-	jal " USER_LABEL_PREFIX #FUNC "\n\
+	la $25, " USER_LABEL_PREFIX #FUNC "\n\
+	jalr $25\n\
+	.set pop\n\
+	" TEXT_SECTION_ASM_OP);
+#elif (defined _ABI64 && _MIPS_SIM == _ABI64)
+#define CRT_CALL_STATIC_FUNCTION(SECTION_OP, FUNC)	\
+   asm (SECTION_OP "\n\
+	.set push\n\
+	.set nomips16\n\
+	.set noreorder\n\
+	bal 1f\n\
+	nop\n\
+1:	.set reorder\n\
+	.cpsetup $31, $2, 1b\n\
+	dla $25, " USER_LABEL_PREFIX #FUNC "\n\
+	jalr $25\n\
 	.set pop\n\
 	" TEXT_SECTION_ASM_OP);
 #endif