diff mbox

Use the middle-end boolean_type_node

Message ID 1481662749-31499-1-git-send-email-blomqvist.janne@gmail.com
State New
Headers show

Commit Message

Janne Blomqvist Dec. 13, 2016, 8:59 p.m. UTC
Use the boolean_type_node setup by the middle-end instead of
redefining it. boolean_type_node is not used in GFortran for any
ABI-visible stuff, only internally as the type of boolean
expressions. There appears to be one exception to this, namely the
caf_get* and caf_send* calls which have boolean_type_node
arguments. However, on the library side they seem to use C _Bool, so I
suspect this might be a case of a argument mismatch that hasn't
affected anything so far.

The practical effect of this is that the size of such variables will
be the same as a C _Bool or C++ bool, that is, on most targets a
single byte. Previously we redefined boolean_type_node to be a Fortran
default logical kind sized variable, that is 4 or 8 bytes depending on
compile options. This might enable slightly more compact code, in case
the optimizer determines that the result of such a generated
comparison expression needs to be stored in some temporary location
rather than being used immediately.

Regression tested on x86_64-pc-linux-gnu, Ok for trunk?

2016-12-13  Janne Blomqvist  <jb@gcc.gnu.org>

	* trans-types.c (gfc_init_types): Don't redefine boolean type node.
---
 gcc/fortran/trans-types.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Janne Blomqvist Dec. 20, 2016, 2:56 p.m. UTC | #1
PING!

On Tue, Dec 13, 2016 at 10:59 PM, Janne Blomqvist
<blomqvist.janne@gmail.com> wrote:
> Use the boolean_type_node setup by the middle-end instead of
> redefining it. boolean_type_node is not used in GFortran for any
> ABI-visible stuff, only internally as the type of boolean
> expressions. There appears to be one exception to this, namely the
> caf_get* and caf_send* calls which have boolean_type_node
> arguments. However, on the library side they seem to use C _Bool, so I
> suspect this might be a case of a argument mismatch that hasn't
> affected anything so far.
>
> The practical effect of this is that the size of such variables will
> be the same as a C _Bool or C++ bool, that is, on most targets a
> single byte. Previously we redefined boolean_type_node to be a Fortran
> default logical kind sized variable, that is 4 or 8 bytes depending on
> compile options. This might enable slightly more compact code, in case
> the optimizer determines that the result of such a generated
> comparison expression needs to be stored in some temporary location
> rather than being used immediately.
>
> Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
>
> 2016-12-13  Janne Blomqvist  <jb@gcc.gnu.org>
>
>         * trans-types.c (gfc_init_types): Don't redefine boolean type node.
> ---
>  gcc/fortran/trans-types.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> index 354308f..e8dafa0 100644
> --- a/gcc/fortran/trans-types.c
> +++ b/gcc/fortran/trans-types.c
> @@ -961,10 +961,6 @@ gfc_init_types (void)
>                         wi::mask (n, UNSIGNED,
>                                   TYPE_PRECISION (size_type_node)));
>
> -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> -
>    /* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
>    gfc_charlen_int_kind = 4;
>    gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> --
> 2.7.4
>
Steve Kargl Dec. 20, 2016, 2:59 p.m. UTC | #2
Thought I gave an 'ok', but apparently never sent email.
Sorry about that.  Yes, ok to commit.
Dominik Vogt Jan. 3, 2017, 2:14 p.m. UTC | #3
On Tue, Dec 13, 2016 at 10:59:09PM +0200, Janne Blomqvist wrote:
> Use the boolean_type_node setup by the middle-end instead of
> redefining it. boolean_type_node is not used in GFortran for any
> ABI-visible stuff, only internally as the type of boolean
> expressions. There appears to be one exception to this, namely the
> caf_get* and caf_send* calls which have boolean_type_node
> arguments. However, on the library side they seem to use C _Bool, so I
> suspect this might be a case of a argument mismatch that hasn't
> affected anything so far.
> 
> The practical effect of this is that the size of such variables will
> be the same as a C _Bool or C++ bool, that is, on most targets a
> single byte. Previously we redefined boolean_type_node to be a Fortran
> default logical kind sized variable, that is 4 or 8 bytes depending on
> compile options. This might enable slightly more compact code, in case
> the optimizer determines that the result of such a generated
> comparison expression needs to be stored in some temporary location
> rather than being used immediately.

This patch costs several thousand additional instructions in
Spec2006 on s390x ("lines" = instructions):

  410.bwaves:     +28 lines (2 funcs bigger)
  437.leslie3d:   +43 lines (5 funcs bigger)
  434.zeusmp:   +2650 lines (15 funcs bigger)
  459.GemsFDTD:   +65 lines (7 funcs bigger)
  454.calculix:  +474 lines (23 funcs bigger)
  465.tonto:    +2182 lines (221 funcs bigger)
  481.wrf:      +4988 lines (117 funcs bigger)
  416.gamess:   +3723 lines (466 funcs bigger)

s390x has a "compare with immediate and jump relative" instruction
for 32 bit, but for an 8 bit quantities it needs separate compare
and jump instructions, e.g.

  cijne   %r1,0,... <bi_cgstab_block_+0xfe2>

->

  tmll    %r1,1
  jne     ... <bi_cgstab_block_+0xfe6>

Instead of hard coding a specific type, should one ask the backend
for the preferred type?

> Regression tested on x86_64-pc-linux-gnu, Ok for trunk?
> 
> 2016-12-13  Janne Blomqvist  <jb@gcc.gnu.org>
> 
> 	* trans-types.c (gfc_init_types): Don't redefine boolean type node.
> ---
>  gcc/fortran/trans-types.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
> index 354308f..e8dafa0 100644
> --- a/gcc/fortran/trans-types.c
> +++ b/gcc/fortran/trans-types.c
> @@ -961,10 +961,6 @@ gfc_init_types (void)
>  			wi::mask (n, UNSIGNED,
>  				  TYPE_PRECISION (size_type_node)));
>  
> -  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
> -  boolean_true_node = build_int_cst (boolean_type_node, 1);
> -  boolean_false_node = build_int_cst (boolean_type_node, 0);
> -
>    /* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
>    gfc_charlen_int_kind = 4;
>    gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);
> -- 
> 2.7.4


Ciao

Dominik ^_^  ^_^
Jakub Jelinek Jan. 3, 2017, 2:20 p.m. UTC | #4
On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
> This patch costs several thousand additional instructions in
> Spec2006 on s390x ("lines" = instructions):
> 
>   410.bwaves:     +28 lines (2 funcs bigger)
>   437.leslie3d:   +43 lines (5 funcs bigger)
>   434.zeusmp:   +2650 lines (15 funcs bigger)
>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>   454.calculix:  +474 lines (23 funcs bigger)
>   465.tonto:    +2182 lines (221 funcs bigger)
>   481.wrf:      +4988 lines (117 funcs bigger)
>   416.gamess:   +3723 lines (466 funcs bigger)
> 
> s390x has a "compare with immediate and jump relative" instruction
> for 32 bit, but for an 8 bit quantities it needs separate compare
> and jump instructions, e.g.
> 
>   cijne   %r1,0,... <bi_cgstab_block_+0xfe2>
> 
> ->
> 
>   tmll    %r1,1
>   jne     ... <bi_cgstab_block_+0xfe6>
> 
> Instead of hard coding a specific type, should one ask the backend
> for the preferred type?

The gfc_init_types change is an ABI change, at least if the fortran FE
bool type is ever stored in memory and accessed by multiple TUs, or
passed as argument etc.  And the difference between the C/C++ _Bool/bool
and fortran FE bool has caused lots of issues in the past, so if it can be
the same type, it is preferrable.

	Jakub
Janne Blomqvist Jan. 3, 2017, 2:31 p.m. UTC | #5
On Tue, Jan 3, 2017 at 4:20 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
>> This patch costs several thousand additional instructions in
>> Spec2006 on s390x ("lines" = instructions):
>>
>>   410.bwaves:     +28 lines (2 funcs bigger)
>>   437.leslie3d:   +43 lines (5 funcs bigger)
>>   434.zeusmp:   +2650 lines (15 funcs bigger)
>>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>>   454.calculix:  +474 lines (23 funcs bigger)
>>   465.tonto:    +2182 lines (221 funcs bigger)
>>   481.wrf:      +4988 lines (117 funcs bigger)
>>   416.gamess:   +3723 lines (466 funcs bigger)
>>
>> s390x has a "compare with immediate and jump relative" instruction
>> for 32 bit, but for an 8 bit quantities it needs separate compare
>> and jump instructions, e.g.
>>
>>   cijne   %r1,0,... <bi_cgstab_block_+0xfe2>
>>
>> ->
>>
>>   tmll    %r1,1
>>   jne     ... <bi_cgstab_block_+0xfe6>
>>
>> Instead of hard coding a specific type, should one ask the backend
>> for the preferred type?

Hmm, that's sort of the opposite of what I had hoped for.. :-/

Is there some way to ask the backend what the preferred type is, then?

(The snide answer, why didn't the s390 ABi define
bool/_Bool/boolean_type_node to be a 32 bit type if 8 bit types are
problematic? But that's of course water under the bridge by now...)

> The gfc_init_types change is an ABI change, at least if the fortran FE
> bool type is ever stored in memory and accessed by multiple TUs, or
> passed as argument etc.

Based on the quick audit I did when I wrote the patch, the only time
it's used except as a local temp variable, is for a couple of the
co-array intrinsics, where the corresponding library implementation
actually uses C _Bool (I suspect it has worked by accident if the args
are passed in registers).

>  And the difference between the C/C++ _Bool/bool
> and fortran FE bool has caused lots of issues in the past, so if it can be
> the same type, it is preferrable.
>
>         Jakub
Andreas Krebbel Jan. 3, 2017, 4:50 p.m. UTC | #6
On 01/03/2017 03:31 PM, Janne Blomqvist wrote:
> On Tue, Jan 3, 2017 at 4:20 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Tue, Jan 03, 2017 at 03:14:46PM +0100, Dominik Vogt wrote:
>>> This patch costs several thousand additional instructions in
>>> Spec2006 on s390x ("lines" = instructions):
>>>
>>>   410.bwaves:     +28 lines (2 funcs bigger)
>>>   437.leslie3d:   +43 lines (5 funcs bigger)
>>>   434.zeusmp:   +2650 lines (15 funcs bigger)
>>>   459.GemsFDTD:   +65 lines (7 funcs bigger)
>>>   454.calculix:  +474 lines (23 funcs bigger)
>>>   465.tonto:    +2182 lines (221 funcs bigger)
>>>   481.wrf:      +4988 lines (117 funcs bigger)
>>>   416.gamess:   +3723 lines (466 funcs bigger)
>>>
>>> s390x has a "compare with immediate and jump relative" instruction
>>> for 32 bit, but for an 8 bit quantities it needs separate compare
>>> and jump instructions, e.g.
>>>
>>>   cijne   %r1,0,... <bi_cgstab_block_+0xfe2>
>>>
>>> ->
>>>
>>>   tmll    %r1,1
>>>   jne     ... <bi_cgstab_block_+0xfe6>
>>>
>>> Instead of hard coding a specific type, should one ask the backend
>>> for the preferred type?
> 
> Hmm, that's sort of the opposite of what I had hoped for.. :-/
> 
> Is there some way to ask the backend what the preferred type is, then?
> 
> (The snide answer, why didn't the s390 ABi define
> bool/_Bool/boolean_type_node to be a 32 bit type if 8 bit types are
> problematic? But that's of course water under the bridge by now...)

The regression with 8 bit boolean types surfaced with the z10 machine. The ABI is much older. Since
we cannot change it anymore we should rather talk to the hardware guys to add the instruction we
need. So for now we probably have to live with the regression in the Fortran cases since as I
understand it your change fixes an actual problem.

-Andreas-

> 
>> The gfc_init_types change is an ABI change, at least if the fortran FE
>> bool type is ever stored in memory and accessed by multiple TUs, or
>> passed as argument etc.
> 
> Based on the quick audit I did when I wrote the patch, the only time
> it's used except as a local temp variable, is for a couple of the
> co-array intrinsics, where the corresponding library implementation
> actually uses C _Bool (I suspect it has worked by accident if the args
> are passed in registers).
> 
>>  And the difference between the C/C++ _Bool/bool
>> and fortran FE bool has caused lots of issues in the past, so if it can be
>> the same type, it is preferrable.
>>
>>         Jakub
> 
> 
>
FX Coudert Jan. 3, 2017, 7:16 p.m. UTC | #7
> The gfc_init_types change is an ABI change, at least if the fortran FE
> bool type is ever stored in memory and accessed by multiple TUs, or
> passed as argument etc.  And the difference between the C/C++ _Bool/bool
> and fortran FE bool has caused lots of issues in the past, so if it can be
> the same type, it is preferrable.

The patch committed doesn’t change the way Fortran LOGICAL types are emitted. The fact that the default LOGICAL kind is different (on most platforms) from C/C++ boolean is due to the standard, and not our own choice of ABI.

As Jane says, boolean_type_node in the Fortran front-end is only used for intermediate values and temporaries; it is used every time we build a COND_EXPR. It is not part of the ABI.

FX
FX Coudert Jan. 3, 2017, 7:20 p.m. UTC | #8
> The regression with 8 bit boolean types surfaced with the z10 machine. The ABI is much older. Since
> we cannot change it anymore we should rather talk to the hardware guys to add the instruction we
> need. So for now we probably have to live with the regression in the Fortran cases since as I
> understand it your change fixes an actual problem.

As far as I understand (and Jane will correct me if I am wrong), the patch does not fix anything in particular. The idea was that, by transitioning from having all boolean expressions from “int” to “bool” (in C terms), and thus from 32-bit to 8-bit on “typical” targets, the optimizer might be able to emit more compact code. I am not sure this was tested.

So: maybe it is a case of "Profile. Don't speculate.”

FX
Janne Blomqvist Jan. 3, 2017, 7:40 p.m. UTC | #9
On Tue, Jan 3, 2017 at 6:50 PM, Andreas Krebbel
<krebbel@linux.vnet.ibm.com> wrote:
> The regression with 8 bit boolean types surfaced with the z10 machine. The ABI is much older. Since
> we cannot change it anymore we should rather talk to the hardware guys to add the instruction we
> need. So for now we probably have to live with the regression in the Fortran cases since as I
> understand it your change fixes an actual problem.

I do think this is fixable without waiting for potential improved
hardware.  The vast majority of uses of boolean_type_node in the
Fortran frontend is code like

          tmp = fold_build2_loc (input_location, GT_EXPR,
                                 boolean_type_node, from_len,
                                 build_zero_cst (TREE_TYPE (from_len)));
          tmp = fold_build3_loc (input_location, COND_EXPR,
                                 void_type_node, tmp, extcopy, stdcopy);

where the result of a boolean expression is returned as a
boolean_type_node.  This is a more like high-level language semantics,
whereas on the asm level boolean expressions are of course evaluated
with integer arithmetic (or are there targets where this is not
true?).

So while the Fortran frontend shouldn't redefine the ABI specified
boolean_type_node, for situations like the above we could instead use
some fast_scalar_non_abi_bool_type_node (better name suggestions
welcome?), if we had some mechanism for returning such information
from the backend? Or is there some universal best choice here?

Or since boolean types are not actually present at the asm level, what
is the best choice in cases like the above in order to generate code
that can be lowered to as efficient code as possible? Should the
result of the fold_build2_loc be of, say, TREE_TYPE(from_len) instead
of boolean_type_node (or fast_scalar_non_abi_bool_type_node or
whatever we come up with)?


(Repost without html formatting, sorry for the spam)
Jakub Jelinek Jan. 3, 2017, 7:45 p.m. UTC | #10
On Tue, Jan 03, 2017 at 09:40:02PM +0200, Janne Blomqvist wrote:
> I do think this is fixable without waiting for potential improved
> hardware.  The vast majority of uses of boolean_type_node in the
> Fortran frontend is code like
> 
>           tmp = fold_build2_loc (input_location, GT_EXPR,
>                                  boolean_type_node, from_len,
>                                  build_zero_cst (TREE_TYPE (from_len)));
>           tmp = fold_build3_loc (input_location, COND_EXPR,
>                                  void_type_node, tmp, extcopy, stdcopy);
> 
> where the result of a boolean expression is returned as a
> boolean_type_node.  This is a more like high-level language semantics,
> whereas on the asm level boolean expressions are of course evaluated
> with integer arithmetic (or are there targets where this is not
> true?).
> 
> So while the Fortran frontend shouldn't redefine the ABI specified
> boolean_type_node, for situations like the above we could instead use
> some fast_scalar_non_abi_bool_type_node (better name suggestions
> welcome?), if we had some mechanism for returning such information
> from the backend? Or is there some universal best choice here?
> 
> Or since boolean types are not actually present at the asm level, what
> is the best choice in cases like the above in order to generate code
> that can be lowered to as efficient code as possible? Should the
> result of the fold_build2_loc be of, say, TREE_TYPE(from_len) instead
> of boolean_type_node (or fast_scalar_non_abi_bool_type_node or
> whatever we come up with)?
> 
> 
> (Repost without html formatting, sorry for the spam)

Well, generally if it helps Fortran on z10, it will likely help C and C++
too, so you probably instead want some type promotion pass which will know
that on the particular target it is beneficial to promote booleans to
32-bits and know under what circumstances (the loads and stores of course
need to remain the same size for ABI reasons, but as soon as you have it in
SSA_NAME, it depends on what you use it for, sometimes it will be beneficial
to promote it, sometimes not.  Of course, that is something that is too late
for GCC 7.

	Jakub
Janne Blomqvist Jan. 4, 2017, 9:51 a.m. UTC | #11
On Tue, Jan 3, 2017 at 9:20 PM, FX <fxcoudert@gmail.com> wrote:
>> The regression with 8 bit boolean types surfaced with the z10 machine. The ABI is much older. Since
>> we cannot change it anymore we should rather talk to the hardware guys to add the instruction we
>> need. So for now we probably have to live with the regression in the Fortran cases since as I
>> understand it your change fixes an actual problem.
>
> As far as I understand (and Jane will correct me if I am wrong),

Sure, allow me to correct you: It's Janne, with two n in the middle. :)

> the patch does not fix anything in particular. The idea was that, by transitioning from having all boolean expressions from “int” to “bool” (in C terms), and thus from 32-bit to 8-bit on “typical” targets, the optimizer might be able to emit more compact code.

My motivations were:

- As you say, hopefully more compact code.

- Fixing the co-array intrinsics which pass arguments as
boolean_type_node and on the library side the corresponding arguments
are C _Bool's. I think this has worked previously accidentally, as
arguments are passed in registers anyway, and stack slots are 8
bytes(?) on x86-64. But of course, we shouldn't let this excuse us
from actually fixing it.

- Redefining an ABI type specified by the middle end is bad form, IMHO.

> I am not sure this was tested.
>
> So: maybe it is a case of "Profile. Don't speculate.”

I don't have access to spec2k6, but I just checked the polyhedron 2011
benchmark suite. Target x86_64-pc-linux-gnu, compile options -O3
-funroll-loops -ffast-math -ftree-loop-linear -march=native (native ==
amdfam10 here). The line counts of the .s files (~number of
instructions) are (working directory is current trunk,
gf_trunk_boolean_type_node is trunk with my patch reverted):

  6856 ac.s
  6856 ../gf_trunk_boolean_type_node/ac.s
 235206 aermod.s
 235337 ../gf_trunk_boolean_type_node/aermod.s
 22310 air.s
 22477 ../gf_trunk_boolean_type_node/air.s
 15221 capacita.s
 15252 ../gf_trunk_boolean_type_node/capacita.s
  4905 channel2.s
  4859 ../gf_trunk_boolean_type_node/channel2.s
  34269 doduc.s
  33610 ../gf_trunk_boolean_type_node/doduc.s
 11963 fatigue2.s
 11959 ../gf_trunk_boolean_type_node/fatigue2.s
 17369 gas_dyn2.s
 17304 ../gf_trunk_boolean_type_node/gas_dyn2.s
  31098 induct2.s
  31098 ../gf_trunk_boolean_type_node/induct2.s
  8366 linpk.s
  8367 ../gf_trunk_boolean_type_node/linpk.s
 15139 mdbx.s
 15155 ../gf_trunk_boolean_type_node/mdbx.s
  5302 mp_prop_design.s
  5314 ../gf_trunk_boolean_type_node/mp_prop_design.s
 17420 nf.s
 17457 ../gf_trunk_boolean_type_node/nf.s
 20804 protein.s
 20713 ../gf_trunk_boolean_type_node/protein.s
  52476 rnflow.s
  52449 ../gf_trunk_boolean_type_node/rnflow.s
  37339 test_fpu2.s
  37377 ../gf_trunk_boolean_type_node/test_fpu2.s
  8466 tfft2.s
  8448 ../gf_trunk_boolean_type_node/tfft2.s


So 7 of 17 benchmarks are slightly bigger after the patch (at least
for channel2 it seems due to trunk unrolling a loop which didn't
happen with the patch reverted, haven't checked others), 8 benchmarks
are slightly smaller after the patch, and 2 are equally long. Runtime
differences seem to be a wash, though I didn't test that very
carefully.
diff mbox

Patch

diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
index 354308f..e8dafa0 100644
--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -961,10 +961,6 @@  gfc_init_types (void)
 			wi::mask (n, UNSIGNED,
 				  TYPE_PRECISION (size_type_node)));
 
-  boolean_type_node = gfc_get_logical_type (gfc_default_logical_kind);
-  boolean_true_node = build_int_cst (boolean_type_node, 1);
-  boolean_false_node = build_int_cst (boolean_type_node, 0);
-
   /* ??? Shouldn't this be based on gfc_index_integer_kind or so?  */
   gfc_charlen_int_kind = 4;
   gfc_charlen_type_node = gfc_get_int_type (gfc_charlen_int_kind);