diff mbox

LTO streaming of TARGET_OPTIMIZE_NODE

Message ID 546DDE1C.6060203@t-online.de
State New
Headers show

Commit Message

Bernd Schmidt Nov. 20, 2014, 12:27 p.m. UTC
On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> this patch adds infrastructure for proper streaming and merging of
> TREE_TARGET_OPTION.

This breaks the offloading path via LTO since it introduces an 
incompatibility in LTO format between host and offload machine.

A very quick patch to fix it is below - the OpenACC testcase I was using 
seems to be working again with this. Thoughts, suggestions?


Bernd

Comments

Richard Biener Nov. 20, 2014, 1:20 p.m. UTC | #1
On Thu, 20 Nov 2014, Bernd Schmidt wrote:

> On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> > this patch adds infrastructure for proper streaming and merging of
> > TREE_TARGET_OPTION.
> 
> This breaks the offloading path via LTO since it introduces an incompatibility
> in LTO format between host and offload machine.
> 
> A very quick patch to fix it is below - the OpenACC testcase I was using seems
> to be working again with this. Thoughts, suggestions?

The offload target needs to have the same target options as the host?

Are the offload functions marked somehow?  That is, can we avoid
setting TREE_TARGET_OPTION on them?  Or rather we need to have a
default TREE_TARGET_OPTION node for the offload target which we'd
need to set - how would you otherwise transfer different offload
target options to the offload compile?  How do you transfer
offload target options to the offload compile at all?

I think this just shows conceptual issues with the LTO approach...

Thanks,
Richard.
Bernd Schmidt Nov. 20, 2014, 2:21 p.m. UTC | #2
On 11/20/2014 02:20 PM, Richard Biener wrote:
> On Thu, 20 Nov 2014, Bernd Schmidt wrote:
>
>> On 11/13/2014 05:06 AM, Jan Hubicka wrote:
>>> this patch adds infrastructure for proper streaming and merging of
>>> TREE_TARGET_OPTION.
>>
>> This breaks the offloading path via LTO since it introduces an incompatibility
>> in LTO format between host and offload machine.
>>
>> A very quick patch to fix it is below - the OpenACC testcase I was using seems
>> to be working again with this. Thoughts, suggestions?
>
> The offload target needs to have the same target options as the host?

Not really meaningful I'd think.

> Are the offload functions marked somehow?  That is, can we avoid
> setting TREE_TARGET_OPTION on them?

Well, they are mostly generated automatically by omp-low.c, so 
TREE_TARGET_OPTION wouldn't normally be set anyway. So the field is 
unnecessary, we just can't write it out since the two compilers involved 
disagree on its layout.

> Or rather we need to have a
> default TREE_TARGET_OPTION node for the offload target which we'd
> need to set - how would you otherwise transfer different offload
> target options to the offload compile?  How do you transfer
> offload target options to the offload compile at all?

ABI options are transferred via the -foffload-abi mechanism. No other 
target options can be transferred.

> I think this just shows conceptual issues with the LTO approach...

I don't think running into a few problems demonstrates a conceptual 
problem when it works fine with some fairly small patches.


Bernd
Jan Hubicka Nov. 20, 2014, 9:12 p.m. UTC | #3
> On 11/20/2014 02:20 PM, Richard Biener wrote:
> >On Thu, 20 Nov 2014, Bernd Schmidt wrote:
> >
> >>On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> >>>this patch adds infrastructure for proper streaming and merging of
> >>>TREE_TARGET_OPTION.
> >>
> >>This breaks the offloading path via LTO since it introduces an incompatibility
> >>in LTO format between host and offload machine.
> >>
> >>A very quick patch to fix it is below - the OpenACC testcase I was using seems
> >>to be working again with this. Thoughts, suggestions?
> >
> >The offload target needs to have the same target options as the host?
> 
> Not really meaningful I'd think.
> 
> >Are the offload functions marked somehow?  That is, can we avoid
> >setting TREE_TARGET_OPTION on them?
> 
> Well, they are mostly generated automatically by omp-low.c, so
> TREE_TARGET_OPTION wouldn't normally be set anyway. So the field is
> unnecessary, we just can't write it out since the two compilers
> involved disagree on its layout.

I am currently populating TREE_TARGET_OPTION in free lang data that is probably
called after omp-low and incrementally I plan to set it even for newly constructed
functions (profiling, ctors etc.) that are built during IPA, so we do not really need
to rely on sane global state at link time.
This however has nothing to do with offloading.

Honza

> 
> >Or rather we need to have a
> >default TREE_TARGET_OPTION node for the offload target which we'd
> >need to set - how would you otherwise transfer different offload
> >target options to the offload compile?  How do you transfer
> >offload target options to the offload compile at all?
> 
> ABI options are transferred via the -foffload-abi mechanism. No
> other target options can be transferred.
> 
> >I think this just shows conceptual issues with the LTO approach...
> 
> I don't think running into a few problems demonstrates a conceptual
> problem when it works fine with some fairly small patches.
> 
> 
> Bernd
Jakub Jelinek Jan. 8, 2015, 2:11 p.m. UTC | #4
On Thu, Nov 20, 2014 at 01:27:08PM +0100, Bernd Schmidt wrote:
> On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> >this patch adds infrastructure for proper streaming and merging of
> >TREE_TARGET_OPTION.
> 
> This breaks the offloading path via LTO since it introduces an
> incompatibility in LTO format between host and offload machine.
> 
> A very quick patch to fix it is below - the OpenACC testcase I was using
> seems to be working again with this. Thoughts, suggestions?

I actually think this patch makes a lot of sense.  Target option nodes
by definition are target specific, generally there is no mapping between
host and offloading target features.  So, the host target options
are not useful to the offloading target.  And, because the amount of bits
streamed is also target specific, say x86_64 will have different and
incompatible cl_target_option_stream_{out,in} from nvptx, and even
for Intel MIC offloading it doesn't make much sense, what CPU is certain
function targetting doesn't necessarily have any relation to the Intel MIC
that will offload it.

I agree the strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0
checks in the patch aren't very nice, that could be replaced by
some bool flag alongside of section_name_prefix that would be set
where section_name_prefix is set:
cgraphunit.c:	  section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX;
cgraphunit.c:	  section_name_prefix = LTO_SECTION_NAME_PREFIX;
lto-streamer.c:const char *section_name_prefix = LTO_SECTION_NAME_PREFIX;
lto/lto.c:    section_name_prefix = OFFLOAD_SECTION_NAME_PREFIX;

(call it say bool lto_stream_offload_p ?).

Also note that the patch fixes all the current regressions in Intel MIC
(emulated) offloading caused by the r218767

	Jakub
Thomas Schwinge Jan. 9, 2015, 11:07 a.m. UTC | #5
Hi!

On Thu, 8 Jan 2015 15:11:49 +0100, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Nov 20, 2014 at 01:27:08PM +0100, Bernd Schmidt wrote:
> > On 11/13/2014 05:06 AM, Jan Hubicka wrote:
> > >this patch adds infrastructure for proper streaming and merging of
> > >TREE_TARGET_OPTION.
> > 
> > This breaks the offloading path via LTO since it introduces an
> > incompatibility in LTO format between host and offload machine.
> > 
> > A very quick patch to fix it is below - the OpenACC testcase I was using
> > seems to be working again with this. Thoughts, suggestions?
> 
> I actually think

Thanks for picking up this issue!

> this patch makes a lot of sense.  Target option nodes
> by definition are target specific, generally there is no mapping between
> host and offloading target features.  So, the host target options
> are not useful to the offloading target.  And, because the amount of bits
> streamed is also target specific, say x86_64 will have different and
> incompatible cl_target_option_stream_{out,in} from nvptx, and even
> for Intel MIC offloading it doesn't make much sense, what CPU is certain
> function targetting doesn't necessarily have any relation to the Intel MIC
> that will offload it.

> Also note that the patch fixes all the current regressions in Intel MIC
> (emulated) offloading caused by the r218767

(Which has been filed as <https://gcc.gnu.org/PR64412>, by the way.)

I'm confirming that Bernd's patch resolves the intelmic offloading
regressions, but I still do see issues in all nvptx offloading, but
cannot tell yet what's going on.  (Reverting Honza's patch resolves
these; but maybe it's something that is solely an issue with the nvptx
offloading path.)


Grüße,
 Thomas
diff mbox

Patch

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index be041e9..3c4b8c9 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -65,7 +65,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "streamer-hooks.h"
 #include "cfgloop.h"
 #include "builtins.h"
-
+#include "lto-section-names.h"
 
 static void lto_write_tree (struct output_block*, tree, bool);
 
@@ -944,7 +944,9 @@  hash_tree (struct streamer_tree_cache_d *cache, hash_map<tree, hashval_t> *map,
     hstate.add (TRANSLATION_UNIT_LANGUAGE (t),
 			strlen (TRANSLATION_UNIT_LANGUAGE (t)));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
+  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
+      /* We don't stream these when passing things to a different target.  */
+      && strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0)
     hstate.add_wide_int (cl_target_option_hash (TREE_TARGET_OPTION (t)));
 
   if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
diff --git a/gcc/tree-streamer-in.c b/gcc/tree-streamer-in.c
index a2a2382..88d36d3 100644
--- a/gcc/tree-streamer-in.c
+++ b/gcc/tree-streamer-in.c
@@ -514,8 +514,10 @@  unpack_value_fields (struct data_in *data_in, struct bitpack_d *bp, tree expr)
 	vec_safe_grow (CONSTRUCTOR_ELTS (expr), length);
     }
 
+#ifndef ACCEL_COMPILER
   if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
     cl_target_option_stream_in (data_in, bp, TREE_TARGET_OPTION (expr));
+#endif
 
   if (code == OMP_CLAUSE)
     unpack_ts_omp_clause_value_fields (data_in, bp, expr);
@@ -779,7 +781,9 @@  lto_input_ts_function_decl_tree_pointers (struct lto_input_block *ib,
   DECL_VINDEX (expr) = stream_read_tree (ib, data_in);
   /* DECL_STRUCT_FUNCTION is loaded on demand by cgraph_get_body.  */
   DECL_FUNCTION_PERSONALITY (expr) = stream_read_tree (ib, data_in);
+#ifndef ACCEL_COMPILER
   DECL_FUNCTION_SPECIFIC_TARGET (expr) = stream_read_tree (ib, data_in);
+#endif
   DECL_FUNCTION_SPECIFIC_OPTIMIZATION (expr) = stream_read_tree (ib, data_in);
 
   /* If the file contains a function with an EH personality set,
diff --git a/gcc/tree-streamer-out.c b/gcc/tree-streamer-out.c
index b959454..fca101e 100644
--- a/gcc/tree-streamer-out.c
+++ b/gcc/tree-streamer-out.c
@@ -47,6 +47,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-streamer.h"
 #include "data-streamer.h"
 #include "streamer-hooks.h"
+#include "lto-section-names.h"
 
 /* Output the STRING constant to the string
    table in OB.  Then put the index onto the INDEX_STREAM.  */
@@ -463,7 +464,9 @@  streamer_pack_tree_bitfields (struct output_block *ob,
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
     bp_pack_var_len_unsigned (bp, CONSTRUCTOR_NELTS (expr));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION))
+  if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)
+      /* Don't stream these when passing things to a different target.  */
+      && strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0)
     cl_target_option_stream_out (ob, bp, TREE_TARGET_OPTION (expr));
 
   if (code == OMP_CLAUSE)
@@ -678,7 +681,9 @@  write_ts_function_decl_tree_pointers (struct output_block *ob, tree expr,
   stream_write_tree (ob, DECL_VINDEX (expr), ref_p);
   /* DECL_STRUCT_FUNCTION is handled by lto_output_function.  */
   stream_write_tree (ob, DECL_FUNCTION_PERSONALITY (expr), ref_p);
-  stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_TARGET (expr), ref_p);
+  /* Don't stream these when passing things to a different target.  */
+  if (strcmp (section_name_prefix, LTO_SECTION_NAME_PREFIX) == 0)
+    stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_TARGET (expr), ref_p);
   stream_write_tree (ob, DECL_FUNCTION_SPECIFIC_OPTIMIZATION (expr), ref_p);
 }