diff mbox series

LTO: Fix writing of toplevel asm with offloading [PR109816]

Message ID 74555a9a-8eb8-14ac-a5bd-d0ab15c9acc1@codesourcery.com
State New
Headers show
Series LTO: Fix writing of toplevel asm with offloading [PR109816] | expand

Commit Message

Tobias Burnus May 12, 2023, 12:46 p.m. UTC
Long standing issue but as top-level 'asm' statement were rare, it did not show up.
However, the fix for PR108969 in commit r14-321-g9a41d2cdbcd added code

+#elif defined(_GLIBCXX_SYMVER_GNU)
+  __extension__ __asm (".globl _ZSt21ios_base_library_initv");
q
libstdc++-v3/include/std/iostream. This was then duly written by the offloading-device
lto1 for digestion by the device-target assembler. While the llvm-mc linker user by
GCN did accept .globl, nvptx's ptxas did choke on it.

Additionally, as the assembly was already written for offloading, the output was
lost on the host when using LTO for not only for offload but for real (i.e. with -flto).

Has someone an idea how to check whether the offloading-code assembler does not
contain the _ZSt21ios_base_library_initv while the host-side (before or after LTO)
should contain it, but only with _GLIBCXX_SYMVER_GNU?
Otherwise, the testcase tests only and at least whether it breaks with nvptx
as ptxas does not like the symbol.

* * *

Tested (manually + running the OvO and sollve-testsuite) on x86-64-gnu-linux with nvptx
offloading and with "make check -k" on x86-64-gnu-linux, albeit without offloading configured.
The installed-build regtesting of "make check-target-libgomp" seems to be currently broken
as it does run all checking code (check_effective_target...) but does not seem to find
any actual testcase to be run, probably a side effect of the recent testsuite changes.

OK for mainline and GCC 13?

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

Comments

Richard Biener May 12, 2023, 1:10 p.m. UTC | #1
On Fri, 12 May 2023, Tobias Burnus wrote:

> Long standing issue but as top-level 'asm' statement were rare, it did not
> show up.
> However, the fix for PR108969 in commit r14-321-g9a41d2cdbcd added code
> 
> +#elif defined(_GLIBCXX_SYMVER_GNU)
> +  __extension__ __asm (".globl _ZSt21ios_base_library_initv");
> q
> libstdc++-v3/include/std/iostream. This was then duly written by the
> offloading-device
> lto1 for digestion by the device-target assembler. While the llvm-mc linker
> user by
> GCN did accept .globl, nvptx's ptxas did choke on it.
> 
> Additionally, as the assembly was already written for offloading, the output
> was
> lost on the host when using LTO for not only for offload but for real (i.e.
> with -flto).
> 
> Has someone an idea how to check whether the offloading-code assembler does
> not
> contain the _ZSt21ios_base_library_initv while the host-side (before or after
> LTO)
> should contain it, but only with _GLIBCXX_SYMVER_GNU?
> Otherwise, the testcase tests only and at least whether it breaks with nvptx
> as ptxas does not like the symbol.
> 
> * * *
> 
> Tested (manually + running the OvO and sollve-testsuite) on x86-64-gnu-linux
> with nvptx
> offloading and with "make check -k" on x86-64-gnu-linux, albeit without
> offloading configured.
> The installed-build regtesting of "make check-target-libgomp" seems to be
> currently broken
> as it does run all checking code (check_effective_target...) but does not seem
> to find
> any actual testcase to be run, probably a side effect of the recent testsuite
> changes.
> 
> OK for mainline and GCC 13?

Without any idea about the details above the patch looks quite reasonable,
thus OK for trunk.  OK for branch after a while if no issues showed up.

Thanks,
Richard.
Thomas Schwinge May 13, 2023, 9:18 a.m. UTC | #2
Hi!

On 2023-05-12T14:46:21+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> Long standing issue but as top-level 'asm' statement were rare, it did not show up.

Thanks for tracking this down, and fixing it!  Presumably latent
"forever" (here: since code offloading support was added, approximately
one decade ago).

> However, the fix for PR108969 in commit r14-321-g9a41d2cdbcd added code
>
> +#elif defined(_GLIBCXX_SYMVER_GNU)
> +  __extension__ __asm (".globl _ZSt21ios_base_library_initv");
> q
> libstdc++-v3/include/std/iostream. This was then duly written by the offloading-device
> lto1 for digestion by the device-target assembler. While the llvm-mc linker user by
> GCN did accept .globl, nvptx's ptxas did choke on it.
>
> Additionally, as the assembly was already written for offloading, the output was
> lost on the host when using LTO for not only for offload but for real (i.e. with -flto).

> Has someone an idea how to check whether the offloading-code assembler does not
> contain the _ZSt21ios_base_library_initv while the host-side (before or after LTO)
> should contain it, but only with _GLIBCXX_SYMVER_GNU?
> Otherwise, the testcase tests only and at least whether it breaks with nvptx
> as ptxas does not like the symbol.

We can construct test cases with their own top-level 'asm', outside of
the libstdc++ context?  I'll try to come up with something.


> The installed-build regtesting of "make check-target-libgomp" seems to be currently broken
> as it does run all checking code (check_effective_target...) but does not seem to find
> any actual testcase to be run, probably a side effect of the recent testsuite changes.

FUD.  ;-O

We've since determined that Tobias' testing problem was due to
'//net/[...]' (double-slash) paths usage (with '/net' served by
'/etc/auto.master:/net -hosts'), which apparently confused DejaGnu/TCL
'find' (as used in 'libgomp/testsuite/libgomp.*/*.exp').

Per
<https://inbox.sourceware.org/877cte9cfa.fsf@euler.schwinge.homeip.net>:

| By the way, all changes (individually) tested in a number of different
| configurations: '--enable-languages=[...]', native vs. cross, build-tree
| vs. installed testing, etc.


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
diff mbox series

Patch

LTO: Fix writing of toplevel asm with offloading [PR109816]

When offloading was enabled, top-level 'asm' were added to the offloading section,
confusing assemblers which did not support the syntax. Additionally, with offloading
and -flto, the top-level assembler code did not end up in the host files.

As r14-321-g9a41d2cdbcd added top-level 'asm' to some libstdc++ header files, the issue became
more apparent, causing fails with nvptx for C++ testcases.

	PR libstdc++/109816

gcc/ChangeLog:
	* lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by
	'!lto_stream_offload_p'.

libgomp/ChangeLog:

	* testsuite/libgomp.c++/target-map-class-1.C: New test.
	* testsuite/libgomp.c++/target-map-class-2.C: New test.

 gcc/lto-cgraph.cc                                  |  2 +-
 libgomp/testsuite/libgomp.c++/target-map-class-1.C | 98 ++++++++++++++++++++++
 libgomp/testsuite/libgomp.c++/target-map-class-2.C |  6 ++
 3 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc
index 805c785..aed5e9d 100644
--- a/gcc/lto-cgraph.cc
+++ b/gcc/lto-cgraph.cc
@@ -1020,7 +1020,7 @@  output_symtab (void)
      When doing WPA we must output every asm just once.  Since we do not partition asm
      nodes at all, output them to first output.  This is kind of hack, but should work
      well.  */
-  if (!asm_nodes_output)
+  if (!asm_nodes_output && !lto_stream_offload_p)
     {
       asm_nodes_output = true;
       lto_output_toplevel_asms ();
diff --git a/libgomp/testsuite/libgomp.c++/target-map-class-1.C b/libgomp/testsuite/libgomp.c++/target-map-class-1.C
new file mode 100644
index 0000000..ad4802d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/target-map-class-1.C
@@ -0,0 +1,98 @@ 
+/* PR middle-end/109816  */
+
+/* This variant: without -flto, see target-map-class-2.C for -flto. */
+
+/* iostream.h adds 'globl _ZSt21ios_base_library_initv' with _GLIBCXX_SYMVER_GNU,
+   but it shouldn't end up in the offload assembly but only in the host assembly. */
+
+/* Example based on sollve_vv's test_target_data_map_classes.cpp; however,
+   relevant is only the 'include' and not the actual executable code.  */
+
+#include <iostream>
+#include <omp.h>
+
+using namespace std;
+
+#define N 1000
+
+struct A
+{
+  int *h_array;
+  int size, sum;
+
+  A (int *array, const int s) : h_array(array), size(s), sum(0) { }
+  ~A() { h_array = NULL; }
+};
+
+void
+test_map_tofrom_class_heap ()
+{
+  int *array = new int[N];
+  A *obj = new A (array, N);
+
+  #pragma omp target map(from: array[:N]) map(tofrom: obj[:1])
+    {
+      int *tmp_h_array = obj->h_array;
+      obj->h_array = array;
+      int tmp = 0;
+      for (int i = 0; i < N; ++i)
+	{
+	  obj->h_array[i] = 4*i;
+	  tmp += 3;
+	}
+      obj->h_array = tmp_h_array;
+      obj->sum = tmp;
+    }
+
+  for (int i = 0; i < N; ++i)
+    if (obj->h_array[i] != 4*i)
+      __builtin_abort ();
+
+  if (3*N != obj->sum)
+    {
+      std::cout << "sum: " << obj->sum << std::endl;
+      __builtin_abort ();
+    }
+
+  delete obj;
+  delete[] array;
+}
+
+void
+test_map_tofrom_class_stack ()
+{
+  int array[N];
+  A obj(array, N);
+
+  #pragma omp target map(from: array[:N]) map(tofrom: obj)
+    {
+      int *tmp_h_array = obj.h_array;
+      obj.h_array = array;
+      int tmp = 0;
+      for (int i = 0; i < N; ++i)
+	{
+	  obj.h_array[i] = 7*i;
+	  tmp += 5;
+	}
+      obj.h_array = tmp_h_array;
+      obj.sum = tmp;
+    }
+
+  for (int i = 0; i < N; ++i)
+    if (obj.h_array[i] != 7*i)
+      __builtin_abort ();
+
+  if (5*N != obj.sum)
+    {
+      std::cout << "sum: " << obj.sum << std::endl;
+      __builtin_abort ();
+    }
+}
+
+int
+main()
+{
+  test_map_tofrom_class_heap();
+  test_map_tofrom_class_stack();
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c++/target-map-class-2.C b/libgomp/testsuite/libgomp.c++/target-map-class-2.C
new file mode 100644
index 0000000..1ef20f7
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c++/target-map-class-2.C
@@ -0,0 +1,6 @@ 
+/* { dg-additional-options "-flto" }  */
+/* PR middle-end/109816  */
+
+/* This variant: with -flto, see target-map-class-1.C for without -flto. */
+
+#include "target-map-class-1.C"