From patchwork Fri May 12 12:46:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tobias Burnus X-Patchwork-Id: 1780581 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QHpRp4qTGz20KD for ; Fri, 12 May 2023 22:46:46 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B70F33856DFB for ; Fri, 12 May 2023 12:46:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 250523858C54 for ; Fri, 12 May 2023 12:46:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 250523858C54 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.99,269,1677571200"; d="diff'?scan'208";a="5045733" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 12 May 2023 04:46:26 -0800 IronPort-SDR: UlQPo+HQS584e7SG0nOhd9WEiU8vztzILBYntvc7t6Sd2EsuKGRMM5IjHjtVi7zHNslSyVBHjc 8cpmiVi8Ox89DdsymyBHw18QbWz/dMT5vGvTNDWA9q2plT3GMr/uBGjXXYxYvBagWY1GDEsbnL 9CI1bsV9SqlcrHHaa5qoAtyqCQ3hatDsNomT8/q7Iar5Aw4sut+rjZ5B6e8Whvo4kmzLUcO/zv oVN+HMxMDsE8cIFpuqHYY0PZz/MDlvB5FYUGotrt0V88pZuBXhHF5/C5e+iJDSzSLw8+7k7ELa KRI= Message-ID: <74555a9a-8eb8-14ac-a5bd-d0ab15c9acc1@codesourcery.com> Date: Fri, 12 May 2023 14:46:21 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Content-Language: en-US To: gcc-patches , Richard Biener From: Tobias Burnus Subject: [Patch] LTO: Fix writing of toplevel asm with offloading [PR109816] X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Long standing issue but as top-level 'asm' statement were rare, it did not show up. However, the fix for PR108969 in commit r14-321-g9a41d2cdbcd added code +#elif defined(_GLIBCXX_SYMVER_GNU) + __extension__ __asm (".globl _ZSt21ios_base_library_initv"); q libstdc++-v3/include/std/iostream. This was then duly written by the offloading-device lto1 for digestion by the device-target assembler. While the llvm-mc linker user by GCN did accept .globl, nvptx's ptxas did choke on it. Additionally, as the assembly was already written for offloading, the output was lost on the host when using LTO for not only for offload but for real (i.e. with -flto). Has someone an idea how to check whether the offloading-code assembler does not contain the _ZSt21ios_base_library_initv while the host-side (before or after LTO) should contain it, but only with _GLIBCXX_SYMVER_GNU? Otherwise, the testcase tests only and at least whether it breaks with nvptx as ptxas does not like the symbol. * * * Tested (manually + running the OvO and sollve-testsuite) on x86-64-gnu-linux with nvptx offloading and with "make check -k" on x86-64-gnu-linux, albeit without offloading configured. The installed-build regtesting of "make check-target-libgomp" seems to be currently broken as it does run all checking code (check_effective_target...) but does not seem to find any actual testcase to be run, probably a side effect of the recent testsuite changes. OK for mainline and GCC 13? Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 LTO: Fix writing of toplevel asm with offloading [PR109816] When offloading was enabled, top-level 'asm' were added to the offloading section, confusing assemblers which did not support the syntax. Additionally, with offloading and -flto, the top-level assembler code did not end up in the host files. As r14-321-g9a41d2cdbcd added top-level 'asm' to some libstdc++ header files, the issue became more apparent, causing fails with nvptx for C++ testcases. PR libstdc++/109816 gcc/ChangeLog: * lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by '!lto_stream_offload_p'. libgomp/ChangeLog: * testsuite/libgomp.c++/target-map-class-1.C: New test. * testsuite/libgomp.c++/target-map-class-2.C: New test. gcc/lto-cgraph.cc | 2 +- libgomp/testsuite/libgomp.c++/target-map-class-1.C | 98 ++++++++++++++++++++++ libgomp/testsuite/libgomp.c++/target-map-class-2.C | 6 ++ 3 files changed, 105 insertions(+), 1 deletion(-) diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc index 805c785..aed5e9d 100644 --- a/gcc/lto-cgraph.cc +++ b/gcc/lto-cgraph.cc @@ -1020,7 +1020,7 @@ output_symtab (void) When doing WPA we must output every asm just once. Since we do not partition asm nodes at all, output them to first output. This is kind of hack, but should work well. */ - if (!asm_nodes_output) + if (!asm_nodes_output && !lto_stream_offload_p) { asm_nodes_output = true; lto_output_toplevel_asms (); diff --git a/libgomp/testsuite/libgomp.c++/target-map-class-1.C b/libgomp/testsuite/libgomp.c++/target-map-class-1.C new file mode 100644 index 0000000..ad4802d --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/target-map-class-1.C @@ -0,0 +1,98 @@ +/* PR middle-end/109816 */ + +/* This variant: without -flto, see target-map-class-2.C for -flto. */ + +/* iostream.h adds 'globl _ZSt21ios_base_library_initv' with _GLIBCXX_SYMVER_GNU, + but it shouldn't end up in the offload assembly but only in the host assembly. */ + +/* Example based on sollve_vv's test_target_data_map_classes.cpp; however, + relevant is only the 'include' and not the actual executable code. */ + +#include +#include + +using namespace std; + +#define N 1000 + +struct A +{ + int *h_array; + int size, sum; + + A (int *array, const int s) : h_array(array), size(s), sum(0) { } + ~A() { h_array = NULL; } +}; + +void +test_map_tofrom_class_heap () +{ + int *array = new int[N]; + A *obj = new A (array, N); + + #pragma omp target map(from: array[:N]) map(tofrom: obj[:1]) + { + int *tmp_h_array = obj->h_array; + obj->h_array = array; + int tmp = 0; + for (int i = 0; i < N; ++i) + { + obj->h_array[i] = 4*i; + tmp += 3; + } + obj->h_array = tmp_h_array; + obj->sum = tmp; + } + + for (int i = 0; i < N; ++i) + if (obj->h_array[i] != 4*i) + __builtin_abort (); + + if (3*N != obj->sum) + { + std::cout << "sum: " << obj->sum << std::endl; + __builtin_abort (); + } + + delete obj; + delete[] array; +} + +void +test_map_tofrom_class_stack () +{ + int array[N]; + A obj(array, N); + + #pragma omp target map(from: array[:N]) map(tofrom: obj) + { + int *tmp_h_array = obj.h_array; + obj.h_array = array; + int tmp = 0; + for (int i = 0; i < N; ++i) + { + obj.h_array[i] = 7*i; + tmp += 5; + } + obj.h_array = tmp_h_array; + obj.sum = tmp; + } + + for (int i = 0; i < N; ++i) + if (obj.h_array[i] != 7*i) + __builtin_abort (); + + if (5*N != obj.sum) + { + std::cout << "sum: " << obj.sum << std::endl; + __builtin_abort (); + } +} + +int +main() +{ + test_map_tofrom_class_heap(); + test_map_tofrom_class_stack(); + return 0; +} diff --git a/libgomp/testsuite/libgomp.c++/target-map-class-2.C b/libgomp/testsuite/libgomp.c++/target-map-class-2.C new file mode 100644 index 0000000..1ef20f7 --- /dev/null +++ b/libgomp/testsuite/libgomp.c++/target-map-class-2.C @@ -0,0 +1,6 @@ +/* { dg-additional-options "-flto" } */ +/* PR middle-end/109816 */ + +/* This variant: with -flto, see target-map-class-1.C for without -flto. */ + +#include "target-map-class-1.C"