{"id":2224025,"url":"http://patchwork.ozlabs.org/api/patches/2224025/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/patch/ab8f2c2f-e5ef-439f-95fe-4fddc885737d@baylibre.com/","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<ab8f2c2f-e5ef-439f-95fe-4fddc885737d@baylibre.com>","list_archive_url":null,"date":"2026-04-16T14:46:38","name":"nvptx: support -march=native with offloading","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"a36c32b4b222462e69af3ea9d7a7f21b3b76cb6a","submitter":{"id":87873,"url":"http://patchwork.ozlabs.org/api/people/87873/?format=json","name":"Tobias Burnus","email":"tburnus@baylibre.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/gcc/patch/ab8f2c2f-e5ef-439f-95fe-4fddc885737d@baylibre.com/mbox/","series":[{"id":500174,"url":"http://patchwork.ozlabs.org/api/series/500174/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/list/?series=500174","date":"2026-04-16T14:46:38","name":"nvptx: support -march=native with offloading","version":1,"mbox":"http://patchwork.ozlabs.org/series/500174/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2224025/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2224025/checks/","tags":{},"related":[],"headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=baylibre-com.20251104.gappssmtp.com\n header.i=@baylibre-com.20251104.gappssmtp.com header.a=rsa-sha256\n header.s=20251104 header.b=BRYladFD;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (2048-bit key,\n unprotected) header.d=baylibre-com.20251104.gappssmtp.com\n header.i=@baylibre-com.20251104.gappssmtp.com header.a=rsa-sha256\n header.s=20251104 header.b=BRYladFD","sourceware.org;\n dmarc=none (p=none dis=none) header.from=baylibre.com","sourceware.org; spf=pass smtp.mailfrom=baylibre.com","server2.sourceware.org;\n arc=none smtp.remote-ip=209.85.221.49"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fxLSw69LPz1yG9\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 17 Apr 2026 00:47:11 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 8101D4BAE7C2\n\tfor <incoming@patchwork.ozlabs.org>; Thu, 16 Apr 2026 14:47:09 +0000 (GMT)","from mail-wr1-f49.google.com (mail-wr1-f49.google.com\n [209.85.221.49])\n by sourceware.org (Postfix) with ESMTPS id 7CA194BB3B81\n for <gcc-patches@gcc.gnu.org>; Thu, 16 Apr 2026 14:46:41 +0000 (GMT)","by mail-wr1-f49.google.com with SMTP id\n ffacd0b85a97d-43d7605ec91so4199009f8f.3\n for <gcc-patches@gcc.gnu.org>; Thu, 16 Apr 2026 07:46:41 -0700 (PDT)","from ?IPV6:2001:16b8:3df4:5200:4f9d:fe54:a905:1e3f?\n ([2001:16b8:3df4:5200:4f9d:fe54:a905:1e3f])\n by smtp.gmail.com with ESMTPSA id\n ffacd0b85a97d-43ead3d5ea9sm14473881f8f.21.2026.04.16.07.46.39\n (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);\n Thu, 16 Apr 2026 07:46:39 -0700 (PDT)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 8101D4BAE7C2","OpenDKIM Filter v2.11.0 sourceware.org 7CA194BB3B81"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 7CA194BB3B81","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org 7CA194BB3B81","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1776350801; cv=none;\n b=eYwZSmcjl2PoS/anTUcxyP1Jw5iQ3CL7FsNu9g0qi3395iV3NlLQehMQzbqvfMF/rax4cYKVSdeE87RMHw08P6jyiw+xkg/HA3omi0MBBgJquWBi6GacX6E3KMV6y14AMNR5JQWc03rAy82JcCcnLGvfw+5Ro1I2oNXCZyf813w=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1776350801; c=relaxed/simple;\n bh=MyGqRL0yExLLIHbzCN7ZL4I8GmfWePdck6G0hZuvzOU=;\n h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject;\n b=f64liY0A+5t6Io9yontiyrxatpDqcooCZRgYzOi7jp18K19ujI/NGFJOwjzPk3qq2YiMJHrl/hdVw7zjlmwjMlDC34HiMBfDvAlIkaYAqi1tfOc4bPovZCtQ6khVFR/O/ZMi9rZGTQpp5Zr4binh1jsL1XadksJyQfmk7T4sj/Q=","ARC-Authentication-Results":"i=1; server2.sourceware.org","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=baylibre-com.20251104.gappssmtp.com; s=20251104; t=1776350800;\n x=1776955600;\n darn=gcc.gnu.org;\n h=subject:from:cc:to:content-language:user-agent:mime-version:date\n :message-id:from:to:cc:subject:date:message-id:reply-to;\n bh=qEHhSfUKa8J5oYTzirydgSgMxn/+Nu9cBQW8oQhQmKQ=;\n b=BRYladFDB7IWySCSyNQEUAxQADeUm+br5vCEHRo3LKNy98eUbfsAyyNudUGVWZWIHy\n uQgeK6WN8W8VtLrZor27VjkIfGJ4c9he6ndH6R/k84DrGHAp6D8lYzmONyGdPs9koQrt\n CSx6u0JarsFNPRUJAP6qdO33ukoGvq8VMPsxYjXh1uD9Sdp4Xwhi9GIvuDb3nnLoIL3f\n gph4QgMCOE3zwQ2YNzKjb7EkdaTqwy1cJHpUr0mmSjt3010i25L0uWQw3ThqNiXrQZ1l\n IE09uh0G8q9cucYC/D8deqIwCV9KrNr+cOrIx9Di+GIvMnSyjRfVT6mAo8mTwHtNMAqu\n v+5w==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20251104; t=1776350800; x=1776955600;\n h=subject:from:cc:to:content-language:user-agent:mime-version:date\n :message-id:x-gm-gg:x-gm-message-state:from:to:cc:subject:date\n :message-id:reply-to;\n bh=qEHhSfUKa8J5oYTzirydgSgMxn/+Nu9cBQW8oQhQmKQ=;\n b=ItcFtcktLbTUBJdnLRebH4RJWrbV8c5j7X0QTpcfBoSC+VdtWNsFAqLsG1NlZggWc/\n EJ7CdnoMl0sWdV+PwTSifW3KY6j+JBzB+x5nODYUKpACbrhAixu5Aex+3lARWY/txt/7\n lpOXd/TW8Yxz3SUnYy2tFpkeuBSDkt2PE5PFzmrT5R7z9/U8g5fPHVX5IL9Nl5NJou+k\n 4NqvbOeXvTl9TmRhYpW1sFx1Yp83OxashRabjJQO+2syQVd2kLoM3G4O8Fc7CS5wY6Rf\n LeuW5T/P5jGdmAg6yaMkpGyvcjjLKuWOaODbit0dBsngZyxe/iwoNLkbaR0XRIMCCcth\n 6b9g==","X-Gm-Message-State":"AOJu0YxI5OfoHV+93uodWYTlY/BkoEKvYOlyJv9jePbOia9x1twVuSsi\n DDNlQJROfly6yz2Xf2E1ej7XYXB4GDIaiyqu9TOWzxL/7CzmRD+5txcxa7Ftm4hz5p4e9Ld6QoW\n vhbDfRJF+/g==","X-Gm-Gg":"AeBDies25Rg9IRg3BCpicRmgGSsNKRVc02Yz5MuNdoLCkRJ2Crh4J459lxh1J+Pxa/f\n zNbR+BWIla0kme4wHWGTp6Gjb1Fd2kDkksWdxJEWE31aiDMNkBOVr9j1qxCH78Wk6YHfgOLKM9p\n NBtKlZaKzyJApBkbrzxJOjHykGetor5+6pV+0FX2YhHmD7IhfCiCMHIleKkwKRKZzCkfdVA47rs\n msIOsWtp/plu1lg/z8N6ghOXKr9fNvCsyut3p8XbsffgcgIES7ycWUJKk6hi91ulgRSzLnsisdj\n /xKZGSKOA0epSwNjaeXuUFYIGilP8Fp4ZjDQLZfyB9yY91jsEuiGlJowIqePsQud2vwYDqZ7w5E\n hPBHkVLg+NfLoqclYKMK5eEAwXQxirQaJHMcuIoR96pqgQ6oLJ+KUY7X7lM6TghFdVZThHyDfla\n BoCu2TzWDMoYp2LC5Za24dwLJ5TMXwoH50NEuQWo10nKSKeNCIVbyTYMHvkfv1eiPKVzNa1LrW1\n 3b62hf7//u/","X-Received":"by 2002:a05:6000:605:b0:43d:7b90:fa2a with SMTP id\n ffacd0b85a97d-43d7b90fe46mr25122907f8f.3.1776350800246;\n Thu, 16 Apr 2026 07:46:40 -0700 (PDT)","Content-Type":"multipart/mixed; boundary=\"------------H4jO7557KIFfBVbPnI5Xs0sV\"","Message-ID":"<ab8f2c2f-e5ef-439f-95fe-4fddc885737d@baylibre.com>","Date":"Thu, 16 Apr 2026 16:46:38 +0200","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","Content-Language":"en-US","To":"gcc-patches <gcc-patches@gcc.gnu.org>,\n Thomas Schwinge <tschwinge@baylibre.com>","Cc":"Andrew Stubbs <ams@baylibre.com>,\n =?utf-8?q?Arsen_Arsenovi=C4=87?= <aarsenovic@baylibre.com>","From":"Tobias Burnus <tburnus@baylibre.com>","Subject":"[Patch] nvptx: support -march=native with offloading","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"},"content":"This came on IRC when richi wondered whether there should be such feature:\n   -foffload-options=-march=native\n\nThe obvious place to handle seems to be mkoffload. Well, attached is an attempt\n  to so for Nvidia, which works fine on my laptop:\n\n-march=native  becomes  -march-map=sm_86  which then becomes -misa=sm_80.\n\nThoughts, comments, remark?\nOr even OK for GCC 17 mainline?\n\nTobias","diff":"nvptx: support -march=native with offloading\n\nSupport -foffload-options=nvptx-none=-march=native by querying\nin mkoffload the CUDA library for the compute capability of all\ndevices; pass the least capable on the nvptx compiler as:\n  -march-map=sm_XX.\n\ngcc/ChangeLog:\n\n\t* config/nvptx/mkoffload.cc (get_cuda_error, get_native_arch): Add.\n\t(main): Call it for -march=native/-misa=native.\n\t* doc/invoke.texi (nvptx): Document that -march=native is supported.\n\n gcc/config/nvptx/mkoffload.cc | 90 +++++++++++++++++++++++++++++++++++++++++++\n gcc/doc/invoke.texi           |  4 +-\n 2 files changed, 93 insertions(+), 1 deletion(-)\n\ndiff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc\nindex 6a2a3bc939b..ba9abe88e94 100644\n--- a/gcc/config/nvptx/mkoffload.cc\n+++ b/gcc/config/nvptx/mkoffload.cc\n@@ -30,6 +30,7 @@\n #define IN_TARGET_CODE 1\n \n #include \"config.h\"\n+#define INCLUDE_DLFCN_H\n #include \"system.h\"\n #include \"coretypes.h\"\n #include \"obstack.h\"\n@@ -97,6 +98,92 @@ maybe_unlink (const char *file)\n     fprintf (stderr, \"[Leaving %s]\\n\", file);\n }\n \n+\n+/* Handle -march=native.  */\n+\n+typedef enum { CUDA_SUCCESS = 0 } CUresult;\n+\n+static const char *\n+get_cuda_error (CUresult err,\n+\t\tCUresult (*cuGetErrorString) (CUresult, const char **))\n+{\n+  static const char fallback[] = \"unknown error\";\n+  const char *msg;\n+\n+  if (cuGetErrorString && cuGetErrorString (err, &msg) == CUDA_SUCCESS)\n+    return msg;\n+  return fallback;\n+}\n+\n+static char *\n+get_native_arch ()\n+{\n+  const char *cuda_runtime_lib = \"libcuda.so.1\";\n+\n+  void *h = dlopen (cuda_runtime_lib, RTLD_LAZY);\n+  if (!h)\n+    fatal_error (input_location,\n+\t\t \"Requested %<-march=native%> but %<libcuda.so.1%> \"\n+\t\t \"cannot be opened\");\n+\n+  CUresult (*cuInit)(unsigned int);\n+  CUresult (*cuGetErrorString) (CUresult, const char **);\n+  CUresult (*cuDeviceGetCount) (int *);\n+  CUresult (*cuDeviceComputeCapability) (int *, int *, int);\n+\n+  cuInit = (typeof cuInit) dlsym (h, \"cuInit\");\n+  cuGetErrorString = (typeof cuGetErrorString) dlsym (h, \"cuGetErrorString\");\n+  cuDeviceGetCount = (typeof cuDeviceGetCount) dlsym (h, \"cuDeviceGetCount\");\n+  cuDeviceComputeCapability\n+    = (typeof cuDeviceComputeCapability) dlsym (h, \"cuDeviceComputeCapability\");\n+\n+  if (!cuInit || !cuDeviceGetCount || !cuDeviceComputeCapability)\n+    fatal_error (input_location,\n+\t\t \"requested %<-march=native%> but %<libcuda.so.1%> lacks \"\n+\t\t \"API functions\");\n+\n+  int n;\n+  CUresult res = (*cuInit) (0);\n+  if (res == CUDA_SUCCESS)\n+    res = (*cuDeviceGetCount) (&n);\n+  if (res != CUDA_SUCCESS)\n+    fatal_error (input_location,\n+\t\t \"requested %<-march=native%> but CUDA runtime failed: %s\",\n+\t\t get_cuda_error (res, cuGetErrorString));\n+  if (n == 0)\n+    fatal_error (input_location,\n+\t\t \"requested %<-march=native%> but system reports no nvptx GPU\");\n+\n+  bool differ = false;\n+  int major = 0, minor = 0;\n+  for (int dev = 0; dev < n; ++dev)\n+    {\n+      int maj, min;\n+      res = cuDeviceComputeCapability (&maj, &min, dev);\n+      if (res != CUDA_SUCCESS)\n+\tfatal_error (input_location,\n+\t\t     \"requested %<-march=native%> but CUDA runtime failed: %s\",\n+\t\t     get_cuda_error (res, cuGetErrorString));\n+      if (dev != 0 && (major != maj || minor != min))\n+\tdiffer = true;\n+\n+      if (dev == 0 || maj < major || (major == maj && min < minor))\n+\t{\n+\t  major = maj;\n+\t  minor = min;\n+\t}\n+    }\n+  if (differ)\n+    warning (0, \"requested %<-march=native%> but found multiple devices, using \"\n+\t\t\"lowest common compute capability %<sm_%d%d%>\", major, minor);\n+\n+  static char buf[sizeof(\"-march-map=sm_\") + 5 + 2];\n+  size_t len = snprintf (buf, sizeof (buf), \"-march-map=sm_%d%d\", major, minor);\n+  gcc_assert (len < sizeof (buf));\n+\n+  return buf;\n+}\n+\n /* Add or change the value of an environment variable, outputting the\n    change to standard error if in verbose mode.  */\n static void\n@@ -745,6 +832,9 @@ main (int argc, char **argv)\n       else if (strcmp (argv[i], \"-dumpbase\") == 0\n \t       && i + 1 < argc)\n \tdumppfx = argv[++i];\n+      else if (strcmp (argv[i], \"-march=native\") == 0\n+\t       || strcmp (argv[i], \"-misa=native\") == 0)\n+\targv[i] = get_native_arch ();\n       /* Translate host into offloading libraries.  */\n       else if (strcmp (argv[i], \"-l_GCC_gfortran\") == 0\n \t       || strcmp (argv[i], \"-l_GCC_m\") == 0\ndiff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi\nindex a3ac487eeaa..e4e28aab618 100644\n--- a/gcc/doc/invoke.texi\n+++ b/gcc/doc/invoke.texi\n@@ -30598,7 +30598,9 @@ Valid architecture strings are\n @samp{sm_70}, @samp{sm_75},\n @samp{sm_80}, and @samp{sm_89}.\n The default depends on how the compiler has been configured, see\n-@option{--with-arch}.\n+@option{--with-arch}.  When @option{-march=native} is specified for\n+offloading, @option{-march=native} is given, the ISA of closest to\n+the least-capable PTX device is used.\n \n This option sets the value of the preprocessor macro\n @code{__PTX_SM__}; for instance, for @samp{sm_35}, it has the value\n","prefixes":[]}