Patchwork [ARM] -m{cpu,tune,arch}=native

login
register
mail settings
Submitter Andrew Stubbs
Date Sept. 6, 2011, 1:35 p.m.
Message ID <4E66219E.4070706@codesourcery.com>
Download mbox | patch
Permalink /patch/113562/
State New
Headers show

Comments

Andrew Stubbs - Sept. 6, 2011, 1:35 p.m.
This update adds many more "magic numbers" for various ARM CPUs, and 
also ensures that the implementer is ARM (as opposed to Marvell, etc.). 
The list is far from comprehensive, but it should cover many (but by no 
means all) of the cores in current use and it would not be hard to add 
support for other implementers and CPU names in future.

It has been suggested that this patch should use auxv rather than 
/proc/cpuinfo. Does anybody here have any insight/preferences?

Is the patch OK?

Andrew
Richard Earnshaw - Sept. 9, 2011, 11:55 a.m.
On 06/09/11 14:35, Andrew Stubbs wrote:
> This update adds many more "magic numbers" for various ARM CPUs, and 
> also ensures that the implementer is ARM (as opposed to Marvell, etc.). 
> The list is far from comprehensive, but it should cover many (but by no 
> means all) of the cores in current use and it would not be hard to add 
> support for other implementers and CPU names in future.
> 
> It has been suggested that this patch should use auxv rather than 
> /proc/cpuinfo. Does anybody here have any insight/preferences?
> 
> Is the patch OK?
> 
> Andrew
> 
> 
> tune-native.patch
> 
> 
> 2011-08-27  Andrew Stubbs  <ams@codesourcery.com>
> 
> 	gcc/
> 	* config.host (arm*-*-linux*): Add driver-arm.o and x-arm.
> 	* config/arm/arm.opt: Add 'native' processor_type and
> 	arm_arch enum values.
> 	* config/arm/arm.h (host_detect_local_cpu): New prototype.
> 	(EXTRA_SPEC_FUNCTIONS): New define.
> 	(MCPU_MTUNE_NATIVE_SPECS): New define.
> 	(DRIVER_SELF_SPECS): New define.
> 	* config/arm/driver-arm.c: New file.
> 	* config/arm/x-arm: New file.
> 	* doc/invoke.texi (ARM Options): Document -mcpu=native,
> 	-mtune=native and -march=native.
> 

The part number field is meaningless outside of the context of a a
specific vendor -- only taken as a pair can they refer to a specific
part.  So why is the vendor field hard-coded rather than factored into
the table of parts.

Maybe it would be better to have a table of tables, with the top-level
table being indexed by vendor id.  Something like

struct vendor_cpu {
  const char *part_no;
  const char *arch_name;
  const char *cpu_name;
};

struct all_cpus {
  const char *vendor_no;
  const struct vendor_cpu *vendor_parts;
}

struct vendor_cpu vendor_arm[] = { ... }

Now your code will allow easy addition of third-party cores.

R.

> --- a/gcc/config.host
> +++ b/gcc/config.host
> @@ -100,6 +100,14 @@ case ${host} in
>  esac
>  
>  case ${host} in
> +  arm*-*-linux*)
> +    case ${target} in
> +      arm*-*-*)
> +	host_extra_gcc_objs="driver-arm.o"
> +	host_xmake_file="${host_xmake_file} arm/x-arm"
> +	;;
> +    esac
> +    ;;
>    alpha*-*-linux* | alpha*-dec-osf*)
>      case ${target} in
>        alpha*-*-linux* | alpha*-dec-osf*)
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -2223,4 +2223,21 @@ extern int making_const_table;
>     instruction.  */
>  #define MAX_LDM_STM_OPS 4
>  
> +/* -mcpu=native handling only makes sense with compiler running on
> +   an ARM chip.  */
> +#if defined(__arm__)
> +extern const char *host_detect_local_cpu (int argc, const char **argv);
> +# define EXTRA_SPEC_FUNCTIONS						\
> +  { "local_cpu_detect", host_detect_local_cpu },
> +
> +# define MCPU_MTUNE_NATIVE_SPECS					\
> +   " %{march=native:%<march=native %:local_cpu_detect(arch)}"		\
> +   " %{mcpu=native:%<mcpu=native %:local_cpu_detect(cpu)}"		\
> +   " %{mtune=native:%<mtune=native %:local_cpu_detect(tune)}"
> +#else
> +# define MCPU_MTUNE_NATIVE_SPECS ""
> +#endif
> +
> +#define DRIVER_SELF_SPECS MCPU_MTUNE_NATIVE_SPECS
> +
>  #endif /* ! GCC_ARM_H */
> --- a/gcc/config/arm/arm.opt
> +++ b/gcc/config/arm/arm.opt
> @@ -80,6 +80,11 @@ march=
>  Target RejectNegative Joined Enum(arm_arch) Var(arm_arch_option)
>  Specify the name of the target architecture
>  
> +; Other arm_arch values are loaded from arm-tables.opt
> +; but that is a generated file and this is an odd-one-out.
> +EnumValue
> +Enum(arm_arch) String(native) Value(-1) DriverOnly
> +
>  marm
>  Target Report RejectNegative InverseMask(THUMB)
>  Generate code in 32 bit ARM state.
> @@ -233,6 +238,11 @@ mtune=
>  Target RejectNegative Joined Enum(processor_type) Var(arm_tune_option) Init(arm_none)
>  Tune code for the given processor
>  
> +; Other processor_type values are loaded from arm-tables.opt
> +; but that is a generated file and this is an odd-one-out.
> +EnumValue
> +Enum(processor_type) String(native) Value(-1) DriverOnly
> +
>  mwords-little-endian
>  Target Report RejectNegative Mask(LITTLE_WORDS)
>  Assume big endian bytes, little endian words.  This option is deprecated.
> --- /dev/null
> +++ b/gcc/config/arm/driver-arm.c
> @@ -0,0 +1,108 @@
> +/* Subroutines for the gcc driver.
> +   Copyright (C) 2011 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"
> +
> +static struct {
> +  const char *part_no;
> +  const char *arch_name;
> +  const char *cpu_name;
> +} cpu_table[] = {
> +    {"0x926", "armv5te", "arm926ej-s"},
> +    {"0xa26", "armv5te", "arm1026ej-s"},
> +    {"0xb02", "armv6k", "mpcore"},
> +    {"0xb36", "armv6j", "arm1136j-s"},
> +    {"0xb56", "armv6t2", "arm1156t2-s"},
> +    {"0xb76", "armv6zk", "arm1176jz-s"},
> +    {"0xc05", "armv7-a", "cortex-a5"},
> +    {"0xc08", "armv7-a", "cortex-a8"},
> +    {"0xc09", "armv7-a", "cortex-a9"},
> +    {"0xc0f", "armv7-a", "cortex-a15"},
> +    {"0xc14", "armv7-r", "cortex-r4"},
> +    {"0xc15", "armv7-r", "cortex-r5"},
> +    {"0xc20", "armv6-m", "cortex-m0"},
> +    {"0xc21", "armv6-m", "cortex-m1"},
> +    {"0xc23", "armv7-m", "cortex-m3"},
> +    {"0xc24", "armv7e-m", "cortex-m4"},
> +    {NULL, NULL, NULL}
> +};
> +
> +/* This will be called by the spec parser in gcc.c when it sees
> +   a %:local_cpu_detect(args) construct.  Currently it will be called
> +   with either "arch", "cpu" or "tune" as argument depending on if
> +   -march=native, -mcpu=native or -mtune=native is to be substituted.
> +
> +   It returns a string containing new command line parameters to be
> +   put at the place of the above two options, depending on what CPU
> +   this is executed.  E.g. "-march=armv7-a" on a Cortex-A8 for
> +   -march=native.  If the routine can't detect a known processor,
> +   the -march or -mtune option is discarded.
> +
> +   ARGC and ARGV are set depending on the actual arguments given
> +   in the spec.  */
> +const char *
> +host_detect_local_cpu (int argc, const char **argv)
> +{
> +  const char *val = NULL;
> +  char buf[128];
> +  FILE *f;
> +  bool arch;
> +
> +  if (argc < 1)
> +    return NULL;
> +
> +  arch = strcmp (argv[0], "arch") == 0;
> +  if (!arch && strcmp (argv[0], "cpu") != 0 && strcmp (argv[0], "tune"))
> +    return NULL;
> +
> +  f = fopen ("/proc/cpuinfo", "r");
> +  if (f == NULL)
> +    return NULL;
> +
> +  while (fgets (buf, sizeof (buf), f) != NULL)
> +    {
> +      /* Ensure that CPU implementer is ARM (0x41).  */
> +      if (strncmp (buf, "CPU implementer", sizeof ("CPU implementer") - 1) == 0
> +	  && strstr (buf, "0x41") == NULL)
> +	return NULL;
> +
> +      /* Detect arch/cpu.  */
> +      if (strncmp (buf, "CPU part", sizeof ("CPU part") - 1) == 0)
> +	{
> +	  int i;
> +	  for (i = 0; cpu_table[i].part_no != NULL; i++)
> +	    if (strstr (buf, cpu_table[i].part_no) != NULL)
> +	      {
> +		val = arch ? cpu_table[i].arch_name : cpu_table[i].cpu_name;
> +		break;
> +	      }
> +	  break;
> +	}
> +    }
> +
> +  fclose (f);
> +
> +  if (val == NULL)
> +    return NULL;
> +
> +  return concat ("-m", argv[0], "=", val, NULL);
> +}
> --- /dev/null
> +++ b/gcc/config/arm/x-arm
> @@ -0,0 +1,3 @@
> +driver-arm.o: $(srcdir)/config/arm/driver-arm.c \
> +  $(CONFIG_H) $(SYSTEM_H)
> +	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -10319,6 +10319,11 @@ assembly code.  Permissible names are: @samp{arm2}, @samp{arm250},
>  @samp{fa526}, @samp{fa626},
>  @samp{fa606te}, @samp{fa626te}, @samp{fmp626}, @samp{fa726te}.
>  
> +@option{-mcpu=native} causes the compiler to auto-detect the CPU
> +of the build computer.  At present, this feature is only supported on
> +Linux, and not all architectures are recognised.  If the auto-detect is
> +unsuccessful the option has no effect.
> +
>  @item -mtune=@var{name}
>  @opindex mtune
>  This option is very similar to the @option{-mcpu=} option, except that
> @@ -10330,6 +10335,11 @@ will generate based on the CPU specified by a @option{-mcpu=} option.
>  For some ARM implementations better performance can be obtained by using
>  this option.
>  
> +@option{-mtune=native} causes the compiler to auto-detect the CPU
> +of the build computer.  At present, this feature is only supported on
> +Linux, and not all architectures are recognised.  If the auto-detect is
> +unsuccessful the option has no effect.
> +
>  @item -march=@var{name}
>  @opindex march
>  This specifies the name of the target ARM architecture.  GCC uses this
> @@ -10343,6 +10353,11 @@ of the @option{-mcpu=} option.  Permissible names are: @samp{armv2},
>  @samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m},
>  @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
>  
> +@option{-march=native} causes the compiler to auto-detect the architecture
> +of the build computer.  At present, this feature is only supported on
> +Linux, and not all architectures are recognised.  If the auto-detect is
> +unsuccessful the option has no effect.
> +
>  @item -mfpu=@var{name}
>  @itemx -mfpe=@var{number}
>  @itemx -mfp=@var{number}

Patch

2011-08-27  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config.host (arm*-*-linux*): Add driver-arm.o and x-arm.
	* config/arm/arm.opt: Add 'native' processor_type and
	arm_arch enum values.
	* config/arm/arm.h (host_detect_local_cpu): New prototype.
	(EXTRA_SPEC_FUNCTIONS): New define.
	(MCPU_MTUNE_NATIVE_SPECS): New define.
	(DRIVER_SELF_SPECS): New define.
	* config/arm/driver-arm.c: New file.
	* config/arm/x-arm: New file.
	* doc/invoke.texi (ARM Options): Document -mcpu=native,
	-mtune=native and -march=native.

--- a/gcc/config.host
+++ b/gcc/config.host
@@ -100,6 +100,14 @@  case ${host} in
 esac
 
 case ${host} in
+  arm*-*-linux*)
+    case ${target} in
+      arm*-*-*)
+	host_extra_gcc_objs="driver-arm.o"
+	host_xmake_file="${host_xmake_file} arm/x-arm"
+	;;
+    esac
+    ;;
   alpha*-*-linux* | alpha*-dec-osf*)
     case ${target} in
       alpha*-*-linux* | alpha*-dec-osf*)
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2223,4 +2223,21 @@  extern int making_const_table;
    instruction.  */
 #define MAX_LDM_STM_OPS 4
 
+/* -mcpu=native handling only makes sense with compiler running on
+   an ARM chip.  */
+#if defined(__arm__)
+extern const char *host_detect_local_cpu (int argc, const char **argv);
+# define EXTRA_SPEC_FUNCTIONS						\
+  { "local_cpu_detect", host_detect_local_cpu },
+
+# define MCPU_MTUNE_NATIVE_SPECS					\
+   " %{march=native:%<march=native %:local_cpu_detect(arch)}"		\
+   " %{mcpu=native:%<mcpu=native %:local_cpu_detect(cpu)}"		\
+   " %{mtune=native:%<mtune=native %:local_cpu_detect(tune)}"
+#else
+# define MCPU_MTUNE_NATIVE_SPECS ""
+#endif
+
+#define DRIVER_SELF_SPECS MCPU_MTUNE_NATIVE_SPECS
+
 #endif /* ! GCC_ARM_H */
--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@@ -80,6 +80,11 @@  march=
 Target RejectNegative Joined Enum(arm_arch) Var(arm_arch_option)
 Specify the name of the target architecture
 
+; Other arm_arch values are loaded from arm-tables.opt
+; but that is a generated file and this is an odd-one-out.
+EnumValue
+Enum(arm_arch) String(native) Value(-1) DriverOnly
+
 marm
 Target Report RejectNegative InverseMask(THUMB)
 Generate code in 32 bit ARM state.
@@ -233,6 +238,11 @@  mtune=
 Target RejectNegative Joined Enum(processor_type) Var(arm_tune_option) Init(arm_none)
 Tune code for the given processor
 
+; Other processor_type values are loaded from arm-tables.opt
+; but that is a generated file and this is an odd-one-out.
+EnumValue
+Enum(processor_type) String(native) Value(-1) DriverOnly
+
 mwords-little-endian
 Target Report RejectNegative Mask(LITTLE_WORDS)
 Assume big endian bytes, little endian words.  This option is deprecated.
--- /dev/null
+++ b/gcc/config/arm/driver-arm.c
@@ -0,0 +1,108 @@ 
+/* Subroutines for the gcc driver.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+
+static struct {
+  const char *part_no;
+  const char *arch_name;
+  const char *cpu_name;
+} cpu_table[] = {
+    {"0x926", "armv5te", "arm926ej-s"},
+    {"0xa26", "armv5te", "arm1026ej-s"},
+    {"0xb02", "armv6k", "mpcore"},
+    {"0xb36", "armv6j", "arm1136j-s"},
+    {"0xb56", "armv6t2", "arm1156t2-s"},
+    {"0xb76", "armv6zk", "arm1176jz-s"},
+    {"0xc05", "armv7-a", "cortex-a5"},
+    {"0xc08", "armv7-a", "cortex-a8"},
+    {"0xc09", "armv7-a", "cortex-a9"},
+    {"0xc0f", "armv7-a", "cortex-a15"},
+    {"0xc14", "armv7-r", "cortex-r4"},
+    {"0xc15", "armv7-r", "cortex-r5"},
+    {"0xc20", "armv6-m", "cortex-m0"},
+    {"0xc21", "armv6-m", "cortex-m1"},
+    {"0xc23", "armv7-m", "cortex-m3"},
+    {"0xc24", "armv7e-m", "cortex-m4"},
+    {NULL, NULL, NULL}
+};
+
+/* This will be called by the spec parser in gcc.c when it sees
+   a %:local_cpu_detect(args) construct.  Currently it will be called
+   with either "arch", "cpu" or "tune" as argument depending on if
+   -march=native, -mcpu=native or -mtune=native is to be substituted.
+
+   It returns a string containing new command line parameters to be
+   put at the place of the above two options, depending on what CPU
+   this is executed.  E.g. "-march=armv7-a" on a Cortex-A8 for
+   -march=native.  If the routine can't detect a known processor,
+   the -march or -mtune option is discarded.
+
+   ARGC and ARGV are set depending on the actual arguments given
+   in the spec.  */
+const char *
+host_detect_local_cpu (int argc, const char **argv)
+{
+  const char *val = NULL;
+  char buf[128];
+  FILE *f;
+  bool arch;
+
+  if (argc < 1)
+    return NULL;
+
+  arch = strcmp (argv[0], "arch") == 0;
+  if (!arch && strcmp (argv[0], "cpu") != 0 && strcmp (argv[0], "tune"))
+    return NULL;
+
+  f = fopen ("/proc/cpuinfo", "r");
+  if (f == NULL)
+    return NULL;
+
+  while (fgets (buf, sizeof (buf), f) != NULL)
+    {
+      /* Ensure that CPU implementer is ARM (0x41).  */
+      if (strncmp (buf, "CPU implementer", sizeof ("CPU implementer") - 1) == 0
+	  && strstr (buf, "0x41") == NULL)
+	return NULL;
+
+      /* Detect arch/cpu.  */
+      if (strncmp (buf, "CPU part", sizeof ("CPU part") - 1) == 0)
+	{
+	  int i;
+	  for (i = 0; cpu_table[i].part_no != NULL; i++)
+	    if (strstr (buf, cpu_table[i].part_no) != NULL)
+	      {
+		val = arch ? cpu_table[i].arch_name : cpu_table[i].cpu_name;
+		break;
+	      }
+	  break;
+	}
+    }
+
+  fclose (f);
+
+  if (val == NULL)
+    return NULL;
+
+  return concat ("-m", argv[0], "=", val, NULL);
+}
--- /dev/null
+++ b/gcc/config/arm/x-arm
@@ -0,0 +1,3 @@ 
+driver-arm.o: $(srcdir)/config/arm/driver-arm.c \
+  $(CONFIG_H) $(SYSTEM_H)
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10319,6 +10319,11 @@  assembly code.  Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{fa526}, @samp{fa626},
 @samp{fa606te}, @samp{fa626te}, @samp{fmp626}, @samp{fa726te}.
 
+@option{-mcpu=native} causes the compiler to auto-detect the CPU
+of the build computer.  At present, this feature is only supported on
+Linux, and not all architectures are recognised.  If the auto-detect is
+unsuccessful the option has no effect.
+
 @item -mtune=@var{name}
 @opindex mtune
 This option is very similar to the @option{-mcpu=} option, except that
@@ -10330,6 +10335,11 @@  will generate based on the CPU specified by a @option{-mcpu=} option.
 For some ARM implementations better performance can be obtained by using
 this option.
 
+@option{-mtune=native} causes the compiler to auto-detect the CPU
+of the build computer.  At present, this feature is only supported on
+Linux, and not all architectures are recognised.  If the auto-detect is
+unsuccessful the option has no effect.
+
 @item -march=@var{name}
 @opindex march
 This specifies the name of the target ARM architecture.  GCC uses this
@@ -10343,6 +10353,11 @@  of the @option{-mcpu=} option.  Permissible names are: @samp{armv2},
 @samp{armv7}, @samp{armv7-a}, @samp{armv7-r}, @samp{armv7-m},
 @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
 
+@option{-march=native} causes the compiler to auto-detect the architecture
+of the build computer.  At present, this feature is only supported on
+Linux, and not all architectures are recognised.  If the auto-detect is
+unsuccessful the option has no effect.
+
 @item -mfpu=@var{name}
 @itemx -mfpe=@var{number}
 @itemx -mfp=@var{number}