Message ID | 56CBACE4.9070309@samsung.com |
---|---|
State | New |
Headers | show |
On Mon, Feb 22, 2016 at 06:50:44PM -0600, Evandro Menezes wrote: > In preparation for the patch adding the Newton series also for > square root, I'd like to propose this patch changing the name of the > existing tuning flag for the reciprocal square root. This is fine, other names like sw_rsqrt, expand_rsqrt, nr_rsqrt would also be OK. Pick your favourite! One comment on the replacement invoke.texi text below, otherwise this is OK to apply now. > diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt > index 5cbd4cd..155d2bd 100644 > --- a/gcc/config/aarch64/aarch64.opt > +++ b/gcc/config/aarch64/aarch64.opt > @@ -151,5 +151,5 @@ PC relative literal loads. > > mlow-precision-recip-sqrt > Common Var(flag_mrecip_low_precision_sqrt) Optimization > -When calculating a sqrt approximation, run fewer steps. > +Calculate the reciprocal square-root approximation in fewer steps. > This reduces precision, but can result in faster computation. > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 490df93..eeff24d 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -12879,12 +12879,10 @@ corresponding flag to the linker. > @item -mno-low-precision-recip-sqrt > @opindex -mlow-precision-recip-sqrt > @opindex -mno-low-precision-recip-sqrt > -The square root estimate uses two steps instead of three for double-precision, > -and one step instead of two for single-precision. > -Thus reducing latency and precision. > -This is only relevant if @option{-ffast-math} activates > -reciprocal square root estimate instructions. > -Which in turn depends on the target processor. > +The reciprocal square root approximation uses one step less than otherwise, > +thus reducing latency and precision. When calculating the reciprocal square root approximation, use one less step than otherwise, thus reducing latency and precision. Thanks, James
On 02/26/16 08:59, James Greenhalgh wrote: > On Mon, Feb 22, 2016 at 06:50:44PM -0600, Evandro Menezes wrote: >> In preparation for the patch adding the Newton series also for >> square root, I'd like to propose this patch changing the name of the >> existing tuning flag for the reciprocal square root. > This is fine, other names like sw_rsqrt, expand_rsqrt, nr_rsqrt would also > be OK. Pick your favourite! > > One comment on the replacement invoke.texi text below, otherwise this is > OK to apply now. > >> diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt >> index 5cbd4cd..155d2bd 100644 >> --- a/gcc/config/aarch64/aarch64.opt >> +++ b/gcc/config/aarch64/aarch64.opt >> @@ -151,5 +151,5 @@ PC relative literal loads. >> >> mlow-precision-recip-sqrt >> Common Var(flag_mrecip_low_precision_sqrt) Optimization >> -When calculating a sqrt approximation, run fewer steps. >> +Calculate the reciprocal square-root approximation in fewer steps. >> This reduces precision, but can result in faster computation. >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index 490df93..eeff24d 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -12879,12 +12879,10 @@ corresponding flag to the linker. >> @item -mno-low-precision-recip-sqrt >> @opindex -mlow-precision-recip-sqrt >> @opindex -mno-low-precision-recip-sqrt >> -The square root estimate uses two steps instead of three for double-precision, >> -and one step instead of two for single-precision. >> -Thus reducing latency and precision. >> -This is only relevant if @option{-ffast-math} activates >> -reciprocal square root estimate instructions. >> -Which in turn depends on the target processor. >> +The reciprocal square root approximation uses one step less than otherwise, >> +thus reducing latency and precision. > When calculating the reciprocal square root approximation, use one less > step than otherwise, thus reducing latency and precision. > Checked in as r233772. Thank you,
On 02/26/16 17:42, Evandro Menezes wrote: > On 02/26/16 08:59, James Greenhalgh wrote: >> On Mon, Feb 22, 2016 at 06:50:44PM -0600, Evandro Menezes wrote: >>> In preparation for the patch adding the Newton series also for >>> square root, I'd like to propose this patch changing the name of the >>> existing tuning flag for the reciprocal square root. >> This is fine, other names like sw_rsqrt, expand_rsqrt, nr_rsqrt would >> also >> be OK. Pick your favourite! >> >> One comment on the replacement invoke.texi text below, otherwise this is >> OK to apply now. >> >>> diff --git a/gcc/config/aarch64/aarch64.opt >>> b/gcc/config/aarch64/aarch64.opt >>> index 5cbd4cd..155d2bd 100644 >>> --- a/gcc/config/aarch64/aarch64.opt >>> +++ b/gcc/config/aarch64/aarch64.opt >>> @@ -151,5 +151,5 @@ PC relative literal loads. >>> mlow-precision-recip-sqrt >>> Common Var(flag_mrecip_low_precision_sqrt) Optimization >>> -When calculating a sqrt approximation, run fewer steps. >>> +Calculate the reciprocal square-root approximation in fewer steps. >>> This reduces precision, but can result in faster computation. >>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >>> index 490df93..eeff24d 100644 >>> --- a/gcc/doc/invoke.texi >>> +++ b/gcc/doc/invoke.texi >>> @@ -12879,12 +12879,10 @@ corresponding flag to the linker. >>> @item -mno-low-precision-recip-sqrt >>> @opindex -mlow-precision-recip-sqrt >>> @opindex -mno-low-precision-recip-sqrt >>> -The square root estimate uses two steps instead of three for >>> double-precision, >>> -and one step instead of two for single-precision. >>> -Thus reducing latency and precision. >>> -This is only relevant if @option{-ffast-math} activates >>> -reciprocal square root estimate instructions. >>> -Which in turn depends on the target processor. >>> +The reciprocal square root approximation uses one step less than >>> otherwise, >>> +thus reducing latency and precision. >> When calculating the reciprocal square root approximation, use one less >> step than otherwise, thus reducing latency and precision. >> > > Checked in as r233772. But not without some log hiccups, sorry...
From 7043444f83c12de0ab50627a8b386e3070050591 Mon Sep 17 00:00:00 2001 From: Evandro Menezes <e.menezes@samsung.com> Date: Mon, 22 Feb 2016 17:49:09 -0600 Subject: [PATCH] Rename the reciprocal square root tuning flag Rename the tuning option to enable the Newton series for the reciprocal square root to reflect its approximative characteristic. 2016-02-22 Evandro Menezes <e.menezes@samsung.com> gcc/ * config/aarch64/aarch64-tuning-flags.def: Rename tuning flag to AARCH64_EXTRA_TUNE_APPROX_RSQRT. * config/aarch64/aarch64.c (xgene1_tunings): Use new name. (use_rsqrt_p): Likewise. * config/aarch64/aarch64.opt (mlow-precision-recip-sqrt): Reword the text explaining this option. * doc/invoke.texi (-mlow-precision-recip-sqrt): Likewise. --- gcc/config/aarch64/aarch64-tuning-flags.def | 2 +- gcc/config/aarch64/aarch64.c | 4 ++-- gcc/config/aarch64/aarch64.opt | 2 +- gcc/doc/invoke.texi | 10 ++++------ 4 files changed, 8 insertions(+), 10 deletions(-) diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def index 8036cfe..7e45a0c 100644 --- a/gcc/config/aarch64/aarch64-tuning-flags.def +++ b/gcc/config/aarch64/aarch64-tuning-flags.def @@ -29,5 +29,5 @@ AARCH64_TUNE_ to give an enum name. */ AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS) -AARCH64_EXTRA_TUNING_OPTION ("recip_sqrt", RECIP_SQRT) +AARCH64_EXTRA_TUNING_OPTION ("approx_rsqrt", APPROX_RSQRT) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 923a4b3..ebf47da 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -586,7 +586,7 @@ static const struct tune_params xgene1_tunings = 0, /* max_case_values. */ 0, /* cache_line_size. */ tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ - (AARCH64_EXTRA_TUNE_RECIP_SQRT) /* tune_flags. */ + (AARCH64_EXTRA_TUNE_APPROX_RSQRT) /* tune_flags. */ }; /* Support for fine-grained override of the tuning structures. */ @@ -7469,7 +7469,7 @@ use_rsqrt_p (void) return (!flag_trapping_math && flag_unsafe_math_optimizations && ((aarch64_tune_params.extra_tuning_flags - & AARCH64_EXTRA_TUNE_RECIP_SQRT) + & AARCH64_EXTRA_TUNE_APPROX_RSQRT) || flag_mrecip_low_precision_sqrt)); } diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt index 5cbd4cd..155d2bd 100644 --- a/gcc/config/aarch64/aarch64.opt +++ b/gcc/config/aarch64/aarch64.opt @@ -151,5 +151,5 @@ PC relative literal loads. mlow-precision-recip-sqrt Common Var(flag_mrecip_low_precision_sqrt) Optimization -When calculating a sqrt approximation, run fewer steps. +Calculate the reciprocal square-root approximation in fewer steps. This reduces precision, but can result in faster computation. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 490df93..eeff24d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -12879,12 +12879,10 @@ corresponding flag to the linker. @item -mno-low-precision-recip-sqrt @opindex -mlow-precision-recip-sqrt @opindex -mno-low-precision-recip-sqrt -The square root estimate uses two steps instead of three for double-precision, -and one step instead of two for single-precision. -Thus reducing latency and precision. -This is only relevant if @option{-ffast-math} activates -reciprocal square root estimate instructions. -Which in turn depends on the target processor. +The reciprocal square root approximation uses one step less than otherwise, +thus reducing latency and precision. +This is only relevant if @option{-ffast-math} enables the reciprocal square root +approximation, which in turn depends on the target processor. @item -march=@var{name} @opindex march -- 2.6.3