Message ID | 4D119DA9.4060002@codesourcery.com |
---|---|
State | New |
Headers | show |
Hi Jie, I'm not sure I totally understand your patch. On Wed, 2010-12-22 at 14:41 +0800, Jie Zhang wrote: > Based on Ricard's explanation, it will be good to allow larger index > range for DImode addressing on modern ARM chips. Like this patch does. I > use multiple_operation_profitable_p to decide if it's good to use LDM or > not. Currently multiple_operation_profitable_p is not fine tuned. In > future the four conditionals about "low" in Is this even going to work ? The patch didn't seem to define multiple_operation_profitable_p anywhere. > > + if ((low != -8 && low != -4 && low != 0 && low != 4) > + || !multiple_operation_profitable_p (false, 2, val)) > > should be moved into multiple_operation_profitable_p. Since the code is identical - why not move it in there today or is that what you are saying ? Ideally I think it would be better to move this in as a flag into the costs structure per core . > > It's still under testing. Is it OK if the result is good? What is the size or performance impact of this on something like SPECINT2k ? I thought that vortex or was it chess in there that uses a bit of DImode arithmetic ? cheers Ramana > > > Regards,
On 12/23/2010 08:58 PM, Ramana Radhakrishnan wrote: > > Hi Jie, > > I'm not sure I totally understand your patch. > > > On Wed, 2010-12-22 at 14:41 +0800, Jie Zhang wrote: >> Based on Ricard's explanation, it will be good to allow larger index >> range for DImode addressing on modern ARM chips. Like this patch does. I >> use multiple_operation_profitable_p to decide if it's good to use LDM or >> not. Currently multiple_operation_profitable_p is not fine tuned. In >> future the four conditionals about "low" in > > Is this even going to work ? The patch didn't seem to define > multiple_operation_profitable_p anywhere. > multiple_operation_profitable_p was added by Bernd 4 months ago. > >> >> + if ((low != -8&& low != -4&& low != 0&& low != 4) >> + || !multiple_operation_profitable_p (false, 2, val)) >> >> should be moved into multiple_operation_profitable_p. > > Since the code is identical - why not move it in there today or is that > what you are saying ? Ideally I think it would be better to move this in > as a flag into the costs structure per core . > I'm not sure if it's good to move it into multiple_operation_profitable_p since it's also called from some places. Moving into it but not affecting other code needs more time thinking. For now I just try to keep my patch as simple as possible to ease code review. We can move it into multiple_operation_profitable_p later when multiple_operation_profitable_p will be changed largely. >> >> It's still under testing. Is it OK if the result is good? > > What is the size or performance impact of this on something like > SPECINT2k ? I thought that vortex or was it chess in there that uses a > bit of DImode arithmetic ? > I have not done the performance benchmarking. I just did the regression testing. There are no regressions for arm-none-eabi target with c,c++,lto and the default multilib. And there are no regressions found with the backported patch for our (CodeSourcery now Mentor Graphics) internal 4.5 branch for arm-none-linux-gnueabi target with c,c++,lto and "-march=armv7-a -mfloat-abi=softfp -mfpu=neon". Regards,
On Thu, 2010-12-23 at 21:29 +0800, Jie Zhang wrote: > On 12/23/2010 08:58 PM, Ramana Radhakrishnan wrote: > > > > Hi Jie, > > > > I'm not sure I totally understand your patch. > > > > > > On Wed, 2010-12-22 at 14:41 +0800, Jie Zhang wrote: > >> Based on Ricard's explanation, it will be good to allow larger index > >> range for DImode addressing on modern ARM chips. Like this patch does. I > >> use multiple_operation_profitable_p to decide if it's good to use LDM or > >> not. Currently multiple_operation_profitable_p is not fine tuned. In > >> future the four conditionals about "low" in > > > > Is this even going to work ? The patch didn't seem to define > > multiple_operation_profitable_p anywhere. > > > multiple_operation_profitable_p was added by Bernd 4 months ago. I suppose I was confused by the additional declaration in your patch and probably was looking in the wrong buffer yesteray. Sorry about that. > > > > >> > >> + if ((low != -8&& low != -4&& low != 0&& low != 4) > >> + || !multiple_operation_profitable_p (false, 2, val)) > >> > >> should be moved into multiple_operation_profitable_p. > > > > Since the code is identical - why not move it in there today or is that > > what you are saying ? Ideally I think it would be better to move this in > > as a flag into the costs structure per core . > > > I'm not sure if it's good to move it into > multiple_operation_profitable_p since it's also called from some places. > Moving into it but not affecting other code needs more time thinking. > For now I just try to keep my patch as simple as possible to ease code > review. We can move it into multiple_operation_profitable_p later when > multiple_operation_profitable_p will be changed largely. Then I would suggest that you add a comment or a FIXME: that marks this as something that has to be looked at when we look at tuning multiple_operation_profitable_p . cheers Ramana
* config/arm/arm.c (multiple_operation_profitable_p): Declare. (arm_legitimize_address): Adjust handling for DImode and DFmode. (arm_legitimize_reload_address): Likewise. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 95bde4a..c926e17 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -250,6 +250,7 @@ static bool arm_builtin_support_vector_misalignment (enum machine_mode mode, bool is_packed); static void arm_conditional_register_usage (void); static reg_class_t arm_preferred_rename_class (reg_class_t rclass); +static bool multiple_operation_profitable_p (bool, int, HOST_WIDE_INT); /* Table of machine attributes. */ @@ -6230,11 +6231,14 @@ arm_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode) rtx base_reg, val; n = INTVAL (xop1); - /* VFP addressing modes actually allow greater offsets, but for - now we just stick with the lowest common denominator. */ if (mode == DImode || ((TARGET_SOFT_FLOAT || TARGET_VFP) && mode == DFmode)) { + HOST_WIDE_INT orig_n = n; + + /* We first see if LDM is profitable. If not, we will use + the lowest common denominator suitable for two LDRs + and VLDR, and for LDRD if TARGET_LDRD. */ low_n = n & 0x0f; n &= ~0x0f; if (low_n > 4) @@ -6242,6 +6246,14 @@ arm_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode) n += 16; low_n -= 16; } + if ((low_n != -8 && low_n != -4 && low_n != 0 && low_n != 4) + || !multiple_operation_profitable_p (false, 2, orig_n)) + { + HOST_WIDE_INT mask = (TARGET_LDRD ? 0xfc : 0x3fc); + n = orig_n; + low_n = (n >= 0 ? (n & mask) : -((-n) & mask)); + n -= low_n; + } } else { @@ -6404,7 +6416,18 @@ arm_legitimize_reload_address (rtx *p, HOST_WIDE_INT low, high; if (mode == DImode || (mode == DFmode && TARGET_SOFT_FLOAT)) - low = ((val & 0xf) ^ 0x8) - 0x8; + { + /* We first see if LDM is profitable. If not, we will use + the lowest common denominator suitable for two LDRs + and VLDR, and for LDRD if TARGET_LDRD. */ + low = ((val & 0xf) ^ 0x8) - 0x8; + if ((low != -8 && low != -4 && low != 0 && low != 4) + || !multiple_operation_profitable_p (false, 2, val)) + { + HOST_WIDE_INT mask = (TARGET_LDRD ? 0xfc : 0x3fc); + low = (val >= 0 ? (val & mask) : -((-val) & mask)); + } + } else if (TARGET_MAVERICK && TARGET_HARD_FLOAT) /* Need to be careful, -256 is not a valid offset. */ low = val >= 0 ? (val & 0xff) : -((-val) & 0xff);