diff mbox

[3/4] MIPS16/GCC: Improve `casesi_internal_mips16_<mode>'s instruction count estimate

Message ID alpine.DEB.2.20.17.1611121010130.10580@tp.orcam.me.uk
State Accepted
Headers show

Commit Message

Maciej W. Rozycki Nov. 14, 2016, 3:45 a.m. UTC
A typical code sequence produced by the `casesi_internal_mips16_<mode>' 
insn is like this:

	sltu	$3, 11	 # 16	casesi_internal_mips16_si	[length = 32]
	bteqz	$L2
	sll	$5, $3, 1
	la	$3, $L4
	addu	$5, $3, $5
	lh	$5, 0($5)
	addu	$3, $3, $5
	j	$3
	.align	1
	.align	2
	.type	__jump_foo_4, @object
__jump_foo_4:
$L4:

which in turn assembles to this binary code:

   a:	5b0b      	sltiu	v1,11
   c:	601d      	bteqz	48 <__pool_foo_12>
   e:	3564      	sll	a1,v1,1
  10:	0b03      	la	v1,1c <__jump_foo_4>
  12:	e3b5      	addu	a1,v1,a1
  14:	8da0      	lh	a1,0(a1)
  16:	e3ad      	addu	v1,a1
  18:	eb80      	jrc	v1
  1a:	6500      	nop

0000001c <__jump_foo_4>:

As you can see the code length estimate is 32, which in turn comes from 
the instruction count being set to 16 for the insn, telling the compiler 
that the pattern will produce the equivalent of 16 regular (16-bit or 
unextended) MIPS16 instructions, as per the attribute's definition.

This estimate is too pessimistic as this pattern will never actually 
reach so many instructions.  Taking the instructions produced one by one 
we have:

1.	sltu	$3, 11     => 1 or 2 depending on the immediate  => 2

2.	bteqz	$L2	   => 1 or 2 depending on label distance => 2

3.	sll	$5, $3, 1  => (HImode) fixed 1
	sll	$5, $3, 2  => (SImode) fixed 1                   => 1

4.	la	$3, $L4    => (Pmode == SImode) fixed 1 as $L4
		              is close and word-aligned
	dla	$3, $L4    => (Pmode == DImode) fixed 1 as $L4
		              is close and word-aligned          => 1

5.	addu	$5, $3, $5 => (Pmode == SImode) fixed 1
	daddu	$5, $3, $5 => (Pmode == DImode) fixed 1          => 1

6.	lh	$5, 0($5)  => (HImode) fixed 1
	lw	$5, 0($5)  => (SImode) fixed 1                   => 1

7.	addu	$3, $3, $5 => (Pmode == SImode) fixed 1
	daddu	$3, $3, $5 => (Pmode == SImode) fixed 1          => 1

8.	j	$3         => 1 if JRC is used or 2 if JR/NOP is => 2
                                                                 ----
                                                                   11

Word alignment of the jump table start is explicitly arranged by 
ASM_OUTPUT_BEFORE_CASE_LABEL and is beneficial as we can use the short 
encoding of LH at no loss in code size, because any 2-byte padding
produced by the `.align 2' pseudo-op would otherwise be consumed by the 
extended form of LH required to encode a PC-relative offset which is not 
a multiple of 4, possibly at some performance loss required for the 
extra instruction halfword fetch.

Set the instruction count to 11 then.

	gcc/
	* config/mips/mips.md (casesi_internal_mips16_<mode>): Set 
	`insn_count' to 11 rather than 16.
---
 OK to apply?

  Maciej

gcc-mips16-casesi-insn-count.diff

Comments

Matthew Fortune Nov. 16, 2016, 10:47 a.m. UTC | #1
Maciej Rozycki <Maciej.Rozycki@imgtec.com> writes:
> 	gcc/
> 	* config/mips/mips.md (casesi_internal_mips16_<mode>): Set
> 	`insn_count' to 11 rather than 16.
> ---
>  OK to apply?

Good catch again. OK.

Thanks,
Matthew
diff mbox

Patch

Index: gcc/gcc/config/mips/mips.md
===================================================================
--- gcc.orig/gcc/config/mips/mips.md	2016-11-12 10:57:12.544746018 +0000
+++ gcc/gcc/config/mips/mips.md	2016-11-12 10:57:13.972699749 +0000
@@ -6444,7 +6444,7 @@ 
 
   return "j\t%4";
 }
-  [(set_attr "insn_count" "16")])
+  [(set_attr "insn_count" "11")])
 
 ;; For TARGET_USE_GOT, we save the gp in the jmp_buf as well.
 ;; While it is possible to either pull it off the stack (in the