Patchwork [1/2] powerpc: split the math emulation into two parts

login
register
mail settings
Submitter Kevin Hao
Date July 16, 2013, 11:57 a.m.
Message ID <1373975836-11928-2-git-send-email-haokexin@gmail.com>
Download mbox | patch
Permalink /patch/259393/
State Accepted
Commit e05c0e81b0628808a7490c35d1803644a18b0405
Delegated to: Benjamin Herrenschmidt
Headers show

Comments

Kevin Hao - July 16, 2013, 11:57 a.m.
For some SoC (such as the FSL BookE) even though there does have
a hardware FPU, but not all floating point instructions are
implemented. Unfortunately some versions of gcc do use these
unimplemented instructions. Then we have to enable the math emulation
to workaround this issue. It seems a little redundant to have the
support to emulate all the floating point instructions in this case.
So split the math emulation into two parts. One is for the SoC which
doesn't have FPU at all and the other for the SoC which does have the
hardware FPU and only need some special floating point instructions to
be emulated.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
---
 arch/powerpc/Kconfig           | 20 ++++++++++++++++++++
 arch/powerpc/math-emu/Makefile | 24 ++++++++++++------------
 arch/powerpc/math-emu/math.c   | 20 ++++++++++++++------
 3 files changed, 46 insertions(+), 18 deletions(-)
Kumar Gala - July 22, 2013, 2:36 p.m.
On Jul 16, 2013, at 6:57 AM, Kevin Hao wrote:

> For some SoC (such as the FSL BookE) even though there does have
> a hardware FPU, but not all floating point instructions are
> implemented. Unfortunately some versions of gcc do use these
> unimplemented instructions. Then we have to enable the math emulation
> to workaround this issue. It seems a little redundant to have the
> support to emulate all the floating point instructions in this case.
> So split the math emulation into two parts. One is for the SoC which
> doesn't have FPU at all and the other for the SoC which does have the
> hardware FPU and only need some special floating point instructions to
> be emulated.
> 
> Signed-off-by: Kevin Hao <haokexin@gmail.com>
> ---
> arch/powerpc/Kconfig           | 20 ++++++++++++++++++++
> arch/powerpc/math-emu/Makefile | 24 ++++++++++++------------
> arch/powerpc/math-emu/math.c   | 20 ++++++++++++++------
> 3 files changed, 46 insertions(+), 18 deletions(-)

why make the split, what harm is there in just turning on the full emulation code to handle the unimplemented cases?

who says what some other implementation doesn't need something that you have in CONFIG_MATH_EMULATION_FULL?

Is the kernel code size really an issue?

- k
Scott Wood - July 22, 2013, 5:25 p.m.
On 07/22/2013 09:36:05 AM, Kumar Gala wrote:
> 
> On Jul 16, 2013, at 6:57 AM, Kevin Hao wrote:
> 
> > For some SoC (such as the FSL BookE) even though there does have
> > a hardware FPU, but not all floating point instructions are
> > implemented. Unfortunately some versions of gcc do use these
> > unimplemented instructions. Then we have to enable the math  
> emulation
> > to workaround this issue. It seems a little redundant to have the
> > support to emulate all the floating point instructions in this case.
> > So split the math emulation into two parts. One is for the SoC which
> > doesn't have FPU at all and the other for the SoC which does have  
> the
> > hardware FPU and only need some special floating point instructions  
> to
> > be emulated.
> >
> > Signed-off-by: Kevin Hao <haokexin@gmail.com>
> > ---
> > arch/powerpc/Kconfig           | 20 ++++++++++++++++++++
> > arch/powerpc/math-emu/Makefile | 24 ++++++++++++------------
> > arch/powerpc/math-emu/math.c   | 20 ++++++++++++++------
> > 3 files changed, 46 insertions(+), 18 deletions(-)
> 
> why make the split, what harm is there in just turning on the full  
> emulation code to handle the unimplemented cases?

My main motivation in requesting it was to contain the increase in  
build time -- math-emu always stuck out to me as something that took a  
noticeable amount of time to build.  It also reduces the increase in  
kernel image size.

> who says what some other implementation doesn't need something that  
> you have in CONFIG_MATH_EMULATION_FULL?

The point is to include any instructions that are known to be missing  
in any chip's FPU (excluding chips that don't have an FPU at all).  If  
it is discovered that some chip is missing an instruction that we  
didn't account for, then we'd move that instruction from one list to  
the other.

> Is the kernel code size really an issue?

It can be when you're storing it on flash -- especially when the growth  
is out of control because of the need to justify pruning low-hanging  
fruit such as this.

-Scott

Patch

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 3bf72cd..7205989 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -312,6 +312,26 @@  config MATH_EMULATION
 	  such as fsqrt on cores that do have an FPU but do not implement
 	  them (such as Freescale BookE).
 
+choice
+	prompt "Math emulation options"
+	default MATH_EMULATION_FULL
+	depends on MATH_EMULATION
+
+config	MATH_EMULATION_FULL
+	bool "Emulate all the floating point instructions"
+	---help---
+	  Select this option will enable the kernel to support to emulate
+	  all the floating point instructions. If your SoC doesn't have
+	  a FPU, you should select this.
+
+config MATH_EMULATION_HW_UNIMPLEMENTED
+	bool "Just emulate the FPU unimplemented instructions"
+	---help---
+	  Select this if you know there does have a hardware FPU on your
+	  SoC, but some floating point instructions are not implemented by that.
+
+endchoice
+
 config PPC_TRANSACTIONAL_MEM
        bool "Transactional Memory support for POWERPC"
        depends on PPC_BOOK3S_64
diff --git a/arch/powerpc/math-emu/Makefile b/arch/powerpc/math-emu/Makefile
index 8d035d2..1b46ab4 100644
--- a/arch/powerpc/math-emu/Makefile
+++ b/arch/powerpc/math-emu/Makefile
@@ -1,15 +1,15 @@ 
-
-obj-$(CONFIG_MATH_EMULATION)	+= fabs.o fadd.o fadds.o fcmpo.o fcmpu.o \
-					fctiw.o fctiwz.o fdiv.o fdivs.o \
-					fmadd.o fmadds.o fmsub.o fmsubs.o \
-					fmul.o fmuls.o fnabs.o fneg.o \
-					fnmadd.o fnmadds.o fnmsub.o fnmsubs.o \
-					fres.o fre.o frsp.o fsel.o lfs.o \
-					frsqrte.o frsqrtes.o \
-					fsqrt.o	fsqrts.o fsub.o fsubs.o \
-					mcrfs.o mffs.o mtfsb0.o mtfsb1.o \
-					mtfsf.o mtfsfi.o stfiwx.o stfs.o \
-					math.o fmr.o lfd.o stfd.o
+math-emu-common-objs = math.o fre.o fsqrt.o fsqrts.o frsqrtes.o mtfsf.o mtfsfi.o
+obj-$(CONFIG_MATH_EMULATION_HW_UNIMPLEMENTED) += $(math-emu-common-objs)
+obj-$(CONFIG_MATH_EMULATION_FULL) += $(math-emu-common-objs) fabs.o fadd.o \
+					fadds.o fcmpo.o fcmpu.o fctiw.o \
+					fctiwz.o fdiv.o fdivs.o  fmadd.o \
+					fmadds.o fmsub.o fmsubs.o fmul.o \
+					fmuls.o fnabs.o fneg.o fnmadd.o \
+					fnmadds.o fnmsub.o fnmsubs.o fres.o \
+					frsp.o fsel.o lfs.o frsqrte.o fsub.o \
+					fsubs.o  mcrfs.o mffs.o mtfsb0.o \
+					mtfsb1.o stfiwx.o stfs.o math.o \
+					fmr.o lfd.o stfd.o
 
 obj-$(CONFIG_SPE)		+= math_efp.o
 
diff --git a/arch/powerpc/math-emu/math.c b/arch/powerpc/math-emu/math.c
index d1ebac7..bc90162 100644
--- a/arch/powerpc/math-emu/math.c
+++ b/arch/powerpc/math-emu/math.c
@@ -14,6 +14,20 @@ 
 
 #define FLOATFUNC(x)	extern int x(void *, void *, void *, void *)
 
+/* The instructions list which may be not implemented by a hardware FPU */
+FLOATFUNC(fre);
+FLOATFUNC(frsqrtes);
+FLOATFUNC(fsqrt);
+FLOATFUNC(fsqrts);
+FLOATFUNC(mtfsf);
+FLOATFUNC(mtfsfi);
+
+#ifdef CONFIG_MATH_EMULATION_HW_UNIMPLEMENTED
+#undef FLOATFUNC(x)
+#define FLOATFUNC(x)	static inline int x(void *op1, void *op2, void *op3, \
+						 void *op4) { }
+#endif
+
 FLOATFUNC(fadd);
 FLOATFUNC(fadds);
 FLOATFUNC(fdiv);
@@ -43,8 +57,6 @@  FLOATFUNC(mcrfs);
 FLOATFUNC(mffs);
 FLOATFUNC(mtfsb0);
 FLOATFUNC(mtfsb1);
-FLOATFUNC(mtfsf);
-FLOATFUNC(mtfsfi);
 
 FLOATFUNC(lfd);
 FLOATFUNC(lfs);
@@ -59,13 +71,9 @@  FLOATFUNC(fnabs);
 FLOATFUNC(fneg);
 
 /* Optional */
-FLOATFUNC(fre);
 FLOATFUNC(fres);
 FLOATFUNC(frsqrte);
-FLOATFUNC(frsqrtes);
 FLOATFUNC(fsel);
-FLOATFUNC(fsqrt);
-FLOATFUNC(fsqrts);
 
 
 #define OP31		0x1f		/*   31 */