Patchwork [U-Boot] arm: add 64-64 bit divider

login
register
mail settings
Submitter Che-liang Chiou
Date Aug. 31, 2011, 10:38 a.m.
Message ID <1314787130-1043-1-git-send-email-clchiou@chromium.org>
Download mbox | patch
Permalink /patch/112505/
State Changes Requested
Headers show

Comments

Che-liang Chiou - Aug. 31, 2011, 10:38 a.m.
This patch adds a 64-64 bit divider that supports ARMv4 and above.

Because clz (count leading zero) instruction is added until ARMv5, the
divider implements a clz function for ARMv4 targets.

The divider was tested with the following test driver code ran by
qemu-arm:

  int main(void)
  {
    uint64_t a, b, q, r;
    while (scanf("%llx %llx %llx %llx", &a, &b, &q, &r) > 0)
      printf("%016llx %016llx %016llx %016llx\n", a, b, a / b, a % b);
    return 0;
  }

Signed-off-by: Che-Liang Chiou <clchiou@chromium.org>
Cc: Albert Aribaud <albert.u.boot@aribaud.net>
---
This patch is alos tested with `MAKEALL -a arm`

 arch/arm/lib/Makefile    |    1 +
 arch/arm/lib/_uldivmod.S |  266 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 267 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/lib/_uldivmod.S
Marek Vasut - Aug. 31, 2011, 11:56 a.m.
On Wednesday, August 31, 2011 12:38:50 PM Che-Liang Chiou wrote:
> This patch adds a 64-64 bit divider that supports ARMv4 and above.
> 
> Because clz (count leading zero) instruction is added until ARMv5, the
> divider implements a clz function for ARMv4 targets.
> 
> The divider was tested with the following test driver code ran by
> qemu-arm:
> 
>   int main(void)
>   {
>     uint64_t a, b, q, r;
>     while (scanf("%llx %llx %llx %llx", &a, &b, &q, &r) > 0)
>       printf("%016llx %016llx %016llx %016llx\n", a, b, a / b, a % b);
>     return 0;
>   }
> 
> Signed-off-by: Che-Liang Chiou <clchiou@chromium.org>
> Cc: Albert Aribaud <albert.u.boot@aribaud.net>
> ---

Hi,

do you see any kind of a performance hit so you can't use the default "C" 
version?

Cheers
Mike Frysinger - Aug. 31, 2011, 2:32 p.m.
On Wednesday, August 31, 2011 06:38:50 Che-Liang Chiou wrote:
> This patch adds a 64-64 bit divider that supports ARMv4 and above.

why ?  if you're doing 64 bit divides, chances are you're doing something 
fundamentally wrong.  perhaps you should fix that instead.

this is also why we have the do_div() helper macro.

so until your changelog documents the actual *reason* for this patch: NAK
-mike
Marek Vasut - Aug. 31, 2011, 3:11 p.m.
On Wednesday, August 31, 2011 04:32:52 PM Mike Frysinger wrote:
> On Wednesday, August 31, 2011 06:38:50 Che-Liang Chiou wrote:
> > This patch adds a 64-64 bit divider that supports ARMv4 and above.
> 
> why ?  if you're doing 64 bit divides, chances are you're doing something
> fundamentally wrong.  perhaps you should fix that instead.

Oh come on Mike, what about too big NAND memories ?
> 
> this is also why we have the do_div() helper macro.
> 
> so until your changelog documents the actual *reason* for this patch: NAK

The reason is likely it's faster. But I don't think it matters, that's why I 
commented on this already.

If he's fixing something by this (like I mistakenly did some time ago), there's 
really something wrong.

> -mike
Mike Frysinger - Aug. 31, 2011, 3:27 p.m.
On Wednesday, August 31, 2011 11:11:00 Marek Vasut wrote:
> On Wednesday, August 31, 2011 04:32:52 PM Mike Frysinger wrote:
> > On Wednesday, August 31, 2011 06:38:50 Che-Liang Chiou wrote:
> > > This patch adds a 64-64 bit divider that supports ARMv4 and above.
> > 
> > why ?  if you're doing 64 bit divides, chances are you're doing something
> > fundamentally wrong.  perhaps you should fix that instead.
> 
> Oh come on Mike, what about too big NAND memories ?

Linux hasnt had a problem supporting large NAND without a 64bit divide 
routine.  why are we special ?

> > this is also why we have the do_div() helper macro.
> > 
> > so until your changelog documents the actual *reason* for this patch: NAK
> 
> The reason is likely it's faster.

let's see actual #'s
-mike
Marek Vasut - Aug. 31, 2011, 3:33 p.m.
On Wednesday, August 31, 2011 05:27:46 PM Mike Frysinger wrote:
> On Wednesday, August 31, 2011 11:11:00 Marek Vasut wrote:
> > On Wednesday, August 31, 2011 04:32:52 PM Mike Frysinger wrote:
> > > On Wednesday, August 31, 2011 06:38:50 Che-Liang Chiou wrote:
> > > > This patch adds a 64-64 bit divider that supports ARMv4 and above.
> > > 
> > > why ?  if you're doing 64 bit divides, chances are you're doing
> > > something fundamentally wrong.  perhaps you should fix that instead.
> > 
> > Oh come on Mike, what about too big NAND memories ?
> 
> Linux hasnt had a problem supporting large NAND without a 64bit divide
> routine.  why are we special ?

Because someone (?) has to fix the code that uses do_div() ;-)

> 
> > > this is also why we have the do_div() helper macro.
> > > 
> > > so until your changelog documents the actual *reason* for this patch:
> > > NAK
> > 
> > The reason is likely it's faster.
> 
> let's see actual #'s

True, will you make the measurements? ;-)

Still, I'd stick with the plain-C version, it doesn't matter I guess.

Cheers
> -mike
Mike Frysinger - Aug. 31, 2011, 4:05 p.m.
On Wednesday, August 31, 2011 11:33:59 Marek Vasut wrote:
> On Wednesday, August 31, 2011 05:27:46 PM Mike Frysinger wrote:
> > On Wednesday, August 31, 2011 11:11:00 Marek Vasut wrote:
> > > On Wednesday, August 31, 2011 04:32:52 PM Mike Frysinger wrote:
> > > > On Wednesday, August 31, 2011 06:38:50 Che-Liang Chiou wrote:
> > > > > This patch adds a 64-64 bit divider that supports ARMv4 and above.
> > > > 
> > > > why ?  if you're doing 64 bit divides, chances are you're doing
> > > > something fundamentally wrong.  perhaps you should fix that instead.
> > > 
> > > Oh come on Mike, what about too big NAND memories ?
> > 
> > Linux hasnt had a problem supporting large NAND without a 64bit divide
> > routine.  why are we special ?
> 
> Because someone (?) has to fix the code that uses do_div() ;-)

sure ... the guy with the problem gets to post the fix :)
-mike
Marek Vasut - Aug. 31, 2011, 4:30 p.m.
On Wednesday, August 31, 2011 06:05:29 PM Mike Frysinger wrote:
> On Wednesday, August 31, 2011 11:33:59 Marek Vasut wrote:
> > On Wednesday, August 31, 2011 05:27:46 PM Mike Frysinger wrote:
> > > On Wednesday, August 31, 2011 11:11:00 Marek Vasut wrote:
> > > > On Wednesday, August 31, 2011 04:32:52 PM Mike Frysinger wrote:
> > > > > On Wednesday, August 31, 2011 06:38:50 Che-Liang Chiou wrote:
> > > > > > This patch adds a 64-64 bit divider that supports ARMv4 and
> > > > > > above.
> > > > > 
> > > > > why ?  if you're doing 64 bit divides, chances are you're doing
> > > > > something fundamentally wrong.  perhaps you should fix that
> > > > > instead.
> > > > 
> > > > Oh come on Mike, what about too big NAND memories ?
> > > 
> > > Linux hasnt had a problem supporting large NAND without a 64bit divide
> > > routine.  why are we special ?
> > 
> > Because someone (?) has to fix the code that uses do_div() ;-)
> 
> sure ... the guy with the problem gets to post the fix :)

Cool, would that be ... you ? ;-)

Cheers
Mike Frysinger - Aug. 31, 2011, 5:13 p.m.
On Wednesday, August 31, 2011 12:30:25 Marek Vasut wrote:
> On Wednesday, August 31, 2011 06:05:29 PM Mike Frysinger wrote:
> > On Wednesday, August 31, 2011 11:33:59 Marek Vasut wrote:
> > > On Wednesday, August 31, 2011 05:27:46 PM Mike Frysinger wrote:
> > > > On Wednesday, August 31, 2011 11:11:00 Marek Vasut wrote:
> > > > > On Wednesday, August 31, 2011 04:32:52 PM Mike Frysinger wrote:
> > > > > > On Wednesday, August 31, 2011 06:38:50 Che-Liang Chiou wrote:
> > > > > > > This patch adds a 64-64 bit divider that supports ARMv4 and
> > > > > > > above.
> > > > > > 
> > > > > > why ?  if you're doing 64 bit divides, chances are you're doing
> > > > > > something fundamentally wrong.  perhaps you should fix that
> > > > > > instead.
> > > > > 
> > > > > Oh come on Mike, what about too big NAND memories ?
> > > > 
> > > > Linux hasnt had a problem supporting large NAND without a 64bit
> > > > divide routine.  why are we special ?
> > > 
> > > Because someone (?) has to fix the code that uses do_div() ;-)
> > 
> > sure ... the guy with the problem gets to post the fix :)
> 
> Cool, would that be ... you ? ;-)

no, because it's building fine for me, thus i dont have a problem
-mike
Wolfgang Denk - Aug. 31, 2011, 8:03 p.m.
Dear Che-Liang Chiou,

In message <1314787130-1043-1-git-send-email-clchiou@chromium.org> you wrote:
> This patch adds a 64-64 bit divider that supports ARMv4 and above.

To summarize the misc feedback:  Please explain in detail which
problem you are trying to fix.  We see no need for this patch so far.

Best regards,

Wolfgang Denk
Che-liang Chiou - Sept. 1, 2011, 10:09 a.m.
Hi,

Thanks for the insightful comments. Here are my responses:

* Why don't I implement the divider in C?
It is not because I think it's performance critical (I haven't
benchmarked it yet), but because I have a probably wrong impression
that the divider has to be written in assembly --- all dividers in
arch/arm/lib/ are written in ARM assembly. What is the policy here for
using assembly or C?

* When do we need a 64-bit divider?
In kernel code do_div() is used for various purposes. So I think it
should be quite often that we would need a 64-bit divider in U-Boot.

* Do we need a 64-64 bit divider?
do_div() defines 64-32 bit division semantics (dividend is 64-bit and
divisor is 32-bit), and this patch implements a 64-64 bit divider
(both dividend and divisor are 64-bit). I have to admit that I can't
think of scenarios or reasons to justify a 64-64 bit divider instead
of a 64-32 bit divider, except that a 64-64 bit divider is more
generic than a 64-32 bit one.

So I guess we can agree that a 64-bit divider is feature that is nice
to have, and we should decide:
* Do we need a 64-64 bit divider or a 64-32 bit one?
* Do we write it in C or assembly?

Depending on our decisions, I will rewrite (or abandon) this patch accordingly.

Regards,
Che-Liang

On Thu, Sep 1, 2011 at 4:03 AM, Wolfgang Denk <wd@denx.de> wrote:
> Dear Che-Liang Chiou,
>
> In message <1314787130-1043-1-git-send-email-clchiou@chromium.org> you wrote:
>> This patch adds a 64-64 bit divider that supports ARMv4 and above.
>
> To summarize the misc feedback:  Please explain in detail which
> problem you are trying to fix.  We see no need for this patch so far.
>
> Best regards,
>
> Wolfgang Denk
>
> --
> DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
> "Success covers a multitude of blunders."       - George Bernard Shaw
>
Marek Vasut - Sept. 1, 2011, 10:16 a.m.
On Thursday, September 01, 2011 12:09:18 PM Che-liang Chiou wrote:
> Hi,
> 
> Thanks for the insightful comments. Here are my responses:
> 
> * Why don't I implement the divider in C?
> It is not because I think it's performance critical (I haven't
> benchmarked it yet), but because I have a probably wrong impression
> that the divider has to be written in assembly --- all dividers in
> arch/arm/lib/ are written in ARM assembly. What is the policy here for
> using assembly or C?

No, C is just fine and is more generic. Those assembler versions are just 
optimized things, you don't need to be bothered by those.

> 
> * When do we need a 64-bit divider?
> In kernel code do_div() is used for various purposes. So I think it
> should be quite often that we would need a 64-bit divider in U-Boot.

Not much really ... and for the rare cases, we can do with do_div() as is.

> 
> * Do we need a 64-64 bit divider?
> do_div() defines 64-32 bit division semantics (dividend is 64-bit and
> divisor is 32-bit), and this patch implements a 64-64 bit divider
> (both dividend and divisor are 64-bit). I have to admit that I can't
> think of scenarios or reasons to justify a 64-64 bit divider instead
> of a 64-32 bit divider, except that a 64-64 bit divider is more
> generic than a 64-32 bit one.

So we don't need 64/64 divide at all.

> 
> So I guess we can agree that a 64-bit divider is feature that is nice
> to have, and we should decide:
> * Do we need a 64-64 bit divider or a 64-32 bit one?

64-32 is do_div()

> * Do we write it in C or assembly?

C is OK.

> 
> Depending on our decisions, I will rewrite (or abandon) this patch
> accordingly.

Look, I don't mean to be rough, but honestly. I see no use for this code. Adding 
code to anywhere so it'd just sit there is bad.

Cheers

> 
> Regards,
> Che-Liang
> 
> On Thu, Sep 1, 2011 at 4:03 AM, Wolfgang Denk <wd@denx.de> wrote:
> > Dear Che-Liang Chiou,
> > 
> > In message <1314787130-1043-1-git-send-email-clchiou@chromium.org> you 
wrote:
> >> This patch adds a 64-64 bit divider that supports ARMv4 and above.
> > 
> > To summarize the misc feedback:  Please explain in detail which
> > problem you are trying to fix.  We see no need for this patch so far.
> > 
> > Best regards,
> > 
> > Wolfgang Denk
> > 
> > --
> > DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
> > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> > Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
> > "Success covers a multitude of blunders."       - George Bernard Shaw
Che-liang Chiou - Sept. 1, 2011, 10:30 a.m.
Hi Marek,

I will abandon this patch and submit a new patch that is adapted from
do_div() and lib64.c of the Linux kernel. Does this sound okay to you?

Regards,
Che-Liang

On Thu, Sep 1, 2011 at 6:16 PM, Marek Vasut <marek.vasut@gmail.com> wrote:
> On Thursday, September 01, 2011 12:09:18 PM Che-liang Chiou wrote:
>> Hi,
>>
>> Thanks for the insightful comments. Here are my responses:
>>
>> * Why don't I implement the divider in C?
>> It is not because I think it's performance critical (I haven't
>> benchmarked it yet), but because I have a probably wrong impression
>> that the divider has to be written in assembly --- all dividers in
>> arch/arm/lib/ are written in ARM assembly. What is the policy here for
>> using assembly or C?
>
> No, C is just fine and is more generic. Those assembler versions are just
> optimized things, you don't need to be bothered by those.
>
>>
>> * When do we need a 64-bit divider?
>> In kernel code do_div() is used for various purposes. So I think it
>> should be quite often that we would need a 64-bit divider in U-Boot.
>
> Not much really ... and for the rare cases, we can do with do_div() as is.
>
>>
>> * Do we need a 64-64 bit divider?
>> do_div() defines 64-32 bit division semantics (dividend is 64-bit and
>> divisor is 32-bit), and this patch implements a 64-64 bit divider
>> (both dividend and divisor are 64-bit). I have to admit that I can't
>> think of scenarios or reasons to justify a 64-64 bit divider instead
>> of a 64-32 bit divider, except that a 64-64 bit divider is more
>> generic than a 64-32 bit one.
>
> So we don't need 64/64 divide at all.
>
>>
>> So I guess we can agree that a 64-bit divider is feature that is nice
>> to have, and we should decide:
>> * Do we need a 64-64 bit divider or a 64-32 bit one?
>
> 64-32 is do_div()
>
>> * Do we write it in C or assembly?
>
> C is OK.
>
>>
>> Depending on our decisions, I will rewrite (or abandon) this patch
>> accordingly.
>
> Look, I don't mean to be rough, but honestly. I see no use for this code. Adding
> code to anywhere so it'd just sit there is bad.
>
> Cheers
>
>>
>> Regards,
>> Che-Liang
>>
>> On Thu, Sep 1, 2011 at 4:03 AM, Wolfgang Denk <wd@denx.de> wrote:
>> > Dear Che-Liang Chiou,
>> >
>> > In message <1314787130-1043-1-git-send-email-clchiou@chromium.org> you
> wrote:
>> >> This patch adds a 64-64 bit divider that supports ARMv4 and above.
>> >
>> > To summarize the misc feedback:  Please explain in detail which
>> > problem you are trying to fix.  We see no need for this patch so far.
>> >
>> > Best regards,
>> >
>> > Wolfgang Denk
>> >
>> > --
>> > DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
>> > HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
>> > Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
>> > "Success covers a multitude of blunders."       - George Bernard Shaw
>
Marek Vasut - Sept. 1, 2011, 10:42 a.m.
On Thursday, September 01, 2011 12:30:47 PM Che-liang Chiou wrote:
> Hi Marek,
> 
> I will abandon this patch and submit a new patch that is adapted from
> do_div() and lib64.c of the Linux kernel. Does this sound okay to you?

I'm not against it, but is it worth the effort? Like ... why do we need it ?
> 
> Regards,
> Che-Liang
[...]
Che-liang Chiou - Sept. 1, 2011, 12:06 p.m.
Hi Marek,

do_div() and lib/div64.c of linux kernel has been ported to U-Boot
since Oct, 2006 (this date is the earliest record that I can find; see
commit 7b64fef3).

Regards,
Che-Liang

On Thu, Sep 1, 2011 at 6:42 PM, Marek Vasut <marek.vasut@gmail.com> wrote:
> On Thursday, September 01, 2011 12:30:47 PM Che-liang Chiou wrote:
>> Hi Marek,
>>
>> I will abandon this patch and submit a new patch that is adapted from
>> do_div() and lib64.c of the Linux kernel. Does this sound okay to you?
>
> I'm not against it, but is it worth the effort? Like ... why do we need it ?
>>
>> Regards,
>> Che-Liang
> [...]
>
Wolfgang Denk - Sept. 1, 2011, 1:07 p.m.
Dear Che-liang Chiou,

In message <CANJuy2K9uWdxT6T=mMj0yLiV3cAJUHLNC4LGDv21sP-DMGVzUg@mail.gmail.com> you wrote:
> 
> do_div() and lib/div64.c of linux kernel has been ported to U-Boot
> since Oct, 2006 (this date is the earliest record that I can find; see
> commit 7b64fef3).

Indeed, and so far nobody ever needed the patch you submitted, so
please explain in detail why you need it now?

Best regards,

Wolfgang Denk
Che-liang Chiou - Sept. 2, 2011, 3:12 a.m.
Dear Wolfgang,

I am convinced that a 64-64 bit divider (this patch) is not needed. Is
there any way that we could mark a patch "abandon"?

Regards,
Che-Liang

On Thu, Sep 1, 2011 at 9:07 PM, Wolfgang Denk <wd@denx.de> wrote:
> Dear Che-liang Chiou,
>
> In message <CANJuy2K9uWdxT6T=mMj0yLiV3cAJUHLNC4LGDv21sP-DMGVzUg@mail.gmail.com> you wrote:
>>
>> do_div() and lib/div64.c of linux kernel has been ported to U-Boot
>> since Oct, 2006 (this date is the earliest record that I can find; see
>> commit 7b64fef3).
>
> Indeed, and so far nobody ever needed the patch you submitted, so
> please explain in detail why you need it now?
>
> Best regards,
>
> Wolfgang Denk
>
> --
> DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
> "More software projects have gone awry for lack of calendar time than
> for all other causes combined."
>                         - Fred Brooks, Jr., _The Mythical Man Month_
>
Wolfgang Denk - Sept. 7, 2011, 9:14 p.m.
Dear Che-liang Chiou,

In message <CANJuy2+BB7tA70vHoTq3LJA-o4ymGCSdP9PnFzrx-uWret7nqQ@mail.gmail.com> you wrote:
> 
> So I guess we can agree that a 64-bit divider is feature that is nice
> to have, and we should decide:
> * Do we need a 64-64 bit divider or a 64-32 bit one?
> * Do we write it in C or assembly?

The situation is simple:  there is no code in U-Boot that needs this
feature, and we try to avoid adding dead code.

If you don;t have a use case at hand that actually requires this, then
please let's drop it.

Thanks.

Best regards,

Wolfgang Denk
Graeme Russ - Sept. 20, 2011, 10:45 a.m.
Hi Wolfgang,

On 08/09/11 07:14, Wolfgang Denk wrote:
> Dear Che-liang Chiou,
> 
> In message <CANJuy2+BB7tA70vHoTq3LJA-o4ymGCSdP9PnFzrx-uWret7nqQ@mail.gmail.com> you wrote:
>>
>> So I guess we can agree that a 64-bit divider is feature that is nice
>> to have, and we should decide:
>> * Do we need a 64-64 bit divider or a 64-32 bit one?
>> * Do we write it in C or assembly?
> 
> The situation is simple:  there is no code in U-Boot that needs this
> feature, and we try to avoid adding dead code.
> 
> If you don;t have a use case at hand that actually requires this, then
> please let's drop it.

You'll laugh at this - the Intel High Performance Event Timers (HPET) are
defined to a resolution of femto-seconds and you end up with code in
get_timer() like:

	u32 count_low;
	u32 count_high;
	u32 fs_per_tick;
	u64 ticks;
	u64 fs;
	u32 ms;

	count_low = readl(&hpet_registers->main_count_low);
	count_high = readl(&hpet_registers->main_count_high);
	fs_per_tick = readl(&hpet_registers->counter_clk_period);

	ticks = ((u64)count_high << 32) | ((u64)count_low);
	fs = fs_per_tick * ticks;
	ms = (u32)lldiv(ticks, 1000000000000);

But I can right shift both divisor and dividend by 12 bits without loosing
any significant precision which turns it into:

	ms = (u32)lldiv(ticks >> 12, 244140625);

So I almost needed a 64 bit divisor.

Regards,

Graeme
Wolfgang Denk - Sept. 20, 2011, 11:28 a.m.
Dear Graeme Russ,

In message <4E786EBA.5040805@gmail.com> you wrote:
> 
> You'll laugh at this - the Intel High Performance Event Timers (HPET) are
> defined to a resolution of femto-seconds and you end up with code in
> get_timer() like:

I have to admit that I have never been able to laugh about x86 design
issues.  But then, Intel told us the Pentium would have "RISK"
features...


Best regards,

Wolfgang Denk
Graeme Russ - Sept. 20, 2011, 11:40 a.m.
On 20/09/11 21:28, Wolfgang Denk wrote:
> Dear Graeme Russ,
> 
> In message <4E786EBA.5040805@gmail.com> you wrote:
>>
>> You'll laugh at this - the Intel High Performance Event Timers (HPET) are
>> defined to a resolution of femto-seconds and you end up with code in
>> get_timer() like:
> 
> I have to admit that I have never been able to laugh about x86 design
> issues.  But then, Intel told us the Pentium would have "RISK"
> features...

*ROFL*

Well actually, it's not really an x86 thing - Any architecture could
implement HPET. Using femto-seconds as the time-base and defining a 'tick'
as a number of femto-seconds makes a lot of sense - It allows preservation
of timer accuracy through the comparators so interrupts can be generated
with extreme precision while actually allowing the source clock to be
pretty much any frequency. They were, after all, designed for multi-media
applications to solve the horrendous sub-ms accuracy issue with the older
programmable timers

Regards,

Graeme

Patch

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 300c8fa..31770dd 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -33,6 +33,7 @@  GLSOBJS	+= _divsi3.o
 GLSOBJS	+= _lshrdi3.o
 GLSOBJS	+= _modsi3.o
 GLSOBJS	+= _udivsi3.o
+GLSOBJS	+= _uldivmod.o
 GLSOBJS	+= _umodsi3.o
 
 GLCOBJS	+= div0.o
diff --git a/arch/arm/lib/_uldivmod.S b/arch/arm/lib/_uldivmod.S
new file mode 100644
index 0000000..9e3a5e6
--- /dev/null
+++ b/arch/arm/lib/_uldivmod.S
@@ -0,0 +1,266 @@ 
+/*
+ * Copyright (c) 2011 The Chromium OS Authors.
+ * See file CREDITS for list of people who contributed to this
+ * project.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston,
+ * MA 02111-1307 USA
+ */
+
+/*
+ * A, Q = r0 + (r1 << 32)
+ * B, R = r2 + (r3 << 32)
+ * A / B = Q ... R
+ */
+
+	.text
+	.global	__aeabi_uldivmod
+	.type	__aeabi_uldivmod, function
+	.align	0
+
+/* armv4 does not support clz (count leading zero) instruction. */
+#if __LINUX_ARM_ARCH__ <= 4
+#  define CLZ(dst, src)		bl	L_clz_ ## dst ## _ ## src
+#  define CLZEQ(dst, src)	bleq	L_clz_ ## dst ## _ ## src
+#else
+#  define CLZ(dst, src)		clz	dst, src
+#  define CLZEQ(dst, src)	clzeq	dst, src
+#endif
+
+A_0	.req	r0
+A_1	.req	r1
+B_0	.req	r2
+B_1	.req	r3
+C_0	.req	r4
+C_1	.req	r5
+D_0	.req	r6
+D_1	.req	r7
+
+Q_0	.req	r0
+Q_1	.req	r1
+R_0	.req	r2
+R_1	.req	r3
+
+__aeabi_uldivmod:
+	stmfd	sp!, {r4, r5, r6, r7, lr}
+	@ Test if B == 0
+	orrs	ip, B_0, B_1		@ Z set -> B == 0
+	beq	L_div_by_0
+	@ Test if B is power of 2: (B & (B - 1)) == 0
+	subs	C_0, B_0, #1
+	sbc	C_1, B_1, #0
+	tst	C_0, B_0
+	tsteq	B_1, C_1
+	beq	L_pow2
+	@ Test if A_1 == B_1 == 0
+	orrs	ip, A_1, B_1
+	beq	L_div_32_32
+
+L_div_64_64:
+	mov	C_0, #1
+	mov	C_1, #0
+	@ D_0 = clz A
+	CLZ(D_0, A_1)
+	teq	A_1, #0
+	CLZEQ(ip, A_0)
+	teq	A_1, #0
+	addeq	D_0, D_0, ip
+	@ D_1 = clz B
+	CLZ(D_1, B_1)
+	teq	B_1, #0
+	CLZEQ(ip, B_0)
+	teq	B_1, #0
+	addeq	D_1, D_1, ip
+	@ if clz B - clz A <= 0: goto L_done_shift
+	subs	D_0, D_1, D_0
+	bls	L_done_shift
+	subs	D_1, D_0, #32
+	rsb	ip, D_0, #32
+	@ B <<= (clz B - clz A)
+	movmi	B_1, B_1, lsl D_0
+	orrmi	B_1, B_1, B_0, lsr ip
+	movpl	B_1, B_0, lsl D_1
+	mov	B_0, B_0, lsl D_0
+	@ C = 1 << (clz B - clz A)
+	movmi	C_1, C_1, lsl D_0
+	orrmi	C_1, C_1, C_0, lsr ip
+	movpl	C_1, C_0, lsl D_1
+	mov	C_0, C_0, lsl D_0
+L_done_shift:
+	mov	D_0, #0
+	mov	D_1, #0
+	@ C: current bit; D: result
+L_subtract:
+	@ if A >= B
+	cmp	A_1, B_1
+	cmpeq	A_0, B_0
+	bcc	L_update
+	@ A -= B
+	subs	A_0, A_0, B_0
+	sbc	A_1, A_1, B_1
+	@ D |= C
+	orr	D_0, D_0, C_0
+	orr	D_1, D_1, C_1
+L_update:
+	@ if A == 0: break
+	orrs	ip, A_1, A_0
+	beq	L_exit
+	@ C >>= 1
+	movs	C_1, C_1, lsr #1
+	movs	C_0, C_0, rrx
+	@ if C == 0: break
+	orrs	ip, C_1, C_0
+	beq	L_exit
+	@ B >>= 1
+	movs	B_1, B_1, lsr #1
+	mov	B_0, B_0, rrx
+	b	L_subtract
+L_exit:
+	@ Note: A, B & Q, R are aliases
+	mov	R_0, A_0
+	mov	R_1, A_1
+	mov	Q_0, D_0
+	mov	Q_1, D_1
+	ldmfd	sp!, {r4, r5, r6, r7, pc}
+
+L_div_32_32:
+	@ Note:	A_0 &	r0 are aliases
+	@	Q_1	r1
+	mov	r1, B_0
+	bl	__aeabi_uidivmod
+	mov	R_0, r1
+	mov	R_1, #0
+	mov	Q_1, #0
+	ldmfd	sp!, {r4, r5, r6, r7, pc}
+
+L_pow2:
+	@ Note: A, B and Q, R are aliases
+	@ R = A & (B - 1)
+	and	C_0, A_0, C_0
+	and	C_1, A_1, C_1
+	@ Q = A >> log2(B)
+	@ Note: B must not be 0 here!
+	CLZ(D_0, B_0)
+	add	D_1, D_0, #1
+	rsbs	D_0, D_0, #31
+	movpl	A_0, A_0, lsr D_0
+	orrpl	A_0, A_0, A_1, lsl D_1
+	bpl	L_1
+	CLZ(D_0, B_1)
+	rsb	D_0, D_0, #31
+	mov	A_0, A_1, lsr D_0
+	add	D_0, D_0, #32
+L_1:
+	mov	A_1, A_1, lsr D_0
+	@ Mov back C to R
+	mov	R_0, C_0
+	mov	R_1, C_1
+	ldmfd	sp!, {r4, r5, r6, r7, pc}
+
+L_div_by_0:
+	bl	__div0
+	@ As wrong as it could be
+	mov	Q_0, #0
+	mov	Q_1, #0
+	mov	R_0, #0
+	mov	R_1, #0
+	ldmfd	sp!, {r4, r5, r6, r7, pc}
+
+#if __LINUX_ARM_ARCH__ <= 4
+/*
+ * count leading zero
+ *
+ * input	: r0
+ * output	: r0
+ * destroy	: r1, r2, r3, r4, r5
+ */
+L_clz:
+	mov	r1, #0		// clz result
+	mov	r2, #0xf0000000	// mask
+	mov	r3, #28		// shift amount
+	adr	r4, L_clz_table
+L_clz_loop:
+	teq	r2, #0
+	beq	L_clz_loop_done
+	ands	r5, r0, r2
+	mov	r5, r5, lsr r3
+	ldrsb	r5, [r4, r5]
+	add	r1, r1, r5
+	mov	r2, r2, lsr #4
+	add	r3, r3, #-4
+	beq	L_clz_loop
+L_clz_loop_done:
+	mov	r0, r1
+	mov	pc, lr
+L_clz_table:
+	.byte	4
+	.byte	3
+	.byte	2
+	.byte	2
+	.byte	1
+	.byte	1
+	.byte	1
+	.byte	1
+	.byte	0
+	.byte	0
+	.byte	0
+	.byte	0
+	.byte	0
+	.byte	0
+	.byte	0
+	.byte	0
+
+L_clz_D_0_A_1:
+	stmfd	sp!, {r0, r1, r2, r3, r4, r5, lr}
+	mov	r0, A_1
+	bl	L_clz
+	mov	D_0, r0
+	ldmfd	sp!, {r0, r1, r2, r3, r4, r5, pc}
+
+L_clz_ip_A_0:
+	stmfd	sp!, {r0, r1, r2, r3, r4, r5, lr}
+	mov	r0, A_0
+	bl	L_clz
+	mov	ip, r0
+	ldmfd	sp!, {r0, r1, r2, r3, r4, r5, pc}
+
+L_clz_D_1_B_1:
+	stmfd	sp!, {r0, r1, r2, r3, r4, r5, lr}
+	mov	r0, B_1
+	bl	L_clz
+	mov	D_1, r0
+	ldmfd	sp!, {r0, r1, r2, r3, r4, r5, pc}
+
+L_clz_ip_B_0:
+	stmfd	sp!, {r0, r1, r2, r3, r4, r5, lr}
+	mov	r0, B_0
+	bl	L_clz
+	mov	ip, r0
+	ldmfd	sp!, {r0, r1, r2, r3, r4, r5, pc}
+
+L_clz_D_0_B_0:
+	stmfd	sp!, {r0, r1, r2, r3, r4, r5, lr}
+	mov	r0, B_0
+	bl	L_clz
+	mov	D_0, r0
+	ldmfd	sp!, {r0, r1, r2, r3, r4, r5, pc}
+
+L_clz_D_0_B_1:
+	stmfd	sp!, {r0, r1, r2, r3, r4, r5, lr}
+	mov	r0, B_1
+	bl	L_clz
+	mov	D_0, r0
+	ldmfd	sp!, {r0, r1, r2, r3, r4, r5, pc}
+#endif /* __LINUX_ARM_ARCH__  */