diff mbox

fbuffer: improve toggle cursor performance

Message ID 20150527001113.2506.23159.stgit@bahia.huguette.org (mailing list archive)
State Not Applicable
Headers show

Commit Message

Greg Kurz May 27, 2015, 12:11 a.m. UTC
SLOF currently calls hv-logical-load and hv-logical-store for every pixel
when enabling or disabling the cursor. This is suboptimal when writing one
char at a time to the console since terminal-write always toggles the cursor.
And this is precisely what grub is doing when the user wants to edit a menu
entry... the result is an incredibly slow and barely usable interface.

The inner loop in fb8-toggle-cursor handles a contiguous region: it can be
converted to hv-logical-memop. The result is 32 times less hcalls per char
and a serious improvement in grub usability.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
---
 slof/fs/fbuffer.fs |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Nikunj A Dadhania May 27, 2015, 5:11 a.m. UTC | #1
Greg Kurz <gkurz@linux.vnet.ibm.com> writes:

> SLOF currently calls hv-logical-load and hv-logical-store for every pixel
> when enabling or disabling the cursor. This is suboptimal when writing one
> char at a time to the console since terminal-write always toggles the cursor.
> And this is precisely what grub is doing when the user wants to edit a menu
> entry... the result is an incredibly slow and barely usable interface.
>
> The inner loop in fb8-toggle-cursor handles a contiguous region: it can be
> converted to hv-logical-memop. The result is 32 times less hcalls per char
> and a serious improvement in grub usability.
>
> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  slof/fs/fbuffer.fs |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index 756f05a..46b59bf 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
>  : fb8-toggle-cursor ( -- )
>  	line# fb8-line2addr column# fb8-columns2bytes +
>  	char-height 0 ?DO
> -		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
> -		screen-width screen-depth * + char-width screen-depth * -
> +		dup dup 0 char-width screen-depth * 1 hv-logical-memop drop
> +		screen-width screen-depth * +

Why did you drop "char-width screen-depth * -" in the new code? This is
not me mentioned in the description.

Regards
Nikunj
Thomas Huth May 27, 2015, 5:59 a.m. UTC | #2
On Wed, 27 May 2015 02:11:13 +0200
Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:

> SLOF currently calls hv-logical-load and hv-logical-store for every pixel
> when enabling or disabling the cursor. This is suboptimal when writing one
> char at a time to the console since terminal-write always toggles the cursor.
> And this is precisely what grub is doing when the user wants to edit a menu
> entry... the result is an incredibly slow and barely usable interface.
> 
> The inner loop in fb8-toggle-cursor handles a contiguous region: it can be
> converted to hv-logical-memop. The result is 32 times less hcalls per char
> and a serious improvement in grub usability.

Good idea for an optimization!

> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> ---
>  slof/fs/fbuffer.fs |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> index 756f05a..46b59bf 100644
> --- a/slof/fs/fbuffer.fs
> +++ b/slof/fs/fbuffer.fs
> @@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
>  : fb8-toggle-cursor ( -- )
>  	line# fb8-line2addr column# fb8-columns2bytes +
>  	char-height 0 ?DO
> -		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
> -		screen-width screen-depth * + char-width screen-depth * -
> +		dup dup 0 char-width screen-depth * 1 hv-logical-memop drop
> +		screen-width screen-depth * +
>  	LOOP drop
>  ;

If you use hv-logical-memop in this file here, you definitely break
board-js2x, since this is bare metal and hv-logical-memop is not
defined there.

I think you should either move the new function to board-qemu and handle
it there like it is done for hcall-invert-screen already, or we could
think of introducing a helper function that is defined by each board
which does the xor operation on a memory region (that way we could
maybe also unify hcall-invert-screen and fb8-invert-screen again).

 Thomas
Greg Kurz May 27, 2015, 9:01 a.m. UTC | #3
On Wed, 27 May 2015 10:41:06 +0530
Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> wrote:

> Greg Kurz <gkurz@linux.vnet.ibm.com> writes:
> 
> > SLOF currently calls hv-logical-load and hv-logical-store for every pixel
> > when enabling or disabling the cursor. This is suboptimal when writing one
> > char at a time to the console since terminal-write always toggles the cursor.
> > And this is precisely what grub is doing when the user wants to edit a menu
> > entry... the result is an incredibly slow and barely usable interface.
> >
> > The inner loop in fb8-toggle-cursor handles a contiguous region: it can be
> > converted to hv-logical-memop. The result is 32 times less hcalls per char
> > and a serious improvement in grub usability.
> >
> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  slof/fs/fbuffer.fs |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> > index 756f05a..46b59bf 100644
> > --- a/slof/fs/fbuffer.fs
> > +++ b/slof/fs/fbuffer.fs
> > @@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
> >  : fb8-toggle-cursor ( -- )
> >  	line# fb8-line2addr column# fb8-columns2bytes +
> >  	char-height 0 ?DO
> > -		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
> > -		screen-width screen-depth * + char-width screen-depth * -
> > +		dup dup 0 char-width screen-depth * 1 hv-logical-memop drop
> > +		screen-width screen-depth * +
> 
> Why did you drop "char-width screen-depth * -" in the new code? This is
> not me mentioned in the description.
> 

This is because the current inner loop increments the address. When the loop
ends, we're pointing at the next char, that is char-width * screen-depth bytes
too far.

In the new code, the address is duped on the stack before calling hv-logical-memop,
so we don't need to fix it when proceeding to next line.

In my first attempt, I forgot to drop the subtraction and got an interesting
visual result :)

> Regards
> Nikunj

Cheers.

--
Greg
Nikunj A Dadhania May 27, 2015, 9:21 a.m. UTC | #4
Greg Kurz <gkurz@linux.vnet.ibm.com> writes:

> On Wed, 27 May 2015 10:41:06 +0530
> Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> wrote:
>
>> Greg Kurz <gkurz@linux.vnet.ibm.com> writes:
>> 
>> > SLOF currently calls hv-logical-load and hv-logical-store for every pixel
>> > when enabling or disabling the cursor. This is suboptimal when writing one
>> > char at a time to the console since terminal-write always toggles the cursor.
>> > And this is precisely what grub is doing when the user wants to edit a menu
>> > entry... the result is an incredibly slow and barely usable interface.
>> >
>> > The inner loop in fb8-toggle-cursor handles a contiguous region: it can be
>> > converted to hv-logical-memop. The result is 32 times less hcalls per char
>> > and a serious improvement in grub usability.
>> >
>> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
>> > ---
>> >  slof/fs/fbuffer.fs |    4 ++--
>> >  1 file changed, 2 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
>> > index 756f05a..46b59bf 100644
>> > --- a/slof/fs/fbuffer.fs
>> > +++ b/slof/fs/fbuffer.fs
>> > @@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
>> >  : fb8-toggle-cursor ( -- )
>> >  	line# fb8-line2addr column# fb8-columns2bytes +
>> >  	char-height 0 ?DO
>> > -		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
>> > -		screen-width screen-depth * + char-width screen-depth * -
>> > +		dup dup 0 char-width screen-depth * 1 hv-logical-memop drop
>> > +		screen-width screen-depth * +
>> 
>> Why did you drop "char-width screen-depth * -" in the new code? This is
>> not me mentioned in the description.
>> 
>
> This is because the current inner loop increments the address. When the loop
> ends, we're pointing at the next char, that is char-width * screen-depth bytes
> too far.
>
> In the new code, the address is duped on the stack before calling hv-logical-memop,
> so we don't need to fix it when proceeding to next line.

Ah ok, i missed that 1+ in the loop.

> In my first attempt, I forgot to drop the subtraction and got an interesting
> visual result :)

Regards
Nikunj
Greg Kurz May 27, 2015, 9:24 a.m. UTC | #5
On Wed, 27 May 2015 07:59:34 +0200
Thomas Huth <thuth@redhat.com> wrote:

> On Wed, 27 May 2015 02:11:13 +0200
> Greg Kurz <gkurz@linux.vnet.ibm.com> wrote:
> 
> > SLOF currently calls hv-logical-load and hv-logical-store for every pixel
> > when enabling or disabling the cursor. This is suboptimal when writing one
> > char at a time to the console since terminal-write always toggles the cursor.
> > And this is precisely what grub is doing when the user wants to edit a menu
> > entry... the result is an incredibly slow and barely usable interface.
> > 
> > The inner loop in fb8-toggle-cursor handles a contiguous region: it can be
> > converted to hv-logical-memop. The result is 32 times less hcalls per char
> > and a serious improvement in grub usability.
> 
> Good idea for an optimization!
> 

Heh no big deal... the hardest part was to find that the LOAD/STORE avalanche
was coming from these rb@ and rb! words. SLOF is still a mysterious beast to
me :)

> > Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
> > ---
> >  slof/fs/fbuffer.fs |    4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
> > index 756f05a..46b59bf 100644
> > --- a/slof/fs/fbuffer.fs
> > +++ b/slof/fs/fbuffer.fs
> > @@ -99,8 +99,8 @@ CREATE bitmap-buffer 400 4 * allot
> >  : fb8-toggle-cursor ( -- )
> >  	line# fb8-line2addr column# fb8-columns2bytes +
> >  	char-height 0 ?DO
> > -		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
> > -		screen-width screen-depth * + char-width screen-depth * -
> > +		dup dup 0 char-width screen-depth * 1 hv-logical-memop drop
> > +		screen-width screen-depth * +
> >  	LOOP drop
> >  ;
> 
> If you use hv-logical-memop in this file here, you definitely break
> board-js2x, since this is bare metal and hv-logical-memop is not
> defined there.
> 

Of course, this is common code... I'll remember for next time. :)

> I think you should either move the new function to board-qemu and handle
> it there like it is done for hcall-invert-screen already, or we could
> think of introducing a helper function that is defined by each board
> which does the xor operation on a memory region (that way we could
> maybe also unify hcall-invert-screen and fb8-invert-screen again).
> 

I guess the first proposal is the obvious fix. From there, we can
work out a patchset for the second proposal.

>  Thomas
> 

Cheers.

--
Greg
diff mbox

Patch

diff --git a/slof/fs/fbuffer.fs b/slof/fs/fbuffer.fs
index 756f05a..46b59bf 100644
--- a/slof/fs/fbuffer.fs
+++ b/slof/fs/fbuffer.fs
@@ -99,8 +99,8 @@  CREATE bitmap-buffer 400 4 * allot
 : fb8-toggle-cursor ( -- )
 	line# fb8-line2addr column# fb8-columns2bytes +
 	char-height 0 ?DO
-		char-width screen-depth * 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP
-		screen-width screen-depth * + char-width screen-depth * -
+		dup dup 0 char-width screen-depth * 1 hv-logical-memop drop
+		screen-width screen-depth * +
 	LOOP drop
 ;