diff mbox

Optimize screendump

Message ID 1308495273-23383-1-git-send-email-avi@redhat.com
State New
Headers show

Commit Message

Avi Kivity June 19, 2011, 2:54 p.m. UTC
When running kvm-autotest, fputc() is often the second highest (sometimes #1)
function showing up in a profile.  This is due to fputc() locking the file
for every byte written.

Optimize by using fputc_unlocked().  Since the file is local to the caller,
clearly no locking is needed.  According to the manual, _GNU_SOURCE is all
that's needed for the function to be present.

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 hw/vga.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

Comments

malc June 19, 2011, 3:05 p.m. UTC | #1
On Sun, 19 Jun 2011, Avi Kivity wrote:

CONFORMING TO
       The     four	functions    getc_unlocked(), getchar_unlocked(),
       putc_unlocked(), putchar_unlocked() are in POSIX.1-2001.
       The non-standard *_unlocked() variants occur on a few Unix systems, and
       are available in recent glibc.  They should probably not be used.

> When running kvm-autotest, fputc() is often the second highest (sometimes #1)
> function showing up in a profile.  This is due to fputc() locking the file
> for every byte written.
> 
> Optimize by using fputc_unlocked().  Since the file is local to the caller,
> clearly no locking is needed.  According to the manual, _GNU_SOURCE is all
> that's needed for the function to be present.
> 
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
>  hw/vga.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/vga.c b/hw/vga.c
> index d5bc582..8b63358 100644
> --- a/hw/vga.c
> +++ b/hw/vga.c
> @@ -2369,9 +2369,9 @@ int ppm_save(const char *filename, struct DisplaySurface *ds)
>                  (ds->pf.gmax + 1);
>              b = ((v >> ds->pf.bshift) & ds->pf.bmax) * 256 /
>                  (ds->pf.bmax + 1);
> -            fputc(r, f);
> -            fputc(g, f);
> -            fputc(b, f);
> +            fputc_unlocked(r, f);
> +            fputc_unlocked(g, f);
> +            fputc_unlocked(b, f);
>              d += ds->pf.bytes_per_pixel;
>          }
>          d1 += ds->linesize;
>
Stefan Hajnoczi June 19, 2011, 3:22 p.m. UTC | #2
On Sun, Jun 19, 2011 at 3:54 PM, Avi Kivity <avi@redhat.com> wrote:
> When running kvm-autotest, fputc() is often the second highest (sometimes #1)
> function showing up in a profile.  This is due to fputc() locking the file
> for every byte written.
>
> Optimize by using fputc_unlocked().  Since the file is local to the caller,
> clearly no locking is needed.  According to the manual, _GNU_SOURCE is all
> that's needed for the function to be present.
>
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
>  hw/vga.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)

"The nonstandard *_unlocked() variants occur on a few Unix systems,
and are available in recent glibc. They should probably not be used."

I wonder if this will break non-Linux platforms.  Perhaps buffer an
entire row of pixels instead and only fwrite(3) at the end of the
outer loop.

Stefan
Avi Kivity June 19, 2011, 3:46 p.m. UTC | #3
On 06/19/2011 06:22 PM, Stefan Hajnoczi wrote:
> I wonder if this will break non-Linux platforms.  Perhaps buffer an
> entire row of pixels instead and only fwrite(3) at the end of the
> outer loop.

That's how I wrote this in the first place.  Since the consensus is 
against these functions, I'll submit that version instead.
Andreas Färber June 19, 2011, 3:53 p.m. UTC | #4
Am 19.06.2011 um 17:46 schrieb Avi Kivity:

> On 06/19/2011 06:22 PM, Stefan Hajnoczi wrote:
>> I wonder if this will break non-Linux platforms.  Perhaps buffer an
>> entire row of pixels instead and only fwrite(3) at the end of the
>> outer loop.
>
> That's how I wrote this in the first place.  Since the consensus is  
> against these functions, I'll submit that version instead.

Maybe add a qemu_fputc_unlocked() and do a configure check for it?

Andreas
Avi Kivity June 19, 2011, 4:04 p.m. UTC | #5
On 06/19/2011 06:53 PM, Andreas Färber wrote:
> Am 19.06.2011 um 17:46 schrieb Avi Kivity:
>
>> On 06/19/2011 06:22 PM, Stefan Hajnoczi wrote:
>>> I wonder if this will break non-Linux platforms.  Perhaps buffer an
>>> entire row of pixels instead and only fwrite(3) at the end of the
>>> outer loop.
>>
>> That's how I wrote this in the first place.  Since the consensus is 
>> against these functions, I'll submit that version instead.
>
> Maybe add a qemu_fputc_unlocked() and do a configure check for it?

Good idea.  I'll try that, unless people disagree.
Alexander Graf June 19, 2011, 5 p.m. UTC | #6
On 19.06.2011, at 18:04, Avi Kivity wrote:

> On 06/19/2011 06:53 PM, Andreas Färber wrote:
>> Am 19.06.2011 um 17:46 schrieb Avi Kivity:
>> 
>>> On 06/19/2011 06:22 PM, Stefan Hajnoczi wrote:
>>>> I wonder if this will break non-Linux platforms.  Perhaps buffer an
>>>> entire row of pixels instead and only fwrite(3) at the end of the
>>>> outer loop.
>>> 
>>> That's how I wrote this in the first place.  Since the consensus is against these functions, I'll submit that version instead.
>> 
>> Maybe add a qemu_fputc_unlocked() and do a configure check for it?
> 
> Good idea.  I'll try that, unless people disagree.

Writing by row should be faster and pretty straight forward, no?


Alex
Avi Kivity June 20, 2011, 8:04 a.m. UTC | #7
On 06/19/2011 08:00 PM, Alexander Graf wrote:
> On 19.06.2011, at 18:04, Avi Kivity wrote:
>
> >  On 06/19/2011 06:53 PM, Andreas Färber wrote:
> >>  Am 19.06.2011 um 17:46 schrieb Avi Kivity:
> >>
> >>>  On 06/19/2011 06:22 PM, Stefan Hajnoczi wrote:
> >>>>  I wonder if this will break non-Linux platforms.  Perhaps buffer an
> >>>>  entire row of pixels instead and only fwrite(3) at the end of the
> >>>>  outer loop.
> >>>
> >>>  That's how I wrote this in the first place.  Since the consensus is against these functions, I'll submit that version instead.
> >>
> >>  Maybe add a qemu_fputc_unlocked() and do a configure check for it?
> >
> >  Good idea.  I'll try that, unless people disagree.
>
> Writing by row should be faster and pretty straight forward, no?

I don't see how it's faster, but I guess I'll do that, it's a local 
issue and is best addressed locally.
diff mbox

Patch

diff --git a/hw/vga.c b/hw/vga.c
index d5bc582..8b63358 100644
--- a/hw/vga.c
+++ b/hw/vga.c
@@ -2369,9 +2369,9 @@  int ppm_save(const char *filename, struct DisplaySurface *ds)
                 (ds->pf.gmax + 1);
             b = ((v >> ds->pf.bshift) & ds->pf.bmax) * 256 /
                 (ds->pf.bmax + 1);
-            fputc(r, f);
-            fputc(g, f);
-            fputc(b, f);
+            fputc_unlocked(r, f);
+            fputc_unlocked(g, f);
+            fputc_unlocked(b, f);
             d += ds->pf.bytes_per_pixel;
         }
         d1 += ds->linesize;