diff mbox series

ui/gtk: fix NULL pointer dereference

Message ID E1lIzWX-0003qN-Me@lizzy.crudebyte.com
State New
Headers show
Series ui/gtk: fix NULL pointer dereference | expand

Commit Message

Christian Schoenebeck March 7, 2021, 7:38 p.m. UTC
DisplaySurface pointer passed to gd_switch() can be NULL, so check this
before trying to dereference it.

Fixes: c821a58ee7 ("ui/console: Pass placeholder surface to display")
Reported-by: Coverity (CID 1448421)
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
---
 ui/gtk.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Akihiko Odaki March 8, 2021, 3:45 a.m. UTC | #1
2021年3月8日(月) 4:57 Christian Schoenebeck <qemu_oss@crudebyte.com>:
>
> DisplaySurface pointer passed to gd_switch() can be NULL, so check this
> before trying to dereference it.
>
> Fixes: c821a58ee7 ("ui/console: Pass placeholder surface to display")
> Reported-by: Coverity (CID 1448421)
> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> ---
>  ui/gtk.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/ui/gtk.c b/ui/gtk.c
> index 3edaf041de..a27b27d004 100644
> --- a/ui/gtk.c
> +++ b/ui/gtk.c
> @@ -567,7 +567,7 @@ static void gd_switch(DisplayChangeListener *dcl,
>      }
>      vc->gfx.ds = surface;
>
> -    if (surface->format == PIXMAN_x8r8g8b8) {
> +    if (surface && surface->format == PIXMAN_x8r8g8b8) {
>          /*
>           * PIXMAN_x8r8g8b8 == CAIRO_FORMAT_RGB24
>           *
> @@ -580,7 +580,7 @@ static void gd_switch(DisplayChangeListener *dcl,
>               surface_width(surface),
>               surface_height(surface),
>               surface_stride(surface));
> -    } else {
> +    } else if (surface) {
>          /* Must convert surface, use pixman to do it. */
>          vc->gfx.convert = pixman_image_create_bits(PIXMAN_x8r8g8b8,
>                                                     surface_width(surface),
> --
> 2.20.1
>

When will the DisplaySurface pointer passed to gd_switch() be NULL?
Also, it affects other displays so it should be fixed in ui/console.c,
or fix all relevant displays.
Christian Schoenebeck March 8, 2021, 10:39 a.m. UTC | #2
On Montag, 8. März 2021 04:45:24 CET Akihiko Odaki wrote:
> 2021年3月8日(月) 4:57 Christian Schoenebeck <qemu_oss@crudebyte.com>:
> > DisplaySurface pointer passed to gd_switch() can be NULL, so check this
> > before trying to dereference it.
> > 
> > Fixes: c821a58ee7 ("ui/console: Pass placeholder surface to display")
> > Reported-by: Coverity (CID 1448421)
> > Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
> > ---
> > 
> >  ui/gtk.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/ui/gtk.c b/ui/gtk.c
> > index 3edaf041de..a27b27d004 100644
> > --- a/ui/gtk.c
> > +++ b/ui/gtk.c
> > @@ -567,7 +567,7 @@ static void gd_switch(DisplayChangeListener *dcl,
> > 
> >      }
> >      vc->gfx.ds = surface;
> > 
> > -    if (surface->format == PIXMAN_x8r8g8b8) {
> > +    if (surface && surface->format == PIXMAN_x8r8g8b8) {
> > 
> >          /*
> >          
> >           * PIXMAN_x8r8g8b8 == CAIRO_FORMAT_RGB24
> >           *
> > 
> > @@ -580,7 +580,7 @@ static void gd_switch(DisplayChangeListener *dcl,
> > 
> >               surface_width(surface),
> >               surface_height(surface),
> >               surface_stride(surface));
> > 
> > -    } else {
> > +    } else if (surface) {
> > 
> >          /* Must convert surface, use pixman to do it. */
> >          vc->gfx.convert = pixman_image_create_bits(PIXMAN_x8r8g8b8,
> >          
> >                                                     surface_width(surface)
> >                                                     ,
> > 
> > --
> > 2.20.1
> 
> When will the DisplaySurface pointer passed to gd_switch() be NULL?
> Also, it affects other displays so it should be fixed in ui/console.c,
> or fix all relevant displays.

This was just about silencing the mentioned automated Coverity defects report.
If you have a better solution, then just ignore this patch.

Best regards,
Christian Schoenebeck
Akihiko Odaki March 8, 2021, 11:31 a.m. UTC | #3
2021年3月8日(月) 19:39 Christian Schoenebeck <qemu_oss@crudebyte.com>:
>
> This was just about silencing the mentioned automated Coverity defects report.
> If you have a better solution, then just ignore this patch.
>
> Best regards,
> Christian Schoenebeck
>
>

I do not have an access to Coverity defects report. I'd appreciate the
details if you provide one. I suspect I made a mistake somewhere else
ui/gtk.c in c821a58ee7 ("ui/console: Pass placeholder surface to
display").

Thanks,
Akihiko Odaki
Christian Schoenebeck March 8, 2021, 12:42 p.m. UTC | #4
On Montag, 8. März 2021 12:31:33 CET Akihiko Odaki wrote:
> 2021年3月8日(月) 19:39 Christian Schoenebeck <qemu_oss@crudebyte.com>:
> > This was just about silencing the mentioned automated Coverity defects
> > report. If you have a better solution, then just ignore this patch.
> > 
> > Best regards,
> > Christian Schoenebeck
> 
> I do not have an access to Coverity defects report. I'd appreciate the
> details if you provide one. I suspect I made a mistake somewhere else
> ui/gtk.c in c821a58ee7 ("ui/console: Pass placeholder surface to
> display").

Unfortunately Coverity's defects reports are not very verbose. In this case:

*** CID 1448421:    (FORWARD_NULL)
/qemu/ui/gtk.c: 570 in gd_switch()
564             surface_width(vc->gfx.ds) == surface_width(surface) &&
565             surface_height(vc->gfx.ds) == surface_height(surface)) {
566             resized = false;
567         }
568         vc->gfx.ds = surface;
569     
>>> CID 1448421:    (FORWARD_NULL)
>>> Dereferencing null pointer "surface".
570         if (surface->format == PIXMAN_x8r8g8b8) {
571             /*
572              * PIXMAN_x8r8g8b8 == CAIRO_FORMAT_RGB24
573              *
574              * No need to convert, use surface directly.  Should be the
575              * common case as this is qemu_default_pixelformat(32) too.

So no detailed path is outlined that may lead to the detected situation (i.e. 
no call stack or conditions like you would get e.g. with clang's static 
analyzer).

There are false positives sometimes, but they should be silenced in some way.

So as you assume "surface" pointer should never be NULL, why did you remove 
the return statement in gd_switch() with c821a58ee7 then? Redundancy?

diff --git a/ui/gtk.c b/ui/gtk.c
index c32ee34edc..3edaf041de 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -567,10 +567,6 @@ static void gd_switch(DisplayChangeListener *dcl,
     }
     vc->gfx.ds = surface;
 
-    if (!surface) {
-        return;
-    }
-
     if (surface->format == PIXMAN_x8r8g8b8) {
         /*
          * PIXMAN_x8r8g8b8 == CAIRO_FORMAT_RGB24

I was reading your change as you wanted to reach the end of the function in 
case of surface == NULL.

Best regards,
Christian Schoenebeck
Akihiko Odaki March 8, 2021, 1:21 p.m. UTC | #5
2021年3月8日(月) 21:42 Christian Schoenebeck <qemu_oss@crudebyte.com>:
>
> Unfortunately Coverity's defects reports are not very verbose. In this case:
>
> *** CID 1448421:    (FORWARD_NULL)
> /qemu/ui/gtk.c: 570 in gd_switch()
> 564             surface_width(vc->gfx.ds) == surface_width(surface) &&
> 565             surface_height(vc->gfx.ds) == surface_height(surface)) {
> 566             resized = false;
> 567         }
> 568         vc->gfx.ds = surface;
> 569
> >>> CID 1448421:    (FORWARD_NULL)
> >>> Dereferencing null pointer "surface".
> 570         if (surface->format == PIXMAN_x8r8g8b8) {
> 571             /*
> 572              * PIXMAN_x8r8g8b8 == CAIRO_FORMAT_RGB24
> 573              *
> 574              * No need to convert, use surface directly.  Should be the
> 575              * common case as this is qemu_default_pixelformat(32) too.
>
> So no detailed path is outlined that may lead to the detected situation (i.e.
> no call stack or conditions like you would get e.g. with clang's static
> analyzer).

Hmm, Coverity should have decided the surface can somehow be NULL. I
hope it is false-positive...

>
> There are false positives sometimes, but they should be silenced in some way.
>
> So as you assume "surface" pointer should never be NULL, why did you remove
> the return statement in gd_switch() with c821a58ee7 then? Redundancy?
>
> diff --git a/ui/gtk.c b/ui/gtk.c
> index c32ee34edc..3edaf041de 100644
> --- a/ui/gtk.c
> +++ b/ui/gtk.c
> @@ -567,10 +567,6 @@ static void gd_switch(DisplayChangeListener *dcl,
>      }
>      vc->gfx.ds = surface;
>
> -    if (!surface) {
> -        return;
> -    }
> -
>      if (surface->format == PIXMAN_x8r8g8b8) {
>          /*
>           * PIXMAN_x8r8g8b8 == CAIRO_FORMAT_RGB24
>
> I was reading your change as you wanted to reach the end of the function in
> case of surface == NULL.
>
> Best regards,
> Christian Schoenebeck
>
>

Redundancy is one reason.

It is also intended to prevent people writing ui/console code from
assuming displays accept NULL as surface. In reality, some other
displays dereferenced surfaces without checking NULL even before this
change. The code checking if the surface is NULL is confusing when
reading the source code. In runtime, pointer dereferences following
the conditional should assert the pointer is not NULL and prevent code
which produces NULL from getting in.

Regards,
Akihiko Odaki
Peter Maydell March 8, 2021, 1:37 p.m. UTC | #6
On Mon, 8 Mar 2021 at 13:32, Christian Schoenebeck
<qemu_oss@crudebyte.com> wrote:
>
> On Montag, 8. März 2021 12:31:33 CET Akihiko Odaki wrote:
> > 2021年3月8日(月) 19:39 Christian Schoenebeck <qemu_oss@crudebyte.com>:
> > > This was just about silencing the mentioned automated Coverity defects
> > > report. If you have a better solution, then just ignore this patch.
> > >
> > > Best regards,
> > > Christian Schoenebeck
> >
> > I do not have an access to Coverity defects report. I'd appreciate the
> > details if you provide one. I suspect I made a mistake somewhere else
> > ui/gtk.c in c821a58ee7 ("ui/console: Pass placeholder surface to
> > display").
>
> Unfortunately Coverity's defects reports are not very verbose.

The online defect viewer is a bit better for showing why it thought
something was an issue. In this case we have at the top of the function:

    trace_gd_switch(vc->label,
                    surface ? surface_width(surface)  : 0,
                    surface ? surface_height(surface) : 0);

which tests whether surface is NULL, implying that sometimes it is.

Then later we have:
    if (vc->gfx.ds && surface &&

also checking surface for NULL-ness.

Finally we have:
    if (surface->format == PIXMAN_x8r8g8b8) {

which dereferences surface without checking if it's NULL.

So there is definitely a bug here:
(1) either surface can never be NULL, and all the places where
the function is testing for NULL-ness are wrong and need to be removed
(2) or surface can be NULL, and we should check here too

Coverity can't tell us which of the two possibilities is right, of course.

thanks
-- PMM
Akihiko Odaki March 8, 2021, 1:57 p.m. UTC | #7
2021年3月8日(月) 22:38 Peter Maydell <peter.maydell@linaro.org>:
>
> The online defect viewer is a bit better for showing why it thought
> something was an issue. In this case we have at the top of the function:
>
>     trace_gd_switch(vc->label,
>                     surface ? surface_width(surface)  : 0,
>                     surface ? surface_height(surface) : 0);
>
> which tests whether surface is NULL, implying that sometimes it is.
>
> Then later we have:
>     if (vc->gfx.ds && surface &&
>
> also checking surface for NULL-ness.
>
> Finally we have:
>     if (surface->format == PIXMAN_x8r8g8b8) {
>
> which dereferences surface without checking if it's NULL.
>
> So there is definitely a bug here:
> (1) either surface can never be NULL, and all the places where
> the function is testing for NULL-ness are wrong and need to be removed
> (2) or surface can be NULL, and we should check here too
>
> Coverity can't tell us which of the two possibilities is right, of course.
>
> thanks
> -- PMM

c821a58ee7 ("ui/console: Pass placeholder surface to display")
intended to eliminate the possibility that surface is NULL, so (1) is
the case. I am preparing a patch to remove NULL checks.

Thanks,
Akihiko Odaki
Christian Schoenebeck March 8, 2021, 2:03 p.m. UTC | #8
On Montag, 8. März 2021 14:37:44 CET Peter Maydell wrote:
> On Mon, 8 Mar 2021 at 13:32, Christian Schoenebeck
> 
> <qemu_oss@crudebyte.com> wrote:
> > On Montag, 8. März 2021 12:31:33 CET Akihiko Odaki wrote:
> > > 2021年3月8日(月) 19:39 Christian Schoenebeck <qemu_oss@crudebyte.com>:
> > > > This was just about silencing the mentioned automated Coverity defects
> > > > report. If you have a better solution, then just ignore this patch.
> > > > 
> > > > Best regards,
> > > > Christian Schoenebeck
> > > 
> > > I do not have an access to Coverity defects report. I'd appreciate the
> > > details if you provide one. I suspect I made a mistake somewhere else
> > > ui/gtk.c in c821a58ee7 ("ui/console: Pass placeholder surface to
> > > display").
> > 
> > Unfortunately Coverity's defects reports are not very verbose.
> 
> The online defect viewer is a bit better for showing why it thought
> something was an issue. In this case we have at the top of the function:

Ah, good to know. Actually never looked into the online viewer. Thanks Peter!

>     trace_gd_switch(vc->label,
>                     surface ? surface_width(surface)  : 0,
>                     surface ? surface_height(surface) : 0);
> 
> which tests whether surface is NULL, implying that sometimes it is.
> 
> Then later we have:
>     if (vc->gfx.ds && surface &&
> 
> also checking surface for NULL-ness.
> 
> Finally we have:
>     if (surface->format == PIXMAN_x8r8g8b8) {
> 
> which dereferences surface without checking if it's NULL.
> 
> So there is definitely a bug here:
> (1) either surface can never be NULL, and all the places where
> the function is testing for NULL-ness are wrong and need to be removed
> (2) or surface can be NULL, and we should check here too
> 
> Coverity can't tell us which of the two possibilities is right, of course.

BTW, there is __nonnull supported by clang, e.g.:

static void foo(void *__nonnull p) {
	...
}

Maybe as an optionally defined macro (if supported by compiler) this could be 
a useful tool for such intended nonnull designs, as it immediately emits 
compiler errors.

Best regards,
Christian Schoenebeck
Akihiko Odaki March 8, 2021, 2:17 p.m. UTC | #9
2021年3月8日(月) 23:03 Christian Schoenebeck <qemu_oss@crudebyte.com>:
>
> BTW, there is __nonnull supported by clang, e.g.:
>
> static void foo(void *__nonnull p) {
>         ...
> }
>
> Maybe as an optionally defined macro (if supported by compiler) this could be
> a useful tool for such intended nonnull designs, as it immediately emits
> compiler errors.
>
> Best regards,
> Christian Schoenebeck
>
>

GCC has nonnull attribute and clang accepts it too. However, it
specifies argument indices, which is harder to understand and to
maintain.
__attribute__((nonnull(2)))
void f(void *k, void *l);

Regards,
Akihiko Odaki
Philippe Mathieu-Daudé March 8, 2021, 2:30 p.m. UTC | #10
On 3/8/21 3:17 PM, Akihiko Odaki wrote:
> 2021年3月8日(月) 23:03 Christian Schoenebeck <qemu_oss@crudebyte.com>:
>>
>> BTW, there is __nonnull supported by clang, e.g.:
>>
>> static void foo(void *__nonnull p) {
>>         ...
>> }
>>
>> Maybe as an optionally defined macro (if supported by compiler) this could be
>> a useful tool for such intended nonnull designs, as it immediately emits
>> compiler errors.
>>
>> Best regards,
>> Christian Schoenebeck
>>
>>
> 
> GCC has nonnull attribute and clang accepts it too. However, it
> specifies argument indices, which is harder to understand and to
> maintain.
> __attribute__((nonnull(2)))
> void f(void *k, void *l);

Richard once suggested to add QEMU_NONNULL(), I have been using
it on a series trying to enforce non-null uses of QOM
'struct Object *owner' but it didn't work out because migrations
of MemoryRegion, some have NULL owner in MachineState.

I also discarded it because Daniel said it could have side-effects
https://www.mail-archive.com/qemu-devel@nongnu.org/msg720739.html
Christian Schoenebeck March 8, 2021, 2:57 p.m. UTC | #11
On Montag, 8. März 2021 15:30:23 CET Philippe Mathieu-Daudé wrote:
> On 3/8/21 3:17 PM, Akihiko Odaki wrote:
> > 2021年3月8日(月) 23:03 Christian Schoenebeck <qemu_oss@crudebyte.com>:
> >> BTW, there is __nonnull supported by clang, e.g.:
> >> 
> >> static void foo(void *__nonnull p) {
> >> 
> >>         ...
> >> 
> >> }
> >> 
> >> Maybe as an optionally defined macro (if supported by compiler) this
> >> could be a useful tool for such intended nonnull designs, as it
> >> immediately emits compiler errors.
> >> 
> >> Best regards,
> >> Christian Schoenebeck
> > 
> > GCC has nonnull attribute and clang accepts it too. However, it
> > specifies argument indices, which is harder to understand and to
> > maintain.
> > __attribute__((nonnull(2)))
> > void f(void *k, void *l);
> 
> Richard once suggested to add QEMU_NONNULL(), I have been using
> it on a series trying to enforce non-null uses of QOM
> 'struct Object *owner' but it didn't work out because migrations
> of MemoryRegion, some have NULL owner in MachineState.
> 
> I also discarded it because Daniel said it could have side-effects
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg720739.html

Yes, but the optimizer part could be disabled with
-fno-delete-null-pointer-checks which would render it a pure diagnostic
feature:

https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-nonnull-function-attribute

Is there an example where the compiler failed to detect a NULL user case?

Best regards,
Christian Schoenebeck
Akihiko Odaki March 9, 2021, 4:20 a.m. UTC | #12
2021年3月8日(月) 23:58 Christian Schoenebeck <qemu_oss@crudebyte.com>:
>
> Yes, but the optimizer part could be disabled with
> -fno-delete-null-pointer-checks which would render it a pure diagnostic
> feature:
>
> https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-nonnull-function-attribute
>
> Is there an example where the compiler failed to detect a NULL user case?
>
> Best regards,
> Christian Schoenebeck
>
>

-fno-delete-null-pointer-checks also prevents the compiler to infer
that a pointer is never NULL with the fact it is dereferenced
somewhere else. It also disables
-fisolate-erroneous-paths-dereference, which turns code paths with
NULL pointer dereferences into traps. I suspect these side effects are
too important to ignore.

Perhaps we may define QEMU_NONNULL as once it was, and document that
it affects runtime behaviors and should not be blindly added to
functions that already exist. We may also be able to enable
-fisolate-erroneous-paths-attribute, which turns code paths with NULL
pointer passing to such functions into traps, if we explicitly state
that it has runtime effects.

Regards,
Akihiko Odaki
diff mbox series

Patch

diff --git a/ui/gtk.c b/ui/gtk.c
index 3edaf041de..a27b27d004 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -567,7 +567,7 @@  static void gd_switch(DisplayChangeListener *dcl,
     }
     vc->gfx.ds = surface;
 
-    if (surface->format == PIXMAN_x8r8g8b8) {
+    if (surface && surface->format == PIXMAN_x8r8g8b8) {
         /*
          * PIXMAN_x8r8g8b8 == CAIRO_FORMAT_RGB24
          *
@@ -580,7 +580,7 @@  static void gd_switch(DisplayChangeListener *dcl,
              surface_width(surface),
              surface_height(surface),
              surface_stride(surface));
-    } else {
+    } else if (surface) {
         /* Must convert surface, use pixman to do it. */
         vc->gfx.convert = pixman_image_create_bits(PIXMAN_x8r8g8b8,
                                                    surface_width(surface),