diff mbox series

usb-mtp: Fix build with gcc 9

Message ID 155137665241.44413.528147250140332507.stgit@bahia.lan
State New
Headers show
Series usb-mtp: Fix build with gcc 9 | expand

Commit Message

Greg Kurz Feb. 28, 2019, 5:57 p.m. UTC
Build fails with gcc 9:

  CC      hw/usb/dev-mtp.o
hw/usb/dev-mtp.c: In function ‘usb_mtp_write_metadata’:
hw/usb/dev-mtp.c:1754:36: error: taking address of packed member of ‘struct <anonymous>’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
 1754 |                             dataset->filename);
      |                             ~~~~~~~^~~~~~~~~~
cc1: all warnings being treated as errors

The way the MTP protocol encodes strings with a leading 8-bit unsigned
integer containing the string length causes the @filename field of the
packed ObjectInfo structure to be unaligned.

Use a temporary buffer for the utf16 filename instead of passing the
dataset->filename pointer around.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/usb/dev-mtp.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

Comments

Peter Maydell Feb. 28, 2019, 6:28 p.m. UTC | #1
On Thu, 28 Feb 2019 at 17:57, Greg Kurz <groug@kaod.org> wrote:
>
> Build fails with gcc 9:
>
>   CC      hw/usb/dev-mtp.o
> hw/usb/dev-mtp.c: In function ‘usb_mtp_write_metadata’:
> hw/usb/dev-mtp.c:1754:36: error: taking address of packed member of ‘struct <anonymous>’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
>  1754 |                             dataset->filename);
>       |                             ~~~~~~~^~~~~~~~~~
> cc1: all warnings being treated as errors
>
> The way the MTP protocol encodes strings with a leading 8-bit unsigned
> integer containing the string length causes the @filename field of the
> packed ObjectInfo structure to be unaligned.
>
> Use a temporary buffer for the utf16 filename instead of passing the
> dataset->filename pointer around.
>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/usb/dev-mtp.c |   18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/hw/usb/dev-mtp.c b/hw/usb/dev-mtp.c
> index 4ee4fc5a893a..8a0ef7f9f4a8 100644
> --- a/hw/usb/dev-mtp.c
> +++ b/hw/usb/dev-mtp.c
> @@ -1692,13 +1692,25 @@ static void usb_mtp_write_metadata(MTPState *s, uint64_t dlen)
>      MTPObject *o;
>      MTPObject *p = usb_mtp_object_lookup(s, s->dataset.parent_handle);
>      uint32_t next_handle = s->next_handle;
> +    uint16_t *utf16_filename;
> +    uint8_t filename_len;
>
>      assert(!s->write_pending);
>      assert(p != NULL);
>
> -    filename = utf16_to_str(MIN(dataset->length,
> -                                dlen - offsetof(ObjectInfo, filename)),
> -                            dataset->filename);
> +    /*
> +     * MTP Specification 3.2.3 Strings
> +     * Strings in PTP (and thus MTP) consist of standard 2-byte Unicode
> +     * characters as defined by ISO 10646. Strings begin with a single, 8-bit
> +     * unsigned integer that identifies the number of characters to follow (not
> +     * bytes).
> +     *
> +     * This causes dataset->filename to be unaligned.
> +     */
> +    filename_len = MIN(dataset->length, dlen - offsetof(ObjectInfo, filename));
> +    utf16_filename = g_memdup(dataset->filename, filename_len * 2);
> +    filename = utf16_to_str(filename_len, utf16_filename);
> +    g_free(utf16_filename);

I think there's an underlying problem with this code which we
should deal with differently. The 'dataset' local in this
file is (I think) pointing at on-the-wire information from
the USB device, but we're treating it as an array of
host-order uint16_t values. Is this really correct on a
big-endian host ? Do we do the right thing if we are
passed a malicious USB packet that ends halfway through a
utf16_t character, or do we index off the end of the packet
data ?

I think that we should define the "filename" field in
ObjectInfo to be a uint8_t array, make utf16_to_str()
take a uint8_t* for its data array, and have it do the
reading of data from the array with lduw_he_p(), which
can handle accessing unaligned data.

We should also check what the endianness of other fields in
the ObjectInfo struct is (eg "format" and "size" and see
whether we should be doing byte swapping here.

PS: it is a bit confusing that in this function the local
variable "dataset" is a pointer to a struct of entirely
different type to the one that s->dataset is.

thanks
-- PMM
Greg Kurz March 1, 2019, 2:59 p.m. UTC | #2
On Thu, 28 Feb 2019 18:28:23 +0000
Peter Maydell <peter.maydell@linaro.org> wrote:

> On Thu, 28 Feb 2019 at 17:57, Greg Kurz <groug@kaod.org> wrote:
> >
> > Build fails with gcc 9:
> >
> >   CC      hw/usb/dev-mtp.o
> > hw/usb/dev-mtp.c: In function ‘usb_mtp_write_metadata’:
> > hw/usb/dev-mtp.c:1754:36: error: taking address of packed member of ‘struct <anonymous>’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
> >  1754 |                             dataset->filename);
> >       |                             ~~~~~~~^~~~~~~~~~
> > cc1: all warnings being treated as errors
> >
> > The way the MTP protocol encodes strings with a leading 8-bit unsigned
> > integer containing the string length causes the @filename field of the
> > packed ObjectInfo structure to be unaligned.
> >
> > Use a temporary buffer for the utf16 filename instead of passing the
> > dataset->filename pointer around.
> >
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/usb/dev-mtp.c |   18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/usb/dev-mtp.c b/hw/usb/dev-mtp.c
> > index 4ee4fc5a893a..8a0ef7f9f4a8 100644
> > --- a/hw/usb/dev-mtp.c
> > +++ b/hw/usb/dev-mtp.c
> > @@ -1692,13 +1692,25 @@ static void usb_mtp_write_metadata(MTPState *s, uint64_t dlen)
> >      MTPObject *o;
> >      MTPObject *p = usb_mtp_object_lookup(s, s->dataset.parent_handle);
> >      uint32_t next_handle = s->next_handle;
> > +    uint16_t *utf16_filename;
> > +    uint8_t filename_len;
> >
> >      assert(!s->write_pending);
> >      assert(p != NULL);
> >
> > -    filename = utf16_to_str(MIN(dataset->length,
> > -                                dlen - offsetof(ObjectInfo, filename)),
> > -                            dataset->filename);
> > +    /*
> > +     * MTP Specification 3.2.3 Strings
> > +     * Strings in PTP (and thus MTP) consist of standard 2-byte Unicode
> > +     * characters as defined by ISO 10646. Strings begin with a single, 8-bit
> > +     * unsigned integer that identifies the number of characters to follow (not
> > +     * bytes).
> > +     *
> > +     * This causes dataset->filename to be unaligned.
> > +     */
> > +    filename_len = MIN(dataset->length, dlen - offsetof(ObjectInfo, filename));
> > +    utf16_filename = g_memdup(dataset->filename, filename_len * 2);
> > +    filename = utf16_to_str(filename_len, utf16_filename);
> > +    g_free(utf16_filename);  
> 
> I think there's an underlying problem with this code which we
> should deal with differently. The 'dataset' local in this
> file is (I think) pointing at on-the-wire information from
> the USB device, but we're treating it as an array of
> host-order uint16_t values. Is this really correct on a
> big-endian host ?

I don't know much about usb-mtp and the MTP spec says:

https://theta360blog.files.wordpress.com/2016/04/mtpforusb-ifv1-1.pdf

3.1.1 Multi-byte Data

The standard format for multi-byte data in this specification is
big-endian. That is, the bits within a byte will be read such that
the most significant byte is read first. The actual multi-byte data
sent over the transport may not necessarily adhere to this same
format, and the actual multi-byte data used on the devices may also
use a different multi-byte format. The big-endian convention only
applies within this document, except where otherwise stated.

So I'm not sure about what the code should really do here... :-\

> Do we do the right thing if we are
> passed a malicious USB packet that ends halfway through a
> utf16_t character, or do we index off the end of the packet
> data ?
> 

Can you elaborate ?

> I think that we should define the "filename" field in
> ObjectInfo to be a uint8_t array, make utf16_to_str()
> take a uint8_t* for its data array, and have it do the
> reading of data from the array with lduw_he_p(), which
> can handle accessing unaligned data.
> 
> We should also check what the endianness of other fields in
> the ObjectInfo struct is (eg "format" and "size" and see
> whether we should be doing byte swapping here.
> 

I don't have any idea on that... the code just seems to assume
everything is host endian.

> PS: it is a bit confusing that in this function the local
> variable "dataset" is a pointer to a struct of entirely
> different type to the one that s->dataset is.
> 

Maybe Gerd or Bandan can comment on that.

> thanks
> -- PMM
Peter Maydell March 1, 2019, 3:15 p.m. UTC | #3
On Fri, 1 Mar 2019 at 14:59, Greg Kurz <groug@kaod.org> wrote:
>
> On Thu, 28 Feb 2019 18:28:23 +0000
> Peter Maydell <peter.maydell@linaro.org> wrote:
>
> > On Thu, 28 Feb 2019 at 17:57, Greg Kurz <groug@kaod.org> wrote:
> > > +    /*
> > > +     * MTP Specification 3.2.3 Strings
> > > +     * Strings in PTP (and thus MTP) consist of standard 2-byte Unicode
> > > +     * characters as defined by ISO 10646. Strings begin with a single, 8-bit
> > > +     * unsigned integer that identifies the number of characters to follow (not
> > > +     * bytes).
> > > +     *
> > > +     * This causes dataset->filename to be unaligned.
> > > +     */
> > > +    filename_len = MIN(dataset->length, dlen - offsetof(ObjectInfo, filename));
> > > +    utf16_filename = g_memdup(dataset->filename, filename_len * 2);
> > > +    filename = utf16_to_str(filename_len, utf16_filename);
> > > +    g_free(utf16_filename);

> > Do we do the right thing if we are
> > passed a malicious USB packet that ends halfway through a
> > utf16_t character, or do we index off the end of the packet
> > data ?
> >
>
> Can you elaborate ?

If the data packet length is such that the packet stops
one byte into a two-byte character in the filename,
do we try to read the two bytes that straddle
the end of the buffer ?

In fact this expression looks bogus:
 filename_len = MIN(dataset->length, dlen - offsetof(ObjectInfo, filename));

because dataset->length (as the comment says) is a
count of 2-byte characters, but the calculation involving
offsetof() is a byte count. So we don't correctly clip
the character count and utf16_to_str() will happily
read off the end of the buffer it is passed.

thanks
-- PMM
Greg Kurz March 1, 2019, 3:34 p.m. UTC | #4
On Fri, 1 Mar 2019 15:15:43 +0000
Peter Maydell <peter.maydell@linaro.org> wrote:

> On Fri, 1 Mar 2019 at 14:59, Greg Kurz <groug@kaod.org> wrote:
> >
> > On Thu, 28 Feb 2019 18:28:23 +0000
> > Peter Maydell <peter.maydell@linaro.org> wrote:
> >  
> > > On Thu, 28 Feb 2019 at 17:57, Greg Kurz <groug@kaod.org> wrote:  
> > > > +    /*
> > > > +     * MTP Specification 3.2.3 Strings
> > > > +     * Strings in PTP (and thus MTP) consist of standard 2-byte Unicode
> > > > +     * characters as defined by ISO 10646. Strings begin with a single, 8-bit
> > > > +     * unsigned integer that identifies the number of characters to follow (not
> > > > +     * bytes).
> > > > +     *
> > > > +     * This causes dataset->filename to be unaligned.
> > > > +     */
> > > > +    filename_len = MIN(dataset->length, dlen - offsetof(ObjectInfo, filename));
> > > > +    utf16_filename = g_memdup(dataset->filename, filename_len * 2);
> > > > +    filename = utf16_to_str(filename_len, utf16_filename);
> > > > +    g_free(utf16_filename);  
> 
> > > Do we do the right thing if we are
> > > passed a malicious USB packet that ends halfway through a
> > > utf16_t character, or do we index off the end of the packet
> > > data ?
> > >  
> >
> > Can you elaborate ?  
> 
> If the data packet length is such that the packet stops
> one byte into a two-byte character in the filename,
> do we try to read the two bytes that straddle
> the end of the buffer ?
> 
> In fact this expression looks bogus:
>  filename_len = MIN(dataset->length, dlen - offsetof(ObjectInfo, filename));
> 
> because dataset->length (as the comment says) is a
> count of 2-byte characters, but the calculation involving
> offsetof() is a byte count. So we don't correctly clip
> the character count and utf16_to_str() will happily
> read off the end of the buffer it is passed.
> 

Oh you're right, this should rather be:

    filename_len = MIN(dataset->length * 2, dlen - offsetof(ObjectInfo, filename)) / 2;

So that if length is say 20 (ie, 40 bytes) but the buffer only has say 39 bytes
for the filename, we clip at 19 characters.

Since the current code already has the issue, I guess this should be fixed
in its own patch.

> thanks
> -- PMM
Bandan Das March 1, 2019, 9:08 p.m. UTC | #5
Greg Kurz <groug@kaod.org> writes:
...
>> 
>> I think there's an underlying problem with this code which we
>> should deal with differently. The 'dataset' local in this
>> file is (I think) pointing at on-the-wire information from
>> the USB device, but we're treating it as an array of
>> host-order uint16_t values. Is this really correct on a
>> big-endian host ?
>
> I don't know much about usb-mtp and the MTP spec says:
>
> https://theta360blog.files.wordpress.com/2016/04/mtpforusb-ifv1-1.pdf
>
> 3.1.1 Multi-byte Data
>
> The standard format for multi-byte data in this specification is
> big-endian. That is, the bits within a byte will be read such that
> the most significant byte is read first. The actual multi-byte data
> sent over the transport may not necessarily adhere to this same
> format, and the actual multi-byte data used on the devices may also
> use a different multi-byte format. The big-endian convention only
> applies within this document, except where otherwise stated.
>
> So I'm not sure about what the code should really do here... :-\
>

If I remember correctly, with USB transport, multibyte values are
little endian and it supersedes the MTP spec? (which is why the code works
as expected on a little endian host). As Peter said, some byte swapping
is probably needed for this to work on big endian hosts.

>> Do we do the right thing if we are
>> passed a malicious USB packet that ends halfway through a
>> utf16_t character, or do we index off the end of the packet
>> data ?
>> 
>
> Can you elaborate ?
>
>> I think that we should define the "filename" field in
>> ObjectInfo to be a uint8_t array, make utf16_to_str()
>> take a uint8_t* for its data array, and have it do the
>> reading of data from the array with lduw_he_p(), which
>> can handle accessing unaligned data.
>> 
>> We should also check what the endianness of other fields in
>> the ObjectInfo struct is (eg "format" and "size" and see
>> whether we should be doing byte swapping here.
>> 
>
> I don't have any idea on that... the code just seems to assume
> everything is host endian.
>
>> PS: it is a bit confusing that in this function the local
>> variable "dataset" is a pointer to a struct of entirely
>> different type to the one that s->dataset is.
>> 
>
> Maybe Gerd or Bandan can comment on that.
>
>> thanks
>> -- PMM
diff mbox series

Patch

diff --git a/hw/usb/dev-mtp.c b/hw/usb/dev-mtp.c
index 4ee4fc5a893a..8a0ef7f9f4a8 100644
--- a/hw/usb/dev-mtp.c
+++ b/hw/usb/dev-mtp.c
@@ -1692,13 +1692,25 @@  static void usb_mtp_write_metadata(MTPState *s, uint64_t dlen)
     MTPObject *o;
     MTPObject *p = usb_mtp_object_lookup(s, s->dataset.parent_handle);
     uint32_t next_handle = s->next_handle;
+    uint16_t *utf16_filename;
+    uint8_t filename_len;
 
     assert(!s->write_pending);
     assert(p != NULL);
 
-    filename = utf16_to_str(MIN(dataset->length,
-                                dlen - offsetof(ObjectInfo, filename)),
-                            dataset->filename);
+    /*
+     * MTP Specification 3.2.3 Strings
+     * Strings in PTP (and thus MTP) consist of standard 2-byte Unicode
+     * characters as defined by ISO 10646. Strings begin with a single, 8-bit
+     * unsigned integer that identifies the number of characters to follow (not
+     * bytes).
+     *
+     * This causes dataset->filename to be unaligned.
+     */
+    filename_len = MIN(dataset->length, dlen - offsetof(ObjectInfo, filename));
+    utf16_filename = g_memdup(dataset->filename, filename_len * 2);
+    filename = utf16_to_str(filename_len, utf16_filename);
+    g_free(utf16_filename);
 
     if (strchr(filename, '/')) {
         usb_mtp_queue_result(s, RES_PARAMETER_NOT_SUPPORTED, d->trans,