Patchwork [for-1.4,01/12] qmp: Fix design bug and read beyond buffer in memchar-write

login
register
mail settings
Submitter Markus Armbruster
Date Feb. 5, 2013, 4:22 p.m.
Message ID <1360081335-6594-2-git-send-email-armbru@redhat.com>
Download mbox | patch
Permalink /patch/218327/
State New
Headers show

Comments

Markus Armbruster - Feb. 5, 2013, 4:22 p.m.
Command memchar-write takes data and size parameter.  Begs the
question what happens when data doesn't match size.

With format base64, qmp_memchar_write() copies the full data argument,
regardless of size argument.

With format utf8, qmp_memchar_write() copies size bytes from data,
happily reading beyond data.  Copies crap from the heap or even
crashes.

Drop the size parameter, and always copy the full data argument.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 hmp.c            | 4 +---
 qapi-schema.json | 4 +---
 qemu-char.c      | 8 +++-----
 qmp-commands.hx  | 4 +---
 4 files changed, 6 insertions(+), 14 deletions(-)
Markus Armbruster - Feb. 6, 2013, 9:06 a.m.
Markus Armbruster <armbru@redhat.com> writes:

> Eric Blake <eblake@redhat.com> writes:
>
>> On 02/05/2013 09:22 AM, Markus Armbruster wrote:
>>> Command memchar-write takes data and size parameter.  Begs the
>>> question what happens when data doesn't match size.
>>> 
>>> With format base64, qmp_memchar_write() copies the full data argument,
>>> regardless of size argument.
>>> 
>>> With format utf8, qmp_memchar_write() copies size bytes from data,
>>> happily reading beyond data.  Copies crap from the heap or even
>>> crashes.
>>> 
>>> Drop the size parameter, and always copy the full data argument.
>>> 
>>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>>> ---
>>>  hmp.c            | 4 +---
>>>  qapi-schema.json | 4 +---
>>>  qemu-char.c      | 8 +++-----
>>>  qmp-commands.hx  | 4 +---
>>>  4 files changed, 6 insertions(+), 14 deletions(-)
>>
>>>      if (has_format && (format == DATA_FORMAT_BASE64)) {
>>>          write_data = g_base64_decode(data, &write_count);
>>>      } else {
>>>          write_data = (uint8_t *)data;
>>> +        write_count = strlen(data);
>>>      }
>>
>> Obviously, base64 is the only way to write an embedded NUL.  But what
>
> Consider the JSON string "this \\u0000 is fun".  But our JSON parser
> chokes on it, so we can ignore it until that's fixed.  See my "[PATCH]
> check-qjson: More thorough testing of UTF-8 in strings", specifically
> the comment right at the beginning of utf8_string().
>
>> happens if the user requests base64 encoding, but the utf8 string that
>> got passed in through JSON is not a valid base64-encoded string?  Does
>> g_base64_decode report an error in that case, and should you be handling
>> the error here?
>
> Good question.  I wish it had occured to GLib developers:
> http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode
>
> Seriously, I need to find out what the heck g_base64_decode() does when
> it's fed crap.  If it fails in a reliable and detectable manner, I need
> to handle the failure.  If not, I need to replace the function.
>
> Moreover, I should document which characters outside the base64 alphabet
> are silently ignored, if any.  All whitespace?  Just newlines?

As far as I can tell, it never fails, but silently ignores characters
outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes.  Oh,
and it does something weird when padding appears in the middle.
Craptastic.

We can either document this behavior as a feature, or we replace the
function by one that accepts only valid base64.  If we do the latter, we
better specify the language we want to accept now, but I guess we could
choose to delay the actual checking post 1.4.

There's another use of g_base64_decode() in qmp_guest_file_write().
Which means guest agent command guest-file-write is similarly
ill-defined.  Mike, anything to be done there?


---<test program>---
#include <glib.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char *argv[])
{
    char **ap, *b64;
    unsigned char *buf;
    size_t sz, i;

    for (ap = argv + 1; *ap; ap++) {
	printf("in : %s\n", *ap);
	buf = g_base64_decode(*ap, &sz);
	printf("out:");
	for (i = 0; i < sz; i++) {
	    printf(" %02x", buf[i]);
	}
	putchar('\n');
	b64 = g_base64_encode(buf, sz);
	printf("b64: %s\n", b64);
	free(buf);
    }
}
---<test run>---
in : 1
out:
b64: 
in : 1=
out:
b64: 
in : 1==
out:
b64: 
in : 1===
out: d4
b64: 1A==
in : 12
out:
b64: 
in : 123
out:
b64: 
in : 1234
out: d7 6d f8
b64: 1234
in : =1234
out: 03 5d b7
b64: A123
in : 1=234
out: d4 0d b7
b64: 1A23
in : <>?,./watch this!@#$%^&*()_+
out: ff 06 ad 72 1b 61
b64: /watchth
in : /watchthis+
out: ff 06 ad 72 1b 61
b64: /watchth
Luiz Capitulino - Feb. 6, 2013, 12:41 p.m.
On Wed, 06 Feb 2013 10:06:03 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Eric Blake <eblake@redhat.com> writes:
> >
> >> On 02/05/2013 09:22 AM, Markus Armbruster wrote:
> >>> Command memchar-write takes data and size parameter.  Begs the
> >>> question what happens when data doesn't match size.
> >>> 
> >>> With format base64, qmp_memchar_write() copies the full data argument,
> >>> regardless of size argument.
> >>> 
> >>> With format utf8, qmp_memchar_write() copies size bytes from data,
> >>> happily reading beyond data.  Copies crap from the heap or even
> >>> crashes.
> >>> 
> >>> Drop the size parameter, and always copy the full data argument.
> >>> 
> >>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >>> ---
> >>>  hmp.c            | 4 +---
> >>>  qapi-schema.json | 4 +---
> >>>  qemu-char.c      | 8 +++-----
> >>>  qmp-commands.hx  | 4 +---
> >>>  4 files changed, 6 insertions(+), 14 deletions(-)
> >>
> >>>      if (has_format && (format == DATA_FORMAT_BASE64)) {
> >>>          write_data = g_base64_decode(data, &write_count);
> >>>      } else {
> >>>          write_data = (uint8_t *)data;
> >>> +        write_count = strlen(data);
> >>>      }
> >>
> >> Obviously, base64 is the only way to write an embedded NUL.  But what
> >
> > Consider the JSON string "this \\u0000 is fun".  But our JSON parser
> > chokes on it, so we can ignore it until that's fixed.  See my "[PATCH]
> > check-qjson: More thorough testing of UTF-8 in strings", specifically
> > the comment right at the beginning of utf8_string().
> >
> >> happens if the user requests base64 encoding, but the utf8 string that
> >> got passed in through JSON is not a valid base64-encoded string?  Does
> >> g_base64_decode report an error in that case, and should you be handling
> >> the error here?
> >
> > Good question.  I wish it had occured to GLib developers:
> > http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode
> >
> > Seriously, I need to find out what the heck g_base64_decode() does when
> > it's fed crap.  If it fails in a reliable and detectable manner, I need
> > to handle the failure.  If not, I need to replace the function.
> >
> > Moreover, I should document which characters outside the base64 alphabet
> > are silently ignored, if any.  All whitespace?  Just newlines?
> 
> As far as I can tell, it never fails, but silently ignores characters
> outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes.  Oh,
> and it does something weird when padding appears in the middle.
> Craptastic.
> 
> We can either document this behavior as a feature, or we replace the
> function by one that accepts only valid base64.  If we do the latter, we
> better specify the language we want to accept now, but I guess we could
> choose to delay the actual checking post 1.4.
> 
> There's another use of g_base64_decode() in qmp_guest_file_write().
> Which means guest agent command guest-file-write is similarly
> ill-defined.  Mike, anything to be done there?

Long time ago Jan warned us about this and he even had patches for
adding a base64 encoder/decoder to qemu.

I think we won't have other option if we want reliable base64
support.
Markus Armbruster - Feb. 6, 2013, 1:11 p.m.
Luiz Capitulino <lcapitulino@redhat.com> writes:

> On Wed, 06 Feb 2013 10:06:03 +0100
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Markus Armbruster <armbru@redhat.com> writes:
>> 
>> > Eric Blake <eblake@redhat.com> writes:
>> >
>> >> On 02/05/2013 09:22 AM, Markus Armbruster wrote:
>> >>> Command memchar-write takes data and size parameter.  Begs the
>> >>> question what happens when data doesn't match size.
>> >>> 
>> >>> With format base64, qmp_memchar_write() copies the full data argument,
>> >>> regardless of size argument.
>> >>> 
>> >>> With format utf8, qmp_memchar_write() copies size bytes from data,
>> >>> happily reading beyond data.  Copies crap from the heap or even
>> >>> crashes.
>> >>> 
>> >>> Drop the size parameter, and always copy the full data argument.
>> >>> 
>> >>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> >>> ---
>> >>>  hmp.c            | 4 +---
>> >>>  qapi-schema.json | 4 +---
>> >>>  qemu-char.c      | 8 +++-----
>> >>>  qmp-commands.hx  | 4 +---
>> >>>  4 files changed, 6 insertions(+), 14 deletions(-)
>> >>
>> >>>      if (has_format && (format == DATA_FORMAT_BASE64)) {
>> >>>          write_data = g_base64_decode(data, &write_count);
>> >>>      } else {
>> >>>          write_data = (uint8_t *)data;
>> >>> +        write_count = strlen(data);
>> >>>      }
>> >>
>> >> Obviously, base64 is the only way to write an embedded NUL.  But what
>> >
>> > Consider the JSON string "this \\u0000 is fun".  But our JSON parser
>> > chokes on it, so we can ignore it until that's fixed.  See my "[PATCH]
>> > check-qjson: More thorough testing of UTF-8 in strings", specifically
>> > the comment right at the beginning of utf8_string().
>> >
>> >> happens if the user requests base64 encoding, but the utf8 string that
>> >> got passed in through JSON is not a valid base64-encoded string?  Does
>> >> g_base64_decode report an error in that case, and should you be handling
>> >> the error here?
>> >
>> > Good question.  I wish it had occured to GLib developers:
>> > http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode
>> >
>> > Seriously, I need to find out what the heck g_base64_decode() does when
>> > it's fed crap.  If it fails in a reliable and detectable manner, I need
>> > to handle the failure.  If not, I need to replace the function.
>> >
>> > Moreover, I should document which characters outside the base64 alphabet
>> > are silently ignored, if any.  All whitespace?  Just newlines?
>> 
>> As far as I can tell, it never fails, but silently ignores characters
>> outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes.  Oh,
>> and it does something weird when padding appears in the middle.
>> Craptastic.
>> 
>> We can either document this behavior as a feature, or we replace the
>> function by one that accepts only valid base64.  If we do the latter, we
>> better specify the language we want to accept now, but I guess we could
>> choose to delay the actual checking post 1.4.
>> 
>> There's another use of g_base64_decode() in qmp_guest_file_write().
>> Which means guest agent command guest-file-write is similarly
>> ill-defined.  Mike, anything to be done there?
>
> Long time ago Jan warned us about this and he even had patches for
> adding a base64 encoder/decoder to qemu.
>
> I think we won't have other option if we want reliable base64
> support.

How about this: I respin this series right away with documentation
changed to demand data contains only characters from the base64
alphabet.

Actually enforcing that will be a separate series.  Let me know if you
want me to try to get it into 1.4.
Luiz Capitulino - Feb. 6, 2013, 1:23 p.m.
On Tue,  5 Feb 2013 17:22:04 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Command memchar-write takes data and size parameter.  Begs the
> question what happens when data doesn't match size.
> 
> With format base64, qmp_memchar_write() copies the full data argument,
> regardless of size argument.
> 
> With format utf8, qmp_memchar_write() copies size bytes from data,
> happily reading beyond data.  Copies crap from the heap or even
> crashes.
> 
> Drop the size parameter, and always copy the full data argument.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  hmp.c            | 4 +---
>  qapi-schema.json | 4 +---
>  qemu-char.c      | 8 +++-----
>  qmp-commands.hx  | 4 +---
>  4 files changed, 6 insertions(+), 14 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index 1689e6f..9fdf1ce 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -664,13 +664,11 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict)
>  
>  void hmp_memchar_write(Monitor *mon, const QDict *qdict)
>  {
> -    uint32_t size;
>      const char *chardev = qdict_get_str(qdict, "device");
>      const char *data = qdict_get_str(qdict, "data");
>      Error *errp = NULL;
>  
> -    size = strlen(data);
> -    qmp_memchar_write(chardev, size, data, false, 0, &errp);
> +    qmp_memchar_write(chardev, data, false, 0, &errp);
>  
>      hmp_handle_error(mon, &errp);
>  }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index cdd8384..9e2cbbd 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -346,8 +346,6 @@
>  #
>  # @device: the name of the memory char device.
>  #
> -# @size: the size to write in bytes.
> -#
>  # @data: the source data write to memchar.

I agree with this change, but I have an observation.

This is fine for QMP on the wire, as data will always be a string. But
for QMP as a C interface, this means that we require data to be passed
as a string (ie. null-terminated).

I've just discussed this with Markus on IRC, and his suggestion is
that the C interface should be different. I agree with him.

But, should we at least make this explicity? Like changing the
@DataFormat doc for @utf8 to:

 @utf8: The data format is a null-terminated utf8 string

 or

 @utf8: The data format is a json string

>  #
>  # @format: #optional the format of the data write to chardev 'memory',
> @@ -359,7 +357,7 @@
>  # Since: 1.4
>  ##
>  { 'command': 'memchar-write',
> -  'data': {'device': 'str', 'size': 'int', 'data': 'str',
> +  'data': {'device': 'str', 'data': 'str',
>             '*format': 'DataFormat'} }
>  
>  ##
> diff --git a/qemu-char.c b/qemu-char.c
> index ac5d62d..9c1dd13 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -2753,9 +2753,8 @@ static bool qemu_is_chr(const CharDriverState *chr, const char *filename)
>      return strcmp(chr->filename, filename);
>  }
>  
> -void qmp_memchar_write(const char *device, int64_t size,
> -                       const char *data, bool has_format,
> -                       enum DataFormat format,
> +void qmp_memchar_write(const char *device, const char *data,
> +                       bool has_format, enum DataFormat format,
>                         Error **errp)
>  {
>      CharDriverState *chr;
> @@ -2774,12 +2773,11 @@ void qmp_memchar_write(const char *device, int64_t size,
>          return;
>      }
>  
> -    write_count = (gsize)size;
> -
>      if (has_format && (format == DATA_FORMAT_BASE64)) {
>          write_data = g_base64_decode(data, &write_count);
>      } else {
>          write_data = (uint8_t *)data;
> +        write_count = strlen(data);
>      }
>  
>      ret = cirmem_chr_write(chr, write_data, write_count);
> diff --git a/qmp-commands.hx b/qmp-commands.hx
> index bbb21f3..8468f10 100644
> --- a/qmp-commands.hx
> +++ b/qmp-commands.hx
> @@ -467,7 +467,7 @@ EQMP
>  
>      {
>          .name       = "memchar-write",
> -        .args_type  = "device:s,size:i,data:s,format:s?",
> +        .args_type  = "device:s,data:s,format:s?",
>          .mhandler.cmd_new = qmp_marshal_input_memchar_write,
>      },
>  
> @@ -481,7 +481,6 @@ char device.
>  Arguments:
>  
>  - "device": the name of the char device, must be unique (json-string)
> -- "size": the memory size, in bytes, should be power of 2 (json-int)
>  - "data": the source data write to memory (json-string)
>  - "format": the data format write to memory, default is
>              utf8. (json-string, optional)
> @@ -491,7 +490,6 @@ Example:
>  
>  -> { "execute": "memchar-write",
>                  "arguments": { "device": foo,
> -                               "size": 8,
>                                 "data": "abcdefgh",
>                                 "format": "utf8" } }
>  <- { "return": {} }
Luiz Capitulino - Feb. 6, 2013, 1:25 p.m.
On Wed, 06 Feb 2013 14:11:10 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Luiz Capitulino <lcapitulino@redhat.com> writes:
> 
> > On Wed, 06 Feb 2013 10:06:03 +0100
> > Markus Armbruster <armbru@redhat.com> wrote:
> >
> >> Markus Armbruster <armbru@redhat.com> writes:
> >> 
> >> > Eric Blake <eblake@redhat.com> writes:
> >> >
> >> >> On 02/05/2013 09:22 AM, Markus Armbruster wrote:
> >> >>> Command memchar-write takes data and size parameter.  Begs the
> >> >>> question what happens when data doesn't match size.
> >> >>> 
> >> >>> With format base64, qmp_memchar_write() copies the full data argument,
> >> >>> regardless of size argument.
> >> >>> 
> >> >>> With format utf8, qmp_memchar_write() copies size bytes from data,
> >> >>> happily reading beyond data.  Copies crap from the heap or even
> >> >>> crashes.
> >> >>> 
> >> >>> Drop the size parameter, and always copy the full data argument.
> >> >>> 
> >> >>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >> >>> ---
> >> >>>  hmp.c            | 4 +---
> >> >>>  qapi-schema.json | 4 +---
> >> >>>  qemu-char.c      | 8 +++-----
> >> >>>  qmp-commands.hx  | 4 +---
> >> >>>  4 files changed, 6 insertions(+), 14 deletions(-)
> >> >>
> >> >>>      if (has_format && (format == DATA_FORMAT_BASE64)) {
> >> >>>          write_data = g_base64_decode(data, &write_count);
> >> >>>      } else {
> >> >>>          write_data = (uint8_t *)data;
> >> >>> +        write_count = strlen(data);
> >> >>>      }
> >> >>
> >> >> Obviously, base64 is the only way to write an embedded NUL.  But what
> >> >
> >> > Consider the JSON string "this \\u0000 is fun".  But our JSON parser
> >> > chokes on it, so we can ignore it until that's fixed.  See my "[PATCH]
> >> > check-qjson: More thorough testing of UTF-8 in strings", specifically
> >> > the comment right at the beginning of utf8_string().
> >> >
> >> >> happens if the user requests base64 encoding, but the utf8 string that
> >> >> got passed in through JSON is not a valid base64-encoded string?  Does
> >> >> g_base64_decode report an error in that case, and should you be handling
> >> >> the error here?
> >> >
> >> > Good question.  I wish it had occured to GLib developers:
> >> > http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode
> >> >
> >> > Seriously, I need to find out what the heck g_base64_decode() does when
> >> > it's fed crap.  If it fails in a reliable and detectable manner, I need
> >> > to handle the failure.  If not, I need to replace the function.
> >> >
> >> > Moreover, I should document which characters outside the base64 alphabet
> >> > are silently ignored, if any.  All whitespace?  Just newlines?
> >> 
> >> As far as I can tell, it never fails, but silently ignores characters
> >> outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes.  Oh,
> >> and it does something weird when padding appears in the middle.
> >> Craptastic.
> >> 
> >> We can either document this behavior as a feature, or we replace the
> >> function by one that accepts only valid base64.  If we do the latter, we
> >> better specify the language we want to accept now, but I guess we could
> >> choose to delay the actual checking post 1.4.
> >> 
> >> There's another use of g_base64_decode() in qmp_guest_file_write().
> >> Which means guest agent command guest-file-write is similarly
> >> ill-defined.  Mike, anything to be done there?
> >
> > Long time ago Jan warned us about this and he even had patches for
> > adding a base64 encoder/decoder to qemu.
> >
> > I think we won't have other option if we want reliable base64
> > support.
> 
> How about this: I respin this series right away with documentation
> changed to demand data contains only characters from the base64
> alphabet.

Seems good enough.

> Actually enforcing that will be a separate series.  Let me know if you
> want me to try to get it into 1.4.

I don't think so. Time is short, and as already noted qemu-ga also
has this problem for some time now.
Peter Maydell - Feb. 6, 2013, 1:30 p.m.
On 6 February 2013 09:06, Markus Armbruster <armbru@redhat.com> wrote:
> As far as I can tell, it never fails, but silently ignores characters
> outside the alphabet [A-Za-z0-9+/]

This bit at least is required behaviour: see RFC2045 section 6.8:

   Any characters outside of the base64 alphabet are to be ignored in
   base64-encoded data.

(thanks to Tony Finch for pointing that one out to me.)

-- PMM
Markus Armbruster - Feb. 6, 2013, 1:51 p.m.
Luiz Capitulino <lcapitulino@redhat.com> writes:

> On Tue,  5 Feb 2013 17:22:04 +0100
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Command memchar-write takes data and size parameter.  Begs the
>> question what happens when data doesn't match size.
>> 
>> With format base64, qmp_memchar_write() copies the full data argument,
>> regardless of size argument.
>> 
>> With format utf8, qmp_memchar_write() copies size bytes from data,
>> happily reading beyond data.  Copies crap from the heap or even
>> crashes.
>> 
>> Drop the size parameter, and always copy the full data argument.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  hmp.c            | 4 +---
>>  qapi-schema.json | 4 +---
>>  qemu-char.c      | 8 +++-----
>>  qmp-commands.hx  | 4 +---
>>  4 files changed, 6 insertions(+), 14 deletions(-)
>> 
>> diff --git a/hmp.c b/hmp.c
>> index 1689e6f..9fdf1ce 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -664,13 +664,11 @@ void hmp_pmemsave(Monitor *mon, const QDict *qdict)
>>  
>>  void hmp_memchar_write(Monitor *mon, const QDict *qdict)
>>  {
>> -    uint32_t size;
>>      const char *chardev = qdict_get_str(qdict, "device");
>>      const char *data = qdict_get_str(qdict, "data");
>>      Error *errp = NULL;
>>  
>> -    size = strlen(data);
>> -    qmp_memchar_write(chardev, size, data, false, 0, &errp);
>> +    qmp_memchar_write(chardev, data, false, 0, &errp);
>>  
>>      hmp_handle_error(mon, &errp);
>>  }
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index cdd8384..9e2cbbd 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -346,8 +346,6 @@
>>  #
>>  # @device: the name of the memory char device.
>>  #
>> -# @size: the size to write in bytes.
>> -#
>>  # @data: the source data write to memchar.
>
> I agree with this change, but I have an observation.
>
> This is fine for QMP on the wire, as data will always be a string. But
> for QMP as a C interface, this means that we require data to be passed
> as a string (ie. null-terminated).
>
> I've just discussed this with Markus on IRC, and his suggestion is
> that the C interface should be different. I agree with him.
>
> But, should we at least make this explicity? Like changing the
> @DataFormat doc for @utf8 to:
>
>  @utf8: The data format is a null-terminated utf8 string

But what does that mean?  See below.

>  or
>
>  @utf8: The data format is a json string

That's trivially true for any json-string argument, including the data
argument with format base64.

Have a look at PATCH 12/12:

 # @memchar-write:
[...]
+# @format: #optional data encoding (default 'utf8').
+#          - base64: data must be base64 encoded text.  It's binary
+#            decoding gets written.
+#          - utf8: data's UTF-8 encoding is written
+#          - data itself is always Unicode regardless of format, like
+#            any other string.
[...]
 # @memchar-read:
[...]
+# @format: #optional data encoding (default 'utf8').
+#          - base64: the data read is returned in base64 encoding.
+#          - utf8: the data read is interpreted as UTF-8.
[Bug doc snipped for clarity...]
+#          - The return value is always Unicode regardless of format,
+#            like any other string.

Back to your question, namely how to document enumeration DataFormat.
Perhaps:

##
# @DataFormat:
#
# An enumeration of data format.
#
# @utf8: Data is a UTF-8 string (RFC 3629)
#
# @base64: Data is a Base64 encoded binary (RFC 3548)
#
# Since: 1.4
##

This omits the fact that data itself is a JSON string, because that's
"obvious".  Hmm.  Let me try -v:

##
# @DataFormat:
#
# An enumeration describing how a data string is to be converted.
#
# @utf8: The string is to be converted to UTF-8 (RFC 3629).  Always
#        possible, because the string is in Unicode.
#
# @base64: The string must be Base64 (RFC 3548) encoded binary, and is
#          to be converted back to binary.
#
# Since: 1.4
##
Markus Armbruster - Feb. 6, 2013, 1:56 p.m.
Peter Maydell <peter.maydell@linaro.org> writes:

> On 6 February 2013 09:06, Markus Armbruster <armbru@redhat.com> wrote:
>> As far as I can tell, it never fails, but silently ignores characters
>> outside the alphabet [A-Za-z0-9+/]
>
> This bit at least is required behaviour: see RFC2045 section 6.8:
>
>    Any characters outside of the base64 alphabet are to be ignored in
>    base64-encoded data.
>
> (thanks to Tony Finch for pointing that one out to me.)

RFC 2045 is "Multipurpose Internet Mail Extensions (MIME) Part One:
Format of Internet Message Bodies".  As such, it is about a *transfer
encoding* of Base64.

RFC 3548 "The Base16, Base32, and Base64 Data Encodings":

2.3.  Interpretation of non-alphabet characters in encoded data

   Base encodings use a specific, reduced, alphabet to encode binary
   data.  Non alphabet characters could exist within base encoded data,
   caused by data corruption or by design.  Non alphabet characters may
   be exploited as a "covert channel", where non-protocol data can be
   sent for nefarious purposes.  Non alphabet characters might also be
   sent in order to exploit implementation errors leading to, e.g.,
   buffer overflow attacks.

   Implementations MUST reject the encoding if it contains characters
   outside the base alphabet when interpreting base encoded data, unless
   the specification referring to this document explicitly states
   otherwise.  Such specifications may, as MIME does, instead state that
   characters outside the base encoding alphabet should simply be
   ignored when interpreting data ("be liberal in what you accept").
   Note that this means that any CRLF constitute "non alphabet
   characters" and are ignored.  Furthermore, such specifications may
   consider the pad character, "=", as not part of the base alphabet
   until the end of the string.  If more than the allowed number of pad
   characters are found at the end of the string, e.g., a base 64 string
   terminated with "===", the excess pad characters could be ignored.

8.  Security Considerations
[...]
   If non-alphabet characters are ignored, instead of causing rejection
   of the entire encoding (as recommended), a covert channel that can be
   used to "leak" information is made possible.  The implications of
   this should be understood in applications that do not follow the
   recommended practice.
[...]
Luiz Capitulino - Feb. 6, 2013, 3:10 p.m.
On Wed, 06 Feb 2013 14:51:44 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Back to your question, namely how to document enumeration DataFormat.
> Perhaps:
> 
> ##
> # @DataFormat:
> #
> # An enumeration of data format.
> #
> # @utf8: Data is a UTF-8 string (RFC 3629)
> #
> # @base64: Data is a Base64 encoded binary (RFC 3548)
> #
> # Since: 1.4
> ##
> 
> This omits the fact that data itself is a JSON string, because that's
> "obvious".  Hmm.  Let me try -v:

That looks good to me.
Markus Armbruster - Feb. 6, 2013, 4:15 p.m.
Luiz Capitulino <lcapitulino@redhat.com> writes:

> On Wed, 06 Feb 2013 14:51:44 +0100
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Back to your question, namely how to document enumeration DataFormat.
>> Perhaps:
>> 
>> ##
>> # @DataFormat:
>> #
>> # An enumeration of data format.
>> #
>> # @utf8: Data is a UTF-8 string (RFC 3629)
>> #
>> # @base64: Data is a Base64 encoded binary (RFC 3548)
>> #
>> # Since: 1.4
>> ##
>> 
>> This omits the fact that data itself is a JSON string, because that's
>> "obvious".  Hmm.  Let me try -v:
>
> That looks good to me.

Sold.
Michael Roth - Feb. 6, 2013, 4:36 p.m.
On Wed, Feb 06, 2013 at 10:06:03AM +0100, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Eric Blake <eblake@redhat.com> writes:
> >
> >> On 02/05/2013 09:22 AM, Markus Armbruster wrote:
> >>> Command memchar-write takes data and size parameter.  Begs the
> >>> question what happens when data doesn't match size.
> >>> 
> >>> With format base64, qmp_memchar_write() copies the full data argument,
> >>> regardless of size argument.
> >>> 
> >>> With format utf8, qmp_memchar_write() copies size bytes from data,
> >>> happily reading beyond data.  Copies crap from the heap or even
> >>> crashes.
> >>> 
> >>> Drop the size parameter, and always copy the full data argument.
> >>> 
> >>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >>> ---
> >>>  hmp.c            | 4 +---
> >>>  qapi-schema.json | 4 +---
> >>>  qemu-char.c      | 8 +++-----
> >>>  qmp-commands.hx  | 4 +---
> >>>  4 files changed, 6 insertions(+), 14 deletions(-)
> >>
> >>>      if (has_format && (format == DATA_FORMAT_BASE64)) {
> >>>          write_data = g_base64_decode(data, &write_count);
> >>>      } else {
> >>>          write_data = (uint8_t *)data;
> >>> +        write_count = strlen(data);
> >>>      }
> >>
> >> Obviously, base64 is the only way to write an embedded NUL.  But what
> >
> > Consider the JSON string "this \\u0000 is fun".  But our JSON parser
> > chokes on it, so we can ignore it until that's fixed.  See my "[PATCH]
> > check-qjson: More thorough testing of UTF-8 in strings", specifically
> > the comment right at the beginning of utf8_string().
> >
> >> happens if the user requests base64 encoding, but the utf8 string that
> >> got passed in through JSON is not a valid base64-encoded string?  Does
> >> g_base64_decode report an error in that case, and should you be handling
> >> the error here?
> >
> > Good question.  I wish it had occured to GLib developers:
> > http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode
> >
> > Seriously, I need to find out what the heck g_base64_decode() does when
> > it's fed crap.  If it fails in a reliable and detectable manner, I need
> > to handle the failure.  If not, I need to replace the function.
> >
> > Moreover, I should document which characters outside the base64 alphabet
> > are silently ignored, if any.  All whitespace?  Just newlines?
> 
> As far as I can tell, it never fails, but silently ignores characters
> outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes.  Oh,
> and it does something weird when padding appears in the middle.
> Craptastic.
> 
> We can either document this behavior as a feature, or we replace the
> function by one that accepts only valid base64.  If we do the latter, we
> better specify the language we want to accept now, but I guess we could
> choose to delay the actual checking post 1.4.
> 
> There's another use of g_base64_decode() in qmp_guest_file_write().
> Which means guest agent command guest-file-write is similarly
> ill-defined.  Mike, anything to be done there?

For qemu-ga I think the documentation makes it clear enough that we're
expecting b64 inputs rather than just interpreting random input as b64,
so I don't see a problem with making the checks stricter in the future.

One other concern though:

> 
> 
> ---<test program>---
> #include <glib.h>
> #include <stdio.h>
> #include <stdlib.h>
> 
> int
> main(int argc, char *argv[])
> {
>     char **ap, *b64;
>     unsigned char *buf;
>     size_t sz, i;
> 
>     for (ap = argv + 1; *ap; ap++) {
> 	printf("in : %s\n", *ap);
> 	buf = g_base64_decode(*ap, &sz);
> 	printf("out:");
> 	for (i = 0; i < sz; i++) {
> 	    printf(" %02x", buf[i]);
> 	}
> 	putchar('\n');
> 	b64 = g_base64_encode(buf, sz);
> 	printf("b64: %s\n", b64);
> 	free(buf);
>     }
> }
> ---<test run>---
> in : 1
> out:
> b64: 
> in : 1=
> out:
> b64: 
> in : 1==
> out:
> b64: 
> in : 1===
> out: d4
> b64: 1A==
> in : 12
> out:
> b64: 
> in : 123
> out:
> b64: 
> in : 1234
> out: d7 6d f8
> b64: 1234
> in : =1234
> out: 03 5d b7
> b64: A123
> in : 1=234
> out: d4 0d b7
> b64: 1A23
> in : <>?,./watch this!@#$%^&*()_+
> out: ff 06 ad 72 1b 61
> b64: /watchth
> in : /watchthis+
> out: ff 06 ad 72 1b 61
> b64: /watchth

Am I misinterpreting the output or is base64_encode() actually spitting
*out* invalid base64 strings for certain inputs? If so that seems pretty
bad for something like guest-file-read, where inputs to base64_encode()
are for all intents completely random. Addressing it in hard freeze may
not be reasonable, since qemu-ga users must already be prepared to receive
garbage from malicious/buggy agents, but I'd certainly pick up a fix for a
subsequent stable release.
Markus Armbruster - Feb. 6, 2013, 8:14 p.m.
mdroth <mdroth@linux.vnet.ibm.com> writes:

> On Wed, Feb 06, 2013 at 10:06:03AM +0100, Markus Armbruster wrote:
>> Markus Armbruster <armbru@redhat.com> writes:
>> 
>> > Eric Blake <eblake@redhat.com> writes:
>> >
>> >> On 02/05/2013 09:22 AM, Markus Armbruster wrote:
>> >>> Command memchar-write takes data and size parameter.  Begs the
>> >>> question what happens when data doesn't match size.
>> >>> 
>> >>> With format base64, qmp_memchar_write() copies the full data argument,
>> >>> regardless of size argument.
>> >>> 
>> >>> With format utf8, qmp_memchar_write() copies size bytes from data,
>> >>> happily reading beyond data.  Copies crap from the heap or even
>> >>> crashes.
>> >>> 
>> >>> Drop the size parameter, and always copy the full data argument.
>> >>> 
>> >>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> >>> ---
>> >>>  hmp.c            | 4 +---
>> >>>  qapi-schema.json | 4 +---
>> >>>  qemu-char.c      | 8 +++-----
>> >>>  qmp-commands.hx  | 4 +---
>> >>>  4 files changed, 6 insertions(+), 14 deletions(-)
>> >>
>> >>>      if (has_format && (format == DATA_FORMAT_BASE64)) {
>> >>>          write_data = g_base64_decode(data, &write_count);
>> >>>      } else {
>> >>>          write_data = (uint8_t *)data;
>> >>> +        write_count = strlen(data);
>> >>>      }
>> >>
>> >> Obviously, base64 is the only way to write an embedded NUL.  But what
>> >
>> > Consider the JSON string "this \\u0000 is fun".  But our JSON parser
>> > chokes on it, so we can ignore it until that's fixed.  See my "[PATCH]
>> > check-qjson: More thorough testing of UTF-8 in strings", specifically
>> > the comment right at the beginning of utf8_string().
>> >
>> >> happens if the user requests base64 encoding, but the utf8 string that
>> >> got passed in through JSON is not a valid base64-encoded string?  Does
>> >> g_base64_decode report an error in that case, and should you be handling
>> >> the error here?
>> >
>> > Good question.  I wish it had occured to GLib developers:
>> > http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode
>> >
>> > Seriously, I need to find out what the heck g_base64_decode() does when
>> > it's fed crap.  If it fails in a reliable and detectable manner, I need
>> > to handle the failure.  If not, I need to replace the function.
>> >
>> > Moreover, I should document which characters outside the base64 alphabet
>> > are silently ignored, if any.  All whitespace?  Just newlines?
>> 
>> As far as I can tell, it never fails, but silently ignores characters
>> outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes.  Oh,
>> and it does something weird when padding appears in the middle.
>> Craptastic.
>> 
>> We can either document this behavior as a feature, or we replace the
>> function by one that accepts only valid base64.  If we do the latter, we
>> better specify the language we want to accept now, but I guess we could
>> choose to delay the actual checking post 1.4.
>> 
>> There's another use of g_base64_decode() in qmp_guest_file_write().
>> Which means guest agent command guest-file-write is similarly
>> ill-defined.  Mike, anything to be done there?
>
> For qemu-ga I think the documentation makes it clear enough that we're
> expecting b64 inputs rather than just interpreting random input as b64,
> so I don't see a problem with making the checks stricter in the future.
>
> One other concern though:
>
>> 
>> 
>> ---<test program>---
>> #include <glib.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> 
>> int
>> main(int argc, char *argv[])
>> {
>>     char **ap, *b64;
>>     unsigned char *buf;
>>     size_t sz, i;
>> 
>>     for (ap = argv + 1; *ap; ap++) {
>> 	printf("in : %s\n", *ap);
>> 	buf = g_base64_decode(*ap, &sz);
>> 	printf("out:");
>> 	for (i = 0; i < sz; i++) {
>> 	    printf(" %02x", buf[i]);
>> 	}
>> 	putchar('\n');
>> 	b64 = g_base64_encode(buf, sz);
>> 	printf("b64: %s\n", b64);
>> 	free(buf);
>>     }
>> }
>> ---<test run>---
>> in : 1
>> out:
>> b64: 
>> in : 1=
>> out:
>> b64: 
>> in : 1==
>> out:
>> b64: 
>> in : 1===
>> out: d4
>> b64: 1A==
>> in : 12
>> out:
>> b64: 
>> in : 123
>> out:
>> b64: 
>> in : 1234
>> out: d7 6d f8
>> b64: 1234
>> in : =1234
>> out: 03 5d b7
>> b64: A123
>> in : 1=234
>> out: d4 0d b7
>> b64: 1A23
>> in : <>?,./watch this!@#$%^&*()_+
>> out: ff 06 ad 72 1b 61
>> b64: /watchth
>> in : /watchthis+
>> out: ff 06 ad 72 1b 61
>> b64: /watchth
>
> Am I misinterpreting the output or is base64_encode() actually spitting
> *out* invalid base64 strings for certain inputs? If so that seems pretty
> bad for something like guest-file-read, where inputs to base64_encode()
> are for all intents completely random. Addressing it in hard freeze may
> not be reasonable, since qemu-ga users must already be prepared to receive
> garbage from malicious/buggy agents, but I'd certainly pick up a fix for a
> subsequent stable release.

Which base64_encode() outputs in my test run do you suspect of being
bad?
Michael Roth - Feb. 6, 2013, 8:39 p.m.
On Wed, Feb 06, 2013 at 09:14:12PM +0100, Markus Armbruster wrote:
> mdroth <mdroth@linux.vnet.ibm.com> writes:
> 
> > On Wed, Feb 06, 2013 at 10:06:03AM +0100, Markus Armbruster wrote:
> >> Markus Armbruster <armbru@redhat.com> writes:
> >> 
> >> > Eric Blake <eblake@redhat.com> writes:
> >> >
> >> >> On 02/05/2013 09:22 AM, Markus Armbruster wrote:
> >> >>> Command memchar-write takes data and size parameter.  Begs the
> >> >>> question what happens when data doesn't match size.
> >> >>> 
> >> >>> With format base64, qmp_memchar_write() copies the full data argument,
> >> >>> regardless of size argument.
> >> >>> 
> >> >>> With format utf8, qmp_memchar_write() copies size bytes from data,
> >> >>> happily reading beyond data.  Copies crap from the heap or even
> >> >>> crashes.
> >> >>> 
> >> >>> Drop the size parameter, and always copy the full data argument.
> >> >>> 
> >> >>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >> >>> ---
> >> >>>  hmp.c            | 4 +---
> >> >>>  qapi-schema.json | 4 +---
> >> >>>  qemu-char.c      | 8 +++-----
> >> >>>  qmp-commands.hx  | 4 +---
> >> >>>  4 files changed, 6 insertions(+), 14 deletions(-)
> >> >>
> >> >>>      if (has_format && (format == DATA_FORMAT_BASE64)) {
> >> >>>          write_data = g_base64_decode(data, &write_count);
> >> >>>      } else {
> >> >>>          write_data = (uint8_t *)data;
> >> >>> +        write_count = strlen(data);
> >> >>>      }
> >> >>
> >> >> Obviously, base64 is the only way to write an embedded NUL.  But what
> >> >
> >> > Consider the JSON string "this \\u0000 is fun".  But our JSON parser
> >> > chokes on it, so we can ignore it until that's fixed.  See my "[PATCH]
> >> > check-qjson: More thorough testing of UTF-8 in strings", specifically
> >> > the comment right at the beginning of utf8_string().
> >> >
> >> >> happens if the user requests base64 encoding, but the utf8 string that
> >> >> got passed in through JSON is not a valid base64-encoded string?  Does
> >> >> g_base64_decode report an error in that case, and should you be handling
> >> >> the error here?
> >> >
> >> > Good question.  I wish it had occured to GLib developers:
> >> > http://developer.gnome.org/glib/stable/glib-Base64-Encoding.html#g-base64-decode
> >> >
> >> > Seriously, I need to find out what the heck g_base64_decode() does when
> >> > it's fed crap.  If it fails in a reliable and detectable manner, I need
> >> > to handle the failure.  If not, I need to replace the function.
> >> >
> >> > Moreover, I should document which characters outside the base64 alphabet
> >> > are silently ignored, if any.  All whitespace?  Just newlines?
> >> 
> >> As far as I can tell, it never fails, but silently ignores characters
> >> outside the alphabet [A-Za-z0-9+/], as well as unpadded suffixes.  Oh,
> >> and it does something weird when padding appears in the middle.
> >> Craptastic.
> >> 
> >> We can either document this behavior as a feature, or we replace the
> >> function by one that accepts only valid base64.  If we do the latter, we
> >> better specify the language we want to accept now, but I guess we could
> >> choose to delay the actual checking post 1.4.
> >> 
> >> There's another use of g_base64_decode() in qmp_guest_file_write().
> >> Which means guest agent command guest-file-write is similarly
> >> ill-defined.  Mike, anything to be done there?
> >
> > For qemu-ga I think the documentation makes it clear enough that we're
> > expecting b64 inputs rather than just interpreting random input as b64,
> > so I don't see a problem with making the checks stricter in the future.
> >
> > One other concern though:
> >
> >> 
> >> 
> >> ---<test program>---
> >> #include <glib.h>
> >> #include <stdio.h>
> >> #include <stdlib.h>
> >> 
> >> int
> >> main(int argc, char *argv[])
> >> {
> >>     char **ap, *b64;
> >>     unsigned char *buf;
> >>     size_t sz, i;
> >> 
> >>     for (ap = argv + 1; *ap; ap++) {
> >> 	printf("in : %s\n", *ap);
> >> 	buf = g_base64_decode(*ap, &sz);
> >> 	printf("out:");
> >> 	for (i = 0; i < sz; i++) {
> >> 	    printf(" %02x", buf[i]);
> >> 	}
> >> 	putchar('\n');
> >> 	b64 = g_base64_encode(buf, sz);
> >> 	printf("b64: %s\n", b64);
> >> 	free(buf);
> >>     }
> >> }
> >> ---<test run>---
> >> in : 1
> >> out:
> >> b64: 
> >> in : 1=
> >> out:
> >> b64: 
> >> in : 1==
> >> out:
> >> b64: 
> >> in : 1===
> >> out: d4
> >> b64: 1A==
> >> in : 12
> >> out:
> >> b64: 
> >> in : 123
> >> out:
> >> b64: 
> >> in : 1234
> >> out: d7 6d f8
> >> b64: 1234
> >> in : =1234
> >> out: 03 5d b7
> >> b64: A123
> >> in : 1=234
> >> out: d4 0d b7
> >> b64: 1A23
> >> in : <>?,./watch this!@#$%^&*()_+
> >> out: ff 06 ad 72 1b 61
> >> b64: /watchth
> >> in : /watchthis+
> >> out: ff 06 ad 72 1b 61
> >> b64: /watchth
> >
> > Am I misinterpreting the output or is base64_encode() actually spitting
> > *out* invalid base64 strings for certain inputs? If so that seems pretty
> > bad for something like guest-file-read, where inputs to base64_encode()
> > are for all intents completely random. Addressing it in hard freeze may
> > not be reasonable, since qemu-ga users must already be prepared to receive
> > garbage from malicious/buggy agents, but I'd certainly pick up a fix for a
> > subsequent stable release.
> 
> Which base64_encode() outputs in my test run do you suspect of being
> bad?
> 

My mistake. The last 2 caught my eye, but I didn't realize "/" was a valid
base64 character, and that characters were being truncated from the
original input due to padding restrictions, I thought it was just
fudging up the plaintext.
Markus Armbruster - Feb. 7, 2013, 6:01 a.m.
mdroth <mdroth@linux.vnet.ibm.com> writes:

> On Wed, Feb 06, 2013 at 09:14:12PM +0100, Markus Armbruster wrote:
>> mdroth <mdroth@linux.vnet.ibm.com> writes:
>> 
>> > On Wed, Feb 06, 2013 at 10:06:03AM +0100, Markus Armbruster wrote:
[...]
>> >> ---<test run>---
>> >> in : 1
>> >> out:
>> >> b64: 
>> >> in : 1=
>> >> out:
>> >> b64: 
>> >> in : 1==
>> >> out:
>> >> b64: 
>> >> in : 1===
>> >> out: d4
>> >> b64: 1A==
>> >> in : 12
>> >> out:
>> >> b64: 
>> >> in : 123
>> >> out:
>> >> b64: 
>> >> in : 1234
>> >> out: d7 6d f8
>> >> b64: 1234
>> >> in : =1234
>> >> out: 03 5d b7
>> >> b64: A123
>> >> in : 1=234
>> >> out: d4 0d b7
>> >> b64: 1A23
>> >> in : <>?,./watch this!@#$%^&*()_+
>> >> out: ff 06 ad 72 1b 61
>> >> b64: /watchth
>> >> in : /watchthis+
>> >> out: ff 06 ad 72 1b 61
>> >> b64: /watchth
>> >
>> > Am I misinterpreting the output or is base64_encode() actually spitting
>> > *out* invalid base64 strings for certain inputs? If so that seems pretty
>> > bad for something like guest-file-read, where inputs to base64_encode()
>> > are for all intents completely random. Addressing it in hard freeze may
>> > not be reasonable, since qemu-ga users must already be prepared to receive
>> > garbage from malicious/buggy agents, but I'd certainly pick up a fix for a
>> > subsequent stable release.
>> 
>> Which base64_encode() outputs in my test run do you suspect of being
>> bad?
>> 
>
> My mistake. The last 2 caught my eye, but I didn't realize "/" was a valid
> base64 character, and that characters were being truncated from the
> original input due to padding restrictions, I thought it was just
> fudging up the plaintext.

You're correct.  And I'm glad we don't have yet another problem!

Patch

diff --git a/hmp.c b/hmp.c
index 1689e6f..9fdf1ce 100644
--- a/hmp.c
+++ b/hmp.c
@@ -664,13 +664,11 @@  void hmp_pmemsave(Monitor *mon, const QDict *qdict)
 
 void hmp_memchar_write(Monitor *mon, const QDict *qdict)
 {
-    uint32_t size;
     const char *chardev = qdict_get_str(qdict, "device");
     const char *data = qdict_get_str(qdict, "data");
     Error *errp = NULL;
 
-    size = strlen(data);
-    qmp_memchar_write(chardev, size, data, false, 0, &errp);
+    qmp_memchar_write(chardev, data, false, 0, &errp);
 
     hmp_handle_error(mon, &errp);
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index cdd8384..9e2cbbd 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -346,8 +346,6 @@ 
 #
 # @device: the name of the memory char device.
 #
-# @size: the size to write in bytes.
-#
 # @data: the source data write to memchar.
 #
 # @format: #optional the format of the data write to chardev 'memory',
@@ -359,7 +357,7 @@ 
 # Since: 1.4
 ##
 { 'command': 'memchar-write',
-  'data': {'device': 'str', 'size': 'int', 'data': 'str',
+  'data': {'device': 'str', 'data': 'str',
            '*format': 'DataFormat'} }
 
 ##
diff --git a/qemu-char.c b/qemu-char.c
index ac5d62d..9c1dd13 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -2753,9 +2753,8 @@  static bool qemu_is_chr(const CharDriverState *chr, const char *filename)
     return strcmp(chr->filename, filename);
 }
 
-void qmp_memchar_write(const char *device, int64_t size,
-                       const char *data, bool has_format,
-                       enum DataFormat format,
+void qmp_memchar_write(const char *device, const char *data,
+                       bool has_format, enum DataFormat format,
                        Error **errp)
 {
     CharDriverState *chr;
@@ -2774,12 +2773,11 @@  void qmp_memchar_write(const char *device, int64_t size,
         return;
     }
 
-    write_count = (gsize)size;
-
     if (has_format && (format == DATA_FORMAT_BASE64)) {
         write_data = g_base64_decode(data, &write_count);
     } else {
         write_data = (uint8_t *)data;
+        write_count = strlen(data);
     }
 
     ret = cirmem_chr_write(chr, write_data, write_count);
diff --git a/qmp-commands.hx b/qmp-commands.hx
index bbb21f3..8468f10 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -467,7 +467,7 @@  EQMP
 
     {
         .name       = "memchar-write",
-        .args_type  = "device:s,size:i,data:s,format:s?",
+        .args_type  = "device:s,data:s,format:s?",
         .mhandler.cmd_new = qmp_marshal_input_memchar_write,
     },
 
@@ -481,7 +481,6 @@  char device.
 Arguments:
 
 - "device": the name of the char device, must be unique (json-string)
-- "size": the memory size, in bytes, should be power of 2 (json-int)
 - "data": the source data write to memory (json-string)
 - "format": the data format write to memory, default is
             utf8. (json-string, optional)
@@ -491,7 +490,6 @@  Example:
 
 -> { "execute": "memchar-write",
                 "arguments": { "device": foo,
-                               "size": 8,
                                "data": "abcdefgh",
                                "format": "utf8" } }
 <- { "return": {} }