diff mbox

[5/9] qapi_sized_buffer

Message ID 1363200988-17865-6-git-send-email-jschopp@linux.vnet.ibm.com
State New
Headers show

Commit Message

Joel Schopp March 13, 2013, 6:56 p.m. UTC
Add a sized buffer interface to qapi.

Cc: Michael Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Joel Schopp <jschopp@linux.vnet.ibm.com>
---
 include/qapi/visitor-impl.h |    2 ++
 include/qapi/visitor.h      |    2 ++
 qapi/qapi-visit-core.c      |    8 ++++++++
 3 files changed, 12 insertions(+)

Comments

Michael Roth March 13, 2013, 8:52 p.m. UTC | #1
On Wed, Mar 13, 2013 at 01:56:24PM -0500, Joel Schopp wrote:
> Add a sized buffer interface to qapi.

Isn't this just a special case of the visit_*_carray() interfaces? We
should avoid new interfaces if possible, since it adds to feature
disparities between visitor implementations.

> 
> Cc: Michael Tsirkin <mst@redhat.com>
> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
> Signed-off-by: Joel Schopp <jschopp@linux.vnet.ibm.com>
> ---
>  include/qapi/visitor-impl.h |    2 ++
>  include/qapi/visitor.h      |    2 ++
>  qapi/qapi-visit-core.c      |    8 ++++++++
>  3 files changed, 12 insertions(+)
> 
> diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
> index 9d87f2d..dc0e25c 100644
> --- a/include/qapi/visitor-impl.h
> +++ b/include/qapi/visitor-impl.h
> @@ -38,6 +38,8 @@ struct Visitor
>                           size_t elem_count, size_t elem_size, Error **errp);
>      void (*next_carray)(Visitor *v, Error **errp);
>      void (*end_carray)(Visitor *v, Error **errp);
> +    void (*type_sized_buffer)(Visitor *v, uint8_t **obj, size_t size,
> +                              const char *name, Error **errp);
> 
>      /* May be NULL */
>      void (*start_optional)(Visitor *v, bool *present, const char *name,
> diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
> index 74bddef..7c7bb98 100644
> --- a/include/qapi/visitor.h
> +++ b/include/qapi/visitor.h
> @@ -55,5 +55,7 @@ void visit_start_carray(Visitor *v, void **obj, const char *name,
>                          size_t elem_count, size_t elem_size, Error **errp);
>  void visit_next_carray(Visitor *v, Error **errp);
>  void visit_end_carray(Visitor *v, Error **errp);
> +void visit_type_sized_buffer(Visitor *v, uint8_t **obj, size_t len,
> +                             const char *name, Error **errp);
> 
>  #endif
> diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
> index d9982f8..4b36a54 100644
> --- a/qapi/qapi-visit-core.c
> +++ b/qapi/qapi-visit-core.c
> @@ -338,3 +338,11 @@ void visit_end_carray(Visitor *v, Error **errp)
>          v->end_carray(v, errp);
>      }
>  }
> +
> +void visit_type_sized_buffer(Visitor *v, uint8_t **obj, size_t len,
> +                             const char *name, Error **errp)
> +{
> +    if (!error_is_set(errp)) {
> +        v->type_sized_buffer(v, obj, len, name, errp);
> +    }
> +}
> -- 
> 1.7.10.4
> 
>
Stefan Berger March 13, 2013, 10 p.m. UTC | #2
On 03/13/2013 04:52 PM, mdroth wrote:
> On Wed, Mar 13, 2013 at 01:56:24PM -0500, Joel Schopp wrote:
>> Add a sized buffer interface to qapi.
> Isn't this just a special case of the visit_*_carray() interfaces? We
> should avoid new interfaces if possible, since it adds to feature
> disparities between visitor implementations.

Yes, it's a special case and carray seems more general.
However, I don't understand the interface of carray. It has a 
start_carray with all parameters given, then a next_carray and an 
end_carray. Why do we need multiple calls if one call (start_carray) 
could be used to serialize all the data already?

Regards,
     Stefan
Michael Roth March 13, 2013, 11:18 p.m. UTC | #3
On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
> On 03/13/2013 04:52 PM, mdroth wrote:
> >On Wed, Mar 13, 2013 at 01:56:24PM -0500, Joel Schopp wrote:
> >>Add a sized buffer interface to qapi.
> >Isn't this just a special case of the visit_*_carray() interfaces? We
> >should avoid new interfaces if possible, since it adds to feature
> >disparities between visitor implementations.
> 
> Yes, it's a special case and carray seems more general.
> However, I don't understand the interface of carray. It has a
> start_carray with all parameters given, then a next_carray and an
> end_carray. Why do we need multiple calls if one call (start_carray)
> could be used to serialize all the data already?

Visitors don't have any knowledge of the data structures they're visiting
outside of what we tell them via the visit_*() API.

For output, visit_start_carray() only provides enough information to
instruct a visitor on how to calculate the offsets of each element (and
perhaps allocate some memory for it's own internal buffers etc.)

For input, it provides enough to allocate storage, and calculate offsets
to each element it's deserializing into.

As far as what to do with each element, we need to make additional calls
to instruct it.

For example, a visitor for a 16-element array of:

typedef struct ComplexType {
    int32_t foo;
    char *bar;
} ComplexType;

would look something like:

visit_start_carray(v, ...); // instruct visitor how to calculate offsets
for (i = 0; i < 16; i++) {
    visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
    visit_next_carray(v, ...); // instruct visitor to move to next offset
}
visit_end_carray(v, ...); // instruct visitor to finalize array

> 
> Regards,
>     Stefan
> 
> 
>
Stefan Berger March 14, 2013, 1:48 a.m. UTC | #4
On 03/13/2013 07:18 PM, mdroth wrote:
> On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
>> On 03/13/2013 04:52 PM, mdroth wrote:
>>
> Visitors don't have any knowledge of the data structures they're visiting
> outside of what we tell them via the visit_*() API.
>
> [...]
>
> For example, a visitor for a 16-element array of:
>
> typedef struct ComplexType {
>      int32_t foo;
>      char *bar;
> } ComplexType;
>
> would look something like:
>
> visit_start_carray(v, ...); // instruct visitor how to calculate offsets
> for (i = 0; i < 16; i++) {
>      visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
>      visit_next_carray(v, ...); // instruct visitor to move to next offset
> }
> visit_end_carray(v, ...); // instruct visitor to finalize array

Given this example above, I think we will need the sized buffer. The 
sized buffer targets  binary arrays and their encoding. If I was to 
encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4 loops 
like above breaking it apart in u8, u16 or u32 respectively I think this 
would 'not bed good' also considering the 2 bytes for tag and length 
being added by ASN.1 for every such datatype (u8,u16,u32). The sized 
buffer allows you to for example take a memory page and write it out in 
one chunk adding a few bytes of ASN.1 'decoration' around the actual data.

    Stefan
Michael Roth March 14, 2013, 12:18 p.m. UTC | #5
On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
> On 03/13/2013 07:18 PM, mdroth wrote:
> >On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
> >>On 03/13/2013 04:52 PM, mdroth wrote:
> >>
> >Visitors don't have any knowledge of the data structures they're visiting
> >outside of what we tell them via the visit_*() API.
> >
> >[...]
> >
> >For example, a visitor for a 16-element array of:
> >
> >typedef struct ComplexType {
> >     int32_t foo;
> >     char *bar;
> >} ComplexType;
> >
> >would look something like:
> >
> >visit_start_carray(v, ...); // instruct visitor how to calculate offsets
> >for (i = 0; i < 16; i++) {
> >     visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
> >     visit_next_carray(v, ...); // instruct visitor to move to next offset
> >}
> >visit_end_carray(v, ...); // instruct visitor to finalize array
> 
> Given this example above, I think we will need the sized buffer. The
> sized buffer targets  binary arrays and their encoding. If I was to
> encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
> loops like above breaking it apart in u8, u16 or u32 respectively I
> think this would 'not bed good' also considering the 2 bytes for tag
> and length being added by ASN.1 for every such datatype
> (u8,u16,u32). The sized buffer allows you to for example take a
> memory page and write it out in one chunk adding a few bytes of
> ASN.1 'decoration' around the actual data.

You could do it with this interface as well actually. The Visitor will
need to maintain some internal state to differentiate what it does with
subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
reason it couldn't also track the elem size so it could tag a buffer
"en masse" when visit_end_carray() gets called.

QMP*Visitor does something similar already for building up lists/arrays
before tacking them onto the parent structure.

> 
>    Stefan
>
Stefan Berger March 14, 2013, 1:39 p.m. UTC | #6
On 03/14/2013 08:18 AM, mdroth wrote:
> On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
>> On 03/13/2013 07:18 PM, mdroth wrote:
>>> On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
>>>> On 03/13/2013 04:52 PM, mdroth wrote:
>>>>
>>> Visitors don't have any knowledge of the data structures they're visiting
>>> outside of what we tell them via the visit_*() API.
>>>
>>> [...]
>>>
>>> For example, a visitor for a 16-element array of:
>>>
>>> typedef struct ComplexType {
>>>      int32_t foo;
>>>      char *bar;
>>> } ComplexType;
>>>
>>> would look something like:
>>>
>>> visit_start_carray(v, ...); // instruct visitor how to calculate offsets
>>> for (i = 0; i < 16; i++) {
>>>      visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
>>>      visit_next_carray(v, ...); // instruct visitor to move to next offset
>>> }
>>> visit_end_carray(v, ...); // instruct visitor to finalize array
>> Given this example above, I think we will need the sized buffer. The
>> sized buffer targets  binary arrays and their encoding. If I was to
>> encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
>> loops like above breaking it apart in u8, u16 or u32 respectively I
>> think this would 'not bed good' also considering the 2 bytes for tag
>> and length being added by ASN.1 for every such datatype
>> (u8,u16,u32). The sized buffer allows you to for example take a
>> memory page and write it out in one chunk adding a few bytes of
>> ASN.1 'decoration' around the actual data.
> You could do it with this interface as well actually. The Visitor will
> need to maintain some internal state to differentiate what it does with
> subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
> reason it couldn't also track the elem size so it could tag a buffer
> "en masse" when visit_end_carray() gets called.

It depends on what you pass into visit_start_carray. In your case if you 
pass in ComplexType you would pass in a sizeof(ComplexType) for the size 
of each element presumably. The problem is now you have char *foo, a 
string pointer, hanging off of this structure. How would you handle 
that? Serializing ComplexType's foo and pointer obviously won't do it. 
You need to follow the string pointer and serialize that as well. So we 
have different use cases here when wanting to serialize ComplexType 
versus a plain array with the carray calls somehow having to figure it 
out themselves -- how ??

    Stefan
Michael Roth March 14, 2013, 2:28 p.m. UTC | #7
On Thu, Mar 14, 2013 at 09:39:14AM -0400, Stefan Berger wrote:
> On 03/14/2013 08:18 AM, mdroth wrote:
> >On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
> >>On 03/13/2013 07:18 PM, mdroth wrote:
> >>>On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
> >>>>On 03/13/2013 04:52 PM, mdroth wrote:
> >>>>
> >>>Visitors don't have any knowledge of the data structures they're visiting
> >>>outside of what we tell them via the visit_*() API.
> >>>
> >>>[...]
> >>>
> >>>For example, a visitor for a 16-element array of:
> >>>
> >>>typedef struct ComplexType {
> >>>     int32_t foo;
> >>>     char *bar;
> >>>} ComplexType;
> >>>
> >>>would look something like:
> >>>
> >>>visit_start_carray(v, ...); // instruct visitor how to calculate offsets
> >>>for (i = 0; i < 16; i++) {
> >>>     visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
> >>>     visit_next_carray(v, ...); // instruct visitor to move to next offset
> >>>}
> >>>visit_end_carray(v, ...); // instruct visitor to finalize array
> >>Given this example above, I think we will need the sized buffer. The
> >>sized buffer targets  binary arrays and their encoding. If I was to
> >>encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
> >>loops like above breaking it apart in u8, u16 or u32 respectively I
> >>think this would 'not bed good' also considering the 2 bytes for tag
> >>and length being added by ASN.1 for every such datatype
> >>(u8,u16,u32). The sized buffer allows you to for example take a
> >>memory page and write it out in one chunk adding a few bytes of
> >>ASN.1 'decoration' around the actual data.
> >You could do it with this interface as well actually. The Visitor will
> >need to maintain some internal state to differentiate what it does with
> >subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
> >reason it couldn't also track the elem size so it could tag a buffer
> >"en masse" when visit_end_carray() gets called.
> 
> It depends on what you pass into visit_start_carray. In your case if
> you pass in ComplexType you would pass in a sizeof(ComplexType) for
> the size of each element presumably. The problem is now you have
> char *foo, a string pointer, hanging off of this structure. How
> would you handle that? Serializing ComplexType's foo and pointer
> obviously won't do it.

Why not?  visit_type_ComplexType() knows how to deal with
the individual fields, including the string pointer. I'm not sure
what's at issue here.

In this case the handling for ComplexType would look something like:

visit_type_Complex:
    visit_start_struct
    visit_type_uin32 //foo
    visit_type_str //bar
    visit_end_struct

Granted, strings are easier to deal with. If char * was instead a plain
old uint8_t*, we'd need a nested call to start_carray for each element.
in this case it would look something like:

visit_type_Complex:
    visit_start_struct
    visit_type_uin32 //foo field
    visit_start_carray //bar field
    for (i = 0; i < len_of_bar; i++):
        visit_type_uint8
        visit_next_carray
    visit_end_carray
    visit_end_struct

The key is knowing the length. In open coded visitor routines we know
this, or where to get it, for routines generated from QAPI schemas
we'd a way to tell the code generators how to field the size, or state
the size in the schema directly. I had some patches to do this, but we
don't have a QAPI user that needs this yet. When we do,
visit_*_carray() should be able to handle it, so we should consolidate
around that interface since there are a lot of things to consider in
the scope of what a visitor implementation may be used for.

> would you handle that? Serializing ComplexType's foo and pointer
> obviously won't do it. You need to follow the string pointer and
> serialize that as well. So we have different use cases here when
> wanting to serialize ComplexType versus a plain array with the
> carray calls somehow having to figure it out themselves -- how ??

for a plain array we'd just replace visit_type_ComplexType() with
visit_type_uint{8,16,32,64} and change loop/elem_size params
accordingly.

> 
>    Stefan
>
Stefan Berger March 14, 2013, 2:51 p.m. UTC | #8
On 03/14/2013 10:28 AM, mdroth wrote:
> On Thu, Mar 14, 2013 at 09:39:14AM -0400, Stefan Berger wrote:
>> On 03/14/2013 08:18 AM, mdroth wrote:
>>> On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
>>>> On 03/13/2013 07:18 PM, mdroth wrote:
>>>>> On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
>>>>>> On 03/13/2013 04:52 PM, mdroth wrote:
>>>>>>
>>>>> Visitors don't have any knowledge of the data structures they're visiting
>>>>> outside of what we tell them via the visit_*() API.
>>>>>
>>>>> [...]
>>>>>
>>>>> For example, a visitor for a 16-element array of:
>>>>>
>>>>> typedef struct ComplexType {
>>>>>      int32_t foo;
>>>>>      char *bar;
>>>>> } ComplexType;
>>>>>
>>>>> would look something like:
>>>>>
>>>>> visit_start_carray(v, ...); // instruct visitor how to calculate offsets
>>>>> for (i = 0; i < 16; i++) {
>>>>>      visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
>>>>>      visit_next_carray(v, ...); // instruct visitor to move to next offset
>>>>> }
>>>>> visit_end_carray(v, ...); // instruct visitor to finalize array
>>>> Given this example above, I think we will need the sized buffer. The
>>>> sized buffer targets  binary arrays and their encoding. If I was to
>>>> encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
>>>> loops like above breaking it apart in u8, u16 or u32 respectively I
>>>> think this would 'not bed good' also considering the 2 bytes for tag
>>>> and length being added by ASN.1 for every such datatype
>>>> (u8,u16,u32). The sized buffer allows you to for example take a
>>>> memory page and write it out in one chunk adding a few bytes of
>>>> ASN.1 'decoration' around the actual data.
>>> You could do it with this interface as well actually. The Visitor will
>>> need to maintain some internal state to differentiate what it does with
>>> subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
>>> reason it couldn't also track the elem size so it could tag a buffer
>>> "en masse" when visit_end_carray() gets called.
>> It depends on what you pass into visit_start_carray. In your case if
>> you pass in ComplexType you would pass in a sizeof(ComplexType) for
>> the size of each element presumably. The problem is now you have
>> char *foo, a string pointer, hanging off of this structure. How
>> would you handle that? Serializing ComplexType's foo and pointer
>> obviously won't do it.
> Why not?  visit_type_ComplexType() knows how to deal with
> the individual fields, including the string pointer. I'm not sure
> what's at issue here.
>
> In this case the handling for ComplexType would look something like:
>
> visit_type_Complex:
>      visit_start_struct
>      visit_type_uin32 //foo
>      visit_type_str //bar
>      visit_end_struct
>
> Granted, strings are easier to deal with. If char * was instead a plain
> old uint8_t*, we'd need a nested call to start_carray for each element.
> in this case it would look something like:
>
> visit_type_Complex:
>      visit_start_struct
>      visit_type_uin32 //foo field
>      visit_start_carray //bar field
>      for (i = 0; i < len_of_bar; i++):
>          visit_type_uint8
>          visit_next_carray
>      visit_end_carray

You really want to create a separate element for each element in this 
potentially large binary array? I guess it depends on the underlying 
data, but this has the potential of generating a lot of control code 
around each such byte... As said, for ASN.1 encoding, each such byte 
would be decorated with a tag and a length value, consuming 2 more bytes 
per byte.

    Stefan
Michael Roth March 14, 2013, 3:11 p.m. UTC | #9
On Thu, Mar 14, 2013 at 10:51:49AM -0400, Stefan Berger wrote:
> On 03/14/2013 10:28 AM, mdroth wrote:
> >On Thu, Mar 14, 2013 at 09:39:14AM -0400, Stefan Berger wrote:
> >>On 03/14/2013 08:18 AM, mdroth wrote:
> >>>On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
> >>>>On 03/13/2013 07:18 PM, mdroth wrote:
> >>>>>On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
> >>>>>>On 03/13/2013 04:52 PM, mdroth wrote:
> >>>>>>
> >>>>>Visitors don't have any knowledge of the data structures they're visiting
> >>>>>outside of what we tell them via the visit_*() API.
> >>>>>
> >>>>>[...]
> >>>>>
> >>>>>For example, a visitor for a 16-element array of:
> >>>>>
> >>>>>typedef struct ComplexType {
> >>>>>     int32_t foo;
> >>>>>     char *bar;
> >>>>>} ComplexType;
> >>>>>
> >>>>>would look something like:
> >>>>>
> >>>>>visit_start_carray(v, ...); // instruct visitor how to calculate offsets
> >>>>>for (i = 0; i < 16; i++) {
> >>>>>     visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
> >>>>>     visit_next_carray(v, ...); // instruct visitor to move to next offset
> >>>>>}
> >>>>>visit_end_carray(v, ...); // instruct visitor to finalize array
> >>>>Given this example above, I think we will need the sized buffer. The
> >>>>sized buffer targets  binary arrays and their encoding. If I was to
> >>>>encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
> >>>>loops like above breaking it apart in u8, u16 or u32 respectively I
> >>>>think this would 'not bed good' also considering the 2 bytes for tag
> >>>>and length being added by ASN.1 for every such datatype
> >>>>(u8,u16,u32). The sized buffer allows you to for example take a
> >>>>memory page and write it out in one chunk adding a few bytes of
> >>>>ASN.1 'decoration' around the actual data.
> >>>You could do it with this interface as well actually. The Visitor will
> >>>need to maintain some internal state to differentiate what it does with
> >>>subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
> >>>reason it couldn't also track the elem size so it could tag a buffer
> >>>"en masse" when visit_end_carray() gets called.
> >>It depends on what you pass into visit_start_carray. In your case if
> >>you pass in ComplexType you would pass in a sizeof(ComplexType) for
> >>the size of each element presumably. The problem is now you have
> >>char *foo, a string pointer, hanging off of this structure. How
> >>would you handle that? Serializing ComplexType's foo and pointer
> >>obviously won't do it.
> >Why not?  visit_type_ComplexType() knows how to deal with
> >the individual fields, including the string pointer. I'm not sure
> >what's at issue here.
> >
> >In this case the handling for ComplexType would look something like:
> >
> >visit_type_Complex:
> >     visit_start_struct
> >     visit_type_uin32 //foo
> >     visit_type_str //bar
> >     visit_end_struct
> >
> >Granted, strings are easier to deal with. If char * was instead a plain
> >old uint8_t*, we'd need a nested call to start_carray for each element.
> >in this case it would look something like:
> >
> >visit_type_Complex:
> >     visit_start_struct
> >     visit_type_uin32 //foo field
> >     visit_start_carray //bar field
> >     for (i = 0; i < len_of_bar; i++):
> >         visit_type_uint8
> >         visit_next_carray
> >     visit_end_carray
> 
> You really want to create a separate element for each element in
> this potentially large binary array? I guess it depends on the
> underlying data, but this has the potential of generating a lot of
> control code around each such byte... As said, for ASN.1 encoding,
> each such byte would be decorated with a tag and a length value,
> consuming 2 more bytes per byte.

I addressed this earlier. Your visitor doesn't have tag each
element: if it know it's handling an array (because we told it via
start_carray()), it can buffer them internally and tag the array en
masse when end_carray() is issued.

> 
>    Stefan
>
Stefan Berger March 14, 2013, 3:24 p.m. UTC | #10
On 03/14/2013 11:11 AM, mdroth wrote:
> On Thu, Mar 14, 2013 at 10:51:49AM -0400, Stefan Berger wrote:
>> On 03/14/2013 10:28 AM, mdroth wrote:
>>> On Thu, Mar 14, 2013 at 09:39:14AM -0400, Stefan Berger wrote:
>>>> On 03/14/2013 08:18 AM, mdroth wrote:
>>>>> On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
>>>>>> On 03/13/2013 07:18 PM, mdroth wrote:
>>>>>>> On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
>>>>>>>> On 03/13/2013 04:52 PM, mdroth wrote:
>>>>>>>>
>>>>>>> Visitors don't have any knowledge of the data structures they're visiting
>>>>>>> outside of what we tell them via the visit_*() API.
>>>>>>>
>>>>>>> [...]
>>>>>>>
>>>>>>> For example, a visitor for a 16-element array of:
>>>>>>>
>>>>>>> typedef struct ComplexType {
>>>>>>>      int32_t foo;
>>>>>>>      char *bar;
>>>>>>> } ComplexType;
>>>>>>>
>>>>>>> would look something like:
>>>>>>>
>>>>>>> visit_start_carray(v, ...); // instruct visitor how to calculate offsets
>>>>>>> for (i = 0; i < 16; i++) {
>>>>>>>      visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
>>>>>>>      visit_next_carray(v, ...); // instruct visitor to move to next offset
>>>>>>> }
>>>>>>> visit_end_carray(v, ...); // instruct visitor to finalize array
>>>>>> Given this example above, I think we will need the sized buffer. The
>>>>>> sized buffer targets  binary arrays and their encoding. If I was to
>>>>>> encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
>>>>>> loops like above breaking it apart in u8, u16 or u32 respectively I
>>>>>> think this would 'not bed good' also considering the 2 bytes for tag
>>>>>> and length being added by ASN.1 for every such datatype
>>>>>> (u8,u16,u32). The sized buffer allows you to for example take a
>>>>>> memory page and write it out in one chunk adding a few bytes of
>>>>>> ASN.1 'decoration' around the actual data.
>>>>> You could do it with this interface as well actually. The Visitor will
>>>>> need to maintain some internal state to differentiate what it does with
>>>>> subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
>>>>> reason it couldn't also track the elem size so it could tag a buffer
>>>>> "en masse" when visit_end_carray() gets called.
>>>> It depends on what you pass into visit_start_carray. In your case if
>>>> you pass in ComplexType you would pass in a sizeof(ComplexType) for
>>>> the size of each element presumably. The problem is now you havechar *foo, a string pointer, hanging off of this structure. How
>>>> would you handle that? Serializing ComplexType's foo and pointer
>>>> obviously won't do it.
>>> Why not?  visit_type_ComplexType() knows how to deal with
>>> the individual fields, including the string pointer. I'm not sure
>>> what's at issue here.
>>>
>>> In this case the handling for ComplexType would look something like:
>>>
>>> visit_type_Complex:
>>>      visit_start_struct
>>>      visit_type_uin32 //foo
>>>      visit_type_str //bar
>>>      visit_end_struct
>>>
>>> Granted, strings are easier to deal with. If char * was instead a plain
>>> old uint8_t*, we'd need a nested call to start_carray for each element.
>>> in this case it would look something like:
>>>
>>> visit_type_Complex:
>>>      visit_start_struct
>>>      visit_type_uin32 //foo field
>>>      visit_start_carray //bar field
>>>      for (i = 0; i < len_of_bar; i++):
>>>          visit_type_uint8
>>>          visit_next_carray
>>>      visit_end_carray
>> You really want to create a separate element for each element in
>> this potentially large binary array? I guess it depends on the
>> underlying data, but this has the potential of generating a lot of
>> control code around each such byte... As said, for ASN.1 encoding,
>> each such byte would be decorated with a tag and a length value,
>> consuming 2 more bytes per byte.
> I addressed this earlier. Your visitor doesn't have tag each
> element: if it know it's handling an array (because we told it via
> start_carray()), it can buffer them internally and tag the array en
> masse when end_carray() is issued.

If we were to do this using carray on an array of structs of the 
following type

struct SimpleStruct {
     uint8_t a;
     uint8_t b;
     uint32_t c;
}

then the serialization of a and b would be buffered and flushed once the 
32bit output visitor (or any other than uint8_t output visitor) would be 
called? Now this does makes the implementation a lot more difficult.

   Stefan
Michael Roth March 14, 2013, 9:06 p.m. UTC | #11
On Thu, Mar 14, 2013 at 11:24:03AM -0400, Stefan Berger wrote:
> On 03/14/2013 11:11 AM, mdroth wrote:
> >On Thu, Mar 14, 2013 at 10:51:49AM -0400, Stefan Berger wrote:
> >>On 03/14/2013 10:28 AM, mdroth wrote:
> >>>On Thu, Mar 14, 2013 at 09:39:14AM -0400, Stefan Berger wrote:
> >>>>On 03/14/2013 08:18 AM, mdroth wrote:
> >>>>>On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
> >>>>>>On 03/13/2013 07:18 PM, mdroth wrote:
> >>>>>>>On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
> >>>>>>>>On 03/13/2013 04:52 PM, mdroth wrote:
> >>>>>>>>
> >>>>>>>Visitors don't have any knowledge of the data structures they're visiting
> >>>>>>>outside of what we tell them via the visit_*() API.
> >>>>>>>
> >>>>>>>[...]
> >>>>>>>
> >>>>>>>For example, a visitor for a 16-element array of:
> >>>>>>>
> >>>>>>>typedef struct ComplexType {
> >>>>>>>     int32_t foo;
> >>>>>>>     char *bar;
> >>>>>>>} ComplexType;
> >>>>>>>
> >>>>>>>would look something like:
> >>>>>>>
> >>>>>>>visit_start_carray(v, ...); // instruct visitor how to calculate offsets
> >>>>>>>for (i = 0; i < 16; i++) {
> >>>>>>>     visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
> >>>>>>>     visit_next_carray(v, ...); // instruct visitor to move to next offset
> >>>>>>>}
> >>>>>>>visit_end_carray(v, ...); // instruct visitor to finalize array
> >>>>>>Given this example above, I think we will need the sized buffer. The
> >>>>>>sized buffer targets  binary arrays and their encoding. If I was to
> >>>>>>encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
> >>>>>>loops like above breaking it apart in u8, u16 or u32 respectively I
> >>>>>>think this would 'not bed good' also considering the 2 bytes for tag
> >>>>>>and length being added by ASN.1 for every such datatype
> >>>>>>(u8,u16,u32). The sized buffer allows you to for example take a
> >>>>>>memory page and write it out in one chunk adding a few bytes of
> >>>>>>ASN.1 'decoration' around the actual data.
> >>>>>You could do it with this interface as well actually. The Visitor will
> >>>>>need to maintain some internal state to differentiate what it does with
> >>>>>subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
> >>>>>reason it couldn't also track the elem size so it could tag a buffer
> >>>>>"en masse" when visit_end_carray() gets called.
> >>>>It depends on what you pass into visit_start_carray. In your case if
> >>>>you pass in ComplexType you would pass in a sizeof(ComplexType) for
> >>>>the size of each element presumably. The problem is now you havechar *foo, a string pointer, hanging off of this structure. How
> >>>>would you handle that? Serializing ComplexType's foo and pointer
> >>>>obviously won't do it.
> >>>Why not?  visit_type_ComplexType() knows how to deal with
> >>>the individual fields, including the string pointer. I'm not sure
> >>>what's at issue here.
> >>>
> >>>In this case the handling for ComplexType would look something like:
> >>>
> >>>visit_type_Complex:
> >>>     visit_start_struct
> >>>     visit_type_uin32 //foo
> >>>     visit_type_str //bar
> >>>     visit_end_struct
> >>>
> >>>Granted, strings are easier to deal with. If char * was instead a plain
> >>>old uint8_t*, we'd need a nested call to start_carray for each element.
> >>>in this case it would look something like:
> >>>
> >>>visit_type_Complex:
> >>>     visit_start_struct
> >>>     visit_type_uin32 //foo field
> >>>     visit_start_carray //bar field
> >>>     for (i = 0; i < len_of_bar; i++):
> >>>         visit_type_uint8
> >>>         visit_next_carray
> >>>     visit_end_carray
> >>You really want to create a separate element for each element in
> >>this potentially large binary array? I guess it depends on the
> >>underlying data, but this has the potential of generating a lot of
> >>control code around each such byte... As said, for ASN.1 encoding,
> >>each such byte would be decorated with a tag and a length value,
> >>consuming 2 more bytes per byte.
> >I addressed this earlier. Your visitor doesn't have tag each
> >element: if it know it's handling an array (because we told it via
> >start_carray()), it can buffer them internally and tag the array en
> >masse when end_carray() is issued.
> 
> If we were to do this using carray on an array of structs of the
> following type
> 
> struct SimpleStruct {
>     uint8_t a;
>     uint8_t b;
>     uint32_t c;
> }
> 
> then the serialization of a and b would be buffered and flushed once
> the 32bit output visitor (or any other than uint8_t output visitor)
> would be called?

I don't quite understand. For a struct, we'd tag each field
individually, right?

It's avoiding the need to tag each element in a list, each SimpleStruct,
that's at issue, right? We have a special case for u8 arrays that is
currently handled by qapi_sized_buffer(), and now we're trying to
generalize this optimized handling for more complex data types via
visit_carray_*?

I guess my first question is whether or not it's possible for more complex
data types. For u8 arrays we seem to use a special OCTET_STRING encoding.

If we're not sure we can do this more generally, we do have the option
of only special-casing u8 arrays to use the OCTET_STRING encoding, and
handle the others with "non-optimized" encodings.

If you have an idea for what a generalized, optimized encoding that's
applicable for non-u8 types would look like, we can work through that if
you have an example optimized encoding for, say, and array of
SimpleStruct.

> 
>   Stefan
>
Stefan Berger March 15, 2013, 2:05 a.m. UTC | #12
On 03/14/2013 05:06 PM, mdroth wrote:
> On Thu, Mar 14, 2013 at 11:24:03AM -0400, Stefan Berger wrote:
>> On 03/14/2013 11:11 AM, mdroth wrote:
>>> On Thu, Mar 14, 2013 at 10:51:49AM -0400, Stefan Berger wrote:
>>>> On 03/14/2013 10:28 AM, mdroth wrote:
>>>>> On Thu, Mar 14, 2013 at 09:39:14AM -0400, Stefan Berger wrote:
>>>>>> On 03/14/2013 08:18 AM, mdroth wrote:
>>>>>>> On Wed, Mar 13, 2013 at 09:48:11PM -0400, Stefan Berger wrote:
>>>>>>>> On 03/13/2013 07:18 PM, mdroth wrote:
>>>>>>>>> On Wed, Mar 13, 2013 at 06:00:24PM -0400, Stefan Berger wrote:
>>>>>>>>>> On 03/13/2013 04:52 PM, mdroth wrote:
>>>>>>>>>>
>>>>>>>>> Visitors don't have any knowledge of the data structures they're visiting
>>>>>>>>> outside of what we tell them via the visit_*() API.
>>>>>>>>>
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>> For example, a visitor for a 16-element array of:
>>>>>>>>>
>>>>>>>>> typedef struct ComplexType {
>>>>>>>>>      int32_t foo;
>>>>>>>>>      char *bar;
>>>>>>>>> } ComplexType;
>>>>>>>>>
>>>>>>>>> would look something like:
>>>>>>>>>
>>>>>>>>> visit_start_carray(v, ...); // instruct visitor how to calculate offsets
>>>>>>>>> for (i = 0; i < 16; i++) {
>>>>>>>>>      visit_type_ComplexType(v, ...) // instruct visitor how to handle elem
>>>>>>>>>      visit_next_carray(v, ...); // instruct visitor to move to next offset
>>>>>>>>> }
>>>>>>>>> visit_end_carray(v, ...); // instruct visitor to finalize array
>>>>>>>> Given this example above, I think we will need the sized buffer. The
>>>>>>>> sized buffer targets  binary arrays and their encoding. If I was to
>>>>>>>> encode an 'unsigned char[n]' (e.g., n=200) using n, or n/2 or n/4
>>>>>>>> loops like above breaking it apart in u8, u16 or u32 respectively I
>>>>>>>> think this would 'not bed good' also considering the 2 bytes for tag
>>>>>>>> and length being added by ASN.1 for every such datatype
>>>>>>>> (u8,u16,u32). The sized buffer allows you to for example take a
>>>>>>>> memory page and write it out in one chunk adding a few bytes of
>>>>>>>> ASN.1 'decoration' around the actual data.
>>>>>>> You could do it with this interface as well actually. The Visitor will
>>>>>>> need to maintain some internal state to differentiate what it does with
>>>>>>> subsequent visit_type*/visit_next_carray/visit_end_carry. There's no
>>>>>>> reason it couldn't also track the elem size so it could tag a buffer
>>>>>>> "en masse" when visit_end_carray() gets called.
>>>>>> It depends on what you pass into visit_start_carray. In your case if
>>>>>> you pass in ComplexType you would pass in a sizeof(ComplexType) for
>>>>>> the size of each element presumably. The problem is now you havechar *foo, a string pointer, hanging off of this structure. How
>>>>>> would you handle that? Serializing ComplexType's foo and pointer
>>>>>> obviously won't do it.
>>>>> Why not?  visit_type_ComplexType() knows how to deal with
>>>>> the individual fields, including the string pointer. I'm not sure
>>>>> what's at issue here.
>>>>>
>>>>> In this case the handling for ComplexType would look something like:
>>>>>
>>>>> visit_type_Complex:
>>>>>      visit_start_struct
>>>>>      visit_type_uin32 //foo
>>>>>      visit_type_str //bar
>>>>>      visit_end_struct
>>>>>
>>>>> Granted, strings are easier to deal with. If char * was instead a plain
>>>>> old uint8_t*, we'd need a nested call to start_carray for each element.
>>>>> in this case it would look something like:
>>>>>
>>>>> visit_type_Complex:
>>>>>      visit_start_struct
>>>>>      visit_type_uin32 //foo field
>>>>>      visit_start_carray //bar field
>>>>>      for (i = 0; i < len_of_bar; i++):
>>>>>          visit_type_uint8
>>>>>          visit_next_carray
>>>>>      visit_end_carray
>>>> You really want to create a separate element for each element in
>>>> this potentially large binary array? I guess it depends on the
>>>> underlying data, but this has the potential of generating a lot of
>>>> control code around each such byte... As said, for ASN.1 encoding,
>>>> each such byte would be decorated with a tag and a length value,
>>>> consuming 2 more bytes per byte.
>>> I addressed this earlier. Your visitor doesn't have tag each
>>> element: if it know it's handling an array (because we told it via
>>> start_carray()), it can buffer them internally and tag the array en
>>> masse when end_carray() is issued.
>> If we were to do this using carray on an array of structs of the
>> following type
>>
>> struct SimpleStruct {
>>      uint8_t a;
>>      uint8_t b;
>>      uint32_t c;
>> }
>>
>> then the serialization of a and b would be buffered and flushed once
>> the 32bit output visitor (or any other than uint8_t output visitor)
>> would be called?
> I don't quite understand. For a struct, we'd tag each field
> individually, right?
>
> It's avoiding the need to tag each element in a list, each SimpleStruct,
> that's at issue, right? We have a special case for u8 arrays that is
> currently handled by qapi_sized_buffer(), and now we're trying to
> generalize this optimized handling for more complex data types via
> visit_carray_*?

I have to admit there's one mistake in the sized buffer implementation 
and that is it should take the width of each element so each element can 
be encoded in network byte order, which I think is the key point here so 
the bytestream becomes portable Using the element width we can then walk 
the array of n elements and write them out in network byte order in one 
go. I also see the sized buffer more as a generalization of the string 
visitor that just happens to work on a null terminated array of bytes.

Using the carray it seems we would create special cases in the 
implementation for when a u8, u16, u32, or u64 follows versus a string, 
list or any more complex data type. The examples above seem to need to 
buffer the u8 and then write them out versus when it was to handle an 
array of structs. With the sized buffer we can do this all with a single 
call.

> I guess my first question is whether or not it's possible for more complex
> data types. For u8 arrays we seem to use a special OCTET_STRING encoding.
>
> If we're not sure we can do this more generally, we do have the option
> of only special-casing u8 arrays to use the OCTET_STRING encoding, and
> handle the others with "non-optimized" encodings.
>
> If you have an idea for what a generalized, optimized encoding that's
> applicable for non-u8 types would look like, we can work through that if
> you have an example optimized encoding for, say, and array of
> SimpleStruct.
For an array of SimpleStruct we use the carray. No endianess conversion 
is necessary here for the structure as a whole but each u16,u32, u64 
inside such a structure will be written out correctly with their 
respective visitor. A u16[] or u32[] inside that structure would then be 
handled with a sized buffer walking the array and normalizing each 
element's endianess.

Strictly speaking, for ASN.1 encoding we don't need a carray visitor 
(but maybe other visitor types need it). It can be simulated with the 
struct visitor, which in effect also causes one more level of nesting, 
just that the identifier would be different, which really is the only 
difference then.

    Stefan
diff mbox

Patch

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 9d87f2d..dc0e25c 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -38,6 +38,8 @@  struct Visitor
                          size_t elem_count, size_t elem_size, Error **errp);
     void (*next_carray)(Visitor *v, Error **errp);
     void (*end_carray)(Visitor *v, Error **errp);
+    void (*type_sized_buffer)(Visitor *v, uint8_t **obj, size_t size,
+                              const char *name, Error **errp);
 
     /* May be NULL */
     void (*start_optional)(Visitor *v, bool *present, const char *name,
diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 74bddef..7c7bb98 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -55,5 +55,7 @@  void visit_start_carray(Visitor *v, void **obj, const char *name,
                         size_t elem_count, size_t elem_size, Error **errp);
 void visit_next_carray(Visitor *v, Error **errp);
 void visit_end_carray(Visitor *v, Error **errp);
+void visit_type_sized_buffer(Visitor *v, uint8_t **obj, size_t len,
+                             const char *name, Error **errp);
 
 #endif
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index d9982f8..4b36a54 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -338,3 +338,11 @@  void visit_end_carray(Visitor *v, Error **errp)
         v->end_carray(v, errp);
     }
 }
+
+void visit_type_sized_buffer(Visitor *v, uint8_t **obj, size_t len,
+                             const char *name, Error **errp)
+{
+    if (!error_is_set(errp)) {
+        v->type_sized_buffer(v, obj, len, name, errp);
+    }
+}