Patchwork [V2,07/10] snapshot: qmp use new internal API for external snapshot transaction

login
register
mail settings
Submitter Wayne Xia
Date Jan. 7, 2013, 7:28 a.m.
Message ID <1357543689-11415-9-git-send-email-xiawenc@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/209870/
State New
Headers show

Comments

Wayne Xia - Jan. 7, 2013, 7:28 a.m.
This patch switch to internal common API to take group external
snapshots from qmp_transaction interface. qmp layer simply does
a translation from user input.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
---
 blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
 1 files changed, 87 insertions(+), 128 deletions(-)
Stefan Hajnoczi - Jan. 9, 2013, 12:44 p.m.
On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
>   This patch switch to internal common API to take group external
> snapshots from qmp_transaction interface. qmp layer simply does
> a translation from user input.
> 
> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
> ---
>  blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
>  1 files changed, 87 insertions(+), 128 deletions(-)

An internal API for snapshots is not necessary.  qmp_transaction() is
already usable both from the monitor and C code.

The QAPI code generator creates structs that can be accessed directly
from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
the snapshot API.  It just doesn't support internal snapshots yet, which
is what you are trying to add.

To add internal snapshot support, define a BlockdevInternalSnapshot type
in qapi-schema.json and add internal snapshot support in
qmp_transaction().

qmp_transaction() was designed with this in mind from the beginning and
dispatches based on BlockdevAction->kind.

The patch series will become much smaller while still adding internal
snapshot support.

Stefan
Wayne Xia - Jan. 10, 2013, 3:21 a.m.
于 2013-1-9 20:44, Stefan Hajnoczi 写道:
> On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
>>    This patch switch to internal common API to take group external
>> snapshots from qmp_transaction interface. qmp layer simply does
>> a translation from user input.
>>
>> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
>> ---
>>   blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
>>   1 files changed, 87 insertions(+), 128 deletions(-)
>
> An internal API for snapshots is not necessary.  qmp_transaction() is
> already usable both from the monitor and C code.
>
> The QAPI code generator creates structs that can be accessed directly
> from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
> the snapshot API.  It just doesn't support internal snapshots yet, which
> is what you are trying to add.
>
> To add internal snapshot support, define a BlockdevInternalSnapshot type
> in qapi-schema.json and add internal snapshot support in
> qmp_transaction().
>
> qmp_transaction() was designed with this in mind from the beginning and
> dispatches based on BlockdevAction->kind.
>
> The patch series will become much smaller while still adding internal
> snapshot support.
>
> Stefan
>

   As API, qmp_transaction have following disadvantages:
1) interface is based on string not data type inside qemu, that means
other function calling it result in: bdrv->string->bdrv
2) all capability are forced to be exposed.
3) need structure to record each transaction state, such as
BlkTransactionStates. Extending it is equal to add an internal layer.

   Actually I started up by use qmp_transaction as API, but soon
found that work is almost done around BlkTransactionStates, so
added a layer around it clearly.
Stefan Hajnoczi - Jan. 10, 2013, 12:41 p.m.
On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
> 于 2013-1-9 20:44, Stefan Hajnoczi 写道:
> >On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
> >>   This patch switch to internal common API to take group external
> >>snapshots from qmp_transaction interface. qmp layer simply does
> >>a translation from user input.
> >>
> >>Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
> >>---
> >>  blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
> >>  1 files changed, 87 insertions(+), 128 deletions(-)
> >
> >An internal API for snapshots is not necessary.  qmp_transaction() is
> >already usable both from the monitor and C code.
> >
> >The QAPI code generator creates structs that can be accessed directly
> >from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
> >the snapshot API.  It just doesn't support internal snapshots yet, which
> >is what you are trying to add.
> >
> >To add internal snapshot support, define a BlockdevInternalSnapshot type
> >in qapi-schema.json and add internal snapshot support in
> >qmp_transaction().
> >
> >qmp_transaction() was designed with this in mind from the beginning and
> >dispatches based on BlockdevAction->kind.
> >
> >The patch series will become much smaller while still adding internal
> >snapshot support.
> >
> >Stefan
> >
> 
>   As API, qmp_transaction have following disadvantages:
> 1) interface is based on string not data type inside qemu, that means
> other function calling it result in: bdrv->string->bdrv

Use bdrv_get_device_name().  You already need to fill in filename or
snapshot name strings.  This is not a big disadvantage.

> 2) all capability are forced to be exposed.

Is there something you cannot expose?

> 3) need structure to record each transaction state, such as
> BlkTransactionStates. Extending it is equal to add an internal layer.

I agree that extending it is equal coding effort to adding an internal
layer because you'll need to refactor qmp_transaction() a bit to really
support additional action types.

But it's the right thing to do.  Don't add unnecessary layers just
because writing new code is more fun than extending existing code.

>   Actually I started up by use qmp_transaction as API, but soon
> found that work is almost done around BlkTransactionStates, so
> added a layer around it clearly.

BlkTransactionStates is only relevant to external snapshots because they
change the BlkDriverState chain and must be able to undo that.

For internal snapshots you simply need to store the snapshot name so you
can delete it if there is an error partway through.  (BTW it's not
possible to completely restore state if you allow overwriting existing
internal snapshots unless you do something like taking a backup snapshot
of the existing snapshot first!)

Stefan
Wayne Xia - Jan. 11, 2013, 6:22 a.m.
于 2013-1-10 20:41, Stefan Hajnoczi 写道:
> On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
>> 于 2013-1-9 20:44, Stefan Hajnoczi 写道:
>>> On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
>>>>    This patch switch to internal common API to take group external
>>>> snapshots from qmp_transaction interface. qmp layer simply does
>>>> a translation from user input.
>>>>
>>>> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
>>>> ---
>>>>   blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
>>>>   1 files changed, 87 insertions(+), 128 deletions(-)
>>>
>>> An internal API for snapshots is not necessary.  qmp_transaction() is
>>> already usable both from the monitor and C code.
>>>
>>> The QAPI code generator creates structs that can be accessed directly
>> >from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
>>> the snapshot API.  It just doesn't support internal snapshots yet, which
>>> is what you are trying to add.
>>>
>>> To add internal snapshot support, define a BlockdevInternalSnapshot type
>>> in qapi-schema.json and add internal snapshot support in
>>> qmp_transaction().
>>>
>>> qmp_transaction() was designed with this in mind from the beginning and
>>> dispatches based on BlockdevAction->kind.
>>>
>>> The patch series will become much smaller while still adding internal
>>> snapshot support.
>>>
>>> Stefan
>>>
>>
>>    As API, qmp_transaction have following disadvantages:
>> 1) interface is based on string not data type inside qemu, that means
>> other function calling it result in: bdrv->string->bdrv
>
> Use bdrv_get_device_name().  You already need to fill in filename or
> snapshot name strings.  This is not a big disadvantage.
>
   Yes, not a big disadvantage, but why not save string operation but
use (bdrv*) as much as possible?

what happens will be:

hmp-snapshot
     |
qmp-snapshot
     |---------
              |
         qmp-transaction            savevm(may be other..)
              |----------------------|
                             |
               internal transaction layer

>> 2) all capability are forced to be exposed.
>
> Is there something you cannot expose?
>
   As other component in qemu can use it, some option may
be used only in qemu not to user. For eg, vm-state-size.

>> 3) need structure to record each transaction state, such as
>> BlkTransactionStates. Extending it is equal to add an internal layer.
>
> I agree that extending it is equal coding effort to adding an internal
> layer because you'll need to refactor qmp_transaction() a bit to really
> support additional action types.
>
> But it's the right thing to do.  Don't add unnecessary layers just
> because writing new code is more fun than extending existing code.
>
  If this layer is not added but depending only qmp_transaction, there
will be many "if else" fragment. I have tried that and the code
is awkful, this layer did not bring extra burden only make what
happens inside qmp_transaction clearer, I did not add this layer just
for fun.


>>    Actually I started up by use qmp_transaction as API, but soon
>> found that work is almost done around BlkTransactionStates, so
>> added a layer around it clearly.
>
> BlkTransactionStates is only relevant to external snapshots because they
> change the BlkDriverState chain and must be able to undo that.
>
> For internal snapshots you simply need to store the snapshot name so you
> can delete it if there is an error partway through.  (BTW it's not
> possible to completely restore state if you allow overwriting existing
> internal snapshots unless you do something like taking a backup snapshot
> of the existing snapshot first!)
>
   Still I need BlkTransactionStates to record BlkDriverState because
the granularity is block device. And yes it is not possible to
completely restore state in overwrite case, which can be solved later
and mark it in comments now.

> Stefan
>
Stefan Hajnoczi - Jan. 11, 2013, 9:12 a.m.
On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
> 于 2013-1-10 20:41, Stefan Hajnoczi 写道:
> >On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
> >>于 2013-1-9 20:44, Stefan Hajnoczi 写道:
> >>>On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
> >>>>   This patch switch to internal common API to take group external
> >>>>snapshots from qmp_transaction interface. qmp layer simply does
> >>>>a translation from user input.
> >>>>
> >>>>Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
> >>>>---
> >>>>  blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
> >>>>  1 files changed, 87 insertions(+), 128 deletions(-)
> >>>
> >>>An internal API for snapshots is not necessary.  qmp_transaction() is
> >>>already usable both from the monitor and C code.
> >>>
> >>>The QAPI code generator creates structs that can be accessed directly
> >>>from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
> >>>the snapshot API.  It just doesn't support internal snapshots yet, which
> >>>is what you are trying to add.
> >>>
> >>>To add internal snapshot support, define a BlockdevInternalSnapshot type
> >>>in qapi-schema.json and add internal snapshot support in
> >>>qmp_transaction().
> >>>
> >>>qmp_transaction() was designed with this in mind from the beginning and
> >>>dispatches based on BlockdevAction->kind.
> >>>
> >>>The patch series will become much smaller while still adding internal
> >>>snapshot support.
> >>>
> >>>Stefan
> >>>
> >>
> >>   As API, qmp_transaction have following disadvantages:
> >>1) interface is based on string not data type inside qemu, that means
> >>other function calling it result in: bdrv->string->bdrv
> >
> >Use bdrv_get_device_name().  You already need to fill in filename or
> >snapshot name strings.  This is not a big disadvantage.
> >
>   Yes, not a big disadvantage, but why not save string operation but
> use (bdrv*) as much as possible?
> 
> what happens will be:
> 
> hmp-snapshot
>     |
> qmp-snapshot
>     |---------
>              |
>         qmp-transaction            savevm(may be other..)
>              |----------------------|
>                             |
>               internal transaction layer

Saving the string operation is not worth duplicating the API.

> >>2) all capability are forced to be exposed.
> >
> >Is there something you cannot expose?
> >
>   As other component in qemu can use it, some option may
> be used only in qemu not to user. For eg, vm-state-size.

When we hit a limitation of QAPI then it needs to be extended.  I'm sure
there's a solution for splitting or hiding parts of the QAPI generated
API.

> >>3) need structure to record each transaction state, such as
> >>BlkTransactionStates. Extending it is equal to add an internal layer.
> >
> >I agree that extending it is equal coding effort to adding an internal
> >layer because you'll need to refactor qmp_transaction() a bit to really
> >support additional action types.
> >
> >But it's the right thing to do.  Don't add unnecessary layers just
> >because writing new code is more fun than extending existing code.
> >
>  If this layer is not added but depending only qmp_transaction, there
> will be many "if else" fragment. I have tried that and the code
> is awkful, this layer did not bring extra burden only make what
> happens inside qmp_transaction clearer, I did not add this layer just
> for fun.
> 
> 
> >>   Actually I started up by use qmp_transaction as API, but soon
> >>found that work is almost done around BlkTransactionStates, so
> >>added a layer around it clearly.

The qmp_transaction() implementation can be changed, I'm not saying you
have to hack in more if statements.  It's cleanest to introduce a
BdrvActionOps abstraction:

typedef struct BdrvActionOps BdrvActionOps;
typedef struct BdrvTransactionState {
    const BdrvActionOps *ops;
    QLIST_ENTRY(BdrvTransactionState);
} BdrvTransactionState;

struct BdrvActionOps {
    int (*prepare)(BdrvTransactionState *s, ...);
    int (*commit)(BdrvTransactionState *s, ...);
    int (*rollback)(BdrvTransactionState *s, ...);
};

BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);

Then qmp_transaction() can be generic code that steps through the
transactions.  This is similar to what your series does and I think it's
the right direction.

But please don't duplicate the qmp_transaction() and
BlockdevAction/BlockdevActionList APIs.  In other words, change the
engine, not the whole car.

Stefan
Wayne Xia - Jan. 14, 2013, 2:56 a.m.
于 2013-1-11 17:12, Stefan Hajnoczi 写道:
> On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
>> 于 2013-1-10 20:41, Stefan Hajnoczi 写道:
>>> On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
>>>> 于 2013-1-9 20:44, Stefan Hajnoczi 写道:
>>>>> On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
>>>>>>    This patch switch to internal common API to take group external
>>>>>> snapshots from qmp_transaction interface. qmp layer simply does
>>>>>> a translation from user input.
>>>>>>
>>>>>> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>   blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
>>>>>>   1 files changed, 87 insertions(+), 128 deletions(-)
>>>>>
>>>>> An internal API for snapshots is not necessary.  qmp_transaction() is
>>>>> already usable both from the monitor and C code.
>>>>>
>>>>> The QAPI code generator creates structs that can be accessed directly
>>>> >from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
>>>>> the snapshot API.  It just doesn't support internal snapshots yet, which
>>>>> is what you are trying to add.
>>>>>
>>>>> To add internal snapshot support, define a BlockdevInternalSnapshot type
>>>>> in qapi-schema.json and add internal snapshot support in
>>>>> qmp_transaction().
>>>>>
>>>>> qmp_transaction() was designed with this in mind from the beginning and
>>>>> dispatches based on BlockdevAction->kind.
>>>>>
>>>>> The patch series will become much smaller while still adding internal
>>>>> snapshot support.
>>>>>
>>>>> Stefan
>>>>>
>>>>
>>>>    As API, qmp_transaction have following disadvantages:
>>>> 1) interface is based on string not data type inside qemu, that means
>>>> other function calling it result in: bdrv->string->bdrv
>>>
>>> Use bdrv_get_device_name().  You already need to fill in filename or
>>> snapshot name strings.  This is not a big disadvantage.
>>>
>>    Yes, not a big disadvantage, but why not save string operation but
>> use (bdrv*) as much as possible?
>>
>> what happens will be:
>>
>> hmp-snapshot
>>      |
>> qmp-snapshot
>>      |---------
>>               |
>>          qmp-transaction            savevm(may be other..)
>>               |----------------------|
>>                              |
>>                internal transaction layer
>
> Saving the string operation is not worth duplicating the API.
>
   I agree with you for this line:), but,  it is a weight on the balance
of choice, pls consider it together with issues below.

>>>> 2) all capability are forced to be exposed.
>>>
>>> Is there something you cannot expose?
>>>
>>    As other component in qemu can use it, some option may
>> be used only in qemu not to user. For eg, vm-state-size.
>
> When we hit a limitation of QAPI then it needs to be extended.  I'm sure
> there's a solution for splitting or hiding parts of the QAPI generated
> API.
>
   I can't think it out now, it seems to be a bit tricky.

>>>> 3) need structure to record each transaction state, such as
>>>> BlkTransactionStates. Extending it is equal to add an internal layer.
>>>
>>> I agree that extending it is equal coding effort to adding an internal
>>> layer because you'll need to refactor qmp_transaction() a bit to really
>>> support additional action types.
>>>
>>> But it's the right thing to do.  Don't add unnecessary layers just
>>> because writing new code is more fun than extending existing code.
>>>
>>   If this layer is not added but depending only qmp_transaction, there
>> will be many "if else" fragment. I have tried that and the code
>> is awkful, this layer did not bring extra burden only make what
>> happens inside qmp_transaction clearer, I did not add this layer just
>> for fun.
>>
>>
>>>>    Actually I started up by use qmp_transaction as API, but soon
>>>> found that work is almost done around BlkTransactionStates, so
>>>> added a layer around it clearly.
>
> The qmp_transaction() implementation can be changed, I'm not saying you
> have to hack in more if statements.  It's cleanest to introduce a
> BdrvActionOps abstraction:
>
> typedef struct BdrvActionOps BdrvActionOps;
> typedef struct BdrvTransactionState {
>      const BdrvActionOps *ops;
>      QLIST_ENTRY(BdrvTransactionState);
> } BdrvTransactionState;
>
> struct BdrvActionOps {
>      int (*prepare)(BdrvTransactionState *s, ...);
>      int (*commit)(BdrvTransactionState *s, ...);
>      int (*rollback)(BdrvTransactionState *s, ...);
> };
>
> BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);
>
> Then qmp_transaction() can be generic code that steps through the
> transactions.
   With internal API, qmp_transaction can still be generic code with
a translate from bdrv* to char* at caller level.

   This is similar to what your series does and I think it's
> the right direction.
>
> But please don't duplicate the qmp_transaction() and
> BlockdevAction/BlockdevActionList APIs.  In other words, change the
> engine, not the whole car.
>
> Stefan
>

   If my understanding is correct, the BdrvActionOps need to be extended
as following:
struct BdrvActionOps {
      /* need following for callback functions */
      const char *sn_name;
      BlockDriverState *bs;
      ...
      int (*prepare)(BdrvTransactionState *s, ...);
      int (*commit)(BdrvTransactionState *s, ...);
      int (*rollback)(BdrvTransactionState *s, ...);
};
Or an opaque* should used for every BdrvActionOps.

Comparation:
The way above:
  1) translate from BlockdevAction to BdrvTransactionState by
     bdrv_transaction_create().
  2) enqueue BdrvTransactionState by
     some code.
  3) execute them by
     a new function, name it as BdrvActionOpsRun().

Internal API way:
  1) translate BlockdevAction to BlkTransStates by
     fill_blk_trs().
  2) enqueue BlkTransStates to BlkTransStates by
     add_transaction().
  3) execute them by
     submit_transaction().

   It seems the way above will end as something like an internal
layer, but without clear APIs tips what it is doing. Please reconsider
the advantages about a clear internal API layer.
Stefan Hajnoczi - Jan. 14, 2013, 10:06 a.m.
On Mon, Jan 14, 2013 at 10:56:30AM +0800, Wenchao Xia wrote:
> 于 2013-1-11 17:12, Stefan Hajnoczi 写道:
> >On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
> >>于 2013-1-10 20:41, Stefan Hajnoczi 写道:
> >>>On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
> >>>>于 2013-1-9 20:44, Stefan Hajnoczi 写道:
> >>>>>On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
> >>>>>>   This patch switch to internal common API to take group external
> >>>>>>snapshots from qmp_transaction interface. qmp layer simply does
> >>>>>>a translation from user input.
> >>>>>>
> >>>>>>Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
> >>>>>>---
> >>>>>>  blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
> >>>>>>  1 files changed, 87 insertions(+), 128 deletions(-)
> >>>>>
> >>>>>An internal API for snapshots is not necessary.  qmp_transaction() is
> >>>>>already usable both from the monitor and C code.
> >>>>>
> >>>>>The QAPI code generator creates structs that can be accessed directly
> >>>>>from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
> >>>>>the snapshot API.  It just doesn't support internal snapshots yet, which
> >>>>>is what you are trying to add.
> >>>>>
> >>>>>To add internal snapshot support, define a BlockdevInternalSnapshot type
> >>>>>in qapi-schema.json and add internal snapshot support in
> >>>>>qmp_transaction().
> >>>>>
> >>>>>qmp_transaction() was designed with this in mind from the beginning and
> >>>>>dispatches based on BlockdevAction->kind.
> >>>>>
> >>>>>The patch series will become much smaller while still adding internal
> >>>>>snapshot support.
> >>>>>
> >>>>>Stefan
> >>>>>
> >>>>
> >>>>   As API, qmp_transaction have following disadvantages:
> >>>>1) interface is based on string not data type inside qemu, that means
> >>>>other function calling it result in: bdrv->string->bdrv
> >>>
> >>>Use bdrv_get_device_name().  You already need to fill in filename or
> >>>snapshot name strings.  This is not a big disadvantage.
> >>>
> >>   Yes, not a big disadvantage, but why not save string operation but
> >>use (bdrv*) as much as possible?
> >>
> >>what happens will be:
> >>
> >>hmp-snapshot
> >>     |
> >>qmp-snapshot
> >>     |---------
> >>              |
> >>         qmp-transaction            savevm(may be other..)
> >>              |----------------------|
> >>                             |
> >>               internal transaction layer
> >
> >Saving the string operation is not worth duplicating the API.
> >
>   I agree with you for this line:), but,  it is a weight on the balance
> of choice, pls consider it together with issues below.
> 
> >>>>2) all capability are forced to be exposed.
> >>>
> >>>Is there something you cannot expose?
> >>>
> >>   As other component in qemu can use it, some option may
> >>be used only in qemu not to user. For eg, vm-state-size.
> >
> >When we hit a limitation of QAPI then it needs to be extended.  I'm sure
> >there's a solution for splitting or hiding parts of the QAPI generated
> >API.
> >
>   I can't think it out now, it seems to be a bit tricky.
> 
> >>>>3) need structure to record each transaction state, such as
> >>>>BlkTransactionStates. Extending it is equal to add an internal layer.
> >>>
> >>>I agree that extending it is equal coding effort to adding an internal
> >>>layer because you'll need to refactor qmp_transaction() a bit to really
> >>>support additional action types.
> >>>
> >>>But it's the right thing to do.  Don't add unnecessary layers just
> >>>because writing new code is more fun than extending existing code.
> >>>
> >>  If this layer is not added but depending only qmp_transaction, there
> >>will be many "if else" fragment. I have tried that and the code
> >>is awkful, this layer did not bring extra burden only make what
> >>happens inside qmp_transaction clearer, I did not add this layer just
> >>for fun.
> >>
> >>
> >>>>   Actually I started up by use qmp_transaction as API, but soon
> >>>>found that work is almost done around BlkTransactionStates, so
> >>>>added a layer around it clearly.
> >
> >The qmp_transaction() implementation can be changed, I'm not saying you
> >have to hack in more if statements.  It's cleanest to introduce a
> >BdrvActionOps abstraction:
> >
> >typedef struct BdrvActionOps BdrvActionOps;
> >typedef struct BdrvTransactionState {
> >     const BdrvActionOps *ops;
> >     QLIST_ENTRY(BdrvTransactionState);
> >} BdrvTransactionState;
> >
> >struct BdrvActionOps {
> >     int (*prepare)(BdrvTransactionState *s, ...);
> >     int (*commit)(BdrvTransactionState *s, ...);
> >     int (*rollback)(BdrvTransactionState *s, ...);
> >};
> >
> >BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);
> >
> >Then qmp_transaction() can be generic code that steps through the
> >transactions.
>   With internal API, qmp_transaction can still be generic code with
> a translate from bdrv* to char* at caller level.
> 
>   This is similar to what your series does and I think it's
> >the right direction.
> >
> >But please don't duplicate the qmp_transaction() and
> >BlockdevAction/BlockdevActionList APIs.  In other words, change the
> >engine, not the whole car.
> >
> >Stefan
> >
> 
>   If my understanding is correct, the BdrvActionOps need to be extended
> as following:
> struct BdrvActionOps {
>      /* need following for callback functions */
>      const char *sn_name;
>      BlockDriverState *bs;
>      ...
>      int (*prepare)(BdrvTransactionState *s, ...);
>      int (*commit)(BdrvTransactionState *s, ...);
>      int (*rollback)(BdrvTransactionState *s, ...);
> };
> Or an opaque* should used for every BdrvActionOps.

It is nice to keep *Ops structs read-only so they can be static const.
This way the ops are shared between all instances of the same action
type.  Also the function pointers can be in read-only memory pages,
which is a slight security win since it prevents memory corruption
exploits from taking advantage of function pointers to execute arbitrary
code.

In the pseudo-code I posted the sn_name or bs fields go into an
action-specific state struct:

typedef struct {
    BdrvTransactionState common;
    char *backup_sn_name;
} InternalSnapshotTransactionState;

typedef struct {
    BdrvTransactionState common;
    BlockDriverState *old_bs;
    BlockDriverState *new_bs;
} ExternalSnapshotTransactionState;

> Comparation:
> The way above:
>  1) translate from BlockdevAction to BdrvTransactionState by
>     bdrv_transaction_create().
>  2) enqueue BdrvTransactionState by
>     some code.
>  3) execute them by
>     a new function, name it as BdrvActionOpsRun().

If you include .prepare() in the transaction creation, then it becomes
simpler:

states = []
for action in actions:
    result = bdrv_transaction_create(action)  # invokes .prepare()
    if result is error:
        for state in states:
	    state.rollback()
	return
    states.append(result)
for state in states:
    state.commit()

Because we don't wait until BdrvActionOpsRun() before processing the
transaction, there's no need to translate from BlockdevAction to
BdrvTransactionState.  The BdrvTransactionState struct really only has
state required to commit/rollback the transaction.

(Even if it becomes necessary to keep information from BlockdevAction
after .prepare() returns, just keep a pointer to BlockdevAction.  Don't
duplicate it.)

> Internal API way:
>  1) translate BlockdevAction to BlkTransStates by
>     fill_blk_trs().
>  2) enqueue BlkTransStates to BlkTransStates by
>     add_transaction().
>  3) execute them by
>     submit_transaction().
> 
>   It seems the way above will end as something like an internal
> layer, but without clear APIs tips what it is doing. Please reconsider
> the advantages about a clear internal API layer.

I'm not convinced by the internal API approach.  It took me a while when
reviewing the code before I understood what was actually going on
because of the qmp_transaction() and BlockdevAction duplication code.

I see the internal API approach as an unnecessary layer of indirection.
It makes the code more complicated to understand and maintain.  Next
time we add something to qmp_transaction() it would also be necessary to
duplicate that change for the internal API.  It creates unnecessary
work.

Just embrace QAPI, the point of it was to eliminate these external <->
internal translations we were doing all the time.

Stefan
Wayne Xia - Jan. 15, 2013, 7:03 a.m.
于 2013-1-14 18:06, Stefan Hajnoczi 写道:
> On Mon, Jan 14, 2013 at 10:56:30AM +0800, Wenchao Xia wrote:
>> 于 2013-1-11 17:12, Stefan Hajnoczi 写道:
>>> On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
>>>> 于 2013-1-10 20:41, Stefan Hajnoczi 写道:
>>>>> On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
>>>>>> 于 2013-1-9 20:44, Stefan Hajnoczi 写道:
>>>>>>> On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
>>>>>>>>    This patch switch to internal common API to take group external
>>>>>>>> snapshots from qmp_transaction interface. qmp layer simply does
>>>>>>>> a translation from user input.
>>>>>>>>
>>>>>>>> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
>>>>>>>> ---
>>>>>>>>   blockdev.c |  215 ++++++++++++++++++++++++------------------------------------
>>>>>>>>   1 files changed, 87 insertions(+), 128 deletions(-)
>>>>>>>
>>>>>>> An internal API for snapshots is not necessary.  qmp_transaction() is
>>>>>>> already usable both from the monitor and C code.
>>>>>>>
>>>>>>> The QAPI code generator creates structs that can be accessed directly
>>>>>> >from C.  qmp_transaction(), BlockdevAction, and BlockdevActionList *is*
>>>>>>> the snapshot API.  It just doesn't support internal snapshots yet, which
>>>>>>> is what you are trying to add.
>>>>>>>
>>>>>>> To add internal snapshot support, define a BlockdevInternalSnapshot type
>>>>>>> in qapi-schema.json and add internal snapshot support in
>>>>>>> qmp_transaction().
>>>>>>>
>>>>>>> qmp_transaction() was designed with this in mind from the beginning and
>>>>>>> dispatches based on BlockdevAction->kind.
>>>>>>>
>>>>>>> The patch series will become much smaller while still adding internal
>>>>>>> snapshot support.
>>>>>>>
>>>>>>> Stefan
>>>>>>>
>>>>>>
>>>>>>    As API, qmp_transaction have following disadvantages:
>>>>>> 1) interface is based on string not data type inside qemu, that means
>>>>>> other function calling it result in: bdrv->string->bdrv
>>>>>
>>>>> Use bdrv_get_device_name().  You already need to fill in filename or
>>>>> snapshot name strings.  This is not a big disadvantage.
>>>>>
>>>>    Yes, not a big disadvantage, but why not save string operation but
>>>> use (bdrv*) as much as possible?
>>>>
>>>> what happens will be:
>>>>
>>>> hmp-snapshot
>>>>      |
>>>> qmp-snapshot
>>>>      |---------
>>>>               |
>>>>          qmp-transaction            savevm(may be other..)
>>>>               |----------------------|
>>>>                              |
>>>>                internal transaction layer
>>>
>>> Saving the string operation is not worth duplicating the API.
>>>
>>    I agree with you for this line:), but,  it is a weight on the balance
>> of choice, pls consider it together with issues below.
>>
>>>>>> 2) all capability are forced to be exposed.
>>>>>
>>>>> Is there something you cannot expose?
>>>>>
>>>>    As other component in qemu can use it, some option may
>>>> be used only in qemu not to user. For eg, vm-state-size.
>>>
>>> When we hit a limitation of QAPI then it needs to be extended.  I'm sure
>>> there's a solution for splitting or hiding parts of the QAPI generated
>>> API.
>>>
>>    I can't think it out now, it seems to be a bit tricky.
>>
>>>>>> 3) need structure to record each transaction state, such as
>>>>>> BlkTransactionStates. Extending it is equal to add an internal layer.
>>>>>
>>>>> I agree that extending it is equal coding effort to adding an internal
>>>>> layer because you'll need to refactor qmp_transaction() a bit to really
>>>>> support additional action types.
>>>>>
>>>>> But it's the right thing to do.  Don't add unnecessary layers just
>>>>> because writing new code is more fun than extending existing code.
>>>>>
>>>>   If this layer is not added but depending only qmp_transaction, there
>>>> will be many "if else" fragment. I have tried that and the code
>>>> is awkful, this layer did not bring extra burden only make what
>>>> happens inside qmp_transaction clearer, I did not add this layer just
>>>> for fun.
>>>>
>>>>
>>>>>>    Actually I started up by use qmp_transaction as API, but soon
>>>>>> found that work is almost done around BlkTransactionStates, so
>>>>>> added a layer around it clearly.
>>>
>>> The qmp_transaction() implementation can be changed, I'm not saying you
>>> have to hack in more if statements.  It's cleanest to introduce a
>>> BdrvActionOps abstraction:
>>>
>>> typedef struct BdrvActionOps BdrvActionOps;
>>> typedef struct BdrvTransactionState {
>>>      const BdrvActionOps *ops;
>>>      QLIST_ENTRY(BdrvTransactionState);
>>> } BdrvTransactionState;
>>>
>>> struct BdrvActionOps {
>>>      int (*prepare)(BdrvTransactionState *s, ...);
>>>      int (*commit)(BdrvTransactionState *s, ...);
>>>      int (*rollback)(BdrvTransactionState *s, ...);
>>> };
>>>
>>> BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);
>>>
>>> Then qmp_transaction() can be generic code that steps through the
>>> transactions.
>>    With internal API, qmp_transaction can still be generic code with
>> a translate from bdrv* to char* at caller level.
>>
>>    This is similar to what your series does and I think it's
>>> the right direction.
>>>
>>> But please don't duplicate the qmp_transaction() and
>>> BlockdevAction/BlockdevActionList APIs.  In other words, change the
>>> engine, not the whole car.
>>>
>>> Stefan
>>>
>>
>>    If my understanding is correct, the BdrvActionOps need to be extended
>> as following:
>> struct BdrvActionOps {
>>       /* need following for callback functions */
>>       const char *sn_name;
>>       BlockDriverState *bs;
>>       ...
>>       int (*prepare)(BdrvTransactionState *s, ...);
>>       int (*commit)(BdrvTransactionState *s, ...);
>>       int (*rollback)(BdrvTransactionState *s, ...);
>> };
>> Or an opaque* should used for every BdrvActionOps.
>
> It is nice to keep *Ops structs read-only so they can be static const.
> This way the ops are shared between all instances of the same action
> type.  Also the function pointers can be in read-only memory pages,
> which is a slight security win since it prevents memory corruption
> exploits from taking advantage of function pointers to execute arbitrary
> code.
>
   Seems good, I will package callback functions into *Ops, thanks.

> In the pseudo-code I posted the sn_name or bs fields go into an
> action-specific state struct:
>
> typedef struct {
>      BdrvTransactionState common;
>      char *backup_sn_name;
> } InternalSnapshotTransactionState;
>
> typedef struct {
>      BdrvTransactionState common;
>      BlockDriverState *old_bs;
>      BlockDriverState *new_bs;
> } ExternalSnapshotTransactionState;
>
>> Comparation:
>> The way above:
>>   1) translate from BlockdevAction to BdrvTransactionState by
>>      bdrv_transaction_create().
>>   2) enqueue BdrvTransactionState by
>>      some code.
>>   3) execute them by
>>      a new function, name it as BdrvActionOpsRun().
>
> If you include .prepare() in the transaction creation, then it becomes
> simpler:
>
> states = []
> for action in actions:
>      result = bdrv_transaction_create(action)  # invokes .prepare()
>      if result is error:
>          for state in states:
> 	    state.rollback()
> 	return
>      states.append(result)
> for state in states:
>      state.commit()
>
> Because we don't wait until BdrvActionOpsRun() before processing the
> transaction, there's no need to translate from BlockdevAction to
> BdrvTransactionState.  The BdrvTransactionState struct really only has
> state required to commit/rollback the transaction.
>
> (Even if it becomes necessary to keep information from BlockdevAction
> after .prepare() returns, just keep a pointer to BlockdevAction.  Don't
> duplicate it.)
>
   OK, *BlockdevAction plus *BlockDriverState and some other
data used internal will be added in states.

>> Internal API way:
>>   1) translate BlockdevAction to BlkTransStates by
>>      fill_blk_trs().
>>   2) enqueue BlkTransStates to BlkTransStates by
>>      add_transaction().
>>   3) execute them by
>>      submit_transaction().
>>
>>    It seems the way above will end as something like an internal
>> layer, but without clear APIs tips what it is doing. Please reconsider
>> the advantages about a clear internal API layer.
>
> I'm not convinced by the internal API approach.  It took me a while when
> reviewing the code before I understood what was actually going on
> because of the qmp_transaction() and BlockdevAction duplication code.
>
> I see the internal API approach as an unnecessary layer of indirection.
> It makes the code more complicated to understand and maintain.  Next
> time we add something to qmp_transaction() it would also be necessary to
> duplicate that change for the internal API.  It creates unnecessary
> work.
>
   Basic process is almost the same in two approaches, I'd like to
adjust the code to avoid data duplication as much as possible, and
see if some function can be removed when code keeps clear, in next
version.

> Just embrace QAPI, the point of it was to eliminate these external <->
> internal translations we were doing all the time.
>
> Stefan
>
Wayne Xia - March 12, 2013, 8:30 a.m.
于 2013-1-15 15:03, Wenchao Xia 写道:
> 于 2013-1-14 18:06, Stefan Hajnoczi 写道:
>> On Mon, Jan 14, 2013 at 10:56:30AM +0800, Wenchao Xia wrote:
>>> 于 2013-1-11 17:12, Stefan Hajnoczi 写道:
>>>> On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
>>>>> 于 2013-1-10 20:41, Stefan Hajnoczi 写道:
>>>>>> On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
>>>>>>> 于 2013-1-9 20:44, Stefan Hajnoczi 写道:
>>>>>>>> On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
>>>>>>>>>    This patch switch to internal common API to take group external
>>>>>>>>> snapshots from qmp_transaction interface. qmp layer simply does
>>>>>>>>> a translation from user input.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
>>>>>>>>> ---
>>>>>>>>>   blockdev.c |  215
>>>>>>>>> ++++++++++++++++++++++++------------------------------------
>>>>>>>>>   1 files changed, 87 insertions(+), 128 deletions(-)
>>>>>>>>
>>>>>>>> An internal API for snapshots is not necessary.
>>>>>>>> qmp_transaction() is
>>>>>>>> already usable both from the monitor and C code.
>>>>>>>>
>>>>>>>> The QAPI code generator creates structs that can be accessed
>>>>>>>> directly
>>>>>>> >from C.  qmp_transaction(), BlockdevAction, and
>>>>>>> BlockdevActionList *is*
>>>>>>>> the snapshot API.  It just doesn't support internal snapshots
>>>>>>>> yet, which
>>>>>>>> is what you are trying to add.
>>>>>>>>
>>>>>>>> To add internal snapshot support, define a
>>>>>>>> BlockdevInternalSnapshot type
>>>>>>>> in qapi-schema.json and add internal snapshot support in
>>>>>>>> qmp_transaction().
>>>>>>>>
>>>>>>>> qmp_transaction() was designed with this in mind from the
>>>>>>>> beginning and
>>>>>>>> dispatches based on BlockdevAction->kind.
>>>>>>>>
>>>>>>>> The patch series will become much smaller while still adding
>>>>>>>> internal
>>>>>>>> snapshot support.
>>>>>>>>
>>>>>>>> Stefan
>>>>>>>>
>>>>>>>
>>>>>>>    As API, qmp_transaction have following disadvantages:
>>>>>>> 1) interface is based on string not data type inside qemu, that
>>>>>>> means
>>>>>>> other function calling it result in: bdrv->string->bdrv
>>>>>>
>>>>>> Use bdrv_get_device_name().  You already need to fill in filename or
>>>>>> snapshot name strings.  This is not a big disadvantage.
>>>>>>
>>>>>    Yes, not a big disadvantage, but why not save string operation but
>>>>> use (bdrv*) as much as possible?
>>>>>
>>>>> what happens will be:
>>>>>
>>>>> hmp-snapshot
>>>>>      |
>>>>> qmp-snapshot
>>>>>      |---------
>>>>>               |
>>>>>          qmp-transaction            savevm(may be other..)
>>>>>               |----------------------|
>>>>>                              |
>>>>>                internal transaction layer
>>>>
>>>> Saving the string operation is not worth duplicating the API.
>>>>
>>>    I agree with you for this line:), but,  it is a weight on the balance
>>> of choice, pls consider it together with issues below.
>>>
>>>>>>> 2) all capability are forced to be exposed.
>>>>>>
>>>>>> Is there something you cannot expose?
>>>>>>
>>>>>    As other component in qemu can use it, some option may
>>>>> be used only in qemu not to user. For eg, vm-state-size.
>>>>
>>>> When we hit a limitation of QAPI then it needs to be extended.  I'm
>>>> sure
>>>> there's a solution for splitting or hiding parts of the QAPI generated
>>>> API.
>>>>
>>>    I can't think it out now, it seems to be a bit tricky.
>>>
>>>>>>> 3) need structure to record each transaction state, such as
>>>>>>> BlkTransactionStates. Extending it is equal to add an internal
>>>>>>> layer.
>>>>>>
>>>>>> I agree that extending it is equal coding effort to adding an
>>>>>> internal
>>>>>> layer because you'll need to refactor qmp_transaction() a bit to
>>>>>> really
>>>>>> support additional action types.
>>>>>>
>>>>>> But it's the right thing to do.  Don't add unnecessary layers just
>>>>>> because writing new code is more fun than extending existing code.
>>>>>>
>>>>>   If this layer is not added but depending only qmp_transaction, there
>>>>> will be many "if else" fragment. I have tried that and the code
>>>>> is awkful, this layer did not bring extra burden only make what
>>>>> happens inside qmp_transaction clearer, I did not add this layer just
>>>>> for fun.
>>>>>
>>>>>
>>>>>>>    Actually I started up by use qmp_transaction as API, but soon
>>>>>>> found that work is almost done around BlkTransactionStates, so
>>>>>>> added a layer around it clearly.
>>>>
>>>> The qmp_transaction() implementation can be changed, I'm not saying you
>>>> have to hack in more if statements.  It's cleanest to introduce a
>>>> BdrvActionOps abstraction:
>>>>
>>>> typedef struct BdrvActionOps BdrvActionOps;
>>>> typedef struct BdrvTransactionState {
>>>>      const BdrvActionOps *ops;
>>>>      QLIST_ENTRY(BdrvTransactionState);
>>>> } BdrvTransactionState;
>>>>
>>>> struct BdrvActionOps {
>>>>      int (*prepare)(BdrvTransactionState *s, ...);
>>>>      int (*commit)(BdrvTransactionState *s, ...);
>>>>      int (*rollback)(BdrvTransactionState *s, ...);
>>>> };
>>>>
>>>> BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);
>>>>
>>>> Then qmp_transaction() can be generic code that steps through the
>>>> transactions.
>>>    With internal API, qmp_transaction can still be generic code with
>>> a translate from bdrv* to char* at caller level.
>>>
>>>    This is similar to what your series does and I think it's
>>>> the right direction.
>>>>
>>>> But please don't duplicate the qmp_transaction() and
>>>> BlockdevAction/BlockdevActionList APIs.  In other words, change the
>>>> engine, not the whole car.
>>>>
>>>> Stefan
>>>>
>>>
>>>    If my understanding is correct, the BdrvActionOps need to be extended
>>> as following:
>>> struct BdrvActionOps {
>>>       /* need following for callback functions */
>>>       const char *sn_name;
>>>       BlockDriverState *bs;
>>>       ...
>>>       int (*prepare)(BdrvTransactionState *s, ...);
>>>       int (*commit)(BdrvTransactionState *s, ...);
>>>       int (*rollback)(BdrvTransactionState *s, ...);
>>> };
>>> Or an opaque* should used for every BdrvActionOps.
>>
>> It is nice to keep *Ops structs read-only so they can be static const.
>> This way the ops are shared between all instances of the same action
>> type.  Also the function pointers can be in read-only memory pages,
>> which is a slight security win since it prevents memory corruption
>> exploits from taking advantage of function pointers to execute arbitrary
>> code.
>>
>    Seems good, I will package callback functions into *Ops, thanks.
>
>> In the pseudo-code I posted the sn_name or bs fields go into an
>> action-specific state struct:
>>
>> typedef struct {
>>      BdrvTransactionState common;
>>      char *backup_sn_name;
>> } InternalSnapshotTransactionState;
>>
>> typedef struct {
>>      BdrvTransactionState common;
>>      BlockDriverState *old_bs;
>>      BlockDriverState *new_bs;
>> } ExternalSnapshotTransactionState;
>>
>>> Comparation:
>>> The way above:
>>>   1) translate from BlockdevAction to BdrvTransactionState by
>>>      bdrv_transaction_create().
>>>   2) enqueue BdrvTransactionState by
>>>      some code.
>>>   3) execute them by
>>>      a new function, name it as BdrvActionOpsRun().
>>
>> If you include .prepare() in the transaction creation, then it becomes
>> simpler:
>>
>> states = []
>> for action in actions:
>>      result = bdrv_transaction_create(action)  # invokes .prepare()
>>      if result is error:
>>          for state in states:
>>         state.rollback()
>>     return
>>      states.append(result)
>> for state in states:
>>      state.commit()
>>
>> Because we don't wait until BdrvActionOpsRun() before processing the
>> transaction, there's no need to translate from BlockdevAction to
>> BdrvTransactionState.  The BdrvTransactionState struct really only has
>> state required to commit/rollback the transaction.
>>
>> (Even if it becomes necessary to keep information from BlockdevAction
>> after .prepare() returns, just keep a pointer to BlockdevAction.  Don't
>> duplicate it.)
>>
>    OK, *BlockdevAction plus *BlockDriverState and some other
> data used internal will be added in states.
>
>>> Internal API way:
>>>   1) translate BlockdevAction to BlkTransStates by
>>>      fill_blk_trs().
>>>   2) enqueue BlkTransStates to BlkTransStates by
>>>      add_transaction().
>>>   3) execute them by
>>>      submit_transaction().
>>>
>>>    It seems the way above will end as something like an internal
>>> layer, but without clear APIs tips what it is doing. Please reconsider
>>> the advantages about a clear internal API layer.
>>
>> I'm not convinced by the internal API approach.  It took me a while when
>> reviewing the code before I understood what was actually going on
>> because of the qmp_transaction() and BlockdevAction duplication code.
>>
>> I see the internal API approach as an unnecessary layer of indirection.
>> It makes the code more complicated to understand and maintain.  Next
>> time we add something to qmp_transaction() it would also be necessary to
>> duplicate that change for the internal API.  It creates unnecessary
>> work.
>>
>    Basic process is almost the same in two approaches, I'd like to
> adjust the code to avoid data duplication as much as possible, and
> see if some function can be removed when code keeps clear, in next
> version.
>
>> Just embrace QAPI, the point of it was to eliminate these external <->
>> internal translations we were doing all the time.
>>
>> Stefan
>>
>
>
Hi, Stefan
   I redesigned the structure, Following is the fake code:

typedef struct BdrvActionOps {
     /* check the request's validation, allocate p_opaque if needed */
     int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
     /* take the action */
     int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
     /* update emulator */
     int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
     /* cancel the action */
     int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
} BdrvActionOps;

typedef struct BlkTransactionStates {
     BlockdevAction *action;
     void *opaque;
     BdrvActionOps *ops;
     QSIMPLEQ_ENTRY(BlkTransactionStates) entry;
} BlkTransactionStates;

/* call ops->check and return state* to be enqueued */
static BlkTransactionStates *transaction_create(BlockdevAction *action,
                                                 Error **errp);

void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
{
     BlockdevActionList *dev_entry = dev_list;
     BlkTransactionStates *state;

     while (NULL != dev_entry) {
         state = transaction_create(dev_entry->value, errp);
         /* enqueue */
         dev_entry = dev_entry->next;
     }

     /* use queue with submit, commit, rollback callback */
}


   In this way, parameter duplication is saved, but one problem remains:
parameter can't be hidden to user such as vm_state_size, but this would
not be a problem if hmp "savevm" use his own code about block snapshot
later, I mean not use qmp_transaction(). What do you think about the
design? Do you have a better way to solve this problem?
Stefan Hajnoczi - March 12, 2013, 3:43 p.m.
On Tue, Mar 12, 2013 at 04:30:41PM +0800, Wenchao Xia wrote:
> 于 2013-1-15 15:03, Wenchao Xia 写道:
> >于 2013-1-14 18:06, Stefan Hajnoczi 写道:
> >>On Mon, Jan 14, 2013 at 10:56:30AM +0800, Wenchao Xia wrote:
> >>>于 2013-1-11 17:12, Stefan Hajnoczi 写道:
> >>>>On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
> >>>>>于 2013-1-10 20:41, Stefan Hajnoczi 写道:
> >>>>>>On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
> >>>>>>>于 2013-1-9 20:44, Stefan Hajnoczi 写道:
> >>>>>>>>On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
> >>>>>>>>>   This patch switch to internal common API to take group external
> >>>>>>>>>snapshots from qmp_transaction interface. qmp layer simply does
> >>>>>>>>>a translation from user input.
> >>>>>>>>>
> >>>>>>>>>Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
> >>>>>>>>>---
> >>>>>>>>>  blockdev.c |  215
> >>>>>>>>>++++++++++++++++++++++++------------------------------------
> >>>>>>>>>  1 files changed, 87 insertions(+), 128 deletions(-)
> >>>>>>>>
> >>>>>>>>An internal API for snapshots is not necessary.
> >>>>>>>>qmp_transaction() is
> >>>>>>>>already usable both from the monitor and C code.
> >>>>>>>>
> >>>>>>>>The QAPI code generator creates structs that can be accessed
> >>>>>>>>directly
> >>>>>>>>from C.  qmp_transaction(), BlockdevAction, and
> >>>>>>>BlockdevActionList *is*
> >>>>>>>>the snapshot API.  It just doesn't support internal snapshots
> >>>>>>>>yet, which
> >>>>>>>>is what you are trying to add.
> >>>>>>>>
> >>>>>>>>To add internal snapshot support, define a
> >>>>>>>>BlockdevInternalSnapshot type
> >>>>>>>>in qapi-schema.json and add internal snapshot support in
> >>>>>>>>qmp_transaction().
> >>>>>>>>
> >>>>>>>>qmp_transaction() was designed with this in mind from the
> >>>>>>>>beginning and
> >>>>>>>>dispatches based on BlockdevAction->kind.
> >>>>>>>>
> >>>>>>>>The patch series will become much smaller while still adding
> >>>>>>>>internal
> >>>>>>>>snapshot support.
> >>>>>>>>
> >>>>>>>>Stefan
> >>>>>>>>
> >>>>>>>
> >>>>>>>   As API, qmp_transaction have following disadvantages:
> >>>>>>>1) interface is based on string not data type inside qemu, that
> >>>>>>>means
> >>>>>>>other function calling it result in: bdrv->string->bdrv
> >>>>>>
> >>>>>>Use bdrv_get_device_name().  You already need to fill in filename or
> >>>>>>snapshot name strings.  This is not a big disadvantage.
> >>>>>>
> >>>>>   Yes, not a big disadvantage, but why not save string operation but
> >>>>>use (bdrv*) as much as possible?
> >>>>>
> >>>>>what happens will be:
> >>>>>
> >>>>>hmp-snapshot
> >>>>>     |
> >>>>>qmp-snapshot
> >>>>>     |---------
> >>>>>              |
> >>>>>         qmp-transaction            savevm(may be other..)
> >>>>>              |----------------------|
> >>>>>                             |
> >>>>>               internal transaction layer
> >>>>
> >>>>Saving the string operation is not worth duplicating the API.
> >>>>
> >>>   I agree with you for this line:), but,  it is a weight on the balance
> >>>of choice, pls consider it together with issues below.
> >>>
> >>>>>>>2) all capability are forced to be exposed.
> >>>>>>
> >>>>>>Is there something you cannot expose?
> >>>>>>
> >>>>>   As other component in qemu can use it, some option may
> >>>>>be used only in qemu not to user. For eg, vm-state-size.
> >>>>
> >>>>When we hit a limitation of QAPI then it needs to be extended.  I'm
> >>>>sure
> >>>>there's a solution for splitting or hiding parts of the QAPI generated
> >>>>API.
> >>>>
> >>>   I can't think it out now, it seems to be a bit tricky.
> >>>
> >>>>>>>3) need structure to record each transaction state, such as
> >>>>>>>BlkTransactionStates. Extending it is equal to add an internal
> >>>>>>>layer.
> >>>>>>
> >>>>>>I agree that extending it is equal coding effort to adding an
> >>>>>>internal
> >>>>>>layer because you'll need to refactor qmp_transaction() a bit to
> >>>>>>really
> >>>>>>support additional action types.
> >>>>>>
> >>>>>>But it's the right thing to do.  Don't add unnecessary layers just
> >>>>>>because writing new code is more fun than extending existing code.
> >>>>>>
> >>>>>  If this layer is not added but depending only qmp_transaction, there
> >>>>>will be many "if else" fragment. I have tried that and the code
> >>>>>is awkful, this layer did not bring extra burden only make what
> >>>>>happens inside qmp_transaction clearer, I did not add this layer just
> >>>>>for fun.
> >>>>>
> >>>>>
> >>>>>>>   Actually I started up by use qmp_transaction as API, but soon
> >>>>>>>found that work is almost done around BlkTransactionStates, so
> >>>>>>>added a layer around it clearly.
> >>>>
> >>>>The qmp_transaction() implementation can be changed, I'm not saying you
> >>>>have to hack in more if statements.  It's cleanest to introduce a
> >>>>BdrvActionOps abstraction:
> >>>>
> >>>>typedef struct BdrvActionOps BdrvActionOps;
> >>>>typedef struct BdrvTransactionState {
> >>>>     const BdrvActionOps *ops;
> >>>>     QLIST_ENTRY(BdrvTransactionState);
> >>>>} BdrvTransactionState;
> >>>>
> >>>>struct BdrvActionOps {
> >>>>     int (*prepare)(BdrvTransactionState *s, ...);
> >>>>     int (*commit)(BdrvTransactionState *s, ...);
> >>>>     int (*rollback)(BdrvTransactionState *s, ...);
> >>>>};
> >>>>
> >>>>BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);
> >>>>
> >>>>Then qmp_transaction() can be generic code that steps through the
> >>>>transactions.
> >>>   With internal API, qmp_transaction can still be generic code with
> >>>a translate from bdrv* to char* at caller level.
> >>>
> >>>   This is similar to what your series does and I think it's
> >>>>the right direction.
> >>>>
> >>>>But please don't duplicate the qmp_transaction() and
> >>>>BlockdevAction/BlockdevActionList APIs.  In other words, change the
> >>>>engine, not the whole car.
> >>>>
> >>>>Stefan
> >>>>
> >>>
> >>>   If my understanding is correct, the BdrvActionOps need to be extended
> >>>as following:
> >>>struct BdrvActionOps {
> >>>      /* need following for callback functions */
> >>>      const char *sn_name;
> >>>      BlockDriverState *bs;
> >>>      ...
> >>>      int (*prepare)(BdrvTransactionState *s, ...);
> >>>      int (*commit)(BdrvTransactionState *s, ...);
> >>>      int (*rollback)(BdrvTransactionState *s, ...);
> >>>};
> >>>Or an opaque* should used for every BdrvActionOps.
> >>
> >>It is nice to keep *Ops structs read-only so they can be static const.
> >>This way the ops are shared between all instances of the same action
> >>type.  Also the function pointers can be in read-only memory pages,
> >>which is a slight security win since it prevents memory corruption
> >>exploits from taking advantage of function pointers to execute arbitrary
> >>code.
> >>
> >   Seems good, I will package callback functions into *Ops, thanks.
> >
> >>In the pseudo-code I posted the sn_name or bs fields go into an
> >>action-specific state struct:
> >>
> >>typedef struct {
> >>     BdrvTransactionState common;
> >>     char *backup_sn_name;
> >>} InternalSnapshotTransactionState;
> >>
> >>typedef struct {
> >>     BdrvTransactionState common;
> >>     BlockDriverState *old_bs;
> >>     BlockDriverState *new_bs;
> >>} ExternalSnapshotTransactionState;
> >>
> >>>Comparation:
> >>>The way above:
> >>>  1) translate from BlockdevAction to BdrvTransactionState by
> >>>     bdrv_transaction_create().
> >>>  2) enqueue BdrvTransactionState by
> >>>     some code.
> >>>  3) execute them by
> >>>     a new function, name it as BdrvActionOpsRun().
> >>
> >>If you include .prepare() in the transaction creation, then it becomes
> >>simpler:
> >>
> >>states = []
> >>for action in actions:
> >>     result = bdrv_transaction_create(action)  # invokes .prepare()
> >>     if result is error:
> >>         for state in states:
> >>        state.rollback()
> >>    return
> >>     states.append(result)
> >>for state in states:
> >>     state.commit()
> >>
> >>Because we don't wait until BdrvActionOpsRun() before processing the
> >>transaction, there's no need to translate from BlockdevAction to
> >>BdrvTransactionState.  The BdrvTransactionState struct really only has
> >>state required to commit/rollback the transaction.
> >>
> >>(Even if it becomes necessary to keep information from BlockdevAction
> >>after .prepare() returns, just keep a pointer to BlockdevAction.  Don't
> >>duplicate it.)
> >>
> >   OK, *BlockdevAction plus *BlockDriverState and some other
> >data used internal will be added in states.
> >
> >>>Internal API way:
> >>>  1) translate BlockdevAction to BlkTransStates by
> >>>     fill_blk_trs().
> >>>  2) enqueue BlkTransStates to BlkTransStates by
> >>>     add_transaction().
> >>>  3) execute them by
> >>>     submit_transaction().
> >>>
> >>>   It seems the way above will end as something like an internal
> >>>layer, but without clear APIs tips what it is doing. Please reconsider
> >>>the advantages about a clear internal API layer.
> >>
> >>I'm not convinced by the internal API approach.  It took me a while when
> >>reviewing the code before I understood what was actually going on
> >>because of the qmp_transaction() and BlockdevAction duplication code.
> >>
> >>I see the internal API approach as an unnecessary layer of indirection.
> >>It makes the code more complicated to understand and maintain.  Next
> >>time we add something to qmp_transaction() it would also be necessary to
> >>duplicate that change for the internal API.  It creates unnecessary
> >>work.
> >>
> >   Basic process is almost the same in two approaches, I'd like to
> >adjust the code to avoid data duplication as much as possible, and
> >see if some function can be removed when code keeps clear, in next
> >version.
> >
> >>Just embrace QAPI, the point of it was to eliminate these external <->
> >>internal translations we were doing all the time.
> >>
> >>Stefan
> >>
> >
> >
> Hi, Stefan
>   I redesigned the structure, Following is the fake code:
> 
> typedef struct BdrvActionOps {
>     /* check the request's validation, allocate p_opaque if needed */
>     int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
>     /* take the action */
>     int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
>     /* update emulator */
>     int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
>     /* cancel the action */
>     int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
> } BdrvActionOps;
> 
> typedef struct BlkTransactionStates {
>     BlockdevAction *action;
>     void *opaque;
>     BdrvActionOps *ops;
>     QSIMPLEQ_ENTRY(BlkTransactionStates) entry;
> } BlkTransactionStates;
> 
> /* call ops->check and return state* to be enqueued */
> static BlkTransactionStates *transaction_create(BlockdevAction *action,
>                                                 Error **errp);
> 
> void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
> {
>     BlockdevActionList *dev_entry = dev_list;
>     BlkTransactionStates *state;
> 
>     while (NULL != dev_entry) {
>         state = transaction_create(dev_entry->value, errp);
>         /* enqueue */
>         dev_entry = dev_entry->next;
>     }
> 
>     /* use queue with submit, commit, rollback callback */
> }
> 
> 
>   In this way, parameter duplication is saved, but one problem remains:
> parameter can't be hidden to user such as vm_state_size, but this would
> not be a problem if hmp "savevm" use his own code about block snapshot
> later, I mean not use qmp_transaction(). What do you think about the
> design? Do you have a better way to solve this problem?

Can you explain the vm_state_size problem again?  Sorry I forgot - I
think it had something to do with having an internal parameter in the
action that should not be exposed via QMP/HMP?

Stefan
Wayne Xia - March 13, 2013, 1:36 a.m.
于 2013-3-12 23:43, Stefan Hajnoczi 写道:
> On Tue, Mar 12, 2013 at 04:30:41PM +0800, Wenchao Xia wrote:
>> 于 2013-1-15 15:03, Wenchao Xia 写道:
>>> 于 2013-1-14 18:06, Stefan Hajnoczi 写道:
>>>> On Mon, Jan 14, 2013 at 10:56:30AM +0800, Wenchao Xia wrote:
>>>>> 于 2013-1-11 17:12, Stefan Hajnoczi 写道:
>>>>>> On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
>>>>>>> 于 2013-1-10 20:41, Stefan Hajnoczi 写道:
>>>>>>>> On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
>>>>>>>>> 于 2013-1-9 20:44, Stefan Hajnoczi 写道:
>>>>>>>>>> On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
>>>>>>>>>>>    This patch switch to internal common API to take group external
>>>>>>>>>>> snapshots from qmp_transaction interface. qmp layer simply does
>>>>>>>>>>> a translation from user input.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
>>>>>>>>>>> ---
>>>>>>>>>>>   blockdev.c |  215
>>>>>>>>>>> ++++++++++++++++++++++++------------------------------------
>>>>>>>>>>>   1 files changed, 87 insertions(+), 128 deletions(-)
>>>>>>>>>>
>>>>>>>>>> An internal API for snapshots is not necessary.
>>>>>>>>>> qmp_transaction() is
>>>>>>>>>> already usable both from the monitor and C code.
>>>>>>>>>>
>>>>>>>>>> The QAPI code generator creates structs that can be accessed
>>>>>>>>>> directly
>>>>>>>>> >from C.  qmp_transaction(), BlockdevAction, and
>>>>>>>>> BlockdevActionList *is*
>>>>>>>>>> the snapshot API.  It just doesn't support internal snapshots
>>>>>>>>>> yet, which
>>>>>>>>>> is what you are trying to add.
>>>>>>>>>>
>>>>>>>>>> To add internal snapshot support, define a
>>>>>>>>>> BlockdevInternalSnapshot type
>>>>>>>>>> in qapi-schema.json and add internal snapshot support in
>>>>>>>>>> qmp_transaction().
>>>>>>>>>>
>>>>>>>>>> qmp_transaction() was designed with this in mind from the
>>>>>>>>>> beginning and
>>>>>>>>>> dispatches based on BlockdevAction->kind.
>>>>>>>>>>
>>>>>>>>>> The patch series will become much smaller while still adding
>>>>>>>>>> internal
>>>>>>>>>> snapshot support.
>>>>>>>>>>
>>>>>>>>>> Stefan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    As API, qmp_transaction have following disadvantages:
>>>>>>>>> 1) interface is based on string not data type inside qemu, that
>>>>>>>>> means
>>>>>>>>> other function calling it result in: bdrv->string->bdrv
>>>>>>>>
>>>>>>>> Use bdrv_get_device_name().  You already need to fill in filename or
>>>>>>>> snapshot name strings.  This is not a big disadvantage.
>>>>>>>>
>>>>>>>    Yes, not a big disadvantage, but why not save string operation but
>>>>>>> use (bdrv*) as much as possible?
>>>>>>>
>>>>>>> what happens will be:
>>>>>>>
>>>>>>> hmp-snapshot
>>>>>>>      |
>>>>>>> qmp-snapshot
>>>>>>>      |---------
>>>>>>>               |
>>>>>>>          qmp-transaction            savevm(may be other..)
>>>>>>>               |----------------------|
>>>>>>>                              |
>>>>>>>                internal transaction layer
>>>>>>
>>>>>> Saving the string operation is not worth duplicating the API.
>>>>>>
>>>>>    I agree with you for this line:), but,  it is a weight on the balance
>>>>> of choice, pls consider it together with issues below.
>>>>>
>>>>>>>>> 2) all capability are forced to be exposed.
>>>>>>>>
>>>>>>>> Is there something you cannot expose?
>>>>>>>>
>>>>>>>    As other component in qemu can use it, some option may
>>>>>>> be used only in qemu not to user. For eg, vm-state-size.
>>>>>>
>>>>>> When we hit a limitation of QAPI then it needs to be extended.  I'm
>>>>>> sure
>>>>>> there's a solution for splitting or hiding parts of the QAPI generated
>>>>>> API.
>>>>>>
>>>>>    I can't think it out now, it seems to be a bit tricky.
>>>>>
>>>>>>>>> 3) need structure to record each transaction state, such as
>>>>>>>>> BlkTransactionStates. Extending it is equal to add an internal
>>>>>>>>> layer.
>>>>>>>>
>>>>>>>> I agree that extending it is equal coding effort to adding an
>>>>>>>> internal
>>>>>>>> layer because you'll need to refactor qmp_transaction() a bit to
>>>>>>>> really
>>>>>>>> support additional action types.
>>>>>>>>
>>>>>>>> But it's the right thing to do.  Don't add unnecessary layers just
>>>>>>>> because writing new code is more fun than extending existing code.
>>>>>>>>
>>>>>>>   If this layer is not added but depending only qmp_transaction, there
>>>>>>> will be many "if else" fragment. I have tried that and the code
>>>>>>> is awkful, this layer did not bring extra burden only make what
>>>>>>> happens inside qmp_transaction clearer, I did not add this layer just
>>>>>>> for fun.
>>>>>>>
>>>>>>>
>>>>>>>>>    Actually I started up by use qmp_transaction as API, but soon
>>>>>>>>> found that work is almost done around BlkTransactionStates, so
>>>>>>>>> added a layer around it clearly.
>>>>>>
>>>>>> The qmp_transaction() implementation can be changed, I'm not saying you
>>>>>> have to hack in more if statements.  It's cleanest to introduce a
>>>>>> BdrvActionOps abstraction:
>>>>>>
>>>>>> typedef struct BdrvActionOps BdrvActionOps;
>>>>>> typedef struct BdrvTransactionState {
>>>>>>      const BdrvActionOps *ops;
>>>>>>      QLIST_ENTRY(BdrvTransactionState);
>>>>>> } BdrvTransactionState;
>>>>>>
>>>>>> struct BdrvActionOps {
>>>>>>      int (*prepare)(BdrvTransactionState *s, ...);
>>>>>>      int (*commit)(BdrvTransactionState *s, ...);
>>>>>>      int (*rollback)(BdrvTransactionState *s, ...);
>>>>>> };
>>>>>>
>>>>>> BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);
>>>>>>
>>>>>> Then qmp_transaction() can be generic code that steps through the
>>>>>> transactions.
>>>>>    With internal API, qmp_transaction can still be generic code with
>>>>> a translate from bdrv* to char* at caller level.
>>>>>
>>>>>    This is similar to what your series does and I think it's
>>>>>> the right direction.
>>>>>>
>>>>>> But please don't duplicate the qmp_transaction() and
>>>>>> BlockdevAction/BlockdevActionList APIs.  In other words, change the
>>>>>> engine, not the whole car.
>>>>>>
>>>>>> Stefan
>>>>>>
>>>>>
>>>>>    If my understanding is correct, the BdrvActionOps need to be extended
>>>>> as following:
>>>>> struct BdrvActionOps {
>>>>>       /* need following for callback functions */
>>>>>       const char *sn_name;
>>>>>       BlockDriverState *bs;
>>>>>       ...
>>>>>       int (*prepare)(BdrvTransactionState *s, ...);
>>>>>       int (*commit)(BdrvTransactionState *s, ...);
>>>>>       int (*rollback)(BdrvTransactionState *s, ...);
>>>>> };
>>>>> Or an opaque* should used for every BdrvActionOps.
>>>>
>>>> It is nice to keep *Ops structs read-only so they can be static const.
>>>> This way the ops are shared between all instances of the same action
>>>> type.  Also the function pointers can be in read-only memory pages,
>>>> which is a slight security win since it prevents memory corruption
>>>> exploits from taking advantage of function pointers to execute arbitrary
>>>> code.
>>>>
>>>    Seems good, I will package callback functions into *Ops, thanks.
>>>
>>>> In the pseudo-code I posted the sn_name or bs fields go into an
>>>> action-specific state struct:
>>>>
>>>> typedef struct {
>>>>      BdrvTransactionState common;
>>>>      char *backup_sn_name;
>>>> } InternalSnapshotTransactionState;
>>>>
>>>> typedef struct {
>>>>      BdrvTransactionState common;
>>>>      BlockDriverState *old_bs;
>>>>      BlockDriverState *new_bs;
>>>> } ExternalSnapshotTransactionState;
>>>>
>>>>> Comparation:
>>>>> The way above:
>>>>>   1) translate from BlockdevAction to BdrvTransactionState by
>>>>>      bdrv_transaction_create().
>>>>>   2) enqueue BdrvTransactionState by
>>>>>      some code.
>>>>>   3) execute them by
>>>>>      a new function, name it as BdrvActionOpsRun().
>>>>
>>>> If you include .prepare() in the transaction creation, then it becomes
>>>> simpler:
>>>>
>>>> states = []
>>>> for action in actions:
>>>>      result = bdrv_transaction_create(action)  # invokes .prepare()
>>>>      if result is error:
>>>>          for state in states:
>>>>         state.rollback()
>>>>     return
>>>>      states.append(result)
>>>> for state in states:
>>>>      state.commit()
>>>>
>>>> Because we don't wait until BdrvActionOpsRun() before processing the
>>>> transaction, there's no need to translate from BlockdevAction to
>>>> BdrvTransactionState.  The BdrvTransactionState struct really only has
>>>> state required to commit/rollback the transaction.
>>>>
>>>> (Even if it becomes necessary to keep information from BlockdevAction
>>>> after .prepare() returns, just keep a pointer to BlockdevAction.  Don't
>>>> duplicate it.)
>>>>
>>>    OK, *BlockdevAction plus *BlockDriverState and some other
>>> data used internal will be added in states.
>>>
>>>>> Internal API way:
>>>>>   1) translate BlockdevAction to BlkTransStates by
>>>>>      fill_blk_trs().
>>>>>   2) enqueue BlkTransStates to BlkTransStates by
>>>>>      add_transaction().
>>>>>   3) execute them by
>>>>>      submit_transaction().
>>>>>
>>>>>    It seems the way above will end as something like an internal
>>>>> layer, but without clear APIs tips what it is doing. Please reconsider
>>>>> the advantages about a clear internal API layer.
>>>>
>>>> I'm not convinced by the internal API approach.  It took me a while when
>>>> reviewing the code before I understood what was actually going on
>>>> because of the qmp_transaction() and BlockdevAction duplication code.
>>>>
>>>> I see the internal API approach as an unnecessary layer of indirection.
>>>> It makes the code more complicated to understand and maintain.  Next
>>>> time we add something to qmp_transaction() it would also be necessary to
>>>> duplicate that change for the internal API.  It creates unnecessary
>>>> work.
>>>>
>>>    Basic process is almost the same in two approaches, I'd like to
>>> adjust the code to avoid data duplication as much as possible, and
>>> see if some function can be removed when code keeps clear, in next
>>> version.
>>>
>>>> Just embrace QAPI, the point of it was to eliminate these external <->
>>>> internal translations we were doing all the time.
>>>>
>>>> Stefan
>>>>
>>>
>>>
>> Hi, Stefan
>>    I redesigned the structure, Following is the fake code:
>>
>> typedef struct BdrvActionOps {
>>      /* check the request's validation, allocate p_opaque if needed */
>>      int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
>>      /* take the action */
>>      int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
>>      /* update emulator */
>>      int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
>>      /* cancel the action */
>>      int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
>> } BdrvActionOps;
>>
>> typedef struct BlkTransactionStates {
>>      BlockdevAction *action;
>>      void *opaque;
>>      BdrvActionOps *ops;
>>      QSIMPLEQ_ENTRY(BlkTransactionStates) entry;
>> } BlkTransactionStates;
>>
>> /* call ops->check and return state* to be enqueued */
>> static BlkTransactionStates *transaction_create(BlockdevAction *action,
>>                                                  Error **errp);
>>
>> void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
>> {
>>      BlockdevActionList *dev_entry = dev_list;
>>      BlkTransactionStates *state;
>>
>>      while (NULL != dev_entry) {
>>          state = transaction_create(dev_entry->value, errp);
>>          /* enqueue */
>>          dev_entry = dev_entry->next;
>>      }
>>
>>      /* use queue with submit, commit, rollback callback */
>> }
>>
>>
>>    In this way, parameter duplication is saved, but one problem remains:
>> parameter can't be hidden to user such as vm_state_size, but this would
>> not be a problem if hmp "savevm" use his own code about block snapshot
>> later, I mean not use qmp_transaction(). What do you think about the
>> design? Do you have a better way to solve this problem?
> 
> Can you explain the vm_state_size problem again?  Sorry I forgot - I
> think it had something to do with having an internal parameter in the
> action that should not be exposed via QMP/HMP?
> 
  Yep, this parameter will be used by qemu "live savevm" code later,
but should not expose it to user in qmp interface.

> Stefan
>
Stefan Hajnoczi - March 13, 2013, 8:42 a.m.
On Wed, Mar 13, 2013 at 09:36:06AM +0800, Wenchao Xia wrote:
> 于 2013-3-12 23:43, Stefan Hajnoczi 写道:
> > On Tue, Mar 12, 2013 at 04:30:41PM +0800, Wenchao Xia wrote:
> >> 于 2013-1-15 15:03, Wenchao Xia 写道:
> >>> 于 2013-1-14 18:06, Stefan Hajnoczi 写道:
> >>>> On Mon, Jan 14, 2013 at 10:56:30AM +0800, Wenchao Xia wrote:
> >>>>> 于 2013-1-11 17:12, Stefan Hajnoczi 写道:
> >>>>>> On Fri, Jan 11, 2013 at 02:22:28PM +0800, Wenchao Xia wrote:
> >>>>>>> 于 2013-1-10 20:41, Stefan Hajnoczi 写道:
> >>>>>>>> On Thu, Jan 10, 2013 at 11:21:22AM +0800, Wenchao Xia wrote:
> >>>>>>>>> 于 2013-1-9 20:44, Stefan Hajnoczi 写道:
> >>>>>>>>>> On Mon, Jan 07, 2013 at 03:28:06PM +0800, Wenchao Xia wrote:
> >>>>>>>>>>>    This patch switch to internal common API to take group external
> >>>>>>>>>>> snapshots from qmp_transaction interface. qmp layer simply does
> >>>>>>>>>>> a translation from user input.
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>   blockdev.c |  215
> >>>>>>>>>>> ++++++++++++++++++++++++------------------------------------
> >>>>>>>>>>>   1 files changed, 87 insertions(+), 128 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> An internal API for snapshots is not necessary.
> >>>>>>>>>> qmp_transaction() is
> >>>>>>>>>> already usable both from the monitor and C code.
> >>>>>>>>>>
> >>>>>>>>>> The QAPI code generator creates structs that can be accessed
> >>>>>>>>>> directly
> >>>>>>>>> >from C.  qmp_transaction(), BlockdevAction, and
> >>>>>>>>> BlockdevActionList *is*
> >>>>>>>>>> the snapshot API.  It just doesn't support internal snapshots
> >>>>>>>>>> yet, which
> >>>>>>>>>> is what you are trying to add.
> >>>>>>>>>>
> >>>>>>>>>> To add internal snapshot support, define a
> >>>>>>>>>> BlockdevInternalSnapshot type
> >>>>>>>>>> in qapi-schema.json and add internal snapshot support in
> >>>>>>>>>> qmp_transaction().
> >>>>>>>>>>
> >>>>>>>>>> qmp_transaction() was designed with this in mind from the
> >>>>>>>>>> beginning and
> >>>>>>>>>> dispatches based on BlockdevAction->kind.
> >>>>>>>>>>
> >>>>>>>>>> The patch series will become much smaller while still adding
> >>>>>>>>>> internal
> >>>>>>>>>> snapshot support.
> >>>>>>>>>>
> >>>>>>>>>> Stefan
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>    As API, qmp_transaction have following disadvantages:
> >>>>>>>>> 1) interface is based on string not data type inside qemu, that
> >>>>>>>>> means
> >>>>>>>>> other function calling it result in: bdrv->string->bdrv
> >>>>>>>>
> >>>>>>>> Use bdrv_get_device_name().  You already need to fill in filename or
> >>>>>>>> snapshot name strings.  This is not a big disadvantage.
> >>>>>>>>
> >>>>>>>    Yes, not a big disadvantage, but why not save string operation but
> >>>>>>> use (bdrv*) as much as possible?
> >>>>>>>
> >>>>>>> what happens will be:
> >>>>>>>
> >>>>>>> hmp-snapshot
> >>>>>>>      |
> >>>>>>> qmp-snapshot
> >>>>>>>      |---------
> >>>>>>>               |
> >>>>>>>          qmp-transaction            savevm(may be other..)
> >>>>>>>               |----------------------|
> >>>>>>>                              |
> >>>>>>>                internal transaction layer
> >>>>>>
> >>>>>> Saving the string operation is not worth duplicating the API.
> >>>>>>
> >>>>>    I agree with you for this line:), but,  it is a weight on the balance
> >>>>> of choice, pls consider it together with issues below.
> >>>>>
> >>>>>>>>> 2) all capability are forced to be exposed.
> >>>>>>>>
> >>>>>>>> Is there something you cannot expose?
> >>>>>>>>
> >>>>>>>    As other component in qemu can use it, some option may
> >>>>>>> be used only in qemu not to user. For eg, vm-state-size.
> >>>>>>
> >>>>>> When we hit a limitation of QAPI then it needs to be extended.  I'm
> >>>>>> sure
> >>>>>> there's a solution for splitting or hiding parts of the QAPI generated
> >>>>>> API.
> >>>>>>
> >>>>>    I can't think it out now, it seems to be a bit tricky.
> >>>>>
> >>>>>>>>> 3) need structure to record each transaction state, such as
> >>>>>>>>> BlkTransactionStates. Extending it is equal to add an internal
> >>>>>>>>> layer.
> >>>>>>>>
> >>>>>>>> I agree that extending it is equal coding effort to adding an
> >>>>>>>> internal
> >>>>>>>> layer because you'll need to refactor qmp_transaction() a bit to
> >>>>>>>> really
> >>>>>>>> support additional action types.
> >>>>>>>>
> >>>>>>>> But it's the right thing to do.  Don't add unnecessary layers just
> >>>>>>>> because writing new code is more fun than extending existing code.
> >>>>>>>>
> >>>>>>>   If this layer is not added but depending only qmp_transaction, there
> >>>>>>> will be many "if else" fragment. I have tried that and the code
> >>>>>>> is awkful, this layer did not bring extra burden only make what
> >>>>>>> happens inside qmp_transaction clearer, I did not add this layer just
> >>>>>>> for fun.
> >>>>>>>
> >>>>>>>
> >>>>>>>>>    Actually I started up by use qmp_transaction as API, but soon
> >>>>>>>>> found that work is almost done around BlkTransactionStates, so
> >>>>>>>>> added a layer around it clearly.
> >>>>>>
> >>>>>> The qmp_transaction() implementation can be changed, I'm not saying you
> >>>>>> have to hack in more if statements.  It's cleanest to introduce a
> >>>>>> BdrvActionOps abstraction:
> >>>>>>
> >>>>>> typedef struct BdrvActionOps BdrvActionOps;
> >>>>>> typedef struct BdrvTransactionState {
> >>>>>>      const BdrvActionOps *ops;
> >>>>>>      QLIST_ENTRY(BdrvTransactionState);
> >>>>>> } BdrvTransactionState;
> >>>>>>
> >>>>>> struct BdrvActionOps {
> >>>>>>      int (*prepare)(BdrvTransactionState *s, ...);
> >>>>>>      int (*commit)(BdrvTransactionState *s, ...);
> >>>>>>      int (*rollback)(BdrvTransactionState *s, ...);
> >>>>>> };
> >>>>>>
> >>>>>> BdrvTransactionState *bdrv_transaction_create(BlockdevAction *action);
> >>>>>>
> >>>>>> Then qmp_transaction() can be generic code that steps through the
> >>>>>> transactions.
> >>>>>    With internal API, qmp_transaction can still be generic code with
> >>>>> a translate from bdrv* to char* at caller level.
> >>>>>
> >>>>>    This is similar to what your series does and I think it's
> >>>>>> the right direction.
> >>>>>>
> >>>>>> But please don't duplicate the qmp_transaction() and
> >>>>>> BlockdevAction/BlockdevActionList APIs.  In other words, change the
> >>>>>> engine, not the whole car.
> >>>>>>
> >>>>>> Stefan
> >>>>>>
> >>>>>
> >>>>>    If my understanding is correct, the BdrvActionOps need to be extended
> >>>>> as following:
> >>>>> struct BdrvActionOps {
> >>>>>       /* need following for callback functions */
> >>>>>       const char *sn_name;
> >>>>>       BlockDriverState *bs;
> >>>>>       ...
> >>>>>       int (*prepare)(BdrvTransactionState *s, ...);
> >>>>>       int (*commit)(BdrvTransactionState *s, ...);
> >>>>>       int (*rollback)(BdrvTransactionState *s, ...);
> >>>>> };
> >>>>> Or an opaque* should used for every BdrvActionOps.
> >>>>
> >>>> It is nice to keep *Ops structs read-only so they can be static const.
> >>>> This way the ops are shared between all instances of the same action
> >>>> type.  Also the function pointers can be in read-only memory pages,
> >>>> which is a slight security win since it prevents memory corruption
> >>>> exploits from taking advantage of function pointers to execute arbitrary
> >>>> code.
> >>>>
> >>>    Seems good, I will package callback functions into *Ops, thanks.
> >>>
> >>>> In the pseudo-code I posted the sn_name or bs fields go into an
> >>>> action-specific state struct:
> >>>>
> >>>> typedef struct {
> >>>>      BdrvTransactionState common;
> >>>>      char *backup_sn_name;
> >>>> } InternalSnapshotTransactionState;
> >>>>
> >>>> typedef struct {
> >>>>      BdrvTransactionState common;
> >>>>      BlockDriverState *old_bs;
> >>>>      BlockDriverState *new_bs;
> >>>> } ExternalSnapshotTransactionState;
> >>>>
> >>>>> Comparation:
> >>>>> The way above:
> >>>>>   1) translate from BlockdevAction to BdrvTransactionState by
> >>>>>      bdrv_transaction_create().
> >>>>>   2) enqueue BdrvTransactionState by
> >>>>>      some code.
> >>>>>   3) execute them by
> >>>>>      a new function, name it as BdrvActionOpsRun().
> >>>>
> >>>> If you include .prepare() in the transaction creation, then it becomes
> >>>> simpler:
> >>>>
> >>>> states = []
> >>>> for action in actions:
> >>>>      result = bdrv_transaction_create(action)  # invokes .prepare()
> >>>>      if result is error:
> >>>>          for state in states:
> >>>>         state.rollback()
> >>>>     return
> >>>>      states.append(result)
> >>>> for state in states:
> >>>>      state.commit()
> >>>>
> >>>> Because we don't wait until BdrvActionOpsRun() before processing the
> >>>> transaction, there's no need to translate from BlockdevAction to
> >>>> BdrvTransactionState.  The BdrvTransactionState struct really only has
> >>>> state required to commit/rollback the transaction.
> >>>>
> >>>> (Even if it becomes necessary to keep information from BlockdevAction
> >>>> after .prepare() returns, just keep a pointer to BlockdevAction.  Don't
> >>>> duplicate it.)
> >>>>
> >>>    OK, *BlockdevAction plus *BlockDriverState and some other
> >>> data used internal will be added in states.
> >>>
> >>>>> Internal API way:
> >>>>>   1) translate BlockdevAction to BlkTransStates by
> >>>>>      fill_blk_trs().
> >>>>>   2) enqueue BlkTransStates to BlkTransStates by
> >>>>>      add_transaction().
> >>>>>   3) execute them by
> >>>>>      submit_transaction().
> >>>>>
> >>>>>    It seems the way above will end as something like an internal
> >>>>> layer, but without clear APIs tips what it is doing. Please reconsider
> >>>>> the advantages about a clear internal API layer.
> >>>>
> >>>> I'm not convinced by the internal API approach.  It took me a while when
> >>>> reviewing the code before I understood what was actually going on
> >>>> because of the qmp_transaction() and BlockdevAction duplication code.
> >>>>
> >>>> I see the internal API approach as an unnecessary layer of indirection.
> >>>> It makes the code more complicated to understand and maintain.  Next
> >>>> time we add something to qmp_transaction() it would also be necessary to
> >>>> duplicate that change for the internal API.  It creates unnecessary
> >>>> work.
> >>>>
> >>>    Basic process is almost the same in two approaches, I'd like to
> >>> adjust the code to avoid data duplication as much as possible, and
> >>> see if some function can be removed when code keeps clear, in next
> >>> version.
> >>>
> >>>> Just embrace QAPI, the point of it was to eliminate these external <->
> >>>> internal translations we were doing all the time.
> >>>>
> >>>> Stefan
> >>>>
> >>>
> >>>
> >> Hi, Stefan
> >>    I redesigned the structure, Following is the fake code:
> >>
> >> typedef struct BdrvActionOps {
> >>      /* check the request's validation, allocate p_opaque if needed */
> >>      int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
> >>      /* take the action */
> >>      int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
> >>      /* update emulator */
> >>      int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
> >>      /* cancel the action */
> >>      int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
> >> } BdrvActionOps;
> >>
> >> typedef struct BlkTransactionStates {
> >>      BlockdevAction *action;
> >>      void *opaque;
> >>      BdrvActionOps *ops;
> >>      QSIMPLEQ_ENTRY(BlkTransactionStates) entry;
> >> } BlkTransactionStates;
> >>
> >> /* call ops->check and return state* to be enqueued */
> >> static BlkTransactionStates *transaction_create(BlockdevAction *action,
> >>                                                  Error **errp);
> >>
> >> void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
> >> {
> >>      BlockdevActionList *dev_entry = dev_list;
> >>      BlkTransactionStates *state;
> >>
> >>      while (NULL != dev_entry) {
> >>          state = transaction_create(dev_entry->value, errp);
> >>          /* enqueue */
> >>          dev_entry = dev_entry->next;
> >>      }
> >>
> >>      /* use queue with submit, commit, rollback callback */
> >> }
> >>
> >>
> >>    In this way, parameter duplication is saved, but one problem remains:
> >> parameter can't be hidden to user such as vm_state_size, but this would
> >> not be a problem if hmp "savevm" use his own code about block snapshot
> >> later, I mean not use qmp_transaction(). What do you think about the
> >> design? Do you have a better way to solve this problem?
> > 
> > Can you explain the vm_state_size problem again?  Sorry I forgot - I
> > think it had something to do with having an internal parameter in the
> > action that should not be exposed via QMP/HMP?
> > 
>   Yep, this parameter will be used by qemu "live savevm" code later,
> but should not expose it to user in qmp interface.

The simplest solution is:

void bdrv_transaction(...)
{
    /* Common code here */
}

void qmp_transaction(...)
{
    /* Reject optional internal fields here */
    if (opts->has_vm_state_size) {
        error_setg(errp, "internal field vm_state_size must not be set");
	return;
    }

    bdrv_transaction(...);
}

If transaction is used from inside QEMU with vm_state_size, then call
bdrv_transaction() instead of qmp_transaction() to skip the public field
checks.

My second idea is to add QAPI syntax to make a field as "private" or
"internal".  The marshalling code will skip or return an error if the
field is set.

This is a little nicer since all callers use qmp_transaction() but we're
guaranteed that external JSON cannot make use of the field.

Stefan
Kevin Wolf - March 13, 2013, 10:18 a.m.
Am 12.03.2013 um 09:30 hat Wenchao Xia geschrieben:
>   I redesigned the structure, Following is the fake code:
> 
> typedef struct BdrvActionOps {
>     /* check the request's validation, allocate p_opaque if needed */
>     int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
>     /* take the action */
>     int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
>     /* update emulator */
>     int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
>     /* cancel the action */
>     int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
> } BdrvActionOps;

Why do you need the split of prepare into check/submit?

If you have prepare/commit/abort, everybody will recognise this as the
standard transaction pattern because this is just how it's done.
Deviating from it needs a good justification in my opinion.

Kevin
Wayne Xia - March 14, 2013, 5:08 a.m.
于 2013-3-13 18:18, Kevin Wolf 写道:
> Am 12.03.2013 um 09:30 hat Wenchao Xia geschrieben:
>>    I redesigned the structure, Following is the fake code:
>>
>> typedef struct BdrvActionOps {
>>      /* check the request's validation, allocate p_opaque if needed */
>>      int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
>>      /* take the action */
>>      int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
>>      /* update emulator */
>>      int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
>>      /* cancel the action */
>>      int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
>> } BdrvActionOps;
>
> Why do you need the split of prepare into check/submit?
>
> If you have prepare/commit/abort, everybody will recognise this as the
> standard transaction pattern because this is just how it's done.
> Deviating from it needs a good justification in my opinion.
>
> Kevin
>

   My thought is rejecting the request in *check if parameter invalid
before take any action, while submit do the real action, to reduce
the chance to of rolling back when some request not valid in the batch.
Kevin Wolf - March 14, 2013, 8:22 a.m.
Am 14.03.2013 um 06:08 hat Wenchao Xia geschrieben:
> 于 2013-3-13 18:18, Kevin Wolf 写道:
> >Am 12.03.2013 um 09:30 hat Wenchao Xia geschrieben:
> >>   I redesigned the structure, Following is the fake code:
> >>
> >>typedef struct BdrvActionOps {
> >>     /* check the request's validation, allocate p_opaque if needed */
> >>     int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
> >>     /* take the action */
> >>     int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
> >>     /* update emulator */
> >>     int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
> >>     /* cancel the action */
> >>     int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
> >>} BdrvActionOps;
> >
> >Why do you need the split of prepare into check/submit?
> >
> >If you have prepare/commit/abort, everybody will recognise this as the
> >standard transaction pattern because this is just how it's done.
> >Deviating from it needs a good justification in my opinion.
> >
> >Kevin
> >
> 
>   My thought is rejecting the request in *check if parameter invalid
> before take any action, while submit do the real action, to reduce
> the chance to of rolling back when some request not valid in the batch.

Okay, so it's not strictly needed, but an optimisation of the error
case?

Does it work well when the transaction includes an operation that
depends on the previous one, like create a snapshot and then do
something with this snapshot?

Anyway, even if we think it works and is worth the effort to optimise
such error cases, please use names that are consistent with the
transactions used for reopening: (check/)prepare/commit/abort.

Kevin
Wayne Xia - March 18, 2013, 10 a.m.
于 2013-3-14 16:22, Kevin Wolf 写道:
> Am 14.03.2013 um 06:08 hat Wenchao Xia geschrieben:
>> 于 2013-3-13 18:18, Kevin Wolf 写道:
>>> Am 12.03.2013 um 09:30 hat Wenchao Xia geschrieben:
>>>>    I redesigned the structure, Following is the fake code:
>>>>
>>>> typedef struct BdrvActionOps {
>>>>      /* check the request's validation, allocate p_opaque if needed */
>>>>      int (*check)(BlockdevAction *action, void **p_opaque, Error **errp);
>>>>      /* take the action */
>>>>      int (*submit)(BlockdevAction *action, void *opaque, Error **errp);
>>>>      /* update emulator */
>>>>      int (*commit)(BlockdevAction *action, void *opaque, Error **errp);
>>>>      /* cancel the action */
>>>>      int (*rollback)(BlockdevAction *action, void *opaque, Error **errp);
>>>> } BdrvActionOps;
>>>
>>> Why do you need the split of prepare into check/submit?
>>>
>>> If you have prepare/commit/abort, everybody will recognise this as the
>>> standard transaction pattern because this is just how it's done.
>>> Deviating from it needs a good justification in my opinion.
>>>
>>> Kevin
>>>
>>
>>    My thought is rejecting the request in *check if parameter invalid
>> before take any action, while submit do the real action, to reduce
>> the chance to of rolling back when some request not valid in the batch.
>
> Okay, so it's not strictly needed, but an optimisation of the error
> case?
>
> Does it work well when the transaction includes an operation that
> depends on the previous one, like create a snapshot and then do
> something with this snapshot?
>
   This seems to complex, since prepare of all actions are executed
in first round, they may interrupt each other later. So I am
thinking make it more clear as complete one job one time, which
may change the old qmp_transcation() logic a little.

> Anyway, even if we think it works and is worth the effort to optimise
> such error cases, please use names that are consistent with the
> transactions used for reopening: (check/)prepare/commit/abort.
>
   In above way check/prepare can be merged, how do you think of it?

> Kevin
>

Patch

diff --git a/blockdev.c b/blockdev.c
index 668506d..299039f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1173,157 +1173,116 @@  void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file,
                        errp);
 }
 
-
-/* New and old BlockDriverState structs for group snapshots */
-typedef struct BlkTransactionStates {
-    BlockDriverState *old_bs;
-    BlockDriverState *new_bs;
-    QSIMPLEQ_ENTRY(BlkTransactionStates) entry;
-} BlkTransactionStates;
-
-/*
- * 'Atomic' group snapshots.  The snapshots are taken as a set, and if any fail
- *  then we do not pivot any of the devices in the group, and abandon the
- *  snapshots
- */
-void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
+/* translation from qmp commands */
+static int fill_blk_trs_ext_create_sync(BlockdevSnapshot *create_sync,
+                                        BlkTransStatesSync *st_sync,
+                                        Error **errp)
 {
-    int ret = 0;
-    BlockdevActionList *dev_entry = dev_list;
-    BlkTransactionStates *states, *next;
-    Error *local_err = NULL;
-
-    QSIMPLEQ_HEAD(snap_bdrv_states, BlkTransactionStates) snap_bdrv_states;
-    QSIMPLEQ_INIT(&snap_bdrv_states);
+    const char *format = "qcow2";
+    enum NewImageMode mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+    BlockDriverState *bs;
+    const char *device = create_sync->device;
+    const char *name = create_sync->snapshot_file;
 
-    /* drain all i/o before any snapshots */
-    bdrv_drain_all();
+    if (create_sync->has_mode) {
+        mode = create_sync->mode;
+    }
+    if (create_sync->has_format) {
+        format = create_sync->format;
+    }
 
-    /* We don't do anything in this loop that commits us to the snapshot */
-    while (NULL != dev_entry) {
-        BlockdevAction *dev_info = NULL;
-        BlockDriver *proto_drv;
-        BlockDriver *drv;
-        int flags;
-        enum NewImageMode mode;
-        const char *new_image_file;
-        const char *device;
-        const char *format = "qcow2";
+    /* find the target bs */
+    bs = bdrv_find(device);
+    if (!bs) {
+        error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+        return -1;
+    }
 
-        dev_info = dev_entry->value;
-        dev_entry = dev_entry->next;
+    switch (mode) {
+    case NEW_IMAGE_MODE_EXISTING:
+        st_sync->use_existing = true;
+        break;
+    case NEW_IMAGE_MODE_ABSOLUTE_PATHS:
+        st_sync->use_existing = false;
+        break;
+    default:
+        error_setg(errp, "Device %s requested invalid snapshot"
+                         " mode %d.", device, mode);
+        return -1;
+    }
 
-        states = g_malloc0(sizeof(BlkTransactionStates));
-        QSIMPLEQ_INSERT_TAIL(&snap_bdrv_states, states, entry);
+    st_sync->external.new_image_file = name;
+    st_sync->external.format = format;
+    st_sync->external.old_bs = bs;
 
-        switch (dev_info->kind) {
-        case BLOCKDEV_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC:
-            device = dev_info->blockdev_snapshot_sync->device;
-            if (!dev_info->blockdev_snapshot_sync->has_mode) {
-                dev_info->blockdev_snapshot_sync->mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
-            }
-            new_image_file = dev_info->blockdev_snapshot_sync->snapshot_file;
-            if (dev_info->blockdev_snapshot_sync->has_format) {
-                format = dev_info->blockdev_snapshot_sync->format;
-            }
-            mode = dev_info->blockdev_snapshot_sync->mode;
-            break;
-        default:
-            abort();
-        }
+    return 0;
+}
 
-        drv = bdrv_find_format(format);
-        if (!drv) {
-            error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
-            goto delete_and_fail;
-        }
+static int fill_blk_trs(BlockdevAction *dev_info,
+                        BlkTransStates *states,
+                        SNTime *time,
+                        const char *time_str,
+                        Error **errp)
+{
+    int ret = 0;
 
-        states->old_bs = bdrv_find(device);
-        if (!states->old_bs) {
-            error_set(errp, QERR_DEVICE_NOT_FOUND, device);
-            goto delete_and_fail;
-        }
+    switch (dev_info->kind) {
+    case BLOCKDEV_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC:
+        states->st_sync.type = BLK_SNAPSHOT_EXTERNAL;
+        states->st_sync.op = BLK_SN_SYNC_CREATE;
+        ret = fill_blk_trs_ext_create_sync(dev_info->blockdev_snapshot_sync,
+                                           &states->st_sync,
+                                           errp);
+        break;
+    default:
+        abort();
+    }
 
-        if (!bdrv_is_inserted(states->old_bs)) {
-            error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
-            goto delete_and_fail;
-        }
+    return ret;
+}
 
-        if (bdrv_in_use(states->old_bs)) {
-            error_set(errp, QERR_DEVICE_IN_USE, device);
-            goto delete_and_fail;
-        }
+/* Here this funtion prepare the request list, submit for atomic snapshot. */
+void qmp_transaction(BlockdevActionList *dev_list, Error **errp)
+{
+    BlockdevActionList *dev_entry = dev_list;
+    BlkTransStates *states;
+    int ret;
 
-        if (!bdrv_is_read_only(states->old_bs)) {
-            if (bdrv_flush(states->old_bs)) {
-                error_set(errp, QERR_IO_ERROR);
-                goto delete_and_fail;
-            }
-        }
+    BlkTransStatesList *snap_bdrv_states = blk_trans_st_list_new();
 
-        flags = states->old_bs->open_flags;
+    /* translate qmp request */
+    /* for group snapshot create we use same time stamp here */
+    SNTime time = get_sn_time();
+    char time_str[256];
+    generate_sn_name_from_time(&time, time_str, sizeof(time_str));
+    while (NULL != dev_entry) {
+        BlockdevAction *dev_info = NULL;
 
-        proto_drv = bdrv_find_protocol(new_image_file);
-        if (!proto_drv) {
-            error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
-            goto delete_and_fail;
-        }
+        dev_info = dev_entry->value;
+        dev_entry = dev_entry->next;
 
-        /* create new image w/backing file */
-        if (mode != NEW_IMAGE_MODE_EXISTING) {
-            bdrv_img_create(new_image_file, format,
-                            states->old_bs->filename,
-                            states->old_bs->drv->format_name,
-                            NULL, -1, flags, &local_err);
-            if (error_is_set(&local_err)) {
-                error_propagate(errp, local_err);
-                goto delete_and_fail;
-            }
+        states = blk_trans_st_new();
+        ret = fill_blk_trs(dev_info, states, &time, time_str, errp);
+        if (ret < 0) {
+            blk_trans_st_delete(&states);
+            goto exit;
         }
 
-        /* We will manually add the backing_hd field to the bs later */
-        states->new_bs = bdrv_new("");
-        ret = bdrv_open(states->new_bs, new_image_file,
-                        flags | BDRV_O_NO_BACKING, drv);
-        if (ret != 0) {
-            error_set(errp, QERR_OPEN_FILE_FAILED, new_image_file);
-            goto delete_and_fail;
+        ret = add_transaction(snap_bdrv_states, states, errp);
+        if (ret < 0) {
+            blk_trans_st_delete(&states);
+            goto exit;
         }
     }
 
+    /* submit to internal API, no need to check return for no following
+       action now. */
+    submit_transaction(snap_bdrv_states, errp);
 
-    /* Now we are going to do the actual pivot.  Everything up to this point
-     * is reversible, but we are committed at this point */
-    QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
-        /* This removes our old bs from the bdrv_states, and adds the new bs */
-        bdrv_append(states->new_bs, states->old_bs);
-        /* We don't need (or want) to use the transactional
-         * bdrv_reopen_multiple() across all the entries at once, because we
-         * don't want to abort all of them if one of them fails the reopen */
-        bdrv_reopen(states->new_bs, states->new_bs->open_flags & ~BDRV_O_RDWR,
-                    NULL);
-    }
-
-    /* success */
-    goto exit;
-
-delete_and_fail:
-    /*
-    * failure, and it is all-or-none; abandon each new bs, and keep using
-    * the original bs for all images
-    */
-    QSIMPLEQ_FOREACH(states, &snap_bdrv_states, entry) {
-        if (states->new_bs) {
-             bdrv_delete(states->new_bs);
-        }
-    }
 exit:
-    QSIMPLEQ_FOREACH_SAFE(states, &snap_bdrv_states, entry, next) {
-        g_free(states);
-    }
+    blk_trans_st_list_delete(&snap_bdrv_states);
 }
 
-
 static void eject_device(BlockDriverState *bs, int force, Error **errp)
 {
     if (bdrv_in_use(bs)) {