diff mbox series

[v5,30/31] crypto: delegate permission functions to JobDriver .pre_run

Message ID 20211124064418.3120601-31-eesposit@redhat.com
State New
Headers show
Series block layer: split block APIs in global state and I/O | expand

Commit Message

Emanuele Giuseppe Esposito Nov. 24, 2021, 6:44 a.m. UTC
block_crypto_amend_options_generic_luks uses the block layer
permission API, therefore it should be called with the BQL held.

However, the same function is being called ib two BlockDriver
callbacks: bdrv_amend_options (under BQL) and bdrv_co_amend (I/O).

The latter is I/O because it is invoked by block/amend.c's
blockdev_amend_run(), a .run callback of the amend JobDriver

Therefore we want to 1) change block_crypto_amend_options_generic_luks
to use the permission API only when the BQL is held, and
2) use the .pre_run JobDriver callback to check for
permissions before switching to the job aiocontext. This has also
the benefit of applying the same permission operation to all
amend implementations, not only luks.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/amend.c  | 20 ++++++++++++++++++++
 block/crypto.c | 18 ++++++++++++------
 2 files changed, 32 insertions(+), 6 deletions(-)

Comments

Hanna Czenczek Dec. 17, 2021, 12:29 p.m. UTC | #1
On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
> block_crypto_amend_options_generic_luks uses the block layer
> permission API, therefore it should be called with the BQL held.
>
> However, the same function is being called ib two BlockDriver

s/ ib / by /

> callbacks: bdrv_amend_options (under BQL) and bdrv_co_amend (I/O).
>
> The latter is I/O because it is invoked by block/amend.c's
> blockdev_amend_run(), a .run callback of the amend JobDriver
>
> Therefore we want to 1) change block_crypto_amend_options_generic_luks
> to use the permission API only when the BQL is held, and
> 2) use the .pre_run JobDriver callback to check for
> permissions before switching to the job aiocontext. This has also
> the benefit of applying the same permission operation to all
> amend implementations, not only luks.
>
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/amend.c  | 20 ++++++++++++++++++++
>   block/crypto.c | 18 ++++++++++++------
>   2 files changed, 32 insertions(+), 6 deletions(-)
>
> diff --git a/block/amend.c b/block/amend.c
> index 392df9ef83..fba6add51a 100644
> --- a/block/amend.c
> +++ b/block/amend.c
> @@ -53,10 +53,30 @@ static int coroutine_fn blockdev_amend_run(Job *job, Error **errp)
>       return ret;
>   }
>   
> +static int blockdev_amend_refresh_perms(Job *job, Error **errp)
> +{
> +    BlockdevAmendJob *s = container_of(job, BlockdevAmendJob, common);
> +
> +    return bdrv_child_refresh_perms(s->bs, s->bs->file, errp);
> +}

I miss some documentation for this function, why we do it and how it 
works together with the bdrv_co_amend implementation.

I was trying to come up with an example text, but then I wondered – how 
does it actually work?  bdrv_child_refresh_perms() eventually ends up in 
block_crypto_child_perms().  However, that will only return exceptional 
permissions if crypto->updating_keys is true. But that’s set only in 
block_crypto_amend_options_generic_luks() – i.e. when the job runs.  
That’s exactly why that function calls bdrv_child_refresh_perms() only 
after it has modified crypto->updating_keys.

Reproducer (amend on a LUKS image with read-only=true, so it doesn’t 
have the WRITE permission continuously, but needs to take it as an 
exception in block_crypto_child_perms()):

$ qemu-img create \
     -f luks \
     --object secret,id=sec0,data=123456 \
     -o key-secret=sec0 \
     test.luks \
     64M
Formatting 'test.luks', fmt=luks size=67108864 key-secret=sec0

$ ./qemu-system-x86_64 \
     -object secret,id=sec0,data=123456 \
     -object iothread,id=iothr0 \
     -blockdev file,node-name=node0,filename=test.luks \
     -blockdev 
luks,node-name=node1,key-secret=sec0,file=node0,read-only=true \
     -device virtio-blk,drive=node1,iothread=iothr0 -qmp stdio \
     <<EOF
{"execute": "qmp_capabilities"}
{
     "execute": "x-blockdev-amend",
     "arguments": {
         "job-id": "amend0",
         "node-name": "node1",
         "options": {
             "driver": "luks",
             "state": "active",
             "new-secret": "sec0"
         }
     }
}
EOF

{"QMP": {"version": {"qemu": {"micro": 93, "minor": 1, "major": 6}, 
"package": "v6.2.0-rc3-50-gdb635fc4e7"}, "capabilities": ["oob"]}}
{"return": {}}
{"timestamp": {"seconds": 1639742600, "microseconds": 574641}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "created", "id": "amend0"}}
{"timestamp": {"seconds": 1639742600, "microseconds": 574919}, "event": 
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "amend0"}}
{"return": {}}
qemu-system-x86_64: ../block/io.c:2041: bdrv_co_write_req_prepare: 
Assertion `child->perm & BLK_PERM_WRITE' failed.
[1]    55880 IOT instruction (core dumped)  ./qemu-system-x86_64 -object 
secret,id=sec0,data=123456 -object  -blockdev


I believe this means we need some new block driver function to prepare 
for an amendment operation.  If so, another question comes up, which is 
whether this preparatory function should then also call 
bdrv_child_refresh_perms(), and then whether we should have a clean-up 
function for symmetry.

> +
> +static int blockdev_amend_pre_run(Job *job, Error **errp)
> +{
> +    return blockdev_amend_refresh_perms(job, errp);
> +}
> +
> +static void blockdev_amend_clean(Job *job)
> +{
> +    Error *errp;
> +    blockdev_amend_refresh_perms(job, &errp);

Do we really want to ignore this error?  If so, we shouldn’t pass a 
pointer to an unused local variable, but NULL.

If we don’t want to ignore it, we have the option of doing what you do 
here and then at least reporting a potential error with 
error_report_err(), and then freeing it, and we also must initialize 
errp to NULL in this case.

If we expect no error to happen (e.g. because we require the amend 
implementation to only release/share permissions and not acquire/unshare 
them), then I’d expect passing &error_abort here.

> +}
> +
>   static const JobDriver blockdev_amend_job_driver = {
>       .instance_size = sizeof(BlockdevAmendJob),
>       .job_type      = JOB_TYPE_AMEND,
>       .run           = blockdev_amend_run,
> +    .pre_run       = blockdev_amend_pre_run,
> +    .clean         = blockdev_amend_clean,
>   };
>   
>   void qmp_x_blockdev_amend(const char *job_id,
> diff --git a/block/crypto.c b/block/crypto.c
> index c8ba4681e2..82f154516c 100644
> --- a/block/crypto.c
> +++ b/block/crypto.c
> @@ -780,6 +780,7 @@ block_crypto_get_specific_info_luks(BlockDriverState *bs, Error **errp)
>   static int
>   block_crypto_amend_options_generic_luks(BlockDriverState *bs,
>                                           QCryptoBlockAmendOptions *amend_options,
> +                                        bool under_bql,

This name makes sense in the context of this series, but not so much 
outside of it.

I’d rename it to e.g. “in_amend_job” (and invert its value), and then 
explain that we don’t need to refresh the child permissions when running 
in an amend job, because that job has already taken care of that.

OTOH, given that I believe we need some separate preparatory function 
anyway, perhaps we should just pull out the bdrv_child_refresh_perms() 
from this function altogether, so that we have:

block_crypto_amend_options_luks():

/* sets updating_keys to true, and invokes bdrv_child_refresh_perms() */
block_crypto_amend_options_prepare();
block_crypto_amend_options_generic_luks();
/* sets updating_keys to false, and invokes bdrv_child_refresh_perms() */
block_crypto_amend_options_clean();


block_crypto_co_amend_luks():

/* No need to prepare or clean up, that is taken care of by the amend job */
block_crypto_amend_options_generic_luks();


(If we decide not to put bdrv_child_refresh_perms() into 
prepare()/clean(), then it would need to be called by 
block_crypto_amend_options_luks(); and if we decide not to have a 
block_crypto_amend_options_clean(), then we’d need to inline it fully.)

Hanna

>                                           bool force,
>                                           Error **errp)
>   {
> @@ -791,9 +792,12 @@ block_crypto_amend_options_generic_luks(BlockDriverState *bs,
>   
>       /* apply for exclusive read/write permissions to the underlying file*/
>       crypto->updating_keys = true;
> -    ret = bdrv_child_refresh_perms(bs, bs->file, errp);
> -    if (ret) {
> -        goto cleanup;
> +
> +    if (under_bql) {
> +        ret = bdrv_child_refresh_perms(bs, bs->file, errp);
> +        if (ret) {
> +            goto cleanup;
> +        }
>       }
>   
>       ret = qcrypto_block_amend_options(crypto->block,
> @@ -806,7 +810,9 @@ block_crypto_amend_options_generic_luks(BlockDriverState *bs,
>   cleanup:
>       /* release exclusive read/write permissions to the underlying file*/
>       crypto->updating_keys = false;
> -    bdrv_child_refresh_perms(bs, bs->file, errp);
> +    if (under_bql) {
> +        bdrv_child_refresh_perms(bs, bs->file, errp);
> +    }
>       return ret;
>   }
>   
> @@ -834,7 +840,7 @@ block_crypto_amend_options_luks(BlockDriverState *bs,
>           goto cleanup;
>       }
>       ret = block_crypto_amend_options_generic_luks(bs, amend_options,
> -                                                  force, errp);
> +                                                  true, force, errp);
>   cleanup:
>       qapi_free_QCryptoBlockAmendOptions(amend_options);
>       return ret;
> @@ -853,7 +859,7 @@ coroutine_fn block_crypto_co_amend_luks(BlockDriverState *bs,
>           .u.luks = *qapi_BlockdevAmendOptionsLUKS_base(&opts->u.luks),
>       };
>       return block_crypto_amend_options_generic_luks(bs, &amend_opts,
> -                                                   force, errp);
> +                                                   false, force, errp);
>   }
>   
>   static void
Hanna Czenczek Dec. 17, 2021, 2:32 p.m. UTC | #2
On 17.12.21 13:29, Hanna Reitz wrote:
> On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
>> block_crypto_amend_options_generic_luks uses the block layer
>> permission API, therefore it should be called with the BQL held.
>>
>> However, the same function is being called ib two BlockDriver
>
> s/ ib / by /
>
>> callbacks: bdrv_amend_options (under BQL) and bdrv_co_amend (I/O).
>>
>> The latter is I/O because it is invoked by block/amend.c's
>> blockdev_amend_run(), a .run callback of the amend JobDriver
>>
>> Therefore we want to 1) change block_crypto_amend_options_generic_luks
>> to use the permission API only when the BQL is held, and
>> 2) use the .pre_run JobDriver callback to check for
>> permissions before switching to the job aiocontext. This has also
>> the benefit of applying the same permission operation to all
>> amend implementations, not only luks.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/amend.c  | 20 ++++++++++++++++++++
>>   block/crypto.c | 18 ++++++++++++------
>>   2 files changed, 32 insertions(+), 6 deletions(-)
>>
>> diff --git a/block/amend.c b/block/amend.c
>> index 392df9ef83..fba6add51a 100644
>> --- a/block/amend.c
>> +++ b/block/amend.c
>> @@ -53,10 +53,30 @@ static int coroutine_fn blockdev_amend_run(Job 
>> *job, Error **errp)
>>       return ret;
>>   }
>>   +static int blockdev_amend_refresh_perms(Job *job, Error **errp)
>> +{
>> +    BlockdevAmendJob *s = container_of(job, BlockdevAmendJob, common);
>> +
>> +    return bdrv_child_refresh_perms(s->bs, s->bs->file, errp);
>> +}
>
> I miss some documentation for this function, why we do it and how it 
> works together with the bdrv_co_amend implementation.
>
> I was trying to come up with an example text, but then I wondered – 
> how does it actually work?  bdrv_child_refresh_perms() eventually ends 
> up in block_crypto_child_perms().  However, that will only return 
> exceptional permissions if crypto->updating_keys is true. But that’s 
> set only in block_crypto_amend_options_generic_luks() – i.e. when the 
> job runs.  That’s exactly why that function calls 
> bdrv_child_refresh_perms() only after it has modified 
> crypto->updating_keys.
>
> Reproducer (amend on a LUKS image with read-only=true, so it doesn’t 
> have the WRITE permission continuously, but needs to take it as an 
> exception in block_crypto_child_perms()):
>
> $ qemu-img create \
>     -f luks \
>     --object secret,id=sec0,data=123456 \
>     -o key-secret=sec0 \
>     test.luks \
>     64M
> Formatting 'test.luks', fmt=luks size=67108864 key-secret=sec0
>
> $ ./qemu-system-x86_64 \
>     -object secret,id=sec0,data=123456 \
>     -object iothread,id=iothr0 \
>     -blockdev file,node-name=node0,filename=test.luks \
>     -blockdev 
> luks,node-name=node1,key-secret=sec0,file=node0,read-only=true \
>     -device virtio-blk,drive=node1,iothread=iothr0 -qmp stdio \
>     <<EOF
> {"execute": "qmp_capabilities"}
> {
>     "execute": "x-blockdev-amend",
>     "arguments": {
>         "job-id": "amend0",
>         "node-name": "node1",
>         "options": {
>             "driver": "luks",
>             "state": "active",
>             "new-secret": "sec0"
>         }
>     }
> }
> EOF
>
> {"QMP": {"version": {"qemu": {"micro": 93, "minor": 1, "major": 6}, 
> "package": "v6.2.0-rc3-50-gdb635fc4e7"}, "capabilities": ["oob"]}}
> {"return": {}}
> {"timestamp": {"seconds": 1639742600, "microseconds": 574641}, 
> "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": 
> "amend0"}}
> {"timestamp": {"seconds": 1639742600, "microseconds": 574919}, 
> "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": 
> "amend0"}}
> {"return": {}}
> qemu-system-x86_64: ../block/io.c:2041: bdrv_co_write_req_prepare: 
> Assertion `child->perm & BLK_PERM_WRITE' failed.
> [1]    55880 IOT instruction (core dumped)  ./qemu-system-x86_64 
> -object secret,id=sec0,data=123456 -object  -blockdev

Addendum: Running the iotests, I realized that (because 295 and 296 fail 
for luks) of course it doesn’t matter whether the job runs in the main 
loop or not in order to reproduce this assertion failure, so the 
`-object iothread,id=iothr0` and `-device 
virtio-blk,drive=node1,iothread=iothr0` can be dropped from the qemu 
command line.

Hanna
Emanuele Giuseppe Esposito Dec. 20, 2021, 3:47 p.m. UTC | #3
On 17/12/2021 13:29, Hanna Reitz wrote:
> On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
>> block_crypto_amend_options_generic_luks uses the block layer
>> permission API, therefore it should be called with the BQL held.
>>
>> However, the same function is being called ib two BlockDriver
> 
> s/ ib / by /
> 
>> callbacks: bdrv_amend_options (under BQL) and bdrv_co_amend (I/O).
>>
>> The latter is I/O because it is invoked by block/amend.c's
>> blockdev_amend_run(), a .run callback of the amend JobDriver
>>
>> Therefore we want to 1) change block_crypto_amend_options_generic_luks
>> to use the permission API only when the BQL is held, and
>> 2) use the .pre_run JobDriver callback to check for
>> permissions before switching to the job aiocontext. This has also
>> the benefit of applying the same permission operation to all
>> amend implementations, not only luks.
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/amend.c  | 20 ++++++++++++++++++++
>>   block/crypto.c | 18 ++++++++++++------
>>   2 files changed, 32 insertions(+), 6 deletions(-)
>>
>> diff --git a/block/amend.c b/block/amend.c
>> index 392df9ef83..fba6add51a 100644
>> --- a/block/amend.c
>> +++ b/block/amend.c
>> @@ -53,10 +53,30 @@ static int coroutine_fn blockdev_amend_run(Job 
>> *job, Error **errp)
>>       return ret;
>>   }
>> +static int blockdev_amend_refresh_perms(Job *job, Error **errp)
>> +{
>> +    BlockdevAmendJob *s = container_of(job, BlockdevAmendJob, common);
>> +
>> +    return bdrv_child_refresh_perms(s->bs, s->bs->file, errp);
>> +}
> 
> I miss some documentation for this function, why we do it and how it 
> works together with the bdrv_co_amend implementation.

> 
> I was trying to come up with an example text, but then I wondered – how 
> does it actually work?  bdrv_child_refresh_perms() eventually ends up in 
> block_crypto_child_perms().  However, that will only return exceptional 
> permissions if crypto->updating_keys is true. But that’s set only in 
> block_crypto_amend_options_generic_luks() – i.e. when the job runs. 
> That’s exactly why that function calls bdrv_child_refresh_perms() only 
> after it has modified crypto->updating_keys.
> 
> Reproducer (amend on a LUKS image with read-only=true, so it doesn’t 
> have the WRITE permission continuously, but needs to take it as an 
> exception in block_crypto_child_perms()):
> 
> $ qemu-img create \
>      -f luks \
>      --object secret,id=sec0,data=123456 \
>      -o key-secret=sec0 \
>      test.luks \
>      64M
> Formatting 'test.luks', fmt=luks size=67108864 key-secret=sec0
> 
> $ ./qemu-system-x86_64 \
>      -object secret,id=sec0,data=123456 \
>      -object iothread,id=iothr0 \
>      -blockdev file,node-name=node0,filename=test.luks \
>      -blockdev 
> luks,node-name=node1,key-secret=sec0,file=node0,read-only=true \
>      -device virtio-blk,drive=node1,iothread=iothr0 -qmp stdio \
>      <<EOF
> {"execute": "qmp_capabilities"}
> {
>      "execute": "x-blockdev-amend",
>      "arguments": {
>          "job-id": "amend0",
>          "node-name": "node1",
>          "options": {
>              "driver": "luks",
>              "state": "active",
>              "new-secret": "sec0"
>          }
>      }
> }
> EOF
> 
> {"QMP": {"version": {"qemu": {"micro": 93, "minor": 1, "major": 6}, 
> "package": "v6.2.0-rc3-50-gdb635fc4e7"}, "capabilities": ["oob"]}}
> {"return": {}}
> {"timestamp": {"seconds": 1639742600, "microseconds": 574641}, "event": 
> "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "amend0"}}
> {"timestamp": {"seconds": 1639742600, "microseconds": 574919}, "event": 
> "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "amend0"}}
> {"return": {}}
> qemu-system-x86_64: ../block/io.c:2041: bdrv_co_write_req_prepare: 
> Assertion `child->perm & BLK_PERM_WRITE' failed.
> [1]    55880 IOT instruction (core dumped)  ./qemu-system-x86_64 -object 
> secret,id=sec0,data=123456 -object  -blockdev
> 
> 
> I believe this means we need some new block driver function to prepare 
> for an amendment operation.  If so, another question comes up, which is 
> whether this preparatory function should then also call 
> bdrv_child_refresh_perms(), and then whether we should have a clean-up 
> function for symmetry.
> 

Yes, unfortunately it means that (see at the end of the mail for more).

I think it does not work because of crypto->updating_keys missing in 
blockdev_amend_pre_run(). That is why the permission is not correctly 
set and the example fails.


>> +
>> +static int blockdev_amend_pre_run(Job *job, Error **errp)
>> +{
>> +    return blockdev_amend_refresh_perms(job, errp);
>> +}
>> +
>> +static void blockdev_amend_clean(Job *job)
>> +{
>> +    Error *errp;
>> +    blockdev_amend_refresh_perms(job, &errp);
> 
> Do we really want to ignore this error?  If so, we shouldn’t pass a 
> pointer to an unused local variable, but NULL.
> 
> If we don’t want to ignore it, we have the option of doing what you do 
> here and then at least reporting a potential error with 
> error_report_err(), and then freeing it, and we also must initialize 
> errp to NULL in this case.

Going with this one above, thanks.
> 
> If we expect no error to happen (e.g. because we require the amend 
> implementation to only release/share permissions and not acquire/unshare 
> them), then I’d expect passing &error_abort here.
> 
>> +}
>> +
>>   static const JobDriver blockdev_amend_job_driver = {
>>       .instance_size = sizeof(BlockdevAmendJob),
>>       .job_type      = JOB_TYPE_AMEND,
>>       .run           = blockdev_amend_run,
>> +    .pre_run       = blockdev_amend_pre_run,
>> +    .clean         = blockdev_amend_clean,
>>   };
>>   void qmp_x_blockdev_amend(const char *job_id,
>> diff --git a/block/crypto.c b/block/crypto.c
>> index c8ba4681e2..82f154516c 100644
>> --- a/block/crypto.c
>> +++ b/block/crypto.c
>> @@ -780,6 +780,7 @@ 
>> block_crypto_get_specific_info_luks(BlockDriverState *bs, Error **errp)
>>   static int
>>   block_crypto_amend_options_generic_luks(BlockDriverState *bs,
>>                                           QCryptoBlockAmendOptions 
>> *amend_options,
>> +                                        bool under_bql,
> 
> This name makes sense in the context of this series, but not so much 
> outside of it.
> 
> I’d rename it to e.g. “in_amend_job” (and invert its value), and then 
> explain that we don’t need to refresh the child permissions when running 
> in an amend job, because that job has already taken care of that.
> 
> OTOH, given that I believe we need some separate preparatory function 
> anyway, perhaps we should just pull out the bdrv_child_refresh_perms() 
> from this function altogether, so that we have:
> 
> block_crypto_amend_options_luks():
> 
> /* sets updating_keys to true, and invokes bdrv_child_refresh_perms() */
> block_crypto_amend_options_prepare();
> block_crypto_amend_options_generic_luks();
> /* sets updating_keys to false, and invokes bdrv_child_refresh_perms() */
> block_crypto_amend_options_clean();
> 
> 
> block_crypto_co_amend_luks():
> 
> /* No need to prepare or clean up, that is taken care of by the amend 
> job */
> block_crypto_amend_options_generic_luks();
> 
> 
> (If we decide not to put bdrv_child_refresh_perms() into 
> prepare()/clean(), then it would need to be called by 
> block_crypto_amend_options_luks(); and if we decide not to have a 
> block_crypto_amend_options_clean(), then we’d need to inline it fully.)

So a couple of things I will change (according with your feedbacks):

- Remove the assertion job->aio_context == qemu_in_main_thread() done in 
job_co_entry, as it is wrong. I don't know why I added that, but we 
cannot assume that job->run() always run in the main context, because 
the job aiocontext can be different. I don't think there is a test doing 
that now, but it is possible. If run() was in the main context, then 
bdrv_co_amend (called only in blockdev_amend_run) would be GS too, but 
it isn't, also according with your comment in v4:

"[...] .bdrv_co_amend very much strikes me like a GS function, but
it isn’t.  I’m afraid it must work on nodes that are not in the main
context, and it launches a job, so AFAIU we absolutely cannot run it
under the BQL."

- Introduce block_crypto_amend_options_prepare and 
block_crypto_amend_options_clean, as you suggested above. These fix the 
GS call stack of block_crypto_amend_options_generic_luks()

- Introduce .bdrv_pre_run() and .bdrv_cleanup(), respectively called by 
.job_pre_run() and .job_cleanup(). The reason is that we need to set 
crypto->updating_keys, otherwise the job amend won't temporary give the 
write permission so the example above would fail.

So for the I/O callstack of block_crypto_amend_options_generic_luks() we 
will have:
job->pre_run():
	.bdrv_pre_run();
		crypto->update_keys = true;
	blockdev_amend_refresh_perms()

job->run():
	block_crypto_amend_options_generic_luks()

job->cleanup():
	.bdrv_cleanup();
		crypto->update_keys = false;
	blockdev_amend_refresh_perms()

Thank you,
Emanuele

> 
> Hanna
> 
>>                                           bool force,
>>                                           Error **errp)
>>   {
>> @@ -791,9 +792,12 @@ 
>> block_crypto_amend_options_generic_luks(BlockDriverState *bs,
>>       /* apply for exclusive read/write permissions to the underlying 
>> file*/
>>       crypto->updating_keys = true;
>> -    ret = bdrv_child_refresh_perms(bs, bs->file, errp);
>> -    if (ret) {
>> -        goto cleanup;
>> +
>> +    if (under_bql) {
>> +        ret = bdrv_child_refresh_perms(bs, bs->file, errp);
>> +        if (ret) {
>> +            goto cleanup;
>> +        }
>>       }
>>       ret = qcrypto_block_amend_options(crypto->block,
>> @@ -806,7 +810,9 @@ 
>> block_crypto_amend_options_generic_luks(BlockDriverState *bs,
>>   cleanup:
>>       /* release exclusive read/write permissions to the underlying 
>> file*/
>>       crypto->updating_keys = false;
>> -    bdrv_child_refresh_perms(bs, bs->file, errp);
>> +    if (under_bql) {
>> +        bdrv_child_refresh_perms(bs, bs->file, errp);
>> +    }
>>       return ret;
>>   }
>> @@ -834,7 +840,7 @@ block_crypto_amend_options_luks(BlockDriverState *bs,
>>           goto cleanup;
>>       }
>>       ret = block_crypto_amend_options_generic_luks(bs, amend_options,
>> -                                                  force, errp);
>> +                                                  true, force, errp);
>>   cleanup:
>>       qapi_free_QCryptoBlockAmendOptions(amend_options);
>>       return ret;
>> @@ -853,7 +859,7 @@ coroutine_fn 
>> block_crypto_co_amend_luks(BlockDriverState *bs,
>>           .u.luks = *qapi_BlockdevAmendOptionsLUKS_base(&opts->u.luks),
>>       };
>>       return block_crypto_amend_options_generic_luks(bs, &amend_opts,
>> -                                                   force, errp);
>> +                                                   false, force, errp);
>>   }
>>   static void
>
Hanna Czenczek Dec. 23, 2021, 5:15 p.m. UTC | #4
On 20.12.21 16:47, Emanuele Giuseppe Esposito wrote:
>
>
> On 17/12/2021 13:29, Hanna Reitz wrote:
>> On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
>>> block_crypto_amend_options_generic_luks uses the block layer
>>> permission API, therefore it should be called with the BQL held.
>>>
>>> However, the same function is being called ib two BlockDriver
>>
>> s/ ib / by /
>>
>>> callbacks: bdrv_amend_options (under BQL) and bdrv_co_amend (I/O).
>>>
>>> The latter is I/O because it is invoked by block/amend.c's
>>> blockdev_amend_run(), a .run callback of the amend JobDriver
>>>
>>> Therefore we want to 1) change block_crypto_amend_options_generic_luks
>>> to use the permission API only when the BQL is held, and
>>> 2) use the .pre_run JobDriver callback to check for
>>> permissions before switching to the job aiocontext. This has also
>>> the benefit of applying the same permission operation to all
>>> amend implementations, not only luks.
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>   block/amend.c  | 20 ++++++++++++++++++++
>>>   block/crypto.c | 18 ++++++++++++------
>>>   2 files changed, 32 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/block/amend.c b/block/amend.c
>>> index 392df9ef83..fba6add51a 100644
>>> --- a/block/amend.c
>>> +++ b/block/amend.c
>>> @@ -53,10 +53,30 @@ static int coroutine_fn blockdev_amend_run(Job 
>>> *job, Error **errp)
>>>       return ret;
>>>   }
>>> +static int blockdev_amend_refresh_perms(Job *job, Error **errp)
>>> +{
>>> +    BlockdevAmendJob *s = container_of(job, BlockdevAmendJob, common);
>>> +
>>> +    return bdrv_child_refresh_perms(s->bs, s->bs->file, errp);
>>> +}
>>
>> I miss some documentation for this function, why we do it and how it 
>> works together with the bdrv_co_amend implementation.
>
>>
>> I was trying to come up with an example text, but then I wondered – 
>> how does it actually work? bdrv_child_refresh_perms() eventually ends 
>> up in block_crypto_child_perms().  However, that will only return 
>> exceptional permissions if crypto->updating_keys is true. But that’s 
>> set only in block_crypto_amend_options_generic_luks() – i.e. when the 
>> job runs. That’s exactly why that function calls 
>> bdrv_child_refresh_perms() only after it has modified 
>> crypto->updating_keys.
>>
>> Reproducer (amend on a LUKS image with read-only=true, so it doesn’t 
>> have the WRITE permission continuously, but needs to take it as an 
>> exception in block_crypto_child_perms()):
>>
>> $ qemu-img create \
>>      -f luks \
>>      --object secret,id=sec0,data=123456 \
>>      -o key-secret=sec0 \
>>      test.luks \
>>      64M
>> Formatting 'test.luks', fmt=luks size=67108864 key-secret=sec0
>>
>> $ ./qemu-system-x86_64 \
>>      -object secret,id=sec0,data=123456 \
>>      -object iothread,id=iothr0 \
>>      -blockdev file,node-name=node0,filename=test.luks \
>>      -blockdev 
>> luks,node-name=node1,key-secret=sec0,file=node0,read-only=true \
>>      -device virtio-blk,drive=node1,iothread=iothr0 -qmp stdio \
>>      <<EOF
>> {"execute": "qmp_capabilities"}
>> {
>>      "execute": "x-blockdev-amend",
>>      "arguments": {
>>          "job-id": "amend0",
>>          "node-name": "node1",
>>          "options": {
>>              "driver": "luks",
>>              "state": "active",
>>              "new-secret": "sec0"
>>          }
>>      }
>> }
>> EOF
>>
>> {"QMP": {"version": {"qemu": {"micro": 93, "minor": 1, "major": 6}, 
>> "package": "v6.2.0-rc3-50-gdb635fc4e7"}, "capabilities": ["oob"]}}
>> {"return": {}}
>> {"timestamp": {"seconds": 1639742600, "microseconds": 574641}, 
>> "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": 
>> "amend0"}}
>> {"timestamp": {"seconds": 1639742600, "microseconds": 574919}, 
>> "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": 
>> "amend0"}}
>> {"return": {}}
>> qemu-system-x86_64: ../block/io.c:2041: bdrv_co_write_req_prepare: 
>> Assertion `child->perm & BLK_PERM_WRITE' failed.
>> [1]    55880 IOT instruction (core dumped)  ./qemu-system-x86_64 
>> -object secret,id=sec0,data=123456 -object  -blockdev
>>
>>
>> I believe this means we need some new block driver function to 
>> prepare for an amendment operation.  If so, another question comes 
>> up, which is whether this preparatory function should then also call 
>> bdrv_child_refresh_perms(), and then whether we should have a 
>> clean-up function for symmetry.
>>
>
> Yes, unfortunately it means that (see at the end of the mail for more).
>
> I think it does not work because of crypto->updating_keys missing in 
> blockdev_amend_pre_run(). That is why the permission is not correctly 
> set and the example fails.
>
>
>>> +
>>> +static int blockdev_amend_pre_run(Job *job, Error **errp)
>>> +{
>>> +    return blockdev_amend_refresh_perms(job, errp);
>>> +}
>>> +
>>> +static void blockdev_amend_clean(Job *job)
>>> +{
>>> +    Error *errp;
>>> +    blockdev_amend_refresh_perms(job, &errp);
>>
>> Do we really want to ignore this error?  If so, we shouldn’t pass a 
>> pointer to an unused local variable, but NULL.
>>
>> If we don’t want to ignore it, we have the option of doing what you 
>> do here and then at least reporting a potential error with 
>> error_report_err(), and then freeing it, and we also must initialize 
>> errp to NULL in this case.
>
> Going with this one above, thanks.
>>
>> If we expect no error to happen (e.g. because we require the amend 
>> implementation to only release/share permissions and not 
>> acquire/unshare them), then I’d expect passing &error_abort here.
>>
>>> +}
>>> +
>>>   static const JobDriver blockdev_amend_job_driver = {
>>>       .instance_size = sizeof(BlockdevAmendJob),
>>>       .job_type      = JOB_TYPE_AMEND,
>>>       .run           = blockdev_amend_run,
>>> +    .pre_run       = blockdev_amend_pre_run,
>>> +    .clean         = blockdev_amend_clean,
>>>   };
>>>   void qmp_x_blockdev_amend(const char *job_id,
>>> diff --git a/block/crypto.c b/block/crypto.c
>>> index c8ba4681e2..82f154516c 100644
>>> --- a/block/crypto.c
>>> +++ b/block/crypto.c
>>> @@ -780,6 +780,7 @@ 
>>> block_crypto_get_specific_info_luks(BlockDriverState *bs, Error **errp)
>>>   static int
>>>   block_crypto_amend_options_generic_luks(BlockDriverState *bs,
>>> QCryptoBlockAmendOptions *amend_options,
>>> +                                        bool under_bql,
>>
>> This name makes sense in the context of this series, but not so much 
>> outside of it.
>>
>> I’d rename it to e.g. “in_amend_job” (and invert its value), and then 
>> explain that we don’t need to refresh the child permissions when 
>> running in an amend job, because that job has already taken care of 
>> that.
>>
>> OTOH, given that I believe we need some separate preparatory function 
>> anyway, perhaps we should just pull out the 
>> bdrv_child_refresh_perms() from this function altogether, so that we 
>> have:
>>
>> block_crypto_amend_options_luks():
>>
>> /* sets updating_keys to true, and invokes bdrv_child_refresh_perms() */
>> block_crypto_amend_options_prepare();
>> block_crypto_amend_options_generic_luks();
>> /* sets updating_keys to false, and invokes 
>> bdrv_child_refresh_perms() */
>> block_crypto_amend_options_clean();
>>
>>
>> block_crypto_co_amend_luks():
>>
>> /* No need to prepare or clean up, that is taken care of by the amend 
>> job */
>> block_crypto_amend_options_generic_luks();
>>
>>
>> (If we decide not to put bdrv_child_refresh_perms() into 
>> prepare()/clean(), then it would need to be called by 
>> block_crypto_amend_options_luks(); and if we decide not to have a 
>> block_crypto_amend_options_clean(), then we’d need to inline it fully.)
>
> So a couple of things I will change (according with your feedbacks):
>
> - Remove the assertion job->aio_context == qemu_in_main_thread() done 
> in job_co_entry, as it is wrong. I don't know why I added that, but we 
> cannot assume that job->run() always run in the main context, because 
> the job aiocontext can be different. I don't think there is a test 
> doing that now, but it is possible. If run() was in the main context, 
> then bdrv_co_amend (called only in blockdev_amend_run) would be GS 
> too, but it isn't, also according with your comment in v4:
>
> "[...] .bdrv_co_amend very much strikes me like a GS function, but
> it isn’t.  I’m afraid it must work on nodes that are not in the main
> context, and it launches a job, so AFAIU we absolutely cannot run it
> under the BQL."
>
> - Introduce block_crypto_amend_options_prepare and 
> block_crypto_amend_options_clean, as you suggested above. These fix 
> the GS call stack of block_crypto_amend_options_generic_luks()
>
> - Introduce .bdrv_pre_run() and .bdrv_cleanup(), respectively called 
> by .job_pre_run() and .job_cleanup(). The reason is that we need to 
> set crypto->updating_keys, otherwise the job amend won't temporary 
> give the write permission so the example above would fail.
>
> So for the I/O callstack of block_crypto_amend_options_generic_luks() 
> we will have:
> job->pre_run():
>     .bdrv_pre_run();
>         crypto->update_keys = true;
>     blockdev_amend_refresh_perms()
>
> job->run():
>     block_crypto_amend_options_generic_luks()
>
> job->cleanup():
>     .bdrv_cleanup();
>         crypto->update_keys = false;
>     blockdev_amend_refresh_perms()

Sounds good!  The only adjustment I’d make is to add “amend” somewhere 
in the .bdrv functions (e.g. “.bdrv_amend_pre_run” and 
“.bdrv_amend_cleanup”), because AFAIU they’ll still be amend-specific, 
right?

(Happy holidays :))

Hanna
diff mbox series

Patch

diff --git a/block/amend.c b/block/amend.c
index 392df9ef83..fba6add51a 100644
--- a/block/amend.c
+++ b/block/amend.c
@@ -53,10 +53,30 @@  static int coroutine_fn blockdev_amend_run(Job *job, Error **errp)
     return ret;
 }
 
+static int blockdev_amend_refresh_perms(Job *job, Error **errp)
+{
+    BlockdevAmendJob *s = container_of(job, BlockdevAmendJob, common);
+
+    return bdrv_child_refresh_perms(s->bs, s->bs->file, errp);
+}
+
+static int blockdev_amend_pre_run(Job *job, Error **errp)
+{
+    return blockdev_amend_refresh_perms(job, errp);
+}
+
+static void blockdev_amend_clean(Job *job)
+{
+    Error *errp;
+    blockdev_amend_refresh_perms(job, &errp);
+}
+
 static const JobDriver blockdev_amend_job_driver = {
     .instance_size = sizeof(BlockdevAmendJob),
     .job_type      = JOB_TYPE_AMEND,
     .run           = blockdev_amend_run,
+    .pre_run       = blockdev_amend_pre_run,
+    .clean         = blockdev_amend_clean,
 };
 
 void qmp_x_blockdev_amend(const char *job_id,
diff --git a/block/crypto.c b/block/crypto.c
index c8ba4681e2..82f154516c 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -780,6 +780,7 @@  block_crypto_get_specific_info_luks(BlockDriverState *bs, Error **errp)
 static int
 block_crypto_amend_options_generic_luks(BlockDriverState *bs,
                                         QCryptoBlockAmendOptions *amend_options,
+                                        bool under_bql,
                                         bool force,
                                         Error **errp)
 {
@@ -791,9 +792,12 @@  block_crypto_amend_options_generic_luks(BlockDriverState *bs,
 
     /* apply for exclusive read/write permissions to the underlying file*/
     crypto->updating_keys = true;
-    ret = bdrv_child_refresh_perms(bs, bs->file, errp);
-    if (ret) {
-        goto cleanup;
+
+    if (under_bql) {
+        ret = bdrv_child_refresh_perms(bs, bs->file, errp);
+        if (ret) {
+            goto cleanup;
+        }
     }
 
     ret = qcrypto_block_amend_options(crypto->block,
@@ -806,7 +810,9 @@  block_crypto_amend_options_generic_luks(BlockDriverState *bs,
 cleanup:
     /* release exclusive read/write permissions to the underlying file*/
     crypto->updating_keys = false;
-    bdrv_child_refresh_perms(bs, bs->file, errp);
+    if (under_bql) {
+        bdrv_child_refresh_perms(bs, bs->file, errp);
+    }
     return ret;
 }
 
@@ -834,7 +840,7 @@  block_crypto_amend_options_luks(BlockDriverState *bs,
         goto cleanup;
     }
     ret = block_crypto_amend_options_generic_luks(bs, amend_options,
-                                                  force, errp);
+                                                  true, force, errp);
 cleanup:
     qapi_free_QCryptoBlockAmendOptions(amend_options);
     return ret;
@@ -853,7 +859,7 @@  coroutine_fn block_crypto_co_amend_luks(BlockDriverState *bs,
         .u.luks = *qapi_BlockdevAmendOptionsLUKS_base(&opts->u.luks),
     };
     return block_crypto_amend_options_generic_luks(bs, &amend_opts,
-                                                   force, errp);
+                                                   false, force, errp);
 }
 
 static void