Message ID | 1435761950-26714-1-git-send-email-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
On 01/07/2015 16:45, Stefan Hajnoczi wrote: > If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2) > EINVAL failures may be encountered. > > It is possible to trigger this by setting granularity to a low value > like 8192. > > This patch stops appending chunks once IOV_MAX is reached. > > The spurious EINVAL failure can be reproduced with a qcow2 image file > and the following QMP invocation: > > qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1', > granularity=8192, sync='full', mode='absolute-paths', > format='raw') > > While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct > bs=4k. > > Cc: Jeff Cody <jcody@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > block/mirror.c | 4 ++++ > trace-events | 1 + > 2 files changed, 5 insertions(+) > > diff --git a/block/mirror.c b/block/mirror.c > index 048e452..985ad00 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -241,6 +241,10 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) > trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); > break; > } > + if (IOV_MAX < nb_chunks + added_chunks) { No Yoda conditions... apart from that, Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> > + trace_mirror_break_iov_max(s, nb_chunks, added_chunks); > + break; > + } > > /* We have enough free space to copy these sectors. */ > bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks); > diff --git a/trace-events b/trace-events > index 52b7efa..943cd0c 100644 > --- a/trace-events > +++ b/trace-events > @@ -94,6 +94,7 @@ mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirt > mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d" > mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" > mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" > +mirror_break_iov_max(void *s, int nb_chunks, int added_chunks) "s %p requested chunks %d added_chunks %d" > > # block/backup.c > backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d" >
On Wed, Jul 1, 2015 at 3:47 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > On 01/07/2015 16:45, Stefan Hajnoczi wrote: >> If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2) >> EINVAL failures may be encountered. >> >> It is possible to trigger this by setting granularity to a low value >> like 8192. >> >> This patch stops appending chunks once IOV_MAX is reached. >> >> The spurious EINVAL failure can be reproduced with a qcow2 image file >> and the following QMP invocation: >> >> qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1', >> granularity=8192, sync='full', mode='absolute-paths', >> format='raw') >> >> While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct >> bs=4k. >> >> Cc: Jeff Cody <jcody@redhat.com> >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> >> --- >> block/mirror.c | 4 ++++ >> trace-events | 1 + >> 2 files changed, 5 insertions(+) >> >> diff --git a/block/mirror.c b/block/mirror.c >> index 048e452..985ad00 100644 >> --- a/block/mirror.c >> +++ b/block/mirror.c >> @@ -241,6 +241,10 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) >> trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); >> break; >> } >> + if (IOV_MAX < nb_chunks + added_chunks) { > > No Yoda conditions... apart from that, > > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> I found it annoying to write it backwards too, but it's for consistency: if (s->buf_free_count < nb_chunks + added_chunks) { trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); break; } if (IOV_MAX < nb_chunks + added_chunks) { trace_mirror_break_iov_max(s, nb_chunks, added_chunks); break; } It's the same type of check as s->buf_free_count (which isn't modified by this loop either so it's a yoda conditional).
On 01/07/2015 16:59, Stefan Hajnoczi wrote: > I found it annoying to write it backwards too, but it's for consistency: > > if (s->buf_free_count < nb_chunks + added_chunks) { > trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); > break; > } > if (IOV_MAX < nb_chunks + added_chunks) { > trace_mirror_break_iov_max(s, nb_chunks, added_chunks); > break; > } > > It's the same type of check as s->buf_free_count (which isn't modified > by this loop either so it's a yoda conditional). Hmm, right. The problem goes back to: while (nb_chunks == 0 && s->buf_free_count < added_chunks) { trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight); qemu_coroutine_yield(); } where s->buf_free_count _is_ modified by the loop. The if below: if (s->buf_free_count < nb_chunks + added_chunks) { trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); break; } is written as a < check for consistency, and the one you add exacerbates the problem. If you want you can change the < to > in the "while" loop as well; otherwise the patch is okay as is. Paolo
On Wed, Jul 1, 2015 at 4:03 PM, Paolo Bonzini <pbonzini@redhat.com> wrote: > On 01/07/2015 16:59, Stefan Hajnoczi wrote: >> I found it annoying to write it backwards too, but it's for consistency: >> >> if (s->buf_free_count < nb_chunks + added_chunks) { >> trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); >> break; >> } >> if (IOV_MAX < nb_chunks + added_chunks) { >> trace_mirror_break_iov_max(s, nb_chunks, added_chunks); >> break; >> } >> >> It's the same type of check as s->buf_free_count (which isn't modified >> by this loop either so it's a yoda conditional). > > Hmm, right. The problem goes back to: > > while (nb_chunks == 0 && s->buf_free_count < added_chunks) { > trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight); > qemu_coroutine_yield(); > } > > where s->buf_free_count _is_ modified by the loop. The if below: > > if (s->buf_free_count < nb_chunks + added_chunks) { > trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); > break; > } > > is written as a < check for consistency, and the one you add exacerbates > the problem. If you want you can change the < to > in the "while" loop > as well; otherwise the patch is okay as is. Let's leave it.
Am 01.07.2015 um 16:45 hat Stefan Hajnoczi geschrieben: > If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2) > EINVAL failures may be encountered. > > It is possible to trigger this by setting granularity to a low value > like 8192. > > This patch stops appending chunks once IOV_MAX is reached. > > The spurious EINVAL failure can be reproduced with a qcow2 image file > and the following QMP invocation: > > qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1', > granularity=8192, sync='full', mode='absolute-paths', > format='raw') > > While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct > bs=4k. > > Cc: Jeff Cody <jcody@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> This looks like a backend-specific problem with raw-posix. Wouldn't it make more sense to either let raw-posix split requests it can't handle or introduce a bs->iov_max instead of spreading knowledge about specific backends into all callers of the block layer? I'm not objecting to taking this patch for now to fix the bug in 2.4, but I don't think it's the proper solution. Kevin > block/mirror.c | 4 ++++ > trace-events | 1 + > 2 files changed, 5 insertions(+) > > diff --git a/block/mirror.c b/block/mirror.c > index 048e452..985ad00 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -241,6 +241,10 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) > trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); > break; > } > + if (IOV_MAX < nb_chunks + added_chunks) { > + trace_mirror_break_iov_max(s, nb_chunks, added_chunks); > + break; > + } > > /* We have enough free space to copy these sectors. */ > bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks); > diff --git a/trace-events b/trace-events > index 52b7efa..943cd0c 100644 > --- a/trace-events > +++ b/trace-events > @@ -94,6 +94,7 @@ mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirt > mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d" > mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" > mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" > +mirror_break_iov_max(void *s, int nb_chunks, int added_chunks) "s %p requested chunks %d added_chunks %d" > > # block/backup.c > backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d" > -- > 2.4.3 > >
On Wed, Jul 8, 2015 at 12:14 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 01.07.2015 um 16:45 hat Stefan Hajnoczi geschrieben: >> If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2) >> EINVAL failures may be encountered. >> >> It is possible to trigger this by setting granularity to a low value >> like 8192. >> >> This patch stops appending chunks once IOV_MAX is reached. >> >> The spurious EINVAL failure can be reproduced with a qcow2 image file >> and the following QMP invocation: >> >> qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1', >> granularity=8192, sync='full', mode='absolute-paths', >> format='raw') >> >> While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct >> bs=4k. >> >> Cc: Jeff Cody <jcody@redhat.com> >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > This looks like a backend-specific problem with raw-posix. Wouldn't it > make more sense to either let raw-posix split requests it can't handle > or introduce a bs->iov_max instead of spreading knowledge about specific > backends into all callers of the block layer? > > I'm not objecting to taking this patch for now to fix the bug in 2.4, > but I don't think it's the proper solution. I will send a patch for QEMU 2.5 that introduces an iov max field in BlockLimits. It will convert the request merging code which currently hardcodes IOV_MAX too. In practice there are probably only two values that get used: IOV_MAX and INT_MAX, but we already have BlockLimits and it doesn't hurt to add this. Stefan
Sorry Jeff, my "Cc:" line in the commit description wasn't picked up. Please merge through your block/jobs tree for QEMU 2.5. Thanks, Stefan On Wed, Jul 1, 2015 at 3:45 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote: > If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2) > EINVAL failures may be encountered. > > It is possible to trigger this by setting granularity to a low value > like 8192. > > This patch stops appending chunks once IOV_MAX is reached. > > The spurious EINVAL failure can be reproduced with a qcow2 image file > and the following QMP invocation: > > qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1', > granularity=8192, sync='full', mode='absolute-paths', > format='raw') > > While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct > bs=4k. > > Cc: Jeff Cody <jcody@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > block/mirror.c | 4 ++++ > trace-events | 1 + > 2 files changed, 5 insertions(+) > > diff --git a/block/mirror.c b/block/mirror.c > index 048e452..985ad00 100644 > --- a/block/mirror.c > +++ b/block/mirror.c > @@ -241,6 +241,10 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) > trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); > break; > } > + if (IOV_MAX < nb_chunks + added_chunks) { > + trace_mirror_break_iov_max(s, nb_chunks, added_chunks); > + break; > + } > > /* We have enough free space to copy these sectors. */ > bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks); > diff --git a/trace-events b/trace-events > index 52b7efa..943cd0c 100644 > --- a/trace-events > +++ b/trace-events > @@ -94,6 +94,7 @@ mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirt > mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d" > mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" > mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" > +mirror_break_iov_max(void *s, int nb_chunks, int added_chunks) "s %p requested chunks %d added_chunks %d" > > # block/backup.c > backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d" > -- > 2.4.3 > >
On Wed, Jul 01, 2015 at 03:45:50PM +0100, Stefan Hajnoczi wrote: > If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2) > EINVAL failures may be encountered. > > It is possible to trigger this by setting granularity to a low value > like 8192. > > This patch stops appending chunks once IOV_MAX is reached. > > The spurious EINVAL failure can be reproduced with a qcow2 image file > and the following QMP invocation: > > qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1', > granularity=8192, sync='full', mode='absolute-paths', > format='raw') > > While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct > bs=4k. > Thanks, Applied to my 'block-next' branch: https://github.com/codyprime/qemu-kvm-jtc/tree/block-next -Jeff
diff --git a/block/mirror.c b/block/mirror.c index 048e452..985ad00 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -241,6 +241,10 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s) trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight); break; } + if (IOV_MAX < nb_chunks + added_chunks) { + trace_mirror_break_iov_max(s, nb_chunks, added_chunks); + break; + } /* We have enough free space to copy these sectors. */ bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks); diff --git a/trace-events b/trace-events index 52b7efa..943cd0c 100644 --- a/trace-events +++ b/trace-events @@ -94,6 +94,7 @@ mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirt mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d" mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d" +mirror_break_iov_max(void *s, int nb_chunks, int added_chunks) "s %p requested chunks %d added_chunks %d" # block/backup.c backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d"
If mirror has more free buffers than IOV_MAX, preadv(2)/pwritev(2) EINVAL failures may be encountered. It is possible to trigger this by setting granularity to a low value like 8192. This patch stops appending chunks once IOV_MAX is reached. The spurious EINVAL failure can be reproduced with a qcow2 image file and the following QMP invocation: qmp.command('drive-mirror', device='virtio0', target='/tmp/r7.s1', granularity=8192, sync='full', mode='absolute-paths', format='raw') While the guest is running dd if=/dev/zero of=/var/tmp/foo oflag=direct bs=4k. Cc: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- block/mirror.c | 4 ++++ trace-events | 1 + 2 files changed, 5 insertions(+)