Message ID | 1336066583-10503-1-git-send-email-sw@weilnetz.de |
---|---|
State | Accepted |
Headers | show |
Am 03.05.2012 19:36, schrieb Stefan Weil: > The QEMU emulation which is currently used with Raspberry PI images > (qemu-system-arm -M versatilepb ...) accesses memory which was freed. > > Valgrind output (extract): > > ==17857== Invalid write of size 4 > ==17857== at 0x24EB06: scsi_req_unref (scsi-bus.c:1273) > ==17857== by 0x24FFAE: scsi_read_complete (scsi-disk.c:277) > ==17857== by 0x152ACC: bdrv_co_em_bh (block.c:3363) > ==17857== by 0x13D49C: qemu_bh_poll (async.c:71) > ==17857== by 0x211A8C: main_loop_wait (main-loop.c:503) > ==17857== by 0x207954: main_loop (vl.c:1555) > ==17857== by 0x20E9C9: main (vl.c:3653) > ==17857== Address 0x1c54383c is 12 bytes inside a block of size 260 free'd > ==17857== at 0x4824B3A: free (vg_replace_malloc.c:366) > ==17857== by 0x20ADFA: free_and_trace (vl.c:2250) > ==17857== by 0x4899FC5: g_free (in /lib/libglib-2.0.so.0.2400.1) > ==17857== by 0x24EB3B: scsi_req_unref (scsi-bus.c:1277) > ==17857== by 0x24F003: scsi_req_complete (scsi-bus.c:1383) > ==17857== by 0x25022A: scsi_read_data (scsi-disk.c:334) > ==17857== by 0x24EB9F: scsi_req_continue (scsi-bus.c:1289) > ==17857== by 0x1C7787: lsi_do_dma (lsi53c895a.c:575) > ==17857== by 0x1C8CDA: lsi_execute_script (lsi53c895a.c:1147) > ==17857== by 0x1C74EA: lsi_resume_script (lsi53c895a.c:510) > ==17857== by 0x1C7ECD: lsi_transfer_data (lsi53c895a.c:746) > ==17857== by 0x24EC90: scsi_req_data (scsi-bus.c:1307) Hi Paolo, this is the result of a bisect to narrow the source of problem: ac6684264642f1aea7cba5c0c3907409b1f7f904 is the first bad commit commit ac6684264642f1aea7cba5c0c3907409b1f7f904 Author: Paolo Bonzini <pbonzini@redhat.com> Date: Thu Apr 19 11:55:28 2012 +0200 scsi: support FUA on reads To force unit access on reads, flush the cache *before* doing the read. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Regards, Stefan > > (There are some more similar messages.) > > This patch adds an assertion which also detects those errors: > > Calling scsi_req_unref is not allowed when the previous call > of that function has decremented refcount to 0, because in this > case req was freed. > > Signed-off-by: Stefan Weil <sw@weilnetz.de> > --- > > There are chances that this patch breaks some test scenarios, > but that is intentional: we should not pretend that there are > no errors when there are some. > > The Raspberry PI emulation with QEMU is currently used by > a lot of people. > > Please apply this patch for the tests of QEMU 1.1. > > Of course we should also fix the problem which triggers the > assertion. I still don't know whether it is caused by > lsi53c895a.c or by the scsi code. It is the scsi code, see git bisect result.
Il 03/05/2012 22:58, Stefan Weil ha scritto: > Am 03.05.2012 19:36, schrieb Stefan Weil: >> The QEMU emulation which is currently used with Raspberry PI images >> (qemu-system-arm -M versatilepb ...) accesses memory which was freed. >> >> Valgrind output (extract): >> >> ==17857== Invalid write of size 4 >> ==17857== at 0x24EB06: scsi_req_unref (scsi-bus.c:1273) >> ==17857== by 0x24FFAE: scsi_read_complete (scsi-disk.c:277) >> ==17857== by 0x152ACC: bdrv_co_em_bh (block.c:3363) >> ==17857== by 0x13D49C: qemu_bh_poll (async.c:71) >> ==17857== by 0x211A8C: main_loop_wait (main-loop.c:503) >> ==17857== by 0x207954: main_loop (vl.c:1555) >> ==17857== by 0x20E9C9: main (vl.c:3653) >> ==17857== Address 0x1c54383c is 12 bytes inside a block of size 260 >> free'd >> ==17857== at 0x4824B3A: free (vg_replace_malloc.c:366) >> ==17857== by 0x20ADFA: free_and_trace (vl.c:2250) >> ==17857== by 0x4899FC5: g_free (in /lib/libglib-2.0.so.0.2400.1) >> ==17857== by 0x24EB3B: scsi_req_unref (scsi-bus.c:1277) >> ==17857== by 0x24F003: scsi_req_complete (scsi-bus.c:1383) >> ==17857== by 0x25022A: scsi_read_data (scsi-disk.c:334) >> ==17857== by 0x24EB9F: scsi_req_continue (scsi-bus.c:1289) >> ==17857== by 0x1C7787: lsi_do_dma (lsi53c895a.c:575) >> ==17857== by 0x1C8CDA: lsi_execute_script (lsi53c895a.c:1147) >> ==17857== by 0x1C74EA: lsi_resume_script (lsi53c895a.c:510) >> ==17857== by 0x1C7ECD: lsi_transfer_data (lsi53c895a.c:746) >> ==17857== by 0x24EC90: scsi_req_data (scsi-bus.c:1307) Yes, this was reported by David Gibson too. Interesting that virtio-scsi doesn't show it, probably it's the sglist support that hides it. I queued the fix and I'm sending the pull request in a matter of minutes. The patch is a good addition so I queued it too, thanks. Paolo
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c index dbdb99c..62779c7 100644 --- a/hw/scsi-bus.c +++ b/hw/scsi-bus.c @@ -1270,6 +1270,7 @@ SCSIRequest *scsi_req_ref(SCSIRequest *req) void scsi_req_unref(SCSIRequest *req) { + assert(req->refcount > 0); if (--req->refcount == 0) { if (req->ops->free_req) { req->ops->free_req(req);
The QEMU emulation which is currently used with Raspberry PI images (qemu-system-arm -M versatilepb ...) accesses memory which was freed. Valgrind output (extract): ==17857== Invalid write of size 4 ==17857== at 0x24EB06: scsi_req_unref (scsi-bus.c:1273) ==17857== by 0x24FFAE: scsi_read_complete (scsi-disk.c:277) ==17857== by 0x152ACC: bdrv_co_em_bh (block.c:3363) ==17857== by 0x13D49C: qemu_bh_poll (async.c:71) ==17857== by 0x211A8C: main_loop_wait (main-loop.c:503) ==17857== by 0x207954: main_loop (vl.c:1555) ==17857== by 0x20E9C9: main (vl.c:3653) ==17857== Address 0x1c54383c is 12 bytes inside a block of size 260 free'd ==17857== at 0x4824B3A: free (vg_replace_malloc.c:366) ==17857== by 0x20ADFA: free_and_trace (vl.c:2250) ==17857== by 0x4899FC5: g_free (in /lib/libglib-2.0.so.0.2400.1) ==17857== by 0x24EB3B: scsi_req_unref (scsi-bus.c:1277) ==17857== by 0x24F003: scsi_req_complete (scsi-bus.c:1383) ==17857== by 0x25022A: scsi_read_data (scsi-disk.c:334) ==17857== by 0x24EB9F: scsi_req_continue (scsi-bus.c:1289) ==17857== by 0x1C7787: lsi_do_dma (lsi53c895a.c:575) ==17857== by 0x1C8CDA: lsi_execute_script (lsi53c895a.c:1147) ==17857== by 0x1C74EA: lsi_resume_script (lsi53c895a.c:510) ==17857== by 0x1C7ECD: lsi_transfer_data (lsi53c895a.c:746) ==17857== by 0x24EC90: scsi_req_data (scsi-bus.c:1307) (There are some more similar messages.) This patch adds an assertion which also detects those errors: Calling scsi_req_unref is not allowed when the previous call of that function has decremented refcount to 0, because in this case req was freed. Signed-off-by: Stefan Weil <sw@weilnetz.de> --- There are chances that this patch breaks some test scenarios, but that is intentional: we should not pretend that there are no errors when there are some. The Raspberry PI emulation with QEMU is currently used by a lot of people. Please apply this patch for the tests of QEMU 1.1. Of course we should also fix the problem which triggers the assertion. I still don't know whether it is caused by lsi53c895a.c or by the scsi code. Thanks, Stefan Weil hw/scsi-bus.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-)