Patchwork [QEMU-KVM] : Megasas + TCM_Loop + SG_IO into Windows XP guests

login
register
mail settings
Submitter Nicholas A. Bellinger
Date May 13, 2010, 9:38 p.m.
Message ID <1273786731.13658.49.camel@haakon2.linux-iscsi.org>
Download mbox | patch
Permalink /patch/52522/
State New
Headers show

Comments

Nicholas A. Bellinger - May 13, 2010, 9:38 p.m.
Greetings Hannes and co,

I have been spending a bit of time trying Megasas HBA emulation +
TCM_Loop + SG_IO with a Windows XP SP2 KVM guests..  So far, I noticed
that hw/scsi-generic.c:execute_command_run() using bdev_aio_ioctl()
appears to be broken for XP guests, which causes the first 36-byte
INQUIRY sent via SG_IO to never make it back to QEMU and results in the
win32 LSI drive taking the LUN offline, et al.  Note that everything
does appear to be functioning as expected in kernel space for the first
INQUIRY with the TCM_Loop LLD and Linux/SCSI code (AFAICT) and Linux KVM
guests using megasas emulation are still working.

So, I ended up needing requiring the following quick hack for
hw/scsi-generic.c:execute_command_run() to make SG_IO function
synchronously using bdrv_ioctl(), which at least gets LUN registration
and basic control path CDBs working for the XP guest.

Here is how it looks in action on a v2.6.34-rc7 host so far:

http://www.linux-iscsi.org/images/TCM-KVM-megasas-XP-05132010.png




Beyond the initial LUN registration and control CDB parts, doing bulk
DATA_SG_IO traffic is completing successfully (and everything looks sane
with TCM_Loop and Linux/SCSI) but it appears that the correct blocks are
not actually getting written/read by megasas.  This appears to be the
case with both hw/scsi-generic.c and hw/scsi-disk.c modes of operation
for megasas with the win32 XP guest.

So I was wondering if anyone aware of known issues with QEMU
asynchronous SG_IO into MSFT KVM guests with virtio or hw/lsi53c895a.c,
or would this be something specific to megasas HBA emulation and XP
guests..?

Hannes, which MSFT guest + driver did you get work stable with bulk
DATA_SG_IO and hw/scsi-disk.c..?

Best,

--nab
Hannes Reinecke - May 14, 2010, 7:22 a.m.
Nicholas A. Bellinger wrote:
> Greetings Hannes and co,
> 
> I have been spending a bit of time trying Megasas HBA emulation +
> TCM_Loop + SG_IO with a Windows XP SP2 KVM guests..  So far, I noticed
> that hw/scsi-generic.c:execute_command_run() using bdev_aio_ioctl()
> appears to be broken for XP guests, which causes the first 36-byte
> INQUIRY sent via SG_IO to never make it back to QEMU and results in the
> win32 LSI drive taking the LUN offline, et al.  Note that everything
> does appear to be functioning as expected in kernel space for the first
> INQUIRY with the TCM_Loop LLD and Linux/SCSI code (AFAICT) and Linux KVM
> guests using megasas emulation are still working.
> 
Now that is really odd. Have you checked if it works with the
'normal' KVM disk backend?

> So, I ended up needing requiring the following quick hack for
> hw/scsi-generic.c:execute_command_run() to make SG_IO function
> synchronously using bdrv_ioctl(), which at least gets LUN registration
> and basic control path CDBs working for the XP guest.
> 
> Here is how it looks in action on a v2.6.34-rc7 host so far:
> 
> http://www.linux-iscsi.org/images/TCM-KVM-megasas-XP-05132010.png
> 
> 
> diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
> index 6c58742..aa1eb83 100644
> --- a/hw/scsi-generic.c
> +++ b/hw/scsi-generic.c
> @@ -140,6 +140,7 @@ static int execute_command_run(SCSIGenericReq *r,
>  {
>      BlockDriverState *bdrv = r->req.dev->conf.dinfo->bdrv;
>      SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, r->req.dev);
> +    int ret;
>  
>      r->io_header.interface_id = 'S';
>      r->io_header.dxfer_direction = sgdir[r->req.cmd.mode];
> @@ -161,11 +162,16 @@ static int execute_command_run(SCSIGenericReq *r,
>      printf("\n");
>      }
>  #endif
> +#if 0
>      r->req.aiocb = bdrv_aio_ioctl(bdrv, SG_IO, &r->io_header, complete, r);
>      if (r->req.aiocb == NULL) {
>          BADF("execute_command: read failed !\n");
>          return -1;
>      }
> +#else
> +    ret = bdrv_ioctl(bdrv, SG_IO, &r->io_header);
> +    complete((void *)r, ret);
> +#endif
>  
>       *      return 0;
>  }
> 
> 
> Beyond the initial LUN registration and control CDB parts, doing bulk
> DATA_SG_IO traffic is completing successfully (and everything looks sane
> with TCM_Loop and Linux/SCSI) but it appears that the correct blocks are
> not actually getting written/read by megasas.  This appears to be the
> case with both hw/scsi-generic.c and hw/scsi-disk.c modes of operation
> for megasas with the win32 XP guest.
> 
Oh. Hmm.

> So I was wondering if anyone aware of known issues with QEMU
> asynchronous SG_IO into MSFT KVM guests with virtio or hw/lsi53c895a.c,
> or would this be something specific to megasas HBA emulation and XP
> guests..?
> 
> Hannes, which MSFT guest + driver did you get work stable with bulk
> DATA_SG_IO and hw/scsi-disk.c..?
> 
Well, I have two more patches for megasas.
The one is just a cleanup to remove duplicate definitions, but the
other contains a real issue with a misjudged cast in megasas_enqueue_frame().
Not sure if that helps here, but it's worth a try nevertheless.

I'll be sending them with separate mails.

Let's see if I can find some time working on the megasas emulation.
Maybe I find something.
Last time I checked it was with a Windows7 build, but I didn't do
any real tests there. Basically just checking if the system boots up :-)

Cheers,

Hannes
Nicholas A. Bellinger - May 14, 2010, 9:42 a.m.
On Fri, 2010-05-14 at 09:22 +0200, Hannes Reinecke wrote:
> Nicholas A. Bellinger wrote:
> > Greetings Hannes and co,
> > 
<SNIP>
> 
> > So, I ended up needing requiring the following quick hack for
> > hw/scsi-generic.c:execute_command_run() to make SG_IO function
> > synchronously using bdrv_ioctl(), which at least gets LUN registration
> > and basic control path CDBs working for the XP guest.
> > 
> > Here is how it looks in action on a v2.6.34-rc7 host so far:
> > 
> > http://www.linux-iscsi.org/images/TCM-KVM-megasas-XP-05132010.png
> > 
> > 
> > diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
> > index 6c58742..aa1eb83 100644
> > --- a/hw/scsi-generic.c
> > +++ b/hw/scsi-generic.c
> > @@ -140,6 +140,7 @@ static int execute_command_run(SCSIGenericReq *r,
> >  {
> >      BlockDriverState *bdrv = r->req.dev->conf.dinfo->bdrv;
> >      SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, r->req.dev);
> > +    int ret;
> >  
> >      r->io_header.interface_id = 'S';
> >      r->io_header.dxfer_direction = sgdir[r->req.cmd.mode];
> > @@ -161,11 +162,16 @@ static int execute_command_run(SCSIGenericReq *r,
> >      printf("\n");
> >      }
> >  #endif
> > +#if 0
> >      r->req.aiocb = bdrv_aio_ioctl(bdrv, SG_IO, &r->io_header, complete, r);
> >      if (r->req.aiocb == NULL) {
> >          BADF("execute_command: read failed !\n");
> >          return -1;
> >      }
> > +#else
> > +    ret = bdrv_ioctl(bdrv, SG_IO, &r->io_header);
> > +    complete((void *)r, ret);
> > +#endif
> >  
> >       *      return 0;
> >  }
> > 
> > 
> > Beyond the initial LUN registration and control CDB parts, doing bulk
> > DATA_SG_IO traffic is completing successfully (and everything looks sane
> > with TCM_Loop and Linux/SCSI) but it appears that the correct blocks are
> > not actually getting written/read by megasas.  This appears to be the
> > case with both hw/scsi-generic.c and hw/scsi-disk.c modes of operation
> > for megasas with the win32 XP guest.
> > 
> Oh. Hmm.
>
> > So I was wondering if anyone aware of known issues with QEMU
> > asynchronous SG_IO into MSFT KVM guests with virtio or hw/lsi53c895a.c,
> > or would this be something specific to megasas HBA emulation and XP
> > guests..?
> > 
> > Hannes, which MSFT guest + driver did you get work stable with bulk
> > DATA_SG_IO and hw/scsi-disk.c..?
> > 
> Well, I have two more patches for megasas.
> The one is just a cleanup to remove duplicate definitions, but the
> other contains a real issue with a misjudged cast in megasas_enqueue_frame().
> Not sure if that helps here, but it's worth a try nevertheless.
> 
> I'll be sending them with separate mails.
> 

Thanks, applied both patches to the megasas friendly qemu-kvm.git tree.

> Let's see if I can find some time working on the megasas emulation.
> Maybe I find something.
> Last time I checked it was with a Windows7 build, but I didn't do
> any real tests there. Basically just checking if the system boots up :-)
> 

Nothing fancy just yet.  This is involving a normal NTFS filesystem
format on a small TCM/FILEIO LUN using scsi-generic and a userspace
FILEIO with scsi-disk.

This involves the XP guest waiting until the very last READ_10 once the
format has completed (eg: all WRITE and VERIFY CDBs complete with GOOD
status AFAICT) before announcing that mkfs.ntfs failed without any
helpful exception message (due to missing metadata of some sort I would
assume..?)

So perhaps dumping QEMU and TCM_Loop SCSI payloads to determine if any
correct blocks from megasas_handle_io() are actually making it out to
KVM host is going to be my next option.  ;)

I might try with a newer version (and 64-bit) version of windows server
and see if that can produce some manner of more useful information for
us.

Best,

--nab
Nicholas A. Bellinger - May 17, 2010, 9:09 p.m.
On Fri, 2010-05-14 at 02:42 -0700, Nicholas A. Bellinger wrote:
> On Fri, 2010-05-14 at 09:22 +0200, Hannes Reinecke wrote:
> > Nicholas A. Bellinger wrote:
> > > Greetings Hannes and co,
> > > 
> <SNIP>
> > Let's see if I can find some time working on the megasas emulation.
> > Maybe I find something.
> > Last time I checked it was with a Windows7 build, but I didn't do
> > any real tests there. Basically just checking if the system boots up :-)
> > 
> 
> Nothing fancy just yet.  This is involving a normal NTFS filesystem
> format on a small TCM/FILEIO LUN using scsi-generic and a userspace
> FILEIO with scsi-disk.
> 
> This involves the XP guest waiting until the very last READ_10 once the
> format has completed (eg: all WRITE and VERIFY CDBs complete with GOOD
> status AFAICT) before announcing that mkfs.ntfs failed without any
> helpful exception message (due to missing metadata of some sort I would
> assume..?)
> 
> So perhaps dumping QEMU and TCM_Loop SCSI payloads to determine if any
> correct blocks from megasas_handle_io() are actually making it out to
> KVM host is going to be my next option.  ;)
> 

Greetings Hannes,

So I spent some more time with XP guests this weekend, and I noticed two
things immediately when using hw/lsi53c895a.c instead of hw/megasas.c
with the same two TCM_Loop SAS LUNs via SG_IO from last week:

1) With lsi53c895a, XP guests are able to boot successfully w/ out the
synchronous SG_IO hack that is currently required to get past the first
36-byte INQUIRY for megasas + XP SP2

2) With lsi53c895a, XP is able to successfully create and mount a NTFS
filesystem, reboot, and read blocks appear to be functioning properly.
FYI I have not run any 'write known pattern then read-back and compare
blocks' data integrity tests from with in the XP guests just yet, but I
am confident that TCM scatterlist -> se_mem_t mapping is working as
expected on the KVM Host.

Futhermore, after formatting a 5 GB TCM/FILEIO LUN with lsi53c895a, and
then rebooting with megasas with the same two configured TCM_Loop SG_IO
devices, it appears to be able to mount and read blocks successfully.
Attempting to write new blocks on the mounted filesystem also appears to
work to some degree, but throughput slows down to a crawl during XP
guest buffer cache flush, which is likely attributed to the use of my
quick SYNC SG_IO hack.

So it appears that there are two seperate issues here, and AFAICT they
both look to be XP and megasas specific.  For #2, it may be something
about the format of the incoming scatterlists generated during XP's
mkfs.ntfs that is causing some issues.  While watching output during fs
creation, I noticed the following WRITE_10s with a starting 4088 byte
scatterlist and a trailing 8 byte scatterlist:

megasas: writel mmio 40: 2b0b003
megasas: Found mapped frame 2 context 82b0b000 pa 2b0b000
megasas: Enqueue frame context 82b0b000 tail 493 busy 1
megasas: LD SCSI dev 2 lun 0 sdev 0xdc0230 xfer 16384
scsi-generic: Using cur_addr: 0x000000000ff6c008 cur_len: 0x0000000000000ff8
scsi-generic: Adding iovec for mem: 0x7f1783b96008 len: 0x0000000000000ff8
scsi-generic: Using cur_addr: 0x000000000fd6e000 cur_len: 0x0000000000001000
scsi-generic: Adding iovec for mem: 0x7f1783998000 len: 0x0000000000001000
scsi-generic: Using cur_addr: 0x000000000fe2f000 cur_len: 0x0000000000001000
scsi-generic: Adding iovec for mem: 0x7f1783a59000 len: 0x0000000000001000
scsi-generic: Using cur_addr: 0x000000000fdf0000 cur_len: 0x0000000000001000
scsi-generic: Adding iovec for mem: 0x7f1783a1a000 len: 0x0000000000001000
scsi-generic: Using cur_addr: 0x000000000fded000 cur_len: 0x0000000000000008
scsi-generic: Adding iovec for mem: 0x7f1783a17000 len: 0x0000000000000008
scsi-generic: execute IOV: iovec_count: 5, dxferp: 0xd92420, dxfer_len: 16384
scsi-generic: -----------------------> Issuing SG_IO CDB len 10: 0x2a 00 00 00 fa be 00 00 20 00 
scsi-generic: scsi_write_complete() ret = 0
scsi-generic: Command complete 0x0xd922c0 tag=0x82b0b000 status=0
megasas: LD SCSI req 0xd922c0 cmd 0xda92c0 lun 0xdc0230 finished with status 0 len 16384
megasas: Complete frame context 82b0b000 tail 493 busy 0 doorbell 0

Also, the final READ_10 that produces the 'could not create filesystem'
exception is for LBA 63 and XP looking for the first FS blocks after
GPT.

Could there be some breakage in megasas with a length < PAGE_SIZE for
the scatterlist..?    As lsi53c895a seems to work OK for this case, is
there something about the logic of parsing the incoming struct
scatterlists that is different between the two HBA drivers..?  AFAICT
both are using Gerd's common code in hw/scsi-bus.c, unless there is
something about megasas_map_sgl() that is causing issues with the
above..?

Best,

--nab
Hannes Reinecke - May 18, 2010, 9:43 a.m.
Nicholas A. Bellinger wrote:
> On Fri, 2010-05-14 at 02:42 -0700, Nicholas A. Bellinger wrote:
>> On Fri, 2010-05-14 at 09:22 +0200, Hannes Reinecke wrote:
>>> Nicholas A. Bellinger wrote:
>>>> Greetings Hannes and co,
>>>>
>> <SNIP>
>>> Let's see if I can find some time working on the megasas emulation.
>>> Maybe I find something.
>>> Last time I checked it was with a Windows7 build, but I didn't do
>>> any real tests there. Basically just checking if the system boots up :-)
>>>
>> Nothing fancy just yet.  This is involving a normal NTFS filesystem
>> format on a small TCM/FILEIO LUN using scsi-generic and a userspace
>> FILEIO with scsi-disk.
>>
>> This involves the XP guest waiting until the very last READ_10 once the
>> format has completed (eg: all WRITE and VERIFY CDBs complete with GOOD
>> status AFAICT) before announcing that mkfs.ntfs failed without any
>> helpful exception message (due to missing metadata of some sort I would
>> assume..?)
>>
>> So perhaps dumping QEMU and TCM_Loop SCSI payloads to determine if any
>> correct blocks from megasas_handle_io() are actually making it out to
>> KVM host is going to be my next option.  ;)
>>
> 
> Greetings Hannes,
> 
> So I spent some more time with XP guests this weekend, and I noticed two
> things immediately when using hw/lsi53c895a.c instead of hw/megasas.c
> with the same two TCM_Loop SAS LUNs via SG_IO from last week:
> 
> 1) With lsi53c895a, XP guests are able to boot successfully w/ out the
> synchronous SG_IO hack that is currently required to get past the first
> 36-byte INQUIRY for megasas + XP SP2
> 
> 2) With lsi53c895a, XP is able to successfully create and mount a NTFS
> filesystem, reboot, and read blocks appear to be functioning properly.
> FYI I have not run any 'write known pattern then read-back and compare
> blocks' data integrity tests from with in the XP guests just yet, but I
> am confident that TCM scatterlist -> se_mem_t mapping is working as
> expected on the KVM Host.
> 
> Futhermore, after formatting a 5 GB TCM/FILEIO LUN with lsi53c895a, and
> then rebooting with megasas with the same two configured TCM_Loop SG_IO
> devices, it appears to be able to mount and read blocks successfully.
> Attempting to write new blocks on the mounted filesystem also appears to
> work to some degree, but throughput slows down to a crawl during XP
> guest buffer cache flush, which is likely attributed to the use of my
> quick SYNC SG_IO hack.
> 
> So it appears that there are two seperate issues here, and AFAICT they
> both look to be XP and megasas specific.  For #2, it may be something
> about the format of the incoming scatterlists generated during XP's
> mkfs.ntfs that is causing some issues.  While watching output during fs
> creation, I noticed the following WRITE_10s with a starting 4088 byte
> scatterlist and a trailing 8 byte scatterlist:
> 
> megasas: writel mmio 40: 2b0b003
> megasas: Found mapped frame 2 context 82b0b000 pa 2b0b000
> megasas: Enqueue frame context 82b0b000 tail 493 busy 1
> megasas: LD SCSI dev 2 lun 0 sdev 0xdc0230 xfer 16384
> scsi-generic: Using cur_addr: 0x000000000ff6c008 cur_len: 0x0000000000000ff8
> scsi-generic: Adding iovec for mem: 0x7f1783b96008 len: 0x0000000000000ff8
> scsi-generic: Using cur_addr: 0x000000000fd6e000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783998000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fe2f000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783a59000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fdf0000 cur_len: 0x0000000000001000
> scsi-generic: Adding iovec for mem: 0x7f1783a1a000 len: 0x0000000000001000
> scsi-generic: Using cur_addr: 0x000000000fded000 cur_len: 0x0000000000000008
> scsi-generic: Adding iovec for mem: 0x7f1783a17000 len: 0x0000000000000008
> scsi-generic: execute IOV: iovec_count: 5, dxferp: 0xd92420, dxfer_len: 16384
> scsi-generic: -----------------------> Issuing SG_IO CDB len 10: 0x2a 00 00 00 fa be 00 00 20 00 
> scsi-generic: scsi_write_complete() ret = 0
> scsi-generic: Command complete 0x0xd922c0 tag=0x82b0b000 status=0
> megasas: LD SCSI req 0xd922c0 cmd 0xda92c0 lun 0xdc0230 finished with status 0 len 16384
> megasas: Complete frame context 82b0b000 tail 493 busy 0 doorbell 0
> 
> Also, the final READ_10 that produces the 'could not create filesystem'
> exception is for LBA 63 and XP looking for the first FS blocks after
> GPT.
> 
> Could there be some breakage in megasas with a length < PAGE_SIZE for
> the scatterlist..?    As lsi53c895a seems to work OK for this case, is
> there something about the logic of parsing the incoming struct
> scatterlists that is different between the two HBA drivers..?  AFAICT
> both are using Gerd's common code in hw/scsi-bus.c, unless there is
> something about megasas_map_sgl() that is causing issues with the
> above..?
> 

The usual disclaimer here: I'm less than happy with the current SCSI disk handling.
Currently we have the two options:
- Using 'scsi-disk', which will _emulate_ a SCSI disk internally, but allow to use
  asynchronous I/O using normal read/write syscalls
- Using 'scsi-generic', which will allow you to pass-through any SCSI device, but
  disallow asynchronous I/O and requires you to use the SG_IO interface.
The latter also implies that the host will mark _all_ I/O commands as 'block_pc',
so the code path within the kernel is quite different from those taken by I/Os
coming in via the 'scsi-disk' emulation.
Guess it's time to have a 'scsi-passthrough' device ...

Other than that: Think we have to investigate.
If you could send me a quite setup guide on how to configure TCM_Loop for an
existing device I'd give it a go ...

Thanks,

Hannes
Nicholas A. Bellinger - May 18, 2010, 11:18 a.m.
On Tue, 2010-05-18 at 11:43 +0200, Hannes Reinecke wrote:
> Nicholas A. Bellinger wrote:
> > On Fri, 2010-05-14 at 02:42 -0700, Nicholas A. Bellinger wrote:
> >> On Fri, 2010-05-14 at 09:22 +0200, Hannes Reinecke wrote:
> >>> Nicholas A. Bellinger wrote:
> >>>> Greetings Hannes and co,
> >>>>
> >> <SNIP>
> >>> Let's see if I can find some time working on the megasas emulation.
> >>> Maybe I find something.
> >>> Last time I checked it was with a Windows7 build, but I didn't do
> >>> any real tests there. Basically just checking if the system boots up :-)
> >>>
> >> Nothing fancy just yet.  This is involving a normal NTFS filesystem
> >> format on a small TCM/FILEIO LUN using scsi-generic and a userspace
> >> FILEIO with scsi-disk.
> >>
> >> This involves the XP guest waiting until the very last READ_10 once the
> >> format has completed (eg: all WRITE and VERIFY CDBs complete with GOOD
> >> status AFAICT) before announcing that mkfs.ntfs failed without any
> >> helpful exception message (due to missing metadata of some sort I would
> >> assume..?)
> >>
> >> So perhaps dumping QEMU and TCM_Loop SCSI payloads to determine if any
> >> correct blocks from megasas_handle_io() are actually making it out to
> >> KVM host is going to be my next option.  ;)
> >>
> > 
> > Greetings Hannes,
> > 
> > So I spent some more time with XP guests this weekend, and I noticed two
> > things immediately when using hw/lsi53c895a.c instead of hw/megasas.c
> > with the same two TCM_Loop SAS LUNs via SG_IO from last week:
> > 
> > 1) With lsi53c895a, XP guests are able to boot successfully w/ out the
> > synchronous SG_IO hack that is currently required to get past the first
> > 36-byte INQUIRY for megasas + XP SP2
> > 
> > 2) With lsi53c895a, XP is able to successfully create and mount a NTFS
> > filesystem, reboot, and read blocks appear to be functioning properly.
> > FYI I have not run any 'write known pattern then read-back and compare
> > blocks' data integrity tests from with in the XP guests just yet, but I
> > am confident that TCM scatterlist -> se_mem_t mapping is working as
> > expected on the KVM Host.
> > 
> > Futhermore, after formatting a 5 GB TCM/FILEIO LUN with lsi53c895a, and
> > then rebooting with megasas with the same two configured TCM_Loop SG_IO
> > devices, it appears to be able to mount and read blocks successfully.
> > Attempting to write new blocks on the mounted filesystem also appears to
> > work to some degree, but throughput slows down to a crawl during XP
> > guest buffer cache flush, which is likely attributed to the use of my
> > quick SYNC SG_IO hack.
> > 
> > So it appears that there are two seperate issues here, and AFAICT they
> > both look to be XP and megasas specific.  For #2, it may be something
> > about the format of the incoming scatterlists generated during XP's
> > mkfs.ntfs that is causing some issues.  While watching output during fs
> > creation, I noticed the following WRITE_10s with a starting 4088 byte
> > scatterlist and a trailing 8 byte scatterlist:
> > 
> > megasas: writel mmio 40: 2b0b003
> > megasas: Found mapped frame 2 context 82b0b000 pa 2b0b000
> > megasas: Enqueue frame context 82b0b000 tail 493 busy 1
> > megasas: LD SCSI dev 2 lun 0 sdev 0xdc0230 xfer 16384
> > scsi-generic: Using cur_addr: 0x000000000ff6c008 cur_len: 0x0000000000000ff8
> > scsi-generic: Adding iovec for mem: 0x7f1783b96008 len: 0x0000000000000ff8
> > scsi-generic: Using cur_addr: 0x000000000fd6e000 cur_len: 0x0000000000001000
> > scsi-generic: Adding iovec for mem: 0x7f1783998000 len: 0x0000000000001000
> > scsi-generic: Using cur_addr: 0x000000000fe2f000 cur_len: 0x0000000000001000
> > scsi-generic: Adding iovec for mem: 0x7f1783a59000 len: 0x0000000000001000
> > scsi-generic: Using cur_addr: 0x000000000fdf0000 cur_len: 0x0000000000001000
> > scsi-generic: Adding iovec for mem: 0x7f1783a1a000 len: 0x0000000000001000
> > scsi-generic: Using cur_addr: 0x000000000fded000 cur_len: 0x0000000000000008
> > scsi-generic: Adding iovec for mem: 0x7f1783a17000 len: 0x0000000000000008
> > scsi-generic: execute IOV: iovec_count: 5, dxferp: 0xd92420, dxfer_len: 16384
> > scsi-generic: -----------------------> Issuing SG_IO CDB len 10: 0x2a 00 00 00 fa be 00 00 20 00 
> > scsi-generic: scsi_write_complete() ret = 0
> > scsi-generic: Command complete 0x0xd922c0 tag=0x82b0b000 status=0
> > megasas: LD SCSI req 0xd922c0 cmd 0xda92c0 lun 0xdc0230 finished with status 0 len 16384
> > megasas: Complete frame context 82b0b000 tail 493 busy 0 doorbell 0
> > 
> > Also, the final READ_10 that produces the 'could not create filesystem'
> > exception is for LBA 63 and XP looking for the first FS blocks after
> > GPT.
> > 
> > Could there be some breakage in megasas with a length < PAGE_SIZE for
> > the scatterlist..?    As lsi53c895a seems to work OK for this case, is
> > there something about the logic of parsing the incoming struct
> > scatterlists that is different between the two HBA drivers..?  AFAICT
> > both are using Gerd's common code in hw/scsi-bus.c, unless there is
> > something about megasas_map_sgl() that is causing issues with the
> > above..?
> > 
> 
> The usual disclaimer here: I'm less than happy with the current SCSI disk handling.
> Currently we have the two options:
> - Using 'scsi-disk', which will _emulate_ a SCSI disk internally, but allow to use
>   asynchronous I/O using normal read/write syscalls
> - Using 'scsi-generic', which will allow you to pass-through any SCSI device, but
>   disallow asynchronous I/O and requires you to use the SG_IO interface.

Well, this is only true so far for the SYNC SG_IO patch with KVM XP
guests.  The asynchronous I/O still works as expected for Linux KVM
guests for 10 Gb/sec sec throughput.

> The latter also implies that the host will mark _all_ I/O commands as 'block_pc',
> so the code path within the kernel is quite different from those taken by I/Os
> coming in via the 'scsi-disk' emulation.
> Guess it's time to have a 'scsi-passthrough' device ...

Currently with QEMU-KVM hw/scsi-generic.c and STGT usr/bs_sg.c we are
expecting driver/scsi/sg.c:sg_start_req() to the passed return
hp->iov_count..

> 
> Other than that: Think we have to investigate.
> If you could send me a quite setup guide on how to configure TCM_Loop for an
> existing device I'd give it a go ...
> 

Sure, the setup for a TCM/IBLOCK device with the TCM_Loop fabric module
is:

tcm_node --block <$HBA/$DEV> <$UDEV_PATH>

and then setup the TCM_Loop virtual SAS endpoint LUN=0 with TCM/LIO 4.0
with a nexus and LUN=0 with:

tcm_loop --createnexus 1
tcm_loop --addlun <$SAS_TARGET_PORT> 1 0 $HBA/$DEV

Best,

--nab
Nicholas A. Bellinger - May 30, 2010, 4:25 a.m.
On Tue, 2010-05-18 at 04:18 -0700, Nicholas A. Bellinger wrote:
> On Tue, 2010-05-18 at 11:43 +0200, Hannes Reinecke wrote:
> > Nicholas A. Bellinger wrote:
> > > On Fri, 2010-05-14 at 02:42 -0700, Nicholas A. Bellinger wrote:
> > > Greetings Hannes,
> > > 
> > > So I spent some more time with XP guests this weekend, and I noticed two
> > > things immediately when using hw/lsi53c895a.c instead of hw/megasas.c
> > > with the same two TCM_Loop SAS LUNs via SG_IO from last week:
> > > 
> > > 1) With lsi53c895a, XP guests are able to boot successfully w/ out the
> > > synchronous SG_IO hack that is currently required to get past the first
> > > 36-byte INQUIRY for megasas + XP SP2
> > > 
> > > 2) With lsi53c895a, XP is able to successfully create and mount a NTFS
> > > filesystem, reboot, and read blocks appear to be functioning properly.
> > > FYI I have not run any 'write known pattern then read-back and compare
> > > blocks' data integrity tests from with in the XP guests just yet, but I
> > > am confident that TCM scatterlist -> se_mem_t mapping is working as
> > > expected on the KVM Host.
> > > 
> > > Futhermore, after formatting a 5 GB TCM/FILEIO LUN with lsi53c895a, and
> > > then rebooting with megasas with the same two configured TCM_Loop SG_IO
> > > devices, it appears to be able to mount and read blocks successfully.
> > > Attempting to write new blocks on the mounted filesystem also appears to
> > > work to some degree, but throughput slows down to a crawl during XP
> > > guest buffer cache flush, which is likely attributed to the use of my
> > > quick SYNC SG_IO hack.
> > > 
> > > So it appears that there are two seperate issues here, and AFAICT they
> > > both look to be XP and megasas specific.  For #2, it may be something
> > > about the format of the incoming scatterlists generated during XP's
> > > mkfs.ntfs that is causing some issues.  While watching output during fs
> > > creation, I noticed the following WRITE_10s with a starting 4088 byte
> > > scatterlist and a trailing 8 byte scatterlist:
> > > 
> > > megasas: writel mmio 40: 2b0b003
> > > megasas: Found mapped frame 2 context 82b0b000 pa 2b0b000
> > > megasas: Enqueue frame context 82b0b000 tail 493 busy 1
> > > megasas: LD SCSI dev 2 lun 0 sdev 0xdc0230 xfer 16384
> > > scsi-generic: Using cur_addr: 0x000000000ff6c008 cur_len: 0x0000000000000ff8
> > > scsi-generic: Adding iovec for mem: 0x7f1783b96008 len: 0x0000000000000ff8
> > > scsi-generic: Using cur_addr: 0x000000000fd6e000 cur_len: 0x0000000000001000
> > > scsi-generic: Adding iovec for mem: 0x7f1783998000 len: 0x0000000000001000
> > > scsi-generic: Using cur_addr: 0x000000000fe2f000 cur_len: 0x0000000000001000
> > > scsi-generic: Adding iovec for mem: 0x7f1783a59000 len: 0x0000000000001000
> > > scsi-generic: Using cur_addr: 0x000000000fdf0000 cur_len: 0x0000000000001000
> > > scsi-generic: Adding iovec for mem: 0x7f1783a1a000 len: 0x0000000000001000
> > > scsi-generic: Using cur_addr: 0x000000000fded000 cur_len: 0x0000000000000008
> > > scsi-generic: Adding iovec for mem: 0x7f1783a17000 len: 0x0000000000000008
> > > scsi-generic: execute IOV: iovec_count: 5, dxferp: 0xd92420, dxfer_len: 16384
> > > scsi-generic: -----------------------> Issuing SG_IO CDB len 10: 0x2a 00 00 00 fa be 00 00 20 00 
> > > scsi-generic: scsi_write_complete() ret = 0
> > > scsi-generic: Command complete 0x0xd922c0 tag=0x82b0b000 status=0
> > > megasas: LD SCSI req 0xd922c0 cmd 0xda92c0 lun 0xdc0230 finished with status 0 len 16384
> > > megasas: Complete frame context 82b0b000 tail 493 busy 0 doorbell 0
> > > 
> > > Also, the final READ_10 that produces the 'could not create filesystem'
> > > exception is for LBA 63 and XP looking for the first FS blocks after
> > > GPT.
> > > 
> > > Could there be some breakage in megasas with a length < PAGE_SIZE for
> > > the scatterlist..?    As lsi53c895a seems to work OK for this case, is
> > > there something about the logic of parsing the incoming struct
> > > scatterlists that is different between the two HBA drivers..?  AFAICT
> > > both are using Gerd's common code in hw/scsi-bus.c, unless there is
> > > something about megasas_map_sgl() that is causing issues with the
> > > above..?
> > > 
> > 
> > The usual disclaimer here: I'm less than happy with the current SCSI disk handling.
> > Currently we have the two options:
> > - Using 'scsi-disk', which will _emulate_ a SCSI disk internally, but allow to use
> >   asynchronous I/O using normal read/write syscalls
> > - Using 'scsi-generic', which will allow you to pass-through any SCSI device, but
> >   disallow asynchronous I/O and requires you to use the SG_IO interface.
> 
> Well, this is only true so far for the SYNC SG_IO patch with KVM XP
> guests.  The asynchronous I/O still works as expected for Linux KVM
> guests for 10 Gb/sec sec throughput.
> 
> > The latter also implies that the host will mark _all_ I/O commands as 'block_pc',
> > so the code path within the kernel is quite different from those taken by I/Os
> > coming in via the 'scsi-disk' emulation.
> > Guess it's time to have a 'scsi-passthrough' device ...
> 
> Currently with QEMU-KVM hw/scsi-generic.c and STGT usr/bs_sg.c we are
> expecting driver/scsi/sg.c:sg_start_req() to the passed return
> hp->iov_count..
> 

Greetings Hannes and Gerd,

Just a quick update on this one..  After giving MEGASAS with shot with
Windows 7 x64, I am able to successfully format, mount and copy blocks
to a TCM/FILEIO LUN backstore and TCM_Loop SAS target port with SG_IO:

http://www.linux-iscsi.org/index.php/Image:TCM-KVM-Megasas-8708EM2-Windows7-x64.png

The 8708EM2 driver for the emulated LSI DFI PowerPC RAID Core was
automatically configured and did not require any extra setup once
megasas emulation was enabled from the QEMU CLI.  Seriously, a very
splended job Dr. Hannes.  8-)

Also it's worth mentioning that I am still running C: on QEMU IDE
emulation, as I could not quite figure out how to boot for a megasas LD
with option-rom.  What QEMU CLI ops where requried to make this work
again..?

Anyways, the issue remains in the megasas friendly qemu-kvm.git tree and
still appears to be specific to hw/megasas.c LDs for a v2.6.34 KVM
x86_64 host into 32-bit XP guests with both scsi-disk and scsi-generic
QEMU backstores.  I am thinking about having a look with the
i386-softmmu target and see if that makes any difference..  What do you
think..?

Best,

--nab
Gerd Hoffmann - May 31, 2010, 9:52 a.m.
Hi,

> Also it's worth mentioning that I am still running C: on QEMU IDE
> emulation, as I could not quite figure out how to boot for a megasas LD
> with option-rom.  What QEMU CLI ops where requried to make this work
> again..?

-option-rom $file

or

-device megasas,romfile=$file

cheers,
   Gerd
Alexander Graf - May 31, 2010, 7:18 p.m.
Am 31.05.2010 um 11:52 schrieb Gerd Hoffmann <kraxel@redhat.com>:

>  Hi,
>
>> Also it's worth mentioning that I am still running C: on QEMU IDE
>> emulation, as I could not quite figure out how to boot for a  
>> megasas LD
>> with option-rom.  What QEMU CLI ops where requried to make this work
>> again..?
>
> -option-rom $file
>
> or
>
> -device megasas,romfile=$file

Or -drive ...,boot=on if you're using qemu-kvm.

Alex

Patch

diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 6c58742..aa1eb83 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -140,6 +140,7 @@  static int execute_command_run(SCSIGenericReq *r,
 {
     BlockDriverState *bdrv = r->req.dev->conf.dinfo->bdrv;
     SCSIGenericState *s = DO_UPCAST(SCSIGenericState, qdev, r->req.dev);
+    int ret;
 
     r->io_header.interface_id = 'S';
     r->io_header.dxfer_direction = sgdir[r->req.cmd.mode];
@@ -161,11 +162,16 @@  static int execute_command_run(SCSIGenericReq *r,
     printf("\n");
     }
 #endif
+#if 0
     r->req.aiocb = bdrv_aio_ioctl(bdrv, SG_IO, &r->io_header, complete, r);
     if (r->req.aiocb == NULL) {
         BADF("execute_command: read failed !\n");
         return -1;
     }
+#else
+    ret = bdrv_ioctl(bdrv, SG_IO, &r->io_header);
+    complete((void *)r, ret);
+#endif
 
      *      return 0;
 }