diff mbox series

ibmvscsi: use GFP_ATOMIC with dma_alloc_coherent in map_sg_data

Message ID 1547089136-20264-1-git-send-email-tyreld@linux.vnet.ibm.com (mailing list archive)
State Not Applicable
Headers show
Series ibmvscsi: use GFP_ATOMIC with dma_alloc_coherent in map_sg_data | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success next/apply_patch Successfully applied
snowpatch_ozlabs/build-ppc64le success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64be success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64e success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-pmac32 success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked

Commit Message

Tyrel Datwyler Jan. 10, 2019, 2:58 a.m. UTC
While mapping DMA for scatter list when a scsi command is queued the
existing call to dma_alloc_coherent() in our map_sg_data() function
passes zero for the gfp_flags parameter. We are most definitly in atomic
context at this point as queue_command() is called in softirq context
and further we have a spinlock holding the scsi host lock.

Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any
sort of sleeping in atomic context deadlock.

Fixes: 4dddbc26c389 ("[SCSI] ibmvscsi: handle large scatter/gather lists")
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Brian King Jan. 10, 2019, 2:56 p.m. UTC | #1
On 01/09/2019 08:58 PM, Tyrel Datwyler wrote:
> While mapping DMA for scatter list when a scsi command is queued the
> existing call to dma_alloc_coherent() in our map_sg_data() function
> passes zero for the gfp_flags parameter. We are most definitly in atomic
> context at this point as queue_command() is called in softirq context
> and further we have a spinlock holding the scsi host lock.
> 
> Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any
> sort of sleeping in atomic context deadlock.
> 
> Fixes: 4dddbc26c389 ("[SCSI] ibmvscsi: handle large scatter/gather lists")
> Cc: stable@vger.kernel.org
> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
> ---
>  drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
> index 1135e74..cb8535e 100644
> --- a/drivers/scsi/ibmvscsi/ibmvscsi.c
> +++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
> @@ -731,7 +731,7 @@ static int map_sg_data(struct scsi_cmnd *cmd,
>  		evt_struct->ext_list = (struct srp_direct_buf *)
>  			dma_alloc_coherent(dev,
>  					   SG_ALL * sizeof(struct srp_direct_buf),
> -					   &evt_struct->ext_list_token, 0);
> +					   &evt_struct->ext_list_token, GFP_ATOMIC);
>  		if (!evt_struct->ext_list) {
>  			if (!firmware_has_feature(FW_FEATURE_CMO))
>  				sdev_printk(KERN_ERR, cmd->device,
> 

Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Christoph Hellwig Jan. 10, 2019, 3:07 p.m. UTC | #2
On Wed, Jan 09, 2019 at 06:58:56PM -0800, Tyrel Datwyler wrote:
> While mapping DMA for scatter list when a scsi command is queued the
> existing call to dma_alloc_coherent() in our map_sg_data() function
> passes zero for the gfp_flags parameter. We are most definitly in atomic
> context at this point as queue_command() is called in softirq context
> and further we have a spinlock holding the scsi host lock.
> 
> Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any
> sort of sleeping in atomic context deadlock.

This is a pretty clear sign you should not be using dma_alloc_coherent
to start with.  GFP_ATOMIC support in many of the implementations either
doesn't work at all or is severly constrained.  Given that the
descriptor is written by the OS and read by the hardware exactly once
there is no point in having the coherent mapping to start with.
Tyrel Datwyler Jan. 10, 2019, 8:11 p.m. UTC | #3
On 01/10/2019 07:07 AM, Christoph Hellwig wrote:
> On Wed, Jan 09, 2019 at 06:58:56PM -0800, Tyrel Datwyler wrote:
>> While mapping DMA for scatter list when a scsi command is queued the
>> existing call to dma_alloc_coherent() in our map_sg_data() function
>> passes zero for the gfp_flags parameter. We are most definitly in atomic
>> context at this point as queue_command() is called in softirq context
>> and further we have a spinlock holding the scsi host lock.
>>
>> Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any
>> sort of sleeping in atomic context deadlock.
> 
> This is a pretty clear sign you should not be using dma_alloc_coherent
> to start with.  GFP_ATOMIC support in many of the implementations either
> doesn't work at all or is severly constrained.  Given that the
> descriptor is written by the OS and read by the hardware exactly once
> there is no point in having the coherent mapping to start with.
> 

This allocation isn't a single use allocation. The driver is just lazy about allocating our ext_list area for large SG lists (ie. SG_ALL). When the driver was first written it only supported up to 10 indirect SRP buffers. James Bottemley added the large SG support back in 2005 with the commit referenced here in the fixes tag "4dddbc26c389". We only allocate the ext_list when we come across a SG list requiring more than 10 indirect buffers. Once allocated we will reuse if already allocated.

-Tyrel
Tyrel Datwyler Jan. 10, 2019, 11:15 p.m. UTC | #4
On 01/10/2019 07:07 AM, Christoph Hellwig wrote:
> On Wed, Jan 09, 2019 at 06:58:56PM -0800, Tyrel Datwyler wrote:
>> While mapping DMA for scatter list when a scsi command is queued the
>> existing call to dma_alloc_coherent() in our map_sg_data() function
>> passes zero for the gfp_flags parameter. We are most definitly in atomic
>> context at this point as queue_command() is called in softirq context
>> and further we have a spinlock holding the scsi host lock.
>>
>> Fix this by passing GFP_ATOMIC to dma_alloc_coherent() to prevent any
>> sort of sleeping in atomic context deadlock.
> 
> This is a pretty clear sign you should not be using dma_alloc_coherent
> to start with.  GFP_ATOMIC support in many of the implementations either
> doesn't work at all or is severly constrained.  

On a secondary note I was unaware of the GFP_ATOMIC limitations. Should this be
added to the documentation somewhere? I don't see any mention here form
DMA-API-HOWTO.txt.

Using Consistent DMA mappings
=============================

To allocate and map large (PAGE_SIZE or so) consistent DMA regions,
you should do::

        dma_addr_t dma_handle;

        cpu_addr = dma_alloc_coherent(dev, size, &dma_handle, gfp);

where device is a ``struct device *``. This may be called in interrupt
context with the GFP_ATOMIC flag.

-Tyrel

Given that the
> descriptor is written by the OS and read by the hardware exactly once
> there is no point in having the coherent mapping to start with.
>
Christoph Hellwig Jan. 11, 2019, 6:24 p.m. UTC | #5
On Thu, Jan 10, 2019 at 12:11:53PM -0800, Tyrel Datwyler wrote:
> This allocation isn't a single use allocation. The driver is just lazy about allocating our ext_list area for large SG lists (ie. SG_ALL). When the driver was first written it only supported up to 10 indirect SRP buffers. James Bottemley added the large SG support back in 2005 with the commit referenced here in the fixes tag "4dddbc26c389". We only allocate the ext_list when we come across a SG list requiring more than 10 indirect buffers. Once allocated we will reuse if already allocated.

I think the right fix is to just allocate the buffer for the ext_list
as part of the scsi command using the .cmd_size field in the host
template, and then dma map it in queuecommand and unmap it on
completion.
Christoph Hellwig Jan. 11, 2019, 6:27 p.m. UTC | #6
On Thu, Jan 10, 2019 at 03:15:35PM -0800, Tyrel Datwyler wrote:
> On a secondary note I was unaware of the GFP_ATOMIC limitations. Should this be
> added to the documentation somewhere? I don't see any mention here form
> DMA-API-HOWTO.txt.

The DMA documentation unfortauntely doesn't seem very good.  It's been
on my todo list to eventually update it, but I'm still discoverying
various warts.

GFP_ATOMIC allocations generally work fine on DMA coherent
architectures, but tend to cause problems on a lot of non-coherent
ones with the notable exceptions of arm and arm64 that go to great
length to introduce special pools for them.  But that code is rarely
exercised, so I found various bugs e.g. in the arm64 iommu code for
this case.

But more importantly there really should be no need for the coherent
allocation from irq context - we only need coherent for descriptors
that don't have clear ownership, and aything allocated in the I/O path
generally has that.
diff mbox series

Patch

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 1135e74..cb8535e 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -731,7 +731,7 @@  static int map_sg_data(struct scsi_cmnd *cmd,
 		evt_struct->ext_list = (struct srp_direct_buf *)
 			dma_alloc_coherent(dev,
 					   SG_ALL * sizeof(struct srp_direct_buf),
-					   &evt_struct->ext_list_token, 0);
+					   &evt_struct->ext_list_token, GFP_ATOMIC);
 		if (!evt_struct->ext_list) {
 			if (!firmware_has_feature(FW_FEATURE_CMO))
 				sdev_printk(KERN_ERR, cmd->device,