diff mbox

Hang: 2.6.32.4 sky2/DMAR (was [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync)

Message ID 20100128090835.0d93e53a@nehalam
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Stephen Hemminger Jan. 28, 2010, 5:08 p.m. UTC
On Thu, 28 Jan 2010 11:43:16 -0500
Michael Breuer <mbreuer@majjas.com> wrote:

> Update: I played with dma-debug. Was being disabled due to lack of 
> memory. I forced it back on while pumping traffic through and got this:
> Jan 28 11:39:30 mail kernel: ------------[ cut here ]------------
> Jan 28 11:39:30 mail kernel: WARNING: at lib/dma-debug.c:902 
> check_sync+0xc1/0x43f()
> Jan 28 11:39:30 mail kernel: Hardware name: System Product Name
> Jan 28 11:39:30 mail kernel: sky2 0000:06:00.0: DMA-API: device driver 
> tries to sync DMA memory it has not allocated [device 

This test in dma-debug is bogus. Because the debug code matches
dma based on address and size; and is perfectly valid to sync a value
less than size. This is the patch I sent earlier, it isn't 100%
correct but it will let you keep testing
....................................


This should fix the dma-debug API code (and documentation), to
avoid false positives when sync is done on a partial map.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael Breuer Jan. 28, 2010, 6:46 p.m. UTC | #1
On 1/28/2010 12:08 PM, Stephen Hemminger wrote:
> On Thu, 28 Jan 2010 11:43:16 -0500
> Michael Breuer<mbreuer@majjas.com>  wrote:
>
>    
>> Update: I played with dma-debug. Was being disabled due to lack of
>> memory. I forced it back on while pumping traffic through and got this:
>> Jan 28 11:39:30 mail kernel: ------------[ cut here ]------------
>> Jan 28 11:39:30 mail kernel: WARNING: at lib/dma-debug.c:902
>> check_sync+0xc1/0x43f()
>> Jan 28 11:39:30 mail kernel: Hardware name: System Product Name
>> Jan 28 11:39:30 mail kernel: sky2 0000:06:00.0: DMA-API: device driver
>> tries to sync DMA memory it has not allocated [device
>>      
> This test in dma-debug is bogus. Because the debug code matches
> dma based on address and size; and is perfectly valid to sync a value
> less than size. This is the patch I sent earlier, it isn't 100%
> correct but it will let you keep testing
> ....................................
>
>
> This should fix the dma-debug API code (and documentation), to
> avoid false positives when sync is done on a partial map.
>
> Signed-off-by: Stephen Hemminger<shemminger@vyatta.com>
>
> --- a/Documentation/DMA-API.txt	2010-01-20 15:17:01.390143729 -0800
> +++ b/Documentation/DMA-API.txt	2010-01-20 15:18:48.967875255 -0800
> @@ -377,9 +377,10 @@ void
>   pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
>   		       int nelems, int direction)
>
> -Synchronise a single contiguous or scatter/gather mapping.  All the
> -parameters must be the same as those passed into the single mapping
> -API.
> +Synchronise a single contiguous or scatter/gather mapping.  The
> +device and handle must be the same as those passed into the single mapping
> +API. The size can be less than the original mapping if only part
> +of the mapping needs to be accessed.
>
>   Notes:  You must do this:
>
> --- a/lib/dma-debug.c	2010-01-20 15:22:55.919519883 -0800
> +++ b/lib/dma-debug.c	2010-01-20 15:26:31.648895638 -0800
> @@ -285,11 +285,9 @@ static struct dma_debug_entry *hash_buck
>   	}
>
>   	/*
> -	 * If we have multiple matches but no perfect-fit, just return
> -	 * NULL.
> +	 * If we have multiple matches but no perfect-fit
> +	 * return best value and let caller deal with it.
>   	 */
> -	ret = (matches == 1) ? ret : NULL;
> -
>   	return ret;
>   }
>
>    
Ok - applied. Noise gone... however I'm not sure whether I'll be able to 
keep dma-debug going long enough to catch anything. num_free_entries 
keeps dropping... looks like entries are not freed. I'm running with a 
huge number for now & sky2 as the driver filter. Is there a reason that 
entries wouldn't be unmapped, or is dma-debug.c just not processing the 
unmap correctly?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Jan. 28, 2010, 10:34 p.m. UTC | #2
On Thu, Jan 28, 2010 at 01:46:17PM -0500, Michael Breuer wrote:
> On 1/28/2010 12:08 PM, Stephen Hemminger wrote:
> >--- a/lib/dma-debug.c	2010-01-20 15:22:55.919519883 -0800
> >+++ b/lib/dma-debug.c	2010-01-20 15:26:31.648895638 -0800
> >@@ -285,11 +285,9 @@ static struct dma_debug_entry *hash_buck
> >  	}
> >
> >  	/*
> >-	 * If we have multiple matches but no perfect-fit, just return
> >-	 * NULL.
> >+	 * If we have multiple matches but no perfect-fit
> >+	 * return best value and let caller deal with it.
> >  	 */
> >-	ret = (matches == 1) ? ret : NULL;
> >-
> >  	return ret;
> >  }
> >
> Ok - applied. Noise gone... however I'm not sure whether I'll be
> able to keep dma-debug going long enough to catch anything.
> num_free_entries keeps dropping... looks like entries are not freed.
> I'm running with a huge number for now & sky2 as the driver filter.
> Is there a reason that entries wouldn't be unmapped, or is
> dma-debug.c just not processing the unmap correctly?

Do you mean it's after this patch or earlier too? I think you might
use my sky2/receive_copy/pci_unmap_len patch instead to get rid of
this warning.

Btw, since 1000 was too much, maybe you could try copybreak=256 yet,
plus additional ping or some other source of shorter packets. And how
about trying this new switch?

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Breuer Jan. 28, 2010, 10:43 p.m. UTC | #3
On 1/28/2010 5:34 PM, Jarek Poplawski wrote:
> On Thu, Jan 28, 2010 at 01:46:17PM -0500, Michael Breuer wrote:
>    
>> On 1/28/2010 12:08 PM, Stephen Hemminger wrote:
>>      
>>> --- a/lib/dma-debug.c	2010-01-20 15:22:55.919519883 -0800
>>> +++ b/lib/dma-debug.c	2010-01-20 15:26:31.648895638 -0800
>>> @@ -285,11 +285,9 @@ static struct dma_debug_entry *hash_buck
>>>   	}
>>>
>>>   	/*
>>> -	 * If we have multiple matches but no perfect-fit, just return
>>> -	 * NULL.
>>> +	 * If we have multiple matches but no perfect-fit
>>> +	 * return best value and let caller deal with it.
>>>   	 */
>>> -	ret = (matches == 1) ? ret : NULL;
>>> -
>>>   	return ret;
>>>   }
>>>
>>>        
>> Ok - applied. Noise gone... however I'm not sure whether I'll be
>> able to keep dma-debug going long enough to catch anything.
>> num_free_entries keeps dropping... looks like entries are not freed.
>> I'm running with a huge number for now&  sky2 as the driver filter.
>> Is there a reason that entries wouldn't be unmapped, or is
>> dma-debug.c just not processing the unmap correctly?
>>      
> Do you mean it's after this patch or earlier too? I think you might
> use my sky2/receive_copy/pci_unmap_len patch instead to get rid of
> this warning.
>
> Btw, since 1000 was too much, maybe you could try copybreak=256 yet,
> plus additional ping or some other source of shorter packets. And how
> about trying this new switch?
>
> Jarek P.
>    
This is with the pci_unmap_len patch as well as the dma-debug patch. I'm 
not getting any warnings - but dma-debug num-free-entries drops until 
zero and then debugging is disabled. I started with 8,000,000 about 
three hours ago - without load I'm already down to less than half that. 
Again - only looking at sky2. Looks like the dma-debug hash table is 
reducing one entry for every packet, but never increasing the 
num_entries (no unmap perhaps).
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski Jan. 28, 2010, 10:56 p.m. UTC | #4
On Thu, Jan 28, 2010 at 05:43:34PM -0500, Michael Breuer wrote:
> This is with the pci_unmap_len patch as well as the dma-debug patch.
> I'm not getting any warnings - but dma-debug num-free-entries drops
> until zero and then debugging is disabled. I started with 8,000,000
> about three hours ago - without load I'm already down to less than
> half that. Again - only looking at sky2. Looks like the dma-debug
> hash table is reducing one entry for every packet, but never
> increasing the num_entries (no unmap perhaps).

OK, then, until there is some new fix you better turn it off.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Breuer Jan. 28, 2010, 10:59 p.m. UTC | #5
On 1/28/2010 5:56 PM, Jarek Poplawski wrote:
> On Thu, Jan 28, 2010 at 05:43:34PM -0500, Michael Breuer wrote:
>    
>> This is with the pci_unmap_len patch as well as the dma-debug patch.
>> I'm not getting any warnings - but dma-debug num-free-entries drops
>> until zero and then debugging is disabled. I started with 8,000,000
>> about three hours ago - without load I'm already down to less than
>> half that. Again - only looking at sky2. Looks like the dma-debug
>> hash table is reducing one entry for every packet, but never
>> increasing the num_entries (no unmap perhaps).
>>      
> OK, then, until there is some new fix you better turn it off.
>
> Jarek P.
>    
:( any further suggestions for helping track this down?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/Documentation/DMA-API.txt	2010-01-20 15:17:01.390143729 -0800
+++ b/Documentation/DMA-API.txt	2010-01-20 15:18:48.967875255 -0800
@@ -377,9 +377,10 @@  void
 pci_dma_sync_sg(struct pci_dev *hwdev, struct scatterlist *sg,
 		       int nelems, int direction)
 
-Synchronise a single contiguous or scatter/gather mapping.  All the
-parameters must be the same as those passed into the single mapping
-API.
+Synchronise a single contiguous or scatter/gather mapping.  The
+device and handle must be the same as those passed into the single mapping
+API. The size can be less than the original mapping if only part
+of the mapping needs to be accessed.
 
 Notes:  You must do this:
 
--- a/lib/dma-debug.c	2010-01-20 15:22:55.919519883 -0800
+++ b/lib/dma-debug.c	2010-01-20 15:26:31.648895638 -0800
@@ -285,11 +285,9 @@  static struct dma_debug_entry *hash_buck
 	}
 
 	/*
-	 * If we have multiple matches but no perfect-fit, just return
-	 * NULL.
+	 * If we have multiple matches but no perfect-fit
+	 * return best value and let caller deal with it.
 	 */
-	ret = (matches == 1) ? ret : NULL;
-
 	return ret;
 }