diff mbox

net: fec_main: dma_map() only the length of the skb

Message ID 1385556253-4130-1-git-send-email-bigeasy@linutronix.de
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Sebastian Andrzej Siewior Nov. 27, 2013, 12:44 p.m. UTC
On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE (=2048)
bytes. This works because we don't overwrite any memory after the data buffer,
we remove it from cache if it was there. So we hurt performace in case the
mapping of a smaller area makes a difference.
There is also a bug: If the data area starts shortly before the end of
RAM say 0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough
space to fit the data area (according to skb->len) but we would map beyond
end of ram if we are using 2048. In v2.6.31 (against which kernel this patch
made) there is the following check in dma_cache_maint():

|BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1));

Since the area starting at 0xc8000000 is no longer virt_addr_valid() we
BUG() during dma_map_single(). The BUG() statement was removed in v3.5-rc1 as
per 2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-common.h").

This patch was tested on v2.6.31 and then forward-ported and compile
tested only against the net tree. I think it is still worth fixing
mainline even after the BUG() statement is gone.

Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
It would be nice if someone could test this on current kernel.
Is this worth pushing stable?

Marek: Was there a special reason why the check was removed? Would it make
sense to bring it back say under CONFIG_DMA_DEBUG?

 drivers/net/ethernet/freescale/fec_main.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Fugang Duan Nov. 27, 2013, 1:08 p.m. UTC | #1
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Data: Wednesday, November 27, 2013 8:44 PM

>To: netdev@vger.kernel.org
>Cc: David S. Miller; Estevam Fabio-R49496; Frank Li; Duan Fugang-B38611; Jim
>Baxter; Marek Szyprowski; Sebastian Andrzej Siewior
>Subject: [PATCH] net: fec_main: dma_map() only the length of the skb
>
>On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE (=2048)
>bytes. This works because we don't overwrite any memory after the data buffer,
>we remove it from cache if it was there. So we hurt performace in case the
>mapping of a smaller area makes a difference.
>There is also a bug: If the data area starts shortly before the end of RAM say
>0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough space to fit the
>data area (according to skb->len) but we would map beyond end of ram if we are
>using 2048. In v2.6.31 (against which kernel this patch
>made) there is the following check in dma_cache_maint():
>
>|BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1));
>
>Since the area starting at 0xc8000000 is no longer virt_addr_valid() we
>BUG() during dma_map_single(). The BUG() statement was removed in v3.5-rc1 as
>per 2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-common.h").
>
>This patch was tested on v2.6.31 and then forward-ported and compile tested
>only against the net tree. I think it is still worth fixing mainline even after
>the BUG() statement is gone.
>
>Cc: Marek Szyprowski <m.szyprowski@samsung.com>
>Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>---
>It would be nice if someone could test this on current kernel.
>Is this worth pushing stable?
>
>Marek: Was there a special reason why the check was removed? Would it make
>sense to bring it back say under CONFIG_DMA_DEBUG?
>
> drivers/net/ethernet/freescale/fec_main.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/net/ethernet/freescale/fec_main.c
>b/drivers/net/ethernet/freescale/fec_main.c
>index 4cbebf3..29d8dc4 100644
>--- a/drivers/net/ethernet/freescale/fec_main.c
>+++ b/drivers/net/ethernet/freescale/fec_main.c
>@@ -385,7 +385,7 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device
>*ndev)
> 	 * data.
> 	 */
> 	bdp->cbd_bufaddr = dma_map_single(&fep->pdev->dev, bufaddr,
>-			FEC_ENET_TX_FRSIZE, DMA_TO_DEVICE);
>+			skb->len, DMA_TO_DEVICE);
> 	if (dma_mapping_error(&fep->pdev->dev, bdp->cbd_bufaddr)) {
> 		bdp->cbd_bufaddr = 0;
> 		fep->tx_skbuff[index] = NULL;
>@@ -779,11 +779,11 @@ fec_enet_tx(struct net_device *ndev)
> 		else
> 			index = bdp - fep->tx_bd_base;
>
>-		dma_unmap_single(&fep->pdev->dev, bdp->cbd_bufaddr,
>-				FEC_ENET_TX_FRSIZE, DMA_TO_DEVICE);
>+		skb = fep->tx_skbuff[index];
>+		dma_unmap_single(&fep->pdev->dev, bdp->cbd_bufaddr, skb->len,
>+				DMA_TO_DEVICE);
> 		bdp->cbd_bufaddr = 0;
>
>-		skb = fep->tx_skbuff[index];
>
> 		/* Check for errors. */
> 		if (status & (BD_ENET_TX_HB | BD_ENET_TX_LC |
>--
>1.8.4.4
>
In fact, there have one memory copy for enet as below since enet have 16 bytes data buffer alignment request.
if (((unsigned long) bufaddr) & FEC_ALIGNMENT) {
                memcpy(fep->tx_bounce[index], skb->data, skb->len);
                bufaddr = fep->tx_bounce[index];
}

So, the bug you describe at commit log shouldn't exist.

Anyway, it is better to use the real packet size for the mapping.


I will do overnight stress test for the patch tomorrow.

Thanks,
Andy

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sebastian Andrzej Siewior Nov. 27, 2013, 1:38 p.m. UTC | #2
- <frank.li@freescale.net> because MTA complains

On 11/27/2013 02:08 PM, Fugang Duan wrote:
> In fact, there have one memory copy for enet as below since enet have 16 bytes data buffer alignment request.
> if (((unsigned long) bufaddr) & FEC_ALIGNMENT) {
>                 memcpy(fep->tx_bounce[index], skb->data, skb->len);
>                 bufaddr = fep->tx_bounce[index];
> }

This memcpy() is only executed if the buffer isn't properly aligned
which shouldn't be the rule but an exception. This aligment check is
also available in v2.6.31 where the BUG_ON() statement was triggered.

> 
> So, the bug you describe at commit log shouldn't exist.
> 
> Anyway, it is better to use the real packet size for the mapping.
> 
> 
> I will do overnight stress test for the patch tomorrow.

Thanks.

> 
> Thanks,
> Andy
> 

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fugang Duan Nov. 28, 2013, 1:18 a.m. UTC | #3
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Sent: Wednesday, November 27, 2013 9:38 PM

>To: Duan Fugang-B38611
>Cc: netdev@vger.kernel.org; David S. Miller; Estevam Fabio-R49496; Frank Li;
>Jim Baxter; Marek Szyprowski
>Subject: Re: [PATCH] net: fec_main: dma_map() only the length of the skb
>
>- <frank.li@freescale.net> because MTA complains
>
>On 11/27/2013 02:08 PM, Fugang Duan wrote:
>> In fact, there have one memory copy for enet as below since enet have 16
>bytes data buffer alignment request.
>> if (((unsigned long) bufaddr) & FEC_ALIGNMENT) {
>>                 memcpy(fep->tx_bounce[index], skb->data, skb->len);
>>                 bufaddr = fep->tx_bounce[index]; }
>
>This memcpy() is only executed if the buffer isn't properly aligned which
>shouldn't be the rule but an exception. This aligment check is also available
>in v2.6.31 where the BUG_ON() statement was triggered.
>
In fact, the skb->data address is not aligned with FEC_ALIGNMENT, so memcpy() always is called at here.

Thanks,
Andy 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sebastian Andrzej Siewior Nov. 28, 2013, 11:24 a.m. UTC | #4
On 11/28/2013 02:18 AM, Fugang Duan wrote:
- <frank.li@freescale.net> because MTA complains (now really)

>> On 11/27/2013 02:08 PM, Fugang Duan wrote:
>>> In fact, there have one memory copy for enet as below since enet have 16
>> bytes data buffer alignment request.
>>> if (((unsigned long) bufaddr) & FEC_ALIGNMENT) {
>>>                 memcpy(fep->tx_bounce[index], skb->data, skb->len);
>>>                 bufaddr = fep->tx_bounce[index]; }
>>
>> This memcpy() is only executed if the buffer isn't properly aligned which
>> shouldn't be the rule but an exception. This aligment check is also available
>> in v2.6.31 where the BUG_ON() statement was triggered.
>>
> In fact, the skb->data address is not aligned with FEC_ALIGNMENT, so memcpy() always is called at here.

I am aware of NET_IP_ALIGN. The pointers I mentioned in the log were
from a printk() shortly before that bug triggered. So I am not making
this up :) I am not aware of the load that triggered this.

Anyway, adding this

   WARN_ON(!((unsigned long)skb->data & 0xf));

in my e1000 on v3.12 triggers shortly after boot and according to the
backtrace it comes aoe. So that is at least one user :)
Since that aligment is for IP, I am not sure if non-IP based protocol
are using this. I know however that the workload, that triggered the
problem, is not IP-based.

> 
> Thanks,
> Andy 

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 2, 2013, 1:25 a.m. UTC | #5
From: Fugang Duan <fugang.duan@freescale.com>
Date: Wed, 27 Nov 2013 13:08:18 +0000

> I will do overnight stress test for the patch tomorrow.

So, what was the result of your testing?

Can you please Ack this patch if there are no problems with it and
your testing went well?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fugang Duan Dec. 2, 2013, 2:04 a.m. UTC | #6
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Data: Wednesday, November 27, 2013 8:44 PM

>To: netdev@vger.kernel.org
>Cc: David S. Miller; Estevam Fabio-R49496; Frank Li; Duan Fugang-B38611; Jim
>Baxter; Marek Szyprowski; Sebastian Andrzej Siewior
>Subject: [PATCH] net: fec_main: dma_map() only the length of the skb
>
>On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE (=2048)
>bytes. This works because we don't overwrite any memory after the data buffer,
>we remove it from cache if it was there. So we hurt performace in case the
>mapping of a smaller area makes a difference.
>There is also a bug: If the data area starts shortly before the end of RAM say
>0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough space to fit the
>data area (according to skb->len) but we would map beyond end of ram if we are
>using 2048. In v2.6.31 (against which kernel this patch
>made) there is the following check in dma_cache_maint():
>
>|BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1));
>
>Since the area starting at 0xc8000000 is no longer virt_addr_valid() we
>BUG() during dma_map_single(). The BUG() statement was removed in v3.5-rc1 as
>per 2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-common.h").
>
>This patch was tested on v2.6.31 and then forward-ported and compile tested
>only against the net tree. I think it is still worth fixing mainline even after
>the BUG() statement is gone.
>
>Cc: Marek Szyprowski <m.szyprowski@samsung.com>
>Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>---
>It would be nice if someone could test this on current kernel.
>Is this worth pushing stable?
>
>Marek: Was there a special reason why the check was removed? Would it make
>sense to bring it back say under CONFIG_DMA_DEBUG?
>
> drivers/net/ethernet/freescale/fec_main.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/net/ethernet/freescale/fec_main.c
>b/drivers/net/ethernet/freescale/fec_main.c
>index 4cbebf3..29d8dc4 100644
>--- a/drivers/net/ethernet/freescale/fec_main.c
>+++ b/drivers/net/ethernet/freescale/fec_main.c
>@@ -385,7 +385,7 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device
>*ndev)
> 	 * data.
> 	 */
> 	bdp->cbd_bufaddr = dma_map_single(&fep->pdev->dev, bufaddr,
>-			FEC_ENET_TX_FRSIZE, DMA_TO_DEVICE);
>+			skb->len, DMA_TO_DEVICE);
> 	if (dma_mapping_error(&fep->pdev->dev, bdp->cbd_bufaddr)) {
> 		bdp->cbd_bufaddr = 0;
> 		fep->tx_skbuff[index] = NULL;
>@@ -779,11 +779,11 @@ fec_enet_tx(struct net_device *ndev)
> 		else
> 			index = bdp - fep->tx_bd_base;
>
>-		dma_unmap_single(&fep->pdev->dev, bdp->cbd_bufaddr,
>-				FEC_ENET_TX_FRSIZE, DMA_TO_DEVICE);
>+		skb = fep->tx_skbuff[index];
>+		dma_unmap_single(&fep->pdev->dev, bdp->cbd_bufaddr, skb->len,
>+				DMA_TO_DEVICE);
> 		bdp->cbd_bufaddr = 0;
>
>-		skb = fep->tx_skbuff[index];

Pls also remove the redundant blank line.
>
> 		/* Check for errors. */
> 		if (status & (BD_ENET_TX_HB | BD_ENET_TX_LC |
>--
>1.8.4.4
>
It pass weekend stress test, and performance have little change for imx6q sabresd platform(cpu frequence is 799MHz, set cpufreq governor to performance):
The previous tx bandwidth for tcp: 427Mhz
After the patch, tx bandwidth for tcp: 440Mhz.
It means dma map memory size can impact networking performance.



Tested-by: Fugang Duan <B38611@freescale.com>

Thanks,
Andy

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fugang Duan Dec. 2, 2013, 2:14 a.m. UTC | #7
From: Fugang Duan <fugang.duan@freescale.com>
Data: Monday, December 02, 2013 10:05 AM + 800

>To: Sebastian Andrzej Siewior; netdev@vger.kernel.org
>Cc: David S. Miller; Estevam Fabio-R49496; Frank Li; Jim Baxter; Marek
>Szyprowski
>Subject: RE: [PATCH] net: fec_main: dma_map() only the length of the skb
>
>From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>Data: Wednesday, November 27, 2013 8:44 PM
>
>>To: netdev@vger.kernel.org
>>Cc: David S. Miller; Estevam Fabio-R49496; Frank Li; Duan
>>Fugang-B38611; Jim Baxter; Marek Szyprowski; Sebastian Andrzej Siewior
>>Subject: [PATCH] net: fec_main: dma_map() only the length of the skb
>>
>>On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE
>>(=2048) bytes. This works because we don't overwrite any memory after
>>the data buffer, we remove it from cache if it was there. So we hurt
>>performace in case the mapping of a smaller area makes a difference.
>>There is also a bug: If the data area starts shortly before the end of
>>RAM say
>>0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough space to
>>fit the data area (according to skb->len) but we would map beyond end
>>of ram if we are using 2048. In v2.6.31 (against which kernel this
>>patch
>>made) there is the following check in dma_cache_maint():
>>
>>|BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1));
>>
>>Since the area starting at 0xc8000000 is no longer virt_addr_valid() we
>>BUG() during dma_map_single(). The BUG() statement was removed in
>>v3.5-rc1 as per 2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-
>common.h").
>>
>>This patch was tested on v2.6.31 and then forward-ported and compile
>>tested only against the net tree. I think it is still worth fixing
>>mainline even after the BUG() statement is gone.
>>
>>Cc: Marek Szyprowski <m.szyprowski@samsung.com>
>>Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>>---
>>
[...]
>It pass weekend stress test, and performance have little change for imx6q
>sabresd platform(cpu frequence is 799MHz, set cpufreq governor to performance):
>The previous tx bandwidth for tcp: 427Mhz After the patch, tx bandwidth for tcp:
>440Mhz.

427Mhz -> 427Mbps
440Mhz -> 440Mbps

>It means dma map memory size can impact networking performance.
>
>
>
>Tested-by: Fugang Duan <B38611@freescale.com>
>
>Thanks,
>Andy
>

Sorry, it is slip of the pen.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marek Szyprowski Dec. 3, 2013, 7:36 a.m. UTC | #8
Hello,

On 2013-11-27 13:44, Sebastian Andrzej Siewior wrote:
> On tx submit the driver always dma_map_single() FEC_ENET_TX_FRSIZE (=2048)
> bytes. This works because we don't overwrite any memory after the data buffer,
> we remove it from cache if it was there. So we hurt performace in case the
> mapping of a smaller area makes a difference.
> There is also a bug: If the data area starts shortly before the end of
> RAM say 0xc7fffa10 and the RAM ends at 0xc8000000 then we have enough
> space to fit the data area (according to skb->len) but we would map beyond
> end of ram if we are using 2048. In v2.6.31 (against which kernel this patch
> made) there is the following check in dma_cache_maint():
>
> |BUG_ON(!virt_addr_valid(start) || !virt_addr_valid(start + size - 1));
>
> Since the area starting at 0xc8000000 is no longer virt_addr_valid() we
> BUG() during dma_map_single(). The BUG() statement was removed in v3.5-rc1 as
> per 2dc6a016 ("ARM: dma-mapping: use asm-generic/dma-mapping-common.h").
>
> This patch was tested on v2.6.31 and then forward-ported and compile
> tested only against the net tree. I think it is still worth fixing
> mainline even after the BUG() statement is gone.
>
> Cc: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> It would be nice if someone could test this on current kernel.
> Is this worth pushing stable?
>
> Marek: Was there a special reason why the check was removed? Would it make
> sense to bring it back say under CONFIG_DMA_DEBUG?

I'm sorry for a delay. The check has been removed during conversion to 
common,
generic dma_map_ops implementation. During those convesion 
dma_(un)map_single
calls have been implemented on top of dma_(un)map_page operation, what 
removed
those additional check (*_page based function didn't have such check). I 
agree
that it might be a good idea to bring them back conditionally under
CONFIG_DMA_DEBUG. Would you like to send a respective patch?

Best regards
Sebastian Andrzej Siewior Dec. 3, 2013, 7:54 a.m. UTC | #9
On 12/03/2013 08:36 AM, Marek Szyprowski wrote:
> Hello,
Hi,

> I'm sorry for a delay. The check has been removed during conversion to
> common,
> generic dma_map_ops implementation. During those convesion
> dma_(un)map_single
> calls have been implemented on top of dma_(un)map_page operation, what
> removed
> those additional check (*_page based function didn't have such check). I
> agree
> that it might be a good idea to bring them back conditionally under
> CONFIG_DMA_DEBUG. Would you like to send a respective patch?

Yes, but it will take a while…

> Best regards

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 4cbebf3..29d8dc4 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -385,7 +385,7 @@  fec_enet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	 * data.
 	 */
 	bdp->cbd_bufaddr = dma_map_single(&fep->pdev->dev, bufaddr,
-			FEC_ENET_TX_FRSIZE, DMA_TO_DEVICE);
+			skb->len, DMA_TO_DEVICE);
 	if (dma_mapping_error(&fep->pdev->dev, bdp->cbd_bufaddr)) {
 		bdp->cbd_bufaddr = 0;
 		fep->tx_skbuff[index] = NULL;
@@ -779,11 +779,11 @@  fec_enet_tx(struct net_device *ndev)
 		else
 			index = bdp - fep->tx_bd_base;
 
-		dma_unmap_single(&fep->pdev->dev, bdp->cbd_bufaddr,
-				FEC_ENET_TX_FRSIZE, DMA_TO_DEVICE);
+		skb = fep->tx_skbuff[index];
+		dma_unmap_single(&fep->pdev->dev, bdp->cbd_bufaddr, skb->len,
+				DMA_TO_DEVICE);
 		bdp->cbd_bufaddr = 0;
 
-		skb = fep->tx_skbuff[index];
 
 		/* Check for errors. */
 		if (status & (BD_ENET_TX_HB | BD_ENET_TX_LC |