Message ID | 20110705180650.GF2959@hmsreliant.think-freely.org |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Le mardi 05 juillet 2011 à 14:06 -0400, Neil Horman a écrit : > On Tue, Jul 05, 2011 at 06:47:21PM +0200, Eric Dumazet wrote: > > Le mardi 05 juillet 2011 à 12:42 -0400, Neil Horman a écrit : > > > On Tue, Jul 05, 2011 at 06:12:32PM +0200, Eric Dumazet wrote: > > > > > > So all descriptors before prod are guaranteed to be ready for host > > > > consume... Fact that a dma access is running on 'next descriptor' should > > > > be irrelevant. > > > > > > > But we handle more than one descriptor per b44_rx call - theres a while loop in > > > there where we do advance to the next descriptor. > > > > Yes, but we advance up to 'prod', which is the very last safe > > descriptor. > > > > If hardware advertises descriptor X being ready to be handled by host, > > while DMA on this X descriptor is not yet finished, this would be a > > really useless hardware ;) > > > > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > Something else just jumped out at me. During b44_open, we call b44_init_rings. > This function allocates bp->rx_pending skb's and iteratively puts them in the rx > dma ring. bp->rx_pending is initalized to B44_DEF_RX_RING_PENDING, which is > defined as 200 (just about half of the 512 entries that the dma ring actually > supports in the hardware. This is normally ok, as subsequent calls to > b44_alloc_rx_skb will fill in entries in the ring as those skbs are consumed. > The problem with this however is that b44_alloc_rx_skb only sets the > DESC_CTRL_EOT bit in the descriptor of the 512th entry, indicating that the > hardware should wrap around and reset the index counter. If a large volume of > traffic is pushed through the adapter early on after initalization, or if the > cpu is busy during init, it would be possible that the ring buffer would fill up > prior to having additional entries added to the ring, the result being that the > dma engine would reach the end of the allocated descriptors, not see an EOT bit > set, and continue on using unallocated descriptors. > > Just a theory, but it would be interesting to see if the problem subsided if you > ensured that you allocated a full descriptor ring on b44_open > Neil > > diff --git a/drivers/net/b44.c b/drivers/net/b44.c > index 3d247f3..1b58a7c 100644 > --- a/drivers/net/b44.c > +++ b/drivers/net/b44.c > @@ -57,7 +57,7 @@ > #define B44_MAX_MTU 1500 > > #define B44_RX_RING_SIZE 512 > -#define B44_DEF_RX_RING_PENDING 200 > +#define B44_DEF_RX_RING_PENDING 512 > #define B44_RX_RING_BYTES (sizeof(struct dma_desc) * \ > B44_RX_RING_SIZE) > #define B44_TX_RING_SIZE 512 No Please take time to read the driver again. 200 desc are setup, and NIC is not allowed to use more than 200 descs. ( B44_DMARX_PTR ) We carefuly advance this pointer after a new desc(s) is(are) setup -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 05 juillet 2011 à 20:13 +0200, Eric Dumazet a écrit : > Le mardi 05 juillet 2011 à 14:06 -0400, Neil Horman a écrit : > > On Tue, Jul 05, 2011 at 06:47:21PM +0200, Eric Dumazet wrote: > > > Le mardi 05 juillet 2011 à 12:42 -0400, Neil Horman a écrit : > > > > On Tue, Jul 05, 2011 at 06:12:32PM +0200, Eric Dumazet wrote: > > > > > > > > So all descriptors before prod are guaranteed to be ready for host > > > > > consume... Fact that a dma access is running on 'next descriptor' should > > > > > be irrelevant. > > > > > > > > > But we handle more than one descriptor per b44_rx call - theres a while loop in > > > > there where we do advance to the next descriptor. > > > > > > Yes, but we advance up to 'prod', which is the very last safe > > > descriptor. > > > > > > If hardware advertises descriptor X being ready to be handled by host, > > > while DMA on this X descriptor is not yet finished, this would be a > > > really useless hardware ;) > > > > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > Something else just jumped out at me. During b44_open, we call b44_init_rings. > > This function allocates bp->rx_pending skb's and iteratively puts them in the rx > > dma ring. bp->rx_pending is initalized to B44_DEF_RX_RING_PENDING, which is > > defined as 200 (just about half of the 512 entries that the dma ring actually > > supports in the hardware. This is normally ok, as subsequent calls to > > b44_alloc_rx_skb will fill in entries in the ring as those skbs are consumed. > > The problem with this however is that b44_alloc_rx_skb only sets the > > DESC_CTRL_EOT bit in the descriptor of the 512th entry, indicating that the > > hardware should wrap around and reset the index counter. If a large volume of > > traffic is pushed through the adapter early on after initalization, or if the > > cpu is busy during init, it would be possible that the ring buffer would fill up > > prior to having additional entries added to the ring, the result being that the > > dma engine would reach the end of the allocated descriptors, not see an EOT bit > > set, and continue on using unallocated descriptors. > > > > Just a theory, but it would be interesting to see if the problem subsided if you > > ensured that you allocated a full descriptor ring on b44_open > > Neil > > > > diff --git a/drivers/net/b44.c b/drivers/net/b44.c > > index 3d247f3..1b58a7c 100644 > > --- a/drivers/net/b44.c > > +++ b/drivers/net/b44.c > > @@ -57,7 +57,7 @@ > > #define B44_MAX_MTU 1500 > > > > #define B44_RX_RING_SIZE 512 > > -#define B44_DEF_RX_RING_PENDING 200 > > +#define B44_DEF_RX_RING_PENDING 512 > > #define B44_RX_RING_BYTES (sizeof(struct dma_desc) * \ > > B44_RX_RING_SIZE) > > #define B44_TX_RING_SIZE 512 > > No > > Please take time to read the driver again. > > 200 desc are setup, and NIC is not allowed to use more than 200 descs. > > ( B44_DMARX_PTR ) > > We carefuly advance this pointer after a new desc(s) is(are) setup > > Then, maybe the driver model is completely wrong, and should really setup 512 buffers, or use less descs but set EOT on last one. Currently it uses a 200 sliding window out of the 512 descs. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 05 juillet 2011 à 20:32 +0200, Eric Dumazet a écrit : > Then, maybe the driver model is completely wrong, and should really > setup 512 buffers, or use less descs but set EOT on last one. > > Currently it uses a 200 sliding window out of the 512 descs. > > One thing we could do would be to allocate a special guard buffer and set all 'out of window' descriptors to point to this guard buffer, and periodically check if buffer is dirtied by the card. (first word would be enough) (instead of setting desc->addr to NULL, set to dma_map_single(guard_buffer)) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 05, 2011 at 08:45:16PM +0200, Eric Dumazet wrote: > Le mardi 05 juillet 2011 à 20:32 +0200, Eric Dumazet a écrit : > > > Then, maybe the driver model is completely wrong, and should really > > setup 512 buffers, or use less descs but set EOT on last one. > > > > Currently it uses a 200 sliding window out of the 512 descs. > > > > > > One thing we could do would be to allocate a special guard buffer and > set all 'out of window' descriptors to point to this guard buffer, and > periodically check if buffer is dirtied by the card. > > (first word would be enough) > > (instead of setting desc->addr to NULL, set to > dma_map_single(guard_buffer)) > I think this is a goo idea, at least for testing. It seems odd to me that we have the B44_DMARX_PTR value which indicates (ostensibly) the pointer to the descriptor to be processed next (the documentation isnt' very verbose on the subject), along with the EOT bit on a descriptor. It seems like both the register and the bit are capable of conveying the same (or at least overlapping) information. I think what I'm having the most trouble with is understanding when the hw looks at the EOT bit in the descriptor. If it completes a DMA and sees the EOT bit set, does the next DMA occur to the descriptor pointed to by the DMARX_ADDR register? Of does it stall until such time as the DMARX_PTR register is rotated around? What if it doesn't see the EOT bit set? Does it just keep going with the next descriptor? Also, there seems to be some inconsistency in the settnig of the B44_DMARX_PTR register. In bnx2_init_hw its set to the value of bp->rx_pending, which is defined as being 200. But in b44_rx its advanced by sizeof(struct dma_desc) for every iteration. So in b44_init_hw we write the value 200 to it, ostensibly indicating a limit of 200 descriptors, but in b44_rx we iteratively write the values 0, 8, 16, 24...4*n to the register to indicate which descriptor we're indexing? Something really doesn't sit right with me there. In the former case we treat the register as holding number of entries, and in the latter we treat it as holding a byte offset into an array. Or am I missing something? Regards Neil > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 05 juillet 2011 à 15:53 -0400, Neil Horman a écrit : > I think this is a goo idea, at least for testing. It seems odd to me that we > have the B44_DMARX_PTR value which indicates (ostensibly) the pointer to the > descriptor to be processed next (the documentation isnt' very verbose on the > subject), along with the EOT bit on a descriptor. It seems like both the > register and the bit are capable of conveying the same (or at least overlapping) > information. > > I think what I'm having the most trouble with is understanding when the hw looks > at the EOT bit in the descriptor. If it completes a DMA and sees the EOT bit > set, does the next DMA occur to the descriptor pointed to by the DMARX_ADDR > register? Of does it stall until such time as the DMARX_PTR register is rotated > around? What if it doesn't see the EOT bit set? Does it just keep going with > the next descriptor? > > Also, there seems to be some inconsistency in the settnig of the B44_DMARX_PTR > register. In bnx2_init_hw its set to the value of bp->rx_pending, which is > defined as being 200. But in b44_rx its advanced by sizeof(struct dma_desc) for > every iteration. So in b44_init_hw we write the value 200 to it, ostensibly > indicating a limit of 200 descriptors, but in b44_rx we iteratively write the > values 0, 8, 16, 24...4*n to the register to indicate which descriptor we're > indexing? Something really doesn't sit right with me there. In the former case > we treat the register as holding number of entries, and in the latter we treat > it as holding a byte offset into an array. Or am I missing something? > Yes, definitely this needs some clarification. More over, when we hit the last entry (currently at slot 511), the EOT instructs hardware to go back to slot 0, while our real window is 511-200 -> 511 . Slot 0 contains garbage (well, an old value) Its late here so I dont plan to send a patch before 8/10 hours. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 05 juillet 2011 à 22:02 +0200, Eric Dumazet a écrit : > Le mardi 05 juillet 2011 à 15:53 -0400, Neil Horman a écrit : > > I think this is a goo idea, at least for testing. It seems odd to me that we > > have the B44_DMARX_PTR value which indicates (ostensibly) the pointer to the > > descriptor to be processed next (the documentation isnt' very verbose on the > > subject), along with the EOT bit on a descriptor. It seems like both the > > register and the bit are capable of conveying the same (or at least overlapping) > > information. > > > > I think what I'm having the most trouble with is understanding when the hw looks > > at the EOT bit in the descriptor. If it completes a DMA and sees the EOT bit > > set, does the next DMA occur to the descriptor pointed to by the DMARX_ADDR > > register? Of does it stall until such time as the DMARX_PTR register is rotated > > around? What if it doesn't see the EOT bit set? Does it just keep going with > > the next descriptor? Since there is no OWN bit (at least not on the online doc I got : it says the rx_ring is read only by the NIC), I would say we really need to advance DMARX_PTR to signal NIC a new entry is available for following incoming frames. This is the reason rx_pending max value is B44_RX_RING_SIZE - 1, or else chip could loop on a circular rx_ring. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/b44.c b/drivers/net/b44.c index 3d247f3..1b58a7c 100644 --- a/drivers/net/b44.c +++ b/drivers/net/b44.c @@ -57,7 +57,7 @@ #define B44_MAX_MTU 1500 #define B44_RX_RING_SIZE 512 -#define B44_DEF_RX_RING_PENDING 200 +#define B44_DEF_RX_RING_PENDING 512 #define B44_RX_RING_BYTES (sizeof(struct dma_desc) * \ B44_RX_RING_SIZE) #define B44_TX_RING_SIZE 512