Message ID | 744e01ea2c93200765ba8a77f0e6b0ca6baca513.1570662004.git.lorenzo@kernel.org |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | add XDP support to mvneta driver | expand |
On Thu, 10 Oct 2019 01:18:34 +0200 Lorenzo Bianconi <lorenzo@kernel.org> wrote: > mvneta driver can run on not cache coherent devices so it is > necessary to sync dma buffers before sending them to the device > in order to avoid memory corruption. This patch introduce a performance > penalty and it is necessary to introduce a more sophisticated logic > in order to avoid dma sync as much as we can Report with benchmarks here: https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_espressobin08_bench_xdp.org We are testing this on an Espressobin board, and do see a huge performance cost associated with this DMA-sync. Regardless we still want to get this patch merged, to move forward with XDP support for this driver. We promised each-other (on IRC freenode #xdp) that we will follow-up with a solution/mitigation, after this patchset is merged. There are several ideas, that likely should get separate upstream review. > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > drivers/net/ethernet/marvell/mvneta.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > index 79a6bac0192b..ba4aa9bbc798 100644 > --- a/drivers/net/ethernet/marvell/mvneta.c > +++ b/drivers/net/ethernet/marvell/mvneta.c > @@ -1821,6 +1821,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > struct mvneta_rx_queue *rxq, > gfp_t gfp_mask) > { > + enum dma_data_direction dma_dir; > dma_addr_t phys_addr; > struct page *page; > > @@ -1830,6 +1831,9 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > return -ENOMEM; > > phys_addr = page_pool_get_dma_addr(page) + pp->rx_offset_correction; > + dma_dir = page_pool_get_dma_dir(rxq->page_pool); > + dma_sync_single_for_device(pp->dev->dev.parent, phys_addr, > + MVNETA_MAX_RX_BUF_SIZE, dma_dir); > mvneta_rx_desc_fill(rx_desc, phys_addr, page, rxq); > > return 0;
Hi Lorenzo, Jesper, On Thu, Oct 10, 2019 at 09:08:31AM +0200, Jesper Dangaard Brouer wrote: > On Thu, 10 Oct 2019 01:18:34 +0200 > Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > > mvneta driver can run on not cache coherent devices so it is > > necessary to sync dma buffers before sending them to the device > > in order to avoid memory corruption. This patch introduce a performance > > penalty and it is necessary to introduce a more sophisticated logic > > in order to avoid dma sync as much as we can > > Report with benchmarks here: > https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_espressobin08_bench_xdp.org > > We are testing this on an Espressobin board, and do see a huge > performance cost associated with this DMA-sync. Regardless we still > want to get this patch merged, to move forward with XDP support for > this driver. > > We promised each-other (on IRC freenode #xdp) that we will follow-up > with a solution/mitigation, after this patchset is merged. There are > several ideas, that likely should get separate upstream review. I think mentioning that the patch *introduces* a performance penalty is a bit misleading. The dma sync does have a performance penalty but it was always there. The initial driver was mapping the DMA with DMA_FROM_DEVICE, which implies syncing as well. In page_pool we do not explicitly sync buffers on allocation and leave it up the driver writer (and allow him some tricks to avoid that), thus this patch is needed. In any case what Jesper mentions is correct, we do have a plan :) > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > > > --- > > drivers/net/ethernet/marvell/mvneta.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > > index 79a6bac0192b..ba4aa9bbc798 100644 > > --- a/drivers/net/ethernet/marvell/mvneta.c > > +++ b/drivers/net/ethernet/marvell/mvneta.c > > @@ -1821,6 +1821,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > > struct mvneta_rx_queue *rxq, > > gfp_t gfp_mask) > > { > > + enum dma_data_direction dma_dir; > > dma_addr_t phys_addr; > > struct page *page; > > > > @@ -1830,6 +1831,9 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > > return -ENOMEM; > > > > phys_addr = page_pool_get_dma_addr(page) + pp->rx_offset_correction; > > + dma_dir = page_pool_get_dma_dir(rxq->page_pool); > > + dma_sync_single_for_device(pp->dev->dev.parent, phys_addr, > > + MVNETA_MAX_RX_BUF_SIZE, dma_dir); > > mvneta_rx_desc_fill(rx_desc, phys_addr, page, rxq); > > > > return 0; > > > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer Thanks! /Ilias
> Hi Lorenzo, Jesper, > > On Thu, Oct 10, 2019 at 09:08:31AM +0200, Jesper Dangaard Brouer wrote: > > On Thu, 10 Oct 2019 01:18:34 +0200 > > Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > > > > mvneta driver can run on not cache coherent devices so it is > > > necessary to sync dma buffers before sending them to the device > > > in order to avoid memory corruption. This patch introduce a performance > > > penalty and it is necessary to introduce a more sophisticated logic > > > in order to avoid dma sync as much as we can > > > > Report with benchmarks here: > > https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_espressobin08_bench_xdp.org > > > > We are testing this on an Espressobin board, and do see a huge > > performance cost associated with this DMA-sync. Regardless we still > > want to get this patch merged, to move forward with XDP support for > > this driver. > > > > We promised each-other (on IRC freenode #xdp) that we will follow-up > > with a solution/mitigation, after this patchset is merged. There are > > several ideas, that likely should get separate upstream review. > > I think mentioning that the patch *introduces* a performance penalty is a bit > misleading. > The dma sync does have a performance penalty but it was always there. > The initial driver was mapping the DMA with DMA_FROM_DEVICE, which implies > syncing as well. In page_pool we do not explicitly sync buffers on allocation > and leave it up the driver writer (and allow him some tricks to avoid that), > thus this patch is needed. Reviewing the commit log I definitely agree, I will rewrite it in v3. Thx Regards, Lorenzo > > In any case what Jesper mentions is correct, we do have a plan :) > > > > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > > > > > --- > > > drivers/net/ethernet/marvell/mvneta.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > > > index 79a6bac0192b..ba4aa9bbc798 100644 > > > --- a/drivers/net/ethernet/marvell/mvneta.c > > > +++ b/drivers/net/ethernet/marvell/mvneta.c > > > @@ -1821,6 +1821,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > > > struct mvneta_rx_queue *rxq, > > > gfp_t gfp_mask) > > > { > > > + enum dma_data_direction dma_dir; > > > dma_addr_t phys_addr; > > > struct page *page; > > > > > > @@ -1830,6 +1831,9 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > > > return -ENOMEM; > > > > > > phys_addr = page_pool_get_dma_addr(page) + pp->rx_offset_correction; > > > + dma_dir = page_pool_get_dma_dir(rxq->page_pool); > > > + dma_sync_single_for_device(pp->dev->dev.parent, phys_addr, > > > + MVNETA_MAX_RX_BUF_SIZE, dma_dir); > > > mvneta_rx_desc_fill(rx_desc, phys_addr, page, rxq); > > > > > > return 0; > > > > > > > > -- > > Best regards, > > Jesper Dangaard Brouer > > MSc.CS, Principal Kernel Engineer at Red Hat > > LinkedIn: http://www.linkedin.com/in/brouer > > Thanks! > /Ilias
> On Thu, 10 Oct 2019 01:18:34 +0200 > Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > > mvneta driver can run on not cache coherent devices so it is > > necessary to sync dma buffers before sending them to the device > > in order to avoid memory corruption. This patch introduce a performance > > penalty and it is necessary to introduce a more sophisticated logic > > in order to avoid dma sync as much as we can > > Report with benchmarks here: > https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/board_espressobin08_bench_xdp.org Thx a lot Jesper for detailed report, I will include this info in the commit log. Regards, Lorenzo > > We are testing this on an Espressobin board, and do see a huge > performance cost associated with this DMA-sync. Regardless we still > want to get this patch merged, to move forward with XDP support for > this driver. > > We promised each-other (on IRC freenode #xdp) that we will follow-up > with a solution/mitigation, after this patchset is merged. There are > several ideas, that likely should get separate upstream review. > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > > > --- > > drivers/net/ethernet/marvell/mvneta.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c > > index 79a6bac0192b..ba4aa9bbc798 100644 > > --- a/drivers/net/ethernet/marvell/mvneta.c > > +++ b/drivers/net/ethernet/marvell/mvneta.c > > @@ -1821,6 +1821,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > > struct mvneta_rx_queue *rxq, > > gfp_t gfp_mask) > > { > > + enum dma_data_direction dma_dir; > > dma_addr_t phys_addr; > > struct page *page; > > > > @@ -1830,6 +1831,9 @@ static int mvneta_rx_refill(struct mvneta_port *pp, > > return -ENOMEM; > > > > phys_addr = page_pool_get_dma_addr(page) + pp->rx_offset_correction; > > + dma_dir = page_pool_get_dma_dir(rxq->page_pool); > > + dma_sync_single_for_device(pp->dev->dev.parent, phys_addr, > > + MVNETA_MAX_RX_BUF_SIZE, dma_dir); > > mvneta_rx_desc_fill(rx_desc, phys_addr, page, rxq); > > > > return 0; > > > > -- > Best regards, > Jesper Dangaard Brouer > MSc.CS, Principal Kernel Engineer at Red Hat > LinkedIn: http://www.linkedin.com/in/brouer
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index 79a6bac0192b..ba4aa9bbc798 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1821,6 +1821,7 @@ static int mvneta_rx_refill(struct mvneta_port *pp, struct mvneta_rx_queue *rxq, gfp_t gfp_mask) { + enum dma_data_direction dma_dir; dma_addr_t phys_addr; struct page *page; @@ -1830,6 +1831,9 @@ static int mvneta_rx_refill(struct mvneta_port *pp, return -ENOMEM; phys_addr = page_pool_get_dma_addr(page) + pp->rx_offset_correction; + dma_dir = page_pool_get_dma_dir(rxq->page_pool); + dma_sync_single_for_device(pp->dev->dev.parent, phys_addr, + MVNETA_MAX_RX_BUF_SIZE, dma_dir); mvneta_rx_desc_fill(rx_desc, phys_addr, page, rxq); return 0;
mvneta driver can run on not cache coherent devices so it is necessary to sync dma buffers before sending them to the device in order to avoid memory corruption. This patch introduce a performance penalty and it is necessary to introduce a more sophisticated logic in order to avoid dma sync as much as we can Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> --- drivers/net/ethernet/marvell/mvneta.c | 4 ++++ 1 file changed, 4 insertions(+)