mbox series

[mlx5-next,00/10] Use ODP MRs for kernel ULPs

Message ID 20200115124340.79108-1-leon@kernel.org
Headers show
Series Use ODP MRs for kernel ULPs | expand

Message

Leon Romanovsky Jan. 15, 2020, 12:43 p.m. UTC
From: Leon Romanovsky <leonro@mellanox.com>

Hi,

The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.

The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a userspace
application that allocates memory for HCA access but the responsibility
to register the memory at the HCA is on an kernel ULP. This ULP that acts
as an agent for the userspace application.

The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.

The third part is actual user of those in-kernel APIs.

Thanks

Hans Westgaard Ry (3):
  net/rds: Detect need of On-Demand-Paging memory registration
  net/rds: Handle ODP mr registration/unregistration
  net/rds: Use prefetch for On-Demand-Paging MR

Jason Gunthorpe (1):
  RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths

Leon Romanovsky (1):
  RDMA/mlx5: Don't fake udata for kernel path

Moni Shoua (5):
  IB: Allow calls to ib_umem_get from kernel ULPs
  IB/core: Introduce ib_reg_user_mr
  IB/core: Add interface to advise_mr for kernel users
  IB/mlx5: Add ODP WQE handlers for kernel QPs
  IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs

 drivers/infiniband/core/umem.c                |  27 +--
 drivers/infiniband/core/umem_odp.c            |  29 +--
 drivers/infiniband/core/verbs.c               |  41 +++++
 drivers/infiniband/hw/bnxt_re/ib_verbs.c      |  12 +-
 drivers/infiniband/hw/cxgb4/mem.c             |   2 +-
 drivers/infiniband/hw/efa/efa_verbs.c         |   4 +-
 drivers/infiniband/hw/hns/hns_roce_cq.c       |   2 +-
 drivers/infiniband/hw/hns/hns_roce_db.c       |   3 +-
 drivers/infiniband/hw/hns/hns_roce_mr.c       |   4 +-
 drivers/infiniband/hw/hns/hns_roce_qp.c       |   2 +-
 drivers/infiniband/hw/hns/hns_roce_srq.c      |   5 +-
 drivers/infiniband/hw/i40iw/i40iw_verbs.c     |   5 +-
 drivers/infiniband/hw/mlx4/cq.c               |   2 +-
 drivers/infiniband/hw/mlx4/doorbell.c         |   3 +-
 drivers/infiniband/hw/mlx4/mr.c               |   8 +-
 drivers/infiniband/hw/mlx4/qp.c               |   5 +-
 drivers/infiniband/hw/mlx4/srq.c              |   3 +-
 drivers/infiniband/hw/mlx5/cq.c               |   6 +-
 drivers/infiniband/hw/mlx5/devx.c             |   2 +-
 drivers/infiniband/hw/mlx5/doorbell.c         |   3 +-
 drivers/infiniband/hw/mlx5/main.c             |  51 ++++--
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |  12 +-
 drivers/infiniband/hw/mlx5/mr.c               |  20 +--
 drivers/infiniband/hw/mlx5/odp.c              |  33 ++--
 drivers/infiniband/hw/mlx5/qp.c               | 167 +++++++++++-------
 drivers/infiniband/hw/mlx5/srq.c              |   2 +-
 drivers/infiniband/hw/mthca/mthca_provider.c  |   2 +-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   |   2 +-
 drivers/infiniband/hw/qedr/verbs.c            |   9 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c  |   2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c  |   2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c  |   7 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c |   2 +-
 drivers/infiniband/sw/rdmavt/mr.c             |   2 +-
 drivers/infiniband/sw/rxe/rxe_mr.c            |   2 +-
 include/rdma/ib_umem.h                        |   4 +-
 include/rdma/ib_umem_odp.h                    |   6 +-
 include/rdma/ib_verbs.h                       |   9 +
 net/rds/ib.c                                  |   7 +
 net/rds/ib.h                                  |   3 +-
 net/rds/ib_mr.h                               |   7 +-
 net/rds/ib_rdma.c                             |  83 ++++++++-
 net/rds/ib_send.c                             |  44 +++--
 net/rds/rdma.c                                | 156 +++++++++++-----
 net/rds/rds.h                                 |  13 +-
 45 files changed, 559 insertions(+), 256 deletions(-)

--
2.20.1

Comments

Leon Romanovsky Jan. 16, 2020, 6:59 a.m. UTC | #1
On Wed, Jan 15, 2020 at 02:43:30PM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
>
> Hi,
>
> The following series extends MR creation routines to allow creation of
> user MRs through kernel ULPs as a proxy. The immediate use case is to
> allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
> MRs to be created and such MRs were not possible to create prior this
> series.
>
> The first part of this patchset extends RDMA to have special verb
> ib_reg_user_mr(). The common use case that uses this function is a userspace
> application that allocates memory for HCA access but the responsibility
> to register the memory at the HCA is on an kernel ULP. This ULP that acts
> as an agent for the userspace application.
>
> The second part provides advise MR functionality for ULPs. This is
> integral part of ODP flows and used to trigger pagefaults in advance
> to prepare memory before running working set.
>
> The third part is actual user of those in-kernel APIs.
>
> Thanks
>
> Hans Westgaard Ry (3):
>   net/rds: Detect need of On-Demand-Paging memory registration
>   net/rds: Handle ODP mr registration/unregistration
>   net/rds: Use prefetch for On-Demand-Paging MR
>
> Jason Gunthorpe (1):
>   RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
>
> Leon Romanovsky (1):
>   RDMA/mlx5: Don't fake udata for kernel path
>
> Moni Shoua (5):
>   IB: Allow calls to ib_umem_get from kernel ULPs
>   IB/core: Introduce ib_reg_user_mr
>   IB/core: Add interface to advise_mr for kernel users
>   IB/mlx5: Add ODP WQE handlers for kernel QPs
>   IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
>
>  drivers/infiniband/core/umem.c                |  27 +--
>  drivers/infiniband/core/umem_odp.c            |  29 +--
>  drivers/infiniband/core/verbs.c               |  41 +++++
>  drivers/infiniband/hw/bnxt_re/ib_verbs.c      |  12 +-
>  drivers/infiniband/hw/cxgb4/mem.c             |   2 +-
>  drivers/infiniband/hw/efa/efa_verbs.c         |   4 +-
>  drivers/infiniband/hw/hns/hns_roce_cq.c       |   2 +-
>  drivers/infiniband/hw/hns/hns_roce_db.c       |   3 +-
>  drivers/infiniband/hw/hns/hns_roce_mr.c       |   4 +-
>  drivers/infiniband/hw/hns/hns_roce_qp.c       |   2 +-
>  drivers/infiniband/hw/hns/hns_roce_srq.c      |   5 +-
>  drivers/infiniband/hw/i40iw/i40iw_verbs.c     |   5 +-
>  drivers/infiniband/hw/mlx4/cq.c               |   2 +-
>  drivers/infiniband/hw/mlx4/doorbell.c         |   3 +-
>  drivers/infiniband/hw/mlx4/mr.c               |   8 +-
>  drivers/infiniband/hw/mlx4/qp.c               |   5 +-
>  drivers/infiniband/hw/mlx4/srq.c              |   3 +-
>  drivers/infiniband/hw/mlx5/cq.c               |   6 +-
>  drivers/infiniband/hw/mlx5/devx.c             |   2 +-
>  drivers/infiniband/hw/mlx5/doorbell.c         |   3 +-
>  drivers/infiniband/hw/mlx5/main.c             |  51 ++++--
>  drivers/infiniband/hw/mlx5/mlx5_ib.h          |  12 +-
>  drivers/infiniband/hw/mlx5/mr.c               |  20 +--
>  drivers/infiniband/hw/mlx5/odp.c              |  33 ++--
>  drivers/infiniband/hw/mlx5/qp.c               | 167 +++++++++++-------
>  drivers/infiniband/hw/mlx5/srq.c              |   2 +-
>  drivers/infiniband/hw/mthca/mthca_provider.c  |   2 +-
>  drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   |   2 +-
>  drivers/infiniband/hw/qedr/verbs.c            |   9 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c  |   2 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c  |   2 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c  |   7 +-
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c |   2 +-
>  drivers/infiniband/sw/rdmavt/mr.c             |   2 +-
>  drivers/infiniband/sw/rxe/rxe_mr.c            |   2 +-
>  include/rdma/ib_umem.h                        |   4 +-
>  include/rdma/ib_umem_odp.h                    |   6 +-
>  include/rdma/ib_verbs.h                       |   9 +
>  net/rds/ib.c                                  |   7 +
>  net/rds/ib.h                                  |   3 +-
>  net/rds/ib_mr.h                               |   7 +-
>  net/rds/ib_rdma.c                             |  83 ++++++++-
>  net/rds/ib_send.c                             |  44 +++--
>  net/rds/rdma.c                                | 156 +++++++++++-----
>  net/rds/rds.h                                 |  13 +-
>  45 files changed, 559 insertions(+), 256 deletions(-)

Thanks Santosh for your review.

David,
Is it ok to route those patches through RDMA tree given the fact that
we are touching a lot of files in drivers/infiniband/* ?

There is no conflict between netdev and RDMA versions of RDS, but to be
on safe side, I'll put all this code to mlx5-next tree.

Thanks

>
> --
> 2.20.1
>
Jason Gunthorpe Jan. 16, 2020, 1:57 p.m. UTC | #2
On Thu, Jan 16, 2020 at 06:59:29AM +0000, Leon Romanovsky wrote:
> >  45 files changed, 559 insertions(+), 256 deletions(-)
> 
> Thanks Santosh for your review.
> 
> David,
> Is it ok to route those patches through RDMA tree given the fact that
> we are touching a lot of files in drivers/infiniband/* ?
> 
> There is no conflict between netdev and RDMA versions of RDS, but to be
> on safe side, I'll put all this code to mlx5-next tree.

Er, lets not contaminate the mlx5-next with this..

It looks like it applies clean to -rc6 so if it has to be in both
trees a clean PR against -rc5/6 is the way to do it.

Santos, do you anticipate more RDS patches this cycle?

Jason
Leon Romanovsky Jan. 16, 2020, 2:04 p.m. UTC | #3
On Thu, Jan 16, 2020 at 01:57:05PM +0000, Jason Gunthorpe wrote:
> On Thu, Jan 16, 2020 at 06:59:29AM +0000, Leon Romanovsky wrote:
> > >  45 files changed, 559 insertions(+), 256 deletions(-)
> >
> > Thanks Santosh for your review.
> >
> > David,
> > Is it ok to route those patches through RDMA tree given the fact that
> > we are touching a lot of files in drivers/infiniband/* ?
> >
> > There is no conflict between netdev and RDMA versions of RDS, but to be
> > on safe side, I'll put all this code to mlx5-next tree.
>
> Er, lets not contaminate the mlx5-next with this..
>
> It looks like it applies clean to -rc6 so if it has to be in both
> trees a clean PR against -rc5/6 is the way to do it.

Yes, it applies cleanly.

>
> Santos, do you anticipate more RDS patches this cycle?
>
> Jason
Santosh Shilimkar Jan. 16, 2020, 7:34 p.m. UTC | #4
On 1/16/20 5:57 AM, Jason Gunthorpe wrote:
> On Thu, Jan 16, 2020 at 06:59:29AM +0000, Leon Romanovsky wrote:
>>>   45 files changed, 559 insertions(+), 256 deletions(-)
>>
>> Thanks Santosh for your review.
>>
>> David,
>> Is it ok to route those patches through RDMA tree given the fact that
>> we are touching a lot of files in drivers/infiniband/* ?
>>
>> There is no conflict between netdev and RDMA versions of RDS, but to be
>> on safe side, I'll put all this code to mlx5-next tree.
> 
> Er, lets not contaminate the mlx5-next with this..
> 
> It looks like it applies clean to -rc6 so if it has to be in both
> trees a clean PR against -rc5/6 is the way to do it.
> 
> Santos, do you anticipate more RDS patches this cycle?
> 

Not for upcoming merge window afaik.
Jason Gunthorpe Jan. 17, 2020, 2:12 p.m. UTC | #5
On Thu, Jan 16, 2020 at 11:34:18AM -0800, santosh.shilimkar@oracle.com wrote:
> On 1/16/20 5:57 AM, Jason Gunthorpe wrote:
> > On Thu, Jan 16, 2020 at 06:59:29AM +0000, Leon Romanovsky wrote:
> > > >   45 files changed, 559 insertions(+), 256 deletions(-)
> > > 
> > > Thanks Santosh for your review.
> > > 
> > > David,
> > > Is it ok to route those patches through RDMA tree given the fact that
> > > we are touching a lot of files in drivers/infiniband/* ?
> > > 
> > > There is no conflict between netdev and RDMA versions of RDS, but to be
> > > on safe side, I'll put all this code to mlx5-next tree.
> > 
> > Er, lets not contaminate the mlx5-next with this..
> > 
> > It looks like it applies clean to -rc6 so if it has to be in both
> > trees a clean PR against -rc5/6 is the way to do it.
> > 
> > Santos, do you anticipate more RDS patches this cycle?
> > 
> 
> Not for upcoming merge window afaik.

In this case DaveM, will you ack and we can take it through RDMA?

The RDMA pieces look OK to me, like Santos I have reviewed many
versions of this already..

Thanks,
Jason