diff mbox

[2/2] siw: Add support for CRC32C offload instruction using libcrypto crc32c-intel

Message ID 1285187425-10950-1-git-send-email-nab@linux-iscsi.org
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Nicholas A. Bellinger Sept. 22, 2010, 8:30 p.m. UTC
From: Nicholas Bellinger <nab@linux-iscsi.org>

This patch updates siw_create_qp() to check for the CONFIG_X86 + cpu_has_xmm4_2
dependent use of the CRC32C instruction offload using libcrypto crc32c-intel.ko.
This patch will by default use crc32c-intel when available, and fall back to the
legacy slicing by 1x libcrypto crc32c.ko code when the instruction offload is not
availabe.

Because of the ability to selectively use c_tx->crc_enabled and c_rx->crc_enabled,
this patch has to use a handful of less than appealing goto's in order to handle the
default case where the offload is enabled by default, the crc32c-intel module is
not available and we need to fall back to slicing by 1x crc32c.ko, or we need to
fail all together.

Note that this code has only been compile tested so far, but a similar patch
to LIO-Target (w/o the selectively use of TX/RX checksums) to use crc32c-intel.ko
instruction offload is available here:

http://git.kernel.org/?p=linux/kernel/git/nab/lio-core-2.6.git;a=commitdiff;h=c92ef80235302a1f891f636691c928c3bb4e9ac1

Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
---
 softiwarp/siw_verbs.c |   37 ++++++++++++++++++++++++++++++++++++-
 1 files changed, 36 insertions(+), 1 deletions(-)

Comments

Nicholas A. Bellinger Sept. 22, 2010, 8:38 p.m. UTC | #1
On Wed, 2010-09-22 at 22:39 +0200, Andi Kleen wrote:
> On 9/22/2010 10:30 PM, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger<nab@linux-iscsi.org>
> >
> > This patch updates siw_create_qp() to check for the CONFIG_X86 + cpu_has_xmm4_2
> > dependent use of the CRC32C instruction offload using libcrypto crc32c-intel.ko.
> > This patch will by default use crc32c-intel when available, and fall back to the
> > legacy slicing by 1x libcrypto crc32c.ko code when the instruction offload is not
> > availabe.
> 
> I don't think every caller should handle checks like this. The crypto 
> layer should load the right driver
> instead and provide the best driver under a generic algorithm name.
> 

Indeed, this would clean up the explict RX/TX CRC32C case quite a bit..
Unfortuately I am too busy with other items atm to cook up this patch,
but I would be happy to test it if someone wants to take it.  ;)

> Need CPUID module auto probing. I have an older patch that needs some fixes.
> 

Hmm, I don't see how that fits in here exactly.  Would you mind
elaborating a bit..?

Thanks!

--nab


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andi Kleen Sept. 22, 2010, 8:39 p.m. UTC | #2
On 9/22/2010 10:30 PM, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger<nab@linux-iscsi.org>
>
> This patch updates siw_create_qp() to check for the CONFIG_X86 + cpu_has_xmm4_2
> dependent use of the CRC32C instruction offload using libcrypto crc32c-intel.ko.
> This patch will by default use crc32c-intel when available, and fall back to the
> legacy slicing by 1x libcrypto crc32c.ko code when the instruction offload is not
> availabe.

I don't think every caller should handle checks like this. The crypto 
layer should load the right driver
instead and provide the best driver under a generic algorithm name.

Need CPUID module auto probing. I have an older patch that needs some fixes.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Sept. 22, 2010, 8:59 p.m. UTC | #3
On Wed, Sep 22, 2010 at 01:38:21PM -0700, Nicholas A. Bellinger wrote:

> Indeed, this would clean up the explict RX/TX CRC32C case quite a
> bit..  Unfortuately I am too busy with other items atm to cook up
> this patch, but I would be happy to test it if someone wants to take
> it.  ;)

Do you even need to do anything at all? crc32c is provided by
crc32c-intel at a higher priority already, so it should be used if it
is available..
 
> > Need CPUID module auto probing. I have an older patch that needs
> > some fixes.

> Hmm, I don't see how that fits in here exactly.  Would you mind
> elaborating a bit..?

Unless the module is loaded the optimized algorithm will not be
available for automatic use. Maybe your patch causes it to autoload
because of the by-name reference? Identifying modules to load by CPUID
will let userspace auto load the appropriate ones based on CPU...

Jason
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger Sept. 22, 2010, 9 p.m. UTC | #4
On Wed, 2010-09-22 at 14:59 -0600, Jason Gunthorpe wrote:
> On Wed, Sep 22, 2010 at 01:38:21PM -0700, Nicholas A. Bellinger wrote:
> 
> > Indeed, this would clean up the explict RX/TX CRC32C case quite a
> > bit..  Unfortuately I am too busy with other items atm to cook up
> > this patch, but I would be happy to test it if someone wants to take
> > it.  ;)
> 
> Do you even need to do anything at all? crc32c is provided by
> crc32c-intel at a higher priority already, so it should be used if it
> is available..

I believe with the current libcrypto code that consumers are still
required to explictly ask for crc32c-intel offload.

>  
> > > Need CPUID module auto probing. I have an older patch that needs
> > > some fixes.
> 
> > Hmm, I don't see how that fits in here exactly.  Would you mind
> > elaborating a bit..?
> 
> Unless the module is loaded the optimized algorithm will not be
> available for automatic use. Maybe your patch causes it to autoload
> because of the by-name reference? Identifying modules to load by CPUID
> will let userspace auto load the appropriate ones based on CPU...
> 

Correct, this patch is so that autoload of crc32c-intel.ko 'just works'
and we do the fallback to the legacy slicing by 1x crc32c.ko when the
former is not availabe.

But I definately agree with Andi here that we should just add a wrapper
around the crypto_alloc_hash() usage of crc32c-intel and crc32c for
libcrypto consumers..

Thanks for your comments Jason!

--nab


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Sept. 22, 2010, 9:18 p.m. UTC | #5
[cc list chopped]

On Wed, Sep 22, 2010 at 02:00:41PM -0700, Nicholas A. Bellinger wrote:
> On Wed, 2010-09-22 at 14:59 -0600, Jason Gunthorpe wrote:
> > On Wed, Sep 22, 2010 at 01:38:21PM -0700, Nicholas A. Bellinger wrote:
> > 
> > > Indeed, this would clean up the explict RX/TX CRC32C case quite a
> > > bit..  Unfortuately I am too busy with other items atm to cook up
> > > this patch, but I would be happy to test it if someone wants to take
> > > it.  ;)
> > 
> > Do you even need to do anything at all? crc32c is provided by
> > crc32c-intel at a higher priority already, so it should be used if it
> > is available..
> 
> I believe with the current libcrypto code that consumers are still
> required to explictly ask for crc32c-intel offload.

Really? It all looks OK to me.. What does your /proc/crypto say on a
system with crc32c-intel support?

Jason
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger Sept. 22, 2010, 9:38 p.m. UTC | #6
On Wed, 2010-09-22 at 15:18 -0600, Jason Gunthorpe wrote:
> [cc list chopped]
> 
> On Wed, Sep 22, 2010 at 02:00:41PM -0700, Nicholas A. Bellinger wrote:
> > On Wed, 2010-09-22 at 14:59 -0600, Jason Gunthorpe wrote:
> > > On Wed, Sep 22, 2010 at 01:38:21PM -0700, Nicholas A. Bellinger wrote:
> > > 
> > > > Indeed, this would clean up the explict RX/TX CRC32C case quite a
> > > > bit..  Unfortuately I am too busy with other items atm to cook up
> > > > this patch, but I would be happy to test it if someone wants to take
> > > > it.  ;)
> > > 
> > > Do you even need to do anything at all? crc32c is provided by
> > > crc32c-intel at a higher priority already, so it should be used if it
> > > is available..
> > 
> > I believe with the current libcrypto code that consumers are still
> > required to explictly ask for crc32c-intel offload.
> 
> Really? It all looks OK to me.. What does your /proc/crypto say on a
> system with crc32c-intel support?

After a fresh boot /proc/crypto looks like:

name         : stdrng
driver       : krng
module       : kernel
priority     : 200
refcnt       : 1
selftest     : passed
type         : rng
seedsize     : 0

name         : crc32c
driver       : crc32c-generic
module       : kernel
priority     : 100
refcnt       : 2
selftest     : passed
type         : shash
blocksize    : 1
digestsize   : 4

name         : sha1
driver       : sha1-generic
module       : kernel
priority     : 0
refcnt       : 1
selftest     : passed
type         : shash
blocksize    : 64
digestsize   : 20

Once I start up the LIO-Target stack and some iSCSI Initiators login
and request crc32c-intel using crypto_alloc_hash() using a method similar
to what this patch for Softiwarp does, the following appears in at the top
of /proc/crypto:

name         : crc32c
driver       : crc32c-intel
module       : crc32c_intel
priority     : 200
refcnt       : 5
selftest     : passed
type         : shash
blocksize    : 1
digestsize   : 4

So I think the main bit here is the ability to request crc32c-intel.ko first,
and then fall back to crc32c.ko when the former is not available on CONFIG_X86.

Best,

--nab

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Sept. 22, 2010, 10:06 p.m. UTC | #7
On Wed, Sep 22, 2010 at 02:38:31PM -0700, Nicholas A. Bellinger wrote:

> So I think the main bit here is the ability to request
> crc32c-intel.ko first, and then fall back to crc32c.ko when the
> former is not available on CONFIG_X86.

Well, it is what Andi said, everything is working fine but there is no
mechanism to autoload the accelerated crypto module. If you did
modprobe crc32c_intel prior to loading your driver it would
automatically get crc32c-intel when it asks for crc32c since it is
loaded and a higher priority.

So, the drivers are correct to just request crc32c .. The work around
to limited autoprobing is so trivial (modprob crc32_intel) I'm not
sure including extra autoprobing code in the drivers is worthwhile?

Jason
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger Sept. 22, 2010, 10:36 p.m. UTC | #8
On Wed, 2010-09-22 at 16:06 -0600, Jason Gunthorpe wrote:
> On Wed, Sep 22, 2010 at 02:38:31PM -0700, Nicholas A. Bellinger wrote:
> 
> > So I think the main bit here is the ability to request
> > crc32c-intel.ko first, and then fall back to crc32c.ko when the
> > former is not available on CONFIG_X86.
> 
> Well, it is what Andi said, everything is working fine but there is no
> mechanism to autoload the accelerated crypto module. If you did
> modprobe crc32c_intel prior to loading your driver it would
> automatically get crc32c-intel when it asks for crc32c since it is
> loaded and a higher priority.
> 

Ah, OK.  I see what you mean now here wrt to libcrypto priorities and
crc32c + crc32c_intel modules.   My apologies for the in-experience with
libcrypto here..

> So, the drivers are correct to just request crc32c .. The work around
> to limited autoprobing is so trivial (modprob crc32_intel) I'm not
> sure including extra autoprobing code in the drivers is worthwhile?
> 

Indeed, I am happy to drop this patch if Bernard would be nice enough to
add a 'modprobe crc32_intel' into the SIW scripts.

Thanks for your comments Jason!

--nab

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bernard Metzler Sept. 23, 2010, 3:31 p.m. UTC | #9
linux-rdma-owner@vger.kernel.org wrote on 09/23/2010 12:36:29 AM:

> On Wed, 2010-09-22 at 16:06 -0600, Jason Gunthorpe wrote:
> > On Wed, Sep 22, 2010 at 02:38:31PM -0700, Nicholas A. Bellinger wrote:
> >
> > > So I think the main bit here is the ability to request
> > > crc32c-intel.ko first, and then fall back to crc32c.ko when the
> > > former is not available on CONFIG_X86.
> >
> > Well, it is what Andi said, everything is working fine but there is no
> > mechanism to autoload the accelerated crypto module. If you did
> > modprobe crc32c_intel prior to loading your driver it would
> > automatically get crc32c-intel when it asks for crc32c since it is
> > loaded and a higher priority.
> >
>
> Ah, OK.  I see what you mean now here wrt to libcrypto priorities and
> crc32c + crc32c_intel modules.   My apologies for the in-experience with
> libcrypto here..
>
> > So, the drivers are correct to just request crc32c .. The work around
> > to limited autoprobing is so trivial (modprob crc32_intel) I'm not
> > sure including extra autoprobing code in the drivers is worthwhile?
> >
>
> Indeed, I am happy to drop this patch if Bernard would be nice enough to
> add a 'modprobe crc32_intel' into the SIW scripts.
>

Ok, thanks for the CRC comments, quite instructive.  To sum up, now I'll
add
a minimum siw bringup script to the kernel part.

Bernard

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/softiwarp/siw_verbs.c b/softiwarp/siw_verbs.c
index 27e2e5e..00ac03c 100644
--- a/softiwarp/siw_verbs.c
+++ b/softiwarp/siw_verbs.c
@@ -348,7 +348,7 @@  struct ib_qp *siw_create_qp(struct ib_pd *ofa_pd, struct ib_qp_init_attr *attrs,
 	struct siw_iwarp_rx		*c_rx;
 	struct siw_uresp_create_qp	uresp;
 
-	int rv = 0;
+	int rv = 0, crc32c_offload = 1;
 
 	dprint(DBG_OBJ|DBG_CM, ": new QP on device %s\n",
 		ofa_dev->name);
@@ -460,6 +460,23 @@  struct ib_qp *siw_create_qp(struct ib_pd *ofa_pd, struct ib_qp_init_attr *attrs,
 	c_tx->crc_enabled = c_rx->crc_enabled = CONFIG_RDMA_SIW_CRC_ENFORCED;
 
 	if (c_tx->crc_enabled) {
+#ifdef CONFIG_X86
+		/*
+		 * Check for the Nehalem optimized crc32c-intel instructions
+		 * This is only currently available while running on bare-metal,
+		 * and is not yet available with QEMU-KVM guests.
+		 */
+		if (cpu_has_xmm4_2 && crc32c_offload) {
+			c_tx->mpa_crc_hd.tfm = crypto_alloc_hash("crc32c-intel",
+						0, CRYPTO_ALG_ASYNC);
+			if (IS_ERR(c_tx->mpa_crc_hd.tfm)) {
+				crc32c_offload = 0;
+				goto check_legacy_tx;
+			}
+			goto check_rx;
+		}
+check_legacy_tx:
+#endif /* CONFIG_X86 */
 		c_tx->mpa_crc_hd.tfm =
 			crypto_alloc_hash("crc32c", 0, CRYPTO_ALG_ASYNC);
 		if (IS_ERR(c_tx->mpa_crc_hd.tfm)) {
@@ -469,7 +486,24 @@  struct ib_qp *siw_create_qp(struct ib_pd *ofa_pd, struct ib_qp_init_attr *attrs,
 			goto remove_qp;
 		}
 	}
+check_rx:
 	if (c_rx->crc_enabled) {
+#ifdef CONFIG_X86
+		/*
+		 * Check for the Nehalem optimized crc32c-intel instructions
+		 * This is only currently available while running on bare-metal,
+		 * and is not yet available with QEMU-KVM guests.
+		 */
+		if (cpu_has_xmm4_2 && crc32c_offload) {
+			c_rx->mpa_crc_hd.tfm = crypto_alloc_hash("crc32c-intel",
+						0, CRYPTO_ALG_ASYNC);
+			if (IS_ERR(c_rx->mpa_crc_hd.tfm))
+				goto check_legacy_rx;
+
+			goto after_crc32c;
+		}
+check_legacy_rx:
+#endif /* CONFIG_X86 */
 		c_rx->mpa_crc_hd.tfm =
 			crypto_alloc_hash("crc32c", 0, CRYPTO_ALG_ASYNC);
 		if (IS_ERR(c_rx->mpa_crc_hd.tfm)) {
@@ -478,6 +512,7 @@  struct ib_qp *siw_create_qp(struct ib_pd *ofa_pd, struct ib_qp_init_attr *attrs,
 			goto remove_qp;
 		}
 	}
+after_crc32c:
 	atomic_set(&qp->tx_ctx.in_use, 0);
 
 	qp->ofa_qp.qp_num = QP_ID(qp);