Message ID | 20171216010117.13275-1-jengelh@inai.de |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | crypto: n2 - cure use after free | expand |
From: Jan Engelhardt <jengelh@inai.de> Date: Sat, 16 Dec 2017 02:01:17 +0100 > queue_cache_init is called for every crypto processor found. > > When first invoked, queue_cache[0] is NULL and queue_cache_init will > allocate a kmem_cache. If queue_cache_init returns a failure code, the > caller, grab_global_resources, will call queue_cache_destroy to release said > kmem_cache, but it does this without setting queue_cache_init[0] to NULL. That's not what's happening exactly. queue_cache_init() is not failing and returning a failure code. In fact, it is a very simple function which does nothing more than create kmem caches so is very much unlikely to fail, especially with the repeatability that you are seeing (ie. every time). Instead, what fails is the algorithm registry which you should look more deeply into the cause of. And this failure path is how we lead to the problem. This does need to be fixed, so please fix you commit message and _also_, more importantly, please CC: the crypto list as well as the crypto maintainer so that Herbert can see and integrate the fix. I would also like you to look into why the algorithm registry fails, if the selftest is running and getting incorrect results for the algorithm that is a huge issue and must be investigated and fixed. That is the true regression which is causing the failure path you see to run at all. Thank you. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 2017-12-19 16:31, David Miller wrote: > >Instead, what fails is the algorithm registry which you should look >more deeply into the cause of. You are right. The registration failure is because the crypto layer expects halg->statesize to be non-zero, and drivers/crypto/n2_core.c does not set it, causing breakage since possibly v4.2-rc1-182-g8996eafdcbad (commit by rmk): (1471) halg = &ahash->halg; halg->digestsize = tmpl->digest_size; Nevertheless, I think that the error pathing in n2_core.c should be made robust as well. Should I resubmit with a new commit message? -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Jan Engelhardt <jengelh@inai.de> Date: Tue, 19 Dec 2017 16:42:39 +0100 (CET) > Nevertheless, I think that the error pathing in n2_core.c should be made > robust as well. I completely agree. > Should I resubmit with a new commit message? Yes. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/crypto/n2_core.c b/drivers/crypto/n2_core.c index 48de52cf2ecc..662e709812cc 100644 --- a/drivers/crypto/n2_core.c +++ b/drivers/crypto/n2_core.c @@ -1625,6 +1625,7 @@ static int queue_cache_init(void) CWQ_ENTRY_SIZE, 0, NULL); if (!queue_cache[HV_NCS_QTYPE_CWQ - 1]) { kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_MAU - 1]); + queue_cache[HV_NCS_QTYPE_MAU - 1] = NULL; return -ENOMEM; } return 0; @@ -1634,6 +1635,8 @@ static void queue_cache_destroy(void) { kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_MAU - 1]); kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]); + queue_cache[HV_NCS_QTYPE_MAU - 1] = NULL; + queue_cache[HV_NCS_QTYPE_CWQ - 1] = NULL; } static long spu_queue_register_workfn(void *arg)
queue_cache_init is called for every crypto processor found. When first invoked, queue_cache[0] is NULL and queue_cache_init will allocate a kmem_cache. If queue_cache_init returns a failure code, the caller, grab_global_resources, will call queue_cache_destroy to release said kmem_cache, but it does this without setting queue_cache_init[0] to NULL. So when the second processor gets probed, queue_cache_init will leave queue_cache[0] at its bogus value, causing a BUG() to trigger when queue_cache[0] is eventually passed to kmem_cache_zalloc: n2_crypto: Found N2CP at /virtual-devices@100/n2cp@7 n2_crypto: Registered NCS HVAPI version 2.0 called queue_cache_init n2_crypto: md5 alg registration failed n2cp f028687c: /virtual-devices@100/n2cp@7: Unable to register algorithms. called queue_cache_destroy n2cp: probe of f028687c failed with error -22 n2_crypto: Found NCP at /virtual-devices@100/ncp@6 n2_crypto: Registered NCS HVAPI version 2.0 called queue_cache_init kernel BUG at mm/slab.c:2993! Call Trace: [0000000000604488] kmem_cache_alloc+0x1a8/0x1e0 (inlined) kmem_cache_zalloc (inlined) new_queue (inlined) spu_queue_setup (inlined) handle_exec_unit [0000000010c61eb4] spu_mdesc_scan+0x1f4/0x460 [n2_crypto] [0000000010c62b80] n2_mau_probe+0x100/0x220 [n2_crypto] [000000000084b174] platform_drv_probe+0x34/0xc0 Signed-off-by: Jan Engelhardt <jengelh@inai.de> --- drivers/crypto/n2_core.c | 3 +++ 1 file changed, 3 insertions(+)