From patchwork Tue Jul 11 06:06:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cyril Bur X-Patchwork-Id: 786483 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3x6BWm6Bnsz9s7C for ; Tue, 11 Jul 2017 16:10:12 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3x6BWm4cqyzDqgY for ; Tue, 11 Jul 2017 16:10:12 +1000 (AEST) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3x6BWd1SGCzDqb4 for ; Tue, 11 Jul 2017 16:10:05 +1000 (AEST) Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v6B693tS138056 for ; Tue, 11 Jul 2017 02:10:02 -0400 Received: from e23smtp06.au.ibm.com (e23smtp06.au.ibm.com [202.81.31.148]) by mx0a-001b2d01.pphosted.com with ESMTP id 2bmbr6a0fc-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 11 Jul 2017 02:10:01 -0400 Received: from localhost by e23smtp06.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 11 Jul 2017 16:09:58 +1000 Received: from d23relay08.au.ibm.com (202.81.31.227) by e23smtp06.au.ibm.com (202.81.31.212) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 11 Jul 2017 16:09:56 +1000 Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay08.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v6B68eaO21495948 for ; Tue, 11 Jul 2017 16:08:40 +1000 Received: from d23av03.au.ibm.com (localhost [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id v6B68V6I019079 for ; Tue, 11 Jul 2017 16:08:32 +1000 Received: from ozlabs.au.ibm.com (ozlabs.au.ibm.com [9.192.253.14]) by d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id v6B68VQ8019069; Tue, 11 Jul 2017 16:08:31 +1000 Received: from camb691.ozlabs.ibm.com (haven.au.ibm.com [9.192.254.114]) (using TLSv1.2 with cipher DHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.au.ibm.com (Postfix) with ESMTPSA id BAC99A0351; Tue, 11 Jul 2017 16:08:39 +1000 (AEST) From: Cyril Bur To: skiboot@lists.ozlabs.org Date: Tue, 11 Jul 2017 16:06:39 +1000 X-Mailer: git-send-email 2.13.2 In-Reply-To: <20170711060639.960-1-cyril.bur@au1.ibm.com> References: <20170711060639.960-1-cyril.bur@au1.ibm.com> X-TM-AS-MML: disable x-cbid: 17071106-0040-0000-0000-00000344B6BC X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17071106-0041-0000-0000-00000CC02306 Message-Id: <20170711060639.960-4-cyril.bur@au1.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-07-11_02:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1707110093 Subject: [Skiboot] [PATCH v2 4/4] core/flash: Make opal_flash_op() actually asynchronous X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: alistair@popple.id.au, rlippert@google.com MIME-Version: 1.0 Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" This patch provides a simple (although not particularly efficient) asynchronous capability to the opal_flash interface. The advantage of this approach is that it doesn't require any changing of blocklevel or its backends to provide an asynchronous implementation. This is also the disadvantage of this implementation as all it actually does is break the work up in chunks that it can performed quickly, but still synchronously. Only a backend could provide a more asynchronous implementation. This solves a problem we have right now with the opal_flash_erase call where it can block in Skiboot for around three minutes. This causes a variety of problems in Linux due to a processor being gone for a long time. For example: [ 98.610043] INFO: rcu_sched detected stalls on CPUs/tasks: [ 98.610050] 113-...: (1 GPs behind) idle=96f/140000000000000/0 softirq=527/528 fqs=1044 [ 98.610051] (detected by 112, t=2102 jiffies, g=223, c=222, q=123) [ 98.610060] Task dump for CPU 113: [ 98.610062] pflash R running task 0 3335 3333 0x00040004 [ 98.610066] Call Trace: [ 98.610070] [c000001fdd847730] [0000000000000001] 0x1 (unreliable) [ 98.610076] [c000001fdd847900] [c000000000013854] __switch_to+0x1e8/0x1f4 [ 98.610081] [c000001fdd847960] [c0000000006122c4] __schedule+0x32c/0x874 [ 98.610083] [c000001fdd847a30] [c000001fdd847b40] 0xc000001fdd847b40 It is for this reason that breaking the work up in smaller chunks solves this problem as Skiboot can return the CPU to Linux between chunks to avoid Linux getting upset. Reported-By: Samuel Mendoza-Jonas Signed-off-by: Cyril Bur --- V2: Use the second suggestion after Alistair didn't like v1 Refactored the code from what was initially sent core/flash.c | 136 ++++++++++++++++++++++---------- doc/opal-api/opal-flash-110-111-112.rst | 4 + 2 files changed, 100 insertions(+), 40 deletions(-) diff --git a/core/flash.c b/core/flash.c index 177f7ae1..f953490e 100644 --- a/core/flash.c +++ b/core/flash.c @@ -28,6 +28,23 @@ #include #include #include +#include +#include + +enum flash_op { + FLASH_OP_READ, + FLASH_OP_WRITE, + FLASH_OP_ERASE, +}; + +struct flash_async_info { + enum flash_op op; + struct timer poller; + uint64_t token; + uint64_t pos; + uint64_t len; + uint64_t buf; +}; struct flash { struct list_node list; @@ -36,6 +53,7 @@ struct flash { uint64_t size; uint32_t block_size; int id; + struct flash_async_info async; }; static LIST_HEAD(flashes); @@ -263,6 +281,64 @@ static int num_flashes(void) return i; } +/* + * Called with flash lock held, drop it on async completion + */ +static void flash_poll(struct timer *t __unused, void *data, + uint64_t now __unused) +{ + struct flash *flash = data; + uint64_t len; + int rc; + + len = MIN(flash->async.len, flash->block_size); + + /* + * These ops intentionally have no smarts (ecc correction or erase + * before write) to them. + * Skiboot is simply exposing the PNOR flash to the host. + * The host is expected to understand that this is a raw flash + * device and treat it as such. + */ + + switch (flash->async.op) { + case FLASH_OP_READ: + rc = blocklevel_raw_read(flash->bl, flash->async.pos, + (void *)flash->async.buf, len); + break; + case FLASH_OP_WRITE: + rc = blocklevel_raw_write(flash->bl, flash->async.pos, + (void *)flash->async.buf, len); + break; + case FLASH_OP_ERASE: + rc = blocklevel_erase(flash->bl, flash->async.pos, len); + break; + default: + assert(0); + } + + if (rc) { + rc = OPAL_HARDWARE; + goto out; + } + + flash->async.pos += len; + flash->async.buf += len; + flash->async.len -= len; + if (flash->async.len) { + /* + * We want to get called pretty much straight away, just have + * to be sure that we jump back out to Linux so that we don't + * cause RCU or the scheduler to freak. + */ + schedule_timer(&flash->async.poller, msecs_to_tb(1)); + return; + } +out: + opal_queue_msg(OPAL_MSG_ASYNC_COMP, NULL, NULL, flash->async.token, rc); + flash_release(flash); +} + int flash_register(struct blocklevel_device *bl) { uint64_t size; @@ -295,6 +371,7 @@ int flash_register(struct blocklevel_device *bl) flash->size = size; flash->block_size = block_size; flash->id = num_flashes(); + init_timer(&flash->async.poller, flash_poll, flash); list_add(&flashes, &flash->list); @@ -323,17 +400,10 @@ int flash_register(struct blocklevel_device *bl) return OPAL_SUCCESS; } -enum flash_op { - FLASH_OP_READ, - FLASH_OP_WRITE, - FLASH_OP_ERASE, -}; - static int64_t opal_flash_op(enum flash_op op, uint64_t id, uint64_t offset, uint64_t buf, uint64_t size, uint64_t token) { struct flash *flash = NULL; - int rc; list_for_each(&flashes, flash, list) if (flash->id == id) @@ -350,44 +420,30 @@ static int64_t opal_flash_op(enum flash_op op, uint64_t id, uint64_t offset, || offset + size > flash->size) { prlog(PR_DEBUG, "Requested flash op %d beyond flash size %" PRIu64 "\n", op, flash->size); - rc = OPAL_PARAMETER; - goto err; - } - - /* - * These ops intentionally have no smarts (ecc correction or erase - * before write) to them. - * Skiboot is simply exposing the PNOR flash to the host. - * The host is expected to understand that this is a raw flash - * device and treat it as such. - */ - switch (op) { - case FLASH_OP_READ: - rc = blocklevel_raw_read(flash->bl, offset, (void *)buf, size); - break; - case FLASH_OP_WRITE: - rc = blocklevel_raw_write(flash->bl, offset, (void *)buf, size); - break; - case FLASH_OP_ERASE: - rc = blocklevel_erase(flash->bl, offset, size); - break; - default: - assert(0); + flash_release(flash); + return OPAL_PARAMETER; } - if (rc) { - rc = OPAL_HARDWARE; - goto err; - } + flash->async.token = token; + flash->async.op = op; + flash->async.pos = offset; + flash->async.buf = buf; + flash->async.len = size; - flash_release(flash); + /* Kick off the process */ + flash_poll(&flash->async.poller, flash, mftb()); - opal_queue_msg(OPAL_MSG_ASYNC_COMP, NULL, NULL, token, rc); + /* + * As of 1/07/2017 the powernv_flash driver in Linux will handle + * OPAL_SUCCESS as an error, the only thing that makes it handle + * things as though they're working is receiving + * OPAL_ASYNC_COMPLETION. + * + * This pretty much rules out any optimisation where the required + * work could be done stright away with no need for + * OPAL_ASYNC_COMPLETION. Oh well. + */ return OPAL_ASYNC_COMPLETION; - -err: - flash_release(flash); - return rc; } static int64_t opal_flash_read(uint64_t id, uint64_t offset, uint64_t buf, diff --git a/doc/opal-api/opal-flash-110-111-112.rst b/doc/opal-api/opal-flash-110-111-112.rst index 71ba866d..086c4095 100644 --- a/doc/opal-api/opal-flash-110-111-112.rst +++ b/doc/opal-api/opal-flash-110-111-112.rst @@ -20,6 +20,10 @@ success, the calls will return ``OPAL_ASYNC_COMPLETION``, and an opal_async_completion message will be sent (with the appropriate token argument) when the operation completes. +Due to an error in the powernv_flash driver in Linux these three OPAL +calls should never return ``OPAL_SUCCESS`` as the driver is likely to +treat this return value as an error. + All calls share the same return values: ``OPAL_ASYNC_COMPLETION``