From patchwork Wed Apr 6 15:56:42 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Grant Erickson X-Patchwork-Id: 90037 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 69EFCB6EFF for ; Thu, 7 Apr 2011 01:58:42 +1000 (EST) Received: from canuck.infradead.org ([2001:4978:20e::1]) by bombadil.infradead.org with esmtps (Exim 4.72 #1 (Red Hat Linux)) id 1Q7V6F-0003nj-SP; Wed, 06 Apr 2011 15:56:56 +0000 Received: from localhost ([127.0.0.1] helo=canuck.infradead.org) by canuck.infradead.org with esmtp (Exim 4.72 #1 (Red Hat Linux)) id 1Q7V6E-0008Hu-7Y; Wed, 06 Apr 2011 15:56:54 +0000 Received: from mail-vw0-f49.google.com ([209.85.212.49]) by canuck.infradead.org with esmtps (Exim 4.72 #1 (Red Hat Linux)) id 1Q7V6A-0008Hc-UK for linux-mtd@lists.infradead.org; Wed, 06 Apr 2011 15:56:52 +0000 Received: by vws8 with SMTP id 8so1474579vws.36 for ; Wed, 06 Apr 2011 08:56:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:to:cc:subject:date:message-id:x-mailer; bh=IoVrkv7lXLc54gOglDrPb9k9WpBpv42NHJg8DfdJ+oI=; b=H2ozUzqmyS0HFBQE/vYU/OZ0Hi0ijPjNm6I3LJRJRA+gZw1uwy7MIrQJ8QN+Ju6Y3n pMTVwufY9qVpD8n1UZfKyvKZumDW/bKxUs2xT3NeDHbBqDhqo4m/5vMu4sfiHaytbB0j f/+fF9B0mrmfbq2Ix4zi78R+ezAZvAjkZaGGs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:date:message-id:x-mailer; b=Zwcx5VCgVhEdQBszDlN7c3eTpTT5b1w0qHIKWKPlJVHAptvCWCpxt83T9AN4pA2z3r mSxMILO87MaaKBIOI8wEN81TmoqFFvg/cfAoMYUnGuQkbDR+WglfixWAnzWfln3su9tt m0mGJATNpmDzNb8YfRwMNyeyVdJVHXmWJHmFU= Received: by 10.52.70.162 with SMTP id n2mr1651961vdu.268.1302105408847; Wed, 06 Apr 2011 08:56:48 -0700 (PDT) Received: from localhost.localdomain (208.74.181.34.static.etheric.net [208.74.181.34]) by mx.google.com with ESMTPS id 15sm142048vdh.3.2011.04.06.08.56.46 (version=SSLv3 cipher=OTHER); Wed, 06 Apr 2011 08:56:47 -0700 (PDT) From: Grant Erickson To: linux-mtd@lists.infradead.org Subject: [PATCH v3] Retry Large Buffer Allocations Date: Wed, 6 Apr 2011 08:56:42 -0700 Message-Id: <1302105402-12990-1-git-send-email-marathon96@gmail.com> X-Mailer: git-send-email 1.7.4.2 X-CRM114-Version: 20090807-BlameThorstenAndJenny ( TRE 0.7.6 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20110406_115651_127998_48776D33 X-CRM114-Status: GOOD ( 22.40 ) X-Spam-Score: 1.4 (+) X-Spam-Report: SpamAssassin version 3.3.1 on canuck.infradead.org summary: Content analysis details: (1.4 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.212.49 listed in list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is freemail (marathon96[at]gmail.com) 2.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (marathon96[at]gmail.com) -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.0 T_TO_NO_BRKTS_FREEMAIL T_TO_NO_BRKTS_FREEMAIL Cc: Jarkko Lavinen , Artem Bityutskiy X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: linux-mtd-bounces@lists.infradead.org Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org When handling user space read or write requests via mtd_{read,write} or JFFS2 medium scan requests, exponentially back off on the size of the requested kernel transfer buffer until it succeeds or until the requested transfer buffer size falls below the page size. This helps ensure the operation can succeed under low-memory, highly-fragmented situations albeit somewhat more slowly. v2: Incorporated coding style and comment feedback from Artem. v3: Incorporated more feedback from Artem. Retargeted patch against l2-mtd-2.6. Signed-off-by: Grant Erickson --- drivers/mtd/mtdchar.c | 50 +++++++++++++++++++++------------------------- drivers/mtd/mtdcore.c | 41 ++++++++++++++++++++++++++++++++++++++ fs/jffs2/scan.c | 11 +++++---- include/linux/mtd/mtd.h | 2 + 4 files changed, 72 insertions(+), 32 deletions(-) 1.7.4.2 diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c index 145b3d0d..8651a79 100644 --- a/drivers/mtd/mtdchar.c +++ b/drivers/mtd/mtdchar.c @@ -166,10 +166,23 @@ static int mtd_close(struct inode *inode, struct file *file) return 0; } /* mtd_close */ -/* FIXME: This _really_ needs to die. In 2.5, we should lock the - userspace buffer down and use it directly with readv/writev. -*/ -#define MAX_KMALLOC_SIZE 0x20000 +/* Back in April 2005, Linus wrote: + * + * FIXME: This _really_ needs to die. In 2.5, we should lock the + * userspace buffer down and use it directly with readv/writev. + * + * The implementation below, using mtd_alloc_up_to, mitigates + * allocation failures when the system is under low-memory situations + * or if memory is highly fragmented at the cost of reducing the + * performance of the requested transfer due to a smaller buffer size. + * + * A more complex but more memory-efficient implementation based on + * get_user_pages and iovecs to cover extents of those pages is a + * longer-term goal, as intimated by Linus above. However, for the + * write case, this requires yet more complex head and tail transfer + * handling when those head and tail offsets and sizes are such that + * alignment requirements are not met in the NAND subdriver. + */ static ssize_t mtd_read(struct file *file, char __user *buf, size_t count,loff_t *ppos) { @@ -179,6 +192,7 @@ static ssize_t mtd_read(struct file *file, char __user *buf, size_t count,loff_t size_t total_retlen=0; int ret=0; int len; + size_t size = count; char *kbuf; DEBUG(MTD_DEBUG_LEVEL0,"MTD_read\n"); @@ -189,23 +203,12 @@ static ssize_t mtd_read(struct file *file, char __user *buf, size_t count,loff_t if (!count) return 0; - /* FIXME: Use kiovec in 2.5 to lock down the user's buffers - and pass them directly to the MTD functions */ - - if (count > MAX_KMALLOC_SIZE) - kbuf=kmalloc(MAX_KMALLOC_SIZE, GFP_KERNEL); - else - kbuf=kmalloc(count, GFP_KERNEL); - + kbuf = mtd_kmalloc_up_to(&size); if (!kbuf) return -ENOMEM; while (count) { - - if (count > MAX_KMALLOC_SIZE) - len = MAX_KMALLOC_SIZE; - else - len = count; + len = min_t(size_t, count, size); switch (mfi->mode) { case MTD_MODE_OTP_FACTORY: @@ -268,6 +271,7 @@ static ssize_t mtd_write(struct file *file, const char __user *buf, size_t count { struct mtd_file_info *mfi = file->private_data; struct mtd_info *mtd = mfi->mtd; + size_t size = count; char *kbuf; size_t retlen; size_t total_retlen=0; @@ -285,20 +289,12 @@ static ssize_t mtd_write(struct file *file, const char __user *buf, size_t count if (!count) return 0; - if (count > MAX_KMALLOC_SIZE) - kbuf=kmalloc(MAX_KMALLOC_SIZE, GFP_KERNEL); - else - kbuf=kmalloc(count, GFP_KERNEL); - + kbuf = mtd_kmalloc_up_to(&size); if (!kbuf) return -ENOMEM; while (count) { - - if (count > MAX_KMALLOC_SIZE) - len = MAX_KMALLOC_SIZE; - else - len = count; + len = min_t(size_t, count, size); if (copy_from_user(kbuf, buf, len)) { kfree(kbuf); diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c index da69bc8..6f720cc 100644 --- a/drivers/mtd/mtdcore.c +++ b/drivers/mtd/mtdcore.c @@ -638,6 +638,46 @@ int default_mtd_writev(struct mtd_info *mtd, const struct kvec *vecs, return ret; } +/** + * mtd_kmalloc_up_to - allocate a contiguous buffer up to the specified size + * @size: A pointer to the ideal or maximum size of the allocation. Points + * to the actual allocation size on success. + * + * This routine attempts to allocate a contiguous kernel buffer up to + * the specified size, backing off the size of the request exponentially + * until the request succeeds or until the allocation size falls below + * the system page size. This attempts to make sure it does not adversely + * impact system performance, so when allocating more than one page, we + * ask the memory allocator to avoid re-trying, swapping, writing back + * or performing I/O. + * + * This is called, for example by mtd_{read,write} and jffs2_scan_medium, + * to handle smaller (i.e. degraded) buffer allocations under low- or + * fragmented-memory situations where such reduced allocations, from a + * requested ideal, are allowed. + * + * Returns a pointer to the allocated buffer on success; otherwise, NULL. + */ +void *mtd_kmalloc_up_to(size_t *size) +{ + gfp_t flags = __GFP_NOWARN | __GFP_WAIT | + __GFP_NORETRY | __GFP_NO_KSWAPD; + size_t try; + void *kbuf; + + try = min_t(size_t, *size, KMALLOC_MAX_SIZE); + + do { + if (try <= PAGE_SIZE) + flags = GFP_KERNEL; + + kbuf = kmalloc(try, flags); + } while (!kbuf && ((try >>= 1) >= PAGE_SIZE)); + + *size = try; + return kbuf; +} + EXPORT_SYMBOL_GPL(add_mtd_device); EXPORT_SYMBOL_GPL(del_mtd_device); EXPORT_SYMBOL_GPL(get_mtd_device); @@ -648,6 +688,7 @@ EXPORT_SYMBOL_GPL(__put_mtd_device); EXPORT_SYMBOL_GPL(register_mtd_user); EXPORT_SYMBOL_GPL(unregister_mtd_user); EXPORT_SYMBOL_GPL(default_mtd_writev); +EXPORT_SYMBOL_GPL(mtd_kmalloc_up_to); #ifdef CONFIG_PROC_FS diff --git a/fs/jffs2/scan.c b/fs/jffs2/scan.c index b632ddd..0850037 100644 --- a/fs/jffs2/scan.c +++ b/fs/jffs2/scan.c @@ -117,14 +117,15 @@ int jffs2_scan_medium(struct jffs2_sb_info *c) else buf_size = PAGE_SIZE; - /* Respect kmalloc limitations */ - if (buf_size > 128*1024) - buf_size = 128*1024; + D1(printk(KERN_DEBUG "Trying to allocate readbuf of %d " + "bytes\n", buf_size)); - D1(printk(KERN_DEBUG "Allocating readbuf of %d bytes\n", buf_size)); - flashbuf = kmalloc(buf_size, GFP_KERNEL); + flashbuf = mtd_kmalloc_up_to(&buf_size); if (!flashbuf) return -ENOMEM; + + D1(printk(KERN_DEBUG "Allocated readbuf of %d bytes\n", + buf_size)); } if (jffs2_sum_active()) { diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h index 9d5306b..a5d31ba 100644 --- a/include/linux/mtd/mtd.h +++ b/include/linux/mtd/mtd.h @@ -348,7 +348,8 @@ int default_mtd_writev(struct mtd_info *mtd, const struct kvec *vecs, int default_mtd_readv(struct mtd_info *mtd, struct kvec *vecs, unsigned long count, loff_t from, size_t *retlen); +void *mtd_kmalloc_up_to(size_t *size); + #ifdef CONFIG_MTD_PARTITIONS void mtd_erase_callback(struct erase_info *instr); #else --