diff mbox

[RFC,1/1] nand/denali: Add runtime pm support for denali controller driver

Message ID 20100906080838.GA6528@intel.com
State RFC
Headers show

Commit Message

Dong, Chuanxiao Sept. 6, 2010, 8:08 a.m. UTC
Hello,
	I know there already were some comments, so here I open a new thread
	submitt this patch again as a RFC one and attach my understanding and
	confusing about how to implement runtime pm for NAND controller MTD
	driver. Hope I can got many help from all of you. Thanks.

	As my understanding, NAND controller MTD driver cannot know when it
	will be hold to do reading/writing/eraseing or when it will be
	released. So there does not have any open/release operation for MTD
	driver, it has to call runtime pm get before it need to access
	hardware and call runtime pm put after finishing access.

	But I have concern about doing this. Since MTD NAND driver is
	reading/writting only one page and can be hold by only one uplayer
	program at each time, get/put will cause very frequently MTD driver
	suspend/resume callback. Is that reasonable? If not, do you guys have
	any idea about how to implement runtime pm for this kind of driver?

	If let MTD framework to call runtime pm get/put callback before
	hold/after release MTD, I think it will be a little better.
	At least, device won't be suspend/resume so much frequently.

	In this RFC patch, I added a timer to implement runtime pm. Device
	only need to suspend/resume every 30s or even longer. Arjan already
	disagree about this for it is duplicating runtime pm refcnt. But I
	cannot found a better way for this...

From 453587bc1ae9f4c0e88d34487a457f9510d979a1 Mon Sep 17 00:00:00 2001
From: Chuanxiao Dong <chuanxiao.dong@intel.com>
Date: Mon, 6 Sep 2010 14:31:12 +0800
Subject: [PATCH] nand/denali: Add runtime pm framework in denali.c

Denali NAND controller has no capability to shut power down
in MRST platform, so now driver do nothing in runtime_suspend/
runtime_resume routine. This patch add a framework to implement
runtime power management

Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com>
---
 drivers/mtd/nand/denali.c |  125 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 125 insertions(+), 0 deletions(-)

Comments

Arjan van de Ven Sept. 6, 2010, 7:53 p.m. UTC | #1
On Mon, 6 Sep 2010 16:08:38 +0800
"Chuanxiao.Dong" <chuanxiao.dong@intel.com> wrote:


> +/* NOW denali NAND controller in MRST has no way to cut power off
> + * once it is power on, so just let these functions be empty
> + * */

this isn't right; you should put the device in PCI D3 state.



> +
> +static struct dev_pm_ops denali_pm = {
> +	.runtime_suspend = denali_runtime_suspend,
> +	.runtime_resume = denali_runtime_resume,
> +	.runtime_idle = denali_runtime_idle,

you don't really need a runtime_idle function


> +};
> +/* support denali runtime pm */
> +static void denali_timer_add(struct denali_nand_info *denali)

and this is where I get very nervous about your patch.

The runtime pm infrastructure already has timers, you don't need to
implement one!

in fact, I think the code is rather not using the runtime PM
infrastructure right.

The runtime PM framework offers you a reference counter for activity,
so what really ought to be done instead of this polling timer, is to
take a refcount each time something is submitted to the hardware..

> +	if (chip->state != FL_READY ||

... and decremented each time something is completed. This will change
the chip->state for you... each place chip->state is used is a
candidate for this proper refcounting.


basically, this timer in the driver is rather misplaced and should be
replaced by using the refcount from the runtime PM infrastructure
correctly...... and I suspect your patch gets a LOT simpler as well for
free.
Dong, Chuanxiao Sept. 7, 2010, 2:12 a.m. UTC | #2
Hi
See my comments below.
Thanks
--Chuanxiao

> -----Original Message-----
> From: Arjan van de Ven [mailto:arjan@infradead.org]
> Sent: Tuesday, September 07, 2010 3:53 AM
> To: Dong, Chuanxiao
> Cc: dwmw2@infradead.org; linux-mtd@lists.infradead.org
> Subject: Re: [RFC][PATCH 1/1]nand/denali: Add runtime pm support for denali
> controller driver
> 
> On Mon, 6 Sep 2010 16:08:38 +0800
> "Chuanxiao.Dong" <chuanxiao.dong@intel.com> wrote:
> 
> 
> > +/* NOW denali NAND controller in MRST has no way to cut power off
> > + * once it is power on, so just let these functions be empty
> > + * */
> 
> this isn't right; you should put the device in PCI D3 state.

This will be update after I got a correct runtime PM framework using.

> 
> > +
> > +static struct dev_pm_ops denali_pm = {
> > +	.runtime_suspend = denali_runtime_suspend,
> > +	.runtime_resume = denali_runtime_resume,
> > +	.runtime_idle = denali_runtime_idle,
> 
> you don't really need a runtime_idle function
> 
> 
> > +};
> > +/* support denali runtime pm */
> > +static void denali_timer_add(struct denali_nand_info *denali)
> 
> and this is where I get very nervous about your patch.
> 
> The runtime pm infrastructure already has timers, you don't need to
> implement one!
> 
> in fact, I think the code is rather not using the runtime PM
> infrastructure right.
> 
> The runtime PM framework offers you a reference counter for activity,
> so what really ought to be done instead of this polling timer, is to
> take a refcount each time something is submitted to the hardware..
> 
> > +	if (chip->state != FL_READY ||
> 
> ... and decremented each time something is completed. This will change
> the chip->state for you... each place chip->state is used is a
> candidate for this proper refcounting.

And this is what I am confusing. Each place chip->state is used is in MTD framework....no in MTD driver
itself. MTD driver never know when it is opened or released.
Each time before hardware has something to do chip->state is already set to be something like READING/
WRITING by MTD framework. And after finishing the hardware work, chip->state does not change. Then
return back to MTD framework. If there are still something for hardware to do, chip->state will not change.
Only when there is nothing to do, MTD framework will set chip->state to be READY after MTD driver finishing
the last hardware work.
That is the process for MTD driver working. It only maintains the hardware working part. Before or after it finishes
the hardware work, the chip->state is always to be READING/WRITING. It never knows when there will have the
first or the last hardware access.
So that is why I add a timer for driver to detect the chip->state. And that is why I think it's better for MTD framework
to do some runtime PM work.

> 
> basically, this timer in the driver is rather misplaced and should be
> replaced by using the refcount from the runtime PM infrastructure
> correctly...... and I suspect your patch gets a LOT simpler as well for
> free.
Arjan van de Ven Sept. 7, 2010, 3:34 a.m. UTC | #3
On Tue, 7 Sep 2010 10:12:09 +0800
"Dong, Chuanxiao" <chuanxiao.dong@intel.com> wrote:

> Hi
> See my comments below.
> Thanks
> --Chuanxiao
> 
> > -----Original Message-----
> > From: Arjan van de Ven [mailto:arjan@infradead.org]
> > Sent: Tuesday, September 07, 2010 3:53 AM
> > To: Dong, Chuanxiao
> > Cc: dwmw2@infradead.org; linux-mtd@lists.infradead.org
> > Subject: Re: [RFC][PATCH 1/1]nand/denali: Add runtime pm support
> > for denali controller driver
> > 
> > On Mon, 6 Sep 2010 16:08:38 +0800
> > "Chuanxiao.Dong" <chuanxiao.dong@intel.com> wrote:
> > 
> > 
> > > +/* NOW denali NAND controller in MRST has no way to cut power off
> > > + * once it is power on, so just let these functions be empty
> > > + * */
> > 
> > this isn't right; you should put the device in PCI D3 state.
> 
> This will be update after I got a correct runtime PM framework using.
> 
> > 
> > > +
> > > +static struct dev_pm_ops denali_pm = {
> > > +	.runtime_suspend = denali_runtime_suspend,
> > > +	.runtime_resume = denali_runtime_resume,
> > > +	.runtime_idle = denali_runtime_idle,
> > 
> > you don't really need a runtime_idle function
> > 
> > 
> > > +};
> > > +/* support denali runtime pm */
> > > +static void denali_timer_add(struct denali_nand_info *denali)
> > 
> > and this is where I get very nervous about your patch.
> > 
> > The runtime pm infrastructure already has timers, you don't need to
> > implement one!
> > 
> > in fact, I think the code is rather not using the runtime PM
> > infrastructure right.
> > 
> > The runtime PM framework offers you a reference counter for
> > activity, so what really ought to be done instead of this polling
> > timer, is to take a refcount each time something is submitted to
> > the hardware..
> > 
> > > +	if (chip->state != FL_READY ||
> > 
> > ... and decremented each time something is completed. This will
> > change the chip->state for you... each place chip->state is used is
> > a candidate for this proper refcounting.
> 
> And this is what I am confusing. Each place chip->state is used is in
> MTD framework....no in MTD driver itself. MTD driver never know when
> it is opened or released. Each time before hardware has something to
> do chip->state is already set to be something like READING/ WRITING
> by MTD framework. And after finishing the hardware work, chip->state
> does not change. Then return back to MTD framework. If there are
> still something for hardware to do, chip->state will not change. Only
> when there is nothing to do, MTD framework will set chip->state to be
> READY after MTD driver finishing the last hardware work. That is the
> process for MTD driver working. It only maintains the hardware
> working part. Before or after it finishes the hardware work, the
> chip->state is always to be READING/WRITING. It never knows when
> there will have the first or the last hardware access. So that is why
> I add a timer for driver to detect the chip->state. And that is why I
> think it's better for MTD framework to do some runtime PM work.

if you take the refcount every time you get a command, and drop a count
each time a command completes.... this is all not too important.. it'll
just work
Dong, Chuanxiao Sept. 7, 2010, 4:03 a.m. UTC | #4
> if you take the refcount every time you get a command, and drop a count
> each time a command completes.... this is all not too important.. it'll
> just work
If I take the refcount every time I got a command and drop a count each time a command completes, the controller device will be resumed every time when it gets a command and suspended after it finishes a command. For a 4k page NAND, processing just 1M data will cause device resume/suspend 256 times....This will influence the performance....Although denali controller cannot be really resumed/suspend, to these devices who can resume/suspend, I think it is not a good way.
Right now there are 6 patches for denali MTD driver pending. After these 6 patches are accepted and Denali MTD driver can be used on MRST platform, I will resubmit a runtime pm patch. Arjan, do you agree?
Arjan van de Ven Sept. 7, 2010, 4:34 a.m. UTC | #5
On Tue, 7 Sep 2010 12:03:40 +0800
"Dong, Chuanxiao" <chuanxiao.dong@intel.com> wrote:

> 
> > if you take the refcount every time you get a command, and drop a
> > count each time a command completes.... this is all not too
> > important.. it'll just work
> If I take the refcount every time I got a command and drop a count
> each time a command completes, the controller device will be resumed
> every time when it gets a command and suspended after it finishes a
> command. For a 4k page NAND, processing just 1M data will cause
> device resume/suspend 256 times....This will influence the
> performance...

so this is where you can tell the runtime PM system to use a timer, and
only turn on power saving after some idle time.

you don't need to do that yourself... the runtime PM system has this
capability built in!
diff mbox

Patch

diff --git a/drivers/mtd/nand/denali.c b/drivers/mtd/nand/denali.c
index 532fe07..976b6b9 100644
--- a/drivers/mtd/nand/denali.c
+++ b/drivers/mtd/nand/denali.c
@@ -25,6 +25,10 @@ 
 #include <linux/pci.h>
 #include <linux/mtd/mtd.h>
 #include <linux/module.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/jiffies.h>
+#include <linux/param.h>
 
 #include "denali.h"
 
@@ -119,6 +123,9 @@  static const uint32_t reset_complete[4] = {INTR_STATUS0__RST_COMP,
 							INTR_STATUS2__RST_COMP,
 							INTR_STATUS3__RST_COMP};
 
+/* record interrupt jiffies */
+static unsigned long last_intr;
+
 /* forward declarations */
 static void clear_interrupts(struct denali_nand_info *denali);
 static uint32_t wait_for_irq(struct denali_nand_info *denali,
@@ -856,6 +863,89 @@  static int read_data_from_flash_mem(struct denali_nand_info *denali,
 	return i*4; /* intent is to return the number of bytes read */
 }
 
+/* NOW denali NAND controller in MRST has no way to cut power off
+ * once it is power on, so just let these functions be empty
+ * */
+#ifdef CONFIG_PM_RUNTIME
+static int denali_runtime_suspend(struct device *dev)
+{
+	dev_info(dev, "%s: disable_depth %d usage_count %d", __func__,
+			dev->power.disable_depth, dev->power.usage_count.counter);
+	dev_info(dev, "%s called", __func__);
+	/* Denali Controller in MRST doesn't have any capbility
+	 * to shut power down or resume back, no register or GPIO
+	 * can do this, so here do nothing
+	 * */
+	return 0;
+}
+
+static int denali_runtime_resume(struct device *dev)
+{
+	dev_info(dev, "%s: disable_depth %d usage_count %d", __func__,
+			dev->power.disable_depth, dev->power.usage_count.counter);
+	dev_info(dev, "%s called", __func__);
+	/* Denali Controller in MRST doesn't have any capbility
+	 * to shut power down or resume back, no register or GPIO
+	 * can do this, so here do nothing
+	 * */
+	return 0;
+}
+
+static int denali_runtime_idle(struct device *dev)
+{
+	dev_info(dev, "%s: disable_depth %d usage_count %d", __func__,
+			dev->power.disable_depth, dev->power.usage_count.counter);
+	dev_info(dev, "%s called", __func__);
+	/* denali controller will only be resume once during
+	 * read/write/erase operation, so if call runtime idle,
+	 * means device can be suspend
+	 * */
+	return 0;
+}
+
+static struct dev_pm_ops denali_pm = {
+	.runtime_suspend = denali_runtime_suspend,
+	.runtime_resume = denali_runtime_resume,
+	.runtime_idle = denali_runtime_idle,
+};
+/* support denali runtime pm */
+static void denali_timer_add(struct denali_nand_info *denali)
+{
+	int ret;
+	struct device *dev = &denali->dev->dev;
+	if (denali->pm_status == DENALI_OFF) {
+		ret = pm_runtime_get_sync(dev);
+		denali->pm_status = DENALI_ON;
+		denali->timer.expires = jiffies + 30 * HZ;
+		add_timer(&denali->timer);
+	}
+}
+
+static void denali_timer_func(unsigned long ptr)
+{
+	struct denali_nand_info *denali =
+		(struct denali_nand_info *)ptr;
+	struct device *dev = &denali->dev->dev;
+	struct nand_chip *chip = &denali->nand;
+	int ret;
+
+	if (denali->pm_status == DENALI_OFF)
+		BUG();
+
+	if (chip->state != FL_READY ||
+		jiffies - last_intr < 1000) {
+		denali->timer.expires = jiffies + 30 * HZ;
+		add_timer(&denali->timer);
+	} else {
+		ret = pm_runtime_put_sync(dev);
+		denali->pm_status = DENALI_OFF;
+	}
+}
+#else
+static inline void denali_timer_add(struct denali_nand_info *denali) {}
+static inline void denali_timer_func(unsigned long ptr) {}
+#endif
+
 /* writes OOB data to the device */
 static int write_oob_data(struct mtd_info *mtd, uint8_t *buf, int page)
 {
@@ -865,6 +955,8 @@  static int write_oob_data(struct mtd_info *mtd, uint8_t *buf, int page)
 						INTR_STATUS0__PROGRAM_FAIL;
 	int status = 0;
 
+	denali_timer_add(denali);
+
 	denali->page = page;
 
 	if (denali_send_pipeline_cmd(denali, false, false, SPARE_ACCESS,
@@ -892,6 +984,8 @@  static void read_oob_data(struct mtd_info *mtd, uint8_t *buf, int page)
 	uint32_t irq_mask = INTR_STATUS0__LOAD_COMP,
 			 irq_status = 0, addr = 0x0, cmd = 0x0;
 
+	denali_timer_add(denali);
+
 	denali->page = page;
 
 	if (denali_send_pipeline_cmd(denali, false, true, SPARE_ACCESS,
@@ -1054,6 +1148,8 @@  static void write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	uint32_t irq_mask = INTR_STATUS0__DMA_CMD_COMP |
 						INTR_STATUS0__PROGRAM_FAIL;
 
+	denali_timer_add(denali);
+
 	/* if it is a raw xfer, we want to disable ecc, and send
 	 * the spare area.
 	 * !raw_xfer - enable ecc
@@ -1149,6 +1245,8 @@  static int denali_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 			    INTR_STATUS0__ECC_ERR;
 	bool check_erased_page = false;
 
+	denali_timer_add(denali);
+
 	if (page != denali->page) {
 		dev_err(&denali->dev->dev, "IN %s: page %d is not"
 				" equal to denali->page %d, investigate!!",
@@ -1200,6 +1298,8 @@  static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 	uint32_t irq_status = 0;
 	uint32_t irq_mask = INTR_STATUS0__DMA_CMD_COMP;
 
+	denali_timer_add(denali);
+
 	if (page != denali->page) {
 		dev_err(&denali->dev->dev, "IN %s: page %d is not"
 				" equal to denali->page %d, investigate!!",
@@ -1263,6 +1363,8 @@  static void denali_erase(struct mtd_info *mtd, int page)
 
 	uint32_t cmd = 0x0, irq_status = 0;
 
+	denali_timer_add(denali);
+
 	/* clear interrupts */
 	clear_interrupts(denali);
 
@@ -1312,6 +1414,7 @@  static void denali_cmdfunc(struct mtd_info *mtd, unsigned int cmd, int col,
 		denali->page = page;
 		break;
 	case NAND_CMD_RESET:
+		denali_timer_add(denali);
 		reset_bank(denali);
 		break;
 	case NAND_CMD_READOOB:
@@ -1435,6 +1538,12 @@  void denali_drv_init(struct denali_nand_info *denali)
 
 	/* initialize our irq_status variable to indicate no interrupts */
 	denali->irq_status = 0;
+
+	/* Initilize denali timer */
+	init_timer(&denali->timer);
+	denali->timer.data = (unsigned long)denali;
+	denali->timer.function = &denali_timer_func;
+	denali->pm_status = DENALI_ON;
 }
 
 /* driver entry point */
@@ -1669,6 +1778,12 @@  static int denali_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
 				ret);
 		goto failed_req_irq;
 	}
+
+	/* init pm runtime */
+	denali->pm_status = DENALI_OFF;
+	pm_runtime_allow(&dev->dev);
+	pm_runtime_put_noidle(&dev->dev);
+
 	return 0;
 
 failed_req_irq:
@@ -1694,6 +1809,10 @@  static void denali_pci_remove(struct pci_dev *dev)
 {
 	struct denali_nand_info *denali = pci_get_drvdata(dev);
 
+	del_timer(&denali->timer);
+	if (denali->pm_status == DENALI_ON)
+		pm_runtime_put_sync(&dev->dev);
+
 	nand_release(&denali->mtd);
 	del_mtd_device(&denali->mtd);
 
@@ -1707,11 +1826,17 @@  static void denali_pci_remove(struct pci_dev *dev)
 							PCI_DMA_BIDIRECTIONAL);
 	pci_set_drvdata(dev, NULL);
 	kfree(denali);
+	pm_runtime_get_noresume(&dev->dev);
 }
 
 MODULE_DEVICE_TABLE(pci, denali_pci_ids);
 
 static struct pci_driver denali_pci_driver = {
+#ifdef CONFIG_PM_RUNTIME
+	.driver = {
+		.pm = &denali_pm,
+	},
+#endif
 	.name = DENALI_NAND_NAME,
 	.id_table = denali_pci_ids,
 	.probe = denali_pci_probe,