Message ID | 20160107230513.GL109450@google.com |
---|---|
State | Superseded |
Headers | show |
On 8 January 2016 at 00:05, Brian Norris <computersforpeace@gmail.com> wrote: > drivers/bcma/driver_chipcommon_sflash.c already nicely sets us up a > struct resource for this window, but we just aren't using it. Use it > now! Works nice :) [ 1.333884] [bcm47xxsflash_bcma_probe] res->start:0x1c000000 > This removes some (implicit) MIPS dependencies and makes the code more > portable, whether we need it or not :) So now we have following forwardtrace: devm_ioremap_nocache ioremap_nocache __ioremap_mode __ioremap CKSEG1ADDR It results in different address than KSEG0ADDR: [ 1.339752] [bcm47xxsflash_bcma_probe] KSEG0ADDR(BCMA_SOC_FLASH2):9c000000 [ 1.346848] [bcm47xxsflash_bcma_probe] devm_ioremap_nocache:bc000000 But it still works as expected! :) [ 1.609426] 6 bcm47xxpart partitions found on MTD device bcm47xxsflash [ 1.616169] Creating 6 MTD partitions on "bcm47xxsflash": > Signed-off-by: Brian Norris <computersforpeace@gmail.com> Tested-by: Rafał Miłecki <zajec5@gmail.com>
On Fri, 8 Jan 2016, Rafał Miłecki wrote: > > This removes some (implicit) MIPS dependencies and makes the code more > > portable, whether we need it or not :) > > So now we have following forwardtrace: > devm_ioremap_nocache > ioremap_nocache > __ioremap_mode > __ioremap > CKSEG1ADDR > > It results in different address than KSEG0ADDR: > [ 1.339752] [bcm47xxsflash_bcma_probe] KSEG0ADDR(BCMA_SOC_FLASH2):9c000000 > [ 1.346848] [bcm47xxsflash_bcma_probe] devm_ioremap_nocache:bc000000 > > But it still works as expected! :) > [ 1.609426] 6 bcm47xxpart partitions found on MTD device bcm47xxsflash > [ 1.616169] Creating 6 MTD partitions on "bcm47xxsflash": It is a functional change though and I think the change from a cached to uncached mapping (i.e. from `ioremap' to `ioremap_nocache') has to be a separate patch, so that both changes can be reviewed independently. Maciej
On 8 January 2016 at 15:01, Maciej W. Rozycki <macro@linux-mips.org> wrote: > On Fri, 8 Jan 2016, Rafał Miłecki wrote: > >> > This removes some (implicit) MIPS dependencies and makes the code more >> > portable, whether we need it or not :) >> >> So now we have following forwardtrace: >> devm_ioremap_nocache >> ioremap_nocache >> __ioremap_mode >> __ioremap >> CKSEG1ADDR >> >> It results in different address than KSEG0ADDR: >> [ 1.339752] [bcm47xxsflash_bcma_probe] KSEG0ADDR(BCMA_SOC_FLASH2):9c000000 >> [ 1.346848] [bcm47xxsflash_bcma_probe] devm_ioremap_nocache:bc000000 >> >> But it still works as expected! :) >> [ 1.609426] 6 bcm47xxpart partitions found on MTD device bcm47xxsflash >> [ 1.616169] Creating 6 MTD partitions on "bcm47xxsflash": > > It is a functional change though and I think the change from a cached to > uncached mapping (i.e. from `ioremap' to `ioremap_nocache') has to be a > separate patch, so that both changes can be reviewed independently. We didn't switch from 'ioremap' but from KSEG0ADDR. What exactly should be a separated patch?
On Fri, Jan 08, 2016 at 02:01:00PM +0000, Maciej W. Rozycki wrote: > On Fri, 8 Jan 2016, Rafał Miłecki wrote: > > > > This removes some (implicit) MIPS dependencies and makes the code more > > > portable, whether we need it or not :) > > > > So now we have following forwardtrace: > > devm_ioremap_nocache > > ioremap_nocache > > __ioremap_mode > > __ioremap > > CKSEG1ADDR I just noticed that ioremap() and ioremap_nocache() are the same on MIPS. So I could just do devm_ioremap_resource() and save myself a few lines... > > It results in different address than KSEG0ADDR: > > [ 1.339752] [bcm47xxsflash_bcma_probe] KSEG0ADDR(BCMA_SOC_FLASH2):9c000000 > > [ 1.346848] [bcm47xxsflash_bcma_probe] devm_ioremap_nocache:bc000000 > > > > But it still works as expected! :) > > [ 1.609426] 6 bcm47xxpart partitions found on MTD device bcm47xxsflash > > [ 1.616169] Creating 6 MTD partitions on "bcm47xxsflash": > > It is a functional change though and I think the change from a cached to > uncached mapping (i.e. from `ioremap' to `ioremap_nocache') has to be a > separate patch, so that both changes can be reviewed independently. As I noted before sending my patch, I don't think this driver should have been using KSEG0 anyway; it should have been KSEG1, right? I can note that in the patch description, but I don't really see why it needs to be a separate patch. Brian
On Fri, 8 Jan 2016, Brian Norris wrote: > > > > This removes some (implicit) MIPS dependencies and makes the code more > > > > portable, whether we need it or not :) > > > > > > So now we have following forwardtrace: > > > devm_ioremap_nocache > > > ioremap_nocache > > > __ioremap_mode > > > __ioremap > > > CKSEG1ADDR > > I just noticed that ioremap() and ioremap_nocache() are the same on > MIPS. So I could just do devm_ioremap_resource() and save myself a few > lines... My bad, I wrote from memory and didn't double check what the caching mode for plain `ioremap' is. You need to use `ioremap_cache' for a cached mapping. Unfortunately the MIPS port is missing this generic interface and defines `ioremap_cachable' instead. I've just posted an obvious fix for this problem. Ralf, can you pick this fix for 4.5? It qualifies as obvious I believe, similar to a recent `ioremap_uc' addition. > > > It results in different address than KSEG0ADDR: > > > [ 1.339752] [bcm47xxsflash_bcma_probe] KSEG0ADDR(BCMA_SOC_FLASH2):9c000000 > > > [ 1.346848] [bcm47xxsflash_bcma_probe] devm_ioremap_nocache:bc000000 > > > > > > But it still works as expected! :) > > > [ 1.609426] 6 bcm47xxpart partitions found on MTD device bcm47xxsflash > > > [ 1.616169] Creating 6 MTD partitions on "bcm47xxsflash": > > > > It is a functional change though and I think the change from a cached to > > uncached mapping (i.e. from `ioremap' to `ioremap_nocache') has to be a > > separate patch, so that both changes can be reviewed independently. > > As I noted before sending my patch, I don't think this driver should > have been using KSEG0 anyway; it should have been KSEG1, right? I can > note that in the patch description, but I don't really see why it needs > to be a separate patch. You did mention that, but didn't actually justify why an uncached mapping is required here. This code is in a function called `bcm47xxsflash_read' and reads from an MMIO region, presumably flash memory which behaves like ordinary memory on reads (i.e. no side effects). Therefore using a cached mapping will in most cases result in much better performance as the CPU will load (prefetch) data in cacheline-sized quantities rather than hitting the external bus every time with a word-sized quantity transferred only. Switching to an uncached mapping would invalidate this optimisation and is independent from a build error fix. Therefore it requires a separate patch and review. Perhaps `memremap' should be used instead (possibly with the MEMREMAP_WT policy), but this is exactly why a separate review is required for that part. Maciej
On Fri, 8 Jan 2016, Rafał Miłecki wrote: > >> > This removes some (implicit) MIPS dependencies and makes the code more > >> > portable, whether we need it or not :) > >> > >> So now we have following forwardtrace: > >> devm_ioremap_nocache > >> ioremap_nocache > >> __ioremap_mode > >> __ioremap > >> CKSEG1ADDR > >> > >> It results in different address than KSEG0ADDR: > >> [ 1.339752] [bcm47xxsflash_bcma_probe] KSEG0ADDR(BCMA_SOC_FLASH2):9c000000 > >> [ 1.346848] [bcm47xxsflash_bcma_probe] devm_ioremap_nocache:bc000000 > >> > >> But it still works as expected! :) > >> [ 1.609426] 6 bcm47xxpart partitions found on MTD device bcm47xxsflash > >> [ 1.616169] Creating 6 MTD partitions on "bcm47xxsflash": > > > > It is a functional change though and I think the change from a cached to > > uncached mapping (i.e. from `ioremap' to `ioremap_nocache') has to be a > > separate patch, so that both changes can be reviewed independently. > > We didn't switch from 'ioremap' but from KSEG0ADDR. What exactly > should be a separated patch? See my other reply -- KSEG0ADDR (cached mapping) corresponds to `ioremap_cache', whereas `ioremap_nocache' (or its generic `ioremap_uc' alias) or plain `ioremap' correspond to KSEG1ADDR (uncached mapping). Consequently a change that switches from KSEG0ADDR to `ioremap_nocache' or `ioremap' includes a functional change along a build error (portability) fix. Therefore such a change has to be split into two, so that the functional change can be reviewed separately from the portability fix. Maciej
On 9 January 2016 at 03:10, Maciej W. Rozycki <macro@linux-mips.org> wrote: > On Fri, 8 Jan 2016, Brian Norris wrote: >> > > It results in different address than KSEG0ADDR: >> > > [ 1.339752] [bcm47xxsflash_bcma_probe] KSEG0ADDR(BCMA_SOC_FLASH2):9c000000 >> > > [ 1.346848] [bcm47xxsflash_bcma_probe] devm_ioremap_nocache:bc000000 >> > > >> > > But it still works as expected! :) >> > > [ 1.609426] 6 bcm47xxpart partitions found on MTD device bcm47xxsflash >> > > [ 1.616169] Creating 6 MTD partitions on "bcm47xxsflash": >> > >> > It is a functional change though and I think the change from a cached to >> > uncached mapping (i.e. from `ioremap' to `ioremap_nocache') has to be a >> > separate patch, so that both changes can be reviewed independently. >> >> As I noted before sending my patch, I don't think this driver should >> have been using KSEG0 anyway; it should have been KSEG1, right? I can >> note that in the patch description, but I don't really see why it needs >> to be a separate patch. > > You did mention that, but didn't actually justify why an uncached mapping > is required here. > > This code is in a function called `bcm47xxsflash_read' and reads from an > MMIO region, presumably flash memory which behaves like ordinary memory on > reads (i.e. no side effects). Therefore using a cached mapping will in > most cases result in much better performance as the CPU will load > (prefetch) data in cacheline-sized quantities rather than hitting the > external bus every time with a word-sized quantity transferred only. So I wanted to stick to the cached mapping, but it appears it's not possible with devm_*. We have two options: 1) devm_ioremap_nocache - it obviously won't be cached mapping 2) devm_ioremap_resource - it uses devm_ioremap which uses ioremap which is nocache on MIPS Should I introduce a new devm_ioremap_resource_cache with some devm_ioremap_cache helper? And then use it in bcm47xxsflash?
On Sat, 16 Jan 2016, Rafał Miłecki wrote: > > This code is in a function called `bcm47xxsflash_read' and reads from an > > MMIO region, presumably flash memory which behaves like ordinary memory on > > reads (i.e. no side effects). Therefore using a cached mapping will in > > most cases result in much better performance as the CPU will load > > (prefetch) data in cacheline-sized quantities rather than hitting the > > external bus every time with a word-sized quantity transferred only. > > So I wanted to stick to the cached mapping, but it appears it's not > possible with devm_*. We have two options: > 1) devm_ioremap_nocache - it obviously won't be cached mapping > 2) devm_ioremap_resource - it uses devm_ioremap which uses ioremap > which is nocache on MIPS > > Should I introduce a new > devm_ioremap_resource_cache > with some > devm_ioremap_cache > helper? And then use it in bcm47xxsflash? This sounds like a plan to me, except that with `devm_ioremap_wc' also in the view and all the three `devm_ioremap*' functions being identical -- except from the use of a different `ioremap_*' call -- it looks to me it really asks for a `mode' argument and all the code to be unified. Compatibility macros (or, perhaps better, static inline functions) could be provided to preserve the internal API for the existing code. I.e. there would be a new `devm_ioremap_resource_mode' entry point which calls `devm_ioremap_mode', which then selects, perhaps with `switch' on `mode', from the available `ioremap*' calls. Maciej
On Sat, Jan 16, 2016 at 01:38:11AM +0100, Rafał Miłecki wrote:
> So I wanted to stick to the cached mapping, [...]
I mentioned this earlier on, but I don't feel like I've gotten a clear
answer. Is a cached mapping actually safe here? From the looks of it,
the memory mapping is a read-only memory-mapped flash, and flash writes
/ erasures are done through a different bus (register writes vis BCMA
bus). So if we have a cached mapping of that memory, it doens't
naturally synchronize with any write/erase operations. Doesn't this mean
you might get stale data if you do a sequence of read / erase / read,
for instance, since the 2nd read will return cached data from the 1st
read?
IIUC, this could be solved by:
(a) using an uncached mapping or
(b) explicitly invalidating the relevant region after doing flash writes
or erasures
But I wonder why you haven't seen any problems if you've been using
KSEG0 (cached) this whole time. Maybe just luck? Or you don't actually
write to the flash that much?
Brian
On 23 January 2016 at 22:49, Brian Norris <computersforpeace@gmail.com> wrote: > On Sat, Jan 16, 2016 at 01:38:11AM +0100, Rafał Miłecki wrote: >> So I wanted to stick to the cached mapping, [...] > > I mentioned this earlier on, but I don't feel like I've gotten a clear > answer. Is a cached mapping actually safe here? From the looks of it, > the memory mapping is a read-only memory-mapped flash, and flash writes > / erasures are done through a different bus (register writes vis BCMA > bus). So if we have a cached mapping of that memory, it doens't > naturally synchronize with any write/erase operations. Doesn't this mean > you might get stale data if you do a sequence of read / erase / read, > for instance, since the 2nd read will return cached data from the 1st > read? > > IIUC, this could be solved by: > (a) using an uncached mapping or > (b) explicitly invalidating the relevant region after doing flash writes > or erasures > > But I wonder why you haven't seen any problems if you've been using > KSEG0 (cached) this whole time. Maybe just luck? Or you don't actually > write to the flash that much? Now you pointed this difference between reads and writes I sounds worrying indeed. I'm not aware of ever hitting this problem but maybe I just didn't use flash in a way triggering it? I'm looking for a way to test it. Using user space I could try doing something like: echo foo > a.txt cat a.txt echo bar > a.txt cat a.txt I guess even more reliable test would to be test in in kernel space. I guess I could modify bcm47xxsflash_write to read flash region that is going to be modified: before modification and after. Both reads using KSEG0ADDR. Then compare if the second read matches was was written. Does my idea for tests make sense?
On Sat, 23 Jan 2016, Brian Norris wrote: > IIUC, this could be solved by: > (a) using an uncached mapping or > (b) explicitly invalidating the relevant region after doing flash writes > or erasures Flash writes are usually much, much less frequent than reads, so optimising for reads is IMO the right direction. So a cached mapping is a good choice, however invalidation must then be done after a write. > But I wonder why you haven't seen any problems if you've been using > KSEG0 (cached) this whole time. Maybe just luck? Or you don't actually > write to the flash that much? That depends on cache usage, any stale lines may well have usually gone in the course of regular cache line replacement, making the issue remain unnoticed. Maciej
On 24 January 2016 at 21:26, Maciej W. Rozycki <macro@imgtec.com> wrote: > On Sat, 23 Jan 2016, Brian Norris wrote: > >> IIUC, this could be solved by: >> (a) using an uncached mapping or >> (b) explicitly invalidating the relevant region after doing flash writes >> or erasures > > Flash writes are usually much, much less frequent than reads, so > optimising for reads is IMO the right direction. So a cached mapping is a > good choice, however invalidation must then be done after a write. Can you give me some hint where to look at for cache invalidation?
On Sun, 24 Jan 2016, Rafał Miłecki wrote: > On 24 January 2016 at 21:26, Maciej W. Rozycki <macro@imgtec.com> wrote: > > On Sat, 23 Jan 2016, Brian Norris wrote: > > > >> IIUC, this could be solved by: > >> (a) using an uncached mapping or > >> (b) explicitly invalidating the relevant region after doing flash writes > >> or erasures > > > > Flash writes are usually much, much less frequent than reads, so > > optimising for reads is IMO the right direction. So a cached mapping is a > > good choice, however invalidation must then be done after a write. > > Can you give me some hint where to look at for cache invalidation? There is `flush_data_cache_page' only it would seem, which is also supported by the MIPS platform only. It makes unnecessary writebacks before invalidation, however these aren't really supposed to happen as no cache line involved is expected to be dirty. Implementing `invalidate_data_cache_page', which would avoid these unnecessary writebacks, should be straightforward as hardware provides the necessary operations and actually the MIPS port has suitable low-level helpers already implemented, for use by `dma_cache_inv'. So that would merely be a semi-mechanical copy, paste, rename operation applied to our source. The bigger problem is the lack of portability of this interface to other platforms, although I suspect some hardware may simply fail to provide required operations. For example x86 only defines the sledgehammer INVD/WBINVD instructions, which operate on the whole cache hierarchy at once rather than on a line-by-line and cache level/part basis. Maciej
diff --git a/drivers/bcma/driver_chipcommon_sflash.c b/drivers/bcma/driver_chipcommon_sflash.c index 7e11ef4cb7db..5534cbcc222f 100644 --- a/drivers/bcma/driver_chipcommon_sflash.c +++ b/drivers/bcma/driver_chipcommon_sflash.c @@ -145,7 +145,6 @@ int bcma_sflash_init(struct bcma_drv_cc *cc) return -ENOTSUPP; } - sflash->window = BCMA_SOC_FLASH2; sflash->blocksize = e->blocksize; sflash->numblocks = e->numblocks; sflash->size = sflash->blocksize * sflash->numblocks; diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c index 347bb83db864..ce6b7e51569d 100644 --- a/drivers/mtd/devices/bcm47xxsflash.c +++ b/drivers/mtd/devices/bcm47xxsflash.c @@ -2,6 +2,7 @@ #include <linux/module.h> #include <linux/slab.h> #include <linux/delay.h> +#include <linux/ioport.h> #include <linux/mtd/mtd.h> #include <linux/platform_device.h> #include <linux/bcma/bcma.h> @@ -109,8 +110,7 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len, if ((from + len) > mtd->size) return -EINVAL; - memcpy_fromio(buf, (void __iomem *)KSEG0ADDR(b47s->window + from), - len); + memcpy_fromio(buf, b47s->window + from, len); *retlen = len; return len; @@ -275,15 +275,34 @@ static void bcm47xxsflash_bcma_cc_write(struct bcm47xxsflash *b47s, u16 offset, static int bcm47xxsflash_bcma_probe(struct platform_device *pdev) { - struct bcma_sflash *sflash = dev_get_platdata(&pdev->dev); + struct device *dev = &pdev->dev; + struct bcma_sflash *sflash = dev_get_platdata(dev); struct bcm47xxsflash *b47s; + struct resource *res; int err; - b47s = devm_kzalloc(&pdev->dev, sizeof(*b47s), GFP_KERNEL); + b47s = devm_kzalloc(dev, sizeof(*b47s), GFP_KERNEL); if (!b47s) return -ENOMEM; sflash->priv = b47s; + res = platform_get_resource(pdev, IORESOURCE_MEM, 0); + if (!res) { + dev_err(dev, "invalid resource\n"); + return -EINVAL; + } + if (!devm_request_mem_region(dev, res->start, resource_size(res), + res->name)) { + dev_err(dev, "can't request region for resource %pR\n", res); + return -EBUSY; + } + b47s->window = devm_ioremap_nocache(dev, res->start, + resource_size(res)); + if (!b47s->window) { + dev_err(dev, "ioremap failed for resource %pR\n", res); + return -ENOMEM; + } + b47s->bcma_cc = container_of(sflash, struct bcma_drv_cc, sflash); b47s->cc_read = bcm47xxsflash_bcma_cc_read; b47s->cc_write = bcm47xxsflash_bcma_cc_write; @@ -297,11 +316,10 @@ static int bcm47xxsflash_bcma_probe(struct platform_device *pdev) break; } - b47s->window = sflash->window; b47s->blocksize = sflash->blocksize; b47s->numblocks = sflash->numblocks; b47s->size = sflash->size; - bcm47xxsflash_fill_mtd(b47s, &pdev->dev); + bcm47xxsflash_fill_mtd(b47s, dev); err = mtd_device_parse_register(&b47s->mtd, probes, NULL, NULL, 0); if (err) { diff --git a/drivers/mtd/devices/bcm47xxsflash.h b/drivers/mtd/devices/bcm47xxsflash.h index fe93daf4f489..1564b62b412e 100644 --- a/drivers/mtd/devices/bcm47xxsflash.h +++ b/drivers/mtd/devices/bcm47xxsflash.h @@ -65,7 +65,8 @@ struct bcm47xxsflash { enum bcm47xxsflash_type type; - u32 window; + void __iomem *window; + u32 blocksize; u16 numblocks; u32 size; diff --git a/include/linux/bcma/bcma_driver_chipcommon.h b/include/linux/bcma/bcma_driver_chipcommon.h index db51a6ffb7d6..03e6fea9e9ce 100644 --- a/include/linux/bcma/bcma_driver_chipcommon.h +++ b/include/linux/bcma/bcma_driver_chipcommon.h @@ -583,7 +583,6 @@ struct mtd_info; struct bcma_sflash { bool present; - u32 window; u32 blocksize; u16 numblocks; u32 size;