Message ID | 1267371341-16684-4-git-send-email-sebastian@breakpoint.cc |
---|---|
State | Rejected |
Delegated to: | David Miller |
Headers | show |
From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Date: Sun, 28 Feb 2010 16:35:41 +0100 > the pio callbacks are called with different kind of buffers. It could be > a straight kernel addr, kernel stack or a kmaped highmem page. > Some of this break the virt_to_page() assumptions. > This patch moves the dcache flush from architecture code into generic > ide code. ide_pio_bytes() is the only place where user pages might be > written as far as I can see. > The dcache flush is avoided in two cases: > - data which is written to the device (i.e. they are comming from the > userland) This needs a flush on sparc, otherwise an alias now exists in the kernel side copy of the buffer. The D-cache flush is intentionally unconditional for PIO mode. I definitely don't want to take the same risks you guys seem to be willing to take for this optimization which is of questionable value. I also, intrinsically, really don't like these changes. For one thing, you're optimizing PIO mode. Secondly, IDE is in deep maintainence mode, if you want to optimize cache flushing do it in the ATA layer. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* David Miller | 2010-02-28 18:34:17 [-0800]: >From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> >Date: Sun, 28 Feb 2010 16:35:41 +0100 > >> the pio callbacks are called with different kind of buffers. It could be >> a straight kernel addr, kernel stack or a kmaped highmem page. >> Some of this break the virt_to_page() assumptions. >> This patch moves the dcache flush from architecture code into generic >> ide code. ide_pio_bytes() is the only place where user pages might be >> written as far as I can see. >> The dcache flush is avoided in two cases: >> - data which is written to the device (i.e. they are comming from the >> userland) > >This needs a flush on sparc, otherwise an alias now exists in the >kernel side copy of the buffer. The D-cache flush is intentionally >unconditional for PIO mode. I definitely don't want to take the same >risks you guys seem to be willing to take for this optimization which >is of questionable value. It is not us guys it is just me. And if it is a stupid thing to do then I get probably punched by Ralf as well once he gets back. The part I don't get is why you have to flush dcache after the copy process. I would understand a flush before reading. The dcache alias in kernel shouldn't hurt since it is not used anymore. Unless we read twice from the same page. Is this the trick? >I also, intrinsically, really don't like these changes. > >For one thing, you're optimizing PIO mode. > >Secondly, IDE is in deep maintainence mode, if you want to optimize >cache flushing do it in the ATA layer. This patch patch was mostly driven by the fact that the buffer can be a normal kernel mapping or a virtual address. The latter doesn't work with virt_to_page(). Anyway I should probably spent more time getting ATA layer wokring than on the IDE layer since it is somehow working since patch#1. > >Thanks. Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Date: Mon, 1 Mar 2010 20:58:58 +0100 > The part I don't get is why you have to flush dcache after the copy > process. I would understand a flush before reading. The dcache alias in > kernel shouldn't hurt since it is not used anymore. Unless we read twice > from the same page. Is this the trick? Anything that puts the data into the cache on the kernel side is a problem. The page is still potentially in user space, as a result there will be thus two active mappings in the cache, one in the kernel side and one in the user side. The user can then do various operations which can access either mapping. Writing to it via write() system call, writing to it via mmap(), making the kernel write to it by doing a read() with the buffer pointing into the mmap() area. All we need is a modification on either side for the other one to potentially become stale. >>Secondly, IDE is in deep maintainence mode, if you want to optimize >>cache flushing do it in the ATA layer. > This patch patch was mostly driven by the fact that the buffer can be a > normal kernel mapping or a virtual address. The latter doesn't work with > virt_to_page(). > Anyway I should probably spent more time getting ATA layer wokring than > on the IDE layer since it is somehow working since patch#1. All buffers passed to the architecture IDE ops should be PAGE_OFFSET relative kernel virtual addresses for which __pa() works. Are kmap()'d things ending up here? It all works out on sparc64 because we don't have highmem and kernel stacks are just in normal kernel pages. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* David Miller | 2010-03-01 16:39:59 [-0800]: >All buffers passed to the architecture IDE ops should be PAGE_OFFSET >relative kernel virtual addresses for which __pa() works. > >Are kmap()'d things ending up here? Ah, bounce buffers are used. So no, there no highmem which get kmap()ed. Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 01, 2010 at 04:39:59PM -0800, David Miller wrote: > From: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> > Date: Mon, 1 Mar 2010 20:58:58 +0100 > > > The part I don't get is why you have to flush dcache after the copy > > process. I would understand a flush before reading. The dcache alias in > > kernel shouldn't hurt since it is not used anymore. Unless we read twice > > from the same page. Is this the trick? > > Anything that puts the data into the cache on the kernel > side is a problem. The page is still potentially in user > space, as a result there will be thus two active mappings > in the cache, one in the kernel side and one in the user > side. The user can then do various operations which can > access either mapping. > > Writing to it via write() system call, writing to it via > mmap(), making the kernel write to it by doing a read() > with the buffer pointing into the mmap() area. > > All we need is a modification on either side for the other > one to potentially become stale. > > >>Secondly, IDE is in deep maintainence mode, if you want to optimize > >>cache flushing do it in the ATA layer. > > This patch patch was mostly driven by the fact that the buffer can be a > > normal kernel mapping or a virtual address. The latter doesn't work with > > virt_to_page(). > > Anyway I should probably spent more time getting ATA layer wokring than > > on the IDE layer since it is somehow working since patch#1. > > All buffers passed to the architecture IDE ops should be PAGE_OFFSET > relative kernel virtual addresses for which __pa() works. > > Are kmap()'d things ending up here? > > It all works out on sparc64 because we don't have highmem and > kernel stacks are just in normal kernel pages. Highmem on MIPS has always been a bastard child though that will probably change now that 32-bit embedded systems are coming with more memory than can be mapped as lowmem. The system in question that Sebastian is talking about has the IDE cache problems because it has no aliases. Neither the IDE code nor the cache flush layer does any cacheflushes because there are no cache aliases [1] to deal with. So eventually code from a page loaded from an IDE disk gets executed and if the CPU's I-cache refilled from the L2 or memory, not from the D-cache stale data may get executed. In an earlier mail in this thread you called the MIPS ins/outs risky. Maybe - but we have to. At least at the time when this code was written the IDE ins/outs variants was possibly being called with interrupts disabled. The normal MIPS cache flushing functions are based on smp_call_function() on SMP systems [2] so can't be used to do cache maintenance hence we have to avoid them. Ralf [1] Configuring page size to 16k or 64k would get rid of aliases on all MIPS processors with VIPT D-caches, so introduce the same situation. Such configurations are getting common now. [2] Because the CACHE instruction only affects the local CPU cache. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/mips/include/asm/mach-generic/ide.h b/arch/mips/include/asm/mach-generic/ide.h index 9360586..7370845 100644 --- a/arch/mips/include/asm/mach-generic/ide.h +++ b/arch/mips/include/asm/mach-generic/ide.h @@ -20,35 +20,16 @@ #include <asm/processor.h> /* MIPS port and memory-mapped I/O string operations. */ -static inline void __ide_set_pages_dirty(const void *addr, unsigned long size) -{ - unsigned long end = (unsigned long)addr + size; - - if (!(cpu_has_dc_aliases || !cpu_has_ic_fills_f_dc)) - return; - - while ((unsigned long)addr < end) { - struct page *p = virt_to_page(addr); - struct address_space *mapping = page_mapping(p); - - if (mapping && mapping_mapped(mapping)) - SetPageDcacheDirty(p); - addr += PAGE_SIZE; - } -} - static inline void __ide_insw(unsigned long port, void *addr, unsigned int count) { insw(port, addr, count); - __ide_set_pages_dirty(addr, count * 2); } static inline void __ide_insl(unsigned long port, void *addr, unsigned int count) { insl(port, addr, count); - __ide_set_pages_dirty(addr, count * 4); } static inline void __ide_outsw(unsigned long port, const void *addr, @@ -66,13 +47,11 @@ static inline void __ide_outsl(unsigned long port, const void *addr, static inline void __ide_mm_insw(void __iomem *port, void *addr, u32 count) { readsw(port, addr, count); - __ide_set_pages_dirty(addr, count * 2); } static inline void __ide_mm_insl(void __iomem *port, void *addr, u32 count) { readsl(port, addr, count); - __ide_set_pages_dirty(addr, count * 4); } static inline void __ide_mm_outsw(void __iomem *port, void *addr, u32 count) diff --git a/arch/sparc/include/asm/ide.h b/arch/sparc/include/asm/ide.h index b7af3d6..c1037ca 100644 --- a/arch/sparc/include/asm/ide.h +++ b/arch/sparc/include/asm/ide.h @@ -34,9 +34,6 @@ static inline void __ide_insw(void __iomem *port, void *dst, u32 count) { -#if defined(CONFIG_SPARC64) && defined(DCACHE_ALIASING_POSSIBLE) - unsigned long end = (unsigned long)dst + (count << 1); -#endif u16 *ps = dst; u32 *pi; @@ -56,17 +53,10 @@ static inline void __ide_insw(void __iomem *port, void *dst, u32 count) ps = (u16 *)pi; if(count) *ps++ = __raw_readw(port); - -#if defined(CONFIG_SPARC64) && defined(DCACHE_ALIASING_POSSIBLE) - __flush_dcache_range((unsigned long)dst, end); -#endif } static inline void __ide_outsw(void __iomem *port, const void *src, u32 count) { -#if defined(CONFIG_SPARC64) && defined(DCACHE_ALIASING_POSSIBLE) - unsigned long end = (unsigned long)src + (count << 1); -#endif const u16 *ps = src; const u32 *pi; @@ -86,10 +76,6 @@ static inline void __ide_outsw(void __iomem *port, const void *src, u32 count) ps = (const u16 *)pi; if(count) __raw_writew(*ps, port); - -#if defined(CONFIG_SPARC64) && defined(DCACHE_ALIASING_POSSIBLE) - __flush_dcache_range((unsigned long)src, end); -#endif } #endif /* __KERNEL__ */ diff --git a/drivers/ide/ide-taskfile.c b/drivers/ide/ide-taskfile.c index cc8633c..95a9922 100644 --- a/drivers/ide/ide-taskfile.c +++ b/drivers/ide/ide-taskfile.c @@ -273,6 +273,13 @@ void ide_pio_bytes(ide_drive_t *drive, struct ide_cmd *cmd, if (page_is_high) local_irq_restore(flags); + if (!write) { + struct address_space *mapping = page_mapping(page); + + if (mapping && mapping_mapped(mapping)) + flush_dcache_page(page); + } + len -= nr_bytes; } }
the pio callbacks are called with different kind of buffers. It could be a straight kernel addr, kernel stack or a kmaped highmem page. Some of this break the virt_to_page() assumptions. This patch moves the dcache flush from architecture code into generic ide code. ide_pio_bytes() is the only place where user pages might be written as far as I can see. The dcache flush is avoided in two cases: - data which is written to the device (i.e. they are comming from the userland) - pages without a mapping. Those requests should be issued by vfs and not go to the userland. Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> --- arch/mips/include/asm/mach-generic/ide.h | 21 --------------------- arch/sparc/include/asm/ide.h | 14 -------------- drivers/ide/ide-taskfile.c | 7 +++++++ 3 files changed, 7 insertions(+), 35 deletions(-)