Message ID | 20080923224535.6869.76210.stgit@jtkirshe-mobile.jf.intel.com |
---|---|
State | Rejected, archived |
Delegated to: | Jeff Garzik |
Headers | show |
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Date: Tue, 23 Sep 2008 15:45:54 -0700 > From: Bruce Allan <bruce.w.allan@intel.com> > > A number of users have reported NVM corruption on various ICHx platform > LOMs. One possible reasons for this could be unexpected and/or malicious > writes to the flash memory area mapped into kernel memory. Once the > interface is up, there should be very few reads/writes of the mapped flash > memory. This patch makes use of the x86 set_memory_*() functions to set > the mapped memory read-only and temporarily set it writable only when the > driver needs to write to it. With the memory set read-only, any unexpected > write will be logged with a stack dump indicating the offending code. > > Since these LOMs are only on x86 ICHx platforms, it does not matter that > this API is not yet available on other architectures, however it is > dependent on a previous patch that exports these function name symbols. > > Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> If the X server or similar is causing this problem, this patch won't help. The X server maps MMIO space using mmap() in userspace, and you're only protecting the kernel side mapping. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 23 Sep 2008, David Miller wrote: > > A number of users have reported NVM corruption on various ICHx platform > > LOMs. One possible reasons for this could be unexpected and/or malicious > > writes to the flash memory area mapped into kernel memory. Once the > > interface is up, there should be very few reads/writes of the mapped flash > > memory. This patch makes use of the x86 set_memory_*() functions to set > > the mapped memory read-only and temporarily set it writable only when the > > driver needs to write to it. With the memory set read-only, any unexpected > > write will be logged with a stack dump indicating the offending code. > > Since these LOMs are only on x86 ICHx platforms, it does not matter that > > this API is not yet available on other architectures, however it is > > dependent on a previous patch that exports these function name symbols. > If the X server or similar is causing this problem, this patch won't > help. The X server maps MMIO space using mmap() in userspace, and you're > only protecting the kernel side mapping. Yes, probably not a vanilla kernel material, but could be very well used for testing to further isolate whether it is kernel or userspace corrupting the memory mapped eeprom. At least I will build suse kernel package with these patches applied and ask users who are able to reproduce the problem and willing to test (if there are any that have backported their EEPROM properly before running new kernel :) ) to give it a try.
From: Jiri Kosina <jkosina@suse.cz> Date: Wed, 24 Sep 2008 01:45:53 +0200 (CEST) > Yes, probably not a vanilla kernel material, but could be very well used > for testing to further isolate whether it is kernel or userspace > corrupting the memory mapped eeprom. Absolutely. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller wrote: > From: Jiri Kosina <jkosina@suse.cz> > Date: Wed, 24 Sep 2008 01:45:53 +0200 (CEST) > >> Yes, probably not a vanilla kernel material, but could be very well used >> for testing to further isolate whether it is kernel or userspace >> corrupting the memory mapped eeprom. > > Absolutely. Would it help and would it be feasible to drastically shorten the lifetime of the MMIO mapping? I.e. only have it mapped briefly during the driver probe, and temporarily whenever a userspace tool like ethtool needs it.
On Tue, Sep 23, 2008 at 03:45:54PM -0700, Jeff Kirsher wrote: > +#ifdef _ASM_X86_CACHEFLUSH_H > + set_memory_rw((unsigned long)hw->flash_address, > + hw->flash_len >> PAGE_SHIFT); > +#endif > writew(val, hw->flash_address + reg); > +#ifdef _ASM_X86_CACHEFLUSH_H > + set_memory_ro((unsigned long)hw->flash_address, > + hw->flash_len >> PAGE_SHIFT); > +#endif > } Hi Jeff, You're making the entire read-only mapping read-write for the single writel call... why not just limit it to the page that the writel is targetting? regards, Kyle [sorry, I've only been following this at a glance, but it's somewhat important for rawhide... does someone have a way to reproduce this at whim? Has someone tried catching it using an IOMMU on one of the newer Intel boxes, if it is a DMA going awry?] -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
It's a single page. We have not been able to reproduce it in-house as yet. -----Original Message----- From: Kyle McMartin [mailto:kyle@redhat.com] Sent: Wednesday, September 24, 2008 10:54 AM To: Kirsher, Jeffrey T Cc: jeff@garzik.org; mingo@elte.hu; davem@davemloft.net; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Brandeburg, Jesse; Ronciak, John; Allan, Bruce W Subject: Re: [PATCH 2/3] e1000e: Useset_memory_ro()/set_memory_rw() to protect flash memory On Tue, Sep 23, 2008 at 03:45:54PM -0700, Jeff Kirsher wrote: > +#ifdef _ASM_X86_CACHEFLUSH_H > + set_memory_rw((unsigned long)hw->flash_address, > + hw->flash_len >> PAGE_SHIFT); > +#endif > writew(val, hw->flash_address + reg); > +#ifdef _ASM_X86_CACHEFLUSH_H > + set_memory_ro((unsigned long)hw->flash_address, > + hw->flash_len >> PAGE_SHIFT); > +#endif > } Hi Jeff, You're making the entire read-only mapping read-write for the single writel call... why not just limit it to the page that the writel is targetting? regards, Kyle [sorry, I've only been following this at a glance, but it's somewhat important for rawhide... does someone have a way to reproduce this at whim? Has someone tried catching it using an IOMMU on one of the newer Intel boxes, if it is a DMA going awry?] -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 24 Sep 2008, Kyle McMartin wrote: > [sorry, I've only been following this at a glance, but it's somewhat > important for rawhide... does someone have a way to reproduce this at > whim? Has someone tried catching it using an IOMMU on one of the newer > Intel boxes, if it is a DMA going awry?] We have a few machines in-house that are affected by this, and they currently have EEPROMs of their ethernet cards filled all over with 0xff. After Karsten finishes writing the tool to restore the original contents of the EEPROM, we will immediately start debugging this.
diff --git a/drivers/net/e1000e/e1000.h b/drivers/net/e1000e/e1000.h index ac4e506..2786754 100644 --- a/drivers/net/e1000e/e1000.h +++ b/drivers/net/e1000e/e1000.h @@ -36,6 +36,7 @@ #include <linux/workqueue.h> #include <linux/io.h> #include <linux/netdevice.h> +#include <asm/cacheflush.h> #include "hw.h" diff --git a/drivers/net/e1000e/hw.h b/drivers/net/e1000e/hw.h index 74f263a..dd25009 100644 --- a/drivers/net/e1000e/hw.h +++ b/drivers/net/e1000e/hw.h @@ -863,6 +863,7 @@ struct e1000_hw { u8 __iomem *hw_addr; u8 __iomem *flash_address; + resource_size_t flash_len; struct e1000_mac_info mac; struct e1000_fc_info fc; diff --git a/drivers/net/e1000e/ich8lan.c b/drivers/net/e1000e/ich8lan.c index 9e38452..f47c60e 100644 --- a/drivers/net/e1000e/ich8lan.c +++ b/drivers/net/e1000e/ich8lan.c @@ -176,12 +176,28 @@ static inline u32 __er32flash(struct e1000_hw *hw, unsigned long reg) static inline void __ew16flash(struct e1000_hw *hw, unsigned long reg, u16 val) { +#ifdef _ASM_X86_CACHEFLUSH_H + set_memory_rw((unsigned long)hw->flash_address, + hw->flash_len >> PAGE_SHIFT); +#endif writew(val, hw->flash_address + reg); +#ifdef _ASM_X86_CACHEFLUSH_H + set_memory_ro((unsigned long)hw->flash_address, + hw->flash_len >> PAGE_SHIFT); +#endif } static inline void __ew32flash(struct e1000_hw *hw, unsigned long reg, u32 val) { +#ifdef _ASM_X86_CACHEFLUSH_H + set_memory_rw((unsigned long)hw->flash_address, + hw->flash_len >> PAGE_SHIFT); +#endif writel(val, hw->flash_address + reg); +#ifdef _ASM_X86_CACHEFLUSH_H + set_memory_ro((unsigned long)hw->flash_address, + hw->flash_len >> PAGE_SHIFT); +#endif } #define er16flash(reg) __er16flash(hw, (reg)) diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index d266510..0e51841 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -4364,7 +4364,6 @@ static int __devinit e1000_probe(struct pci_dev *pdev, struct e1000_hw *hw; const struct e1000_info *ei = e1000_info_tbl[ent->driver_data]; resource_size_t mmio_start, mmio_len; - resource_size_t flash_start, flash_len; static int cards_found; int i, err, pci_using_dac; @@ -4434,11 +4433,15 @@ static int __devinit e1000_probe(struct pci_dev *pdev, if ((adapter->flags & FLAG_HAS_FLASH) && (pci_resource_flags(pdev, 1) & IORESOURCE_MEM)) { - flash_start = pci_resource_start(pdev, 1); - flash_len = pci_resource_len(pdev, 1); - adapter->hw.flash_address = ioremap(flash_start, flash_len); + adapter->hw.flash_len = pci_resource_len(pdev, 1); + adapter->hw.flash_address = ioremap(pci_resource_start(pdev, 1), + adapter->hw.flash_len); if (!adapter->hw.flash_address) goto err_flashmap; +#ifdef _ASM_X86_CACHEFLUSH_H + set_memory_ro((unsigned long)adapter->hw.flash_address, + adapter->hw.flash_len >> PAGE_SHIFT); +#endif } /* construct the net_device struct */