From patchwork Tue Feb 10 18:56:53 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Phil Dennis-Jordan X-Patchwork-Id: 22884 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 964B0DDD0B for ; Wed, 11 Feb 2009 05:57:01 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754503AbZBJS44 (ORCPT ); Tue, 10 Feb 2009 13:56:56 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753887AbZBJS44 (ORCPT ); Tue, 10 Feb 2009 13:56:56 -0500 Received: from fg-out-1718.google.com ([72.14.220.158]:8727 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750945AbZBJS44 (ORCPT ); Tue, 10 Feb 2009 13:56:56 -0500 Received: by fg-out-1718.google.com with SMTP id 16so18485fgg.17 for ; Tue, 10 Feb 2009 10:56:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.86.99.9 with SMTP id w9mr39270fgb.31.1234292213350; Tue, 10 Feb 2009 10:56:53 -0800 (PST) Date: Tue, 10 Feb 2009 19:56:53 +0100 Message-ID: Subject: [PATCH] skge: Fix/workaround for DMA mask quirk on ASUS P5NSLI/Marvell Yukon-Lite From: Phillip Michael Jordan To: shemminger@linux-foundation.org Cc: netdev@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Phillip Michael Jordan The onboard Marvell Yukon-Lite gigabit ethernet chip on my ASUS P5NSLI motherboard with the nForce570 SLI/Intel chipset (any BIOS version, including latest), using the skge module, stopped working after upgrading the system to more than 3GB of physical RAM. The problem has been around for a while, at least since 2.6.22. Symptoms on earlier kernels (at least up to 2.6.27) are severely corrupted ethernet packets (observed via wireshark) and associated IP packet loss and eventual failure of any packets being delivered at all. As of 2.6.29-rc4, the kernel panics about 1-2 seconds after insmod with 8GB memory installed, as far as I can tell this is due to memory corruption. I have now traced this problem to DMA to/from memory above the 32-bit boundary, which despite the pci_set_dma_mask() and pci_set_consistent_dma_mask() calls in skge_probe() apparently succeeding with a DMA_64BIT_MASK. Switching to a DMA_32BIT_MASK makes the problem disappear entirely, so this patch against 2.6.29-rc4 does just that for the affected system by identifying the board via DMI data and ethernet chip via vendor/product ID. I've tried to make it as unintrusive as possible, and attempted to make it easy to add other devices that behave similarly in the future. Nothing changes for devices not on the blacklist. (admittedly unable to verify due to lack of other skge hardware) Searching the web, others have had similar problems, though not on the same specific motherboard. Passing iommu=force to the kernel seems to work in some of these previous cases. In my case, this just breaks a number of other PCI(e) devices, including all of USB, video, etc. - and skge still doesn't work. I can therefore only conclude that there is a bug in either the chipset or the BIOS. Signed-off-by: Phillip Michael Jordan --- I don't have documentation for the hardware, I'm fighting the symptoms here. Oddly enough, no other device in my system seems to suffer from the problem, so I struggled to pin the fix somewhere other than in skge. I'm not sure if the method of querying DMI data is the canonical way of detecting quirks like this - if there's a better way, I'd appreciate some information on that. Patch applies cleanly to earlier kernel versions. Comments & suggestions welcome! phil if (err) { @@ -3912,7 +3949,10 @@ static int __devinit skge_probe(struct pci_dev *pdev, pci_set_master(pdev); - if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) { + /* check if we're on a system which falsely claims to allow 64-bit DMA mask */ + dma_32bit_quirk = skge_use_32bit_dma_quirk(pdev); + + if (!dma_32bit_quirk && !pci_set_dma_mask(pdev, DMA_64BIT_MASK)) { using_dac = 1; err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK); } else if (!(err = pci_set_dma_mask(pdev, DMA_32BIT_MASK))) { @@ -3958,9 +3998,10 @@ static int __devinit skge_probe(struct pci_dev *pdev, if (err) goto err_out_iounmap; - printk(KERN_INFO PFX DRV_VERSION " addr 0x%llx irq %d chip %s rev %d\n", + printk(KERN_INFO PFX DRV_VERSION " addr 0x%llx irq %d chip %s rev %d %s\n", (unsigned long long)pci_resource_start(pdev, 0), pdev->irq, - skge_board_name(hw), hw->chip_rev); + skge_board_name(hw), hw->chip_rev, + dma_32bit_quirk ? "32-bit DMA mask quirk on" : ""); dev = skge_devinit(hw, 0, using_dac); if (!dev) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/net/skge.c b/drivers/net/skge.c index c9dbb06..8d4127a 100644 --- a/drivers/net/skge.c +++ b/drivers/net/skge.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include "skge.h" @@ -3891,12 +3892,48 @@ static void __devinit skge_show_addr(struct net_device *dev) dev->name, dev->dev_addr); } +/* nonzero if the device has troubles with 64-bit DMA address mask on + * this system. */ +static int __devinit skge_use_32bit_dma_quirk(struct pci_dev *pdev) +{ + /* Blacklist of Motherboard(s) & onboard chips that incorrectly report + * 64-bit DMA mask capability and require forcing 32-bit mask to work. */ + static struct pci_device_id marvell_4320[] = + { + { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4320) }, + { } + }; + static struct dmi_system_id quirk_devices[] = { + { + .ident = "Marvell 88E8001 on ASUS P5NSLI", + .matches = { + DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK Computer INC."), + DMI_MATCH(DMI_BOARD_NAME, "P5NSLI") + }, + .driver_data = marvell_4320 + }, + { } /* terminate list */ + }; + + /* see if we can find our system on the blacklist */ + const struct dmi_system_id* remaining = quirk_devices; + while ((remaining = dmi_first_match(remaining)) != NULL) + { + /* found the motherboard, check whether the current net device is quirky */ + if (pci_match_id((const struct pci_device_id*)remaining->driver_data, pdev)) + return 1; + ++remaining; + } + + return 0; +} + static int __devinit skge_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { struct net_device *dev, *dev1; struct skge_hw *hw; - int err, using_dac = 0; + int err, using_dac = 0, dma_32bit_quirk = 0; err = pci_enable_device(pdev);