Patchwork ucc_geth: fix ethtool set ring param bug

login
register
mail settings
Submitter Liang Li
Date Aug. 30, 2010, 2:47 p.m.
Message ID <1283179677-21262-1-git-send-email-liang.li@windriver.com>
Download mbox | patch
Permalink /patch/63085/
State Superseded
Delegated to: Kumar Gala
Headers show

Comments

Liang Li - Aug. 30, 2010, 2:47 p.m.
It's common sense that when we should do change to driver ring
desc/buffer etc only after 'stop/shutdown' the device. When we
do change while devices/driver is running, kernel oops occur:

[
root@fsl_8569mds:/root> ethtool -G eth0 tx 256
root@fsl_8569mds:/root> Oops: Kernel access of bad area, sig: 11 [#1]
MPC8569 MDS
last sysfs file: /sys/kernel/uevent_seqnum
Modules linked in:
NIP: 00000000 LR: c0072fbc CTR: 00000000
REGS: effefef0 TRAP: 0400   Not tainted  (2.6.36-rc3-00002-g6c3b118-dirty)
MSR: 00021000 <ME,CE>  CR: 24442048  XER: 00000000
TASK = c0518350[0] 'swapper' THREAD: c0544000
GPR00: 00000000 effeffa0 c0518350 00000020 ef0be000 ef005000 80000000 00000200
GPR08: c03b5d00 00000000 f1010080 ef08d458 000dda96 00000000 3ffb2900 00000000
GPR16: 00000000 3ffa8948 3fff1314 3ffac3f8 00000000 00000000 00000000 00000000
GPR24: 00000000 00000000 00000000 c0530000 00000000 00000000 00000020 ef3709c0
NIP [00000000] (null)
LR [c0072fbc] handle_IRQ_event+0x4c/0x12c
Call Trace:
[effeffa0] [c0019414] qe_ic_mask_irq+0x1c/0x90 (unreliable)
[effeffc0] [c0075b74] handle_level_irq+0x88/0x128
[effeffd0] [c001ca44] qe_ic_cascade_muxed_mpic+0x50/0x88
[effefff0] [c000d5fc] call_handle_irq+0x18/0x28
[c0545ea0] [c0004f3c] do_IRQ+0xa8/0x140
[c0545ed0] [c000e2bc] ret_from_except+0x0/0x18
-- Exception: 501 at cpu_idle+0x9c/0xdc
    LR = cpu_idle+0x9c/0xdc
[c0545f90] [c0008318] cpu_idle+0x54/0xdc (unreliable)
[c0545fb0] [c000231c] rest_init+0x68/0x7c
[c0545fc0] [c04e686c] start_kernel+0x230/0x2b0
[c0545ff0] [c000039c] skpinv+0x2b4/0x2f0
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Kernel panic - not syncing: Fatal exception in interrupt
Rebooting in 180 seconds..
]

Then the natural solution would be 'stop driver/device then adjust
ring buffer parameter then reactivate driver/device'.

Signed-off-by: Liang Li <liang.li@windriver.com>
---
 drivers/net/ucc_geth.c         |    4 ++--
 drivers/net/ucc_geth.h         |    2 ++
 drivers/net/ucc_geth_ethtool.c |   17 +++++++++++------
 3 files changed, 15 insertions(+), 8 deletions(-)
Ben Hutchings - Aug. 31, 2010, 2:41 p.m.
On Mon, 2010-08-30 at 22:47 +0800, Liang Li wrote:
> It's common sense that when we should do change to driver ring
> desc/buffer etc only after 'stop/shutdown' the device. When we
> do change while devices/driver is running, kernel oops occur:
[...]
> diff --git a/drivers/net/ucc_geth_ethtool.c b/drivers/net/ucc_geth_ethtool.c
> index 6f92e48..1b37aaa 100644
> --- a/drivers/net/ucc_geth_ethtool.c
> +++ b/drivers/net/ucc_geth_ethtool.c
> @@ -255,13 +255,18 @@ uec_set_ringparam(struct net_device *netdev,
>  		return -EINVAL;
>  	}
>  
> -	ug_info->bdRingLenRx[queue] = ring->rx_pending;
> -	ug_info->bdRingLenTx[queue] = ring->tx_pending;
> -
>  	if (netif_running(netdev)) {
> -		/* FIXME: restart automatically */
> -		printk(KERN_INFO
> -			"Please re-open the interface.\n");
> +		printk(KERN_INFO "Stopping interface %s.\n", netdev->name);
> +		ucc_geth_close(netdev);
> +
> +		ug_info->bdRingLenRx[queue] = ring->rx_pending;
> +		ug_info->bdRingLenTx[queue] = ring->tx_pending;
> +
> +		printk(KERN_INFO "Reactivating interface %s.\n", netdev->name);
> +		ucc_geth_open(netdev);

What if ucc_geth_open() fails?

Ben.

> +	} else {
> +		ug_info->bdRingLenRx[queue] = ring->rx_pending;
> +		ug_info->bdRingLenTx[queue] = ring->tx_pending;
>  	}
>  
>  	return ret;
Liang Li - Aug. 31, 2010, 3:16 p.m.
On Tue, Aug 31, 2010 at 03:41:22PM +0100, Ben Hutchings wrote:
> On Mon, 2010-08-30 at 22:47 +0800, Liang Li wrote:
> > It's common sense that when we should do change to driver ring
> > desc/buffer etc only after 'stop/shutdown' the device. When we
> > do change while devices/driver is running, kernel oops occur:
> [...]
> > diff --git a/drivers/net/ucc_geth_ethtool.c b/drivers/net/ucc_geth_ethtool.c
> > index 6f92e48..1b37aaa 100644
> > --- a/drivers/net/ucc_geth_ethtool.c
> > +++ b/drivers/net/ucc_geth_ethtool.c
> > @@ -255,13 +255,18 @@ uec_set_ringparam(struct net_device *netdev,
> >  		return -EINVAL;
> >  	}
> >  
> > -	ug_info->bdRingLenRx[queue] = ring->rx_pending;
> > -	ug_info->bdRingLenTx[queue] = ring->tx_pending;
> > -
> >  	if (netif_running(netdev)) {
> > -		/* FIXME: restart automatically */
> > -		printk(KERN_INFO
> > -			"Please re-open the interface.\n");
> > +		printk(KERN_INFO "Stopping interface %s.\n", netdev->name);
> > +		ucc_geth_close(netdev);
> > +
> > +		ug_info->bdRingLenRx[queue] = ring->rx_pending;
> > +		ug_info->bdRingLenTx[queue] = ring->tx_pending;
> > +
> > +		printk(KERN_INFO "Reactivating interface %s.\n", netdev->name);
> > +		ucc_geth_open(netdev);
> 
> What if ucc_geth_open() fails?

I did some runtime tests but did not witness the ucc_geth_open fail.
Assume it may fail for some reason, then I tend to think give out
warnings for request user to open/enable it mannually?  Or we may need
to keep the 'FIXME' line?

Thanks,
				-Liang Li

> 
> Ben.
> 
> > +	} else {
> > +		ug_info->bdRingLenRx[queue] = ring->rx_pending;
> > +		ug_info->bdRingLenTx[queue] = ring->tx_pending;
> >  	}
> >  
> >  	return ret;
> 
> -- 
> Ben Hutchings, Senior Software Engineer, Solarflare Communications
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>
Ben Hutchings - Aug. 31, 2010, 3:23 p.m.
On Tue, 2010-08-31 at 23:16 +0800, Liang Li wrote:
> On Tue, Aug 31, 2010 at 03:41:22PM +0100, Ben Hutchings wrote:
> > On Mon, 2010-08-30 at 22:47 +0800, Liang Li wrote:
> > > It's common sense that when we should do change to driver ring
> > > desc/buffer etc only after 'stop/shutdown' the device. When we
> > > do change while devices/driver is running, kernel oops occur:
> > [...]
> > > diff --git a/drivers/net/ucc_geth_ethtool.c b/drivers/net/ucc_geth_ethtool.c
> > > index 6f92e48..1b37aaa 100644
> > > --- a/drivers/net/ucc_geth_ethtool.c
> > > +++ b/drivers/net/ucc_geth_ethtool.c
> > > @@ -255,13 +255,18 @@ uec_set_ringparam(struct net_device *netdev,
> > >  		return -EINVAL;
> > >  	}
> > >  
> > > -	ug_info->bdRingLenRx[queue] = ring->rx_pending;
> > > -	ug_info->bdRingLenTx[queue] = ring->tx_pending;
> > > -
> > >  	if (netif_running(netdev)) {
> > > -		/* FIXME: restart automatically */
> > > -		printk(KERN_INFO
> > > -			"Please re-open the interface.\n");
> > > +		printk(KERN_INFO "Stopping interface %s.\n", netdev->name);
> > > +		ucc_geth_close(netdev);
> > > +
> > > +		ug_info->bdRingLenRx[queue] = ring->rx_pending;
> > > +		ug_info->bdRingLenTx[queue] = ring->tx_pending;
> > > +
> > > +		printk(KERN_INFO "Reactivating interface %s.\n", netdev->name);
> > > +		ucc_geth_open(netdev);
> > 
> > What if ucc_geth_open() fails?
> 
> I did some runtime tests but did not witness the ucc_geth_open fail.
> Assume it may fail for some reason, then I tend to think give out
> warnings for request user to open/enable it mannually?  Or we may need
> to keep the 'FIXME' line?

The easy way out is to allow changing the ring size only while the
interface is down.  The hard way is to make the change in such a way
that you can roll back in case of failure.

Ben.

Patch

diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 1b0aef3..9779185 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -3485,7 +3485,7 @@  err:
 
 /* Called when something needs to use the ethernet device */
 /* Returns 0 for success. */
-static int ucc_geth_open(struct net_device *dev)
+int ucc_geth_open(struct net_device *dev)
 {
 	struct ucc_geth_private *ugeth = netdev_priv(dev);
 	int err;
@@ -3542,7 +3542,7 @@  err:
 }
 
 /* Stops the kernel queue, and halts the controller */
-static int ucc_geth_close(struct net_device *dev)
+int ucc_geth_close(struct net_device *dev)
 {
 	struct ucc_geth_private *ugeth = netdev_priv(dev);
 
diff --git a/drivers/net/ucc_geth.h b/drivers/net/ucc_geth.h
index ef1fbeb..f5cb874 100644
--- a/drivers/net/ucc_geth.h
+++ b/drivers/net/ucc_geth.h
@@ -1235,5 +1235,7 @@  int init_flow_control_params(u32 automatic_flow_control_mode,
 		u32 __iomem *upsmr_register, u32 __iomem *uempr_register,
 		u32 __iomem *maccfg1_register);
 
+extern int ucc_geth_open(struct net_device *);
+extern int ucc_geth_close(struct net_device *);
 
 #endif				/* __UCC_GETH_H__ */
diff --git a/drivers/net/ucc_geth_ethtool.c b/drivers/net/ucc_geth_ethtool.c
index 6f92e48..1b37aaa 100644
--- a/drivers/net/ucc_geth_ethtool.c
+++ b/drivers/net/ucc_geth_ethtool.c
@@ -255,13 +255,18 @@  uec_set_ringparam(struct net_device *netdev,
 		return -EINVAL;
 	}
 
-	ug_info->bdRingLenRx[queue] = ring->rx_pending;
-	ug_info->bdRingLenTx[queue] = ring->tx_pending;
-
 	if (netif_running(netdev)) {
-		/* FIXME: restart automatically */
-		printk(KERN_INFO
-			"Please re-open the interface.\n");
+		printk(KERN_INFO "Stopping interface %s.\n", netdev->name);
+		ucc_geth_close(netdev);
+
+		ug_info->bdRingLenRx[queue] = ring->rx_pending;
+		ug_info->bdRingLenTx[queue] = ring->tx_pending;
+
+		printk(KERN_INFO "Reactivating interface %s.\n", netdev->name);
+		ucc_geth_open(netdev);
+	} else {
+		ug_info->bdRingLenRx[queue] = ring->rx_pending;
+		ug_info->bdRingLenTx[queue] = ring->tx_pending;
 	}
 
 	return ret;