Message ID | a06cf09ed5096bf8f9434d90828adfa70586e936.1278307573.git.richard.cochran@omicron.at |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Richard Cochran <richardcochran@gmail.com> Date: Mon, 5 Jul 2010 07:33:14 +0200 > In order to support hardware time stamping from a PHY, it is necessary to > read from the PHY while running in_interrupt(). This patch allows a mii > bus to operate in an atomic context. An mii_bus driver may declare itself > capable for this mode. Drivers which do not do this will remain with the > default that bus operations may sleep. > > Before commit 35b5f6b1a82b5c586e0b24c711dc6ba944e88ef1 mii bus > operations were protected with spin locks. That commit replaced the > locks with mutexs in order to accommodate i2c buses that need to > sleep. Thus, this patch restores the original behavior as a run time > option. > > Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Ok, I'm not to happy about this one, to be honest. Conditional locking is always a problem waiting to happen. Question: What context can you call mdiobus_lock() in? Answer: You absolutely cannot know. Therefore every piece of code, except those with special knowledge of what kind of device is behind the mdiobus object, have to assume the worst which is that the interface can only be called in sleepable contexts. We need to find another way to accomodate this, I really don't want to see this kind of situation where the locking facility type is conditional upon the device sitting behind the interface. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jul 5, 2010 at 12:33 AM, Richard Cochran <richardcochran@gmail.com> wrote: > In order to support hardware time stamping from a PHY, it is necessary to > read from the PHY while running in_interrupt(). This patch allows a mii > bus to operate in an atomic context. An mii_bus driver may declare itself > capable for this mode. Drivers which do not do this will remain with the > default that bus operations may sleep. > > Before commit 35b5f6b1a82b5c586e0b24c711dc6ba944e88ef1 mii bus > operations were protected with spin locks. That commit replaced the > locks with mutexs in order to accommodate i2c buses that need to > sleep. Thus, this patch restores the original behavior as a run time > option. The reason we moved to mutexes was because we didn't want PHY operations to be done in interrupt context. They are too slow. Also, some MII busses will trigger an interrupt when the operation is done, which means their driver has the option of sleeping. As such, it was an original principle of the PHY lib that MII transactions were not allowed in interrupt context. *Certainly*, once you *do* allow MII transactions in interrupt context, you *cannot* use spin_lock(). You at least have to use spin_lock_irq[save]. Also, I agree with David's comments, and Grant's. There's got to be another way to do this. Andy -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 06, 2010 at 12:09:10PM -0500, Andy Fleming wrote: > allowed in interrupt context. *Certainly*, once you *do* allow MII > transactions in interrupt context, you *cannot* use spin_lock(). You > at least have to use spin_lock_irq[save]. Okay, I see. Before the change to mutexes the code used spin_lock_bh. > Also, I agree with David's comments, and Grant's. There's got to be > another way to do this. I understand what troubles you about the proposed changes. However, I cannot see a better way to get a PHY time stamp other than to read it out during the napi poll and hard_xmit functions. It is a chicken and egg problem. Consider the receive path: 1. PTP Packet passed through PHY. PHY recognizes it and stores a time stamp along with some UID from the packet. 2. Napi calls the MAC driver's poll function. 3. MAC driver acquires packet. At this point, if we want to have a hardware time stamp, we must read it out over the mdio bus, before handing the packet over to the stack via netif_receive_skb(). If we decide to defer the packet delivery (in step 3), we have to know whether the PHY will have a time stamp for this packet. The only way to do this is to compare the UIDs, but that requires reading it over the mdio bus. One could simply defer *all* packets, but that would really stink. Possibly, we could defer all likely packets, for example all PTP packets. Changing all the MAC driver to call a time stamping hook that conditionally consumes the skbuffs would also stink. Could the packet be deferred by the stack, early in netif_receive_skb? If so, a work queue could read the PHY and then deliver the deferred packets at some later time. IMHO, it better just to check for a PHY time stamp immediately, and live with the performance hit. I believe that PTP users will accept this. Naturally, it requires allowing reading the mdio bus in two non-sleeping contexts. There are perhaps other ways to provide PHY time stamps, but I am doubtful that they would work better, in general. The National Semiconductor PHYTER can provide the time stamps as status frames, and it can also insert time stamps directly into the incoming and outgoing frames. * Using status frames is an attractive idea, because it obviates the need to read over the mdio bus. However, you still have the problem of matching the status frames to the PTP packets. * Inserting a time stamp into outgoing frames presents no problems (this is the so called "PTP one step" operation). However, for incoming frames the PHYTER inserts the time stamp into the PTP message at a programmable offset. This invalidates the UDP checksum for PTPv1 packets (PTPv2 checksums will be automatically corrected.) Even if we implement one of these alternatives, I am not sure that other PHYs will also offer the same capabilities as the PHYTER. Well, there you have it. I welcome any ideas on how to go about offering PHY time stamping. Thanks, Richard -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 07, 2010 at 09:18:32AM +0200, Richard Cochran wrote: > Consider the receive path: > > 1. PTP Packet passed through PHY. PHY recognizes it and stores a time > stamp along with some UID from the packet. > > 2. Napi calls the MAC driver's poll function. > > 3. MAC driver acquires packet. At this point, if we want to have a > hardware time stamp, we must read it out over the mdio bus, before > handing the packet over to the stack via netif_receive_skb(). > > If we decide to defer the packet delivery (in step 3), we have to know > whether the PHY will have a time stamp for this packet. The only way > to do this is to compare the UIDs, but that requires reading it over > the mdio bus. (Forwarding Andy's message to me that was missing a CC to the list) On Thu, Jul 08, 2010 at 03:40:41PM -0500, Andy Fleming wrote: Wait, is the intent for this mdio read to be done for *every* packet? MDIO is spec'ed out to go up to 2.5MHz. Each transaction takes 64 cycles. And I see that reading the timestamp from the PHY you submitted support for takes at *least* 2 transactions. The fastest you can process packets would then be under 20,000 packets per second. Even on a 100Mb link with full-sized packets, you would be 40% done receiving the next packet before you had passed the packet on to the stack. Each MDIO transaction takes 25,600 cycles on a gigahertz processor. It's just too long, IMO, and it looks like this code will end up doing up to...10? I wasn't able to find an example of how you were going to use the time-stamp functions you provided. Could you please go into a little more detail about how you intended this to work? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 08, 2010 at 03:40:41PM -0500, Andy Fleming wrote: > > Wait, is the intent for this mdio read to be done for *every* packet? > MDIO is spec'ed out to go up to 2.5MHz. Each transaction takes 64 > cycles. And I see that reading the timestamp from the PHY you > submitted support for takes at *least* 2 transactions. The fastest > you can process packets would then be under 20,000 packets per second. > Even on a 100Mb link with full-sized packets, you would be 40% done > receiving the next packet before you had passed the packet on to the > stack. Each MDIO transaction takes 25,600 cycles on a gigahertz > processor. It's just too long, IMO, and it looks like this code will > end up doing up to...10? Well, it is actually worse than that. From my measurements, 64 cycles at 2.5 MHz (25.6 usec) is too optimistic. On one Xscale IXP platform, I get 35 to 40 usec per read. Also, the PHYTER needs six reads for a Rx and four for a Tx time stamp. So you are right, it is a very long time to leave interrupts off. However, not every packet needs a time stamp. The PHY devices that I know of all are selective. That is, they recognize PTP packets in hardware and only provide time stamps for those special packets. The packet rate is not that hight. For example, the "sync" message comes once every two seconds in PTPv1 and up to ten times per second in PTPv2. I have been working on an alternative, which I will post soon. It goes like this... (comments, please, before I get too far into it!) * General - phy registers itself with net_device { opaque pointer } - ts_candidate(skb): function to detect likely packets, returns true if 1. phy_device pointer exists in net_device 2. data look like PTP, using a BPF - work queue: 1. read out time stamps 2. match times with queued skbs 3. deliver skbs with time stamps (or without on timeout) * Rx path - netif_receive_skb: if ts_candidate(skb), defer(skb). * Tx path - hook in each MAC driver - if ts_candidate(skb), clone and defer(skb). > I wasn't able to find an example of how you were going to use the > time-stamp functions you provided. Could you please go into a little > more detail about how you intended this to work? My last PTP series included the PHY stuff, too. There you can see how it all fits together. http://marc.info/?l=linux-netdev&m=127661796627120&w=3 Thanks, Richard -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/fsl_pq_mdio.c b/drivers/net/fsl_pq_mdio.c index b4c41d7..0b34f50 100644 --- a/drivers/net/fsl_pq_mdio.c +++ b/drivers/net/fsl_pq_mdio.c @@ -145,7 +145,7 @@ static int fsl_pq_mdio_reset(struct mii_bus *bus) struct fsl_pq_mdio __iomem *regs = fsl_pq_mdio_get_regs(bus); int timeout = PHY_INIT_TIMEOUT; - mutex_lock(&bus->mdio_lock); + mdiobus_lock(bus); /* Reset the management interface */ out_be32(®s->miimcfg, MIIMCFG_RESET); @@ -157,7 +157,7 @@ static int fsl_pq_mdio_reset(struct mii_bus *bus) while ((in_be32(®s->miimind) & MIIMIND_BUSY) && timeout--) cpu_relax(); - mutex_unlock(&bus->mdio_lock); + mdiobus_unlock(bus); if (timeout < 0) { printk(KERN_ERR "%s: The MII Bus is stuck!\n", diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c index 6a6b819..ad6bed8 100644 --- a/drivers/net/phy/mdio_bus.c +++ b/drivers/net/phy/mdio_bus.c @@ -37,6 +37,32 @@ #include <asm/uaccess.h> /** + * mdiobus_lock - locks a given bus for read or write operations. + * @bus: target mii_bus + */ +void mdiobus_lock(struct mii_bus *bus) +{ + if (MDIOBUS_SLEEPS_RW == bus->locktype) + mutex_lock(&bus->lock.m); + else + spin_lock(&bus->lock.s); +} +EXPORT_SYMBOL(mdiobus_lock); + +/** + * mdiobus_unlock - unlocks a given bus for read or write operations. + * @bus: target mii_bus + */ +void mdiobus_unlock(struct mii_bus *bus) +{ + if (MDIOBUS_SLEEPS_RW == bus->locktype) + mutex_unlock(&bus->lock.m); + else + spin_unlock(&bus->lock.s); +} +EXPORT_SYMBOL(mdiobus_unlock); + +/** * mdiobus_alloc - allocate a mii_bus structure * * Description: called by a bus driver to allocate an mii_bus @@ -107,7 +133,10 @@ int mdiobus_register(struct mii_bus *bus) return -EINVAL; } - mutex_init(&bus->mdio_lock); + if (MDIOBUS_SLEEPS_RW == bus->locktype) + mutex_init(&bus->lock.m); + else + spin_lock_init(&bus->lock.s); if (bus->reset) bus->reset(bus); @@ -212,11 +241,12 @@ int mdiobus_read(struct mii_bus *bus, int addr, u32 regnum) { int retval; - BUG_ON(in_interrupt()); + if (MDIOBUS_SLEEPS_RW == bus->locktype) + BUG_ON(in_interrupt()); - mutex_lock(&bus->mdio_lock); + mdiobus_lock(bus); retval = bus->read(bus, addr, regnum); - mutex_unlock(&bus->mdio_lock); + mdiobus_unlock(bus); return retval; } @@ -237,11 +267,12 @@ int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val) { int err; - BUG_ON(in_interrupt()); + if (MDIOBUS_SLEEPS_RW == bus->locktype) + BUG_ON(in_interrupt()); - mutex_lock(&bus->mdio_lock); + mdiobus_lock(bus); err = bus->write(bus, addr, regnum, val); - mutex_unlock(&bus->mdio_lock); + mdiobus_unlock(bus); return err; } diff --git a/include/linux/phy.h b/include/linux/phy.h index 7a8caac..93ea55f 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -98,11 +98,20 @@ struct mii_bus { int (*write)(struct mii_bus *bus, int phy_id, int regnum, u16 val); int (*reset)(struct mii_bus *bus); + /* Indicates whether bus may be used from an atomic context. */ + enum { + MDIOBUS_SLEEPS_RW, + MDIOBUS_ATOMIC_RW + } locktype; + /* - * A lock to ensure that only one thing can read/write + * A lock or mutex to ensure that only one thing can read/write * the MDIO bus at a time */ - struct mutex mdio_lock; + union { + struct mutex m; + spinlock_t s; + } lock; struct device *parent; enum { @@ -127,6 +136,8 @@ struct mii_bus { }; #define to_mii_bus(d) container_of(d, struct mii_bus, dev) +void mdiobus_lock(struct mii_bus *bus); +void mdiobus_unlock(struct mii_bus *bus); struct mii_bus *mdiobus_alloc(void); int mdiobus_register(struct mii_bus *bus); void mdiobus_unregister(struct mii_bus *bus);
In order to support hardware time stamping from a PHY, it is necessary to read from the PHY while running in_interrupt(). This patch allows a mii bus to operate in an atomic context. An mii_bus driver may declare itself capable for this mode. Drivers which do not do this will remain with the default that bus operations may sleep. Before commit 35b5f6b1a82b5c586e0b24c711dc6ba944e88ef1 mii bus operations were protected with spin locks. That commit replaced the locks with mutexs in order to accommodate i2c buses that need to sleep. Thus, this patch restores the original behavior as a run time option. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> --- drivers/net/fsl_pq_mdio.c | 4 +- drivers/net/phy/mdio_bus.c | 45 +++++++++++++++++++++++++++++++++++++------ include/linux/phy.h | 15 ++++++++++++- 3 files changed, 53 insertions(+), 11 deletions(-)