Message ID | 87ha7cwiry.fsf@xmission.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: ebiederm@xmission.com (Eric W. Biederman) Date: Wed, 05 Mar 2014 11:24:33 -0800 > Now that I have looked closer the printk generating a printk problem > seems to be something that is best solved at the printk level. I'm not so sure that disallowing printk recursion is necessary. If you consider an error printk emitted from a device driver's transmit function during netconsole output, netpoll handles this transparently already. Basically, what happens right now in this situation is that netpoll queues it up when recursion is detected, and delayed work is scheduled to process such pending packets. The only issue at hand is the IRQ context bit. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> writes: > From: ebiederm@xmission.com (Eric W. Biederman) > Date: Wed, 05 Mar 2014 11:24:33 -0800 > >> Now that I have looked closer the printk generating a printk problem >> seems to be something that is best solved at the printk level. > > I'm not so sure that disallowing printk recursion is necessary. > > If you consider an error printk emitted from a device driver's > transmit function during netconsole output, netpoll handles this > transparently already. > > Basically, what happens right now in this situation is that netpoll > queues it up when recursion is detected, and delayed work is scheduled > to process such pending packets. Except that printk does not recurse into netpoll again, printk adds the message to printk's ring buffer, and then the next the next time through the loop in console_unlock writes that message out with console_unlock. I have had warnings from printk kill a couple of machines, which is largely why I am anxious to fix netpoll. Further I have experimentally verified that I can still kill a machine that way in the 3.14-rcX. > The only issue at hand is the IRQ context bit. That is the only issue that is a networking stack issue, and I am happy to focus there. If we don't get printk's generating warnings the machine won't lock up. I am slowly working my way through reading the code and verifying I really understand what is going on so I can reasonably say the routines in the appropriate drivers should be safe in hard irq context. Hopefully I will have patches in the next couple of days. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index ba2f5e710af1..aaa9062061c8 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -734,6 +734,7 @@ static void write_msg(struct console *con, const char *msg, unsigned int len) unsigned long flags; struct netconsole_target *nt; const char *tmp; + bool hard_irq; if (oops_only && !oops_in_progress) return; @@ -742,6 +743,9 @@ static void write_msg(struct console *con, const char *msg, unsigned int len) return; spin_lock_irqsave(&target_list_lock, flags); + hard_irq = in_irq(); + if (!hard_irq) + irq_enter(); list_for_each_entry(nt, &target_list, list) { netconsole_target_get(nt); if (nt->enabled && netif_running(nt->np.dev)) { @@ -761,6 +765,8 @@ static void write_msg(struct console *con, const char *msg, unsigned int len) } netconsole_target_put(nt); } + if (!hard_irq) + irq_exit(); spin_unlock_irqrestore(&target_list_lock, flags); }