Message ID | 1402924639-5164-15-git-send-email-peter@hurleysoftware.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Monday 16 June 2014 09:17:11 Peter Hurley wrote: > tty_wait_until_sent_from_close() drops the tty lock while waiting > for the tty driver to finish sending previously accepted data (ie., > data remaining in its write buffer and transmit fifo). > > However, dropping the tty lock is a hold-over from when the tty > lock was system-wide; ie., one lock for all ttys. > > Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, > 'tty: localise the lock', dropping the tty lock has not been necessary. > > CC: Karsten Keil <isdn@linux-pingi.de> > CC: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Peter Hurley <peter@hurleysoftware.com> I don't understand the second half of the changelog, it doesn't seem to fit here: there deadlock that we are trying to avoid here happens when the *same* tty needs the lock to complete the function that sends the pending data. I don't think we do still do that any more, but it doesn't seem related to the tty lock being system-wide or not. Arnd
From: Arnd Bergmann > On Monday 16 June 2014 09:17:11 Peter Hurley wrote: > > tty_wait_until_sent_from_close() drops the tty lock while waiting > > for the tty driver to finish sending previously accepted data (ie., > > data remaining in its write buffer and transmit fifo). > > > > However, dropping the tty lock is a hold-over from when the tty > > lock was system-wide; ie., one lock for all ttys. > > > > Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, > > 'tty: localise the lock', dropping the tty lock has not been necessary. > > > > CC: Karsten Keil <isdn@linux-pingi.de> > > CC: linuxppc-dev@lists.ozlabs.org > > Signed-off-by: Peter Hurley <peter@hurleysoftware.com> > > I don't understand the second half of the changelog, it doesn't seem > to fit here: there deadlock that we are trying to avoid here happens > when the *same* tty needs the lock to complete the function that > sends the pending data. I don't think we do still do that any more, > but it doesn't seem related to the tty lock being system-wide or not. While I've not looked at the code in question; my thoughts were that holding any lock while waiting for output to drain (or anything else really) probably isn't a good idea. You might find that something else needs the lock - even if only some kind of status request. David
On 06/17/2014 04:00 AM, Arnd Bergmann wrote: > On Monday 16 June 2014 09:17:11 Peter Hurley wrote: >> tty_wait_until_sent_from_close() drops the tty lock while waiting >> for the tty driver to finish sending previously accepted data (ie., >> data remaining in its write buffer and transmit fifo). >> >> However, dropping the tty lock is a hold-over from when the tty >> lock was system-wide; ie., one lock for all ttys. >> >> Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, >> 'tty: localise the lock', dropping the tty lock has not been necessary. >> >> CC: Karsten Keil <isdn@linux-pingi.de> >> CC: linuxppc-dev@lists.ozlabs.org >> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> > > I don't understand the second half of the changelog, it doesn't seem > to fit here: there deadlock that we are trying to avoid here happens > when the *same* tty needs the lock to complete the function that > sends the pending data. I don't think we do still do that any more, > but it doesn't seem related to the tty lock being system-wide or not. The tty lock is not used in the i/o path; it's purpose is to mutually exclude state changes in open(), close() and hangup(). The commit that added this [1] comments that _other_ ttys may wait for this tty to complete, and comments in the code note that this function should be removed when the system-wide tty mutex was removed (which happened with the commit noted in the changelog). Regards, Peter Hurley [1] commit a57a7bf3fc7eff00f07eb9c805774d911a3f2472 Author: Jiri Slaby <jslaby@suse.cz> Date: Thu Aug 25 15:12:06 2011 +0200 TTY: define tty_wait_until_sent_from_close We need this helper to fix system stalls. The issue is that the rest of the system TTYs wait for us to finish waiting. This wasn't an issue with BKL. BKL used to unlock implicitly. This is based on the Arnd suggestion. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
From: Peter Hurley ... > > I don't understand the second half of the changelog, it doesn't seem > > to fit here: there deadlock that we are trying to avoid here happens > > when the *same* tty needs the lock to complete the function that > > sends the pending data. I don't think we do still do that any more, > > but it doesn't seem related to the tty lock being system-wide or not. > > The tty lock is not used in the i/o path; it's purpose is to > mutually exclude state changes in open(), close() and hangup(). > > The commit that added this [1] comments that _other_ ttys may wait > for this tty to complete, and comments in the code note that this > function should be removed when the system-wide tty mutex was removed > (which happened with the commit noted in the changelog). What happens if another process tries to do a non-blocking open while you are sleeping in close waiting for output to drain? Hopefully this returns before that data has drained. David
On Tuesday 17 June 2014 11:03:50 David Laight wrote: > From: Peter Hurley > ... > > > I don't understand the second half of the changelog, it doesn't seem > > > to fit here: there deadlock that we are trying to avoid here happens > > > when the *same* tty needs the lock to complete the function that > > > sends the pending data. I don't think we do still do that any more, > > > but it doesn't seem related to the tty lock being system-wide or not. > > > > The tty lock is not used in the i/o path; it's purpose is to > > mutually exclude state changes in open(), close() and hangup(). > > > > The commit that added this [1] comments that _other_ ttys may wait > > for this tty to complete, and comments in the code note that this > > function should be removed when the system-wide tty mutex was removed > > (which happened with the commit noted in the changelog). > > What happens if another process tries to do a non-blocking open > while you are sleeping in close waiting for output to drain? > > Hopefully this returns before that data has drained. Before the patch, I believe tty_reopen() would return -EIO because the TTY_CLOSING flag is set. After the patch, tty_open() blocks on tty_lock() before calling tty_reopen(). AFAICT, this is independent of O_NONBLOCK. Arnd
On 06/17/2014 07:03 AM, David Laight wrote: > From: Peter Hurley > ... >>> I don't understand the second half of the changelog, it doesn't seem >>> to fit here: there deadlock that we are trying to avoid here happens >>> when the *same* tty needs the lock to complete the function that >>> sends the pending data. I don't think we do still do that any more, >>> but it doesn't seem related to the tty lock being system-wide or not. >> >> The tty lock is not used in the i/o path; it's purpose is to >> mutually exclude state changes in open(), close() and hangup(). >> >> The commit that added this [1] comments that _other_ ttys may wait >> for this tty to complete, and comments in the code note that this >> function should be removed when the system-wide tty mutex was removed >> (which happened with the commit noted in the changelog). > > What happens if another process tries to do a non-blocking open > while you are sleeping in close waiting for output to drain? > > Hopefully this returns before that data has drained. Good point. tty_open() should be trylocking both mutexes anyway in O_NONBLOCK. Regards, Peter Hurley
> Before the patch, I believe tty_reopen() would return -EIO because > the TTY_CLOSING flag is set. After the patch, tty_open() blocks > on tty_lock() before calling tty_reopen(). AFAICT, this is independent > of O_NONBLOCK. That would be a bug then. Returning -EIO is fine (if unfriendly). The O_NONBLOCK can't block in this case though because the port could take a long time to give up trying to dribble its bits (up to 30 seconds or so) Alan
On 06/17/2014 07:32 AM, Peter Hurley wrote: > On 06/17/2014 07:03 AM, David Laight wrote: >> From: Peter Hurley >> ... >>>> I don't understand the second half of the changelog, it doesn't seem >>>> to fit here: there deadlock that we are trying to avoid here happens >>>> when the *same* tty needs the lock to complete the function that >>>> sends the pending data. I don't think we do still do that any more, >>>> but it doesn't seem related to the tty lock being system-wide or not. >>> >>> The tty lock is not used in the i/o path; it's purpose is to >>> mutually exclude state changes in open(), close() and hangup(). >>> >>> The commit that added this [1] comments that _other_ ttys may wait >>> for this tty to complete, and comments in the code note that this >>> function should be removed when the system-wide tty mutex was removed >>> (which happened with the commit noted in the changelog). >> >> What happens if another process tries to do a non-blocking open >> while you are sleeping in close waiting for output to drain? >> >> Hopefully this returns before that data has drained. > > Good point. > > tty_open() should be trylocking both mutexes anyway in O_NONBLOCK. Further, the tty lock should not be nested within the tty_mutex lock in a reopen, regardless of O_NONBLOCK. AFAICT, the tty_mutex in the reopen scenario is only protecting the tty count bump of the linked tty (if the tty is a pty). I think with some refactoring and returning with a tty reference held from both tty_open_current_tty() and tty_driver_lookup_tty(), the tty lock in tty_open() can be attempted without nesting in the tty_mutex. Regardless, I'll be splitting this series and I'll be sure to cc you all when I resubmit these changes (after testing). Regards, Peter Hurley
On Mon, Jun 16, 2014 at 09:17:11AM -0400, Peter Hurley wrote: > tty_wait_until_sent_from_close() drops the tty lock while waiting > for the tty driver to finish sending previously accepted data (ie., > data remaining in its write buffer and transmit fifo). > > However, dropping the tty lock is a hold-over from when the tty > lock was system-wide; ie., one lock for all ttys. > > Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, > 'tty: localise the lock', dropping the tty lock has not been necessary. > > CC: Karsten Keil <isdn@linux-pingi.de> > CC: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Peter Hurley <peter@hurleysoftware.com> > --- > drivers/isdn/i4l/isdn_tty.c | 2 +- > drivers/tty/hvc/hvc_console.c | 2 +- > drivers/tty/hvc/hvcs.c | 2 +- > drivers/tty/tty_port.c | 11 ++--------- > include/linux/tty.h | 18 ------------------ > 5 files changed, 5 insertions(+), 30 deletions(-) I've applied the first 13 patches in this series, as it looks like you were going to split things up from here, right? Can you refresh these and resend when you have that done? thanks, greg k-h
On 07/10/2014 07:09 PM, Greg Kroah-Hartman wrote: > On Mon, Jun 16, 2014 at 09:17:11AM -0400, Peter Hurley wrote: >> tty_wait_until_sent_from_close() drops the tty lock while waiting >> for the tty driver to finish sending previously accepted data (ie., >> data remaining in its write buffer and transmit fifo). >> >> However, dropping the tty lock is a hold-over from when the tty >> lock was system-wide; ie., one lock for all ttys. >> >> Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, >> 'tty: localise the lock', dropping the tty lock has not been necessary. >> >> CC: Karsten Keil <isdn@linux-pingi.de> >> CC: linuxppc-dev@lists.ozlabs.org >> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> >> --- >> drivers/isdn/i4l/isdn_tty.c | 2 +- >> drivers/tty/hvc/hvc_console.c | 2 +- >> drivers/tty/hvc/hvcs.c | 2 +- >> drivers/tty/tty_port.c | 11 ++--------- >> include/linux/tty.h | 18 ------------------ >> 5 files changed, 5 insertions(+), 30 deletions(-) > > I've applied the first 13 patches in this series, as it looks like you > were going to split things up from here, right? Yes, thanks for doing that. > Can you refresh these and resend when you have that done? Unfortunately, that probably won't be until after the 3.17 merge window, for 3.18. The tty_open() rework is not trivial and there is an issue with the ldisc flush removal patch. I'm hoping to include the tty flow control fixes with that stuff as well. Regards, Peter Hurley
On 06/17/2014 07:03 AM, David Laight wrote: > From: Peter Hurley > ... >>> I don't understand the second half of the changelog, it doesn't seem >>> to fit here: there deadlock that we are trying to avoid here happens >>> when the *same* tty needs the lock to complete the function that >>> sends the pending data. I don't think we do still do that any more, >>> but it doesn't seem related to the tty lock being system-wide or not. >> >> The tty lock is not used in the i/o path; it's purpose is to >> mutually exclude state changes in open(), close() and hangup(). >> >> The commit that added this [1] comments that _other_ ttys may wait >> for this tty to complete, and comments in the code note that this >> function should be removed when the system-wide tty mutex was removed >> (which happened with the commit noted in the changelog). I just wanted to revisit this discussion briefly so I can clarify the situation regarding holding the tty lock while closing, and how that affects parallel opens. I've unnested the tty lock from the tty mutex (which I'm still testing) but will be submitting after the merge window re-opens for 3.19. So this is more relevant now. The original patch that led to this thread is here: https://lkml.org/lkml/2014/6/16/306 > What happens if another process tries to do a non-blocking open > while you are sleeping in close waiting for output to drain? > > Hopefully this returns before that data has drained. Current mainline blocks on _any_ racing re-open while this lock is dropped in tty_wait_until_sent_from_close(); blocking while ASYNC_CLOSING has been in mainline since at least 2.6.29 and that just merged existing code together. See tty_port_block_til_ready(); note the test for O_NONBLOCK is after the wait while ASYNC_CLOSING. IOW, currently a non-blocking open will sleep for the _entire_ duration of a parallel hardware shutdown, and when it wakes, the error return will cause a release of its tty, and it will restart with a fresh attempt to open. Same with a blocking open that is already waiting; when its woken the hardware shutdown has already completed so ASYNC_INITIALIZED is cleared, which forces a release and restart too. The point being that holding the tty lock across the _entire_ close is equivalent to the current outcome, regardless of O_NONBLOCK. I'm reluctant to start returning EGAIN for non-blocking tty opens because no tty driver does that now, and I don't think userspace will deal well with new return codes from tty opens. Regards, Peter Hurley
> The point being that holding the tty lock across the _entire_ close > is equivalent to the current outcome, regardless of O_NONBLOCK. > > I'm reluctant to start returning EGAIN for non-blocking tty opens > because no tty driver does that now, and I don't think userspace will > deal well with new return codes from tty opens. I do not know about the non blocking case mattering. The blocking open does need to wait, when I broke that case before I broke the console login drivers (mingetty). Returning EAGAIN would also only work if poll/select did the right thing. Currently Linux can't support a System5 style ttymon process because of this limitation, which means, for example, that systemd can't implement a single thread to manage all console prompts/setup Alan
diff --git a/drivers/isdn/i4l/isdn_tty.c b/drivers/isdn/i4l/isdn_tty.c index 3c5f249..732f68a 100644 --- a/drivers/isdn/i4l/isdn_tty.c +++ b/drivers/isdn/i4l/isdn_tty.c @@ -1587,7 +1587,7 @@ isdn_tty_close(struct tty_struct *tty, struct file *filp) * line status register. */ if (port->flags & ASYNC_INITIALIZED) { - tty_wait_until_sent_from_close(tty, 3000); /* 30 seconds timeout */ + tty_wait_until_sent(tty, 3000); /* 30 seconds timeout */ /* * Before we drop DTR, make sure the UART transmitter * has completely drained; this is especially diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 0ff7fda..2297dc7 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -417,7 +417,7 @@ static void hvc_close(struct tty_struct *tty, struct file * filp) * there is no buffered data otherwise sleeps on a wait queue * waking periodically to check chars_in_buffer(). */ - tty_wait_until_sent_from_close(tty, HVC_CLOSE_WAIT); + tty_wait_until_sent(tty, HVC_CLOSE_WAIT); } else { if (hp->port.count < 0) printk(KERN_ERR "hvc_close %X: oops, count is %d\n", diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c index 81e939e..236302d 100644 --- a/drivers/tty/hvc/hvcs.c +++ b/drivers/tty/hvc/hvcs.c @@ -1230,7 +1230,7 @@ static void hvcs_close(struct tty_struct *tty, struct file *filp) irq = hvcsd->vdev->irq; spin_unlock_irqrestore(&hvcsd->lock, flags); - tty_wait_until_sent_from_close(tty, HVCS_CLOSE_WAIT); + tty_wait_until_sent(tty, HVCS_CLOSE_WAIT); /* * This line is important because it tells hvcs_open that this diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index 1b93357..6b6214b 100644 --- a/drivers/tty/tty_port.c +++ b/drivers/tty/tty_port.c @@ -464,10 +464,7 @@ static void tty_port_drain_delay(struct tty_port *port, struct tty_struct *tty) schedule_timeout_interruptible(timeout); } -/* Caller holds tty lock. - * NB: may drop and reacquire tty lock (in tty_wait_until_sent_from_close()) - * so tty and tty port may have changed state (but not hung up or reopened). - */ +/* Caller holds tty lock. */ int tty_port_close_start(struct tty_port *port, struct tty_struct *tty, struct file *filp) { @@ -505,7 +502,7 @@ int tty_port_close_start(struct tty_port *port, if (tty->flow_stopped) tty_driver_flush_buffer(tty); if (port->closing_wait != ASYNC_CLOSING_WAIT_NONE) - tty_wait_until_sent_from_close(tty, port->closing_wait); + tty_wait_until_sent(tty, port->closing_wait); if (port->drain_delay) tty_port_drain_delay(port, tty); } @@ -545,10 +542,6 @@ EXPORT_SYMBOL(tty_port_close_end); * tty_port_close * * Caller holds tty lock - * - * NB: may drop and reacquire tty lock (in tty_port_close_start()-> - * tty_wait_until_sent_from_close()) so tty and tty_port may have changed - * state (but not hung up or reopened). */ void tty_port_close(struct tty_port *port, struct tty_struct *tty, struct file *filp) diff --git a/include/linux/tty.h b/include/linux/tty.h index 1c3316a..f3eb70d 100644 --- a/include/linux/tty.h +++ b/include/linux/tty.h @@ -644,24 +644,6 @@ extern void __lockfunc tty_unlock_pair(struct tty_struct *tty, struct tty_struct *tty2); /* - * this shall be called only from where BTM is held (like close) - * - * We need this to ensure nobody waits for us to finish while we are waiting. - * Without this we were encountering system stalls. - * - * This should be indeed removed with BTM removal later. - * - * Locking: BTM required. Nobody is allowed to hold port->mutex. - */ -static inline void tty_wait_until_sent_from_close(struct tty_struct *tty, - long timeout) -{ - tty_unlock(tty); /* tty->ops->close holds the BTM, drop it while waiting */ - tty_wait_until_sent(tty, timeout); - tty_lock(tty); -} - -/* * wait_event_interruptible_tty -- wait for a condition with the tty lock held * * The condition we are waiting for might take a long time to
tty_wait_until_sent_from_close() drops the tty lock while waiting for the tty driver to finish sending previously accepted data (ie., data remaining in its write buffer and transmit fifo). However, dropping the tty lock is a hold-over from when the tty lock was system-wide; ie., one lock for all ttys. Since commit 89c8d91e31f267703e365593f6bfebb9f6d2ad01, 'tty: localise the lock', dropping the tty lock has not been necessary. CC: Karsten Keil <isdn@linux-pingi.de> CC: linuxppc-dev@lists.ozlabs.org Signed-off-by: Peter Hurley <peter@hurleysoftware.com> --- drivers/isdn/i4l/isdn_tty.c | 2 +- drivers/tty/hvc/hvc_console.c | 2 +- drivers/tty/hvc/hvcs.c | 2 +- drivers/tty/tty_port.c | 11 ++--------- include/linux/tty.h | 18 ------------------ 5 files changed, 5 insertions(+), 30 deletions(-)