mbox

patch "tty: serial: OMAP: ensure FIFO levels are set correctly in non-DMA" added to tty tree

Message ID alpine.DEB.2.00.1201261225520.32667@utopia.booyaka.com
State New
Headers show

Pull-request

git://git.pwsan.com/linux-2.6 omap_serial_fixes_3.3rc

Message

Paul Walmsley Jan. 26, 2012, 7:34 p.m. UTC
On Thu, 26 Jan 2012, Greg KH wrote:

> Ok, I've just reverted both of these patches for now, please send new
> ones when you have them.

Thanks.  A pull request is at the bottom of this message.  The patches 
are also available from the mailing list archives here:

http://marc.info/?l=linux-arm-kernel&m=132754676014389&w=2
http://marc.info/?l=linux-arm-kernel&m=132754677914395&w=2
http://marc.info/?l=linux-arm-kernel&m=132754676014388&w=2


- Paul

The following changes since commit dcd6c92267155e70a94b3927bce681ce74b80d1f:

  Linux 3.3-rc1 (2012-01-19 15:04:48 -0800)

are available in the git repository at:
  git://git.pwsan.com/linux-2.6 omap_serial_fixes_3.3rc

Paul Walmsley (3):
      tty: serial: OMAP: use a 1-byte RX FIFO threshold in PIO mode
      tty: serial: OMAP: block idle while the UART is transferring data in PIO mode
      tty: serial: OMAP: wakeup latency constraint is in microseconds, not milliseconds

 arch/arm/mach-omap2/serial.c     |    8 ++++----
 drivers/tty/serial/omap-serial.c |   30 +++++++++++++++++++++++++-----
 2 files changed, 29 insertions(+), 9 deletions(-)

Comments

Paul Walmsley Feb. 2, 2012, 8:03 p.m. UTC | #1
Hi Greg,

On Thu, 26 Jan 2012, Paul Walmsley wrote:

> On Thu, 26 Jan 2012, Greg KH wrote:
> 
> > Ok, I've just reverted both of these patches for now, please send new
> > ones when you have them.
> 
> Thanks.  A pull request is at the bottom of this message.  The patches 
> are also available from the mailing list archives here:
> 
> http://marc.info/?l=linux-arm-kernel&m=132754676014389&w=2
> http://marc.info/?l=linux-arm-kernel&m=132754677914395&w=2
> http://marc.info/?l=linux-arm-kernel&m=132754676014388&w=2

Any comments on these? 


- Paul

> - Paul
> 
> The following changes since commit dcd6c92267155e70a94b3927bce681ce74b80d1f:
> 
>   Linux 3.3-rc1 (2012-01-19 15:04:48 -0800)
> 
> are available in the git repository at:
>   git://git.pwsan.com/linux-2.6 omap_serial_fixes_3.3rc
> 
> Paul Walmsley (3):
>       tty: serial: OMAP: use a 1-byte RX FIFO threshold in PIO mode
>       tty: serial: OMAP: block idle while the UART is transferring data in PIO mode
>       tty: serial: OMAP: wakeup latency constraint is in microseconds, not milliseconds
> 
>  arch/arm/mach-omap2/serial.c     |    8 ++++----
>  drivers/tty/serial/omap-serial.c |   30 +++++++++++++++++++++++++-----
>  2 files changed, 29 insertions(+), 9 deletions(-)
> 


- Paul
Greg KH Feb. 2, 2012, 8:22 p.m. UTC | #2
On Thu, Feb 02, 2012 at 01:03:01PM -0700, Paul Walmsley wrote:
> Hi Greg,
> 
> On Thu, 26 Jan 2012, Paul Walmsley wrote:
> 
> > On Thu, 26 Jan 2012, Greg KH wrote:
> > 
> > > Ok, I've just reverted both of these patches for now, please send new
> > > ones when you have them.
> > 
> > Thanks.  A pull request is at the bottom of this message.  The patches 
> > are also available from the mailing list archives here:
> > 
> > http://marc.info/?l=linux-arm-kernel&m=132754676014389&w=2
> > http://marc.info/?l=linux-arm-kernel&m=132754677914395&w=2
> > http://marc.info/?l=linux-arm-kernel&m=132754676014388&w=2
> 
> Any comments on these? 

They are in my to-apply queue, I'm getting to them...

greg k-h
Greg Kroah-Hartman Feb. 2, 2012, 9:02 p.m. UTC | #3
On Thu, Jan 26, 2012 at 12:34:50PM -0700, Paul Walmsley wrote:
> On Thu, 26 Jan 2012, Greg KH wrote:
> 
> > Ok, I've just reverted both of these patches for now, please send new
> > ones when you have them.
> 
> Thanks.  A pull request is at the bottom of this message.  The patches 
> are also available from the mailing list archives here:
> 
> http://marc.info/?l=linux-arm-kernel&m=132754676014389&w=2
> http://marc.info/?l=linux-arm-kernel&m=132754677914395&w=2
> http://marc.info/?l=linux-arm-kernel&m=132754676014388&w=2

Now applied.

greg k-h
Paul Walmsley Feb. 3, 2012, 5:45 a.m. UTC | #4
Hello Neil.

On Fri, 3 Feb 2012, NeilBrown wrote:

> Can I comment??...  They are good but ....
> 
> I've tried two approaches to getting serial behaviour that I am happy with.
> They are with and without runtime pm.
> 
> If I enable runtime pm by setting power/autosuspend_delay_ms (e.g. to 5000)

Runtime PM should be enabled even with power/autosuspend_delay_ms = 0.  
I think you are simply enabling the autosuspend timer.  There was a 
significant behavior change here from 3.2, I believe.

> then CPUIDLE enters lower states and I think it uses less power but I
> sometimes lose the first char I type (that is known) but also I sometimes get
> corruption on output.

I don't see any output corruption on 35xx BeagleBoard, either with or 
without these patches.  So this is presumably a 37xx-centric problem, and 
unrelated to these patches, I would guess.

> The problem seems to be that we turn off the clocks when the kernel's ring
> buffer is empty rather than waiting for the UART's fifo to be empty.

pm_runtime_put*() calls will write to the CM_*CLKEN registers if the 
usecount goes to zero.  But the PRCM should not actually turn off the 
clocks if the UART isn't idle.  So I would suggest that you first try some 
hwmod CLOCKACTIVITY changes to fix this (described briefly below).

> I can remove this effect with:
> 
> diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> index f809041..c7ef760 100644
> --- a/drivers/tty/serial/omap-serial.c
> +++ b/drivers/tty/serial/omap-serial.c
> @@ -440,7 +440,8 @@ static unsigned int serial_omap_tx_empty(struct uart_port *port)
>  	spin_lock_irqsave(&up->port.lock, flags);
>  	ret = serial_in(up, UART_LSR) & UART_LSR_TEMT ? TIOCSER_TEMT : 0;
>  	spin_unlock_irqrestore(&up->port.lock, flags);
> -	pm_runtime_put(&up->pdev->dev);
> +	pm_runtime_mark_last_busy(&up->pdev->dev);
> +	pm_runtime_put_autosuspend(&up->pdev->dev);
>  	return ret;
>  }

Well this change probably makes sense anyway, just to keep the autosuspend 
behavior consistent when an autosuspend timeout is set.  But the effect of 
this patch may be a little different than what you think; see below.

> i.e. when the tx buffer is empty, so turn the clocks off immediately, 
> but wait a while for the fifo to empty.  Hopefully the auto-suspend time 
> is enough.

Hmm, this statement is a little unclear to me.  The clocks won't be turned 
off immediately - the request to disable the clocks only happens when the 
autosuspend timer expires.  And even then, as mentioned above, it's just a 
request.  The hardware should not actually disable the functional clock 
until the UART FIFO is drained...

Anyway, consider trying a different CLOCKACTIVITY setting for the UARTs.  
There are fields and flags for this in the hwmod code; see for example the 
I2C SYSCONFIG settings in mach-omap2/omap_hwmod_3xxx_data.c.  It's 
possible that the UART is currently assuming that its functional clock 
will stay on when it idle-acks.  That might cause the corruption that you 
are seeing.

> But I decided not to pursue this further as turning off the clocks seems 
> like the wrong thing to be doing.

The clocks should be getting disabled when the autosuspend timer is 0 
also.  It's just that the UART driver will request to disable the clocks 
immediately, rather than waiting.

> The OMAP UARTs have auto-suspend/auto-wake so I would rather depend on 
> that.  And turning off the clocks loses that first character at 115200 
> and above (not below).

If you make a change that causes the kernel to keep the UART clocks on, 
the enclosing clockdomain and any associated clockdomains (probably 
CORE_L3, CORE_L4 & PER) will be prevented from going to sleep.  So just be 
aware that the strategy you are considering will incur an energy 
consumption cost.  The lowest C-state you should manage to reach should be 
C4, at least in mainline terms.

Incidentally, it's unclear to me how you are keeping the clocks on?  Are 
you disabling runtime PM for the driver in some way?

> So I explored leaving the runtime_pm setting as it was.  This set a maximum
> latency of 4444 usec (which should be 5555 as there are 10 bits per char) 
> so it didn't get very low down the CPUIDLE levels.
>
> So I disabled the latency setting with hardware handshaking is used (as you
> suggested once):

[snip]

> and now CPUIDLE uses lower states (especially if I enable_off_mode).

That's good.  One important note is that the CPUIdle statistics don't keep 
track of what C-state was actually entered -- it simply tracks what 
C-state that CPUIdle intended to enter.  So you'll want to check 
/debug/pm_debug/count to be sure.  Also, the PM debug counts are 
approximate.

Incidentally, I have some patches to fix the latency calculation here that 
are in the works, similar to the ones you describe.  The current 
calculation in the driver is pretty broken, but since the changes to fix 
the calculation are rather involved, Kevin and I thought it would be best 
to defer most of them to the v3.4 merge window.  The calculation fix in 
the 3.3-rc series is simply intended to deal with a very basic power 
management regression, as you know - not to make it ideal, which is more 
complicated.  Anyway, the patches are at git://git.pwsan.com/linux-2.6 in 
the branch 'omap_serial_fix_pm_qos_formula_devel_3.4'.  Keep in mind that 
at least one patch in that branch is broken, but perhaps you might get an 
idea of where they are trying to go.  Comments welcome.

> However I am using a lot more power than in 3.2.  

Is this without disabling the UART clocks?  If the driver is keeping the 
UART clocks enabled, then increased power consumption is definitely 
expected.

> That probably isn't all UART-related, but there is one interesting 
> observation.
> 
> If I watch /sys/kernel/debug/pm_debug/time and see where the time is spent
> over a 30second period when the systems is mostly idle:
> 
> With runtime_pm disabled for UARTs, both PER and CORE are permanently ON,
> and MPU is OFF nearly all the time. (This is with off_mode enabled).

How are you disabling runtime PM?  Is this just with the autosuspend 
timeout set to 0?

> If I then enabled runtime_pm with a 5 second autosuspend delay:
>   CORE is still permanently ON (I think the USB might be doing that).
>   PER  is OFF (7 seconds) RET (15 seconds) and ON (8 seconds).
> but more surprising
>   MPU  is OFF (8 sec) RET (8 sec) INA (9 sec) ON (4 secs).
> 
> So we get to turn PER off at the cost of turning the MPU on a lot.
> (the periods in each state are only repeatable to a precision of 20%-50%).
> 
> I understand that PER won't go OFF without runtime PM, but I would expect
> it to at least go to INA as the UART is in smart-idle mode

You are using UART3?  That is in PER and its functional clock comes via 
the PER clockdomain.  If the UART functional clock is prevented from being 
turned off, how can PER go inactive when the UART3 functional clock is 
still enabled?

> The net result is that with runtime_pm enabled I'm seeing an extra 7mA (at
> 4V). That might not be very precise, but it is in the opposite direction to
> my expectations.
> 
> So there is something odd here...  ideas?
> 
> NeilBrown
> 
> p.s. who should I formally submit OMAP-UART patches to?  I have a couple of
> others such as the below that I should submit somewhere.

Would suggest sending them to linux-serial@vger.kernel.org, 
linux-omap@vger.kernel.org, linux-arm-kernel@lists.infradead.org, and Alan 
Cox since he is the serial maintainer?  And you should probably also cc 
me, since I seem to be stuck with fixing this driver so I can test my 
other patches; and cc Govindraj as well.


- Paul
Govindraj Feb. 3, 2012, 6:56 a.m. UTC | #5
On Fri, Feb 3, 2012 at 9:37 AM, NeilBrown <neilb@suse.de> wrote:
> On Thu, 2 Feb 2012 13:03:01 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
>
>> Hi Greg,
>>
>> On Thu, 26 Jan 2012, Paul Walmsley wrote:
>>
>> > On Thu, 26 Jan 2012, Greg KH wrote:
>> >
>> > > Ok, I've just reverted both of these patches for now, please send new
>> > > ones when you have them.
>> >
>> > Thanks.  A pull request is at the bottom of this message.  The patches
>> > are also available from the mailing list archives here:
>> >
>> > http://marc.info/?l=linux-arm-kernel&m=132754676014389&w=2
>> > http://marc.info/?l=linux-arm-kernel&m=132754677914395&w=2
>> > http://marc.info/?l=linux-arm-kernel&m=132754676014388&w=2
>>
>> Any comments on these?
>
> Can I comment??...  They are good but ....
>
> I've tried two approaches to getting serial behaviour that I am happy with.
> They are with and without runtime pm.
>
> If I enable runtime pm by setting power/autosuspend_delay_ms (e.g. to 5000)
> then CPUIDLE enters lower states and I think it uses less power but I
> sometimes lose the first char I type (that is known) but also I sometimes get
> corruption on output.
>
> The problem seems to be that we turn off the clocks when the kernel's ring
> buffer is empty rather than waiting for the UART's fifo to be empty.
> So I get corruption near the end of a stream of output ... not right at the
> end so something must be turning the clocks back on somehow.
>
> I can remove this effect with:
>
> diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> index f809041..c7ef760 100644
> --- a/drivers/tty/serial/omap-serial.c
> +++ b/drivers/tty/serial/omap-serial.c
> @@ -440,7 +440,8 @@ static unsigned int serial_omap_tx_empty(struct uart_port *port)
>        spin_lock_irqsave(&up->port.lock, flags);
>        ret = serial_in(up, UART_LSR) & UART_LSR_TEMT ? TIOCSER_TEMT : 0;
>        spin_unlock_irqrestore(&up->port.lock, flags);
> -       pm_runtime_put(&up->pdev->dev);
> +       pm_runtime_mark_last_busy(&up->pdev->dev);
> +       pm_runtime_put_autosuspend(&up->pdev->dev);
>        return ret;
>  }
>

But looking into it UART_LSR_TEMT("include/linux/serial_reg.h") => 0x40

from omap trm:

TX_SR_E => bit 6
"Read 0x1: Transmitter hold (TX FIFO) and shift registers are empty."

So it means all data from tx fifo has shifted out and no pending data in
tx fifo.

But we had used UART_LSR_THRE (0x20) here for tx fifo emptiness comparison
then  from omap trm

TX_FIFO_E => bit 5

"Read 0x1: Transmit hold register (TX FIFO) is empty.
The transmission is not necessarily complete"

So I think all data has been shifted out from tx fifo for
serial_omap_tx_empty check.

>
> i.e. when the tx buffer is empty, so turn the clocks off immediately, but
> wait a while for the fifo to empty.  Hopefully the auto-suspend time is
> enough.
>

[...]

>
> p.s. who should I formally submit OMAP-UART patches to?  I have a couple of
> others such as the below that I should submit somewhere.
>
>
> diff --git a/arch/arm/mach-omap2/serial.c b/arch/arm/mach-omap2/serial.c
> index 247d894..35a649f 100644
> --- a/arch/arm/mach-omap2/serial.c
> +++ b/arch/arm/mach-omap2/serial.c
> @@ -54,11 +54,9 @@
>
>  struct omap_uart_state {
>        int num;
> -       int can_sleep;
>
>        struct list_head node;
>        struct omap_hwmod *oh;
> -       struct platform_device *pdev;
>  };
>
>  static LIST_HEAD(uart_list);
> @@ -381,8 +379,6 @@ void __init omap_serial_init_port(struct omap_board_data *bd
>
>        oh->mux = omap_hwmod_mux_init(bdata->pads, bdata->pads_cnt);
>
> -       uart->pdev = pdev;
> -
>        oh->dev_attr = uart;
>
>        if (((cpu_is_omap34xx() || cpu_is_omap44xx()) && bdata->pads)

Acked-by: Govindraj.R <govindraj.raja@ti.com>

Please post out this part as a patch,
I think this change has to go through linux-omap tree.

--
Thanks,
Govindraj.R
NeilBrown Feb. 3, 2012, 9:54 a.m. UTC | #6
On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> Hello Neil.
> 
> On Fri, 3 Feb 2012, NeilBrown wrote:
> 
> > Can I comment??...  They are good but ....
> > 
> > I've tried two approaches to getting serial behaviour that I am happy with.
> > They are with and without runtime pm.
> > 
> > If I enable runtime pm by setting power/autosuspend_delay_ms (e.g. to 5000)
> 
> Runtime PM should be enabled even with power/autosuspend_delay_ms = 0.  
> I think you are simply enabling the autosuspend timer.  There was a 
> significant behavior change here from 3.2, I believe.

However the default autosuspend_delay_ms is "-1", not "0", and "-1" does
disable runbtime_pm completely.  It increments usage_count (see
update_autosuspend()) so runtime_pm can never suspend the device.

> 
> > then CPUIDLE enters lower states and I think it uses less power but I
> > sometimes lose the first char I type (that is known) but also I sometimes get
> > corruption on output.
> 
> I don't see any output corruption on 35xx BeagleBoard, either with or 
> without these patches.  So this is presumably a 37xx-centric problem, and 
> unrelated to these patches, I would guess.

Maybe it is 37xx specific.  I think this is a DM3730.


> 
> > The problem seems to be that we turn off the clocks when the kernel's ring
> > buffer is empty rather than waiting for the UART's fifo to be empty.
> 
> pm_runtime_put*() calls will write to the CM_*CLKEN registers if the 
> usecount goes to zero.  But the PRCM should not actually turn off the 
> clocks if the UART isn't idle.  So I would suggest that you first try some 
> hwmod CLOCKACTIVITY changes to fix this (described briefly below).

Hmm... I thought it was the other way around - CLKEN could gate the clock
off, and PRCM could also turn it off after a handshake.  But I finally found
the relevant text:

   The software effect is immediate and direct. The functional clock is
   turned on as soon as the bit is set, and turned off if the bit is cleared
   and the clock is not required by any module.

so it seems I was wrong.

Still - something is definitely going wrong because I definitely an regularly
see garbage characters.  And the patch fixes it.  So some runtime-suspend
handler must be doing something bad, and it must happen while characters
are being written.


> 
> > I can remove this effect with:
> > 
> > diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> > index f809041..c7ef760 100644
> > --- a/drivers/tty/serial/omap-serial.c
> > +++ b/drivers/tty/serial/omap-serial.c
> > @@ -440,7 +440,8 @@ static unsigned int serial_omap_tx_empty(struct uart_port *port)
> >  	spin_lock_irqsave(&up->port.lock, flags);
> >  	ret = serial_in(up, UART_LSR) & UART_LSR_TEMT ? TIOCSER_TEMT : 0;
> >  	spin_unlock_irqrestore(&up->port.lock, flags);
> > -	pm_runtime_put(&up->pdev->dev);
> > +	pm_runtime_mark_last_busy(&up->pdev->dev);
> > +	pm_runtime_put_autosuspend(&up->pdev->dev);
> >  	return ret;
> >  }
> 
> Well this change probably makes sense anyway, just to keep the autosuspend 
> behavior consistent when an autosuspend timeout is set.  But the effect of 
> this patch may be a little different than what you think; see below.
> 
> > i.e. when the tx buffer is empty, so turn the clocks off immediately, 
> > but wait a while for the fifo to empty.  Hopefully the auto-suspend time 
> > is enough.
> 
> Hmm, this statement is a little unclear to me.  The clocks won't be turned 
> off immediately - the request to disable the clocks only happens when the 
> autosuspend timer expires.  And even then, as mentioned above, it's just a 
> request.  The hardware should not actually disable the functional clock 
> until the UART FIFO is drained...

If you pm_runtime_put_autosuspend(), the suspend won't happen immediately but
will wait the timeout.
If you pm_runtime_put_sync(), the suspend happens straight away.
If you pm_runtime_put(), the suspend is schedule immediately in another
thread, so it happens very soon.  It doesn't wait for a timer to expire (no
timer is ticking at this point).

Just because an autosuspend timeout was set earlier, that won't cause
pm_runtime_put() to delay in suspending the device when it is called.

So I think it does request that the clocks be turned off immediately.


> 
> Anyway, consider trying a different CLOCKACTIVITY setting for the UARTs. 

My TRM saids that SYSC for the UARTs doesn't have CLOCKACTIVITY. Only
 AUTOIDLE SOFTRESET ENAWAKEUP IDLEMODE

Is it worth trying anyway?
 
> There are fields and flags for this in the hwmod code; see for example the 
> I2C SYSCONFIG settings in mach-omap2/omap_hwmod_3xxx_data.c.  It's 
> possible that the UART is currently assuming that its functional clock 
> will stay on when it idle-acks.  That might cause the corruption that you 
> are seeing.
> 
> > But I decided not to pursue this further as turning off the clocks seems 
> > like the wrong thing to be doing.
> 
> The clocks should be getting disabled when the autosuspend timer is 0 
> also.  It's just that the UART driver will request to disable the clocks 
> immediately, rather than waiting.

But not when it is -1.


> 
> > The OMAP UARTs have auto-suspend/auto-wake so I would rather depend on 
> > that.  And turning off the clocks loses that first character at 115200 
> > and above (not below).
> 
> If you make a change that causes the kernel to keep the UART clocks on, 
> the enclosing clockdomain and any associated clockdomains (probably 
> CORE_L3, CORE_L4 & PER) will be prevented from going to sleep.  So just be 
> aware that the strategy you are considering will incur an energy 
> consumption cost.  The lowest C-state you should manage to reach should be 
> C4, at least in mainline terms.
> 
> Incidentally, it's unclear to me how you are keeping the clocks on?  Are 
> you disabling runtime PM for the driver in some way?

Yes, leaving autosuspend_delay_ms at the default of -1 .... :-)

> 
> > So I explored leaving the runtime_pm setting as it was.  This set a maximum
> > latency of 4444 usec (which should be 5555 as there are 10 bits per char) 
> > so it didn't get very low down the CPUIDLE levels.
> >
> > So I disabled the latency setting with hardware handshaking is used (as you
> > suggested once):
> 
> [snip]
> 
> > and now CPUIDLE uses lower states (especially if I enable_off_mode).
> 
> That's good.  One important note is that the CPUIdle statistics don't keep 
> track of what C-state was actually entered -- it simply tracks what 
> C-state that CPUIdle intended to enter.  So you'll want to check 
> /debug/pm_debug/count to be sure.  Also, the PM debug counts are 
> approximate.

Yes, I'd begun to realise that - thanks.

> 
> Incidentally, I have some patches to fix the latency calculation here that 
> are in the works, similar to the ones you describe.  The current 
> calculation in the driver is pretty broken, but since the changes to fix 
> the calculation are rather involved, Kevin and I thought it would be best 
> to defer most of them to the v3.4 merge window.  The calculation fix in 
> the 3.3-rc series is simply intended to deal with a very basic power 
> management regression, as you know - not to make it ideal, which is more 
> complicated.  Anyway, the patches are at git://git.pwsan.com/linux-2.6 in 
> the branch 'omap_serial_fix_pm_qos_formula_devel_3.4'.  Keep in mind that 
> at least one patch in that branch is broken, but perhaps you might get an 
> idea of where they are trying to go.  Comments welcome.
> 
> > However I am using a lot more power than in 3.2.  
> 
> Is this without disabling the UART clocks?  If the driver is keeping the 
> UART clocks enabled, then increased power consumption is definitely 
> expected.

Both. With clocks kept on (autosuspend == -1) I'm using about 30 mA more than
before.  With clocks allowed to go off it is closer to 40mA more !!! Weird,
isn't it?

> 
> > That probably isn't all UART-related, but there is one interesting 
> > observation.
> > 
> > If I watch /sys/kernel/debug/pm_debug/time and see where the time is spent
> > over a 30second period when the systems is mostly idle:
> > 
> > With runtime_pm disabled for UARTs, both PER and CORE are permanently ON,
> > and MPU is OFF nearly all the time. (This is with off_mode enabled).
> 
> How are you disabling runtime PM?  Is this just with the autosuspend 
> timeout set to 0?
     -1.


> 
> > If I then enabled runtime_pm with a 5 second autosuspend delay:
> >   CORE is still permanently ON (I think the USB might be doing that).
> >   PER  is OFF (7 seconds) RET (15 seconds) and ON (8 seconds).
> > but more surprising
> >   MPU  is OFF (8 sec) RET (8 sec) INA (9 sec) ON (4 secs).
> > 
> > So we get to turn PER off at the cost of turning the MPU on a lot.
> > (the periods in each state are only repeatable to a precision of 20%-50%).
> > 
> > I understand that PER won't go OFF without runtime PM, but I would expect
> > it to at least go to INA as the UART is in smart-idle mode
> 
> You are using UART3?  That is in PER and its functional clock comes via 
> the PER clockdomain.  If the UART functional clock is prevented from being 
> turned off, how can PER go inactive when the UART3 functional clock is 
> still enabled?

I was thinking the auto-suspend would kick in - but I had it backwards it
seems.  I'll have to rethink things.
(yes, UART3).

> 
> > The net result is that with runtime_pm enabled I'm seeing an extra 7mA (at
> > 4V). That might not be very precise, but it is in the opposite direction to
> > my expectations.
> > 
> > So there is something odd here...  ideas?
> > 
> > NeilBrown
> > 
> > p.s. who should I formally submit OMAP-UART patches to?  I have a couple of
> > others such as the below that I should submit somewhere.
> 
> Would suggest sending them to linux-serial@vger.kernel.org, 
> linux-omap@vger.kernel.org, linux-arm-kernel@lists.infradead.org, and Alan 
> Cox since he is the serial maintainer?  And you should probably also cc 
> me, since I seem to be stuck with fixing this driver so I can test my 
> other patches; and cc Govindraj as well.
> 
> 
> - Paul

Thanks for your time.
I'm going to have to come up with a better theory for why my patch works.

Thanks,
NeilBrown
Gražvydas Ignotas Feb. 3, 2012, 11:42 a.m. UTC | #7
On Fri, Feb 3, 2012 at 11:54 AM, NeilBrown <neilb@suse.de> wrote:
> On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
>> On Fri, 3 Feb 2012, NeilBrown wrote:
>>
>> > then CPUIDLE enters lower states and I think it uses less power but I
>> > sometimes lose the first char I type (that is known) but also I sometimes get
>> > corruption on output.
>>
>> I don't see any output corruption on 35xx BeagleBoard, either with or
>> without these patches.  So this is presumably a 37xx-centric problem, and
>> unrelated to these patches, I would guess.
>
> Maybe it is 37xx specific.  I think this is a DM3730.

Not sure if it's the same problem but with 3530 on 3.2 with
sleep_timeout set, I usually get first char dropped (as expected) but
sometimes I get corrupted char instead too. Corrupt char seems to
almost always happen if I set cpufreq to powersave, on performace it's
almost always ok, so maybe it's some timing problem,
NeilBrown Feb. 3, 2012, 12:07 p.m. UTC | #8
On Fri, 3 Feb 2012 12:26:14 +0530 Govindraj <govindraj.ti@gmail.com> wrote:

> On Fri, Feb 3, 2012 at 9:37 AM, NeilBrown <neilb@suse.de> wrote:
> > On Thu, 2 Feb 2012 13:03:01 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> >
> >> Hi Greg,
> >>
> >> On Thu, 26 Jan 2012, Paul Walmsley wrote:
> >>
> >> > On Thu, 26 Jan 2012, Greg KH wrote:
> >> >
> >> > > Ok, I've just reverted both of these patches for now, please send new
> >> > > ones when you have them.
> >> >
> >> > Thanks.  A pull request is at the bottom of this message.  The patches
> >> > are also available from the mailing list archives here:
> >> >
> >> > http://marc.info/?l=linux-arm-kernel&m=132754676014389&w=2
> >> > http://marc.info/?l=linux-arm-kernel&m=132754677914395&w=2
> >> > http://marc.info/?l=linux-arm-kernel&m=132754676014388&w=2
> >>
> >> Any comments on these?
> >
> > Can I comment??...  They are good but ....
> >
> > I've tried two approaches to getting serial behaviour that I am happy with.
> > They are with and without runtime pm.
> >
> > If I enable runtime pm by setting power/autosuspend_delay_ms (e.g. to 5000)
> > then CPUIDLE enters lower states and I think it uses less power but I
> > sometimes lose the first char I type (that is known) but also I sometimes get
> > corruption on output.
> >
> > The problem seems to be that we turn off the clocks when the kernel's ring
> > buffer is empty rather than waiting for the UART's fifo to be empty.
> > So I get corruption near the end of a stream of output ... not right at the
> > end so something must be turning the clocks back on somehow.
> >
> > I can remove this effect with:
> >
> > diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> > index f809041..c7ef760 100644
> > --- a/drivers/tty/serial/omap-serial.c
> > +++ b/drivers/tty/serial/omap-serial.c
> > @@ -440,7 +440,8 @@ static unsigned int serial_omap_tx_empty(struct uart_port *port)
> >        spin_lock_irqsave(&up->port.lock, flags);
> >        ret = serial_in(up, UART_LSR) & UART_LSR_TEMT ? TIOCSER_TEMT : 0;
> >        spin_unlock_irqrestore(&up->port.lock, flags);
> > -       pm_runtime_put(&up->pdev->dev);
> > +       pm_runtime_mark_last_busy(&up->pdev->dev);
> > +       pm_runtime_put_autosuspend(&up->pdev->dev);
> >        return ret;
> >  }
> >
> 
> But looking into it UART_LSR_TEMT("include/linux/serial_reg.h") => 0x40
> 
> from omap trm:
> 
> TX_SR_E => bit 6
> "Read 0x1: Transmitter hold (TX FIFO) and shift registers are empty."
> 
> So it means all data from tx fifo has shifted out and no pending data in
> tx fifo.
> 
> But we had used UART_LSR_THRE (0x20) here for tx fifo emptiness comparison
> then  from omap trm
> 
> TX_FIFO_E => bit 5
> 
> "Read 0x1: Transmit hold register (TX FIFO) is empty.
> The transmission is not necessarily complete"
> 
> So I think all data has been shifted out from tx fifo for
> serial_omap_tx_empty check.

Useful and interesting - thanks.
However what it really shows me is that I was misunderstanding the code
(If I spent the time to verify every conclusion that I jumped to, I'd never
get anywhere :-( ).  Better clarify that now.

So this function - serial_omap_tx_empty() is called to test if the
tx queue is empty.

It is called in a loop at the end of uart_wait_until_sent.

uart_wait_until_sent it calls from various places, but particularly when
doing an 'stty' ioctl.

The loop isn't a very tight loop though it would be tighter if jiffies were
smaller.  As it is it checks every jiffy and usually loops 3 times when I
see corruption.

So when I

   cat somefile

to the serial console, most of the file comes out fine.  But after 'cat'
finishes and returns to the shell - while some chars are still in the FIFO - 
the shell does an 'stty' ioctl to make sure the settings are still OK.
This ioctl calls serial_omap_tx_empty which calls pm_resume_put() which
immediately suspends the uart, which seems to stop some clock - even though
we think it shouldn't.

(If is use 'dash' instead of 'bash', that shell doesn't fiddle with stty, and
I don't see any corruption).

After a bit more experimentation, I find that if either UART3 or UART4 (which
are numbers 2 and 3 is sysfs!!!) have auto_suspend_delay set to -1, then I
don't see any corruption.
But if both are set to 5000, then I do.

(The settings of UART1 and UART2 seem to be irrelevant - presumably because
they are in CORE, not PER).

So if either uart wants PER_48M_FCLK on, then it stays on.  But if neither
explicitly request it and are only keeping it on by being active ... then
there is room for a hiccup it seems.
Or maybe it isn't the clocks ... maybe the problem is that PER goes into
RET mode, which it does for about 40msec.

If I run
 cat /sys/kernel/debug/pm_debug/time; stty 115200; \
 cat /sys/kernel/debug/pm_debug/time

then I see corruption near the end of the first output of the 'time' file,
and the time that PER is in RET goes up by about 40msec.
This could be because the clock goes off, or maybe it has some other cause
that I'm not aware of.

So yeah - I was quite wrong in my original analysis, thanks for encouraging
me to sort it out.



This is awfully similar to the problem I had with HDQ - clock seeming to
disappear when it should not.

There is something important I'm missing.... time to go back to the TRM I
guess..



> 
> >
> > i.e. when the tx buffer is empty, so turn the clocks off immediately, but
> > wait a while for the fifo to empty.  Hopefully the auto-suspend time is
> > enough.
> >
> 
> [...]
> 
> >
> > p.s. who should I formally submit OMAP-UART patches to?  I have a couple of
> > others such as the below that I should submit somewhere.
> >
> >
> > diff --git a/arch/arm/mach-omap2/serial.c b/arch/arm/mach-omap2/serial.c
> > index 247d894..35a649f 100644
> > --- a/arch/arm/mach-omap2/serial.c
> > +++ b/arch/arm/mach-omap2/serial.c
> > @@ -54,11 +54,9 @@
> >
> >  struct omap_uart_state {
> >        int num;
> > -       int can_sleep;
> >
> >        struct list_head node;
> >        struct omap_hwmod *oh;
> > -       struct platform_device *pdev;
> >  };
> >
> >  static LIST_HEAD(uart_list);
> > @@ -381,8 +379,6 @@ void __init omap_serial_init_port(struct omap_board_data *bd
> >
> >        oh->mux = omap_hwmod_mux_init(bdata->pads, bdata->pads_cnt);
> >
> > -       uart->pdev = pdev;
> > -
> >        oh->dev_attr = uart;
> >
> >        if (((cpu_is_omap34xx() || cpu_is_omap44xx()) && bdata->pads)
> 
> Acked-by: Govindraj.R <govindraj.raja@ti.com>
> 
> Please post out this part as a patch,
> I think this change has to go through linux-omap tree.

I'll do that, thanks.

NeilBrown
NeilBrown Feb. 3, 2012, 12:11 p.m. UTC | #9
On Fri, 3 Feb 2012 13:42:13 +0200 Grazvydas Ignotas <notasas@gmail.com> wrote:

> On Fri, Feb 3, 2012 at 11:54 AM, NeilBrown <neilb@suse.de> wrote:
> > On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> >> On Fri, 3 Feb 2012, NeilBrown wrote:
> >>
> >> > then CPUIDLE enters lower states and I think it uses less power but I
> >> > sometimes lose the first char I type (that is known) but also I sometimes get
> >> > corruption on output.
> >>
> >> I don't see any output corruption on 35xx BeagleBoard, either with or
> >> without these patches.  So this is presumably a 37xx-centric problem, and
> >> unrelated to these patches, I would guess.
> >
> > Maybe it is 37xx specific.  I think this is a DM3730.
> 
> Not sure if it's the same problem but with 3530 on 3.2 with
> sleep_timeout set, I usually get first char dropped (as expected) but
> sometimes I get corrupted char instead too. Corrupt char seems to
> almost always happen if I set cpufreq to powersave, on performace it's
> almost always ok, so maybe it's some timing problem,

I see that too - I'm glad someone else does :-)

If I look at the corruption as compared to the expected character, it is
always the case some it has been shifted down a few bits, with '1's shifted
in the top.
As bytes are sent lsb first followed by stop bits which are '1', this is
completely consistent with clocking the bits in a little bit late.

I see this with baud rates of 115200 and higher, but never with lower.

My theory is that there is a delay between the falling RX line waking the
system up, and the CPU enabling the UART - whether enabling the clocks or
doing a full config, I am not sure - though I think the former.

Maybe if we could enable the UART clocks immediately after returning from the
WFI instruction we could avoid the corruption....

NeilBrown
Felipe Contreras Feb. 3, 2012, 12:12 p.m. UTC | #10
On Fri, Feb 3, 2012 at 6:07 AM, NeilBrown <neilb@suse.de> wrote:
> If I then enabled runtime_pm with a 5 second autosuspend delay:
>  CORE is still permanently ON (I think the USB might be doing that).

Maybe related to this:
http://thread.gmane.org/gmane.linux.usb.general/56096
Russell King - ARM Linux Feb. 3, 2012, 12:20 p.m. UTC | #11
On Fri, Feb 03, 2012 at 11:07:20PM +1100, NeilBrown wrote:
> However what it really shows me is that I was misunderstanding the code
> (If I spent the time to verify every conclusion that I jumped to, I'd never
> get anywhere :-( ).  Better clarify that now.
> 
> So this function - serial_omap_tx_empty() is called to test if the
> tx queue is empty.

Not just the queue.  The documentation for that function is accurate:

  tx_empty(port)
        This function tests whether the transmitter fifo and shifter
        for the port described by 'port' is empty.  If it is empty,
        this function should return TIOCSER_TEMT, otherwise return 0.
        If the port does not support this operation, then it should
        return TIOCSER_TEMT.

So, if this is returning TIOCSER_TEMT while there's still a transmission
in progress, then it's buggy.

> So when I
> 
>    cat somefile
> 
> to the serial console, most of the file comes out fine.  But after 'cat'
> finishes and returns to the shell - while some chars are still in the FIFO - 
> the shell does an 'stty' ioctl to make sure the settings are still OK.
> This ioctl calls serial_omap_tx_empty which calls pm_resume_put() which
> immediately suspends the uart, which seems to stop some clock - even though
> we think it shouldn't.

If there's an on-going transmission, why is the runtime PM count zero,
meaning that the UART can sleep at the point where serial_omap_tx_empty()
is being called - and obviously there's still characters in the FIFO?

I guess this is highlighting a problem with doing runtime PM with
serial ports: you don't know when the port has _actually_ finished
sending its final character, which is even more of a problem if the
port does hardware CTS flow control itself.

It seems that runtime PM should be checking whether the TX FIFO is empty
before shutting the port down - and that probably means a hook into the
idle stuff.

Note though, that serial_omap_tx_empty() can be called via ioctl from
userspace, so this function would still need the runtime callbacks in
it so that the register is readable at _any_ time that the port is open.
Paul Walmsley Feb. 3, 2012, 7:34 p.m. UTC | #12
On Fri, 3 Feb 2012, NeilBrown wrote:

> On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> 
> > On Fri, 3 Feb 2012, NeilBrown wrote:
> > 
> > > If I enable runtime pm by setting power/autosuspend_delay_ms (e.g. to 5000)
> > 
> > Runtime PM should be enabled even with power/autosuspend_delay_ms = 0.  
> > I think you are simply enabling the autosuspend timer.  There was a 
> > significant behavior change here from 3.2, I believe.
> 
> However the default autosuspend_delay_ms is "-1", not "0", and "-1" does
> disable runbtime_pm completely.  It increments usage_count (see
> update_autosuspend()) so runtime_pm can never suspend the device.

OK good, so you are indeed keeping the clocks enabled, then.

> Hmm... I thought it was the other way around - CLKEN could gate the clock
> off, and PRCM could also turn it off after a handshake.  But I finally found
> the relevant text:
> 
>    The software effect is immediate and direct. The functional clock is
>    turned on as soon as the bit is set, and turned off if the bit is cleared
>    and the clock is not required by any module.
> 
> so it seems I was wrong.

Well, one shouldn't take the TRM too seriously.  But in this case, yes I 
think you had a slightly different idea than what the hardware people
intended ;-)

> Still - something is definitely going wrong because I definitely an regularly
> see garbage characters.  And the patch fixes it.  So some runtime-suspend
> handler must be doing something bad, and it must happen while characters
> are being written.

It's certainly possible that there is another idle bug in the UART where 
it could be mistakenly idle-acking when there are bytes left to be 
transmitted.  But the patch 'tty: serial: OMAP: block idle while the UART 
is transferring data in PIO mode' should fix that.  Are you running with 
those patches applied?  Also, are you using PIO or DMA?

> > > i.e. when the tx buffer is empty, so turn the clocks off immediately, 
> > > but wait a while for the fifo to empty.  Hopefully the auto-suspend time 
> > > is enough.
> > 
> > Hmm, this statement is a little unclear to me.  The clocks won't be turned 
> > off immediately - the request to disable the clocks only happens when the 
> > autosuspend timer expires.  And even then, as mentioned above, it's just a 
> > request.  The hardware should not actually disable the functional clock 
> > until the UART FIFO is drained...
> 
> If you pm_runtime_put_autosuspend(), the suspend won't happen immediately but
> will wait the timeout.
> If you pm_runtime_put_sync(), the suspend happens straight away.
> If you pm_runtime_put(), the suspend is schedule immediately in another
> thread, so it happens very soon.  It doesn't wait for a timer to expire (no
> timer is ticking at this point).
> 
> Just because an autosuspend timeout was set earlier, that won't cause
> pm_runtime_put() to delay in suspending the device when it is called.
> 
> So I think it does request that the clocks be turned off immediately.

I was under the impression you were referring to the behavior after your 
patch was applied, rather than before it.  My misunderstanding.

> > Anyway, consider trying a different CLOCKACTIVITY setting for the UARTs. 
> 
> My TRM saids that SYSC for the UARTs doesn't have CLOCKACTIVITY. Only
>  AUTOIDLE SOFTRESET ENAWAKEUP IDLEMODE
> 
> Is it worth trying anyway?

Sure, why not.  It's fast to try, and if it happens to work, it's better 
than hacking the driver..  

> > Incidentally, I have some patches to fix the latency calculation here that 
> > are in the works, similar to the ones you describe.  The current 
> > calculation in the driver is pretty broken, but since the changes to fix 
> > the calculation are rather involved, Kevin and I thought it would be best 
> > to defer most of them to the v3.4 merge window.  The calculation fix in 
> > the 3.3-rc series is simply intended to deal with a very basic power 
> > management regression, as you know - not to make it ideal, which is more 
> > complicated.  Anyway, the patches are at git://git.pwsan.com/linux-2.6 in 
> > the branch 'omap_serial_fix_pm_qos_formula_devel_3.4'.  Keep in mind that 
> > at least one patch in that branch is broken, but perhaps you might get an 
> > idea of where they are trying to go.  Comments welcome.
> > 
> > > However I am using a lot more power than in 3.2.  
> > 
> > Is this without disabling the UART clocks?  If the driver is keeping the 
> > UART clocks enabled, then increased power consumption is definitely 
> > expected.
> 
> Both. With clocks kept on (autosuspend == -1) I'm using about 30 mA more than
> before.  With clocks allowed to go off it is closer to 40mA more !!! Weird,
> isn't it?

Interesting.  A few questions.  Do you have the PMIC and the OMAP 
configured to scale voltage in retention?  Also, does the power effect 
differ if you use different autosuspend timeouts?

> > > If I then enabled runtime_pm with a 5 second autosuspend delay:
> > >   CORE is still permanently ON (I think the USB might be doing that).
> > >   PER  is OFF (7 seconds) RET (15 seconds) and ON (8 seconds).
> > > but more surprising
> > >   MPU  is OFF (8 sec) RET (8 sec) INA (9 sec) ON (4 secs).
> > > 
> > > So we get to turn PER off at the cost of turning the MPU on a lot.
> > > (the periods in each state are only repeatable to a precision of 20%-50%).

Hmmm yeah, that does seem weird that the MPU would stay out of a low power 
state just because the autosuspend timer was enabled.  

> > > p.s. who should I formally submit OMAP-UART patches to?  I have a couple of
> > > others such as the below that I should submit somewhere.
> > 
> > Would suggest sending them to linux-serial@vger.kernel.org, 
> > linux-omap@vger.kernel.org, linux-arm-kernel@lists.infradead.org, and Alan 
> > Cox since he is the serial maintainer?  And you should probably also cc 
> > me, since I seem to be stuck with fixing this driver so I can test my 
> > other patches; and cc Govindraj as well.
>
> Thanks for your time.

Thanks for helping us clean up this driver :-)

By the way, you might want to cc Kevin Hilman on any serial patches too, 
since he has been working in this area also.


- Paul
Paul Walmsley Feb. 3, 2012, 7:42 p.m. UTC | #13
On Fri, 3 Feb 2012, Grazvydas Ignotas wrote:

> On Fri, Feb 3, 2012 at 11:54 AM, NeilBrown <neilb@suse.de> wrote:
> > On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> >> On Fri, 3 Feb 2012, NeilBrown wrote:
> >>
> >> > then CPUIDLE enters lower states and I think it uses less power but I
> >> > sometimes lose the first char I type (that is known) but also I sometimes get
> >> > corruption on output.
> >>
> >> I don't see any output corruption on 35xx BeagleBoard, either with or
> >> without these patches.  So this is presumably a 37xx-centric problem, and
> >> unrelated to these patches, I would guess.
> >
> > Maybe it is 37xx specific.  I think this is a DM3730.
> 
> Not sure if it's the same problem but with 3530 on 3.2 with
> sleep_timeout set, I usually get first char dropped (as expected) but
> sometimes I get corrupted char instead too. Corrupt char seems to
> almost always happen if I set cpufreq to powersave, on performace it's
> almost always ok, so maybe it's some timing problem,

OK so let's distinguish between two corruption situations:

1. The first character transmitted to the OMAP UART in a serial console 
when the UART powerdomain is in a non-functional, low power state (e.g., 
RET or below) is corrupted.  This is not actually output corruption, this 
is input corruption.

2. Characters are corrupted while the OMAP UART is transmitting data, but 
there has been no recent data sent to the OMAP.

Case 1 is expected and is almost certainly not a bug.  As Neil mentioned 
it should be bps-rate dependent.  It occurs when the first character 
transmitted to the OMAP wakes the chip up via I/O ring/chain wakeup.
I/O ring/chain wakeup is driven by a 32KiHz clock and is therefore 
relatively high-latency.  So this could easily cause the first character 
or first few characters to be lost or corrupted, depending on the exact 
sequence of events, the low power state that the chip was in, etc.

Case 2 is not expected.  That is likely a bug somewhere.  Neil, this is 
what I understood that you are experiencing.  Is that correct?

Gražvydas, are you seeing case 1 or 2 (or something completely different 
;-) ?


- Paul
Paul Walmsley Feb. 3, 2012, 7:49 p.m. UTC | #14
On Fri, 3 Feb 2012, NeilBrown wrote:

> My theory is that there is a delay between the falling RX line waking the
> system up, and the CPU enabling the UART - whether enabling the clocks or
> doing a full config, I am not sure - though I think the former.
> 
> Maybe if we could enable the UART clocks immediately after returning from the
> WFI instruction we could avoid the corruption....

The PRCM should be re-enabling the UART's functional clock itself, with no 
kernel involvement.  The sequence should go something like this 
(simplified):

1. I/O wakeup occurs

2. CORE & PER powerdomains are awakened

3. The UART notices an event on its input lines and deasserts its idle-ack

4. The PRCM re-enables the clocks to the UART

5. The UART receives the input character into its FIFO and raises an MPU 
   interrupt

That's the theory, anyway.

- Paul
Paul Walmsley Feb. 3, 2012, 7:54 p.m. UTC | #15
On Fri, 3 Feb 2012, Russell King - ARM Linux wrote:

> If there's an on-going transmission, why is the runtime PM count zero,
> meaning that the UART can sleep at the point where serial_omap_tx_empty()
> is being called - and obviously there's still characters in the FIFO?

In the case of OMAP, that's the point of the idle protocol.  If there's 
data left in the TX FIFO, the UART should prevent the rest of the chip 
from cutting its clocks until the TX FIFO is drained.  Even if the MPU has 
requested that the UART clocks be cut.

Of course, there may well be a bug that prevents this from working the 
way it was intended.


- Paul
Paul Walmsley Feb. 3, 2012, 8:10 p.m. UTC | #16
One other comment..

On Fri, 3 Feb 2012, NeilBrown wrote:

> On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> 
> > On Fri, 3 Feb 2012, NeilBrown wrote:
> > 
> > > I can remove this effect with:
> > > 
> > > diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> > > index f809041..c7ef760 100644
> > > --- a/drivers/tty/serial/omap-serial.c
> > > +++ b/drivers/tty/serial/omap-serial.c
> > > @@ -440,7 +440,8 @@ static unsigned int serial_omap_tx_empty(struct uart_port *port)
> > >  	spin_lock_irqsave(&up->port.lock, flags);
> > >  	ret = serial_in(up, UART_LSR) & UART_LSR_TEMT ? TIOCSER_TEMT : 0;
> > >  	spin_unlock_irqrestore(&up->port.lock, flags);
> > > -	pm_runtime_put(&up->pdev->dev);
> > > +	pm_runtime_mark_last_busy(&up->pdev->dev);
> > > +	pm_runtime_put_autosuspend(&up->pdev->dev);
> > >  	return ret;
> > >  }

It's a little surprising that this helps.  The pm_runtime_get*() and 
_put*() in serial_omap_tx_empty() are just intended to ensure that the 
UART's clocks are running for that LSR register read.

Considering your theory that the UART clocks are being cut while there's 
still data in the FIFO, you might consider removing this code at the end 
of transmit_chars():

	if (uart_circ_empty(xmit))
		serial_omap_stop_tx(&up->port);



- Paul
Paul Walmsley Feb. 3, 2012, 8:34 p.m. UTC | #17
On Fri, 3 Feb 2012, Paul Walmsley wrote:

> 2. CORE & PER powerdomains are awakened

Incidentally, there might be a missing clkdm_add_wakeup() in 
mach-omap2/pm34xx.c to wake up PER when CORE is awakened.  For people who 
are seeing some input character corruption when CORE/PER enters a 
low-power state, that might be worth experimenting with.


- Paul
NeilBrown Feb. 3, 2012, 8:44 p.m. UTC | #18
On Fri, 3 Feb 2012 12:42:22 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> On Fri, 3 Feb 2012, Grazvydas Ignotas wrote:
> 
> > On Fri, Feb 3, 2012 at 11:54 AM, NeilBrown <neilb@suse.de> wrote:
> > > On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> > >> On Fri, 3 Feb 2012, NeilBrown wrote:
> > >>
> > >> > then CPUIDLE enters lower states and I think it uses less power but I
> > >> > sometimes lose the first char I type (that is known) but also I sometimes get
> > >> > corruption on output.
> > >>
> > >> I don't see any output corruption on 35xx BeagleBoard, either with or
> > >> without these patches.  So this is presumably a 37xx-centric problem, and
> > >> unrelated to these patches, I would guess.
> > >
> > > Maybe it is 37xx specific.  I think this is a DM3730.
> > 
> > Not sure if it's the same problem but with 3530 on 3.2 with
> > sleep_timeout set, I usually get first char dropped (as expected) but
> > sometimes I get corrupted char instead too. Corrupt char seems to
> > almost always happen if I set cpufreq to powersave, on performace it's
> > almost always ok, so maybe it's some timing problem,
> 
> OK so let's distinguish between two corruption situations:
> 
> 1. The first character transmitted to the OMAP UART in a serial console 
> when the UART powerdomain is in a non-functional, low power state (e.g., 
> RET or below) is corrupted.  This is not actually output corruption, this 
> is input corruption.
> 
> 2. Characters are corrupted while the OMAP UART is transmitting data, but 
> there has been no recent data sent to the OMAP.
> 
> Case 1 is expected and is almost certainly not a bug.  As Neil mentioned 
> it should be bps-rate dependent.  It occurs when the first character 
> transmitted to the OMAP wakes the chip up via I/O ring/chain wakeup.
> I/O ring/chain wakeup is driven by a 32KiHz clock and is therefore 
> relatively high-latency.  So this could easily cause the first character 
> or first few characters to be lost or corrupted, depending on the exact 
> sequence of events, the low power state that the chip was in, etc.

A 32KiHz cycles every 30mSec.
At 115200 bps, there is one bit every 8.7 microseconds.

When I type "return" - which looks like 0101100001111... on the wire,
I see '0xC3' which looks like 011000011111... on the wire.
So we lost exactly 2 bits, or a delay around 17 microseconds.

I find it hard to reconcile that delay with the cause being a 32KiHZ clock.

If the first char I type is a space (0x20 or 0000001001111111) then
the character received is 0x90 (0000010011111) which is exactly 1 bit missing,
so an 8 or 9 usec delay.
If the first char I type is 'o' (0x6f, or 0111101101111111) then the character
received is 0xfb (01101111111111) which misses 5 bits.
I think it misses the first bit, then waits for a start bit (0), then takes
the next 8 bits.

At 230400 bps, I always lose at least 2 bits.
At 460800 bps, I seem lose at least 3 bits.
(above there, nothing works at all ... could be my USB/serial cable at fault).

So it looks a lot like a delay which is a small number of microseconds.
Could be the wake-up latency in the I/O ring/chain, but doesn't look like the
32 KiHz clock :-)

> 
> Case 2 is not expected.  That is likely a bug somewhere.  Neil, this is 
> what I understood that you are experiencing.  Is that correct?

Correct.

Thanks,
NeilBrown

> 
> Gražvydas, are you seeing case 1 or 2 (or something completely different 
> ;-) ?
> 
> 
> - Paul
Paul Walmsley Feb. 3, 2012, 9:04 p.m. UTC | #19
On Sat, 4 Feb 2012, NeilBrown wrote:

> On Fri, 3 Feb 2012 12:42:22 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
>
> > Case 1 is expected and is almost certainly not a bug.  As Neil mentioned 
> > it should be bps-rate dependent.  It occurs when the first character 
> > transmitted to the OMAP wakes the chip up via I/O ring/chain wakeup.
> > I/O ring/chain wakeup is driven by a 32KiHz clock and is therefore 
> > relatively high-latency.  So this could easily cause the first character 
> > or first few characters to be lost or corrupted, depending on the exact 
> > sequence of events, the low power state that the chip was in, etc.
> 
> A 32KiHz cycles every 30mSec.
> At 115200 bps, there is one bit every 8.7 microseconds.
> 
> When I type "return" - which looks like 0101100001111... on the wire,
> I see '0xC3' which looks like 011000011111... on the wire.
> So we lost exactly 2 bits, or a delay around 17 microseconds.
> 
> I find it hard to reconcile that delay with the cause being a 32KiHZ clock.

If you're not disabling the HF clock oscillator via the AUTOEXTCLKMODE 
bits, the wakeup logic may be getting clocked by the sys_clk, which is 
quite a bit faster.  That might explain the corruption patterns you're 
seeing.


- Paul
Paul Walmsley Feb. 3, 2012, 9:42 p.m. UTC | #20
One correction on this part...

On Fri, 3 Feb 2012, Paul Walmsley wrote:

> On Fri, 3 Feb 2012, NeilBrown wrote:
> 
> > My theory is that there is a delay between the falling RX line waking the
> > system up, and the CPU enabling the UART - whether enabling the clocks or
> > doing a full config, I am not sure - though I think the former.
> > 
> > Maybe if we could enable the UART clocks immediately after returning from the
> > WFI instruction we could avoid the corruption....
> 
> The PRCM should be re-enabling the UART's functional clock itself, with no 
> kernel involvement.  The sequence should go something like this 
> (simplified):
> 
> 1. I/O wakeup occurs
> 
> 2. CORE & PER powerdomains are awakened
> 
> 3. The UART notices an event on its input lines and deasserts its idle-ack

It just occurred to me that, supposedly, the only UART input line that is 
attached to the SWAKEUP signal is CTS.  So the UART may not in fact be 
able to deassert its idle-ack autonomously at this point.

So you might want to give your clock re-enable after WFI idea a shot!  It 
would be interesting if it helps.

I regret the oversight, 


- Paul
NeilBrown Feb. 3, 2012, 9:59 p.m. UTC | #21
On Fri, 3 Feb 2012 13:10:28 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> One other comment..
> 
> On Fri, 3 Feb 2012, NeilBrown wrote:
> 
> > On Thu, 2 Feb 2012 22:45:53 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> > 
> > > On Fri, 3 Feb 2012, NeilBrown wrote:
> > > 
> > > > I can remove this effect with:
> > > > 
> > > > diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> > > > index f809041..c7ef760 100644
> > > > --- a/drivers/tty/serial/omap-serial.c
> > > > +++ b/drivers/tty/serial/omap-serial.c
> > > > @@ -440,7 +440,8 @@ static unsigned int serial_omap_tx_empty(struct uart_port *port)
> > > >  	spin_lock_irqsave(&up->port.lock, flags);
> > > >  	ret = serial_in(up, UART_LSR) & UART_LSR_TEMT ? TIOCSER_TEMT : 0;
> > > >  	spin_unlock_irqrestore(&up->port.lock, flags);
> > > > -	pm_runtime_put(&up->pdev->dev);
> > > > +	pm_runtime_mark_last_busy(&up->pdev->dev);
> > > > +	pm_runtime_put_autosuspend(&up->pdev->dev);
> > > >  	return ret;
> > > >  }
> 
> It's a little surprising that this helps.  The pm_runtime_get*() and 
> _put*() in serial_omap_tx_empty() are just intended to ensure that the 
> UART's clocks are running for that LSR register read.
> 
> Considering your theory that the UART clocks are being cut while there's 
> still data in the FIFO, you might consider removing this code at the end 
> of transmit_chars():
> 
> 	if (uart_circ_empty(xmit))
> 		serial_omap_stop_tx(&up->port);

I read the code and chickened out of just removing that.
serial_omap_stop_tx seem to do 2 things:
 1/ tell the uart to stop sending interrupts when the tx fifo is empty
 2/ set forceidle (really smartidle) on the uart.

I didn't feel comfortable removing '1' as I thought it might generate an
interrupt storm .. maybe not.
Instead I just removed '2'.  In fact I replaced the 'set_forceidle' call with
'set_noidle'.  So the uart should never report that it was idle.

I did this with my other patch removed so pm_runtime_put() was still being
called.

Result:  I still get corruption.
So having the UART say "no, I'm not idle" does *not* stop the clock
being turned off when we use omap_hwmod_idle() to turn off the clocks.

When we turn off a clock, if that is the last clock in the clock-domain, we
also turn off the clock-domain (I think).
Could it be that the clock-domain doesn't do any handshaking with modules,
and so turns off the clocks even though they are being used?

NeilBrown
NeilBrown Feb. 3, 2012, 10:10 p.m. UTC | #22
On Fri, 3 Feb 2012 14:42:09 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> One correction on this part...
> 
> On Fri, 3 Feb 2012, Paul Walmsley wrote:
> 
> > On Fri, 3 Feb 2012, NeilBrown wrote:
> > 
> > > My theory is that there is a delay between the falling RX line waking the
> > > system up, and the CPU enabling the UART - whether enabling the clocks or
> > > doing a full config, I am not sure - though I think the former.
> > > 
> > > Maybe if we could enable the UART clocks immediately after returning from the
> > > WFI instruction we could avoid the corruption....
> > 
> > The PRCM should be re-enabling the UART's functional clock itself, with no 
> > kernel involvement.  The sequence should go something like this 
> > (simplified):
> > 
> > 1. I/O wakeup occurs
> > 
> > 2. CORE & PER powerdomains are awakened
> > 
> > 3. The UART notices an event on its input lines and deasserts its idle-ack
> 
> It just occurred to me that, supposedly, the only UART input line that is 
> attached to the SWAKEUP signal is CTS.  So the UART may not in fact be 
> able to deassert its idle-ack autonomously at this point.

How does that relate to the RX_CTS_WU_EN bit which enables an interrupt on 
    "a falling edge of pins RX, nCTS, or nDSR"

This seems to be a "wakeup interrupt", bit it isn't clear what it wakes us.

> 
> So you might want to give your clock re-enable after WFI idea a shot!  It 
> would be interesting if it helps.

Might be a bit beyond me at the moment :-(

Thanks,
NeilBrown

> 
> I regret the oversight, 
> 
> 
> - Paul
Paul Walmsley Feb. 3, 2012, 10:30 p.m. UTC | #23
On Sat, 4 Feb 2012, NeilBrown wrote:

> On Fri, 3 Feb 2012 14:42:09 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> 
> > On Fri, 3 Feb 2012, Paul Walmsley wrote:
> > 
> > > On Fri, 3 Feb 2012, NeilBrown wrote:
> > > 
> > > > My theory is that there is a delay between the falling RX line waking the
> > > > system up, and the CPU enabling the UART - whether enabling the clocks or
> > > > doing a full config, I am not sure - though I think the former.
> > > > 
> > > > Maybe if we could enable the UART clocks immediately after returning from the
> > > > WFI instruction we could avoid the corruption....
> > > 
> > > The PRCM should be re-enabling the UART's functional clock itself, with no 
> > > kernel involvement.  The sequence should go something like this 
> > > (simplified):
> > > 
> > > 1. I/O wakeup occurs
> > > 
> > > 2. CORE & PER powerdomains are awakened
> > > 
> > > 3. The UART notices an event on its input lines and deasserts its idle-ack
> > 
> > It just occurred to me that, supposedly, the only UART input line that is 
> > attached to the SWAKEUP signal is CTS.  So the UART may not in fact be 
> > able to deassert its idle-ack autonomously at this point.
> 
> How does that relate to the RX_CTS_WU_EN bit which enables an interrupt on 
>     "a falling edge of pins RX, nCTS, or nDSR"

That's the bit I'm talking about :-)  Maybe it might work appropriately, 
then, if it also tests RX.  Section 19.3.2.3 "Wake-up Request" only 
mentions the CTS lines.  Flip a coin ;-)

> This seems to be a "wakeup interrupt", bit it isn't clear what it wakes us.

The UART.  It should send an SWAKEUP to the PRCM and bring the UART out of 
idle-ack.

- Paul
Paul Walmsley Feb. 3, 2012, 11:02 p.m. UTC | #24
On Sat, 4 Feb 2012, NeilBrown wrote:

> On Fri, 3 Feb 2012 13:10:28 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> 
> > Considering your theory that the UART clocks are being cut while there's 
> > still data in the FIFO, you might consider removing this code at the end 
> > of transmit_chars():
> > 
> > 	if (uart_circ_empty(xmit))
> > 		serial_omap_stop_tx(&up->port);
> 
> I read the code and chickened out of just removing that.
> serial_omap_stop_tx seem to do 2 things:
>  1/ tell the uart to stop sending interrupts when the tx fifo is empty
>  2/ set forceidle (really smartidle) on the uart.
> 
> I didn't feel comfortable removing '1' as I thought it might generate an
> interrupt storm .. maybe not.

Might be worth a try.  In theory, since the current UART driver sets the 
TX_EMPTY flag in the SCR register, the UART should only raise a TX 
interrupt when the FIFO + shift register are totally empty.  So hopefully 
you should only get one extra interrupt per TTY transmit operation.

> Instead I just removed '2'.  In fact I replaced the 'set_forceidle' call with
> 'set_noidle'.  So the uart should never report that it was idle.
> 
> I did this with my other patch removed so pm_runtime_put() was still being
> called.
> 
> Result:  I still get corruption.
> So having the UART say "no, I'm not idle" does *not* stop the clock
> being turned off when we use omap_hwmod_idle() to turn off the clocks.

Hmm that's doubtful.  If that's really so, then we should be seeing 
massive UART transmit problems.  I'd expect that the driver wouldn't be 
able to get any transmit buffers out the door at all before the UART's 
fclk is cut.

What's probably happening in this case is that the hwmod code is rewriting 
the UART SIDLEMODE bits in the hwmod code's _idle() function.  This gets 
called as part of the PM runtime suspend operation.  So it's bypassing 
your debugging hack :-)  The hwmod code expects to control the SYSCONFIG 
register bits itself, and the current way that the UART driver messes with 
the SYSCONFIG bits is a total hack that that hwmod code is not expecting.  
You could try disabling that behavior in _idle_sysc() by adding a hack to 
skip it if it's the UART3 hwmod.

> When we turn off a clock, if that is the last clock in the clock-domain, we
> also turn off the clock-domain (I think).

That's only true if the clockdomain is programmed to use 
software-supervised idle.  CORE & PER should both be programmed to 
hardware-supervised idle by mach-omap2/pm34xx.c.  In that case, we let the 
PRCM put the clockdomain to sleep by itself.

> Could it be that the clock-domain doesn't do any handshaking with modules,
> and so turns off the clocks even though they are being used?

Probably not -- I'd think that hardware-supervised idle wouldn't work at 
all if that were true.


- Paul
NeilBrown Feb. 4, 2012, 12:01 a.m. UTC | #25
On Fri, 3 Feb 2012 16:02:42 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> On Sat, 4 Feb 2012, NeilBrown wrote:
> 
> > On Fri, 3 Feb 2012 13:10:28 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> > 
> > > Considering your theory that the UART clocks are being cut while there's 
> > > still data in the FIFO, you might consider removing this code at the end 
> > > of transmit_chars():
> > > 
> > > 	if (uart_circ_empty(xmit))
> > > 		serial_omap_stop_tx(&up->port);
> > 
> > I read the code and chickened out of just removing that.
> > serial_omap_stop_tx seem to do 2 things:
> >  1/ tell the uart to stop sending interrupts when the tx fifo is empty
> >  2/ set forceidle (really smartidle) on the uart.
> > 
> > I didn't feel comfortable removing '1' as I thought it might generate an
> > interrupt storm .. maybe not.
> 
> Might be worth a try.  In theory, since the current UART driver sets the 
> TX_EMPTY flag in the SCR register, the UART should only raise a TX 
> interrupt when the FIFO + shift register are totally empty.  So hopefully 
> you should only get one extra interrupt per TTY transmit operation.
> 
> > Instead I just removed '2'.  In fact I replaced the 'set_forceidle' call with
> > 'set_noidle'.  So the uart should never report that it was idle.
> > 
> > I did this with my other patch removed so pm_runtime_put() was still being
> > called.
> > 
> > Result:  I still get corruption.
> > So having the UART say "no, I'm not idle" does *not* stop the clock
> > being turned off when we use omap_hwmod_idle() to turn off the clocks.
> 
> Hmm that's doubtful.  If that's really so, then we should be seeing 
> massive UART transmit problems.  I'd expect that the driver wouldn't be 
> able to get any transmit buffers out the door at all before the UART's 
> fclk is cut.

Guess what happens if I set autosuspend_delay_ms to 0?
Massive transmit problems.  Driver can hardly get anything out before the
UART's fclk is cut...


> 
> What's probably happening in this case is that the hwmod code is rewriting 
> the UART SIDLEMODE bits in the hwmod code's _idle() function.  This gets 
> called as part of the PM runtime suspend operation.  So it's bypassing 
> your debugging hack :-)  The hwmod code expects to control the SYSCONFIG 
> register bits itself, and the current way that the UART driver messes with 
> the SYSCONFIG bits is a total hack that that hwmod code is not expecting.  
> You could try disabling that behavior in _idle_sysc() by adding a hack to 
> skip it if it's the UART3 hwmod.
> 
> > When we turn off a clock, if that is the last clock in the clock-domain, we
> > also turn off the clock-domain (I think).
> 
> That's only true if the clockdomain is programmed to use 
> software-supervised idle.  CORE & PER should both be programmed to 
> hardware-supervised idle by mach-omap2/pm34xx.c.  In that case, we let the 
> PRCM put the clockdomain to sleep by itself.
> 
> > Could it be that the clock-domain doesn't do any handshaking with modules,
> > and so turns off the clocks even though they are being used?
> 
> Probably not -- I'd think that hardware-supervised idle wouldn't work at 
> all if that were true.

Thanks for those hints.  Next time I dive into the code/doco they might help
me understand a bit more.

Thanks,
NeilBrown
Woodruff, Richard Feb. 4, 2012, 12:23 a.m. UTC | #26
> From: linux-omap-owner@vger.kernel.org [mailto:linux-omap-
> owner@vger.kernel.org] On Behalf Of NeilBrown

> > Not sure if it's the same problem but with 3530 on 3.2 with
> > sleep_timeout set, I usually get first char dropped (as expected) but
> > sometimes I get corrupted char instead too. Corrupt char seems to
> > almost always happen if I set cpufreq to powersave, on performace it's
> > almost always ok, so maybe it's some timing problem,
> 
> I see that too - I'm glad someone else does :-)

When you have aggressive PM working at the SOC level you many times lost a character on UART every since OMAP2. A strange but true statement is it is nice to see it losing a character on mainline as it as in indication that PM is likely working.

If you just hook up simple RX and TX lines and not other flow control it is very likely especially with older OMAPs you can lose the 'wake' character on debug console. The UART operates on a derived clock from a 96MHz DPLL which was probably stopped. When the wakeup event hits the IO ring many internals may need to repower and its source DPLL needs to relock. This all can take a while and you can lose the start bit at high baud rate. If you use flow control you might be able to get ahead of it.

As the silicon process has shrunk from 90nm (omap2) to 65nm (omap3) to 45nm (omap4) the DPLL relock times have dropped a lot. With certain DPLL parameters it could take hundreds of uS to relock in OMAP2. And there are more variables to latency stack up than just DPLL. Most of these have improved (gotten smaller) over time.

Always the hack for debug console was activity timer along with denying idle while SW and HW TX FIFOs had characters in them. This made debug console useable.

One irritation was some internal interrupt sources were not linked to low power wakeup events. If you were in interrupt mode and got characters below watermark you could sleep before interrupt status showed up (as you had to wait several frame times before functional interrupt asserted) but there was no wake at anticipated frame timeout because lack of linking of internal event to wake event.

Outside of debug console, this loss has not been huge. Protocols like irda would retransmit their magic wake packets. You can move between DMA and interrupt modes with activity. So far there has been a work around per attached device.

If your screen blanks and you hit a char to wakeup, do you expect to see the character or do you throw it away?  You can set some timeout or policy to deny lower c-states which can ensure a debug console doesn't have any issue.  If your application is a phone it is not likely something you would do.

Maybe your issue today is due to some other software bug... but at the end of the trail you still may have an issue unless your attached hardware accommodates.

Regards,
Richard W.
Paul Walmsley Feb. 4, 2012, 12:59 a.m. UTC | #27
On Sat, 4 Feb 2012, Woodruff, Richard wrote:

> When you have aggressive PM working at the SOC level you many times lost 
> a character on UART every since OMAP2. A strange but true statement is 
> it is nice to see it losing a character on mainline as it as in 
> indication that PM is likely working.

We've been losing wakeup characters in mainline for many years now ;-)

> One irritation was some internal interrupt sources were not linked to 
> low power wakeup events. If you were in interrupt mode and got 
> characters below watermark you could sleep before interrupt status 
> showed up (as you had to wait several frame times before functional 
> interrupt asserted) but there was no wake at anticipated frame timeout 
> because lack of linking of internal event to wake event.

Indeed, it seems that we are just now working around these wakeup-related 
bugs.  Kind of surprising that no errata showed up for them.

What's particularly remarkable is that it looks like the UARTs will 
idle-ack while their transmit FIFOs have data in them (!)


- Paul
Woodruff, Richard Feb. 4, 2012, 1:46 a.m. UTC | #28
> From: Paul Walmsley [mailto:paul@pwsan.com]
> Sent: Friday, February 03, 2012 7:00 PM

> > One irritation was some internal interrupt sources were not linked to
> > low power wakeup events. If you were in interrupt mode and got
> > characters below watermark you could sleep before interrupt status
> > showed up (as you had to wait several frame times before functional
> > interrupt asserted) but there was no wake at anticipated frame timeout
> > because lack of linking of internal event to wake event.
> 
> Indeed, it seems that we are just now working around these wakeup-related
> bugs.  Kind of surprising that no errata showed up for them.

There have been errata over time in this area. Several I hit were updated at 3630 time. UART did get IER2 but I don't recall all details for UART.  Probably that is not being used.

> What's particularly remarkable is that it looks like the UARTs will
> idle-ack while their transmit FIFOs have data in them (!)

Generally a module can ACK its ICLK if it is not used internally. The FCLK can push data out with out ICLK and is controlled separately always (omap4 changed encoding, to optional clock). This allows interconnect to idle during tx to save power. The trick is to ensure all module wakeup plumbing is enabled so a functional tx irq will flow.  Audits last showed several drivers missing steps (omap specific). Some drivers seemed to rely on static dependencies or coincident neighbor activity to allow their functional interrupt to flow... to many interdependent custom details... and yes some errata.

Anyway, even with all SOC specific wake bits you may lose the character with latency of restart. Point I was raising was external chip hook can not be neglected as its part of equation.

Regards,
Richard W.
Paul Walmsley Feb. 4, 2012, 2:12 a.m. UTC | #29
On Fri, 3 Feb 2012, Paul Walmsley wrote:

> Will also give the CLOCKACTIVITY bits a quick test.

... which doesn't help.  So, software workaround it is.


- Paul
NeilBrown Feb. 4, 2012, 2:31 a.m. UTC | #30
On Sat, 4 Feb 2012 00:23:09 +0000 "Woodruff, Richard" <r-woodruff2@ti.com>
wrote:

> 
> > From: linux-omap-owner@vger.kernel.org [mailto:linux-omap-
> > owner@vger.kernel.org] On Behalf Of NeilBrown
> 
> > > Not sure if it's the same problem but with 3530 on 3.2 with
> > > sleep_timeout set, I usually get first char dropped (as expected) but
> > > sometimes I get corrupted char instead too. Corrupt char seems to
> > > almost always happen if I set cpufreq to powersave, on performace it's
> > > almost always ok, so maybe it's some timing problem,
> > 
> > I see that too - I'm glad someone else does :-)
> 
> When you have aggressive PM working at the SOC level you many times lost a character on UART every since OMAP2. A strange but true statement is it is nice to see it losing a character on mainline as it as in indication that PM is likely working.
> 
> If you just hook up simple RX and TX lines and not other flow control it is very likely especially with older OMAPs you can lose the 'wake' character on debug console. The UART operates on a derived clock from a 96MHz DPLL which was probably stopped. When the wakeup event hits the IO ring many internals may need to repower and its source DPLL needs to relock. This all can take a while and you can lose the start bit at high baud rate. If you use flow control you might be able to get ahead of it.

So... if flow control is available, then when we idle the uart we should set
the trigger so that RTS is de-asserted as soon as one character arrives.
That would minimise the number of corrupt character we receive and ensure we
resync as early as possible (I have seen 2 corrupt characters when CR,NL
arrive back-to-back.  Neither get through correctly).

Actually ... could we make the off-mode setting of the RTS pin be "ready to
send", but as soon as we wake up, it is reset to "don't send now" until
everything is properly awake and configured?  That should ensure only one
byte is lost.


> Outside of debug console, this loss has not been huge. Protocols like irda would retransmit their magic wake packets. You can move between DMA and interrupt modes with activity. So far there has been a work around per attached device.


What about bluetooth?  HCI/UART doesn't seem to have a lot of error
handling.  Maybe it has enough though.
(I have bluetooth on UART1 ... of course we might not have the same problems
on UART1 .. I haven't played with bluetooth much yet).

Thanks for the insights,

NeilBrown
Paul Walmsley Feb. 4, 2012, 2:39 a.m. UTC | #31
On Sat, 4 Feb 2012, Woodruff, Richard wrote:

> There have been errata over time in this area. Several I hit were 
> updated at 3630 time. UART did get IER2 but I don't recall all details 
> for UART.  Probably that is not being used.

Govindraj sent an RFC patch a few days ago to add IER2, which is good, but 
we're still awaiting the followup patch for it.

> > From: Paul Walmsley [mailto:paul@pwsan.com]
> > Sent: Friday, February 03, 2012 7:00 PM
> 
> > What's particularly remarkable is that it looks like the UARTs will
> > idle-ack while their transmit FIFOs have data in them (!)
> 
> Generally a module can ACK its ICLK if it is not used internally. The 
> FCLK can push data out with out ICLK and is controlled separately always 
> (omap4 changed encoding, to optional clock). This allows interconnect to 
> idle during tx to save power. 

Yep, that's a good point.  Unfortunately the PER has a hardware sleep 
dependency with the CORE_L3 clockdomain on OMAP3... so I'm not sure how 
much power we'd be able to save.  Perhaps some: it appears that the UART3 
functional clock comes from the CORE_L4 clockdomain.  So it might be worth 
implementing some extra intelligence here.  The kernel code is disabling 
both the ICLK and the FCLK simultaneously, so that may not be optimal in 
this situation.

In the short-term, on the kernel side, we should just keep the PM runtime 
count non-zero when the UART is transmitting.  Since we can get an 
interrupt when the TX is done, or close to being done anyway, we can just 
disable the clocks at that point.  Not ideal, but should work.

> The trick is to ensure all module wakeup plumbing is enabled so a 
> functional tx irq will flow.  Audits last showed several drivers missing 
> steps (omap specific). Some drivers seemed to rely on static 
> dependencies or coincident neighbor activity to allow their functional 
> interrupt to flow... to many interdependent custom details... and yes 
> some errata.

Yeah.  I think we've got an acceptable workaround for the missing TX 
wakeup problem.  And we've got a somewhat unpleasant workaround for the 
missing RX timeout wakeup problem.   Now we just need to put together a 
strategy for the idle-during-TX problem...

> Anyway, even with all SOC specific wake bits you may lose the character 
> with latency of restart. Point I was raising was external chip hook can 
> not be neglected as its part of equation.

Indeed.

Thanks for the info -- it's always nice to see your posts on the lists --


- Paul
NeilBrown Feb. 4, 2012, 3:09 a.m. UTC | #32
On Fri, 3 Feb 2012 19:06:19 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> Hi Neil
> 
> On Sat, 4 Feb 2012, NeilBrown wrote:
> 
> > Guess what happens if I set autosuspend_delay_ms to 0?
> > Massive transmit problems.  Driver can hardly get anything out before the
> > UART's fclk is cut...
> 
> Just reproduced this on 35xx BeagleBoard.  Looks like the UART is indeed 
> going idle while the TX FIFO has bytes in it.

That makes me happy :-)

> 
> Here's a patch that helps.  It seems to work down to an 
> autosuspend_delay_ms of 1 ms.  Without it, the best I can get is 8 ms.
> 
> Of course, ideally it should work fine at autosuspend_delay_ms = 0, so 
> likely there's some other infelicity that we're currently missing.
> 
> Neil, care to give this a test and confirm it on your setup?

Yes, that seems to make the output corruption go away.

Even with small autosuspend_delay_ms down to 0 it doesn't corrupt output,
but as the first input byte is corrupted, I cannot really type with those
setting (so I ssh to gain control again).

The patch disables the IDLEMODE_SMART setting that happens on runtime
suspend/resume so that the IDLEMODE_NO setting stays in force.

So it clearly isn't "stopping the clocks" that is the problem - as I first
imagined - but rather the SIDLE handshake isn't doing what we think it should
do.

Thanks,
NeilBrown
Paul Walmsley Feb. 4, 2012, 3:16 a.m. UTC | #33
On Sat, 4 Feb 2012, NeilBrown wrote:

> On Fri, 3 Feb 2012 19:06:19 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> > 
> > Here's a patch that helps.  It seems to work down to an 
> > autosuspend_delay_ms of 1 ms.  Without it, the best I can get is 8 ms.
> > 
> > Of course, ideally it should work fine at autosuspend_delay_ms = 0, so 
> > likely there's some other infelicity that we're currently missing.
> > 
> > Neil, care to give this a test and confirm it on your setup?
> 
> Yes, that seems to make the output corruption go away.

Cool, thanks for the test :-)

> Even with small autosuspend_delay_ms down to 0 it doesn't corrupt output,
> but as the first input byte is corrupted, I cannot really type with those
> setting (so I ssh to gain control again).

Could you try pasting in a buffer from another window?  If I paste in the 
buffer at the bottom of this message a few times, I see some output 
corruption. 


- Paul


 
;
;
;
;
cat  /sys/devices/platform/omap/omap_uart.2/power/autosuspend_delay_ms
echo 0 > /sys/devices/platform/omap/omap_uart.2/power/autosuspend_delay_ms
cat /proc/interrupts
;
NeilBrown Feb. 4, 2012, 3:43 a.m. UTC | #34
On Fri, 3 Feb 2012 20:16:08 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> On Sat, 4 Feb 2012, NeilBrown wrote:
> 
> > On Fri, 3 Feb 2012 19:06:19 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:
> > > 
> > > Here's a patch that helps.  It seems to work down to an 
> > > autosuspend_delay_ms of 1 ms.  Without it, the best I can get is 8 ms.
> > > 
> > > Of course, ideally it should work fine at autosuspend_delay_ms = 0, so 
> > > likely there's some other infelicity that we're currently missing.
> > > 
> > > Neil, care to give this a test and confirm it on your setup?
> > 
> > Yes, that seems to make the output corruption go away.
> 
> Cool, thanks for the test :-)
> 
> > Even with small autosuspend_delay_ms down to 0 it doesn't corrupt output,
> > but as the first input byte is corrupted, I cannot really type with those
> > setting (so I ssh to gain control again).
> 
> Could you try pasting in a buffer from another window?  If I paste in the 
> buffer at the bottom of this message a few times, I see some output 
> corruption. 

I have to set autosuspend_delay_ms for omap_uart.3 as well before the
behaviour is significant.
But then I see no output corruption.  Lots of input corruption of course but
the output looks fine.

NeilBrown


> 
> 
> - Paul
> 
> 
>  
> ;
> ;
> ;
> ;
> cat  /sys/devices/platform/omap/omap_uart.2/power/autosuspend_delay_ms
> echo 0 > /sys/devices/platform/omap/omap_uart.2/power/autosuspend_delay_ms
> cat /proc/interrupts
> ;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Walmsley Feb. 4, 2012, 3:56 a.m. UTC | #35
On Sat, 4 Feb 2012, NeilBrown wrote:

> I have to set autosuspend_delay_ms for omap_uart.3 as well before the
> behaviour is significant.
> But then I see no output corruption.  Lots of input corruption of course but
> the output looks fine.

OK.  Is the input corruption at the beginning of the pasted buffer, or the 
middle?  And this is with CPUIdle enabled?

With CPUIdle disabled here, what I thought was output corruption occurs in 
the middle of the pasted buffer occasionally.  But it might be input 
corruption, if the CPU manages to empty the RX FIFO while the TX FIFO is 
empty.


- Paul
NeilBrown Feb. 4, 2012, 4:17 a.m. UTC | #36
On Fri, 3 Feb 2012 20:56:07 -0700 (MST) Paul Walmsley <paul@pwsan.com> wrote:

> On Sat, 4 Feb 2012, NeilBrown wrote:
> 
> > I have to set autosuspend_delay_ms for omap_uart.3 as well before the
> > behaviour is significant.
> > But then I see no output corruption.  Lots of input corruption of course but
> > the output looks fine.
> 
> OK.  Is the input corruption at the beginning of the pasted buffer, or the 
> middle?  And this is with CPUIdle enabled?
> 
> With CPUIdle disabled here, what I thought was output corruption occurs in 
> the middle of the pasted buffer occasionally.  But it might be input 
> corruption, if the CPU manages to empty the RX FIFO while the TX FIFO is 
> empty.
> 
> 
> - Paul

Yes, cpu-idle is enabled.

I think corruption is mostly early, though it isn't very consistent.

e.g.

# C!jHhzys/Y?omap/omap_uart.2/power/autosuspend_delay_ms
-bash: !jHhzys/Y?omap/omap_uart.2/power/autosuspend_delay_ms: event not found
# echo 0 > /sys/devices/platFK/////mpWWt.]au%e_mHHHhQ 5
-bash: /sys/devices/platFK/////mpWWt.]au%e_mHHHhQ: No such file or directory


NeilBrown
Gražvydas Ignotas Feb. 4, 2012, 4 p.m. UTC | #37
On Fri, Feb 3, 2012 at 9:42 PM, Paul Walmsley <paul@pwsan.com> wrote:
> On Fri, 3 Feb 2012, Grazvydas Ignotas wrote:
>> On Fri, Feb 3, 2012 at 11:54 AM, NeilBrown <neilb@suse.de> wrote:
>> > Maybe it is 37xx specific.  I think this is a DM3730.
>>
>> Not sure if it's the same problem but with 3530 on 3.2 with
>> sleep_timeout set, I usually get first char dropped (as expected) but
>> sometimes I get corrupted char instead too. Corrupt char seems to
>> almost always happen if I set cpufreq to powersave, on performace it's
>> almost always ok, so maybe it's some timing problem,
>
> OK so let's distinguish between two corruption situations:
>
> 1. The first character transmitted to the OMAP UART in a serial console
> when the UART powerdomain is in a non-functional, low power state (e.g.,
> RET or below) is corrupted.  This is not actually output corruption, this
> is input corruption.
>
> 2. Characters are corrupted while the OMAP UART is transmitting data, but
> there has been no recent data sent to the OMAP.
>
> Case 1 is expected and is almost certainly not a bug.  As Neil mentioned
> it should be bps-rate dependent.  It occurs when the first character
> transmitted to the OMAP wakes the chip up via I/O ring/chain wakeup.
> I/O ring/chain wakeup is driven by a 32KiHz clock and is therefore
> relatively high-latency.  So this could easily cause the first character
> or first few characters to be lost or corrupted, depending on the exact
> sequence of events, the low power state that the chip was in, etc.
>
> Case 2 is not expected.  That is likely a bug somewhere.  Neil, this is
> what I understood that you are experiencing.  Is that correct?
>
> Gražvydas, are you seeing case 1 or 2 (or something completely different
> ;-) ?

It's case 1. What I wanted to say is that first char is most often
nicely dropped and does not get into the terminal, so I can just type
the command after it. But in some cases terminal gets corrupted char
instead, so I must then first get rid of it somehow to successfully
send a command, which is annoying a bit. I thought that maybe there is
code somewhere that gets rid of first bad char received and maybe it
can be tuned, but judging on further discussions it's all done by
hardware?

I've also noticed if I paste a command instead, up to 3 characters can
be lost, and in some cases I get 3 corrupted chars there instead. I
paste a command to both wake the board and read the fuel gauge just
before it updates to see how much current board was draining while
suspended. I insert 3 spaces at the start of command to be eaten by
wakeup, but if it decides to corrupt those chars instead of dropping,
the whole command is ruined. It's all at 115200 baud rate.
Paul Walmsley Feb. 4, 2012, 4:31 p.m. UTC | #38
On Sat, 4 Feb 2012, Grazvydas Ignotas wrote:

> It's case 1. What I wanted to say is that first char is most often
> nicely dropped and does not get into the terminal, so I can just type
> the command after it. But in some cases terminal gets corrupted char
> instead, so I must then first get rid of it somehow to successfully
> send a command, which is annoying a bit. I thought that maybe there is
> code somewhere that gets rid of first bad char received and maybe it
> can be tuned, but judging on further discussions it's all done by
> hardware?
> 
> I've also noticed if I paste a command instead, up to 3 characters can
> be lost, and in some cases I get 3 corrupted chars there instead. I
> paste a command to both wake the board and read the fuel gauge just
> before it updates to see how much current board was draining while
> suspended. I insert 3 spaces at the start of command to be eaten by
> wakeup, but if it decides to corrupt those chars instead of dropping,
> the whole command is ruined. It's all at 115200 baud rate.

Aside from trying some of the muxing suggestions that Neil proposed, 
perhaps the UART driver should clear the RX FIFO if the UART detects a 
framing error?  e.g., section 17.4.4.1.3.5 "Error Detection" in the
34xx TRM vZT.


- Paul
Russell King - ARM Linux Feb. 4, 2012, 4:39 p.m. UTC | #39
On Sat, Feb 04, 2012 at 06:00:56PM +0200, Grazvydas Ignotas wrote:
> It's case 1. What I wanted to say is that first char is most often
> nicely dropped and does not get into the terminal, so I can just type
> the command after it. But in some cases terminal gets corrupted char
> instead, so I must then first get rid of it somehow to successfully
> send a command, which is annoying a bit. I thought that maybe there is
> code somewhere that gets rid of first bad char received and maybe it
> can be tuned, but judging on further discussions it's all done by
> hardware?

If it's the case that the UART bitclock is derived from a PLL which is
shutdown while idle, then no matter which way you look at it - if the
port is open, and therefore the port is expecting to transfer data,
the port must _never_ be allowed to go into any low power state, period.

If it does, then the PLL stops, and it takes time for the PLL to re-lock.
That time will cause a character to be dropped, which is exactly what
people are reporting in this thread.

Moreover, if that then means that the OMAP CPU cores can't be put into a
low power state, then that's the hit that _has_ to be taken because of
the design of the hardware.

It is entirely unacceptable to drop characters on a serial port through
the use of PM.  Many serial protocols just will not cope with that kind
of behaviour - yes, serial protocols may have retry built-in, but will
they retry _before_ the port re-idles and the PLL shuts down?  Can you
be sure?

If you can't, then you can't do PM in this area while the port is open.
Runtime power management is _supposed_ to be transparent.  If it isn't,
it's a bug plain and simple, which blocks the ability for the device to
even _use_ runtime power management.

There's no absolutely argument here.  OMAP's hardware auto idle on the
UART which results in characters being dropped is quite clearly broken.

So, what I suggest is reverting back to standard FIFO thresholds, and
then doing the PM in software: if the kernel transmit buffer holds
characters, or the device FIFO contains characters, PM on the transmit
side must be denied.  If the port is _open_, PM on the receive side
must be denied.  If you don't have a distinction between the transmit
and receive sides, then that becomes a very simple rule: if the port is
open, runtime PM of the serial port is denied and the port must remain
active all the time that it's open.  It's that simple, no ifs or buts.

Anything else, which results in characters lost, is buggy.
Paul Walmsley Feb. 4, 2012, 4:49 p.m. UTC | #40
On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:

> If you can't, then you can't do PM in this area while the port is open.
> Runtime power management is _supposed_ to be transparent.  If it isn't,
> it's a bug plain and simple, which blocks the ability for the device to
> even _use_ runtime power management.
> 
> There's no absolutely argument here.  OMAP's hardware auto idle on the
> UART which results in characters being dropped is quite clearly broken.
> 
> So, what I suggest is reverting back to standard FIFO thresholds, and
> then doing the PM in software: if the kernel transmit buffer holds
> characters, or the device FIFO contains characters, PM on the transmit
> side must be denied.  If the port is _open_, PM on the receive side
> must be denied.  If you don't have a distinction between the transmit
> and receive sides, then that becomes a very simple rule: if the port is
> open, runtime PM of the serial port is denied and the port must remain
> active all the time that it's open.  It's that simple, no ifs or buts.
> 
> Anything else, which results in characters lost, is buggy.

There is indeed an argument here.  The decision of how to act in this 
situation needs to be up to the user of the serial port.

The default behavior needs to be what you state: to not lose characters.  
And indeed that is what it is in v3.3: the UART will not enter idle when 
the PM runtime autosuspend timeout is -1.  

But in cases where there is a protocol that can handle retries, the system 
integrator may well prefer the large power savings available by letting 
the chip enter device idle, and take the added delay in the retransmission 
process.

As in many power management situations, the choice needs to be up to the 
user of the serial port or the system administrator, with the default 
mode being to not lose data.  We must not remove that choice from them, 
otherwise they will just hack it in later.


- Paul
Paul Walmsley Feb. 4, 2012, 4:55 p.m. UTC | #41
On Sat, 4 Feb 2012, Paul Walmsley wrote:

> The default behavior needs to be what you state: to not lose characters.  
> And indeed that is what it is in v3.3: the UART will not enter idle when 
> the PM runtime autosuspend timeout is -1.  

One technical correction on this section -- rather, the CORE* clockdomains 
will not be allowed to go idle when the PM runtime autosuspend timeout is 
-1.  And that is what causes the character loss.  The rest of the E-mail 
stands.

The question of whether the UART should be allowed to enter idle is a 
different one, since the current driver is not properly working in this 
regard, which is why we're not getting RX timeout events, etc.  We should 
be able to have the proper RX timeout behavior, and therefore use normal 
FIFO thresholds, by fixing some of the bugs in the existing driver.


- Paul
Russell King - ARM Linux Feb. 4, 2012, 4:57 p.m. UTC | #42
On Sat, Feb 04, 2012 at 09:31:29AM -0700, Paul Walmsley wrote:
> Aside from trying some of the muxing suggestions that Neil proposed, 
> perhaps the UART driver should clear the RX FIFO if the UART detects a 
> framing error?  e.g., section 17.4.4.1.3.5 "Error Detection" in the
> 34xx TRM vZT.

Paul, do you have a desire to totally destroy serial ports on OMAP,
because these kinds of suggestions are where you're going with it.
You're also in danger of making the OMAP serial ports not conform to
POSIX conventions.

Errors in received characters are already dealt with.  Depending on the
termios settings, the driver should be detecting framing errors, and
discarding characters with framing errors.

And no amount of FIFO clearing recovers from a framing error.  Consider
this as the transmitted bitstream:

0x67 0x66 0x66 0x66 0x66
S01234567TS01234567TS01234567TS01234567TS01234567T
01110011010011001101001100110100110011010011001101

This can be successfully received without framing errors here, and you'll
never be the wiser.

S01234567TS01234567TS01234567TS01234567TS01234567T
0011010011001101001100110100110011010011001101

So, this gets received as 0x96 0x96 0x96 0x96.  You have no idea that the
stream you're receiving is incorrect - it appears to be correctly framed.

S01234567TS01234567TS01234567TS01234567TS01234567T
010011001101001100110100110011010011001101

Or maybe 0x99 0x99 0x99 0x99 ... again, no framing error.

Take a different pattern - 'linux\n':
0x6c 0x69 0x6e 0x75 0x78 0x0a
S01234567TS01234567TS01234567TS01234567TS01234567TS01234567T
000110110101001011010011101101010101110100001111010010100001

This could be received as:
S01234567T S01234567T   S01234567T   S01234567TS01234567TS01234567T
00110110101001011010011101101010101110100001111010010100001
         *          *            *                      *
0110110101 001011010011101101010101110100001111010010100001
                    *            *                      *
0110101001 0110100111   01101010101110100001111010010100001
                                 *                      *

where the '*' indicates where we end up with framing errors.  Again,
we have some characters which appear to be received validly, others
which aren't.

Let's take the last case.  You've received 0x2b 0xcb before you've
discovered a framing error.  You can't wipe out those two characters
you've already delivered to the upper layers - and it would be totally
incorrect to do so.  The framing error might be a single bit error.

So please, stop trying to bastardize this stuff by coming up with mad
work-arounds which just make things worse for this obviously broken
hardware.

The fact of the matter is - if idling the serial port results in the
clocks being stopped, and you can't _instantly_ restart the clocks when
the RXD line first goes low to start sampling the first bit of the
transmission, you _absolutely_ _can_ _not_ allow the port to have its
clock stopped all the time which you expect to receive a character from
it.  No ifs or buts.  You can't.  There's no getting away from that.
There's no work-around.  There's nothing you can do with a framing error
to solve it.  Emptying the FIFO won't help (as I've shown above with
the valid characters received before the framing error).

So, just admit that this is broken and don't allow the port to be idled
while it's in use.
Russell King - ARM Linux Feb. 4, 2012, 5:01 p.m. UTC | #43
On Sat, Feb 04, 2012 at 09:49:57AM -0700, Paul Walmsley wrote:
> There is indeed an argument here.  The decision of how to act in this 
> situation needs to be up to the user of the serial port.
> 
> The default behavior needs to be what you state: to not lose characters.  
> And indeed that is what it is in v3.3: the UART will not enter idle when 
> the PM runtime autosuspend timeout is -1.  
> 
> But in cases where there is a protocol that can handle retries, the system 
> integrator may well prefer the large power savings available by letting 
> the chip enter device idle, and take the added delay in the retransmission 
> process.

Rubbish.  Let's say I hook an OMAP platform up to a GPS, and the system
integrator has decided to set the idle timeout on all UARTs to .5 sec.
The GPS transmits data every second.  Yes, it effectively retries each
second, but there's no way to receive its complete transmission _ever_.

So, if this is allowed, OMAP is broken.  Plain and simple.

> As in many power management situations, the choice needs to be up to the 
> user of the serial port or the system administrator, with the default 
> mode being to not lose data.  We must not remove that choice from them, 
> otherwise they will just hack it in later.

And then they'll whinge that they get lost characters and abandon the
attempt.  Good, they learnt that the hardware is broken, and they
learnt why it's not implemented.
Paul Walmsley Feb. 4, 2012, 5:22 p.m. UTC | #44
On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:

> On Sat, Feb 04, 2012 at 09:49:57AM -0700, Paul Walmsley wrote:
> > There is indeed an argument here.  The decision of how to act in this 
> > situation needs to be up to the user of the serial port.
> > 
> > The default behavior needs to be what you state: to not lose characters.  
> > And indeed that is what it is in v3.3: the UART will not enter idle when 
> > the PM runtime autosuspend timeout is -1.  
> > 
> > But in cases where there is a protocol that can handle retries, the system 
> > integrator may well prefer the large power savings available by letting 
> > the chip enter device idle, and take the added delay in the retransmission 
> > process.
> 
> Rubbish.  Let's say I hook an OMAP platform up to a GPS, and the system
> integrator has decided to set the idle timeout on all UARTs to .5 sec.
> The GPS transmits data every second.  Yes, it effectively retries each
> second, but there's no way to receive its complete transmission _ever_.

No, that is not an example of a protocol with a retry.  That is an example 
of a protocol that has no provision for reliable data delivery.  Sending a 
new data string one second later is not a retry.

In such situations, the system integrator would just use the UART in the 
default (lossless) mode.  And if they don't, they'll have to deal with the 
consequences that they chose.  Those of us who ship battery-powered Linux 
devices are indeed capable of making this choice.

One could argue that the PM runtime autosuspend timeout is not the 
appropriate place to change this setting, and that it should be somewhere 
else.  That's fine.  But that's a separate issue from removing the 
functionality completely.


- Paul
Paul Walmsley Feb. 4, 2012, 5:32 p.m. UTC | #45
On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:

> [detailed discussion of framing errors]

Thanks for the detailed description.  If the driver is in fact discarding 
characters with framing errors -- which I have not personally verified -- 
then taking further action there is pointless.


- Paul
Russell King - ARM Linux Feb. 4, 2012, 5:47 p.m. UTC | #46
On Sat, Feb 04, 2012 at 10:22:27AM -0700, Paul Walmsley wrote:
> No, that is not an example of a protocol with a retry.  That is an example 
> of a protocol that has no provision for reliable data delivery.  Sending a 
> new data string one second later is not a retry.
> 
> In such situations, the system integrator would just use the UART in the 
> default (lossless) mode.  And if they don't, they'll have to deal with the 
> consequences that they chose.  Those of us who ship battery-powered Linux 
> devices are indeed capable of making this choice.

Okay, lets see.  You're making a battery powered Linux device.  It has
a standard RS232 serial port available, and you allow users to load
'apps' onto it.

Do you run the serial ports in lossless mode?
Russell King - ARM Linux Feb. 4, 2012, 5:55 p.m. UTC | #47
On Sat, Feb 04, 2012 at 10:32:16AM -0700, Paul Walmsley wrote:
> On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:
> 
> > [detailed discussion of framing errors]
> 
> Thanks for the detailed description.  If the driver is in fact discarding 
> characters with framing errors -- which I have not personally verified -- 
> then taking further action there is pointless.

Paul, I know you don't particularly like me getting involved with OMAP
issues, but tough.  You don't seem to understand some of these issues so
you're going to get more explanation.

And you didn't get the point, and why I included the detailed illustration.
No, the character with a framing error is not discarded out of the FIFO.
It is kept in the FIFO along with its corresponding error bits.

When that character comes to the top of the FIFO, the error bits for that
character become available.

When we read the character, we also read the error bits.  If the termios
settings are such that _userspace_ asks for characters with errors to be
ignored, then even if the UART received a character with a framing error,
the bit pattern that it received _must_ still be passed to userspace.

Userspace can also ask for characters in error to be received, but marked
with special markers

Userspace can also ask for characters in error to be discarded.

Same with parity errors and the like.

But, the thing which really hurts is that you've totally and utterly
failed to understand my point that it is possible to receive characters
in a serial stream _without_ errors but for the received characters to
be completely different from what was transmitted - and there's nothing
you can do about that.

That is why shutting down the clock whenever you're expecting to receive
a character is totally and utterly the wrong thing to be doing in any
case what so ever, even if your protocol retries.
Tony Lindgren Feb. 4, 2012, 6:59 p.m. UTC | #48
* Russell King - ARM Linux <linux@arm.linux.org.uk> [120204 09:16]:
> On Sat, Feb 04, 2012 at 10:22:27AM -0700, Paul Walmsley wrote:
> > No, that is not an example of a protocol with a retry.  That is an example 
> > of a protocol that has no provision for reliable data delivery.  Sending a 
> > new data string one second later is not a retry.
> > 
> > In such situations, the system integrator would just use the UART in the 
> > default (lossless) mode.  And if they don't, they'll have to deal with the 
> > consequences that they chose.  Those of us who ship battery-powered Linux 
> > devices are indeed capable of making this choice.
> 
> Okay, lets see.  You're making a battery powered Linux device.  It has
> a standard RS232 serial port available, and you allow users to load
> 'apps' onto it.
> 
> Do you run the serial ports in lossless mode?

The default should always be the lossless mode. If the clocks for the
serial port are cut off based on a timer, the timer should be port
specific, and default to 0.

Then if some app using the port wants to intentionall enable automatic
disabling of the clocks, it can still do it.

Regards,

Tony
Paul Walmsley Feb. 4, 2012, 7:24 p.m. UTC | #49
On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:

> On Sat, Feb 04, 2012 at 10:22:27AM -0700, Paul Walmsley wrote:
> > No, that is not an example of a protocol with a retry.  That is an example 
> > of a protocol that has no provision for reliable data delivery.  Sending a 
> > new data string one second later is not a retry.
> > 
> > In such situations, the system integrator would just use the UART in the 
> > default (lossless) mode.  And if they don't, they'll have to deal with the 
> > consequences that they chose.  Those of us who ship battery-powered Linux 
> > devices are indeed capable of making this choice.
> 
> Okay, lets see.  You're making a battery powered Linux device.  It has
> a standard RS232 serial port available, and you allow users to load
> 'apps' onto it.
> 
> Do you run the serial ports in lossless mode?

Not every serial port is available to arbitrary 'apps.'.  Not every 
battery-powered Linux device allows users to run arbitrary 'apps.'

On devices that do allow users to load arbitrary 'apps,' and that allow 
those 'apps' to have direct access to the serial ports, I personally 
believe that system integrators should not change the default OMAP serial 
setting, which is to run the serial ports in lossless mode.

Here is another example.  Suppose someone builds a GPS receiver with an 
OMAP that is capable of sending NMEA position sentences, once per second, 
to a remotely connected serial device.  No receive traffic is expected on 
that port.

The position you seem to be advocating is that the mainline Linux kernel 
should not support any ability to allow the system integrator to 
affirmatively instruct the SoC to enter device idle between those position 
sentences.  This will cause the SoC to consume energy to losslessly 
handle an incoming serial character that will never come.  Is that really 
what you're advocating?


- Paul
Paul Walmsley Feb. 4, 2012, 7:37 p.m. UTC | #50
On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:

> On Sat, Feb 04, 2012 at 10:32:16AM -0700, Paul Walmsley wrote:
> > On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:
> > 
> > > [detailed discussion of framing errors]
> > 
> > Thanks for the detailed description.  If the driver is in fact discarding 
> > characters with framing errors -- which I have not personally verified -- 
> > then taking further action there is pointless.
> 
> Paul, I know you don't particularly like me getting involved with OMAP
> issues, but tough.  You don't seem to understand some of these issues so
> you're going to get more explanation.

Hehe.  Oh, hurt me again with more explanation, please ;-)  I can't take 
it ;-)  I happen to enjoy many of your technical explanations.  Doesn't 
necessarily mean that we're always in agreement, though.

As for the part about not wanting you involved with OMAP, that's an 
interesting perspective, considering I mentioned to you at ELC-E last year 
my appreciation of your technical review on the lists.  And you'll 
probably note, if you care to review the lists, that many of my E-mail 
responses express gratitude for your comments... including the one you 
quoted.

Of course the snarky, personal bits can be unnecessarily irritating, but 
anyone who's in this line of work just has to deal with them, it seems.

So if you want to believe otherwise, there's nothing I can do to control 
that.  But you should represent this as your personal perspective, and 
not as someone else's, e.g., mine.


- Paul
Russell King - ARM Linux Feb. 4, 2012, 8:07 p.m. UTC | #51
On Sat, Feb 04, 2012 at 12:24:07PM -0700, Paul Walmsley wrote:
> On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:
> 
> > On Sat, Feb 04, 2012 at 10:22:27AM -0700, Paul Walmsley wrote:
> > > No, that is not an example of a protocol with a retry.  That is an example 
> > > of a protocol that has no provision for reliable data delivery.  Sending a 
> > > new data string one second later is not a retry.
> > > 
> > > In such situations, the system integrator would just use the UART in the 
> > > default (lossless) mode.  And if they don't, they'll have to deal with the 
> > > consequences that they chose.  Those of us who ship battery-powered Linux 
> > > devices are indeed capable of making this choice.
> > 
> > Okay, lets see.  You're making a battery powered Linux device.  It has
> > a standard RS232 serial port available, and you allow users to load
> > 'apps' onto it.
> > 
> > Do you run the serial ports in lossless mode?
> 
> Not every serial port is available to arbitrary 'apps.'.  Not every 
> battery-powered Linux device allows users to run arbitrary 'apps.'
> 
> On devices that do allow users to load arbitrary 'apps,' and that allow 
> those 'apps' to have direct access to the serial ports, I personally 
> believe that system integrators should not change the default OMAP serial 
> setting, which is to run the serial ports in lossless mode.
> 
> Here is another example.  Suppose someone builds a GPS receiver with an 
> OMAP that is capable of sending NMEA position sentences, once per second, 
> to a remotely connected serial device.  No receive traffic is expected on 
> that port.
> 
> The position you seem to be advocating is that the mainline Linux kernel 
> should not support any ability to allow the system integrator to 
> affirmatively instruct the SoC to enter device idle between those position 
> sentences.  This will cause the SoC to consume energy to losslessly 
> handle an incoming serial character that will never come.  Is that really 
> what you're advocating?

Stop procrastinating.  Please answer my question.  Then I'll answer yours.
Russell King - ARM Linux Feb. 5, 2012, 12:16 p.m. UTC | #52
On Sat, Feb 04, 2012 at 12:37:06PM -0700, Paul Walmsley wrote:
> On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:
> 
> > On Sat, Feb 04, 2012 at 10:32:16AM -0700, Paul Walmsley wrote:
> > > On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:
> > > 
> > > > [detailed discussion of framing errors]
> > > 
> > > Thanks for the detailed description.  If the driver is in fact discarding 
> > > characters with framing errors -- which I have not personally verified -- 
> > > then taking further action there is pointless.
> > 
> > Paul, I know you don't particularly like me getting involved with OMAP
> > issues, but tough.  You don't seem to understand some of these issues so
> > you're going to get more explanation.
> 
> Hehe.  Oh, hurt me again with more explanation, please ;-)  I can't take 
> it ;-)  I happen to enjoy many of your technical explanations.  Doesn't 
> necessarily mean that we're always in agreement, though.
> 
> As for the part about not wanting you involved with OMAP, that's an 
> interesting perspective, considering I mentioned to you at ELC-E last year 
> my appreciation of your technical review on the lists.  And you'll 
> probably note, if you care to review the lists, that many of my E-mail 
> responses express gratitude for your comments... including the one you 
> quoted.
> 
> Of course the snarky, personal bits can be unnecessarily irritating, but 
> anyone who's in this line of work just has to deal with them, it seems.

Paul, you're being two faced.  You can see the reaction that your message
below gained:

"On Sat, 4 Feb 2012, Russell King - ARM Linux wrote:
"> [detailed discussion of framing errors]
"
"Thanks for the detailed description.  If the driver is in fact discarding
"characters with framing errors -- which I have not personally verified --
"then taking further action there is pointless."

which, _to_ _me_, looks like you're trying to get rid of my input from
this problem, and _that_ is extremely irritating.  You talk of irritating,
but maybe you don't realise that you're _just_ as irritating too at times
- and that's not the first time you've attempted to cut off my input in
that way.

Especially when you start making suggestions like throwing away an entire
FIFO load of data when you get a framing error.  I think you have a
fundamental misunderstanding about UARTs or what's required from them.

Now, the fact is that POSIX allows user programs to tell the TTY drivers
what behaviour they want, and it's essentially one of the following:

1. Ignore errors and receive all characters from the UART whether
   they be in error or not.
2. Receive characters in the FIFO and mark characters in error.
3. Receive all properly received characters.

Which errors cause this behaviour can be controlled individually.  Here's
the extract straight from POSIX:

   If IGNBRK is set, a break condition detected on input shall be ignored;
   that is, not put on the input queue and therefore not read by any process.
   If IGNBRK is not set and BRKINT is set, the break condition shall flush
   the input and output queues, and if the terminal is the controlling
   terminal of a foreground process group, the break condition shall generate
   a single SIGINT signal to that foreground process group. If neither IGNBRK
   nor BRKINT is set, a break condition shall be read as a single 0x00, or if
   PARMRK is set, as 0xff 0x00 0x00.

   If IGNPAR is set, a byte with a framing or parity error (other than break)
   shall be ignored.

   If PARMRK is set, and IGNPAR is not set, a byte with a framing or parity
   error (other than break) shall be given to the application as the
   three-byte sequence 0xff 0x00 X, where 0xff 0x00 is a two-byte flag
   preceding each sequence and X is the data of the byte received in error.
   To avoid ambiguity in this case, if ISTRIP is not set, a valid byte of
   0xff is given to the application as 0xff 0xff. If neither PARMRK nor
   IGNPAR is set, a framing or parity error (other than break) shall be given
   to the application as a single byte 0x00.

So, as you can see, the serial driver does not have the option of throwing
characters away just because an error bit is set - the behaviour required
is left to userspace to decide, and are partly implemented by the tty
layers and partly the serial driver.  The requirements are well defined,
and 8250 follows them - and I've just checked omap-serial against 8250
and it does the same.

The fact is that the way OMAP implements the power management around UARTs
is broken, because it results in corrupted characters being received.
It's as plain and simple as that.

And there's nothing you can to do solve the problem of the broken PM
causing a badly received character which _may_ have a framing error
being passed to userspace.

If the termios settings are correct, the badly framed character shouldn't
be passed to a shell - but that doesn't stop characters which don't have
framing errors but still aren't received as they were transmitted being
passed.  And there's absolutely _nothing_ you can do about that as long
as the broken PM is enabled on the port.

As I illustrated in my 'detailed discussion of framing errors'.
Woodruff, Richard Feb. 5, 2012, 3:37 p.m. UTC | #53
> From: linux-omap-owner@vger.kernel.org [mailto:linux-omap-
> owner@vger.kernel.org] On Behalf Of Russell King - ARM Linux
> Sent: Saturday, February 04, 2012 10:40 AM

> It is entirely unacceptable to drop characters on a serial port through
> the use of PM.  Many serial protocols just will not cope with that kind
> of behaviour - yes, serial protocols may have retry built-in, but will
> they retry _before_ the port re-idles and the PLL shuts down?  Can you
> be sure?

What is acceptable depends on the hardware and applications stack ups being used. There are ways to work around issues along whole path in SW and HW.  There is no single setup which makes all combinations work. Some old PC chassis may define 1 of many standards but they probably were not defined with very low power tradeoffs in mind.

If the board designer hooks up all possible serial signals for uart/rs232/rs422/standard-xyz or just rx/tx, or adds some glue logic, or adds smart peripheral, or ..., there are will be constraints and ways to cope with issue being discussed.

For most OMAP reference platforms the HW design links available UARTs with likely peripherals used in that timeframe. When it comes to making each UART work with PM some changes are usually needed per interface. Depending on the given stack up (to include bugs/limitations) some per interface tweak is always needed. It might be as you say you have to defeat PM on a port and only release after protocol handshaking with some modem firmware or it might be the debug UART expectation is lowered and allowed to work in a lossy mode so as not to destroy platform power for non-production port.

[x] What is acceptable depends is not black and white.  Is there some QOS mapping which can be set per channel which allows runtime PM to pick a best chose (which may allow for loss and frame issues)?.

Regards,
Richard W.
Russell King - ARM Linux Feb. 5, 2012, 4:03 p.m. UTC | #54
On Sun, Feb 05, 2012 at 03:37:21PM +0000, Woodruff, Richard wrote:
> [x] What is acceptable depends is not black and white.  Is there some
> QOS mapping which can be set per channel which allows runtime PM to
> pick a best chose (which may allow for loss and frame issues)?.

What you're asking is whether there's anything in the kernel which can
predict when the next character is to arrive.

For many serial protocols, that would require a non-causal universe.  So
far, I'm afraid, physics hasn't managed to provably invent time travel
in any useful way.  Once physics has, then maybe this can be fixed.

But, the fact of the matter is that deriving the UART clocks from a PLL
which takes a finite time to lock, and the PLL is shut down during runtime
PM _is_ _going_ _to_ _cause_ _problems_.  There is absolutely no getting
away from that.

Let's take your modem example.  Modems today would typically be used with
some IP based protocol, probably PPP based.  Incoming traffic on IP is
entirely unpredictable.  You wouldn't know when the next character would
be received.

One solution to this is to transmit an 0xff character before your real
data to ensure that your end is awake and properly synchronized...

Therefore, if you're running PPP (or, shudder, SLIP) over serial, you'd
have to totally defeat the PM support for the port.  No, not just for the
initial negotiation, but all the time that the PPP connection was
established.

You can't rely on the remote end retrying, because of backoffs.

You'll find that setting a PM timeout of maybe 5 seconds works, but as
soon as you experience a flakey link somewhere, the TCP could backoff to
10 seconds or even minutes with its retry.  At that point you're into a
vicious circle because you're now putting the port into low power mode,
which will drop the first character.  That means that TCP's retry will
get dropped on the floor, which in turn will increase the TCP backoff.
And so the cycle continues with ever increasing TCP backoffs and no
forward progress.

... but the problem with the 0xff character approach here is that most
ISPs won't allow you to drop by and fiddle with their PPP termination to
add that workaround.  They'll rightfully tell you to go screw yourself
and get some hardware which works.

So, go ahead with having PM drop random characters if you want, but don't
expect anyone in their right mind to accept broken workarounds to the
kernel serial driver to try to drop maybe 16 characters or more at a time
on the floor when a framing error occurs just because the PM is broken.

And let's not forget the problem that current kernels have on OMAP34xx
platforms.  Literally minutes to get a dmesg out over a 115200 baud serial
port, 16 characters at a time.  I guess you're going to try to justify
that as 'acceptable behaviour' from the system.  Yes, of course it is.
If you're dealing with a 1960s computer which processes that slowly.

And I thought we were in 2012.
Woodruff, Richard Feb. 5, 2012, 5:57 p.m. UTC | #55
> From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk]
> Sent: Sunday, February 05, 2012 10:03 AM
> To: Woodruff, Richard

> On Sun, Feb 05, 2012 at 03:37:21PM +0000, Woodruff, Richard wrote:
> > [x] What is acceptable depends is not black and white.  Is there some
> > QOS mapping which can be set per channel which allows runtime PM to
> > pick a best chose (which may allow for loss and frame issues)?.
> 
> What you're asking is whether there's anything in the kernel which can
> predict when the next character is to arrive.

No, this was not the comment's intent.

> But, the fact of the matter is that deriving the UART clocks from a PLL
> which takes a finite time to lock, and the PLL is shut down during runtime
> PM _is_ _going_ _to_ _cause_ _problems_.  There is absolutely no getting
> away from that.

Yes this is one of the issues to be worked around.

> Let's take your modem example.  Modems today would typically be used with
> some IP based protocol, probably PPP based.  Incoming traffic on IP is
> entirely unpredictable.  You wouldn't know when the next character would
> be received.
> 
> One solution to this is to transmit an 0xff character before your real
> data to ensure that your end is awake and properly synchronized...

This approach as you say has issues.  This is solved in different ways for modems.

<sleep>My observation is modem software which many talk over ppp over ip over serial of some sort (might be uart, might be usb), will send a command to the modem to go into a low power mode. Now you can cut clocks with out hurting modem and getting SOC power.

<wake>When some event happens at modem or processor (timer near beacon or other) the modem or apps processor can signal the other with some wake event (maybe over gpio) which then puts system in a state where it can receive data in trusted manner.

The modem channel driver try's to inform kernel about entering/exiting modes to set expectation.

> So, go ahead with having PM drop random characters if you want, but don't
> expect anyone in their right mind to accept broken workarounds to the
> kernel serial driver to try to drop maybe 16 characters or more at a time
> on the floor when a framing error occurs just because the PM is broken.

No character was dropped in modem example.  On the UART-to-Debug console it may be ok to drop a character.  Ether needs a coordination hook.

Each stack needs some way to adjust expectations.  Finding a way to isolate to a sub-layer of stack and not break everything is always the quest.

> And let's not forget the problem that current kernels have on OMAP34xx
> platforms.  Literally minutes to get a dmesg out over a 115200 baud serial
> port, 16 characters at a time.  I guess you're going to try to justify
> that as 'acceptable behaviour' from the system.  Yes, of course it is.
> If you're dealing with a 1960s computer which processes that slowly.
> 
> And I thought we were in 2012.

This looks like a different issue.  Years back with custom kernels with lots of hacks hack this was not the case.  These days the job is harder to make it work more generally.  Evolving for PM tradeoffs is painful in HW and SW.

Regards,
Richard W.
NeilBrown Feb. 6, 2012, 11:58 p.m. UTC | #56
On Sun, 5 Feb 2012 17:57:40 +0000 "Woodruff, Richard" <r-woodruff2@ti.com>
wrote:

> 
> > From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk]
> > Sent: Sunday, February 05, 2012 10:03 AM
> > To: Woodruff, Richard
> 
> > On Sun, Feb 05, 2012 at 03:37:21PM +0000, Woodruff, Richard wrote:
> > > [x] What is acceptable depends is not black and white.  Is there some
> > > QOS mapping which can be set per channel which allows runtime PM to
> > > pick a best chose (which may allow for loss and frame issues)?.
> > 
> > What you're asking is whether there's anything in the kernel which can
> > predict when the next character is to arrive.
> 
> No, this was not the comment's intent.
> 
> > But, the fact of the matter is that deriving the UART clocks from a PLL
> > which takes a finite time to lock, and the PLL is shut down during runtime
> > PM _is_ _going_ _to_ _cause_ _problems_.  There is absolutely no getting
> > away from that.
> 
> Yes this is one of the issues to be worked around.
> 
> > Let's take your modem example.  Modems today would typically be used with
> > some IP based protocol, probably PPP based.  Incoming traffic on IP is
> > entirely unpredictable.  You wouldn't know when the next character would
> > be received.
> > 
> > One solution to this is to transmit an 0xff character before your real
> > data to ensure that your end is awake and properly synchronized...
> 
> This approach as you say has issues.  This is solved in different ways for modems.
> 
> <sleep>My observation is modem software which many talk over ppp over ip over serial of some sort (might be uart, might be usb), will send a command to the modem to go into a low power mode. Now you can cut clocks with out hurting modem and getting SOC power.
> 
> <wake>When some event happens at modem or processor (timer near beacon or other) the modem or apps processor can signal the other with some wake event (maybe over gpio) which then puts system in a state where it can receive data in trusted manner.
> 
> The modem channel driver try's to inform kernel about entering/exiting modes to set expectation.

I think the correct way to "inform the kernel" would be with the 'CREAD'
c_cflag in termios.  Clearing it indicates that we don't expect to read which
would allow the UART to go to sleep.
When the GPIO interrupt arrives, we would use termios to set CREAD and then
start talking to the modem.

However it is easy to imagine situations where this wouldn't be enough.
A fairly obvious way to wake a sleeping serial connection is with a 'break'.
I have a GPS which can be put to sleep and then woken by sending a 'break'.
Similarly the OMAP uart can reliably receive a break when asleep, but cannot
reliably receive any other input.

Though maybe if BRKINT is set in c_iflag, then we could still receive a break
even when CREAD is clear.  Then clearing CREAD would be enough to allow
low-power mode.

> 
> > So, go ahead with having PM drop random characters if you want, but don't
> > expect anyone in their right mind to accept broken workarounds to the
> > kernel serial driver to try to drop maybe 16 characters or more at a time
> > on the floor when a framing error occurs just because the PM is broken.
> 
> No character was dropped in modem example.  On the UART-to-Debug console it may be ok to drop a character.  Ether needs a coordination hook.

I don't think it is really OK to drop chars on the UART-to-Debug console.
However it is OK to drop the BAUD rate to 57600 where we can wake up in time
for to catch the first bit.  So if you want power saving, drop the console
buad rate.

So I would suggest:
 - remove the autosuspend timeouts
 - allow runtime_pm to shutdown the device when it is not open, or when
   rate is 57600 or below, or when 'CREAD' is clear
 - keep runtime_pm active whenever there are bytes in the output queue or
   fifo

The only case that wouldn't support is when a device will wake up the SOC by
sending a non-break character which it is OK to receive corrupted.  The tty
would have to be in !CREAD for that to happen, and then there would be no way
for the app to know that a non-break character was received.
Would it be reasonable to treat any input while CREAD is clear as a break?

NeilBrown
Woodruff, Richard Feb. 7, 2012, 1 a.m. UTC | #57
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Friday, February 03, 2012 8:31 PM

> So... if flow control is available, then when we idle the uart we should
> set <snip> ...

Yes I think you can improve situation with such tricks.

Unfortunately there are a few types of low-power idle wakeups which muddy the water when trying to understand TRMs.

- The IOpad type wakes are the ones being discussed now. These are used in conjunction with isolation rings which stop external signals from propagating into chip and causing undefined things. This protection is used to enable OFF mode (but can be armed outside of from 34xx+ and beyond). The wakeup event here travels through pins to wakeup domain then to prcm which reactivates subdomains and signals can then reflow (if there are still valid). You get IOpad status which can map to function.

- A shade of idle up is module idle wakes. Here if the isolation latch is not enabled you need to program omap-ocp&ip wake-function wrappers in uart itself. Here the wake signal comes through pads to uart-ip then it signals prcm to reflow signals.

- A wakeup which no one seems to use above this is the 16x50 UART has some internal sleep/wake features. The generic linux driver might know these but they are rarely used.
 
> > Outside of debug console, this loss has not been huge. Protocols like
> irda would retransmit their magic wake packets. You can move between DMA
> and interrupt modes with activity. So far there has been a work around per
> attached device.
> 
> What about bluetooth?  HCI/UART doesn't seem to have a lot of error
> handling.  Maybe it has enough though.
> (I have bluetooth on UART1 ... of course we might not have the same
> problems
> on UART1 .. I haven't played with bluetooth much yet).

I heard of solutions but don't recall as I personally didn't donate blood to get them working. I recall some activity timer + wake packets but would have to dig up old PPTs.

Regards,
Richard W.
Woodruff, Richard Feb. 7, 2012, 1:13 a.m. UTC | #58
> From: NeilBrown [mailto:neilb@suse.de]
> Sent: Monday, February 06, 2012 5:58 PM
> To: Woodruff, Richard

Apologies for mangled mails... I am years over due ditching current method.

> I don't think it is really OK to drop chars on the UART-to-Debug console.
> However it is OK to drop the BAUD rate to 57600 where we can wake up in
> time
> for to catch the first bit.  So if you want power saving, drop the console
> buad rate.
> 
> So I would suggest:
>  - remove the autosuspend timeouts
>  - allow runtime_pm to shutdown the device when it is not open, or when
>    rate is 57600 or below, or when 'CREAD' is clear
>  - keep runtime_pm active whenever there are bytes in the output queue or
>    fifo

Yes slower baud + use of flow control should help. OMAP4 is like 10x better than OMAP2 with OMAP3 in middle.

> The only case that wouldn't support is when a device will wake up the SOC
> by
> sending a non-break character which it is OK to receive corrupted.  The
> tty
> would have to be in !CREAD for that to happen, and then there would be no
> way
> for the app to know that a non-break character was received.
> Would it be reasonable to treat any input while CREAD is clear as a break?

Others would have to comment on this.  I never took this step as I was ok with degradation on debug console.

If ever I wanted better logs I tended to telnet/ssh in on a network port which was better at retry at higher level.

Regards,
Richard W.
Paul Walmsley Feb. 8, 2012, 3:50 p.m. UTC | #59
Hi

Just a quick note.  Haven't had the chance to follow up on these threads 
due to travel and other obligations, but plan to do so soon.

regards -

- Paul