[v3,0/4] New Fuzzy Sync library API

Message ID	20181010140405.24496-1-rpalethorpe@suse.com
Headers	show Return-Path: <ltp-bounces+incoming=patchwork.ozlabs.org@lists.linux.it> From: Richard Palethorpe <rpalethorpe@suse.com> To: ltp@lists.linux.it Date: Wed, 10 Oct 2018 16:04:01 +0200 Message-Id: <20181010140405.24496-1-rpalethorpe@suse.com> Cc: Richard Palethorpe <rpalethorpe@suse.com> Subject: [LTP] [PATCH v3 0/4] New Fuzzy Sync library API Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: ltp-bounces+incoming=patchwork.ozlabs.org@lists.linux.it Sender: "ltp" <ltp-bounces+incoming=patchwork.ozlabs.org@lists.linux.it>
Series	New Fuzzy Sync library API \| expand [v3,0/4] New Fuzzy Sync library API [v3,1/4] tst_timer: Add nano second conversions [v3,2/4] fzsync: Simplify API with start/end race calls and limit exec time [v3,3/4] Convert tests to use fzsync_{start, end}_race API [v3,4/4] fzsync: Add delay bias for difficult races

Message ID

20181010140405.24496-1-rpalethorpe@suse.com

Headers

From: Richard Palethorpe <rpalethorpe@suse.com>
To: ltp@lists.linux.it
Date: Wed, 10 Oct 2018 16:04:01 +0200
Message-Id: <20181010140405.24496-1-rpalethorpe@suse.com>
Cc: Richard Palethorpe <rpalethorpe@suse.com>
Subject: [LTP] [PATCH v3 0/4] New Fuzzy Sync library API
Precedence: list
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: ltp-bounces+incoming=patchwork.ozlabs.org@lists.linux.it
Sender: "ltp" <ltp-bounces+incoming=patchwork.ozlabs.org@lists.linux.it>

Series

New Fuzzy Sync library API | expand

Message

Richard Palethorpe Oct. 10, 2018, 2:04 p.m. UTC

Changes from V2 -> v3:
 Add warning if test times out before sampling has finished
 Take timestamp after delay so bias is not included during sampling
 Use absolute max dev ratio
 Don't discard stats after delay bias change
 Correct default exec time
 Fix some documentation typos
 Drop atomic store of exit
 Use NULL instead of 0 in pthread_join
 include range bounds
 Use static inline instead of unused

Richard Palethorpe (4):
  tst_timer: Add nano second conversions
  fzsync: Simplify API with start/end race calls and limit exec time
  Convert tests to use fzsync_{start,end}_race API
  fzsync: Add delay bias for difficult races

 include/tst_fuzzy_sync.h                      | 785 ++++++++++++++----
 include/tst_timer.h                           |  11 +
 lib/newlib_tests/test16.c                     |  62 +-
 testcases/cve/cve-2014-0196.c                 |  37 +-
 testcases/cve/cve-2016-7117.c                 |  59 +-
 testcases/cve/cve-2017-2671.c                 |  32 +-
 testcases/kernel/syscalls/inotify/inotify09.c |  33 +-
 .../kernel/syscalls/ipc/shmctl/shmctl05.c     |  30 +-
 8 files changed, 725 insertions(+), 324 deletions(-)

Comments

Cyril Hrubis Oct. 18, 2018, 3:02 p.m. UTC | #1

Hi!
I've dusted off my old pxa270 PDA and tried to compare the different
implementations of the fuzzy sync library:

|-------------------------------------------------------------|
| test          |  old library  |  new library                |
|-------------------------------------------------------------|
| shmctl05      | timeouts      | timeouts                    |
|-------------------------------------------------------------|
| inotify09     | timeouts      | exits in sampling with WARN |
|-------------------------------------------------------------|
| cve-2017-2671 | kernel crash  | kernel crash                |
|-------------------------------------------------------------|
| cve-2016-7117 | kernel crash  | exits in sampling with WARN |
|-------------------------------------------------------------|
| cve-2014-0196 | timetous      | exits in sampling with WARN |
|-------------------------------------------------------------|

The shmctl05 timeouts because the remap_file_pages is too slow and we
fail to do even one iteration, it's possible that this is because we are
hitting the race as well since this is kernel 3.0.0, but I cannot say
that for sure.

The real problem is that we fail to callibrate because the machine is
too slow and we do not manage to take the minimal amount of samples
until the default timeout.

If I increase the timeout percentage to 0.5, we manage to take at least
minimal amount of samples and to trigger the cve-2016-7117 from time to
time. But it looks like the bias computation does not work reasonably
reliably there, not sure why. But looking at the latest version adding
bias no longer resets the averages, which may be the reason because the
bias seems to be more or less the same as the number minimal samples.

So there are a few things to consider, first one is that the default
timeout percentage could be probably increased so that we do not have to
tune the LTP_TIMEOUT_MUL even on slower processors. The downside is that
these testcase would take longer on modern harware. Maybe we can do some
simple CPU benchmarking to callibrate the timeout.

Second thing to consider is if and how to tune the minimal amount of
samples, Maybe we can set the minimal amount of samples to be smaller
and then exit the callibration if our deviation was small enough three
times in a row. But then there is this bias that we have to take into an
account somehow.

Richard Palethorpe Oct. 22, 2018, 9:24 a.m. UTC | #2

Hello,

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
> I've dusted off my old pxa270 PDA and tried to compare the different
> implementations of the fuzzy sync library:

Good stuff!

>
> |-------------------------------------------------------------|
> | test          |  old library  |  new library                |
> |-------------------------------------------------------------|
> | shmctl05      | timeouts      | timeouts                    |
> |-------------------------------------------------------------|
> | inotify09     | timeouts      | exits in sampling with WARN |
> |-------------------------------------------------------------|
> | cve-2017-2671 | kernel crash  | kernel crash                |
> |-------------------------------------------------------------|
> | cve-2016-7117 | kernel crash  | exits in sampling with WARN |
> |-------------------------------------------------------------|
> | cve-2014-0196 | timetous      | exits in sampling with WARN |
> |-------------------------------------------------------------|
>
> The shmctl05 timeouts because the remap_file_pages is too slow and we
> fail to do even one iteration, it's possible that this is because we are
> hitting the race as well since this is kernel 3.0.0, but I cannot say
> that for sure.
>
> The real problem is that we fail to callibrate because the machine is
> too slow and we do not manage to take the minimal amount of samples
> until the default timeout.
>
> If I increase the timeout percentage to 0.5, we manage to take at least
> minimal amount of samples and to trigger the cve-2016-7117 from time to
> time. But it looks like the bias computation does not work reasonably
> reliably there, not sure why. But looking at the latest version adding
> bias no longer resets the averages, which may be the reason because the
> bias seems to be more or less the same as the number minimal samples.

Sounds correct. I guess context switches take a large number of cycles
on this CPU relative to x86.

>
> So there are a few things to consider, first one is that the default
> timeout percentage could be probably increased so that we do not have to
> tune the LTP_TIMEOUT_MUL even on slower processors. The downside is that
> these testcase would take longer on modern harware. Maybe we can do some
> simple CPU benchmarking to callibrate the timeout.

Perhaps the test runner or test library should tune LTP_TIMEOUT_MUL?
Assuming the user allows it.

>
> Second thing to consider is if and how to tune the minimal amount of
> samples, Maybe we can set the minimal amount of samples to be smaller
> and then exit the callibration if our deviation was small enough three
> times in a row. But then there is this bias that we have to take into an
> account somehow.

I think the only way is to benchmark a selection of syscalls and then
pass this data to the test somehow. Then it can calculate some
reasonable time and sample limits.

However I also think this is beyond the scope of this patch set because
fuzzy sync tests are just one potential user of such metrics. I suspect
also that it will be a big enough change to justify its own discussion
and patch set.

For now, if we increase the minimum time limit and samples so that
cve-2016-7117 behaves sensibly on a pxa270 then we are probably covering
most users. The downside is that we are wasting some time and
electricity on server grade hardware, but at least the tests are being
performed correctly on most hardware.

--
Thank you,
Richard.

Li Wang Oct. 26, 2018, 7:31 a.m. UTC | #3

On Mon, Oct 22, 2018 at 5:24 PM, Richard Palethorpe <rpalethorpe@suse.de>
wrote:

> Hello,
>
> Cyril Hrubis <chrubis@suse.cz> writes:
>
> > Hi!
> > I've dusted off my old pxa270 PDA and tried to compare the different
> > implementations of the fuzzy sync library:
>
> Good stuff!
>
> >
> > |-------------------------------------------------------------|
> > | test          |  old library  |  new library                |
> > |-------------------------------------------------------------|
> > | shmctl05      | timeouts      | timeouts                    |
> > |-------------------------------------------------------------|
> > | inotify09     | timeouts      | exits in sampling with WARN |
> > |-------------------------------------------------------------|
> > | cve-2017-2671 | kernel crash  | kernel crash                |
> > |-------------------------------------------------------------|
> > | cve-2016-7117 | kernel crash  | exits in sampling with WARN |
> > |-------------------------------------------------------------|
> > | cve-2014-0196 | timetous      | exits in sampling with WARN |
> > |-------------------------------------------------------------|
> >
> > The shmctl05 timeouts because the remap_file_pages is too slow and we
> > fail to do even one iteration, it's possible that this is because we are
> > hitting the race as well since this is kernel 3.0.0, but I cannot say
> > that for sure.
> >
> > The real problem is that we fail to callibrate because the machine is
> > too slow and we do not manage to take the minimal amount of samples
> > until the default timeout.
> >
> > If I increase the timeout percentage to 0.5, we manage to take at least
> > minimal amount of samples and to trigger the cve-2016-7117 from time to
> > time. But it looks like the bias computation does not work reasonably
> > reliably there, not sure why. But looking at the latest version adding
> > bias no longer resets the averages, which may be the reason because the
> > bias seems to be more or less the same as the number minimal samples.
>
> Sounds correct. I guess context switches take a large number of cycles
> on this CPU relative to x86.
>
> >
> > So there are a few things to consider, first one is that the default
> > timeout percentage could be probably increased so that we do not have to
> > tune the LTP_TIMEOUT_MUL even on slower processors. The downside is that
> > these testcase would take longer on modern harware. Maybe we can do some
> > simple CPU benchmarking to callibrate the timeout.
>
> Perhaps the test runner or test library should tune LTP_TIMEOUT_MUL?
> Assuming the user allows it.
>
> >
> > Second thing to consider is if and how to tune the minimal amount of
> > samples, Maybe we can set the minimal amount of samples to be smaller
> > and then exit the callibration if our deviation was small enough three
> > times in a row. But then there is this bias that we have to take into an
> > account somehow.
>
> I think the only way is to benchmark a selection of syscalls and then
> pass this data to the test somehow. Then it can calculate some
> reasonable time and sample limits.
>

Maybe we can also reduce the sampling time via remove pair->diff_ss average
counting.

Looking at the pair->delay algorithm:

        per_spin_time = fabsf(pair->diff_ab.avg) / pair->spins_avg.avg;
        time_delay = drand48() * (pair->diff_sa.avg + pair->diff_sb.avg) -
pair->diff_sb.avg;
        pair->delay += (int)(time_delay / per_spin_time);
the pair->diff_ss is not in use and why we do average calculation in
tst_upd_diff_stat()? On the other hand, it has overlap with pair->diff_ab
in functional, we could reduce 1/4 of total sampling time if we remove it.


> However I also think this is beyond the scope of this patch set because
> fuzzy sync tests are just one potential user of such metrics. I suspect
> also that it will be a big enough change to justify its own discussion
> and patch set.
>
> For now, if we increase the minimum time limit and samples so that
> cve-2016-7117 behaves sensibly on a pxa270 then we are probably covering
> most users. The downside is that we are wasting some time and
> electricity on server grade hardware, but at least the tests are being
> performed correctly on most hardware.
>
> --
> Thank you,
> Richard.
>
> --
> Mailing list info: https://lists.linux.it/listinfo/ltp
>

Richard Palethorpe Oct. 29, 2018, 10:04 a.m. UTC | #4

Hello,

Li Wang <liwang@redhat.com> writes:

> On Mon, Oct 22, 2018 at 5:24 PM, Richard Palethorpe <rpalethorpe@suse.de>
> wrote:
>
>> Hello,
>>
>> Cyril Hrubis <chrubis@suse.cz> writes:
>>
>> > Hi!
>> > I've dusted off my old pxa270 PDA and tried to compare the different
>> > implementations of the fuzzy sync library:
>>
>> Good stuff!
>>
>> >
>> > |-------------------------------------------------------------|
>> > | test          |  old library  |  new library                |
>> > |-------------------------------------------------------------|
>> > | shmctl05      | timeouts      | timeouts                    |
>> > |-------------------------------------------------------------|
>> > | inotify09     | timeouts      | exits in sampling with WARN |
>> > |-------------------------------------------------------------|
>> > | cve-2017-2671 | kernel crash  | kernel crash                |
>> > |-------------------------------------------------------------|
>> > | cve-2016-7117 | kernel crash  | exits in sampling with WARN |
>> > |-------------------------------------------------------------|
>> > | cve-2014-0196 | timetous      | exits in sampling with WARN |
>> > |-------------------------------------------------------------|
>> >
>> > The shmctl05 timeouts because the remap_file_pages is too slow and we
>> > fail to do even one iteration, it's possible that this is because we are
>> > hitting the race as well since this is kernel 3.0.0, but I cannot say
>> > that for sure.
>> >
>> > The real problem is that we fail to callibrate because the machine is
>> > too slow and we do not manage to take the minimal amount of samples
>> > until the default timeout.
>> >
>> > If I increase the timeout percentage to 0.5, we manage to take at least
>> > minimal amount of samples and to trigger the cve-2016-7117 from time to
>> > time. But it looks like the bias computation does not work reasonably
>> > reliably there, not sure why. But looking at the latest version adding
>> > bias no longer resets the averages, which may be the reason because the
>> > bias seems to be more or less the same as the number minimal samples.
>>
>> Sounds correct. I guess context switches take a large number of cycles
>> on this CPU relative to x86.
>>
>> >
>> > So there are a few things to consider, first one is that the default
>> > timeout percentage could be probably increased so that we do not have to
>> > tune the LTP_TIMEOUT_MUL even on slower processors. The downside is that
>> > these testcase would take longer on modern harware. Maybe we can do some
>> > simple CPU benchmarking to callibrate the timeout.
>>
>> Perhaps the test runner or test library should tune LTP_TIMEOUT_MUL?
>> Assuming the user allows it.
>>
>> >
>> > Second thing to consider is if and how to tune the minimal amount of
>> > samples, Maybe we can set the minimal amount of samples to be smaller
>> > and then exit the callibration if our deviation was small enough three
>> > times in a row. But then there is this bias that we have to take into an
>> > account somehow.
>>
>> I think the only way is to benchmark a selection of syscalls and then
>> pass this data to the test somehow. Then it can calculate some
>> reasonable time and sample limits.
>>
>
> Maybe we can also reduce the sampling time via remove pair->diff_ss average
> counting.
>
> Looking at the pair->delay algorithm:
>
>         per_spin_time = fabsf(pair->diff_ab.avg) / pair->spins_avg.avg;
>         time_delay = drand48() * (pair->diff_sa.avg + pair->diff_sb.avg) -
> pair->diff_sb.avg;
>         pair->delay += (int)(time_delay / per_spin_time);
> the pair->diff_ss is not in use and why we do average calculation in
> tst_upd_diff_stat()? On the other hand, it has overlap with pair->diff_ab
> in functional, we could reduce 1/4 of total sampling time if we remove
>         it.

It is just a few maths ops and a highly predictable branch on data that
should (at least) be in the cache. Compared to a context switch or even
a memory barrier (on non x86) it should be insignificant.

>
>
>> However I also think this is beyond the scope of this patch set because
>> fuzzy sync tests are just one potential user of such metrics. I suspect
>> also that it will be a big enough change to justify its own discussion
>> and patch set.
>>
>> For now, if we increase the minimum time limit and samples so that
>> cve-2016-7117 behaves sensibly on a pxa270 then we are probably covering
>> most users. The downside is that we are wasting some time and
>> electricity on server grade hardware, but at least the tests are being
>> performed correctly on most hardware.
>>
>> --
>> Thank you,
>> Richard.
>>
>> --
>> Mailing list info: https://lists.linux.it/listinfo/ltp
>>


--
Thank you,
Richard.