Message ID | 20220629213428.3065430-10-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | Add arc4random support | expand |
On Wed, Jun 29, 2022 at 2:36 PM Adhemerval Zanella via Libc-alpha <libc-alpha@sourceware.org> wrote: > > --- > manual/math.texi | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 45 insertions(+) > > diff --git a/manual/math.texi b/manual/math.texi > index 477a18b6d1..ab96726e57 100644 > --- a/manual/math.texi > +++ b/manual/math.texi > @@ -1447,6 +1447,7 @@ systems. > * ISO Random:: @code{rand} and friends. > * BSD Random:: @code{random} and friends. > * SVID Random:: @code{drand48} and friends. > +* High Quality Random:: @code{arc4random} and friends. > @end menu > > @node ISO Random > @@ -1985,6 +1986,50 @@ This function is a GNU extension and should not be used in portable > programs. > @end deftypefun > > +@node High Quality Random > +@subsection High Quality Random Number Functions > + > +This section describes the random number functions provided as a GNU > +extension, based on OpenBSD interfaces. > + > +@Theglibc{} uses kernel entropy obtained either through @code{getrandom} > +or by reading @file{/dev/urandom} to seed and periodically re-seed the > +internal state. A per-thread data pool is used, which allows fast output > +generation. > + Are we committing to per-thread data pools? I thought there were ideas to use rseq. > +Although these functions provide higher random quality than ISO, BSD, and > +SVID functions, these still use a Pseudo-Random generator and should not > +be used in cryptographic contexts. > + > +The internal state is cleared and reseed with kernel entropy on @code{fork} > +and @code{_Fork}. It is not cleared for either direct @code{clone} syscall > +or when using @theglibc{} @code{syscall} function. > + > +The prototypes for these functions are in @file{stdlib.h}. > +@pindex stdlib.h > + > +@deftypefun int32_t arc4random (void) > +@standards{BSD, stdlib.h} > +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} > +This function returns a single 32-bit value in the range of @code{0} to > +@code{2^32−1} (inclusive), which is twice the range of @code{rand} and > +@code{random}. > +@end deftypefun > + > +@deftypefun void arc4random (void *@var{buffer}, size_t @var{length}) > +@standards{BSD, stdlib.h} > +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} > +This function fills the region @var{buffer} of @var{length} with random data. > +@end deftypefun > + > +@deftypefun uint32_t arc4random_uniform (uint32_t @var{upper_bound}) > +@standards{BSD, stdlib.h} > +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} > +This function returns a single 32-bit value, uniformly distributed but > +less than the @var{upper_bound}. It avoids the @w{modulo bias} when the > +upper bound is not a power of two. > +@end deftypefun > + > @node FP Function Optimizations > @section Is Fast Code or Small Code preferred? > @cindex Optimization > -- > 2.34.1 >
> On 29 Jun 2022, at 18:45, Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > On Wed, Jun 29, 2022 at 2:36 PM Adhemerval Zanella via Libc-alpha > <libc-alpha@sourceware.org> wrote: >> >> --- >> manual/math.texi | 45 +++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 45 insertions(+) >> >> diff --git a/manual/math.texi b/manual/math.texi >> index 477a18b6d1..ab96726e57 100644 >> --- a/manual/math.texi >> +++ b/manual/math.texi >> @@ -1447,6 +1447,7 @@ systems. >> * ISO Random:: @code{rand} and friends. >> * BSD Random:: @code{random} and friends. >> * SVID Random:: @code{drand48} and friends. >> +* High Quality Random:: @code{arc4random} and friends. >> @end menu >> >> @node ISO Random >> @@ -1985,6 +1986,50 @@ This function is a GNU extension and should not be used in portable >> programs. >> @end deftypefun >> >> +@node High Quality Random >> +@subsection High Quality Random Number Functions >> + >> +This section describes the random number functions provided as a GNU >> +extension, based on OpenBSD interfaces. >> + >> +@Theglibc{} uses kernel entropy obtained either through @code{getrandom} >> +or by reading @file{/dev/urandom} to seed and periodically re-seed the >> +internal state. A per-thread data pool is used, which allows fast output >> +generation. >> + > > Are we committing to per-thread data pools? I thought there were ideas to > use rseq. For this version yes, since it works on all supported kernels (even for the ones without getentropy support) and on all architectures. I do not know how feasible it would be to implement per-cpu caches along with rseq and it would require a fallback for older kernel (most likely a per-thread cache as this version), although it might be future improvement.
> On 29 Jun 2022, at 18:34, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote: > > --- > manual/math.texi | 45 +++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 45 insertions(+) > > diff --git a/manual/math.texi b/manual/math.texi > index 477a18b6d1..ab96726e57 100644 > --- a/manual/math.texi > +++ b/manual/math.texi > @@ -1447,6 +1447,7 @@ systems. > * ISO Random:: @code{rand} and friends. > * BSD Random:: @code{random} and friends. > * SVID Random:: @code{drand48} and friends. > +* High Quality Random:: @code{arc4random} and friends. > @end menu > > @node ISO Random > @@ -1985,6 +1986,50 @@ This function is a GNU extension and should not be used in portable > programs. > @end deftypefun > > +@node High Quality Random > +@subsection High Quality Random Number Functions > + > +This section describes the random number functions provided as a GNU > +extension, based on OpenBSD interfaces. > + > +@Theglibc{} uses kernel entropy obtained either through @code{getrandom} > +or by reading @file{/dev/urandom} to seed and periodically re-seed the > +internal state. A per-thread data pool is used, which allows fast output > +generation. > + > +Although these functions provide higher random quality than ISO, BSD, and > +SVID functions, these still use a Pseudo-Random generator and should not > +be used in cryptographic contexts. > + > +The internal state is cleared and reseed with kernel entropy on @code{fork} > +and @code{_Fork}. It is not cleared for either direct @code{clone} syscall > +or when using @theglibc{} @code{syscall} function. > + > +The prototypes for these functions are in @file{stdlib.h}. > +@pindex stdlib.h > + > +@deftypefun int32_t arc4random (void) > +@standards{BSD, stdlib.h} > +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} > +This function returns a single 32-bit value in the range of @code{0} to > +@code{2^32−1} (inclusive), which is twice the range of @code{rand} and > +@code{random}. > +@end deftypefun > + > +@deftypefun void arc4random (void *@var{buffer}, size_t @var{length}) And this should be arc4random_buf, I have fixed it locally. > +@standards{BSD, stdlib.h} > +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} > +This function fills the region @var{buffer} of @var{length} with random data. > +@end deftypefun > + > +@deftypefun uint32_t arc4random_uniform (uint32_t @var{upper_bound}) > +@standards{BSD, stdlib.h} > +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} > +This function returns a single 32-bit value, uniformly distributed but > +less than the @var{upper_bound}. It avoids the @w{modulo bias} when the > +upper bound is not a power of two. > +@end deftypefun > + > @node FP Function Optimizations > @section Is Fast Code or Small Code preferred? > @cindex Optimization > -- > 2.34.1 >
On Wed, Jun 29, 2022 at 2:53 PM Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote: > > > > > On 29 Jun 2022, at 18:45, Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > > > On Wed, Jun 29, 2022 at 2:36 PM Adhemerval Zanella via Libc-alpha > > <libc-alpha@sourceware.org> wrote: > >> > >> --- > >> manual/math.texi | 45 +++++++++++++++++++++++++++++++++++++++++++++ > >> 1 file changed, 45 insertions(+) > >> > >> diff --git a/manual/math.texi b/manual/math.texi > >> index 477a18b6d1..ab96726e57 100644 > >> --- a/manual/math.texi > >> +++ b/manual/math.texi > >> @@ -1447,6 +1447,7 @@ systems. > >> * ISO Random:: @code{rand} and friends. > >> * BSD Random:: @code{random} and friends. > >> * SVID Random:: @code{drand48} and friends. > >> +* High Quality Random:: @code{arc4random} and friends. > >> @end menu > >> > >> @node ISO Random > >> @@ -1985,6 +1986,50 @@ This function is a GNU extension and should not be used in portable > >> programs. > >> @end deftypefun > >> > >> +@node High Quality Random > >> +@subsection High Quality Random Number Functions > >> + > >> +This section describes the random number functions provided as a GNU > >> +extension, based on OpenBSD interfaces. > >> + > >> +@Theglibc{} uses kernel entropy obtained either through @code{getrandom} > >> +or by reading @file{/dev/urandom} to seed and periodically re-seed the > >> +internal state. A per-thread data pool is used, which allows fast output > >> +generation. > >> + > > > > Are we committing to per-thread data pools? I thought there were ideas to > > use rseq. > > For this version yes, since it works on all supported kernels (even for the > ones without getentropy support) and on all architectures. I do not know how > feasible it would be to implement per-cpu caches along with rseq and it would > require a fallback for older kernel (most likely a per-thread cache as this > version), although it might be future improvement. I guess do we want to explicitly say per-thread buffer if we may want to experiment with something else? Just seems like the kind of thing that might make it impossible to re-implement another way. What about something like: "The data-pool is implemented to minimize cross-core contention allowing fast output generation"?
Hi, Le 30/06/2022 à 00:05, Noah Goldstein via Libc-alpha a écrit : > On Wed, Jun 29, 2022 at 2:53 PM Adhemerval Zanella > <adhemerval.zanella@linaro.org> wrote: >> >> >>> On 29 Jun 2022, at 18:45, Noah Goldstein <goldstein.w.n@gmail.com> wrote: >>> >>> On Wed, Jun 29, 2022 at 2:36 PM Adhemerval Zanella via Libc-alpha >>> <libc-alpha@sourceware.org> wrote: >>>> --- >>>> manual/math.texi | 45 +++++++++++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 45 insertions(+) >>>> >>>> diff --git a/manual/math.texi b/manual/math.texi >>>> index 477a18b6d1..ab96726e57 100644 >>>> --- a/manual/math.texi >>>> +++ b/manual/math.texi >>>> @@ -1447,6 +1447,7 @@ systems. >>>> * ISO Random:: @code{rand} and friends. >>>> * BSD Random:: @code{random} and friends. >>>> * SVID Random:: @code{drand48} and friends. >>>> +* High Quality Random:: @code{arc4random} and friends. >>>> @end menu >>>> >>>> @node ISO Random >>>> @@ -1985,6 +1986,50 @@ This function is a GNU extension and should not be used in portable >>>> programs. >>>> @end deftypefun >>>> >>>> +@node High Quality Random >>>> +@subsection High Quality Random Number Functions >>>> + >>>> +This section describes the random number functions provided as a GNU >>>> +extension, based on OpenBSD interfaces. >>>> + >>>> +@Theglibc{} uses kernel entropy obtained either through @code{getrandom} >>>> +or by reading @file{/dev/urandom} to seed and periodically re-seed the >>>> +internal state. A per-thread data pool is used, which allows fast output >>>> +generation. >>>> + >>> Are we committing to per-thread data pools? I thought there were ideas to >>> use rseq. >> For this version yes, since it works on all supported kernels (even for the >> ones without getentropy support) and on all architectures. I do not know how >> feasible it would be to implement per-cpu caches along with rseq and it would >> require a fallback for older kernel (most likely a per-thread cache as this >> version), although it might be future improvement. > I guess do we want to explicitly say per-thread buffer if we may want > to experiment > with something else? > > Just seems like the kind of thing that might make it impossible to re-implement > another way. > > What about something like: > > "The data-pool is implemented to minimize cross-core contention > allowing fast output generation"? "Each thread has its own independant random stream"
> On 29 Jun 2022, at 19:05, Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > On Wed, Jun 29, 2022 at 2:53 PM Adhemerval Zanella > <adhemerval.zanella@linaro.org> wrote: >> >> >> >>> On 29 Jun 2022, at 18:45, Noah Goldstein <goldstein.w.n@gmail.com> wrote: >>> >>> On Wed, Jun 29, 2022 at 2:36 PM Adhemerval Zanella via Libc-alpha >>> <libc-alpha@sourceware.org> wrote: >>>> >>>> --- >>>> manual/math.texi | 45 +++++++++++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 45 insertions(+) >>>> >>>> diff --git a/manual/math.texi b/manual/math.texi >>>> index 477a18b6d1..ab96726e57 100644 >>>> --- a/manual/math.texi >>>> +++ b/manual/math.texi >>>> @@ -1447,6 +1447,7 @@ systems. >>>> * ISO Random:: @code{rand} and friends. >>>> * BSD Random:: @code{random} and friends. >>>> * SVID Random:: @code{drand48} and friends. >>>> +* High Quality Random:: @code{arc4random} and friends. >>>> @end menu >>>> >>>> @node ISO Random >>>> @@ -1985,6 +1986,50 @@ This function is a GNU extension and should not be used in portable >>>> programs. >>>> @end deftypefun >>>> >>>> +@node High Quality Random >>>> +@subsection High Quality Random Number Functions >>>> + >>>> +This section describes the random number functions provided as a GNU >>>> +extension, based on OpenBSD interfaces. >>>> + >>>> +@Theglibc{} uses kernel entropy obtained either through @code{getrandom} >>>> +or by reading @file{/dev/urandom} to seed and periodically re-seed the >>>> +internal state. A per-thread data pool is used, which allows fast output >>>> +generation. >>>> + >>> >>> Are we committing to per-thread data pools? I thought there were ideas to >>> use rseq. >> >> For this version yes, since it works on all supported kernels (even for the >> ones without getentropy support) and on all architectures. I do not know how >> feasible it would be to implement per-cpu caches along with rseq and it would >> require a fallback for older kernel (most likely a per-thread cache as this >> version), although it might be future improvement. > > I guess do we want to explicitly say per-thread buffer if we may want > to experiment > with something else? > > Just seems like the kind of thing that might make it impossible to re-implement > another way. > > What about something like: > > "The data-pool is implemented to minimize cross-core contention > allowing fast output generation”? I take the idea is to avoid adding too much implementation detail to manual, although I take that documentation and arc4random specification does not define performance or complexity details (as for some STL containers for instance). If we even change the arc4random implementation I would expect to update this documentation description. But iIt does make sense to define expectations, so I will update with your suggestion (and Yann remarks).
diff --git a/manual/math.texi b/manual/math.texi index 477a18b6d1..ab96726e57 100644 --- a/manual/math.texi +++ b/manual/math.texi @@ -1447,6 +1447,7 @@ systems. * ISO Random:: @code{rand} and friends. * BSD Random:: @code{random} and friends. * SVID Random:: @code{drand48} and friends. +* High Quality Random:: @code{arc4random} and friends. @end menu @node ISO Random @@ -1985,6 +1986,50 @@ This function is a GNU extension and should not be used in portable programs. @end deftypefun +@node High Quality Random +@subsection High Quality Random Number Functions + +This section describes the random number functions provided as a GNU +extension, based on OpenBSD interfaces. + +@Theglibc{} uses kernel entropy obtained either through @code{getrandom} +or by reading @file{/dev/urandom} to seed and periodically re-seed the +internal state. A per-thread data pool is used, which allows fast output +generation. + +Although these functions provide higher random quality than ISO, BSD, and +SVID functions, these still use a Pseudo-Random generator and should not +be used in cryptographic contexts. + +The internal state is cleared and reseed with kernel entropy on @code{fork} +and @code{_Fork}. It is not cleared for either direct @code{clone} syscall +or when using @theglibc{} @code{syscall} function. + +The prototypes for these functions are in @file{stdlib.h}. +@pindex stdlib.h + +@deftypefun int32_t arc4random (void) +@standards{BSD, stdlib.h} +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} +This function returns a single 32-bit value in the range of @code{0} to +@code{2^32−1} (inclusive), which is twice the range of @code{rand} and +@code{random}. +@end deftypefun + +@deftypefun void arc4random (void *@var{buffer}, size_t @var{length}) +@standards{BSD, stdlib.h} +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} +This function fills the region @var{buffer} of @var{length} with random data. +@end deftypefun + +@deftypefun uint32_t arc4random_uniform (uint32_t @var{upper_bound}) +@standards{BSD, stdlib.h} +@safety{@mtsafe{}@asunsafe{@asucorrupt{}}@acsafe{}} +This function returns a single 32-bit value, uniformly distributed but +less than the @var{upper_bound}. It avoids the @w{modulo bias} when the +upper bound is not a power of two. +@end deftypefun + @node FP Function Optimizations @section Is Fast Code or Small Code preferred? @cindex Optimization