diff mbox series

[2/4] Revise manual sections on cryptography and random numbers.

Message ID 20180506175153.22629-3-zackw@panix.com
State New
Headers show
Series Revise crypto documentation, deprecate DES, add --disable-crypt | expand

Commit Message

Zack Weinberg May 6, 2018, 5:51 p.m. UTC
This is a major rewrite of the description of 'crypt'.  It also makes
minor improvements to the description of the obsolete DES functions,
and introduces some new introductory exposition to the "random number
generation" chapter.

A few highlights of the content changes:

 - All references to FIPS have been removed.  The "Legal Problems"
   section has also been removed, but I left a one-paragraph
   cautionary note that there _are_, or have been, legal problems at
   the top of the chapter.  I also removed the paragraph saying that
   systems not connected to a network need no user authentication,
   because that's a pretty rare situation nowadays.  (It still says
   "sometimes it is necessary" to authenticate the user, though.)

 - I added documentation for all of the hash functions that glibc
   actually supports, but not for the additional hash functions
   supported by libxcrypt.  If we're going to keep this manual section
   around after the transition is more advanced, it would probably
   make sense to add them then.

 - There is much more detailed discussion of how to generate a salt,
   and the failure behavior for crypt is documented.  (Returning an
   invalid hash on failure is what libxcrypt does; Solar Designer's
   notes say that this was done "for compatibility with old programs
   that assume crypt can never fail".)

 - As far as I can tell, the header 'crypt.h' is entirely a GNU
   invention, and never existed on any other Unix lineage.  The
   functions 'crypt', 'encrypt', and 'setkey', however, were in Issue
   1 of the SVID and are now in the XSI component of POSIX.  I tried
   to make all of the @standards annotations consistent with this, but
   I'm not sure I got them perfectly right.

 - The genpass.c example has been improved to use getentropy instead
   of the current time to generate the salt.  It uses more random
   bytes than is strictly necessary because I didn't want to
   complicate the code with proper base64 encoding.

 - The testpass.c example has been improved to take a username as a
   command-line argument and validate that user's password, instead of
   hardwiring a hash.  It really ought to read passwords out of the
   shadow file (getspnam() instead of getpwnam()), but we don't have
   any documentation for the shadow file functions, and I don't want
   to delay this work even more by writing some.

 - I tried to make the explanation of the encoding of the arguments to
   'encrypt' and 'setkey' better.  It's still a bit confusing but I
   don't think it's worth spending any more effort on.

 - I changed "hardware device" to "hardware DES accelerator" because a
   hardware device could be anything.

 - DES_FAILED is a macro, not a function, and the description of the
   value it returns was backwards.

	* manual/crypt.texi (Cryptographic Functions): New initial
	exposition.
	(Legal Problems): Section removed, replaced by a short caution in
	the initial exposition for 'Cryptographic Functions'.
	(crypt): Section renamed to 'Password Storage'.  Full rewrite.
	(DES Encryption): Section renamed to 'Obsolete Encryption'.
	Remove all mention of FIPS.  State clearly that none of these
	functions should be used anymore.  Various other improvements to
	the text.
	* manual/examples/genpass.c: Generate a salt using getentropy
	instead of the current time.
	* manual/examples/testpass.c: Instead of a hard-coded password
	hash, demonstrate reading passwords from the user database.
	* manual/string.texi (strfry, memfrob): Remove cross-references to
	the obsolete DES encryption functions.

	* manual/random.texi (Random Number Generation): New initial
	exposition.
	(Pseudo-Random Numbers): No need to xref the very next section.
	(Unpredictable Bytes): Improve initial exposition.  Clarify error
	behavior of getentropy and getrandom.
---
 manual/crypt.texi          | 445 ++++++++++++++++++++-----------------
 manual/examples/genpass.c  |  38 ++--
 manual/examples/testpass.c |  34 ++-
 manual/random.texi         |  84 +++++--
 manual/string.texi         |   6 +-
 5 files changed, 358 insertions(+), 249 deletions(-)

Comments

Florian Weimer May 8, 2018, 2:23 p.m. UTC | #1
On 05/06/2018 07:51 PM, Zack Weinberg wrote:

> +@Theglibc{} is not a general-purpose cryptography library, but it does
> +include one-way hash functions for password storage, and a set of
> +encryption functions based on the obsolete DES block cipher.  The
> +latter are present for historical reasons only, and should not be used
> +at all anymore; any existing programs that still use them need to be
> +updated to use ciphers that are still secure.

Okay.

> +Many countries have historically placed legal restrictions on the
> +import, export, possession, or use of cryptographic tools.  These
> +restrictions change frequently, so we cannot give you specific advice
> +about what the situation might be in your country.  To the best of our
> +knowledge, no one has objected to the export or import of @theglibc{}
> +in many years,

I think people routinely file notices for glibc with the appropriate 
authorities.  The whole thing doesn't really work the way the text 
implies.  It's not that nobody complains, it's that all this work is 
done proactively.  I don't think we need to get into detail here.

> but if you are writing a program that performs
> +cryptographic operations, we advise you to find out what your
> +country's laws might be, first.  The one-way hash functions for
> +password storage are less likely to be restricted than the DES-based
> +encryption functions, because they cannot be used to send secret
> +messages---the recipient would not be able to decrypt them.

It's fairly easy to turn a one-way hash function into a stream cipher, 
so I'm not sure if the final sentence is correct.

The HMAC construction can also be used with this and give you a cipher:

   <https://people.csail.mit.edu/rivest/Chaffing.txt>

I'm also not sure if DES qualifies as a cryptographic algorithm anymore, 
so the actual situation might well be the opposite of what you state. 
Sorry.

> +Programs that handle passwords must take special care not to reveal
> +them to anyone, no matter what.  It is not enough to keep them in a
> +file that is only accessible with special privileges.  The file might
> +be ``leaked'' via a bug or misconfiguration, and system administrators
> +shouldn't learn everyone's password even if they have to edit that
> +file for some reason.  To avoid this, passwords should also be
> +converted into @dfn{one-way hashes}, using a @dfn{one-way function},
> +before they are stored.  It is believed that the only way to recover a
> +password from its one-way hash is to guess possible passwords, hash
> +them, and compare the output with the hashes stored in the password
> +file.  The one-way functions are designed to make this process
> +impractically slow, for all but the most obvious guesses.  (Do not use
> +a word from the dictionary as your password.)

Okay.

> +@Theglibc{} provides an interface to four different one-way functions,
> +based on the SHA-2-512, SHA-2-256, MD5, and DES cryptographic
> +primitives.  New passwords should be hashed with either of the
> +SHA-based functions.  The others are too weak for newly set passwords,
> +but we continue to support them for verifying old passwords.  The
> +DES-based hash is especially weak, because it ignores all but the
> +first eight characters of its input.

Okay.

> +The function @code{crypt} converts a password string, @var{phrase},
> +into a one-way hash suitable for storage in the user database.  The
> +hash will consist entirely of printable ASCII characters.  It will not
> +contain whitespace, nor any of the characters @kbd{:}, @kbd{;},
> +@kbd{*}, @kbd{!}, or @kbd{\}.

@code{':'} etc.?  These are bytes, not keyboard characters.

> +The @var{salt} parameter controls which one-way function is used, and
> +it also ensures that the output of the one-way function is different
> +for every user, even if they have the same password.  This makes it
> +harder to guess passwords from a large user database.  Without salt,
> +the intruder could make a guess, run @code{crypt} on it once, and
> +compare the result with all the hashed passwords.  Salt forces the
> +intruder to make many calls to @code{crypt} for each guess.

Okay.

> +To verify a password, pass the previously hashed password as the
> +@var{salt}.  To hash a new password for storage, set @var{salt} to a
> +string consisting of a prefix plus a sequence of randomly chosen
> +characters, according to this table:
> +
> +@multitable @columnfractions .15 .1 .3
> +@headitem Algorithm @tab Prefix @tab Random sequence
> +@item SHA-2-512
> +@tab @samp{$6$}
> +@tab 16 characters
> +@item SHA-2-256
> +@tab @samp{$5$}
> +@tab 16 characters
> +@item MD5
> +@tab @samp{$1$}
> +@tab 8 characters
> +@item DES
> +@tab @samp{}
> +@tab 2 characters
> +@end multitable

Maybe use @code{"$6$"} etc. for the prefixes?  No strong preference on 
my part though.

> +In all cases, the random characters should be chosen from the alphabet
> +@code{./0-9A-Za-z}.

The code checks this, right?  Or perhaps only for DES.

> +With all of the hash functions @emph{except} DES, @var{phrase} can be
> +arbitrarily long, and all eight bits of each byte are significant.
> +With DES, all but the first eight characters of @var{phrase} are ignored,
> +and the eighth bit of each byte is also ignored.

Okay.

> +@code{crypt} can fail.  Some implementations return @code{NULL} on
> +failure, and others return an @emph{invalid} hashed passphrase, which
> +will begin with a @kbd{*} and will not be the same as @var{salt}.  In

@code{'*'}?

> +either case, @code{errno} will be set to indicate the problem.  Some
> +of the possible error codes are:
> +
> +@table @code
> +@item EINVAL
> +@var{salt} is invalid; neither a previously hashed password, nor a
> +well-formed new salt for any of the supported hash functions.
> +
> +@item EPERM
> +The hash function selected by @var{salt} has been disabled by local
> +policy.
> +
> +@item ENOMEM
> +Failed to allocate internal scratch storage.
> +
> +@item ENOSYS
> +@itemx EOPNOTSUPP
> +Hashing passphrases is not supported at all, or the hash function
> +selected by @var{salt} is not supported.  @Theglibc{} does not use
> +these error codes, but they may be encountered on other operating
> +systems.
> +@end table
> +
> +@code{crypt} uses static storage for both internal scratchwork and the
> +string it returns.  It is not safe to call @code{crypt} from multiple
> +threads simultaneously, and the string it returns will be overwritten
> +by any subsequent call to @code{crypt}.

Okay.

> +@code{crypt} is specified in the X/Open Portability Guide and is
> +present on nearly all historical Unix systems.  However, XPG does not
> +specify any one-way functions.

> +@code{crypt} is declared in @file{unistd.h}.  @Theglibc{} also
> +declares this function in @file{crypt.h}.

Okay.

>   @end deftypefun
>   
> -@deftypefun {char *} crypt_r (const char *@var{key}, const char *@var{salt}, {struct crypt_data *} @var{data})
> +@deftypefun {char *} crypt_r (const char *@var{phrase}, const char *@var{salt}, {struct crypt_data *} @var{data})
>   @standards{GNU, crypt.h}
>   @safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @asulock{} @ascuheap{} @ascudlopen{}}@acunsafe{@aculock{} @acsmem{}}}
>   @c Compared with crypt, this function fixes the @mtasurace:crypt
>   @c problem, but nothing else.
>   
> -The @code{crypt_r} function does the same thing as @code{crypt}, but
> -takes an extra parameter which includes space for its result (among
> -other things), so it can be reentrant.  @code{data@w{->}initialized} must be
> -cleared to zero before the first time @code{crypt_r} is called.
> -
> -The @code{crypt_r} function is a GNU extension.
> +The function @code{crypt_r} is a reentrant version of @code{crypt}.

I don't think crypt_r is reentrant, it is merely thread-safe.  But no 
strong opinion.

> +Instead of static storage, it uses the memory pointed to by its
> +@var{data} argument for both scratchwork and the string it returns.
> +It can safely be used from multiple threads, as long as different
> +@var{data} objects are used in each thread.  The string it returns
> +will still be overwritten by another call with the same @var{data}.
> +@var{data} must be a pointer to a @samp{struct crypt_data} object
> +allocated by the caller, and before one of these objects is used for
> +the first time, it must be erased using
> +@samp{@w{memset(data, 0, sizeof data)}}
> +or similar.

Missing space before parentheses.

Perhaps add a warning that struct crypt_data is very large and should 
not be allocated on the stack?

> +The following program shows how to use @code{crypt} the first time a
> +password is entered.  It uses @code{getentropy} to make the salt as
> +unpredictable as possible; @pxref{Unpredictable Bytes}.

Okay.

>   @smallexample
>   @include genpass.c.texi
>   @end smallexample
>   
> -The next program shows how to verify a password.  It prompts the user
> -for a password and prints ``Access granted.'' if the user types
> -@code{GNU libc manual}.
> +@c When we have documentation for getspnam, change the demo to use it.
> +The next program shows how to verify a password stored in the system
> +user database in the traditional manner; @pxref{User Database}.  (On
> +newer installations, the passwords are instead stored in a ``shadow''
> +database that requires special privileges to access, and this program
> +will not work.)

Okay.

> +@node Obsolete Encryption
> +@section Obsolete Encryption
> +
> +For historical reasons, @theglibc{} includes several functions which
> +perform encryption using the obsolete Data Encryption Standard (DES).
> +None of these functions should be used anymore.  Instead, use one of
> +the many free encryption libraries that use modern ciphers.

Okay.  Maybe also say that these functions are not thread-safe?

> +DES is a block cipher standardized by the US government in 1977.  It
> +is no longer considered to be secure, and has been withdrawn as a
> +standard, because it only has @math{2@sup{56}} possible keys, so
> +testing all of them is practical.  In 1998, it cost US$250,000 to
> +build a massively parallel computer that could find the DES key that
> +encrypted a known message in three days.  In 2018, the same process
> +takes only one day, and there are online services that will do it for
> +less than US$100 per message.

Okay.

> +The @code{setkey} function prepares to perform DES encryption or
> +decryption using the key @var{key}.  @var{key} should point to an
> +array of 64 @code{char}s, each of which must be set to either @samp{0}
> +or @samp{1}; that is, each byte stores one bit of the key.  Every
> +eighth byte (array indices 7, 15, 23, @dots{}) must be set to give it
> +plus the preceding group of seven bytes odd parity.

“must be set to the negation of the parity of the preceding seven 
bits/bytes”?

>  For instance, if
> +there are three bytes set to @samp{1} among bytes 0 through 6, then
> +byte 7 must be set to @samp{0}, and similarly if there are four bytes
> +set to @samp{1} among bytes 8 through 14, then byte 15 must be set to
> +@samp{0}, and so on.  Thus, of the 64 bytes, only 56 can be used to
> +supply key data.

Okay.

> +The @code{setkey} function is specified in the X/Open Portability
> +Guide and is declared in @file{stdlib.h}.  @Theglibc{} also declares
> +this function in @file{crypt.h}.
>   @end deftypefun

Okay.

>   @deftypefun void encrypt (char *@var{block}, int @var{edflag})
> -@standards{BSD, crypt.h}
> -@standards{SVID, crypt.h}
> +@standards{X/Open, unistd.h}
> +@standards{GNU, crypt.h}
>   @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
>   @c Same issues as setkey.
>   
> -The @code{encrypt} function encrypts @var{block} if
> -@var{edflag} is 0, otherwise it decrypts @var{block}, using a key
> -previously set by @code{setkey}.  The result is
> -placed in @var{block}.
> +The @code{encrypt} function encrypts @var{block} if @var{edflag} is 0,
> +otherwise it decrypts @var{block}, using a key previously set by
> +@code{setkey}.  The result overwrites the previous value of
> +@var{block}.
>   
> -Like @code{setkey}, @var{block} is specified as an array of 64 bits each
> -stored in a @code{char}, but there are no parity bits in @var{block}.
> +Like @code{setkey}, @var{block} is as an array of 64 @code{char}s,
> +each of which stores one bit of the block to be encrypted.  Unlike
> +@code{setkey}, there are no parity bits.  All 64 of the bits are
> +treated as data.
> +
> +The @code{encrypt} function is specified in the X/Open Portability
> +Guide and is declared in @file{unistd.h}.  @Theglibc{} also declares
> +this function in @file{crypt.h}.

Okay.

> @@ -245,25 +267,30 @@ stored in a @code{char}, but there are no parity bits in @var{block}.
>   These are reentrant versions of @code{setkey} and @code{encrypt}.  The
>   only difference is the extra parameter, which stores the expanded
>   version of @var{key}.  Before calling @code{setkey_r} the first time,
> -@code{data->initialized} must be cleared to zero.
> -@end deftypefun
> +@code{data} must be erased using
> +@samp{@w{memset(data, 0, sizeof data)}}
> +or similar.

Missing space before parenthesis.

> -The @code{setkey_r} and @code{encrypt_r} functions are GNU extensions.
> -@code{setkey}, @code{encrypt}, @code{setkey_r}, and @code{encrypt_r} are
> -defined in @file{crypt.h}.
> +@code{setkey_r} and @code{encrypt_r} are GNU extensions.  They are
> +declared in @file{crypt.h}.
> +@end deftypefun
>   
>   @deftypefun int ecb_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode})
>   @standards{SUNRPC, rpc/des_crypt.h}
>   @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>   
>   The function @code{ecb_crypt} encrypts or decrypts one or more blocks
> -using DES.  Each block is encrypted independently.
> +using DES.  Each block is encrypted independently, which means that if
> +any two input blocks are the same, then their encryptions will also be
> +the same.  This is an additional weakness in the encryption, on top of
> +the weakness of DES itself.

Maybe mention Electronic Codebook Mode (ECB)?

>   The @var{blocks} and the @var{key} are stored packed in 8-bit bytes, so
>   that the first bit of the key is the most-significant bit of
>   @code{key[0]} and the 63rd bit of the key is stored as the
> -least-significant bit of @code{key[7]}.  The @var{key} should have the
> -correct parity.
> +least-significant bit of @code{key[7]}.  The least-significant bit of
> +each byte must be chosen to give each byte odd parity, as with
> +@code{setkey}.

Okay.

>   @var{len} is the number of bytes in @var{blocks}.  It should be a
>   multiple of 8 (so that there are a whole number of blocks to encrypt).
> @@ -271,33 +298,33 @@ multiple of 8 (so that there are a whole number of blocks to encrypt).
>   
>   The result of the encryption replaces the input in @var{blocks}.
>   
> -The @var{mode} parameter is the bitwise OR of two of the following:
> +The @var{mode} parameter should be the bitwise OR of one of these two
> +constants:
>   
>   @vtable @code
>   @item DES_ENCRYPT
>   @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, specifies that
>   @var{blocks} is to be encrypted.
>   
>   @item DES_DECRYPT
>   @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, specifies that
>   @var{blocks} is to be decrypted.
> +@end vtable

Okay.

> +@noindent
> +with one of these two constants:
>   
> +@vtable @code
>   @item DES_HW
>   @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, asks to use a hardware
> -device.  If no hardware device is available, encryption happens anyway,
> -but in software.
> +Use a hardware DES accelerator, if available.  If no accelerator is
> +available, encryption or decryption will still occur, but in software.
>   
>   @item DES_SW
>   @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, specifies that no
> -hardware device is to be used.
> +Do not use a hardware DES accelerator.
>   @end vtable

Okay.

>   @deftypefun int cbc_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode}, char *@var{ivec})
>   @standards{SUNRPC, rpc/des_crypt.h}
>   @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> @@ -324,7 +351,8 @@ bytes.
>   Otherwise, all the parameters are similar to those for @code{ecb_crypt}.
>   @end deftypefun
>   
> -The result of the function will be one of these values:
> +@code{ecb_crypt} and @code{cbc_crypt} both return one of the following
> +values:
>   
>   @vtable @code
>   @item DESERR_NONE
> @@ -333,7 +361,7 @@ The encryption succeeded.
>   
>   @item DESERR_NOHWDEVICE
>   @standards{SUNRPC, rpc/des_crypt.h}
> -The encryption succeeded, but there was no hardware device available.
> +The encryption succeeded, but there was no hardware accelerator available.
>   
>   @item DESERR_HWERROR
>   @standards{SUNRPC, rpc/des_crypt.h}
> @@ -344,13 +372,20 @@ The encryption failed because of a hardware problem.
>   The encryption failed because of a bad parameter, for instance @var{len}
>   is not a multiple of 8 or @var{len} is larger than @code{DES_MAXDATA}.
>   @end vtable
> +@end deftypefun

Okay.

> +For convenience, there is a macro which distinguishes successful from
> +failure result codes:
>   
> -@deftypefun int DES_FAILED (int @var{err})
> +@deftypefn Macro int DES_FAILED (int @var{err})
>   @standards{SUNRPC, rpc/des_crypt.h}
>   @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> -This macro returns 1 if @var{err} is a `success' result code from
> -@code{ecb_crypt} or @code{cbc_crypt}, and 0 otherwise.
> -@end deftypefun
> +This macro returns a nonzero value if @var{err} is a failure result
> +code from @code{ecb_crypt} or @code{cbc_crypt}, and 0 otherwise.
> +@end deftypefn
> +
> +There is also a helper function which assists in producing well-formed
> +DES keys:

Okay.

>   @deftypefun void des_setparity (char *@var{key})
>   @standards{SUNRPC, rpc/des_crypt.h}
> diff --git a/manual/examples/genpass.c b/manual/examples/genpass.c
> index 5edb2e9b8a..fe10a847cf 100644
> --- a/manual/examples/genpass.c
> +++ b/manual/examples/genpass.c
> @@ -16,34 +16,44 @@
>   */
>   
>   #include <stdio.h>
> -#include <time.h>
>   #include <unistd.h>
>   #include <crypt.h>
>   
>   int
>   main(void)
>   {
> -  unsigned long seed[2];
> -  char salt[] = "$1$........";
> -  const char *const seedchars =
> +  unsigned char ubytes[16];
> +  char salt[20];
> +  const char *const saltchars =
>       "./0123456789ABCDEFGHIJKLMNOPQRST"
>       "UVWXYZabcdefghijklmnopqrstuvwxyz";
>     char *password;
>     int i;
>   
> -  /* Generate a (not very) random seed.
> -     You should do it better than this...  */
> -  seed[0] = time(NULL);
> -  seed[1] = getpid() ^ (seed[0] >> 14 & 0x30000);
> -
> -  /* Turn it into printable characters from `seedchars'.  */
> -  for (i = 0; i < 8; i++)
> -    salt[3+i] = seedchars[(seed[i/5] >> (i%5)*6) & 0x3f];
> +  /* Retrieve 16 unpredictable bytes from the operating system.  */
> +  if (getentropy (ubytes, sizeof ubytes))
> +    {
> +      perror ("getentropy");
> +      return 1;
> +    }

Explicit check of getentropy return value against zero?  It's not 
exactly a boolean flag, after all.

> +
> +  /* Use them to fill in the salt string.  */
> +  salt[0] = '$';
> +  salt[1] = '5'; /* SHA-256 */
> +  salt[2] = '$';
> +  for (i = 0; i < 16; i++)
> +    salt[3+i] = saltchars[ubytes[i] & 0x3f];
> +  salt[19] = '\0';
>   
>     /* Read in the user's password and encrypt it.  */
> -  password = crypt(getpass("Password:"), salt);
> +  password = crypt (getpass ("Password:"), salt);
> +  if (!password)
> +    {
> +      perror ("crypt");
> +      return 1;
> +    }
>   
>     /* Print the results.  */
> -  puts(password);
> +  puts (password);
>     return 0;
>   }

Okay.

> diff --git a/manual/examples/testpass.c b/manual/examples/testpass.c
> index 19f1ae7de0..3cc2383ea8 100644
> --- a/manual/examples/testpass.c
> +++ b/manual/examples/testpass.c
> @@ -15,29 +15,45 @@
>      along with this program; if not, if not, see <http://www.gnu.org/licenses/>.
>   */
>   
> +#include <sys/types.h>
> +#include <pwd.h>
>   #include <stdio.h>
>   #include <string.h>
>   #include <unistd.h>
>   #include <crypt.h>
>   
>   int
> -main(void)
> +main(int argc, char **argv)
>   {
> -  /* Hashed form of "GNU libc manual".  */
> -  const char *const pass = "$1$/iSaq7rB$EoUw5jJPPvAPECNaaWzMK/";
> -
>     char *result;
>     int ok;
> +  struct passwd *pw;
> +
> +  if (argc < 2)
> +    {
> +      fprintf (stderr, "usage: %s username\n", argv[0]);
> +      return 1;
> +    }
> +
> +  /* Retrieve the password for user argv[1]
> +     from the user database.  */
> +  pw = getpwnam (argv[1]);

Check error?

> -/*@group*/
>     /* Read in the user's password and encrypt it,
>        passing the expected password in as the salt.  */
> -  result = crypt(getpass("Password:"), pass);
> -/*@end group*/
> +  result = crypt (getpass ("Password:"), pw->pw_passwd);
> +  if (!result)
> +    {
> +      perror ("crypt");
> +      return 1;
> +    }
> +
> +  printf ("Stored:  %s\n", pw->pw_passwd);
> +  printf ("Current: %s\n", result);
>   
>     /* Test the result.  */
> -  ok = strcmp (result, pass) == 0;
> +  ok = strcmp (result, pw->pw_passwd) == 0;

Maybe add a comment that this could result in a timing oracle?

> -  puts(ok ? "Access granted." : "Access denied.");
> +  puts (ok ? "Access granted." : "Access denied.");
>     return ok ? 0 : 1;
>   }
> diff --git a/manual/random.texi b/manual/random.texi
> index c53ed7fc22..c68a43aca0 100644
> --- a/manual/random.texi
> +++ b/manual/random.texi
> @@ -2,6 +2,52 @@
>   @chapter Random Number Generation
>   @c %MENU% Various ways to generate random values.
>   
> +Many algorithms require a source of @dfn{random numbers}, or to be
> +more precise, sequences of numbers chosen uniformly at random from
> +some subset of the integers or reals.  There are several different
> +ways to generate random numbers, depending on how stringent your
> +requirements are.
> +
> +A @dfn{pseudo-random number generator} (PRNG) produces a sequence of
> +numbers that @emph{appears} to be random, and has statistical
> +properties matching what we expect of numbers chosen uniformly at
> +random.  However, an ordinary PRNG doesn't guarantee that its output
> +is unguessable.  Also, the output of a PRNG depends on a relatively
> +small @dfn{seed} value, and so there are only a small number of
> +sequences that it can produce; astronomically small, relative to the
> +total number of random sequences.  If the seed is reused, the output
> +will be exactly the same, which is sometimes exactly what you want,
> +and sometimes disastrous.

Okay.

> +A @dfn{cryptographically strong pseudo-random number generator}
> +(CSPRNG) is a PRNG that @emph{does} guarantee its output is
> +unguessable.  Formally, there is no deterministic, polynomial-time
> +algorithm@footnote{Assuming
> +@iftex
> +@c Don't typeset NP as if multiplying N by P. Use text italic for both.
> +@math{\hbox{\it P} \neq \hbox{\it NP}}@c
> +@end iftex
> +@ifnottex
> +@math{P ≠ NP}@c
> +@end ifnottex
> +.} that can tell the difference between the output of
> +a CSPRNG and a sequence of numbers that really were chosen uniformly
> +at random.  A CSPRNG still uses a seed and can only produce an
> +astronomically small number of random sequences.

I'm not sure if this detail is necessary.  If it is, you need to add 
“independent” somewhere.

> +Finally, a @dfn{true random number generator} (TRNG) produces random
> +numbers not from a seed, but from some unpredictable physical process.
> +In principle, a TRNG's output is unguessable, and is not limited to an
> +astronomically small number of sequences.  However, TRNGs are very
> +slow, and because there is no seed, there is no way to make one repeat
> +its output.  Usually, it is best to use a TRNG only to choose the seed
> +for a PRNG or CSPRNG.
> +
> +At present, @theglibc{} offers a variety of ordinary PRNGs, and on
> +some operating systems it also offers access to an OS-provided TRNG.
> +We may add a CSPRNG in the future.

If this refers to arc4random, I don't think it's a CSPRNG under your 
definition.  And we shouldn't claim it has formal properties initially 
because that would make glibc subject to FIPS certification again, which 
is something we should avoid.

> +
> +
>   @menu
>   * Pseudo-Random Numbers::       Sequences of numbers with apparently
>                                    random distribution, but not difficult
> @@ -30,8 +76,7 @@ is convenient when you are debugging a program, but it is unhelpful if
>   you want the program to behave unpredictably.  If you want a different
>   pseudo-random series each time your program runs, you must specify a
>   different seed each time.  For ordinary purposes, basing the seed on the
> -current time works well.  For random numbers in cryptography,
> -@pxref{Unpredictable Bytes}.
> +current time works well.

Okay.

> @@ -595,14 +640,18 @@ programs.
>   @node Unpredictable Bytes
>   @section Generating Unpredictable Bytes
>   
> -Some cryptographic applications (such as session key generation) need
> -unpredictable bytes.
> +Some cryptographic applications need truly unpredictable bytes.
> +@Theglibc{} provides two functions for this purpose, both of which
> +access a true random generator implemented by the operating system.

I don't think the Linux urandom generator qualifies as a TRNG, sorry.

> +Not all operating systems support these functions; programs that use
> +them must be prepared for them to fail.  They are slow, and can only
> +produce short sequences of unpredictable bytes.  Most programs should
> +use these functions only to seed a cryptographically strong
> +pseudo-random generator.

The functions aren't *that* slow .  But I don't have a proposal for 
improving the wording.

> -In general, application code should use a deterministic random bit
> -generator, which could call the @code{getentropy} function described
> -below internally to obtain randomness to seed the generator.  The
> -@code{getrandom} function is intended for low-level applications which
> -need additional control over the blocking behavior.
> +Most programs should use @code{getentropy}.  The @code{getrandom}
> +function is intended for low-level applications which need additional
> +control over the blocking behavior.

Okay.

>   @deftypefun int getentropy (void *@var{buffer}, size_t @var{length})
>   @standards{GNU, sys/random.h}
> @@ -623,7 +672,9 @@ will keep blocking even if a signal arrives, and return only after the
>   entropy pool has been initialized.
>   
>   The @code{getentropy} function can fail with several errors, some of
> -which are listed below.
> +which are listed below.  Applications must check for failure.
> +However, when it succeeds, it will always produce exactly @var{length}
> +bytes of random data.

Okay.

> @@ -645,11 +696,11 @@ could not be overwritten with random data for an unspecified reason.
>   @standards{GNU, sys/random.h}
>   @safety{@mtsafe{}@assafe{}@acsafe{}}
>   
> -This function writes @var{length} bytes of random data to the array
> -starting at @var{buffer}.  On success, this function returns the number
> -of bytes which have been written to the buffer (which can be less than
> -@var{length}).  On error, @code{-1} is returned, and @code{errno} is
> -updated accordingly.
> +This function writes up to @var{length} bytes of random data to the
> +array starting at @var{buffer}.  On success, this function returns the
> +number of bytes which have been written to the buffer, which can be
> +less than @var{length}.  On error, @code{-1} is returned, and
> +@code{errno} is updated accordingly.

Okay.

>   The @code{getrandom} function is declared in the header file
>   @file{sys/random.h}.  It is a GNU extension.
> @@ -676,7 +727,8 @@ booted and the pool has not yet been initialized.
>   
>   The @code{getrandom} function can fail with several errors, some of
>   which are listed below.  In addition, the function may not fill the
> -buffer completely and return a value less than @var{length}.
> +buffer completely.  Applications must check for failure and for short
> +writes.

Okay.

>   @table @code
>   @item ENOSYS
> diff --git a/manual/string.texi b/manual/string.texi
> index b07cfb4550..f113777f2d 100644
> --- a/manual/string.texi
> +++ b/manual/string.texi
> @@ -2469,8 +2469,6 @@ Unlike Rot13, @code{memfrob} works on arbitrary binary data, not just
>   text.
>   @cindex Rot13
>   
> -For true encryption, @xref{Cryptographic Functions}.
> -

Okay.

>   This function is declared in @file{string.h}.
>   @pindex string.h
>   
> @@ -2487,9 +2485,7 @@ Note that @code{memfrob} a second time on the same data structure
>   returns it to its original state.
>   
>   This is a good function for hiding information from someone who doesn't
> -want to see it or doesn't want to see it very much.  To really prevent
> -people from retrieving the information, use stronger encryption such as
> -that described in @xref{Cryptographic Functions}.
> +want to see it or doesn't want to see it very much.

Hmm.  Okay I guess.

Thanks,
Florian
Zack Weinberg May 8, 2018, 3:58 p.m. UTC | #2
On Tue, May 8, 2018 at 10:23 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 05/06/2018 07:51 PM, Zack Weinberg wrote:
>> +Many countries have historically placed legal restrictions on the
>> +import, export, possession, or use of cryptographic tools.  These
>> +restrictions change frequently, so we cannot give you specific advice
>> +about what the situation might be in your country.  To the best of our
>> +knowledge, no one has objected to the export or import of @theglibc{}
>> +in many years,
>
> I think people routinely file notices for glibc with the appropriate
> authorities.  The whole thing doesn't really work the way the text implies.
> It's not that nobody complains, it's that all this work is done proactively.
> I don't think we need to get into detail here.
>
>> but if you are writing a program that performs
>> +cryptographic operations, we advise you to find out what your
>> +country's laws might be, first.  The one-way hash functions for
>> +password storage are less likely to be restricted than the DES-based
>> +encryption functions, because they cannot be used to send secret
>> +messages---the recipient would not be able to decrypt them.
>
> It's fairly easy to turn a one-way hash function into a stream cipher, so
> I'm not sure if the final sentence is correct.
>
> The HMAC construction can also be used with this and give you a cipher:
>
>   <https://people.csail.mit.edu/rivest/Chaffing.txt>
>
> I'm also not sure if DES qualifies as a cryptographic algorithm anymore, so
> the actual situation might well be the opposite of what you state. Sorry.

I will think about this some more, and possibly consult one of my
actual lawyer friends over at the EFF.  Do you have any thoughts on
more appropriate wording?  Or, do you think we should not discuss this
at all, which is how I had it in the previous revision?

>> +The function @code{crypt} converts a password string, @var{phrase},
>> +into a one-way hash suitable for storage in the user database.  The
>> +hash will consist entirely of printable ASCII characters.  It will not
>> +contain whitespace, nor any of the characters @kbd{:}, @kbd{;},
>> +@kbd{*}, @kbd{!}, or @kbd{\}.
>
> @code{':'} etc.?  These are bytes, not keyboard characters.

I always have problems deciding which of @samp/@code/@kbd/... to use
for this sort of thing.  The current Texinfo manual says specifically
to use @samp for single characters "unless @kbd or @key is more
appropriate", and I think you're right that it shouldn't be @kbd, so
I'll change them to @samp in the next revision.

>> +@tab @samp{$6$}
>
> Maybe use @code{"$6$"} etc. for the prefixes?  No strong preference on my
> part though.

In this case I think we need to use @samp, because the prefix for DES
is the empty string, which will come out as ‘’ with @samp but nothing
at all with @code.

> The code checks [that the salt uses only certain characters], right?  Or perhaps only for DES.

It looks like the code only checks for DES, but I think it's
appropriate to document that the salt should use only those characters
anyway.

>> +will begin with a @kbd{*} and will not be the same as @var{salt}.  In
> @code{'*'}?

@samp, again, I think.

>> -The @code{crypt_r} function does the same thing as @code{crypt}, but
>> -takes an extra parameter which includes space for its result (among
>> -other things), so it can be reentrant.  @code{data@w{->}initialized} must
>> be
>> -cleared to zero before the first time @code{crypt_r} is called.
>> -
>> -The @code{crypt_r} function is a GNU extension.
>> +The function @code{crypt_r} is a reentrant version of @code{crypt}.
>
> I don't think crypt_r is reentrant, it is merely thread-safe.  But no strong
> opinion.

The term 'reentrant' was used by the text this replaced (see above
quoted diff) but I think I will change it to 'thread-safe' here and
elsewhere.

>> +@samp{@w{memset(data, 0, sizeof data)}}
>
> Missing space before parentheses.

Will fix.

> Perhaps add a warning that struct crypt_data is very large and should not be
> allocated on the stack?

Yeah, good idea.

> Okay.  Maybe also say that these functions are not thread-safe?

Only encrypt and setkey are not thread-safe.  I will make sure this is
clear, though.

>> +The @code{setkey} function prepares to perform DES encryption or
>> +decryption using the key @var{key}.  @var{key} should point to an
>> +array of 64 @code{char}s, each of which must be set to either @samp{0}
>> +or @samp{1}; that is, each byte stores one bit of the key.  Every
>> +eighth byte (array indices 7, 15, 23, @dots{}) must be set to give it
>> +plus the preceding group of seven bytes odd parity.
>
> “must be set to the negation of the parity of the preceding seven
> bits/bytes”?

What I wrote is confusing, but I'm not sure this suggestion is correct
either.  I'll think about it some more.

>> +@samp{@w{memset(data, 0, sizeof data)}}
>
> Missing space before parenthesis.

Will fix.

>>     The function @code{ecb_crypt} encrypts or decrypts one or more blocks
>> -using DES.  Each block is encrypted independently.
>> +using DES.  Each block is encrypted independently, which means that if
>> +any two input blocks are the same, then their encryptions will also be
>> +the same.  This is an additional weakness in the encryption, on top of
>> +the weakness of DES itself.
>
> Maybe mention Electronic Codebook Mode (ECB)?

Good point, not everyone already knows what "ecb" means :)

>> +  if (getentropy (ubytes, sizeof ubytes))
>> +    {
>> +      perror ("getentropy");
>> +      return 1;
>> +    }
>
> Explicit check of getentropy return value against zero?  It's not exactly a
> boolean flag, after all.

getentropy is documented to return 0 on success or -1 on failure, so I
think 'if (getentropy (...))' is the way to go here.  If it returned
the number of bytes read, that would be a different story, but it's
getrandom that does that.

>> +  /* Retrieve the password for user argv[1]
>> +     from the user database.  */
>> +  pw = getpwnam (argv[1]);
>
> Check error?

Doh! Will fix.

>> -  ok = strcmp (result, pass) == 0;
>> +  ok = strcmp (result, pw->pw_passwd) == 0;
>
> Maybe add a comment that this could result in a timing oracle?

It doesn't, because we are comparing the hashes, not the raw
passwords.  Maybe I should say _that_?

>> +A @dfn{cryptographically strong pseudo-random number generator}
>> +(CSPRNG) is a PRNG that @emph{does} guarantee its output is
>> +unguessable.  Formally, there is no deterministic, polynomial-time
>> +algorithm@footnote{Assuming
>> +@iftex
>> +@c Don't typeset NP as if multiplying N by P. Use text italic for both.
>> +@math{\hbox{\it P} \neq \hbox{\it NP}}@c
>> +@end iftex
>> +@ifnottex
>> +@math{P ≠ NP}@c
>> +@end ifnottex
>> +.} that can tell the difference between the output of
>> +a CSPRNG and a sequence of numbers that really were chosen uniformly
>> +at random.  A CSPRNG still uses a seed and can only produce an
>> +astronomically small number of random sequences.
>
> I'm not sure if this detail is necessary.  If it is, you need to add
> “independent” somewhere.

I'm afraid I don't understand which bit of the above text is the
detail you think might not be necessary, or where the word
"independent" should appear.  This is probably because I have so
thoroughly internalized how the academic cryptography literature uses
the term "CSPRNG" that I no longer know how to explain it to someone
who's new to the topic. Could you clarify, please?

>> +Finally, a @dfn{true random number generator} (TRNG) produces random
>> +numbers not from a seed, but from some unpredictable physical process.
>> +In principle, a TRNG's output is unguessable, and is not limited to an
>> +astronomically small number of sequences.  However, TRNGs are very
>> +slow, and because there is no seed, there is no way to make one repeat
>> +its output.  Usually, it is best to use a TRNG only to choose the seed
>> +for a PRNG or CSPRNG.
>> +
>> +At present, @theglibc{} offers a variety of ordinary PRNGs, and on
>> +some operating systems it also offers access to an OS-provided TRNG.
>> +We may add a CSPRNG in the future.
>
> If this refers to arc4random, I don't think it's a CSPRNG under your
> definition.

Yes, that was the intent; why would it not qualify?

> And we shouldn't claim it has formal properties initially
> because that would make glibc subject to FIPS certification again, which is
> something we should avoid.

I meant to use "formal" in the mathematical sense -- "these are the
details of how the guarantee is made accessible to mathematical
proof".  Can you point me at some explanation of what phrasing might
or might not make a FIPS certification requirement apply?  I only know
the barest outline of that process.


>> +Some cryptographic applications need truly unpredictable bytes.
>> +@Theglibc{} provides two functions for this purpose, both of which
>> +access a true random generator implemented by the operating system.
>
> I don't think the Linux urandom generator qualifies as a TRNG, sorry.

Hmm.  I see why you say that, but for most programs' purposes it is
good enough.  Maybe that's what I should say, and also that if you
need something stronger you need to get yourself an actual piece of
hardware?

>
>> +Not all operating systems support these functions; programs that use
>> +them must be prepared for them to fail.  They are slow, and can only
>> +produce short sequences of unpredictable bytes.  Most programs should
>> +use these functions only to seed a cryptographically strong
>> +pseudo-random generator.
>
> The functions aren't *that* slow .  But I don't have a proposal for
> improving the wording.

I think maybe I just won't say they're slow.  The major reason for
wanting to use them only for seeds is the 256-byte limit on the output
of getentropy.

>>   This function is declared in @file{string.h}.
>>   @pindex string.h
>>   @@ -2487,9 +2485,7 @@ Note that @code{memfrob} a second time on the same
>> data structure
>>   returns it to its original state.
>>     This is a good function for hiding information from someone who
>> doesn't
>> -want to see it or doesn't want to see it very much.  To really prevent
>> -people from retrieving the information, use stronger encryption such as
>> -that described in @xref{Cryptographic Functions}.
>> +want to see it or doesn't want to see it very much.
>
> Hmm.  Okay I guess.

The point of this change is the manual shouldn't be endorsing DES even
by implication as a way of "really prevent[ing]" people from
retrieving information.  How about I mention libgcrypt again, instead
of saying nothing at all?

zw
Florian Weimer May 8, 2018, 5:51 p.m. UTC | #3
On 05/08/2018 05:58 PM, Zack Weinberg wrote:
> I will think about this some more, and possibly consult one of my
> actual lawyer friends over at the EFF.  Do you have any thoughts on
> more appropriate wording?  Or, do you think we should not discuss this
> at all, which is how I had it in the previous revision?

Not discussing it at all is fine with me.  I can't provide any concrete 
guidance for the current U.S. situation.

>>> +The function @code{crypt} converts a password string, @var{phrase},
>>> +into a one-way hash suitable for storage in the user database.  The
>>> +hash will consist entirely of printable ASCII characters.  It will not
>>> +contain whitespace, nor any of the characters @kbd{:}, @kbd{;},
>>> +@kbd{*}, @kbd{!}, or @kbd{\}.
>>
>> @code{':'} etc.?  These are bytes, not keyboard characters.
> 
> I always have problems deciding which of @samp/@code/@kbd/... to use
> for this sort of thing.  The current Texinfo manual says specifically
> to use @samp for single characters "unless @kbd or @key is more
> appropriate", and I think you're right that it shouldn't be @kbd, so
> I'll change them to @samp in the next revision.

I meant ':' as a C expression, hence wrapping it in @code.

>>> +@tab @samp{$6$}
>>
>> Maybe use @code{"$6$"} etc. for the prefixes?  No strong preference on my
>> part though.
> 
> In this case I think we need to use @samp, because the prefix for DES
> is the empty string, which will come out as ‘’ with @samp but nothing
> at all with @code.

Again, with "…" as a wrapper, so that the empty string is actually 
visible.  No strong opinion though.

>> The code checks [that the salt uses only certain characters], right?  Or perhaps only for DES.
> 
> It looks like the code only checks for DES, but I think it's
> appropriate to document that the salt should use only those characters
> anyway.

Right.

>>> +  if (getentropy (ubytes, sizeof ubytes))
>>> +    {
>>> +      perror ("getentropy");
>>> +      return 1;
>>> +    }
>>
>> Explicit check of getentropy return value against zero?  It's not exactly a
>> boolean flag, after all.
> 
> getentropy is documented to return 0 on success or -1 on failure, so I
> think 'if (getentropy (...))' is the way to go here.

Hmm.  I thought that the C/POSIX 0/-1 return value does not really 
qualify as a boolean because the meaning is reversed and the bit pattern 
is non-standard, but I'm not sure if that's part of our coding guidelines.

>>> -  ok = strcmp (result, pass) == 0;
>>> +  ok = strcmp (result, pw->pw_passwd) == 0;
>>
>> Maybe add a comment that this could result in a timing oracle?
> 
> It doesn't, because we are comparing the hashes, not the raw
> passwords.  Maybe I should say _that_?

It still makes me nervous. 8-/

In some cases, the comparison could go in the other direction.

>>> +A @dfn{cryptographically strong pseudo-random number generator}
>>> +(CSPRNG) is a PRNG that @emph{does} guarantee its output is
>>> +unguessable.  Formally, there is no deterministic, polynomial-time
>>> +algorithm@footnote{Assuming
>>> +@iftex
>>> +@c Don't typeset NP as if multiplying N by P. Use text italic for both.
>>> +@math{\hbox{\it P} \neq \hbox{\it NP}}@c
>>> +@end iftex
>>> +@ifnottex
>>> +@math{P ≠ NP}@c
>>> +@end ifnottex
>>> +.} that can tell the difference between the output of
>>> +a CSPRNG and a sequence of numbers that really were chosen uniformly
>>> +at random.  A CSPRNG still uses a seed and can only produce an
>>> +astronomically small number of random sequences.
>>
>> I'm not sure if this detail is necessary.  If it is, you need to add
>> “independent” somewhere.
> 
> I'm afraid I don't understand which bit of the above text is the
> detail you think might not be necessary, or where the word
> "independent" should appear.

The sequence of random variables against which the CSPRNG is compared 
must be uniformly distributed and independent, or something like that. 
This is something I last thought about probably fifteen years ago, so do 
not take my word for it.

However, I still think that this definition still has two problems: It 
is hard to prove anything in this area, so the definition isn't really 
helpful for documentation (because we don't know if our algorithm 
actually has that property).  And I'm vary of asymptomatics 
(“polynomial-time”) in this context anyway because our algorithms have 
fixed-size inputs and internal state, so everything is O(1) anyway.

>>> +Finally, a @dfn{true random number generator} (TRNG) produces random
>>> +numbers not from a seed, but from some unpredictable physical process.
>>> +In principle, a TRNG's output is unguessable, and is not limited to an
>>> +astronomically small number of sequences.  However, TRNGs are very
>>> +slow, and because there is no seed, there is no way to make one repeat
>>> +its output.  Usually, it is best to use a TRNG only to choose the seed
>>> +for a PRNG or CSPRNG.
>>> +
>>> +At present, @theglibc{} offers a variety of ordinary PRNGs, and on
>>> +some operating systems it also offers access to an OS-provided TRNG.
>>> +We may add a CSPRNG in the future.
>>
>> If this refers to arc4random, I don't think it's a CSPRNG under your
>> definition.
> 
> Yes, that was the intent; why would it not qualify?

We don't have proof for any NIST-approved algorithm that it is a CSPRNG, 
do we?

And existing proofs are often useless, see section 6 of 
<https://eprint.iacr.org/2006/229> (Neal Koblitz and Alfred Menezes, 
“Another Look at "Provable Security" II”).

>>> +Some cryptographic applications need truly unpredictable bytes.
>>> +@Theglibc{} provides two functions for this purpose, both of which
>>> +access a true random generator implemented by the operating system.
>>
>> I don't think the Linux urandom generator qualifies as a TRNG, sorry.
> 
> Hmm.  I see why you say that, but for most programs' purposes it is
> good enough.  Maybe that's what I should say, and also that if you
> need something stronger you need to get yourself an actual piece of
> hardware?

I think we should steer clear of crypto politics (including proof 
politics) and just say it's unpredictable.

> I think maybe I just won't say they're slow.  The major reason for
> wanting to use them only for seeds is the 256-byte limit on the output
> of getentropy.

That's a good point.

>>>    This function is declared in @file{string.h}.
>>>    @pindex string.h
>>>    @@ -2487,9 +2485,7 @@ Note that @code{memfrob} a second time on the same
>>> data structure
>>>    returns it to its original state.
>>>      This is a good function for hiding information from someone who
>>> doesn't
>>> -want to see it or doesn't want to see it very much.  To really prevent
>>> -people from retrieving the information, use stronger encryption such as
>>> -that described in @xref{Cryptographic Functions}.
>>> +want to see it or doesn't want to see it very much.
>>
>> Hmm.  Okay I guess.
> 
> The point of this change is the manual shouldn't be endorsing DES even
> by implication as a way of "really prevent[ing]" people from
> retrieving information.  How about I mention libgcrypt again, instead
> of saying nothing at all?

I like this idea.

Thanks,
Florian
Rical Jasan May 9, 2018, 11:36 a.m. UTC | #4
On 05/06/2018 10:51 AM, Zack Weinberg wrote:
> This is a major rewrite of the description of 'crypt'.  It also makes
> minor improvements to the description of the obsolete DES functions,
> and introduces some new introductory exposition to the "random number
> generation" chapter.
> 
> A few highlights of the content changes:
> 
>  - All references to FIPS have been removed.  The "Legal Problems"
>    section has also been removed, but I left a one-paragraph
>    cautionary note that there _are_, or have been, legal problems at
>    the top of the chapter.  I also removed the paragraph saying that
>    systems not connected to a network need no user authentication,
>    because that's a pretty rare situation nowadays.  (It still says
>    "sometimes it is necessary" to authenticate the user, though.)
> 
>  - I added documentation for all of the hash functions that glibc
>    actually supports, but not for the additional hash functions
>    supported by libxcrypt.  If we're going to keep this manual section
>    around after the transition is more advanced, it would probably
>    make sense to add them then.
> 
>  - There is much more detailed discussion of how to generate a salt,
>    and the failure behavior for crypt is documented.  (Returning an
>    invalid hash on failure is what libxcrypt does; Solar Designer's
>    notes say that this was done "for compatibility with old programs
>    that assume crypt can never fail".)
> 
>  - As far as I can tell, the header 'crypt.h' is entirely a GNU
>    invention, and never existed on any other Unix lineage.  The
>    functions 'crypt', 'encrypt', and 'setkey', however, were in Issue
>    1 of the SVID and are now in the XSI component of POSIX.  I tried
>    to make all of the @standards annotations consistent with this, but
>    I'm not sure I got them perfectly right.
> 
>  - The genpass.c example has been improved to use getentropy instead
>    of the current time to generate the salt.  It uses more random
>    bytes than is strictly necessary because I didn't want to
>    complicate the code with proper base64 encoding.
> 
>  - The testpass.c example has been improved to take a username as a
>    command-line argument and validate that user's password, instead of
>    hardwiring a hash.  It really ought to read passwords out of the
>    shadow file (getspnam() instead of getpwnam()), but we don't have
>    any documentation for the shadow file functions, and I don't want
>    to delay this work even more by writing some.
> 
>  - I tried to make the explanation of the encoding of the arguments to
>    'encrypt' and 'setkey' better.  It's still a bit confusing but I
>    don't think it's worth spending any more effort on.
> 
>  - I changed "hardware device" to "hardware DES accelerator" because a
>    hardware device could be anything.
> 
>  - DES_FAILED is a macro, not a function, and the description of the
>    value it returns was backwards.
> 
> 	* manual/crypt.texi (Cryptographic Functions): New initial
> 	exposition.
> 	(Legal Problems): Section removed, replaced by a short caution in
> 	the initial exposition for 'Cryptographic Functions'.
> 	(crypt): Section renamed to 'Password Storage'.  Full rewrite.
> 	(DES Encryption): Section renamed to 'Obsolete Encryption'.
> 	Remove all mention of FIPS.  State clearly that none of these
> 	functions should be used anymore.  Various other improvements to
> 	the text.
> 	* manual/examples/genpass.c: Generate a salt using getentropy
> 	instead of the current time.
> 	* manual/examples/testpass.c: Instead of a hard-coded password
> 	hash, demonstrate reading passwords from the user database.
> 	* manual/string.texi (strfry, memfrob): Remove cross-references to
> 	the obsolete DES encryption functions.
> 
> 	* manual/random.texi (Random Number Generation): New initial
> 	exposition.
> 	(Pseudo-Random Numbers): No need to xref the very next section.
> 	(Unpredictable Bytes): Improve initial exposition.  Clarify error
> 	behavior of getentropy and getrandom.
> ---

Another impeccable introduction, thank you.

>  manual/crypt.texi          | 445 ++++++++++++++++++++-----------------
>  manual/examples/genpass.c  |  38 ++--
>  manual/examples/testpass.c |  34 ++-
>  manual/random.texi         |  84 +++++--
>  manual/string.texi         |   6 +-
>  5 files changed, 358 insertions(+), 249 deletions(-)
> 
> diff --git a/manual/crypt.texi b/manual/crypt.texi
> index c24306c07c..a3c02be055 100644
> --- a/manual/crypt.texi
> +++ b/manual/crypt.texi
> @@ -2,73 +2,35 @@
>  @chapter Cryptographic Functions
>  @c %MENU% Password storage and obsolete encryption functions
>  
> -@vindex AUTH_DES
> -@cindex FIPS 140-2
> -It also provides support for Secure RPC, and some library functions that
> -can be used to perform normal DES encryption.  The @code{AUTH_DES}
> -authentication flavor in Secure RPC, as provided by @theglibc{},
> -uses DES and does not comply with FIPS 140-2 nor does any other use of DES
> -within @theglibc{}.  It is recommended that Secure RPC should not be used
> -for systems that need to comply with FIPS 140-2 since all flavors of
> -encrypted authentication use normal DES.
> +@Theglibc{} is not a general-purpose cryptography library, but it does
> +include one-way hash functions for password storage, and a set of
> +encryption functions based on the obsolete DES block cipher.  The
> +latter are present for historical reasons only, and should not be used
> +at all anymore; any existing programs that still use them need to be
> +updated to use ciphers that are still secure.

Could we say, "ciphers that are considered secure by modern standards"?
I know it's a bit of a mouthful, but "still" feels like DES originally
had a contemporary that has stood the test of time, or that glibc has
other ciphers that should be (or have been) used instead.  Maybe this is
a good moment to make yet another plug at using a cryptography library
for cryptography?  The introduction to Obsolete Encryption says "use one
of the many free encryption libraries that use modern ciphers", which
could work equally as well here too.

> +Many countries have historically placed legal restrictions on the
> +import, export, possession, or use of cryptographic tools.  These
> +restrictions change frequently, so we cannot give you specific advice
> +about what the situation might be in your country.  To the best of our
> +knowledge, no one has objected to the export or import of @theglibc{}
> +in many years, but if you are writing a program that performs
> +cryptographic operations, we advise you to find out what your
> +country's laws might be, first.  The one-way hash functions for
> +password storage are less likely to be restricted than the DES-based
> +encryption functions, because they cannot be used to send secret
> +messages---the recipient would not be able to decrypt them.

Noting Florian's comment in his reply, we could probably get rid of the
last sentence and keep the spirit of the paragraph intact.

>  
>  @menu
> -* Legal Problems::              This software can get you locked up, or worse.
> -* crypt::                       A one-way function for passwords.
> -* DES Encryption::              Routines for DES encryption.
> +* Password Storage::		A one-way function for passwords.
> +* Obsolete Encryption::		Routines for encrypting and decrypting
> +				  data using the obsolete DES block cipher.
>  @end menu
>  
> -@node Legal Problems
> -@section Legal Problems
> -
> -Because of the continuously changing state of the law, it's not possible
> -to provide a definitive survey of the laws affecting cryptography.
> -Instead, this section warns you of some of the known trouble spots; this
> -may help you when you try to find out what the laws of your country are.
> -
> -Some countries require that you have a license to use, possess, or import
> -cryptography.  These countries are believed to include Byelorussia,
> -Burma, India, Indonesia, Israel, Kazakhstan, Pakistan, Russia, and Saudi
> -Arabia.
> -
> -Some countries restrict the transmission of encrypted messages by radio;
> -some telecommunications carriers restrict the transmission of encrypted
> -messages over their network.
> -
> -Many countries have some form of export control for encryption software.
> -The Wassenaar Arrangement is a multilateral agreement between 33
> -countries (Argentina, Australia, Austria, Belgium, Bulgaria, Canada, the
> -Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary,
> -Ireland, Italy, Japan, Luxembourg, the Netherlands, New Zealand, Norway,
> -Poland, Portugal, the Republic of Korea, Romania, the Russian
> -Federation, the Slovak Republic, Spain, Sweden, Switzerland, Turkey,
> -Ukraine, the United Kingdom and the United States) which restricts some
> -kinds of encryption exports.  Different countries apply the arrangement
> -in different ways; some do not allow the exception for certain kinds of
> -``public domain'' software (which would include this library), some
> -only restrict the export of software in tangible form, and others impose
> -significant additional restrictions.
> -
> -The United States has additional rules.  This software would generally
> -be exportable under 15 CFR 740.13(e), which permits exports of
> -``encryption source code'' which is ``publicly available'' and which is
> -``not subject to an express agreement for the payment of a licensing fee or
> -royalty for commercial production or sale of any product developed with
> -the source code'' to most countries.
> -
> -The rules in this area are continuously changing.  If you know of any
> -information in this manual that is out-of-date, please report it to
> -the bug database.  @xref{Reporting Bugs}.
> -
> -@node crypt
> -@section Encrypting Passwords
> -
> -On many systems, it is unnecessary to have any kind of user
> -authentication; for instance, a workstation which is not connected to a
> -network probably does not need any user authentication, because to use
> -the machine an intruder must have physical access.
> -
> -Sometimes, however, it is necessary to be sure that a user is authorized
> +@node Password Storage
> +@section Password Storage
> +
> +Sometimes it is necessary to be sure that a user is authorized
>  to use some service a machine provides---for instance, to log in as a
>  particular user id (@pxref{Users and Groups}).  One traditional way of
>  doing this is for each user to choose a secret @dfn{password}; then, the
> @@ -76,164 +38,224 @@ system can ask someone claiming to be a user what the user's password
>  is, and if the person gives the correct password then the system can
>  grant the appropriate privileges.
>  
> -If all the passwords are just stored in a file somewhere, then this file
> -has to be very carefully protected.  To avoid this, passwords are run
> -through a @dfn{one-way function}, a function which makes it difficult to
> -work out what its input was by looking at its output, before storing in
> -the file.
> -
> -@Theglibc{} provides a one-way function that is compatible with
> -the behavior of the @code{crypt} function introduced in FreeBSD 2.0.
> -It supports two one-way algorithms: one based on the MD5
> -message-digest algorithm that is compatible with modern BSD systems,
> -and the other based on the Data Encryption Standard (DES) that is
> -compatible with Unix systems.
> -
> -@deftypefun {char *} crypt (const char *@var{key}, const char *@var{salt})
> -@standards{BSD, crypt.h}
> -@standards{SVID, crypt.h}
> +Programs that handle passwords must take special care not to reveal
> +them to anyone, no matter what.  It is not enough to keep them in a
> +file that is only accessible with special privileges.  The file might
> +be ``leaked'' via a bug or misconfiguration, and system administrators
> +shouldn't learn everyone's password even if they have to edit that
> +file for some reason.  To avoid this, passwords should also be
> +converted into @dfn{one-way hashes}, using a @dfn{one-way function},
> +before they are stored.  It is believed that the only way to recover a
> +password from its one-way hash is to guess possible passwords, hash
> +them, and compare the output with the hashes stored in the password
> +file.  The one-way functions are designed to make this process
> +impractically slow, for all but the most obvious guesses.  (Do not use
> +a word from the dictionary as your password.)
> +
> +@Theglibc{} provides an interface to four different one-way functions,
> +based on the SHA-2-512, SHA-2-256, MD5, and DES cryptographic
> +primitives.  New passwords should be hashed with either of the
> +SHA-based functions.  The others are too weak for newly set passwords,
> +but we continue to support them for verifying old passwords.  The
> +DES-based hash is especially weak, because it ignores all but the
> +first eight characters of its input.
> +
> +@deftypefun {char *} crypt (const char *@var{phrase}, const char *@var{salt})

I suppose changing the variable name from KEY to PHRASE is more
accessible/intuitive to a wider audience, so OK, but it breaks
consistency with the other functions below.

> +@standards{X/Open, unistd.h}
> +@standards{GNU, crypt.h}

POSIX says crypt is in unistd.h (which is marked XSI) and has been
present from Issue 1, having been derived from Issue 1 of the SVID.  I
do not see a header include in the SYNOPSIS of crypt/encrypt/setkey in
my copy of the SVID 4Ev1, however, and I can't speak for BSD, so I'm not
sure if this is just /more correct/, or what.

It also touches on which standard we want to report (i.e., the first,
latest, all, etc.), and I think we may have been heading towards the
oldest relevant standard (which would probably be POSIX/XSI here; i.e.,
X/Open, if I'm understanding the interplay correctly).  Normally I would
suggest mentioning the origins in the text after changing the @standards
from BSD and SVID to X/Open, but since the POSIX docs point to the SVID,
I think there's enough of a trail left behind.

Perhaps someone else can weigh in on whether crypt.h is BSD or GNU.

>  @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{} @ascuheap{} @ascudlopen{}}@acunsafe{@aculock{} @acsmem{}}}
>  @c Besides the obvious problem of returning a pointer into static
>  @c storage, the DES initializer takes an internal lock with the usual
> -@c set of problems for AS- and AC-Safety.  The FIPS mode checker and the
> -@c NSS implementations of may leak file descriptors if canceled.  The
> +@c set of problems for AS- and AC-Safety.
> +@c The NSS implementations may leak file descriptors if cancelled.
>  @c The MD5, SHA256 and SHA512 implementations will malloc on long keys,
>  @c and NSS relies on dlopening, which brings about another can of worms.
>  
> -The @code{crypt} function takes a password, @var{key}, as a string, and
> -a @var{salt} character array which is described below, and returns a
> -printable ASCII string which starts with another salt.  It is believed
> -that, given the output of the function, the best way to find a @var{key}
> -that will produce that output is to guess values of @var{key} until the
> -original value of @var{key} is found.
> -
> -The @var{salt} parameter does two things.  Firstly, it selects which
> -algorithm is used, the MD5-based one or the DES-based one.  Secondly, it
> -makes life harder for someone trying to guess passwords against a file
> -containing many passwords; without a @var{salt}, an intruder can make a
> -guess, run @code{crypt} on it once, and compare the result with all the
> -passwords.  With a @var{salt}, the intruder must run @code{crypt} once
> -for each different salt.
> -
> -For the MD5-based algorithm, the @var{salt} should consist of the string
> -@code{$1$}, followed by up to 8 characters, terminated by either
> -another @code{$} or the end of the string.  The result of @code{crypt}
> -will be the @var{salt}, followed by a @code{$} if the salt didn't end
> -with one, followed by 22 characters from the alphabet
> -@code{./0-9A-Za-z}, up to 34 characters total.  Every character in the
> -@var{key} is significant.
> -
> -For the DES-based algorithm, the @var{salt} should consist of two
> -characters from the alphabet @code{./0-9A-Za-z}, and the result of
> -@code{crypt} will be those two characters followed by 11 more from the
> -same alphabet, 13 in total.  Only the first 8 characters in the
> -@var{key} are significant.
> -
> -The MD5-based algorithm has no limit on the useful length of the
> -password used, and is slightly more secure.  It is therefore preferred
> -over the DES-based algorithm.
> -
> -When the user enters their password for the first time, the @var{salt}
> -should be set to a new string which is reasonably random.  To verify a
> -password against the result of a previous call to @code{crypt}, pass
> -the result of the previous call as the @var{salt}.
> +The function @code{crypt} converts a password string, @var{phrase},
> +into a one-way hash suitable for storage in the user database.  The
> +hash will consist entirely of printable ASCII characters.  It will not
> +contain whitespace, nor any of the characters @kbd{:}, @kbd{;},
> +@kbd{*}, @kbd{!}, or @kbd{\}.
> +
> +The @var{salt} parameter controls which one-way function is used, and
> +it also ensures that the output of the one-way function is different
> +for every user, even if they have the same password.  This makes it
> +harder to guess passwords from a large user database.  Without salt,
> +the intruder could make a guess, run @code{crypt} on it once, and
> +compare the result with all the hashed passwords.  Salt forces the
> +intruder to make many calls to @code{crypt} for each guess.
> +
> +To verify a password, pass the previously hashed password as the
> +@var{salt}.  To hash a new password for storage, set @var{salt} to a
> +string consisting of a prefix plus a sequence of randomly chosen
> +characters, according to this table:
> +
> +@multitable @columnfractions .15 .1 .3
> +@headitem Algorithm @tab Prefix @tab Random sequence
> +@item SHA-2-512
> +@tab @samp{$6$}
> +@tab 16 characters
> +@item SHA-2-256
> +@tab @samp{$5$}
> +@tab 16 characters
> +@item MD5
> +@tab @samp{$1$}
> +@tab 8 characters
> +@item DES
> +@tab @samp{}
> +@tab 2 characters
> +@end multitable
> +
> +In all cases, the random characters should be chosen from the alphabet
> +@code{./0-9A-Za-z}.
> +
> +With all of the hash functions @emph{except} DES, @var{phrase} can be
> +arbitrarily long, and all eight bits of each byte are significant.
> +With DES, all but the first eight characters of @var{phrase} are ignored,
> +and the eighth bit of each byte is also ignored.
> +
> +@code{crypt} can fail.  Some implementations return @code{NULL} on
> +failure, and others return an @emph{invalid} hashed passphrase, which
> +will begin with a @kbd{*} and will not be the same as @var{salt}.  In
> +either case, @code{errno} will be set to indicate the problem.  Some
> +of the possible error codes are:
> +
> +@table @code
> +@item EINVAL
> +@var{salt} is invalid; neither a previously hashed password, nor a
> +well-formed new salt for any of the supported hash functions.
> +
> +@item EPERM
> +The hash function selected by @var{salt} has been disabled by local
> +policy.
> +
> +@item ENOMEM
> +Failed to allocate internal scratch storage.
> +
> +@item ENOSYS
> +@itemx EOPNOTSUPP
> +Hashing passphrases is not supported at all, or the hash function
> +selected by @var{salt} is not supported.  @Theglibc{} does not use
> +these error codes, but they may be encountered on other operating
> +systems.
> +@end table
> +
> +@code{crypt} uses static storage for both internal scratchwork and the
> +string it returns.  It is not safe to call @code{crypt} from multiple
> +threads simultaneously, and the string it returns will be overwritten
> +by any subsequent call to @code{crypt}.
> +
> +@code{crypt} is specified in the X/Open Portability Guide and is
> +present on nearly all historical Unix systems.  However, XPG does not

"the XPG"

> +specify any one-way functions.
> +
> +@code{crypt} is declared in @file{unistd.h}.  @Theglibc{} also
> +declares this function in @file{crypt.h}.
>  @end deftypefun
>  
> -@deftypefun {char *} crypt_r (const char *@var{key}, const char *@var{salt}, {struct crypt_data *} @var{data})
> +@deftypefun {char *} crypt_r (const char *@var{phrase}, const char *@var{salt}, {struct crypt_data *} @var{data})

I believe it's safe to strip the braces around the struct and remove the
intervening space between the pointer and variable, for formatting
consistency.  (The braces are important around the return type, at least
for generating the Summary, IIRC.)

>  @standards{GNU, crypt.h}
>  @safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @asulock{} @ascuheap{} @ascudlopen{}}@acunsafe{@aculock{} @acsmem{}}}
>  @c Compared with crypt, this function fixes the @mtasurace:crypt
>  @c problem, but nothing else.
>  
> -The @code{crypt_r} function does the same thing as @code{crypt}, but
> -takes an extra parameter which includes space for its result (among
> -other things), so it can be reentrant.  @code{data@w{->}initialized} must be
> -cleared to zero before the first time @code{crypt_r} is called.
> -
> -The @code{crypt_r} function is a GNU extension.
> +The function @code{crypt_r} is a reentrant version of @code{crypt}.
> +Instead of static storage, it uses the memory pointed to by its
> +@var{data} argument for both scratchwork and the string it returns.
> +It can safely be used from multiple threads, as long as different
> +@var{data} objects are used in each thread.  The string it returns
> +will still be overwritten by another call with the same @var{data}.
> +@var{data} must be a pointer to a @samp{struct crypt_data} object

Shouldn't the @samp be @code?

> +allocated by the caller, and before one of these objects is used for
> +the first time, it must be erased using
> +@samp{@w{memset(data, 0, sizeof data)}}
> +or similar.

In the past, we've avoided using function calls inline within
paragraphs, so this should be, "erased using:

@smallexample
memset (data, 0, sizeof (data));
@end smallexample

@noident
or similar."

(Also note syntax/style differences in the C code.)

> +
> +@code{crypt_r} is a GNU extension.  It is declared in @file{crypt.h},
> +as is @samp{struct crypt_data}.

@code, again, I think.

>  @end deftypefun
>  
> -The @code{crypt} and @code{crypt_r} functions are prototyped in the
> -header @file{crypt.h}.
> -
> -The following short program is an example of how to use @code{crypt} the
> -first time a password is entered.  Note that the @var{salt} generation
> -is just barely acceptable; in particular, it is not unique between
> -machines, and in many applications it would not be acceptable to let an
> -attacker know what time the user's password was last set.
> +The following program shows how to use @code{crypt} the first time a
> +password is entered.  It uses @code{getentropy} to make the salt as
> +unpredictable as possible; @pxref{Unpredictable Bytes}.
>  
>  @smallexample
>  @include genpass.c.texi
>  @end smallexample
>  
> -The next program shows how to verify a password.  It prompts the user
> -for a password and prints ``Access granted.'' if the user types
> -@code{GNU libc manual}.
> +@c When we have documentation for getspnam, change the demo to use it.

As much as I hate doing things like this, if you're going to leave a
note-to-future-self, at least use a FIXME or TODO on the line, because
you know, someday, somebody's gonna grep 'em all and fix 'em.

Or just file a bug on the lack of documentation.  ;)

> +The next program shows how to verify a password stored in the system
> +user database in the traditional manner; @pxref{User Database}.  (On
> +newer installations, the passwords are instead stored in a ``shadow''
> +database that requires special privileges to access, and this program
> +will not work.)
>  
>  @smallexample
>  @include testpass.c.texi
>  @end smallexample
>  
> -@node DES Encryption
> -@section DES Encryption
> -
> -@cindex FIPS 46-3
> -The Data Encryption Standard is described in the US Government Federal
> -Information Processing Standards (FIPS) 46-3 published by the National
> -Institute of Standards and Technology.  The DES has been very thoroughly
> -analyzed since it was developed in the late 1970s, and no new
> -significant flaws have been found.
> -
> -However, the DES uses only a 56-bit key (plus 8 parity bits), and a
> -machine has been built in 1998 which can search through all possible
> -keys in about 6 days, which cost about US$200000; faster searches would
> -be possible with more money.  This makes simple DES insecure for most
> -purposes, and NIST no longer permits new US government systems
> -to use simple DES.
> -
> -For serious encryption functionality, it is recommended that one of the
> -many free encryption libraries be used instead of these routines.
> -
> -The DES is a reversible operation which takes a 64-bit block and a
> -64-bit key, and produces another 64-bit block.  Usually the bits are
> -numbered so that the most-significant bit, the first bit, of each block
> -is numbered 1.
> -
> -Under that numbering, every 8th bit of the key (the 8th, 16th, and so
> -on) is not used by the encryption algorithm itself.  But the key must
> -have odd parity; that is, out of bits 1 through 8, and 9 through 16, and
> -so on, there must be an odd number of `1' bits, and this completely
> -specifies the unused bits.
> +@node Obsolete Encryption
> +@section Obsolete Encryption

Some @cindices for "Data Encryption Standard" and "DES" would be great here.

> +
> +For historical reasons, @theglibc{} includes several functions which
> +perform encryption using the obsolete Data Encryption Standard (DES).
> +None of these functions should be used anymore.  Instead, use one of
> +the many free encryption libraries that use modern ciphers.
> +
> +DES is a block cipher standardized by the US government in 1977.  It
> +is no longer considered to be secure, and has been withdrawn as a
> +standard, because it only has @math{2@sup{56}} possible keys, so
> +testing all of them is practical.  In 1998, it cost US$250,000 to
> +build a massively parallel computer that could find the DES key that
> +encrypted a known message in three days.  In 2018, the same process
> +takes only one day, and there are online services that will do it for
> +less than US$100 per message.

This is a good update on the state of the art.

>  @deftypefun void setkey (const char *@var{key})
> -@standards{BSD, crypt.h}
> -@standards{SVID, crypt.h}
> +@standards{X/Open, stdlib.h}
> +@standards{GNU, crypt.h}

Same comments apply here as for crypt, above.

>  @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
>  @c The static buffer stores the key, making it fundamentally
>  @c thread-unsafe.  The locking issues are only in the initialization
>  @c path; cancelling the initialization will leave the lock held, it
>  @c would otherwise repeat the initialization on the next call.
>  
> -The @code{setkey} function sets an internal data structure to be an
> -expanded form of @var{key}.  @var{key} is specified as an array of 64
> -bits each stored in a @code{char}, the first bit is @code{key[0]} and
> -the 64th bit is @code{key[63]}.  The @var{key} should have the correct
> -parity.
> +The @code{setkey} function prepares to perform DES encryption or
> +decryption using the key @var{key}.  @var{key} should point to an
> +array of 64 @code{char}s, each of which must be set to either @samp{0}
> +or @samp{1}; that is, each byte stores one bit of the key.  Every
> +eighth byte (array indices 7, 15, 23, @dots{}) must be set to give it
> +plus the preceding group of seven bytes odd parity.  For instance, if

I feel this explanation is a vast improvement -- particularly given the
following explanation -- but I would inject some commas for grammar's
sake: "... it, plus the preceding group of seven bytes, odd parity."

I haven't vetted the source for correctness, but I think this paragraph
is much more understandable, overall, even going so far as to explain
parity without defining it...

> +there are three bytes set to @samp{1} among bytes 0 through 6, then
> +byte 7 must be set to @samp{0}, and similarly if there are four bytes
> +set to @samp{1} among bytes 8 through 14, then byte 15 must be set to
> +@samp{0}, and so on.  Thus, of the 64 bytes, only 56 can be used to

...although for odd parity, I believe the latter case should set the
parity bit/byte to 1.

As to the use of @samp for the 0's and 1's, I think they should be @math
(see [0]).

> +supply key data.
> +
> +The @code{setkey} function is specified in the X/Open Portability
> +Guide and is declared in @file{stdlib.h}.  @Theglibc{} also declares
> +this function in @file{crypt.h}.
>  @end deftypefun
>  
>  @deftypefun void encrypt (char *@var{block}, int @var{edflag})
> -@standards{BSD, crypt.h}
> -@standards{SVID, crypt.h}
> +@standards{X/Open, unistd.h}
> +@standards{GNU, crypt.h}

Same comments apply here as for crypt, above.

>  @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
>  @c Same issues as setkey.
>  
> -The @code{encrypt} function encrypts @var{block} if
> -@var{edflag} is 0, otherwise it decrypts @var{block}, using a key
> -previously set by @code{setkey}.  The result is
> -placed in @var{block}.
> +The @code{encrypt} function encrypts @var{block} if @var{edflag} is 0,
> +otherwise it decrypts @var{block}, using a key previously set by
> +@code{setkey}.  The result overwrites the previous value of
> +@var{block}.
>  
> -Like @code{setkey}, @var{block} is specified as an array of 64 bits each
> -stored in a @code{char}, but there are no parity bits in @var{block}.
> +Like @code{setkey}, @var{block} is as an array of 64 @code{char}s,

"is an array"

> +each of which stores one bit of the block to be encrypted.  Unlike
> +@code{setkey}, there are no parity bits.  All 64 of the bits are
> +treated as data.

This is a good distinction.

> +The @code{encrypt} function is specified in the X/Open Portability
> +Guide and is declared in @file{unistd.h}.  @Theglibc{} also declares
> +this function in @file{crypt.h}.
>  @end deftypefun
>  
>  @deftypefun void setkey_r (const char *@var{key}, {struct crypt_data *} @var{data})

We could strip the braces from the struct and collapse the space here as
well.  This isn't necessarily everywhere we could do this, which could
even be a patch in and of itself---I'm only pointing it out because it
showed up in the diff.

> @@ -245,25 +267,30 @@ stored in a @code{char}, but there are no parity bits in @var{block}.
>  These are reentrant versions of @code{setkey} and @code{encrypt}.  The
>  only difference is the extra parameter, which stores the expanded
>  version of @var{key}.  Before calling @code{setkey_r} the first time,
> -@code{data->initialized} must be cleared to zero.
> -@end deftypefun
> +@code{data} must be erased using
> +@samp{@w{memset(data, 0, sizeof data)}}
> +or similar.

An @smallexample would avoid the paragraph-inline function here, as above.

> -The @code{setkey_r} and @code{encrypt_r} functions are GNU extensions.
> -@code{setkey}, @code{encrypt}, @code{setkey_r}, and @code{encrypt_r} are
> -defined in @file{crypt.h}.
> +@code{setkey_r} and @code{encrypt_r} are GNU extensions.  They are
> +declared in @file{crypt.h}.
> +@end deftypefun
>  
>  @deftypefun int ecb_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode})
>  @standards{SUNRPC, rpc/des_crypt.h}
>  @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>  
>  The function @code{ecb_crypt} encrypts or decrypts one or more blocks
> -using DES.  Each block is encrypted independently.

Normally, we would write this in such a way that ECB was spelled out
beforehand, but it's viable to work in parenthetically here.

> +using DES.  Each block is encrypted independently, which means that if
> +any two input blocks are the same, then their encryptions will also be
> +the same.  This is an additional weakness in the encryption, on top of
> +the weakness of DES itself.
>  
>  The @var{blocks} and the @var{key} are stored packed in 8-bit bytes, so
>  that the first bit of the key is the most-significant bit of
>  @code{key[0]} and the 63rd bit of the key is stored as the
> -least-significant bit of @code{key[7]}.  The @var{key} should have the
> -correct parity.
> +least-significant bit of @code{key[7]}.  The least-significant bit of
> +each byte must be chosen to give each byte odd parity, as with
> +@code{setkey}.

OK.

>  @var{len} is the number of bytes in @var{blocks}.  It should be a
>  multiple of 8 (so that there are a whole number of blocks to encrypt).
> @@ -271,33 +298,33 @@ multiple of 8 (so that there are a whole number of blocks to encrypt).
>  
>  The result of the encryption replaces the input in @var{blocks}.
>  
> -The @var{mode} parameter is the bitwise OR of two of the following:
> +The @var{mode} parameter should be the bitwise OR of one of these two
> +constants:
>  
>  @vtable @code
>  @item DES_ENCRYPT
>  @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, specifies that
>  @var{blocks} is to be encrypted.
>  
>  @item DES_DECRYPT
>  @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, specifies that
>  @var{blocks} is to be decrypted.
> +@end vtable

OK, but is there a default?

> +@noindent
> +with one of these two constants:
>  
> +@vtable @code
>  @item DES_HW
>  @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, asks to use a hardware
> -device.  If no hardware device is available, encryption happens anyway,
> -but in software.
> +Use a hardware DES accelerator, if available.  If no accelerator is
> +available, encryption or decryption will still occur, but in software.
>  
>  @item DES_SW
>  @standards{SUNRPC, rpc/des_crypt.h}
> -This constant, used in the @var{mode} parameter, specifies that no
> -hardware device is to be used.
> +Do not use a hardware DES accelerator.
>  @end vtable
>  
> -@end deftypefun
> -
>  @deftypefun int cbc_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode}, char *@var{ivec})

I think the indentation breaks here (i.e., the "@end deftypefun"
shouldn't have been removed).

>  @standards{SUNRPC, rpc/des_crypt.h}
>  @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> @@ -324,7 +351,8 @@ bytes.
>  Otherwise, all the parameters are similar to those for @code{ecb_crypt}.
>  @end deftypefun
>  
> -The result of the function will be one of these values:

We should reindent back here, after the function descriptions.

> +@code{ecb_crypt} and @code{cbc_crypt} both return one of the following
> +values:
>  
>  @vtable @code
>  @item DESERR_NONE
> @@ -333,7 +361,7 @@ The encryption succeeded.
>  
>  @item DESERR_NOHWDEVICE
>  @standards{SUNRPC, rpc/des_crypt.h}
> -The encryption succeeded, but there was no hardware device available.
> +The encryption succeeded, but there was no hardware accelerator available.
>  
>  @item DESERR_HWERROR
>  @standards{SUNRPC, rpc/des_crypt.h}
> @@ -344,13 +372,20 @@ The encryption failed because of a hardware problem.
>  The encryption failed because of a bad parameter, for instance @var{len}
>  is not a multiple of 8 or @var{len} is larger than @code{DES_MAXDATA}.
>  @end vtable
> +@end deftypefun
> +
> +For convenience, there is a macro which distinguishes successful from
> +failure result codes:
>  
> -@deftypefun int DES_FAILED (int @var{err})
> +@deftypefn Macro int DES_FAILED (int @var{err})

OK (Function-like macro).

>  @standards{SUNRPC, rpc/des_crypt.h}
>  @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> -This macro returns 1 if @var{err} is a `success' result code from
> -@code{ecb_crypt} or @code{cbc_crypt}, and 0 otherwise.
> -@end deftypefun
> +This macro returns a nonzero value if @var{err} is a failure result
> +code from @code{ecb_crypt} or @code{cbc_crypt}, and 0 otherwise.
> +@end deftypefn

Good.

> +There is also a helper function which assists in producing well-formed
> +DES keys:
>  
>  @deftypefun void des_setparity (char *@var{key})
>  @standards{SUNRPC, rpc/des_crypt.h}


> diff --git a/manual/examples/genpass.c b/manual/examples/genpass.c
...> diff --git a/manual/examples/testpass.c b/manual/examples/testpass.c
...> diff --git a/manual/random.texi b/manual/random.texi

I have to pass on these for now due to time constraints, but I think
I've thrown in a few cents worth.

Thank you,
Rical

[0] https://sourceware.org/ml/libc-alpha/2017-11/msg00653.html
Joseph Myers May 9, 2018, 3:47 p.m. UTC | #5
On Wed, 9 May 2018, Rical Jasan wrote:

> Perhaps someone else can weigh in on whether crypt.h is BSD or GNU.

crypt.h is not in the CSRG BSD history (so not in BSD 4.4, which is what 
would typically have been meant by BSD in the manual).  See e.g. 
<https://github.com/weiss/original-bsd>.
diff mbox series

Patch

diff --git a/manual/crypt.texi b/manual/crypt.texi
index c24306c07c..a3c02be055 100644
--- a/manual/crypt.texi
+++ b/manual/crypt.texi
@@ -2,73 +2,35 @@ 
 @chapter Cryptographic Functions
 @c %MENU% Password storage and obsolete encryption functions
 
-@vindex AUTH_DES
-@cindex FIPS 140-2
-It also provides support for Secure RPC, and some library functions that
-can be used to perform normal DES encryption.  The @code{AUTH_DES}
-authentication flavor in Secure RPC, as provided by @theglibc{},
-uses DES and does not comply with FIPS 140-2 nor does any other use of DES
-within @theglibc{}.  It is recommended that Secure RPC should not be used
-for systems that need to comply with FIPS 140-2 since all flavors of
-encrypted authentication use normal DES.
+@Theglibc{} is not a general-purpose cryptography library, but it does
+include one-way hash functions for password storage, and a set of
+encryption functions based on the obsolete DES block cipher.  The
+latter are present for historical reasons only, and should not be used
+at all anymore; any existing programs that still use them need to be
+updated to use ciphers that are still secure.
+
+Many countries have historically placed legal restrictions on the
+import, export, possession, or use of cryptographic tools.  These
+restrictions change frequently, so we cannot give you specific advice
+about what the situation might be in your country.  To the best of our
+knowledge, no one has objected to the export or import of @theglibc{}
+in many years, but if you are writing a program that performs
+cryptographic operations, we advise you to find out what your
+country's laws might be, first.  The one-way hash functions for
+password storage are less likely to be restricted than the DES-based
+encryption functions, because they cannot be used to send secret
+messages---the recipient would not be able to decrypt them.
 
 @menu
-* Legal Problems::              This software can get you locked up, or worse.
-* crypt::                       A one-way function for passwords.
-* DES Encryption::              Routines for DES encryption.
+* Password Storage::		A one-way function for passwords.
+* Obsolete Encryption::		Routines for encrypting and decrypting
+				  data using the obsolete DES block cipher.
 @end menu
 
-@node Legal Problems
-@section Legal Problems
-
-Because of the continuously changing state of the law, it's not possible
-to provide a definitive survey of the laws affecting cryptography.
-Instead, this section warns you of some of the known trouble spots; this
-may help you when you try to find out what the laws of your country are.
-
-Some countries require that you have a license to use, possess, or import
-cryptography.  These countries are believed to include Byelorussia,
-Burma, India, Indonesia, Israel, Kazakhstan, Pakistan, Russia, and Saudi
-Arabia.
-
-Some countries restrict the transmission of encrypted messages by radio;
-some telecommunications carriers restrict the transmission of encrypted
-messages over their network.
-
-Many countries have some form of export control for encryption software.
-The Wassenaar Arrangement is a multilateral agreement between 33
-countries (Argentina, Australia, Austria, Belgium, Bulgaria, Canada, the
-Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary,
-Ireland, Italy, Japan, Luxembourg, the Netherlands, New Zealand, Norway,
-Poland, Portugal, the Republic of Korea, Romania, the Russian
-Federation, the Slovak Republic, Spain, Sweden, Switzerland, Turkey,
-Ukraine, the United Kingdom and the United States) which restricts some
-kinds of encryption exports.  Different countries apply the arrangement
-in different ways; some do not allow the exception for certain kinds of
-``public domain'' software (which would include this library), some
-only restrict the export of software in tangible form, and others impose
-significant additional restrictions.
-
-The United States has additional rules.  This software would generally
-be exportable under 15 CFR 740.13(e), which permits exports of
-``encryption source code'' which is ``publicly available'' and which is
-``not subject to an express agreement for the payment of a licensing fee or
-royalty for commercial production or sale of any product developed with
-the source code'' to most countries.
-
-The rules in this area are continuously changing.  If you know of any
-information in this manual that is out-of-date, please report it to
-the bug database.  @xref{Reporting Bugs}.
-
-@node crypt
-@section Encrypting Passwords
-
-On many systems, it is unnecessary to have any kind of user
-authentication; for instance, a workstation which is not connected to a
-network probably does not need any user authentication, because to use
-the machine an intruder must have physical access.
-
-Sometimes, however, it is necessary to be sure that a user is authorized
+@node Password Storage
+@section Password Storage
+
+Sometimes it is necessary to be sure that a user is authorized
 to use some service a machine provides---for instance, to log in as a
 particular user id (@pxref{Users and Groups}).  One traditional way of
 doing this is for each user to choose a secret @dfn{password}; then, the
@@ -76,164 +38,224 @@  system can ask someone claiming to be a user what the user's password
 is, and if the person gives the correct password then the system can
 grant the appropriate privileges.
 
-If all the passwords are just stored in a file somewhere, then this file
-has to be very carefully protected.  To avoid this, passwords are run
-through a @dfn{one-way function}, a function which makes it difficult to
-work out what its input was by looking at its output, before storing in
-the file.
-
-@Theglibc{} provides a one-way function that is compatible with
-the behavior of the @code{crypt} function introduced in FreeBSD 2.0.
-It supports two one-way algorithms: one based on the MD5
-message-digest algorithm that is compatible with modern BSD systems,
-and the other based on the Data Encryption Standard (DES) that is
-compatible with Unix systems.
-
-@deftypefun {char *} crypt (const char *@var{key}, const char *@var{salt})
-@standards{BSD, crypt.h}
-@standards{SVID, crypt.h}
+Programs that handle passwords must take special care not to reveal
+them to anyone, no matter what.  It is not enough to keep them in a
+file that is only accessible with special privileges.  The file might
+be ``leaked'' via a bug or misconfiguration, and system administrators
+shouldn't learn everyone's password even if they have to edit that
+file for some reason.  To avoid this, passwords should also be
+converted into @dfn{one-way hashes}, using a @dfn{one-way function},
+before they are stored.  It is believed that the only way to recover a
+password from its one-way hash is to guess possible passwords, hash
+them, and compare the output with the hashes stored in the password
+file.  The one-way functions are designed to make this process
+impractically slow, for all but the most obvious guesses.  (Do not use
+a word from the dictionary as your password.)
+
+@Theglibc{} provides an interface to four different one-way functions,
+based on the SHA-2-512, SHA-2-256, MD5, and DES cryptographic
+primitives.  New passwords should be hashed with either of the
+SHA-based functions.  The others are too weak for newly set passwords,
+but we continue to support them for verifying old passwords.  The
+DES-based hash is especially weak, because it ignores all but the
+first eight characters of its input.
+
+@deftypefun {char *} crypt (const char *@var{phrase}, const char *@var{salt})
+@standards{X/Open, unistd.h}
+@standards{GNU, crypt.h}
 @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{} @ascuheap{} @ascudlopen{}}@acunsafe{@aculock{} @acsmem{}}}
 @c Besides the obvious problem of returning a pointer into static
 @c storage, the DES initializer takes an internal lock with the usual
-@c set of problems for AS- and AC-Safety.  The FIPS mode checker and the
-@c NSS implementations of may leak file descriptors if canceled.  The
+@c set of problems for AS- and AC-Safety.
+@c The NSS implementations may leak file descriptors if cancelled.
 @c The MD5, SHA256 and SHA512 implementations will malloc on long keys,
 @c and NSS relies on dlopening, which brings about another can of worms.
 
-The @code{crypt} function takes a password, @var{key}, as a string, and
-a @var{salt} character array which is described below, and returns a
-printable ASCII string which starts with another salt.  It is believed
-that, given the output of the function, the best way to find a @var{key}
-that will produce that output is to guess values of @var{key} until the
-original value of @var{key} is found.
-
-The @var{salt} parameter does two things.  Firstly, it selects which
-algorithm is used, the MD5-based one or the DES-based one.  Secondly, it
-makes life harder for someone trying to guess passwords against a file
-containing many passwords; without a @var{salt}, an intruder can make a
-guess, run @code{crypt} on it once, and compare the result with all the
-passwords.  With a @var{salt}, the intruder must run @code{crypt} once
-for each different salt.
-
-For the MD5-based algorithm, the @var{salt} should consist of the string
-@code{$1$}, followed by up to 8 characters, terminated by either
-another @code{$} or the end of the string.  The result of @code{crypt}
-will be the @var{salt}, followed by a @code{$} if the salt didn't end
-with one, followed by 22 characters from the alphabet
-@code{./0-9A-Za-z}, up to 34 characters total.  Every character in the
-@var{key} is significant.
-
-For the DES-based algorithm, the @var{salt} should consist of two
-characters from the alphabet @code{./0-9A-Za-z}, and the result of
-@code{crypt} will be those two characters followed by 11 more from the
-same alphabet, 13 in total.  Only the first 8 characters in the
-@var{key} are significant.
-
-The MD5-based algorithm has no limit on the useful length of the
-password used, and is slightly more secure.  It is therefore preferred
-over the DES-based algorithm.
-
-When the user enters their password for the first time, the @var{salt}
-should be set to a new string which is reasonably random.  To verify a
-password against the result of a previous call to @code{crypt}, pass
-the result of the previous call as the @var{salt}.
+The function @code{crypt} converts a password string, @var{phrase},
+into a one-way hash suitable for storage in the user database.  The
+hash will consist entirely of printable ASCII characters.  It will not
+contain whitespace, nor any of the characters @kbd{:}, @kbd{;},
+@kbd{*}, @kbd{!}, or @kbd{\}.
+
+The @var{salt} parameter controls which one-way function is used, and
+it also ensures that the output of the one-way function is different
+for every user, even if they have the same password.  This makes it
+harder to guess passwords from a large user database.  Without salt,
+the intruder could make a guess, run @code{crypt} on it once, and
+compare the result with all the hashed passwords.  Salt forces the
+intruder to make many calls to @code{crypt} for each guess.
+
+To verify a password, pass the previously hashed password as the
+@var{salt}.  To hash a new password for storage, set @var{salt} to a
+string consisting of a prefix plus a sequence of randomly chosen
+characters, according to this table:
+
+@multitable @columnfractions .15 .1 .3
+@headitem Algorithm @tab Prefix @tab Random sequence
+@item SHA-2-512
+@tab @samp{$6$}
+@tab 16 characters
+@item SHA-2-256
+@tab @samp{$5$}
+@tab 16 characters
+@item MD5
+@tab @samp{$1$}
+@tab 8 characters
+@item DES
+@tab @samp{}
+@tab 2 characters
+@end multitable
+
+In all cases, the random characters should be chosen from the alphabet
+@code{./0-9A-Za-z}.
+
+With all of the hash functions @emph{except} DES, @var{phrase} can be
+arbitrarily long, and all eight bits of each byte are significant.
+With DES, all but the first eight characters of @var{phrase} are ignored,
+and the eighth bit of each byte is also ignored.
+
+@code{crypt} can fail.  Some implementations return @code{NULL} on
+failure, and others return an @emph{invalid} hashed passphrase, which
+will begin with a @kbd{*} and will not be the same as @var{salt}.  In
+either case, @code{errno} will be set to indicate the problem.  Some
+of the possible error codes are:
+
+@table @code
+@item EINVAL
+@var{salt} is invalid; neither a previously hashed password, nor a
+well-formed new salt for any of the supported hash functions.
+
+@item EPERM
+The hash function selected by @var{salt} has been disabled by local
+policy.
+
+@item ENOMEM
+Failed to allocate internal scratch storage.
+
+@item ENOSYS
+@itemx EOPNOTSUPP
+Hashing passphrases is not supported at all, or the hash function
+selected by @var{salt} is not supported.  @Theglibc{} does not use
+these error codes, but they may be encountered on other operating
+systems.
+@end table
+
+@code{crypt} uses static storage for both internal scratchwork and the
+string it returns.  It is not safe to call @code{crypt} from multiple
+threads simultaneously, and the string it returns will be overwritten
+by any subsequent call to @code{crypt}.
+
+@code{crypt} is specified in the X/Open Portability Guide and is
+present on nearly all historical Unix systems.  However, XPG does not
+specify any one-way functions.
+
+@code{crypt} is declared in @file{unistd.h}.  @Theglibc{} also
+declares this function in @file{crypt.h}.
 @end deftypefun
 
-@deftypefun {char *} crypt_r (const char *@var{key}, const char *@var{salt}, {struct crypt_data *} @var{data})
+@deftypefun {char *} crypt_r (const char *@var{phrase}, const char *@var{salt}, {struct crypt_data *} @var{data})
 @standards{GNU, crypt.h}
 @safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @asulock{} @ascuheap{} @ascudlopen{}}@acunsafe{@aculock{} @acsmem{}}}
 @c Compared with crypt, this function fixes the @mtasurace:crypt
 @c problem, but nothing else.
 
-The @code{crypt_r} function does the same thing as @code{crypt}, but
-takes an extra parameter which includes space for its result (among
-other things), so it can be reentrant.  @code{data@w{->}initialized} must be
-cleared to zero before the first time @code{crypt_r} is called.
-
-The @code{crypt_r} function is a GNU extension.
+The function @code{crypt_r} is a reentrant version of @code{crypt}.
+Instead of static storage, it uses the memory pointed to by its
+@var{data} argument for both scratchwork and the string it returns.
+It can safely be used from multiple threads, as long as different
+@var{data} objects are used in each thread.  The string it returns
+will still be overwritten by another call with the same @var{data}.
+@var{data} must be a pointer to a @samp{struct crypt_data} object
+allocated by the caller, and before one of these objects is used for
+the first time, it must be erased using
+@samp{@w{memset(data, 0, sizeof data)}}
+or similar.
+
+@code{crypt_r} is a GNU extension.  It is declared in @file{crypt.h},
+as is @samp{struct crypt_data}.
 @end deftypefun
 
-The @code{crypt} and @code{crypt_r} functions are prototyped in the
-header @file{crypt.h}.
-
-The following short program is an example of how to use @code{crypt} the
-first time a password is entered.  Note that the @var{salt} generation
-is just barely acceptable; in particular, it is not unique between
-machines, and in many applications it would not be acceptable to let an
-attacker know what time the user's password was last set.
+The following program shows how to use @code{crypt} the first time a
+password is entered.  It uses @code{getentropy} to make the salt as
+unpredictable as possible; @pxref{Unpredictable Bytes}.
 
 @smallexample
 @include genpass.c.texi
 @end smallexample
 
-The next program shows how to verify a password.  It prompts the user
-for a password and prints ``Access granted.'' if the user types
-@code{GNU libc manual}.
+@c When we have documentation for getspnam, change the demo to use it.
+The next program shows how to verify a password stored in the system
+user database in the traditional manner; @pxref{User Database}.  (On
+newer installations, the passwords are instead stored in a ``shadow''
+database that requires special privileges to access, and this program
+will not work.)
 
 @smallexample
 @include testpass.c.texi
 @end smallexample
 
-@node DES Encryption
-@section DES Encryption
-
-@cindex FIPS 46-3
-The Data Encryption Standard is described in the US Government Federal
-Information Processing Standards (FIPS) 46-3 published by the National
-Institute of Standards and Technology.  The DES has been very thoroughly
-analyzed since it was developed in the late 1970s, and no new
-significant flaws have been found.
-
-However, the DES uses only a 56-bit key (plus 8 parity bits), and a
-machine has been built in 1998 which can search through all possible
-keys in about 6 days, which cost about US$200000; faster searches would
-be possible with more money.  This makes simple DES insecure for most
-purposes, and NIST no longer permits new US government systems
-to use simple DES.
-
-For serious encryption functionality, it is recommended that one of the
-many free encryption libraries be used instead of these routines.
-
-The DES is a reversible operation which takes a 64-bit block and a
-64-bit key, and produces another 64-bit block.  Usually the bits are
-numbered so that the most-significant bit, the first bit, of each block
-is numbered 1.
-
-Under that numbering, every 8th bit of the key (the 8th, 16th, and so
-on) is not used by the encryption algorithm itself.  But the key must
-have odd parity; that is, out of bits 1 through 8, and 9 through 16, and
-so on, there must be an odd number of `1' bits, and this completely
-specifies the unused bits.
+@node Obsolete Encryption
+@section Obsolete Encryption
+
+For historical reasons, @theglibc{} includes several functions which
+perform encryption using the obsolete Data Encryption Standard (DES).
+None of these functions should be used anymore.  Instead, use one of
+the many free encryption libraries that use modern ciphers.
+
+DES is a block cipher standardized by the US government in 1977.  It
+is no longer considered to be secure, and has been withdrawn as a
+standard, because it only has @math{2@sup{56}} possible keys, so
+testing all of them is practical.  In 1998, it cost US$250,000 to
+build a massively parallel computer that could find the DES key that
+encrypted a known message in three days.  In 2018, the same process
+takes only one day, and there are online services that will do it for
+less than US$100 per message.
 
 @deftypefun void setkey (const char *@var{key})
-@standards{BSD, crypt.h}
-@standards{SVID, crypt.h}
+@standards{X/Open, stdlib.h}
+@standards{GNU, crypt.h}
 @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
 @c The static buffer stores the key, making it fundamentally
 @c thread-unsafe.  The locking issues are only in the initialization
 @c path; cancelling the initialization will leave the lock held, it
 @c would otherwise repeat the initialization on the next call.
 
-The @code{setkey} function sets an internal data structure to be an
-expanded form of @var{key}.  @var{key} is specified as an array of 64
-bits each stored in a @code{char}, the first bit is @code{key[0]} and
-the 64th bit is @code{key[63]}.  The @var{key} should have the correct
-parity.
+The @code{setkey} function prepares to perform DES encryption or
+decryption using the key @var{key}.  @var{key} should point to an
+array of 64 @code{char}s, each of which must be set to either @samp{0}
+or @samp{1}; that is, each byte stores one bit of the key.  Every
+eighth byte (array indices 7, 15, 23, @dots{}) must be set to give it
+plus the preceding group of seven bytes odd parity.  For instance, if
+there are three bytes set to @samp{1} among bytes 0 through 6, then
+byte 7 must be set to @samp{0}, and similarly if there are four bytes
+set to @samp{1} among bytes 8 through 14, then byte 15 must be set to
+@samp{0}, and so on.  Thus, of the 64 bytes, only 56 can be used to
+supply key data.
+
+The @code{setkey} function is specified in the X/Open Portability
+Guide and is declared in @file{stdlib.h}.  @Theglibc{} also declares
+this function in @file{crypt.h}.
 @end deftypefun
 
 @deftypefun void encrypt (char *@var{block}, int @var{edflag})
-@standards{BSD, crypt.h}
-@standards{SVID, crypt.h}
+@standards{X/Open, unistd.h}
+@standards{GNU, crypt.h}
 @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{}}@acunsafe{@aculock{}}}
 @c Same issues as setkey.
 
-The @code{encrypt} function encrypts @var{block} if
-@var{edflag} is 0, otherwise it decrypts @var{block}, using a key
-previously set by @code{setkey}.  The result is
-placed in @var{block}.
+The @code{encrypt} function encrypts @var{block} if @var{edflag} is 0,
+otherwise it decrypts @var{block}, using a key previously set by
+@code{setkey}.  The result overwrites the previous value of
+@var{block}.
 
-Like @code{setkey}, @var{block} is specified as an array of 64 bits each
-stored in a @code{char}, but there are no parity bits in @var{block}.
+Like @code{setkey}, @var{block} is as an array of 64 @code{char}s,
+each of which stores one bit of the block to be encrypted.  Unlike
+@code{setkey}, there are no parity bits.  All 64 of the bits are
+treated as data.
+
+The @code{encrypt} function is specified in the X/Open Portability
+Guide and is declared in @file{unistd.h}.  @Theglibc{} also declares
+this function in @file{crypt.h}.
 @end deftypefun
 
 @deftypefun void setkey_r (const char *@var{key}, {struct crypt_data *} @var{data})
@@ -245,25 +267,30 @@  stored in a @code{char}, but there are no parity bits in @var{block}.
 These are reentrant versions of @code{setkey} and @code{encrypt}.  The
 only difference is the extra parameter, which stores the expanded
 version of @var{key}.  Before calling @code{setkey_r} the first time,
-@code{data->initialized} must be cleared to zero.
-@end deftypefun
+@code{data} must be erased using
+@samp{@w{memset(data, 0, sizeof data)}}
+or similar.
 
-The @code{setkey_r} and @code{encrypt_r} functions are GNU extensions.
-@code{setkey}, @code{encrypt}, @code{setkey_r}, and @code{encrypt_r} are
-defined in @file{crypt.h}.
+@code{setkey_r} and @code{encrypt_r} are GNU extensions.  They are
+declared in @file{crypt.h}.
+@end deftypefun
 
 @deftypefun int ecb_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode})
 @standards{SUNRPC, rpc/des_crypt.h}
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
 
 The function @code{ecb_crypt} encrypts or decrypts one or more blocks
-using DES.  Each block is encrypted independently.
+using DES.  Each block is encrypted independently, which means that if
+any two input blocks are the same, then their encryptions will also be
+the same.  This is an additional weakness in the encryption, on top of
+the weakness of DES itself.
 
 The @var{blocks} and the @var{key} are stored packed in 8-bit bytes, so
 that the first bit of the key is the most-significant bit of
 @code{key[0]} and the 63rd bit of the key is stored as the
-least-significant bit of @code{key[7]}.  The @var{key} should have the
-correct parity.
+least-significant bit of @code{key[7]}.  The least-significant bit of
+each byte must be chosen to give each byte odd parity, as with
+@code{setkey}.
 
 @var{len} is the number of bytes in @var{blocks}.  It should be a
 multiple of 8 (so that there are a whole number of blocks to encrypt).
@@ -271,33 +298,33 @@  multiple of 8 (so that there are a whole number of blocks to encrypt).
 
 The result of the encryption replaces the input in @var{blocks}.
 
-The @var{mode} parameter is the bitwise OR of two of the following:
+The @var{mode} parameter should be the bitwise OR of one of these two
+constants:
 
 @vtable @code
 @item DES_ENCRYPT
 @standards{SUNRPC, rpc/des_crypt.h}
-This constant, used in the @var{mode} parameter, specifies that
 @var{blocks} is to be encrypted.
 
 @item DES_DECRYPT
 @standards{SUNRPC, rpc/des_crypt.h}
-This constant, used in the @var{mode} parameter, specifies that
 @var{blocks} is to be decrypted.
+@end vtable
+
+@noindent
+with one of these two constants:
 
+@vtable @code
 @item DES_HW
 @standards{SUNRPC, rpc/des_crypt.h}
-This constant, used in the @var{mode} parameter, asks to use a hardware
-device.  If no hardware device is available, encryption happens anyway,
-but in software.
+Use a hardware DES accelerator, if available.  If no accelerator is
+available, encryption or decryption will still occur, but in software.
 
 @item DES_SW
 @standards{SUNRPC, rpc/des_crypt.h}
-This constant, used in the @var{mode} parameter, specifies that no
-hardware device is to be used.
+Do not use a hardware DES accelerator.
 @end vtable
 
-@end deftypefun
-
 @deftypefun int cbc_crypt (char *@var{key}, char *@var{blocks}, unsigned int @var{len}, unsigned int @var{mode}, char *@var{ivec})
 @standards{SUNRPC, rpc/des_crypt.h}
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
@@ -324,7 +351,8 @@  bytes.
 Otherwise, all the parameters are similar to those for @code{ecb_crypt}.
 @end deftypefun
 
-The result of the function will be one of these values:
+@code{ecb_crypt} and @code{cbc_crypt} both return one of the following
+values:
 
 @vtable @code
 @item DESERR_NONE
@@ -333,7 +361,7 @@  The encryption succeeded.
 
 @item DESERR_NOHWDEVICE
 @standards{SUNRPC, rpc/des_crypt.h}
-The encryption succeeded, but there was no hardware device available.
+The encryption succeeded, but there was no hardware accelerator available.
 
 @item DESERR_HWERROR
 @standards{SUNRPC, rpc/des_crypt.h}
@@ -344,13 +372,20 @@  The encryption failed because of a hardware problem.
 The encryption failed because of a bad parameter, for instance @var{len}
 is not a multiple of 8 or @var{len} is larger than @code{DES_MAXDATA}.
 @end vtable
+@end deftypefun
+
+For convenience, there is a macro which distinguishes successful from
+failure result codes:
 
-@deftypefun int DES_FAILED (int @var{err})
+@deftypefn Macro int DES_FAILED (int @var{err})
 @standards{SUNRPC, rpc/des_crypt.h}
 @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
-This macro returns 1 if @var{err} is a `success' result code from
-@code{ecb_crypt} or @code{cbc_crypt}, and 0 otherwise.
-@end deftypefun
+This macro returns a nonzero value if @var{err} is a failure result
+code from @code{ecb_crypt} or @code{cbc_crypt}, and 0 otherwise.
+@end deftypefn
+
+There is also a helper function which assists in producing well-formed
+DES keys:
 
 @deftypefun void des_setparity (char *@var{key})
 @standards{SUNRPC, rpc/des_crypt.h}
diff --git a/manual/examples/genpass.c b/manual/examples/genpass.c
index 5edb2e9b8a..fe10a847cf 100644
--- a/manual/examples/genpass.c
+++ b/manual/examples/genpass.c
@@ -16,34 +16,44 @@ 
 */
 
 #include <stdio.h>
-#include <time.h>
 #include <unistd.h>
 #include <crypt.h>
 
 int
 main(void)
 {
-  unsigned long seed[2];
-  char salt[] = "$1$........";
-  const char *const seedchars =
+  unsigned char ubytes[16];
+  char salt[20];
+  const char *const saltchars =
     "./0123456789ABCDEFGHIJKLMNOPQRST"
     "UVWXYZabcdefghijklmnopqrstuvwxyz";
   char *password;
   int i;
 
-  /* Generate a (not very) random seed.
-     You should do it better than this...  */
-  seed[0] = time(NULL);
-  seed[1] = getpid() ^ (seed[0] >> 14 & 0x30000);
-
-  /* Turn it into printable characters from `seedchars'.  */
-  for (i = 0; i < 8; i++)
-    salt[3+i] = seedchars[(seed[i/5] >> (i%5)*6) & 0x3f];
+  /* Retrieve 16 unpredictable bytes from the operating system.  */
+  if (getentropy (ubytes, sizeof ubytes))
+    {
+      perror ("getentropy");
+      return 1;
+    }
+
+  /* Use them to fill in the salt string.  */
+  salt[0] = '$';
+  salt[1] = '5'; /* SHA-256 */
+  salt[2] = '$';
+  for (i = 0; i < 16; i++)
+    salt[3+i] = saltchars[ubytes[i] & 0x3f];
+  salt[19] = '\0';
 
   /* Read in the user's password and encrypt it.  */
-  password = crypt(getpass("Password:"), salt);
+  password = crypt (getpass ("Password:"), salt);
+  if (!password)
+    {
+      perror ("crypt");
+      return 1;
+    }
 
   /* Print the results.  */
-  puts(password);
+  puts (password);
   return 0;
 }
diff --git a/manual/examples/testpass.c b/manual/examples/testpass.c
index 19f1ae7de0..3cc2383ea8 100644
--- a/manual/examples/testpass.c
+++ b/manual/examples/testpass.c
@@ -15,29 +15,45 @@ 
    along with this program; if not, if not, see <http://www.gnu.org/licenses/>.
 */
 
+#include <sys/types.h>
+#include <pwd.h>
 #include <stdio.h>
 #include <string.h>
 #include <unistd.h>
 #include <crypt.h>
 
 int
-main(void)
+main(int argc, char **argv)
 {
-  /* Hashed form of "GNU libc manual".  */
-  const char *const pass = "$1$/iSaq7rB$EoUw5jJPPvAPECNaaWzMK/";
-
   char *result;
   int ok;
+  struct passwd *pw;
+
+  if (argc < 2)
+    {
+      fprintf (stderr, "usage: %s username\n", argv[0]);
+      return 1;
+    }
+
+  /* Retrieve the password for user argv[1]
+     from the user database.  */
+  pw = getpwnam (argv[1]);
 
-/*@group*/
   /* Read in the user's password and encrypt it,
      passing the expected password in as the salt.  */
-  result = crypt(getpass("Password:"), pass);
-/*@end group*/
+  result = crypt (getpass ("Password:"), pw->pw_passwd);
+  if (!result)
+    {
+      perror ("crypt");
+      return 1;
+    }
+
+  printf ("Stored:  %s\n", pw->pw_passwd);
+  printf ("Current: %s\n", result);
 
   /* Test the result.  */
-  ok = strcmp (result, pass) == 0;
+  ok = strcmp (result, pw->pw_passwd) == 0;
 
-  puts(ok ? "Access granted." : "Access denied.");
+  puts (ok ? "Access granted." : "Access denied.");
   return ok ? 0 : 1;
 }
diff --git a/manual/random.texi b/manual/random.texi
index c53ed7fc22..c68a43aca0 100644
--- a/manual/random.texi
+++ b/manual/random.texi
@@ -2,6 +2,52 @@ 
 @chapter Random Number Generation
 @c %MENU% Various ways to generate random values.
 
+Many algorithms require a source of @dfn{random numbers}, or to be
+more precise, sequences of numbers chosen uniformly at random from
+some subset of the integers or reals.  There are several different
+ways to generate random numbers, depending on how stringent your
+requirements are.
+
+A @dfn{pseudo-random number generator} (PRNG) produces a sequence of
+numbers that @emph{appears} to be random, and has statistical
+properties matching what we expect of numbers chosen uniformly at
+random.  However, an ordinary PRNG doesn't guarantee that its output
+is unguessable.  Also, the output of a PRNG depends on a relatively
+small @dfn{seed} value, and so there are only a small number of
+sequences that it can produce; astronomically small, relative to the
+total number of random sequences.  If the seed is reused, the output
+will be exactly the same, which is sometimes exactly what you want,
+and sometimes disastrous.
+
+A @dfn{cryptographically strong pseudo-random number generator}
+(CSPRNG) is a PRNG that @emph{does} guarantee its output is
+unguessable.  Formally, there is no deterministic, polynomial-time
+algorithm@footnote{Assuming
+@iftex
+@c Don't typeset NP as if multiplying N by P. Use text italic for both.
+@math{\hbox{\it P} \neq \hbox{\it NP}}@c
+@end iftex
+@ifnottex
+@math{P ≠ NP}@c
+@end ifnottex
+.} that can tell the difference between the output of
+a CSPRNG and a sequence of numbers that really were chosen uniformly
+at random.  A CSPRNG still uses a seed and can only produce an
+astronomically small number of random sequences.
+
+Finally, a @dfn{true random number generator} (TRNG) produces random
+numbers not from a seed, but from some unpredictable physical process.
+In principle, a TRNG's output is unguessable, and is not limited to an
+astronomically small number of sequences.  However, TRNGs are very
+slow, and because there is no seed, there is no way to make one repeat
+its output.  Usually, it is best to use a TRNG only to choose the seed
+for a PRNG or CSPRNG.
+
+At present, @theglibc{} offers a variety of ordinary PRNGs, and on
+some operating systems it also offers access to an OS-provided TRNG.
+We may add a CSPRNG in the future.
+
+
 @menu
 * Pseudo-Random Numbers::       Sequences of numbers with apparently
                                  random distribution, but not difficult
@@ -30,8 +76,7 @@  is convenient when you are debugging a program, but it is unhelpful if
 you want the program to behave unpredictably.  If you want a different
 pseudo-random series each time your program runs, you must specify a
 different seed each time.  For ordinary purposes, basing the seed on the
-current time works well.  For random numbers in cryptography,
-@pxref{Unpredictable Bytes}.
+current time works well.
 
 You can obtain repeatable sequences of numbers on a particular machine type
 by specifying the same initial seed value for the random number
@@ -595,14 +640,18 @@  programs.
 @node Unpredictable Bytes
 @section Generating Unpredictable Bytes
 
-Some cryptographic applications (such as session key generation) need
-unpredictable bytes.
+Some cryptographic applications need truly unpredictable bytes.
+@Theglibc{} provides two functions for this purpose, both of which
+access a true random generator implemented by the operating system.
+Not all operating systems support these functions; programs that use
+them must be prepared for them to fail.  They are slow, and can only
+produce short sequences of unpredictable bytes.  Most programs should
+use these functions only to seed a cryptographically strong
+pseudo-random generator.
 
-In general, application code should use a deterministic random bit
-generator, which could call the @code{getentropy} function described
-below internally to obtain randomness to seed the generator.  The
-@code{getrandom} function is intended for low-level applications which
-need additional control over the blocking behavior.
+Most programs should use @code{getentropy}.  The @code{getrandom}
+function is intended for low-level applications which need additional
+control over the blocking behavior.
 
 @deftypefun int getentropy (void *@var{buffer}, size_t @var{length})
 @standards{GNU, sys/random.h}
@@ -623,7 +672,9 @@  will keep blocking even if a signal arrives, and return only after the
 entropy pool has been initialized.
 
 The @code{getentropy} function can fail with several errors, some of
-which are listed below.
+which are listed below.  Applications must check for failure.
+However, when it succeeds, it will always produce exactly @var{length}
+bytes of random data.
 
 @table @code
 @item ENOSYS
@@ -645,11 +696,11 @@  could not be overwritten with random data for an unspecified reason.
 @standards{GNU, sys/random.h}
 @safety{@mtsafe{}@assafe{}@acsafe{}}
 
-This function writes @var{length} bytes of random data to the array
-starting at @var{buffer}.  On success, this function returns the number
-of bytes which have been written to the buffer (which can be less than
-@var{length}).  On error, @code{-1} is returned, and @code{errno} is
-updated accordingly.
+This function writes up to @var{length} bytes of random data to the
+array starting at @var{buffer}.  On success, this function returns the
+number of bytes which have been written to the buffer, which can be
+less than @var{length}.  On error, @code{-1} is returned, and
+@code{errno} is updated accordingly.
 
 The @code{getrandom} function is declared in the header file
 @file{sys/random.h}.  It is a GNU extension.
@@ -676,7 +727,8 @@  booted and the pool has not yet been initialized.
 
 The @code{getrandom} function can fail with several errors, some of
 which are listed below.  In addition, the function may not fill the
-buffer completely and return a value less than @var{length}.
+buffer completely.  Applications must check for failure and for short
+writes.
 
 @table @code
 @item ENOSYS
diff --git a/manual/string.texi b/manual/string.texi
index b07cfb4550..f113777f2d 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -2469,8 +2469,6 @@  Unlike Rot13, @code{memfrob} works on arbitrary binary data, not just
 text.
 @cindex Rot13
 
-For true encryption, @xref{Cryptographic Functions}.
-
 This function is declared in @file{string.h}.
 @pindex string.h
 
@@ -2487,9 +2485,7 @@  Note that @code{memfrob} a second time on the same data structure
 returns it to its original state.
 
 This is a good function for hiding information from someone who doesn't
-want to see it or doesn't want to see it very much.  To really prevent
-people from retrieving the information, use stronger encryption such as
-that described in @xref{Cryptographic Functions}.
+want to see it or doesn't want to see it very much.
 
 @strong{Portability Note:}  This function is unique to @theglibc{}.