diff mbox series

[2/3] manual: Document ld.so --list-diagnostics output

Message ID b17c05bcb9bdb932274db611445e19c83abfaa6c.1691172895.git.fweimer@redhat.com
State New
Headers show
Series Document ld.so --list-diagnostics, add syntax tests | expand

Commit Message

Florian Weimer Aug. 4, 2023, 6:16 p.m. UTC
---
 manual/dynlink.texi | 279 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 279 insertions(+)

Comments

Arsen Arsenović Aug. 4, 2023, 11:33 p.m. UTC | #1
Hi Florian,

Florian Weimer <fweimer@redhat.com> writes:

> ---
>  manual/dynlink.texi | 279 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 279 insertions(+)
>
> diff --git a/manual/dynlink.texi b/manual/dynlink.texi
> index 45bf5a5b55..fc2cd2f0a4 100644
> --- a/manual/dynlink.texi
> +++ b/manual/dynlink.texi
> @@ -13,9 +13,288 @@ as plugins) later at run time.
>  Dynamic linkers are sometimes called @dfn{dynamic loaders}.
>  
>  @menu
> +* Dynamic Linker Invocation::   Explicit invocation of the dynamic linker.
>  * Dynamic Linker Introspection::    Interfaces for querying mapping information.
>  @end menu
>  
> +@node Dynamic Linker Invocation
> +
> +@cindex program interpreter
> +When a dynamically linked program starts, the operating system
> +automatically loads the dynamic linker along with the program.
> +@Theglibc{} also supports invoking the dynamic linker explicitly to
> +launch a program.  This command uses the implied dynamic linker
> +(also sometimes called the @dfn{program interpreter}):
> +
> +@smallexample
> +sh -c 'echo "Hello, world!"'
> +@end smallexample
> +
> +This command specifies the dynamic linker explicitly:
> +
> +@smallexample
> +ld.so /bin/sh -c 'echo "Hello, world!"'
> +@end smallexample
> +
> +Note that @command{ld.so} does not search the @env{PATH} environment
> +variable, so the full file name of the executable needs to be specified.
> +
> +The @command{ld.so} program supports various options.  Options start
> +@samp{--} and need to come before the program that is being launched.
> +Some of the supported options are listed below.
> +
> +@table @code
> +@item --list-diagnostics
> +Print system diagnostic information in a machine-readable format.
> +@xref{Dynamic Linker Diagnostics}.
> +@end table
> +
> +@menu
> +* Dynamic Linker Diagnostics::   Obtaining system diagnostic information.
> +@end menu
> +
> +@node Dynamic Linker Diagnostics
> +@section Dynamic Linker Diagnostics
> +@cindex diagnostics (dynamic linker)
> +
> +The @samp{ld.so --list-diagnostics} produces machine-readable
> +diagnostics output.  This output contains system data that affects
> +behavior of @theglibc{}, and potentially application behavior as well.
> +
> +The exact set of diagnostic items can change between releases of
> +@theglibc{}.  The output format itself is not expected to change
> +radically.
> +
> +The following table shows some example lines that can be written by the
> +diagnostics command.
> +
> +@table @code
> +@item dl_pagesize=0x1000
> +The system page size is 4096 bytes.
> +
> +@item env[0x14]="LANG=en_US.UTF-8"
> +This item indicates that the 21st environment variable at process
> +startup contains a setting for @code{LANG}.
> +
> +@item env_filtered[0x22]="DISPLAY"
> +The 35th environment variable is @code{DISPLAY}.  Its value is not
> +included in the output for privacy reasons because it is not recognized
> +as harmless by the diagnostics code.
> +
> +@item path.prefix="/usr"
> +This means that @theglibc{} was configured with @code{--prefix=/usr}.
> +
> +@item path.system_dirs[0x0]="/lib64/"
> +@itemx path.system_dirs[0x1]="/usr/lib64/"
> +The built-in dynamic linker search path contains two directories,
> +@code{/lib64} and @code{/usr/lib64}.
> +@end table
> +
> +@subsection Dynamic Linker Diagnostics Output Format
> +
> +As seen above, diagnostic lines assign values (integers or strings) to a
> +sequences of labeled subscripts, separated by @samp{.}.  Some subscripts
> +have integer indices associated with them.  The subscript indices are
> +not necessarily contiguous or small, so an associative array should be
> +used to store them.  Currently, all integers fit into the 64-bit
> +unsigned integer range.  Every access path to a value has a fixed type
> +(string or integer) independently of subscript index values.  Likewise,
> +whether a subscript is indexed does not depend on previous indices (but
> +may depend on previous subscript labels).
> +
> +A syntax description in ABNF (RFC 5234) follows.  Note that
> +@code{%x30-39} denotes the range of decimal digits.  Diagnostic output
> +lines are expected to match the @code{line} production.
> +
> +@c ABNF-START
> +@smallexample
> +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
> +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
> +ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
> +DQUOTE = %x22 ; "
> +
> +; Numbers are always hexadecimal and use a 0x prefix.
> +hex-value-prefix = %x30 %x78
> +hex-value = hex-value-prefix 1*HEXDIG
> +
> +; Strings use octal escape sequences and \\, \".
> +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
> +string-quoted-octal = %x30-33 2*2%x30-37
> +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
> +string-value = DQUOTE *(string-char / string-quoted) DQUOTE
> +
> +value = hex-value / string-value
> +
> +label = ALPHA *ALPHA-NUMERIC
> +index = "[" hex-value "]"
> +subscript = label [index]
> +
> +line = subscript *("." subscript) "=" value
> +@end smallexample
> +
> +Output lines can be parsed with the following Python function.  It
> +assumes lines formatted according to the ABNF @code{line} production
> +above.
> +
> +@c PYTHON-START
> +@smallexample
> +def parse_line(line):
> +    """Parse a line of --list-diagnostics output.
> +
> +    This function returns a pair (SUBSCRIPTS, VALUE).  VALUE is either
> +    a byte string or an integer.  SUBSCRIPT is a tuple of (LABEL,
> +    INDEX) pairs, where LABEL is a field identifier (a string), and
> +    INDEX is an integer or None, to indicate that this field is not
> +    indexed.
> +
> +    """
> +
> +    # Extract the list of subscripts before the value.
> +    idx = 0
> +    subscripts = []
> +    while line[idx] != '=':
> +        start_idx = idx
> +
> +        # Extract the label.
> +        while line[idx] not in '[.=':
> +            idx += 1
> +        label = line[start_idx:idx]
> +
> +        if line[idx] == '[':
> +            # Subscript with a 0x index.
> +            assert label
> +            close_bracket = line.index(']', idx)
> +            index = line[idx + 1:close_bracket]
> +            assert index.startswith('0x')
> +            index = int(index, 0)
> +            subscripts.append((label, index))
> +            idx = close_bracket + 1
> +        else: # '.' or '='.
> +            if label:
> +                subscripts.append((label, None))
> +            if line[idx] == '.':
> +                idx += 1
> +
> +    # The value is either a string or a 0x number.
> +    value = line[idx + 1:]
> +    if value[0] == '"':
> +        # Decode the escaped string into a byte string.
> +        assert value[-1] == '"'
> +        idx = 1
> +        result = []
> +        while True:
> +            ch = value[idx]
> +            if ch == '\\':
> +                if value[idx + 1] in '"\\':
> +                    result.append(ord(value[idx + 1]))
> +                    idx += 2
> +                else:
> +                    result.append(int(value[idx + 1:idx + 4], 8))
> +                    idx += 4
> +            elif ch == '"':
> +                assert idx == len(value) - 1
> +                break
> +            else:
> +                result.append(ord(value[idx]))
> +                idx += 1
> +        value = bytes(result)
> +    else:
> +        # Convert the value into an integer.
> +        assert value.startswith('0x')
> +        value = int(value, 0)
> +    return (tuple(subscripts), value)
> +@end smallexample
> +
> +@subsection Dynamic Linker Diagnostics Values
> +
> +As mentioned above, the set of diagnostics may change between
> +@theglibc{} releases.  Nevertheless, the following table documents a few
> +common diagnostic items.
> +
> +@table @code
> +@item dl_dst_lib=@var{string}
> +The @code{$LIB} dynamic string token expands to @var{string}.
> +
> +@item dl_hwcap=@var{integer}
> +@itemx dl_hwcap2=@var{integer}
> +@cindex HWCAP (diagnostics)
> +The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as
> +used in other places depending on the architecture.

Please put indices before @items they refer to.  I've recently gone over
all GCC manuals to correct for this exact error, as we've made some
upstream changes to Texinfo rely on index-then-item ordering to provide
a nice pilcrow anchor for copyable links.

See
https://inbox.sourceware.org/gcc-patches/20230223102714.3606058-3-arsen@aarsen.me/
for some context, as well as the GCC docs for the resulting pilcrows
(note that there's no Texinfo release which will demonstrate this yet,
so we use a snapshot for GCC).

> +@item dl_pagesize=@var{integer}
> +@cindex page size (diagnostics)
> +The system page size is @var{integer} bytes.
> +
> +@item dl_platform=@var{string}
> +The @code{$PLATFORM} dynamic string token expands to @var{string}.
> +
> +@item dso.libc=@var{string}
> +This is the soname of the shared @code{libc} object that is part of
> +@theglibc{}.  On most architectures, this is @code{libc.so.6}.
> +
> +@item env[@var{index}]=@var{string}
> +@itemx env_filtered[@var{index}]=@var{string}
> +An environment variable from the process environment.  The integer
> +@var{index} is the array index in the environment array.  Variables
> +under @code{env} include the variable value after the @samp{=} (assuming
> +that it was present), variables under @code{env_filtered} do not.
> +
> +@item path.prefix=@var{string}
> +This indicates that @theglibc{} was configured using
> +@samp{--prefix=@var{string}}.
> +
> +@item path.sysconfdir=@var{string}
> +@Theglibc{} was configured (perhaps implicitly) with
> +@samp{--sysconfdir=@var{string}} (typically @code{/etc}).
> +
> +@item path.system_dirs[@var{index}]=@var{string}
> +These items list the elements of the built-in array that describes the
> +default library search path.  The value @var{string} a directory file
> +name with a trailing @samp{/}.
> +
> +@item path.rtld=@var{string}
> +This string indicates the application binary interface (ABI) file name
> +of the run-time dynamic linker.
> +
> +@item version.release="stable"
> +@itemx version.release="development"
> +The value @code{"stable"} indicates that this build of @theglibc{} is
> +from a release branch.  Releases labeled as @code{"development"} are
> +unreleased development versions.
> +
> +@item version.version="@var{major}.@var{minor}"
> +@itemx version.version="@var{major}.@var{minor}.9000"
> +@cindex version (diagnostics)
> +@Theglibc{} version.  Development releases end in @samp{.9000}.
> +
> +@item auxv[@var{index}].a_type=@var{type}
> +@itemx auxv[@var{index}].a_val=@var{integer}
> +@itemx auxv[@var{index}].a_val_string=@var{string}
> +@cindex auxiliary vector (diagnostics)
> +An entry in the auxiliary vector (specific to Linux).  The values
> +@var{type} (an integer) and @var{integer} correspond to the members of
> +@code{struct auxv}.  If the value is a string, @code{a_val_string} is
> +used instead of @code{a_val}, so that values have consistent types.
> +
> +The @code{AT_HWCAP} and @code{AT_HWCAP2} values in this output do not
> +reflect adjustment by @theglibc{}.
> +
> +@item uname.sysname=@var{string}
> +@itemx uname.nodename=@var{string}
> +@itemx uname.release=@var{string}
> +@itemx uname.version=@var{string}
> +@itemx uname.machine=@var{string}
> +@itemx uname.domain=@var{string}
> +These Linux-specific items show the values of @code{struct utsname}, as
> +reported by the @code{uname} function.  @xref{Platform Type}.
> +
> +@item x86.cpu_features.@dots{}
> +@cindex CPUID (diagnostics)
> +These items are specific to the i386 and x86-64 architectures.  They
> +reflect supported CPU feature and information on cache geometry, mostly
> +collected using the @code{CPUID} instruction.
> +@end table
> +
>  @node Dynamic Linker Introspection
>  @section Dynamic Linker Introspection

Thank you!  Have a lovely night.
Florian Weimer Aug. 8, 2023, 3:38 p.m. UTC | #2
* Arsen Arsenović:

>> +@item dl_hwcap=@var{integer}
>> +@itemx dl_hwcap2=@var{integer}
>> +@cindex HWCAP (diagnostics)
>> +The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as
>> +used in other places depending on the architecture.
>
> Please put indices before @items they refer to.  I've recently gone over
> all GCC manuals to correct for this exact error, as we've made some
> upstream changes to Texinfo rely on index-then-item ordering to provide
> a nice pilcrow anchor for copyable links.
>
> See
> https://inbox.sourceware.org/gcc-patches/20230223102714.3606058-3-arsen@aarsen.me/
> for some context, as well as the GCC docs for the resulting pilcrows
> (note that there's no Texinfo release which will demonstrate this yet,
> so we use a snapshot for GCC).

Thank you, I've changed this locally.

Florian
Adhemerval Zanella Aug. 21, 2023, 4:24 p.m. UTC | #3
Looks good, some minor comments below.

On 04/08/23 15:16, Florian Weimer via Libc-alpha wrote:
> ---
>  manual/dynlink.texi | 279 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 279 insertions(+)
> 
> diff --git a/manual/dynlink.texi b/manual/dynlink.texi
> index 45bf5a5b55..fc2cd2f0a4 100644
> --- a/manual/dynlink.texi
> +++ b/manual/dynlink.texi
> @@ -13,9 +13,288 @@ as plugins) later at run time.
>  Dynamic linkers are sometimes called @dfn{dynamic loaders}.
>  
>  @menu
> +* Dynamic Linker Invocation::   Explicit invocation of the dynamic linker.
>  * Dynamic Linker Introspection::    Interfaces for querying mapping information.
>  @end menu
>  
> +@node Dynamic Linker Invocation
> +
> +@cindex program interpreter
> +When a dynamically linked program starts, the operating system
> +automatically loads the dynamic linker along with the program.
> +@Theglibc{} also supports invoking the dynamic linker explicitly to
> +launch a program.  This command uses the implied dynamic linker
> +(also sometimes called the @dfn{program interpreter}):
> +
> +@smallexample
> +sh -c 'echo "Hello, world!"'
> +@end smallexample
> +
> +This command specifies the dynamic linker explicitly:
> +
> +@smallexample
> +ld.so /bin/sh -c 'echo "Hello, world!"'
> +@end smallexample
> +
> +Note that @command{ld.so} does not search the @env{PATH} environment
> +variable, so the full file name of the executable needs to be specified.
> +
> +The @command{ld.so} program supports various options.  Options start
> +@samp{--} and need to come before the program that is being launched.
> +Some of the supported options are listed below.
> +
> +@table @code
> +@item --list-diagnostics
> +Print system diagnostic information in a machine-readable format.
> +@xref{Dynamic Linker Diagnostics}.
> +@end table
> +
> +@menu
> +* Dynamic Linker Diagnostics::   Obtaining system diagnostic information.
> +@end menu
> +
> +@node Dynamic Linker Diagnostics
> +@section Dynamic Linker Diagnostics
> +@cindex diagnostics (dynamic linker)
> +
> +The @samp{ld.so --list-diagnostics} produces machine-readable
> +diagnostics output.  This output contains system data that affects
> +behavior of @theglibc{}, and potentially application behavior as well.

Maybe 'the behavior'.

> +
> +The exact set of diagnostic items can change between releases of
> +@theglibc{}.  The output format itself is not expected to change
> +radically.
> +
> +The following table shows some example lines that can be written by the
> +diagnostics command.
> +
> +@table @code
> +@item dl_pagesize=0x1000
> +The system page size is 4096 bytes.
> +
> +@item env[0x14]="LANG=en_US.UTF-8"
> +This item indicates that the 21st environment variable at process
> +startup contains a setting for @code{LANG}.
> +
> +@item env_filtered[0x22]="DISPLAY"
> +The 35th environment variable is @code{DISPLAY}.  Its value is not
> +included in the output for privacy reasons because it is not recognized
> +as harmless by the diagnostics code.
> +
> +@item path.prefix="/usr"
> +This means that @theglibc{} was configured with @code{--prefix=/usr}.
> +
> +@item path.system_dirs[0x0]="/lib64/"
> +@itemx path.system_dirs[0x1]="/usr/lib64/"
> +The built-in dynamic linker search path contains two directories,
> +@code{/lib64} and @code{/usr/lib64}.
> +@end table
> +
> +@subsection Dynamic Linker Diagnostics Output Format
> +
> +As seen above, diagnostic lines assign values (integers or strings) to a
> +sequences of labeled subscripts, separated by @samp{.}.  Some subscripts

s/sequences/sequence

> +have integer indices associated with them.  The subscript indices are
> +not necessarily contiguous or small, so an associative array should be
> +used to store them.  Currently, all integers fit into the 64-bit
> +unsigned integer range.  Every access path to a value has a fixed type
> +(string or integer) independently of subscript index values.  Likewise,

s/independently/independent

> +whether a subscript is indexed does not depend on previous indices (but
> +may depend on previous subscript labels).
> +
> +A syntax description in ABNF (RFC 5234) follows.  Note that
> +@code{%x30-39} denotes the range of decimal digits.  Diagnostic output
> +lines are expected to match the @code{line} production.
> +
> +@c ABNF-START
> +@smallexample
> +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
> +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
> +ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
> +DQUOTE = %x22 ; "
> +
> +; Numbers are always hexadecimal and use a 0x prefix.
> +hex-value-prefix = %x30 %x78
> +hex-value = hex-value-prefix 1*HEXDIG
> +
> +; Strings use octal escape sequences and \\, \".
> +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
> +string-quoted-octal = %x30-33 2*2%x30-37
> +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
> +string-value = DQUOTE *(string-char / string-quoted) DQUOTE
> +
> +value = hex-value / string-value
> +
> +label = ALPHA *ALPHA-NUMERIC
> +index = "[" hex-value "]"
> +subscript = label [index]
> +
> +line = subscript *("." subscript) "=" value
> +@end smallexample
> +
> +Output lines can be parsed with the following Python function.  It
> +assumes lines formatted according to the ABNF @code{line} production
> +above.

I don't think it would be useful to add the python script example below,
it is not an full example (reader will need to fill the blacks to actually
run it) and he ABNF description should be suffice.

> +
> +@c PYTHON-START
> +@smallexample
> +def parse_line(line):
> +    """Parse a line of --list-diagnostics output.
> +
> +    This function returns a pair (SUBSCRIPTS, VALUE).  VALUE is either
> +    a byte string or an integer.  SUBSCRIPT is a tuple of (LABEL,
> +    INDEX) pairs, where LABEL is a field identifier (a string), and
> +    INDEX is an integer or None, to indicate that this field is not
> +    indexed.
> +
> +    """
> +
> +    # Extract the list of subscripts before the value.
> +    idx = 0
> +    subscripts = []
> +    while line[idx] != '=':
> +        start_idx = idx
> +
> +        # Extract the label.
> +        while line[idx] not in '[.=':
> +            idx += 1
> +        label = line[start_idx:idx]
> +
> +        if line[idx] == '[':
> +            # Subscript with a 0x index.
> +            assert label
> +            close_bracket = line.index(']', idx)
> +            index = line[idx + 1:close_bracket]
> +            assert index.startswith('0x')
> +            index = int(index, 0)
> +            subscripts.append((label, index))
> +            idx = close_bracket + 1
> +        else: # '.' or '='.
> +            if label:
> +                subscripts.append((label, None))
> +            if line[idx] == '.':
> +                idx += 1
> +
> +    # The value is either a string or a 0x number.
> +    value = line[idx + 1:]
> +    if value[0] == '"':
> +        # Decode the escaped string into a byte string.
> +        assert value[-1] == '"'
> +        idx = 1
> +        result = []
> +        while True:
> +            ch = value[idx]
> +            if ch == '\\':
> +                if value[idx + 1] in '"\\':
> +                    result.append(ord(value[idx + 1]))
> +                    idx += 2
> +                else:
> +                    result.append(int(value[idx + 1:idx + 4], 8))
> +                    idx += 4
> +            elif ch == '"':
> +                assert idx == len(value) - 1
> +                break
> +            else:
> +                result.append(ord(value[idx]))
> +                idx += 1
> +        value = bytes(result)
> +    else:
> +        # Convert the value into an integer.
> +        assert value.startswith('0x')
> +        value = int(value, 0)
> +    return (tuple(subscripts), value)
> +@end smallexample
> +
> +@subsection Dynamic Linker Diagnostics Values
> +
> +As mentioned above, the set of diagnostics may change between
> +@theglibc{} releases.  Nevertheless, the following table documents a few
> +common diagnostic items.
> +
> +@table @code
> +@item dl_dst_lib=@var{string}
> +The @code{$LIB} dynamic string token expands to @var{string}.

s/to/to a

> +
> +@item dl_hwcap=@var{integer}
> +@itemx dl_hwcap2=@var{integer}
> +@cindex HWCAP (diagnostics)
> +The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as
> +used in other places depending on the architecture.
> +
> +@item dl_pagesize=@var{integer}
> +@cindex page size (diagnostics)
> +The system page size is @var{integer} bytes.

Maybe add it is in hexadecimal.

> +
> +@item dl_platform=@var{string}
> +The @code{$PLATFORM} dynamic string token expands to @var{string}.

s/to/to a

> +
> +@item dso.libc=@var{string}
> +This is the soname of the shared @code{libc} object that is part of
> +@theglibc{}.  On most architectures, this is @code{libc.so.6}.
> +
> +@item env[@var{index}]=@var{string}
> +@itemx env_filtered[@var{index}]=@var{string}
> +An environment variable from the process environment.  The integer
> +@var{index} is the array index in the environment array.  Variables
> +under @code{env} include the variable value after the @samp{=} (assuming
> +that it was present), variables under @code{env_filtered} do not.
> +
> +@item path.prefix=@var{string}
> +This indicates that @theglibc{} was configured using
> +@samp{--prefix=@var{string}}.
> +
> +@item path.sysconfdir=@var{string}
> +@Theglibc{} was configured (perhaps implicitly) with
> +@samp{--sysconfdir=@var{string}} (typically @code{/etc}).
> +
> +@item path.system_dirs[@var{index}]=@var{string}
> +These items list the elements of the built-in array that describes the
> +default library search path.  The value @var{string} a directory file

s/a /is a

> +name with a trailing @samp{/}.
> +
> +@item path.rtld=@var{string}
> +This string indicates the application binary interface (ABI) file name
> +of the run-time dynamic linker.
> +
> +@item version.release="stable"
> +@itemx version.release="development"
> +The value @code{"stable"} indicates that this build of @theglibc{} is
> +from a release branch.  Releases labeled as @code{"development"} are
> +unreleased development versions.
> +
> +@item version.version="@var{major}.@var{minor}"
> +@itemx version.version="@var{major}.@var{minor}.9000"
> +@cindex version (diagnostics)
> +@Theglibc{} version.  Development releases end in @samp{.9000}.
> +
> +@item auxv[@var{index}].a_type=@var{type}
> +@itemx auxv[@var{index}].a_val=@var{integer}
> +@itemx auxv[@var{index}].a_val_string=@var{string}
> +@cindex auxiliary vector (diagnostics)
> +An entry in the auxiliary vector (specific to Linux).  The values
> +@var{type} (an integer) and @var{integer} correspond to the members of
> +@code{struct auxv}.  If the value is a string, @code{a_val_string} is
> +used instead of @code{a_val}, so that values have consistent types.
> +
> +The @code{AT_HWCAP} and @code{AT_HWCAP2} values in this output do not
> +reflect adjustment by @theglibc{}.
> +
> +@item uname.sysname=@var{string}
> +@itemx uname.nodename=@var{string}
> +@itemx uname.release=@var{string}
> +@itemx uname.version=@var{string}
> +@itemx uname.machine=@var{string}
> +@itemx uname.domain=@var{string}
> +These Linux-specific items show the values of @code{struct utsname}, as
> +reported by the @code{uname} function.  @xref{Platform Type}.
> +
> +@item x86.cpu_features.@dots{}
> +@cindex CPUID (diagnostics)
> +These items are specific to the i386 and x86-64 architectures.  They
> +reflect supported CPU feature and information on cache geometry, mostly

s/feature/features

> +collected using the @code{CPUID} instruction.
> +@end table
> +
>  @node Dynamic Linker Introspection
>  @section Dynamic Linker Introspection
>
Florian Weimer Aug. 23, 2023, 6:33 a.m. UTC | #4
* Adhemerval Zanella Netto:

>> +Output lines can be parsed with the following Python function.  It
>> +assumes lines formatted according to the ABNF @code{line} production
>> +above.
>
> I don't think it would be useful to add the python script example below,
> it is not an full example (reader will need to fill the blacks to actually
> run it) and he ABNF description should be suffice.

I'm going to remove it.

>> +As mentioned above, the set of diagnostics may change between
>> +@theglibc{} releases.  Nevertheless, the following table documents a few
>> +common diagnostic items.
>> +
>> +@table @code
>> +@item dl_dst_lib=@var{string}
>> +The @code{$LIB} dynamic string token expands to @var{string}.
>
> s/to/to a

I think @var{string} as a variable should not have an article.

>> +
>> +@item dl_hwcap=@var{integer}
>> +@itemx dl_hwcap2=@var{integer}
>> +@cindex HWCAP (diagnostics)
>> +The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as
>> +used in other places depending on the architecture.
>> +
>> +@item dl_pagesize=@var{integer}
>> +@cindex page size (diagnostics)
>> +The system page size is @var{integer} bytes.
>
> Maybe add it is in hexadecimal.

I'm going to add a sentence to the introductory paragraph:

All numbers are in hexadecimal, with a @samp{0x} prefix.

I've applied the rest of your suggestions.

Thanks,
Florian
diff mbox series

Patch

diff --git a/manual/dynlink.texi b/manual/dynlink.texi
index 45bf5a5b55..fc2cd2f0a4 100644
--- a/manual/dynlink.texi
+++ b/manual/dynlink.texi
@@ -13,9 +13,288 @@  as plugins) later at run time.
 Dynamic linkers are sometimes called @dfn{dynamic loaders}.
 
 @menu
+* Dynamic Linker Invocation::   Explicit invocation of the dynamic linker.
 * Dynamic Linker Introspection::    Interfaces for querying mapping information.
 @end menu
 
+@node Dynamic Linker Invocation
+
+@cindex program interpreter
+When a dynamically linked program starts, the operating system
+automatically loads the dynamic linker along with the program.
+@Theglibc{} also supports invoking the dynamic linker explicitly to
+launch a program.  This command uses the implied dynamic linker
+(also sometimes called the @dfn{program interpreter}):
+
+@smallexample
+sh -c 'echo "Hello, world!"'
+@end smallexample
+
+This command specifies the dynamic linker explicitly:
+
+@smallexample
+ld.so /bin/sh -c 'echo "Hello, world!"'
+@end smallexample
+
+Note that @command{ld.so} does not search the @env{PATH} environment
+variable, so the full file name of the executable needs to be specified.
+
+The @command{ld.so} program supports various options.  Options start
+@samp{--} and need to come before the program that is being launched.
+Some of the supported options are listed below.
+
+@table @code
+@item --list-diagnostics
+Print system diagnostic information in a machine-readable format.
+@xref{Dynamic Linker Diagnostics}.
+@end table
+
+@menu
+* Dynamic Linker Diagnostics::   Obtaining system diagnostic information.
+@end menu
+
+@node Dynamic Linker Diagnostics
+@section Dynamic Linker Diagnostics
+@cindex diagnostics (dynamic linker)
+
+The @samp{ld.so --list-diagnostics} produces machine-readable
+diagnostics output.  This output contains system data that affects
+behavior of @theglibc{}, and potentially application behavior as well.
+
+The exact set of diagnostic items can change between releases of
+@theglibc{}.  The output format itself is not expected to change
+radically.
+
+The following table shows some example lines that can be written by the
+diagnostics command.
+
+@table @code
+@item dl_pagesize=0x1000
+The system page size is 4096 bytes.
+
+@item env[0x14]="LANG=en_US.UTF-8"
+This item indicates that the 21st environment variable at process
+startup contains a setting for @code{LANG}.
+
+@item env_filtered[0x22]="DISPLAY"
+The 35th environment variable is @code{DISPLAY}.  Its value is not
+included in the output for privacy reasons because it is not recognized
+as harmless by the diagnostics code.
+
+@item path.prefix="/usr"
+This means that @theglibc{} was configured with @code{--prefix=/usr}.
+
+@item path.system_dirs[0x0]="/lib64/"
+@itemx path.system_dirs[0x1]="/usr/lib64/"
+The built-in dynamic linker search path contains two directories,
+@code{/lib64} and @code{/usr/lib64}.
+@end table
+
+@subsection Dynamic Linker Diagnostics Output Format
+
+As seen above, diagnostic lines assign values (integers or strings) to a
+sequences of labeled subscripts, separated by @samp{.}.  Some subscripts
+have integer indices associated with them.  The subscript indices are
+not necessarily contiguous or small, so an associative array should be
+used to store them.  Currently, all integers fit into the 64-bit
+unsigned integer range.  Every access path to a value has a fixed type
+(string or integer) independently of subscript index values.  Likewise,
+whether a subscript is indexed does not depend on previous indices (but
+may depend on previous subscript labels).
+
+A syntax description in ABNF (RFC 5234) follows.  Note that
+@code{%x30-39} denotes the range of decimal digits.  Diagnostic output
+lines are expected to match the @code{line} production.
+
+@c ABNF-START
+@smallexample
+HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only
+ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore
+ALPHA-NUMERIC = ALPHA / %x30-39 / "_"
+DQUOTE = %x22 ; "
+
+; Numbers are always hexadecimal and use a 0x prefix.
+hex-value-prefix = %x30 %x78
+hex-value = hex-value-prefix 1*HEXDIG
+
+; Strings use octal escape sequences and \\, \".
+string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\
+string-quoted-octal = %x30-33 2*2%x30-37
+string-quoted = "\" ("\" / DQUOTE / string-quoted-octal)
+string-value = DQUOTE *(string-char / string-quoted) DQUOTE
+
+value = hex-value / string-value
+
+label = ALPHA *ALPHA-NUMERIC
+index = "[" hex-value "]"
+subscript = label [index]
+
+line = subscript *("." subscript) "=" value
+@end smallexample
+
+Output lines can be parsed with the following Python function.  It
+assumes lines formatted according to the ABNF @code{line} production
+above.
+
+@c PYTHON-START
+@smallexample
+def parse_line(line):
+    """Parse a line of --list-diagnostics output.
+
+    This function returns a pair (SUBSCRIPTS, VALUE).  VALUE is either
+    a byte string or an integer.  SUBSCRIPT is a tuple of (LABEL,
+    INDEX) pairs, where LABEL is a field identifier (a string), and
+    INDEX is an integer or None, to indicate that this field is not
+    indexed.
+
+    """
+
+    # Extract the list of subscripts before the value.
+    idx = 0
+    subscripts = []
+    while line[idx] != '=':
+        start_idx = idx
+
+        # Extract the label.
+        while line[idx] not in '[.=':
+            idx += 1
+        label = line[start_idx:idx]
+
+        if line[idx] == '[':
+            # Subscript with a 0x index.
+            assert label
+            close_bracket = line.index(']', idx)
+            index = line[idx + 1:close_bracket]
+            assert index.startswith('0x')
+            index = int(index, 0)
+            subscripts.append((label, index))
+            idx = close_bracket + 1
+        else: # '.' or '='.
+            if label:
+                subscripts.append((label, None))
+            if line[idx] == '.':
+                idx += 1
+
+    # The value is either a string or a 0x number.
+    value = line[idx + 1:]
+    if value[0] == '"':
+        # Decode the escaped string into a byte string.
+        assert value[-1] == '"'
+        idx = 1
+        result = []
+        while True:
+            ch = value[idx]
+            if ch == '\\':
+                if value[idx + 1] in '"\\':
+                    result.append(ord(value[idx + 1]))
+                    idx += 2
+                else:
+                    result.append(int(value[idx + 1:idx + 4], 8))
+                    idx += 4
+            elif ch == '"':
+                assert idx == len(value) - 1
+                break
+            else:
+                result.append(ord(value[idx]))
+                idx += 1
+        value = bytes(result)
+    else:
+        # Convert the value into an integer.
+        assert value.startswith('0x')
+        value = int(value, 0)
+    return (tuple(subscripts), value)
+@end smallexample
+
+@subsection Dynamic Linker Diagnostics Values
+
+As mentioned above, the set of diagnostics may change between
+@theglibc{} releases.  Nevertheless, the following table documents a few
+common diagnostic items.
+
+@table @code
+@item dl_dst_lib=@var{string}
+The @code{$LIB} dynamic string token expands to @var{string}.
+
+@item dl_hwcap=@var{integer}
+@itemx dl_hwcap2=@var{integer}
+@cindex HWCAP (diagnostics)
+The HWCAP and HWCAP2 values, as returned for @code{getauxval}, and as
+used in other places depending on the architecture.
+
+@item dl_pagesize=@var{integer}
+@cindex page size (diagnostics)
+The system page size is @var{integer} bytes.
+
+@item dl_platform=@var{string}
+The @code{$PLATFORM} dynamic string token expands to @var{string}.
+
+@item dso.libc=@var{string}
+This is the soname of the shared @code{libc} object that is part of
+@theglibc{}.  On most architectures, this is @code{libc.so.6}.
+
+@item env[@var{index}]=@var{string}
+@itemx env_filtered[@var{index}]=@var{string}
+An environment variable from the process environment.  The integer
+@var{index} is the array index in the environment array.  Variables
+under @code{env} include the variable value after the @samp{=} (assuming
+that it was present), variables under @code{env_filtered} do not.
+
+@item path.prefix=@var{string}
+This indicates that @theglibc{} was configured using
+@samp{--prefix=@var{string}}.
+
+@item path.sysconfdir=@var{string}
+@Theglibc{} was configured (perhaps implicitly) with
+@samp{--sysconfdir=@var{string}} (typically @code{/etc}).
+
+@item path.system_dirs[@var{index}]=@var{string}
+These items list the elements of the built-in array that describes the
+default library search path.  The value @var{string} a directory file
+name with a trailing @samp{/}.
+
+@item path.rtld=@var{string}
+This string indicates the application binary interface (ABI) file name
+of the run-time dynamic linker.
+
+@item version.release="stable"
+@itemx version.release="development"
+The value @code{"stable"} indicates that this build of @theglibc{} is
+from a release branch.  Releases labeled as @code{"development"} are
+unreleased development versions.
+
+@item version.version="@var{major}.@var{minor}"
+@itemx version.version="@var{major}.@var{minor}.9000"
+@cindex version (diagnostics)
+@Theglibc{} version.  Development releases end in @samp{.9000}.
+
+@item auxv[@var{index}].a_type=@var{type}
+@itemx auxv[@var{index}].a_val=@var{integer}
+@itemx auxv[@var{index}].a_val_string=@var{string}
+@cindex auxiliary vector (diagnostics)
+An entry in the auxiliary vector (specific to Linux).  The values
+@var{type} (an integer) and @var{integer} correspond to the members of
+@code{struct auxv}.  If the value is a string, @code{a_val_string} is
+used instead of @code{a_val}, so that values have consistent types.
+
+The @code{AT_HWCAP} and @code{AT_HWCAP2} values in this output do not
+reflect adjustment by @theglibc{}.
+
+@item uname.sysname=@var{string}
+@itemx uname.nodename=@var{string}
+@itemx uname.release=@var{string}
+@itemx uname.version=@var{string}
+@itemx uname.machine=@var{string}
+@itemx uname.domain=@var{string}
+These Linux-specific items show the values of @code{struct utsname}, as
+reported by the @code{uname} function.  @xref{Platform Type}.
+
+@item x86.cpu_features.@dots{}
+@cindex CPUID (diagnostics)
+These items are specific to the i386 and x86-64 architectures.  They
+reflect supported CPU feature and information on cache geometry, mostly
+collected using the @code{CPUID} instruction.
+@end table
+
 @node Dynamic Linker Introspection
 @section Dynamic Linker Introspection