diff mbox series

[v2,1/2] system_data_types.7: Add 'void *'

Message ID 20201001154946.104626-2-colomar.6.4.3@gmail.com
State New
Headers show
Series Document 'void *' | expand

Commit Message

Alejandro Colomar Oct. 1, 2020, 3:49 p.m. UTC
Signed-off-by: Alejandro Colomar <colomar.6.4.3@gmail.com>
---
 man7/system_data_types.7 | 47 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

Comments

Michael Kerrisk \(man-pages\) Oct. 1, 2020, 4:38 p.m. UTC | #1
Hi Alex,

> +According to the C language standard,
> +a pointer to any object type may be converted to a pointer to
> +.I void
> +and back.
> +POSIX further requires that any pointer,
> +including pointers to functions,
> +may be converted to a pointer to
> +.I void
> +and back.
I know you are correct about POSIX, but which part of the 
standard did you find this information in? The only
reference that I find in POSIX is the dlsym() spec. Is it
covered also somewhere else in the standrd?

Thanks,

Michael
Alejandro Colomar Oct. 1, 2020, 4:55 p.m. UTC | #2
On 2020-10-01 18:38, Michael Kerrisk (man-pages) wrote:
> Hi Alex,
> 
>> +According to the C language standard,
>> +a pointer to any object type may be converted to a pointer to
>> +.I void
>> +and back.
>> +POSIX further requires that any pointer,
>> +including pointers to functions,
>> +may be converted to a pointer to
>> +.I void
>> +and back.
> I know you are correct about POSIX, but which part of the
> standard did you find this information in? The only
> reference that I find in POSIX is the dlsym() spec. Is it
> covered also somewhere else in the standrd?
> 
> Thanks,
> 
> Michael
> 

Hi Michael,

I've bean searching, and dlsym is the only one:
Paul Eggert Oct. 1, 2020, 5:32 p.m. UTC | #3
If you're going to document this at all, I suggest documenting 'void' as well as 
'void *', and putting both sets of documentation into the same man page.

For 'void *' you should also mention that one cannot use arithmetic on void * 
pointers, so they're special in that way too. Also, you should warn that because 
one can convert from any pointer type to void * and then to any other pointer 
type, it's a deliberate hole in C's type-checking. It might not also hurt to 
mention 'void const *', 'void volatile *', 'void const volatile *', etc.

For 'void' you can mention the usual things, such as functions returning void, 
and functions declared with (void) parameters, why one would want to cast to 
(void), and so forth.

You're starting to document the C language here, and if you're going to do that 
you might as well do it right.
Alejandro Colomar Oct. 2, 2020, 8:24 a.m. UTC | #4
Hi Paul,

On 2020-10-01 19:32, Paul Eggert wrote:
 > If you're going to document this at all, I suggest documenting 'void' as
 > well as 'void *', and putting both sets of documentation into the same
 > man page.
 >

All the types we're documenting are in the same page:
system_data_types(7).
And then we have links with the name of each type.

And yes, I also pretend to document 'void'.


 > For 'void *' you should also mention that one cannot use arithmetic on
 > void * pointers, so they're special in that way too.

Good suggestion!

 > Also, you should
 > warn that because one can convert from any pointer type to void * and
 > then to any other pointer type, it's a deliberate hole in C's
 > type-checking.

Also good.  I'll talk about generic function parameters for this.

 > It might not also hurt to mention 'void const *', 'void
 > volatile *', 'void const volatile *', etc.

Those are qualifiers for the type,
and I don't see how any of them would apply differently to 'void *'
than to any other pointer type (or any type at all),
so I think they don't belong to system_data_types(7).

However, it might be good that someone starts a page called
'type_qualifiers(7)' or something like that.

I would love that someone documents 'volatile' correctly,
as there aren't many good sources about it.
If someone who knows when to use --and especially when not to use--
'volatile', is reading this, think about it :-)
I still wonder if I used it correctly in the few cases I've had to.

BTW, I'll CC the LKML.

 >
 > For 'void' you can mention the usual things, such as functions returning
 > void, and functions declared with (void) parameters, why one would want
 > to cast to (void), and so forth.

Yes, I was thinking about that.

 >
 > You're starting to document the C language here, and if you're going to
 > do that you might as well do it right.

I'm trying to do so :)

Thanks,

Alex
Alejandro Colomar Oct. 2, 2020, 8:48 a.m. UTC | #5
Hi Michael,

On 2020-10-02 10:24, Alejandro Colomar wrote:
> On 2020-10-01 19:32, Paul Eggert wrote:
>  > For 'void *' you should also mention that one cannot use arithmetic on
>  > void * pointers, so they're special in that way too.
> 
> Good suggestion!
> 
>  > Also, you should
>  > warn that because one can convert from any pointer type to void * and
>  > then to any other pointer type, it's a deliberate hole in C's
>  > type-checking.
> 
> Also good.  I'll talk about generic function parameters for this.
I think the patch as is now is complete enough to be added.

So I won't rewrite it for now.
Please review the patch as is,
and I'll add more info to this type in the future.

Thanks,

Alex
David Laight Oct. 2, 2020, 9:10 a.m. UTC | #6
From: Alejandro Colomar
> Sent: 02 October 2020 09:25
>  > For 'void *' you should also mention that one cannot use arithmetic on
>  > void * pointers, so they're special in that way too.
> 
> Good suggestion!

Except that is a gcc extension that is allowed in the kernel.

>  > Also, you should
>  > warn that because one can convert from any pointer type to void * and
>  > then to any other pointer type, it's a deliberate hole in C's
>  > type-checking.
> 
> Also good.  I'll talk about generic function parameters for this.

That isn't what the C standard says at all.
What is says is that you can cast any data pointer to 'void *'
and then cast it back to the same type.

This matters because the compiler will 'remember' structure
alignment through 'void *' casts.
So you can't use memcpy() to copy from a potentially misaligned
(typed) pointer.

'void *' should only be used for structures that are 'a sequence of bytes'.
(eg things that look a bit like read() or write()).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Jonathan Wakely Oct. 2, 2020, 10:49 a.m. UTC | #7
On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc <gcc@gcc.gnu.org> wrote:
> However, it might be good that someone starts a page called
> 'type_qualifiers(7)' or something like that.

Who is this for? Who is trying to learn C from man pages? Should
somebody stop them?
Michael Kerrisk \(man-pages\) via Libc-alpha Oct. 2, 2020, 11:31 a.m. UTC | #8
On Fri, 2 Oct 2020 at 12:49, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
>
> On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc <gcc@gcc.gnu.org> wrote:
> > However, it might be good that someone starts a page called
> > 'type_qualifiers(7)' or something like that.
>
> Who is this for? Who is trying to learn C from man pages? Should
> somebody stop them?

Yes, I think so. To add context, Alex has been doing a lot of work to
build up the new system_data_types(7) page [1], which I think is
especially useful for the POSIX system data types that are used with
various APIs. With the addition of the integer types and 'void *'
things are straying somewhat from POSIX into C. I think there is value
in saying something about those types, but I'm somewhat neutral about
their inclusion in the page. But Alex has done the work, and I'm
willing to include those types in the page.

I do think that something like type_qualifiers(7) strays over the line
of what should be covered in Linux man-pages, which are primarily
about the kernel + libc APIs. [2]

Thanks,

Michael

[1] https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/man7/system_data_types.7
[2] Mind you, man-pages trayed over the line already very many years
ago with operators(7), because who ever remembers all of the C
operator precedences.
Michael Kerrisk \(man-pages\) Oct. 2, 2020, 11:44 a.m. UTC | #9
Hi Alex,

On 10/2/20 10:48 AM, Alejandro Colomar wrote:
> Hi Michael,
> 
> On 2020-10-02 10:24, Alejandro Colomar wrote:
>> On 2020-10-01 19:32, Paul Eggert wrote:
>>  > For 'void *' you should also mention that one cannot use arithmetic on
>>  > void * pointers, so they're special in that way too.
>>
>> Good suggestion!
>>
>>  > Also, you should
>>  > warn that because one can convert from any pointer type to void * and
>>  > then to any other pointer type, it's a deliberate hole in C's
>>  > type-checking.
>>
>> Also good.  I'll talk about generic function parameters for this.
> I think the patch as is now is complete enough to be added.
> 
> So I won't rewrite it for now.
> Please review the patch as is,
> and I'll add more info to this type in the future.

Actually, I would rather prefer one patch series, rather than 
patches on patches please. It also makes review of the overall
'void *' text easier if it's all one patch. So, If you could
squash the patches together and resubmit, that would be helful.

Thanks,

Michael
Michael Kerrisk \(man-pages\) Oct. 2, 2020, 11:54 a.m. UTC | #10
Hi Alex,

On 10/1/20 6:55 PM, Alejandro Colomar wrote:
> 
> 
> On 2020-10-01 18:38, Michael Kerrisk (man-pages) wrote:
>> Hi Alex,
>>
>>> +According to the C language standard,
>>> +a pointer to any object type may be converted to a pointer to
>>> +.I void
>>> +and back.
>>> +POSIX further requires that any pointer,
>>> +including pointers to functions,
>>> +may be converted to a pointer to
>>> +.I void
>>> +and back.
>> I know you are correct about POSIX, but which part of the
>> standard did you find this information in? The only
>> reference that I find in POSIX is the dlsym() spec. Is it
>> covered also somewhere else in the standrd?
>>
>> Thanks,
>>
>> Michael
>>
> 
> Hi Michael,
> 
> I've bean searching, and dlsym is the only one:
> 
> ________
> 
> user@debian:~/Desktop/src/Standards/susv4-2018$ grep -rn "pointer to a 
> function"
> functions/regfree.html:530:&quot;undefined&quot; means that the action 
> by the application is an error, of similar severity to passing a bad 
> pointer to a function.</p>
> functions/dlsym.html:138:<p>Note that conversion from a <b>void *</b> 
> pointer to a function pointer as in:</p>
> functions/regcomp.html:530:&quot;undefined&quot; means that the action 
> by the application is an error, of similar severity to passing a bad 
> pointer to a function.</p>
> functions/regexec.html:530:&quot;undefined&quot; means that the action 
> by the application is an error, of similar severity to passing a bad 
> pointer to a function.</p>
> functions/V2_chap02.html:3039:<p>There are three types of action that 
> can be associated with a signal: SIG_DFL, SIG_IGN, or a pointer to a 
> function. Initially,
> functions/regerror.html:530:&quot;undefined&quot; means that the action 
> by the application is an error, of similar severity to passing a bad 
> pointer to a function.</p>
> user@debian:~/Desktop/src/Standards/susv4-2018$ grep -rn "function pointer"
> basedefs/glob.h.html:165:"../functions/glob.html"><i>glob</i>()</a> 
> prototype definition by removing the <b>restrict</b> qualifier from the 
> function pointer
> xrat/V4_xsh_chap02.html:114:when the application requires it; for 
> example, if its address is to be stored in a function pointer variable.</p>
> functions/dlsym.html:138:<p>Note that conversion from a <b>void *</b> 
> pointer to a function pointer as in:</p>
> user@debian:~/Desktop/src/Standards/susv4-2018$ grep -rn "pointer to 
> function"
> functions/dlsym.html:73:converted from type pointer to function to type 
> pointer to <b>void</b>; otherwise, <i>dlsym</i>() shall return the 
> address of the
> user@debian:~/Desktop/src/Standards/susv4-2018$
> 
>  From those, the only one that documents this is functions/dlsym.
> The rest is noise.
> 
> The most explicit paragraph in dlsym is the following:
> 
> [[
> Note that conversion from a void * pointer to a function pointer as in:
> 
> fptr = (int (*)(int))dlsym(handle, "my_function");
> 
> is not defined by the ISO C standard.
> This standard requires this conversion to work correctly
> on conforming implementations.
> ]]

Okay -- so, one more thing for a revised (squashed) patch.
I think you better say that that POSIX requirements exists 
only since POS0X.1-2008 Technical Corrigendum 1 (2013).

Thanks,

Michael
Jonathan Wakely Oct. 2, 2020, 1:06 p.m. UTC | #11
On Fri, 2 Oct 2020 at 12:31, Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
>
> On Fri, 2 Oct 2020 at 12:49, Jonathan Wakely <jwakely.gcc@gmail.com> wrote:
> >
> > On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc <gcc@gcc.gnu.org> wrote:
> > > However, it might be good that someone starts a page called
> > > 'type_qualifiers(7)' or something like that.
> >
> > Who is this for? Who is trying to learn C from man pages? Should
> > somebody stop them?
>
> Yes, I think so. To add context, Alex has been doing a lot of work to
> build up the new system_data_types(7) page [1], which I think is
> especially useful for the POSIX system data types that are used with
> various APIs.

It's definitely useful for types like struct siginfo_t and struct
timeval, which aren't in C.

Trying to document C seems like a huge task, ill-suited for man-pages,
and not worth the effort.

Maybe some people prefer man pages, but for everybody else
https://en.cppreference.com/w/c already exists and seems like a better
use of time.
Alejandro Colomar Oct. 2, 2020, 1:20 p.m. UTC | #12
On 2020-10-02 15:06, Jonathan Wakely wrote:
 > On Fri, 2 Oct 2020 at 12:31, Michael Kerrisk (man-pages)
 > <mtk.manpages@gmail.com> wrote:
 >>
 >> On Fri, 2 Oct 2020 at 12:49, Jonathan Wakely <jwakely.gcc@gmail.com> 
wrote:
 >>>
 >>> On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc 
<gcc@gcc.gnu.org> wrote:
 >>>> However, it might be good that someone starts a page called
 >>>> 'type_qualifiers(7)' or something like that.
 >>>
 >>> Who is this for? Who is trying to learn C from man pages? Should
 >>> somebody stop them?
 >>
 >> Yes, I think so. To add context, Alex has been doing a lot of work to
 >> build up the new system_data_types(7) page [1], which I think is
 >> especially useful for the POSIX system data types that are used with
 >> various APIs.
 >
 > It's definitely useful for types like struct siginfo_t and struct
 > timeval, which aren't in C.

Hi Jonathan,

But then the line is a bit diffuse.
Would you document 'ssize_t' and not 'size_t'?
Would you not document intN_t types?
Would you document stdint types, including 'intptr_t', and not 'void *'?

I guess the basic types (int, long, ...) can be left out for now,
and apart from 'int' those rarely are the most appropriate types
for most uses.
But other than that, I would document all of the types.
And even... when all of the other types are documented,
it will be only a little extra effort to document those,
so in the future I might consider that.
But yes, priority should probably go to Linux/POSIX-only types.

Thanks,

Alex

 >
 > Trying to document C seems like a huge task, ill-suited for man-pages,
 > and not worth the effort.
 >
 > Maybe some people prefer man pages, but for everybody else
 > https://en.cppreference.com/w/c already exists and seems like a better
 > use of time.
 >
Jonathan Wakely Oct. 2, 2020, 1:27 p.m. UTC | #13
On Fri, 2 Oct 2020 at 14:20, Alejandro Colomar <colomar.6.4.3@gmail.com> wrote:
>
>
>
> On 2020-10-02 15:06, Jonathan Wakely wrote:
>  > On Fri, 2 Oct 2020 at 12:31, Michael Kerrisk (man-pages)
>  > <mtk.manpages@gmail.com> wrote:
>  >>
>  >> On Fri, 2 Oct 2020 at 12:49, Jonathan Wakely <jwakely.gcc@gmail.com>
> wrote:
>  >>>
>  >>> On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc
> <gcc@gcc.gnu.org> wrote:
>  >>>> However, it might be good that someone starts a page called
>  >>>> 'type_qualifiers(7)' or something like that.
>  >>>
>  >>> Who is this for? Who is trying to learn C from man pages? Should
>  >>> somebody stop them?
>  >>
>  >> Yes, I think so. To add context, Alex has been doing a lot of work to
>  >> build up the new system_data_types(7) page [1], which I think is
>  >> especially useful for the POSIX system data types that are used with
>  >> various APIs.
>  >
>  > It's definitely useful for types like struct siginfo_t and struct
>  > timeval, which aren't in C.
>
> Hi Jonathan,
>
> But then the line is a bit diffuse.
> Would you document 'ssize_t' and not 'size_t'?

Yes. My documentation for ssize_t would mention size_t, refer to the C
standard, and not define it.

> Would you not document intN_t types?
> Would you document stdint types, including 'intptr_t', and not 'void *'?

I would document neither.

I can see some small value in documenting size_t and the stdint types,
as they are technically defined by the libc headers. But documenting
void* seems very silly. It's one of the most fundamental built-in
parts of the C language, not an interface provided by the system.

> I guess the basic types (int, long, ...) can be left out for now,

I should hope so!

> and apart from 'int' those rarely are the most appropriate types
> for most uses.
> But other than that, I would document all of the types.
> And even... when all of the other types are documented,
> it will be only a little extra effort to document those,
> so in the future I might consider that.

[insert Jurassic Park meme "Your scientists were so preoccupied with
whether or not they could, they didn't stop to think if they should."
]

I don't see value in bloating the man-pages with information nobody
will ever use, and which doesn't (IMHO) belong there anyway. We seem
to fundamentally disagree about what the man pages are for. I don't
think they are supposed to teach C programming from scratch.


> But yes, priority should probably go to Linux/POSIX-only types.
Alejandro Colomar Oct. 2, 2020, 1:51 p.m. UTC | #14
Hi Jonathan,

On 2020-10-02 15:27, Jonathan Wakely wrote:
> On Fri, 2 Oct 2020 at 14:20, Alejandro Colomar <colomar.6.4.3@gmail.com> wrote:
>>
>>
>>
>> On 2020-10-02 15:06, Jonathan Wakely wrote:
>>   > On Fri, 2 Oct 2020 at 12:31, Michael Kerrisk (man-pages)
>>   > <mtk.manpages@gmail.com> wrote:
>>   >>
>>   >> On Fri, 2 Oct 2020 at 12:49, Jonathan Wakely <jwakely.gcc@gmail.com>
>> wrote:
>>   >>>
>>   >>> On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc
>> <gcc@gcc.gnu.org> wrote:
>>   >>>> However, it might be good that someone starts a page called
>>   >>>> 'type_qualifiers(7)' or something like that.
>>   >>>
>>   >>> Who is this for? Who is trying to learn C from man pages? Should
>>   >>> somebody stop them?
>>   >>
>>   >> Yes, I think so. To add context, Alex has been doing a lot of work to
>>   >> build up the new system_data_types(7) page [1], which I think is
>>   >> especially useful for the POSIX system data types that are used with
>>   >> various APIs.
>>   >
>>   > It's definitely useful for types like struct siginfo_t and struct
>>   > timeval, which aren't in C.
>>
>> Hi Jonathan,
>>
>> But then the line is a bit diffuse.
>> Would you document 'ssize_t' and not 'size_t'?
> 
> Yes. My documentation for ssize_t would mention size_t, refer to the C
> standard, and not define it.
> 
>> Would you not document intN_t types?
>> Would you document stdint types, including 'intptr_t', and not 'void *'?
> 
> I would document neither.
> 
> I can see some small value in documenting size_t and the stdint types,
> as they are technically defined by the libc headers. But documenting
> void* seems very silly. It's one of the most fundamental built-in
> parts of the C language, not an interface provided by the system.
> 
>> I guess the basic types (int, long, ...) can be left out for now,
> 
> I should hope so!
> 
>> and apart from 'int' those rarely are the most appropriate types
>> for most uses.
>> But other than that, I would document all of the types.
>> And even... when all of the other types are documented,
>> it will be only a little extra effort to document those,
>> so in the future I might consider that.
> 
> [insert Jurassic Park meme "Your scientists were so preoccupied with
> whether or not they could, they didn't stop to think if they should."
> ]
> 
> I don't see value in bloating the man-pages with information nobody
> will ever use, and which doesn't (IMHO) belong there anyway. We seem
> to fundamentally disagree about what the man pages are for. I don't
> think they are supposed to teach C programming from scratch.

Agree in part.
I'll try to think about it again.

In the meantime, I trust Michael to tell me when something is way off :)

Thanks, really!

Alex



> 
> 
>> But yes, priority should probably go to Linux/POSIX-only types.
Paul Eggert Oct. 2, 2020, 5 p.m. UTC | #15
On 10/2/20 2:10 AM, David Laight wrote:
>>   > Also, you should
>>   > warn that because one can convert from any pointer type to void * and
>>   > then to any other pointer type, it's a deliberate hole in C's
>>   > type-checking.
>>
> That isn't what the C standard says at all.
> What is says is that you can cast any data pointer to 'void *'
> and then cast it back to the same type.

I was talking about compile-time checking; you're talking about run-time 
behavior. We're both right in our own domains. It is a tricky area, and this 
suggests that perhaps we shouldn't be trying to document this stuff in a 
libc/kernel manual.
Michael Kerrisk \(man-pages\) Oct. 3, 2020, 8 a.m. UTC | #16
Hi Alex, et al.
On 10/2/20 3:51 PM, Alejandro Colomar wrote:
> 
> Hi Jonathan,
> 
> On 2020-10-02 15:27, Jonathan Wakely wrote:
>> On Fri, 2 Oct 2020 at 14:20, Alejandro Colomar <colomar.6.4.3@gmail.com> wrote:
>>>
>>>
>>>
>>> On 2020-10-02 15:06, Jonathan Wakely wrote:
>>>   > On Fri, 2 Oct 2020 at 12:31, Michael Kerrisk (man-pages)
>>>   > <mtk.manpages@gmail.com> wrote:
>>>   >>
>>>   >> On Fri, 2 Oct 2020 at 12:49, Jonathan Wakely <jwakely.gcc@gmail.com>
>>> wrote:
>>>   >>>
>>>   >>> On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc
>>> <gcc@gcc.gnu.org> wrote:
>>>   >>>> However, it might be good that someone starts a page called
>>>   >>>> 'type_qualifiers(7)' or something like that.
>>>   >>>
>>>   >>> Who is this for? Who is trying to learn C from man pages? Should
>>>   >>> somebody stop them?
>>>   >>
>>>   >> Yes, I think so. To add context, Alex has been doing a lot of work to
>>>   >> build up the new system_data_types(7) page [1], which I think is
>>>   >> especially useful for the POSIX system data types that are used with
>>>   >> various APIs.
>>>   >
>>>   > It's definitely useful for types like struct siginfo_t and struct
>>>   > timeval, which aren't in C.
>>>
>>> Hi Jonathan,
>>>
>>> But then the line is a bit diffuse.
>>> Would you document 'ssize_t' and not 'size_t'?
>>
>> Yes. My documentation for ssize_t would mention size_t, refer to the C
>> standard, and not define it.
>>
>>> Would you not document intN_t types?
>>> Would you document stdint types, including 'intptr_t', and not 'void *'?
>>
>> I would document neither.
>>
>> I can see some small value in documenting size_t and the stdint types,
>> as they are technically defined by the libc headers. But documenting
>> void* seems very silly. It's one of the most fundamental built-in
>> parts of the C language, not an interface provided by the system.
>>
>>> I guess the basic types (int, long, ...) can be left out for now,
>>
>> I should hope so!
>>
>>> and apart from 'int' those rarely are the most appropriate types
>>> for most uses.
>>> But other than that, I would document all of the types.
>>> And even... when all of the other types are documented,
>>> it will be only a little extra effort to document those,
>>> so in the future I might consider that.
>>
>> [insert Jurassic Park meme "Your scientists were so preoccupied with
>> whether or not they could, they didn't stop to think if they should."
>> ]
>>
>> I don't see value in bloating the man-pages with information nobody
>> will ever use, and which doesn't (IMHO) belong there anyway. We seem
>> to fundamentally disagree about what the man pages are for. I don't
>> think they are supposed to teach C programming from scratch.
> 
> Agree in part.
> I'll try to think about it again.
> 
> In the meantime, I trust Michael to tell me when something is way off :)
> 
> Thanks, really!
> 
> Alex

So, I think a navigational correction is needed.

My vision was that system_data_types(7) would most usefully document 
the POSIX types, but by now there's too much of C creeping in. I have
been a little slow to react to that, and I apologize for that.
But I think we should not go in that direction

I think it is worth having types like ssize_t and size_t in 
the page, simply because they turn up with so many of the POSIX
APIs, and people often don't understand some details of these
types (such as the necessary prinf() specifiers). So, as long as
we're going to have a page about these types, it's fine by
me to include size_t and ssize_t.

Types like [u]intN_t are definitely on the borderline for me. But,
they do appear in various APIs in the Linux interface (either
explicitly, or as the similar __u32 ___64, etc.). And again
many people don't understand some basic details, such as
the PRI and SCN constants, so I think it is useful to have them
briefly summarized in one place, and as long as they are already
in the page, then let's keep them.

I think __int128 etc definitely doesn't belong in this page.

And I'd like to back pedal a bit. I think we really shouldn't have
[u]int_fastN_t
[u]int_leastN_t
in the page. They are C details that have nothing to with POSIX, 
the kernel, or libc. Could you send me a patch to remove these
from the page? And again, my apologies for not being focused 
enough on the big picture sooner.

I don't think 'void' belongs in this page. Nor basic types
such as int, long, etc.

The question of 'void *' is an interesting one. It is something
like a fundamental C type, and not something that comes from POSIX.
But, it does appear in POSIX APIs and often details of using
the type are not well understood. So, as a matter of practicality,
and again since you've done the work, I am inclined to include
this type in the page, just so it can be handily referred to
along with all of the other types.

Looking ahead (and I hope none of the above disheartens you,
since you've done a lot of great work for this page), it would
be good if you could provide a bit of an advance roadmap about
the types that you'd like to add to the page.

Thanks,

Michael
Alejandro Colomar Oct. 3, 2020, 9:16 a.m. UTC | #17
Hi Michael,

On 2020-10-03 10:00, Michael Kerrisk (man-pages) wrote:
 > Hi Alex, et al.
 > On 10/2/20 3:51 PM, Alejandro Colomar wrote:
 >>
 >> Hi Jonathan,
 >>
 >> On 2020-10-02 15:27, Jonathan Wakely wrote:
 >>> On Fri, 2 Oct 2020 at 14:20, Alejandro Colomar 
<colomar.6.4.3@gmail.com> wrote:
 >>>>
 >>>>
 >>>>
 >>>> On 2020-10-02 15:06, Jonathan Wakely wrote:
 >>>>    > On Fri, 2 Oct 2020 at 12:31, Michael Kerrisk (man-pages)
 >>>>    > <mtk.manpages@gmail.com> wrote:
 >>>>    >>
 >>>>    >> On Fri, 2 Oct 2020 at 12:49, Jonathan Wakely 
<jwakely.gcc@gmail.com>
 >>>> wrote:
 >>>>    >>>
 >>>>    >>> On Fri, 2 Oct 2020 at 09:28, Alejandro Colomar via Gcc
 >>>> <gcc@gcc.gnu.org> wrote:
 >>>>    >>>> However, it might be good that someone starts a page called
 >>>>    >>>> 'type_qualifiers(7)' or something like that.
 >>>>    >>>
 >>>>    >>> Who is this for? Who is trying to learn C from man pages? 
Should
 >>>>    >>> somebody stop them?
 >>>>    >>
 >>>>    >> Yes, I think so. To add context, Alex has been doing a lot 
of work to
 >>>>    >> build up the new system_data_types(7) page [1], which I think is
 >>>>    >> especially useful for the POSIX system data types that are 
used with
 >>>>    >> various APIs.
 >>>>    >
 >>>>    > It's definitely useful for types like struct siginfo_t and struct
 >>>>    > timeval, which aren't in C.
 >>>>
 >>>> Hi Jonathan,
 >>>>
 >>>> But then the line is a bit diffuse.
 >>>> Would you document 'ssize_t' and not 'size_t'?
 >>>
 >>> Yes. My documentation for ssize_t would mention size_t, refer to the C
 >>> standard, and not define it.
 >>>
 >>>> Would you not document intN_t types?
 >>>> Would you document stdint types, including 'intptr_t', and not 
'void *'?
 >>>
 >>> I would document neither.
 >>>
 >>> I can see some small value in documenting size_t and the stdint types,
 >>> as they are technically defined by the libc headers. But documenting
 >>> void* seems very silly. It's one of the most fundamental built-in
 >>> parts of the C language, not an interface provided by the system.
 >>>
 >>>> I guess the basic types (int, long, ...) can be left out for now,
 >>>
 >>> I should hope so!
 >>>
 >>>> and apart from 'int' those rarely are the most appropriate types
 >>>> for most uses.
 >>>> But other than that, I would document all of the types.
 >>>> And even... when all of the other types are documented,
 >>>> it will be only a little extra effort to document those,
 >>>> so in the future I might consider that.
 >>>
 >>> [insert Jurassic Park meme "Your scientists were so preoccupied with
 >>> whether or not they could, they didn't stop to think if they should."
 >>> ]
 >>>
 >>> I don't see value in bloating the man-pages with information nobody
 >>> will ever use, and which doesn't (IMHO) belong there anyway. We seem
 >>> to fundamentally disagree about what the man pages are for. I don't
 >>> think they are supposed to teach C programming from scratch.
 >>
 >> Agree in part.
 >> I'll try to think about it again.
 >>
 >> In the meantime, I trust Michael to tell me when something is way off :)
 >>
 >> Thanks, really!
 >>
 >> Alex
 >
 > So, I think a navigational correction is needed.
 >
 > My vision was that system_data_types(7) would most usefully document
 > the POSIX types, but by now there's too much of C creeping in. I have
 > been a little slow to react to that, and I apologize for that.
 > But I think we should not go in that direction
 >
 > I think it is worth having types like ssize_t and size_t in
 > the page, simply because they turn up with so many of the POSIX
 > APIs, and people often don't understand some details of these
 > types (such as the necessary prinf() specifiers). So, as long as
 > we're going to have a page about these types, it's fine by
 > me to include size_t and ssize_t.
 >
 > Types like [u]intN_t are definitely on the borderline for me. But,
 > they do appear in various APIs in the Linux interface (either
 > explicitly, or as the similar __u32 ___64, etc.). And again
 > many people don't understand some basic details, such as
 > the PRI and SCN constants, so I think it is useful to have them
 > briefly summarized in one place, and as long as they are already
 > in the page, then let's keep them.
 >
 > I think __int128 etc definitely doesn't belong in this page.
 >
 > And I'd like to back pedal a bit. I think we really shouldn't have
 > [u]int_fastN_t
 > [u]int_leastN_t
 > in the page. They are C details that have nothing to with POSIX,
 > the kernel, or libc. Could you send me a patch to remove these
 > from the page? And again, my apologies for not being focused
 > enough on the big picture sooner.

I'm fine with removing them.
I only added them because while I was adding [u]intN_t,
they were in the same page, and I just took them too.
No problem with removing them.

To be clear, I should remove [u]int_*astN_t, right?

 >
 > I don't think 'void' belongs in this page. Nor basic types
 > such as int, long, etc.
Fine.


 >
 > The question of 'void *' is an interesting one. It is something
 > like a fundamental C type, and not something that comes from POSIX.
 > But, it does appear in POSIX APIs and often details of using
 > the type are not well understood. So, as a matter of practicality,
 > and again since you've done the work, I am inclined to include
 > this type in the page, just so it can be handily referred to
 > along with all of the other types.
 >
 > Looking ahead (and I hope none of the above disheartens you,
 > since you've done a lot of great work for this page),

Actually, not.
Its good to have you tell me what is good for the man and what's not.
Otherwise, I wouldn't know.
I keep a branch with all of the rejected patches,
just to have an idea of what I should not send you :-)

 > it would
 > be good if you could provide a bit of an advance roadmap about
 > the types that you'd like to add to the page.

Well, I didn't have a clear roadmap.
I had some types which I clearly wanted to document,
and they were ptrdiff_t, and ssize_t,
which I documented in the first patches,
and then I was finding related types,
and also tended to document about types which I knew very well too,
to have something useful to add to the description.

I may now start writing about off_t and related types,
which were the ones that made me want this page.

 >
 > Thanks,
 >
 > Michael
 >
 >
 >

Thanks,

Alex
Michael Kerrisk \(man-pages\) Oct. 3, 2020, 11:39 a.m. UTC | #18
Hi Alex,


>  >
>  > The question of 'void *' is an interesting one. It is something
>  > like a fundamental C type, and not something that comes from POSIX.
>  > But, it does appear in POSIX APIs and often details of using
>  > the type are not well understood. So, as a matter of practicality,
>  > and again since you've done the work, I am inclined to include
>  > this type in the page, just so it can be handily referred to
>  > along with all of the other types.
>  >
>  > Looking ahead (and I hope none of the above disheartens you,
>  > since you've done a lot of great work for this page),
> 
> Actually, not.
> Its good to have you tell me what is good for the man and what's not.
> Otherwise, I wouldn't know.
> I keep a branch with all of the rejected patches,
> just to have an idea of what I should not send you :-)
> 
>  > it would
>  > be good if you could provide a bit of an advance roadmap about
>  > the types that you'd like to add to the page.
> 
> Well, I didn't have a clear roadmap.
> I had some types which I clearly wanted to document,
> and they were ptrdiff_t, and ssize_t,
> which I documented in the first patches,
> and then I was finding related types,
> and also tended to document about types which I knew very well too,
> to have something useful to add to the description.
> 
> I may now start writing about off_t and related types,
> which were the ones that made me want this page.

off_t would be great.

In case you are looking for some other candidates, some others
that I would be interested to see go into the page would be

fd_set
clock_t
clockid_t
and probably dev_t


Thanks,

Michael
Alejandro Colomar Oct. 5, 2020, 10:08 p.m. UTC | #19
Hi Michael,

On 2020-10-03 13:39, Michael Kerrisk (man-pages) wrote:
> Hi Alex,
[...]
> 
> off_t would be great.
> 
> In case you are looking for some other candidates, some others
> that I would be interested to see go into the page would be
> 
> fd_set
> clock_t
> clockid_t
> and probably dev_t

Great!

off_t is almost done.  I think I have too many references in "See also".

I'll send you the patch, and trim as you want :)

> 
> 
> Thanks,
> 
> Michael
> 

Cheers,

Alex
Michael Kerrisk \(man-pages\) Oct. 7, 2020, 6:53 a.m. UTC | #20
On 10/6/20 12:08 AM, Alejandro Colomar wrote:
> Hi Michael,
> 
> On 2020-10-03 13:39, Michael Kerrisk (man-pages) wrote:
>> Hi Alex,
> [...]
>>
>> off_t would be great.
>>
>> In case you are looking for some other candidates, some others
>> that I would be interested to see go into the page would be
>>
>> fd_set
>> clock_t
>> clockid_t
>> and probably dev_t
> 
> Great!
> 
> off_t is almost done.  I think I have too many references in "See also".
> 
> I'll send you the patch, and trim as you want :)

Thanks, Alex. I'm teaching a course this week, so less active, 
I'm sorry.

Thanks,

Michael
Vincent Lefevre Oct. 8, 2020, 1:52 p.m. UTC | #21
On 2020-10-01 18:55:04 +0200, Alejandro Colomar via Gcc wrote:
> On 2020-10-01 18:38, Michael Kerrisk (man-pages) wrote:
> > > +According to the C language standard,
> > > +a pointer to any object type may be converted to a pointer to
> > > +.I void
> > > +and back.
> > > +POSIX further requires that any pointer,
> > > +including pointers to functions,
> > > +may be converted to a pointer to
> > > +.I void
> > > +and back.
> > I know you are correct about POSIX, but which part of the
> > standard did you find this information in? The only
> > reference that I find in POSIX is the dlsym() spec. Is it
> > covered also somewhere else in the standrd?
[...]
> I've bean searching, and dlsym is the only one:
[...]
> The most explicit paragraph in dlsym is the following:
> 
> [[
> Note that conversion from a void * pointer to a function pointer as in:
> 
> fptr = (int (*)(int))dlsym(handle, "my_function");
> 
> is not defined by the ISO C standard.
> This standard requires this conversion to work correctly
> on conforming implementations.
> ]]

I think that "this conversion" applies only to the dlsym context,
and the conversion isn't defined in general. Imagine that the
void * pointer to function pointer conversion requires the compiler
to generate additional code. The compiler may be able to detect
that dlsym will not be used in some contexts (e.g. because of
always false condition) and do not generate such additional code,
making the conversion to have undefined behavior.
Michael Kerrisk \(man-pages\) via Libc-alpha Oct. 12, 2020, 9:36 a.m. UTC | #22
Hello Vincent,

On Thu, 8 Oct 2020 at 15:52, Vincent Lefevre <vincent@vinc17.net> wrote:
>
> On 2020-10-01 18:55:04 +0200, Alejandro Colomar via Gcc wrote:
> > On 2020-10-01 18:38, Michael Kerrisk (man-pages) wrote:
> > > > +According to the C language standard,
> > > > +a pointer to any object type may be converted to a pointer to
> > > > +.I void
> > > > +and back.
> > > > +POSIX further requires that any pointer,
> > > > +including pointers to functions,
> > > > +may be converted to a pointer to
> > > > +.I void
> > > > +and back.
> > > I know you are correct about POSIX, but which part of the
> > > standard did you find this information in? The only
> > > reference that I find in POSIX is the dlsym() spec. Is it
> > > covered also somewhere else in the standrd?
> [...]
> > I've bean searching, and dlsym is the only one:
> [...]
> > The most explicit paragraph in dlsym is the following:
> >
> > [[
> > Note that conversion from a void * pointer to a function pointer as in:
> >
> > fptr = (int (*)(int))dlsym(handle, "my_function");
> >
> > is not defined by the ISO C standard.
> > This standard requires this conversion to work correctly
> > on conforming implementations.
> > ]]
>
> I think that "this conversion" applies only to the dlsym context,
> and the conversion isn't defined in general. Imagine that the
> void * pointer to function pointer conversion requires the compiler
> to generate additional code. The compiler may be able to detect
> that dlsym will not be used in some contexts (e.g. because of
> always false condition) and do not generate such additional code,
> making the conversion to have undefined behavior.

Thanks. It's a good point that you raise.

I agree that the wording in the standard is not too clear. But I
believe the intent really is to allow
[void *] <==> [function pointer] casts.

The most relevant pieces I can find are as follows:

In the current standard, in CHANGE HISTORY for dlsum():

[[
Issue 6
IEEE Std 1003.1-2001/Cor 1-2002, item XSH/TC1/D6/14 is applied,
correcting an example, and
adding text to the RATIONALE describing issues related to conversion
of pointers to functions
and back again.
Issue 7
POSIX.1-2008, Technical Corrigendum 1, XSH/TC1-2008/0074 [74] is applied.
]]

https://www.austingroupbugs.net/view.php?id=74
This is a little thin. The initial report says:
"The intent is simply to permit dlsym to use a void * as its return type."
and no one seems to have questioned that.

And then in https://pubs.opengroup.org/onlinepubs/7899949299/toc.pdf
(TC1 for POSIXX.1-2001)

there is:

[[
Change Number: XSH/TC1/D6/14 [XSH ERN 13]
On Page: 259  Line: 8566,8590  Section: dlsym
In the EXAMPLES section, change from:
fptr = (int (*)(int))dlsym(handle, "my_function");
to:
*(void **)(&fptr) = dlsym(handle, "my_function");
In the RATIONALE section on Page 260, Line 8590, change from:
"None."
to:
"The C Standard does not require that pointers to functions can be
cast back and forth to pointers to data. Indeed, the C Standard
does not require that an object of type void* can hold a pointer
to a function.  Systems supporting the X/Open System Interfaces
Extension, however, do require that an object of type void* can
hold a pointer to a function.  The result of converting a pointer
to a function into a pointer to another data type (except void*)
is still undefined, however.
]]

And one finds the above text in POSIX.1-2001 TC1 spec for dlsym(),
although it was removed in POSIX.1-2008, and now we have just the
smaller text that is present in the dlsym() page. But along the way, I
can find nothing that speaks against the notion that POSIX was aiming
to allow the more general cast of [void *] <==> [function pointer].
Your thoughts?

Thanks,

Michael
diff mbox series

Patch

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index e42cf2557..e545aa1a0 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -679,7 +679,6 @@  See also the
 .I uintptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"------------------------------------- lconv ------------------------/
@@ -1781,7 +1780,6 @@  See also the
 .I intptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"------------------------------------- va_list ----------------------/
@@ -1815,6 +1813,51 @@  See also:
 .BR va_copy (3),
 .BR va_end (3)
 .RE
+.\"------------------------------------- void * -----------------------/
+.TP
+.I void *
+.RS
+According to the C language standard,
+a pointer to any object type may be converted to a pointer to
+.I void
+and back.
+POSIX further requires that any pointer,
+including pointers to functions,
+may be converted to a pointer to
+.I void
+and back.
+.PP
+Conversions from and to any other pointer type are done implicitly,
+not requiring casts at all.
+.PP
+A value of this type can't be dereferenced,
+as it would give a value of type
+.I void
+which is not possible.
+.PP
+The conversion specifier for
+.I void *
+for the
+.BR printf (3)
+and the
+.BR scanf (3)
+families of functions is
+.BR p ;
+resulting commonly in
+.B %p
+for printing
+.I void *
+values.
+.PP
+Conforming to:
+C99 and later; POSIX.1-2001 and later.
+.PP
+See also the
+.I intptr_t
+and
+.I uintptr_t
+types in this page.
+.RE
 .\"--------------------------------------------------------------------/
 .SH NOTES
 The structures described in this manual page shall contain,