Message ID | 56D61A86.3050108@cs.ucla.edu |
---|---|
State | New |
Headers | show |
On 03/01/2016 11:41 PM, Paul Eggert wrote: > On 03/01/2016 02:16 PM, Florian Weimer wrote: >>> Why not use a flexible array member for this? >> For which part, and how exactly? > > Something like the attached patch, say. (Totally untested.) > >> You can't put a flexible array member into a transparent union. > > That's OK. Any such usage of struct dirent would be unportable anyway. > >> If you mean to add some zero-width padding member at the end of the >> struct, after the d_name member, then I'm worried that makes overrunning >> the d_name array member even more undefined than it already is. > > No, no padding member, just use C99 the way it was designed. This > should improve overrun detection in programs like valgrind. With glibc's > current definition these programs can be fooled into thinking that > struct dirent accesses are invalid (outside of array bounds) when they > are actually OK, so people shut off array-bounds checking. If we used > flexible array members, valgrind etc. should know that the array's upper > bound is unknown and should not issue so many false alarms, so people > can leave bounds checking on. I don't think valgrind can see the difference, but you are correct in principle (this is essentially the “undefined” part I was worried about). Unfortunately, GCC does not produce a warning for taking the size of a struct with a flexible member, or for using it in a non-pointer declarator, so it does only half of what we want. And at the cost of changing sizeof (struct dirent), which can't be a good thing. > If flexible arrays are no-go for some reason, I suppose we could use > 'char 'd_name[SIZE_MAX - 1000];' instead. That should get peoples' > attention. :-) GCC refuses to compile the type definition, not just declarations. Refusing declarations with an error would break quite a lot of existing configure tests. struct dirent d; int z; z - d.d_ino; is a common idiom to check for struct members. Florian
On 03/01/2016 03:07 PM, Florian Weimer wrote: > GCC does not produce a warning for taking the size of a > struct with a flexible member, or for using it in a non-pointer > declarator, so it does only half of what we want. Ouch. That's annoying, but I guess C11 requires it. At least we get half a loaf.... > And at the cost of > changing sizeof (struct dirent), which can't be a good thing. Any program that depends on sizeof (struct dirent) is broken already, so this isn't that worrisome.
On 03/02/2016 12:25 AM, Paul Eggert wrote: >> And at the cost of >> changing sizeof (struct dirent), which can't be a good thing. > > Any program that depends on sizeof (struct dirent) is broken already, so > this isn't that worrisome. Just to be clear, you looked at the wrong struct dirent definition for GNU/Linux, there is a sysdeps override. Right now, most programs relying on sizeof (struct dirent) work well in almost all cases. We really don't want to break that. There appears to be an overlap between these programs and users of readdir_r, so once we remove that from the API, we should have better story for struct dirent declarators as well. Florian
On 03/02/2016 12:44 AM, Florian Weimer wrote: > On 03/02/2016 12:25 AM, Paul Eggert wrote: > >>> And at the cost of >>> changing sizeof (struct dirent), which can't be a good thing. >> >> Any program that depends on sizeof (struct dirent) is broken already, so >> this isn't that worrisome. > > Just to be clear, you looked at the wrong struct dirent definition for > GNU/Linux, there is a sysdeps override. > > Right now, most programs relying on sizeof (struct dirent) work well in > almost all cases. We really don't want to break that. There appears to > be an overlap between these programs and users of readdir_r, so once we > remove that from the API, we should have better story for struct dirent > declarators as well. So, it seems like much more could be said about this in documentation. How about the following text for the man page? DESCRIPTION [...] In the glibc implementation, the dirent structure is defined as follows: struct dirent { ino_t d_ino; /* Inode number */ off_t d_off; /* Not an offset; see below */ unsigned short d_reclen; /* Length of this record */ unsigned char d_type; /* Type of file; not supported by all filesystem types */ char d_name[256]; /* Null-terminated filename */ }; [...] NOTES The d_name field The dirent structure definition shown above is taken from the glibc headers, and shows the d_name field with a fixed size. Warning: applications should avoid any dependence on the size of the dname field. POSIX defines it as char d_name[], a char‐ acter array of unspecified size, with at most NAME_MAX charac‐ ters preceding the terminating null byte ('\0'). POSIX.1 explicitly notes that this field should not be used as an lvalue. The standard also notes that the use of sizeof(d_name) (and by implication sizeof(struct dirent)) is incorrect; use strlen(d_name) instead. (On some systems, this field is defined as char d_name[1]!) Note that while the call fpathconf(fd, _PC_NAME_MAX) returns the value 255 for most filesystems, on some filesystems (e.g., CIFS, Windows SMB servers), the null-terminated filename that is (correctly) returned in d_name can actually exceed this size. (In such cases, the d_reclen field will contain a value that exceeds the size of the glibc dirent structure shown above.) Cheers, Michael
Ping on this man page text. Anyone have an opinion for or against? Thanks, Michael On 03/02/2016 11:39 AM, Michael Kerrisk (man-pages) wrote: > On 03/02/2016 12:44 AM, Florian Weimer wrote: >> On 03/02/2016 12:25 AM, Paul Eggert wrote: >> >>>> And at the cost of >>>> changing sizeof (struct dirent), which can't be a good thing. >>> >>> Any program that depends on sizeof (struct dirent) is broken already, so >>> this isn't that worrisome. >> >> Just to be clear, you looked at the wrong struct dirent definition for >> GNU/Linux, there is a sysdeps override. >> >> Right now, most programs relying on sizeof (struct dirent) work well in >> almost all cases. We really don't want to break that. There appears to >> be an overlap between these programs and users of readdir_r, so once we >> remove that from the API, we should have better story for struct dirent >> declarators as well. > > So, it seems like much more could be said about this in documentation. > How about the following text for the man page? > > DESCRIPTION > > [...] > > In the glibc implementation, the dirent structure is defined as > follows: > > struct dirent { > ino_t d_ino; /* Inode number */ > off_t d_off; /* Not an offset; see below */ > unsigned short d_reclen; /* Length of this record */ > unsigned char d_type; /* Type of file; not supported > by all filesystem types */ > char d_name[256]; /* Null-terminated filename */ > }; > > [...] > NOTES > The d_name field > The dirent structure definition shown above is taken from the > glibc headers, and shows the d_name field with a fixed size. > > Warning: applications should avoid any dependence on the size > of the dname field. POSIX defines it as char d_name[], a char‐ > acter array of unspecified size, with at most NAME_MAX charac‐ > ters preceding the terminating null byte ('\0'). > > POSIX.1 explicitly notes that this field should not be used as > an lvalue. The standard also notes that the use of > sizeof(d_name) (and by implication sizeof(struct dirent)) is > incorrect; use strlen(d_name) instead. (On some systems, this > field is defined as char d_name[1]!) > > Note that while the call > > fpathconf(fd, _PC_NAME_MAX) > > returns the value 255 for most filesystems, on some filesystems > (e.g., CIFS, Windows SMB servers), the null-terminated filename > that is (correctly) returned in d_name can actually exceed this > size. (In such cases, the d_reclen field will contain a value > that exceeds the size of the glibc dirent structure shown > above.) > > Cheers, > > Michael >
On 03/02/2016 11:39 AM, Michael Kerrisk (man-pages) wrote:
> of the dname field. POSIX defines it as char d_name[], a char‐
Should be “d_name” instead of “dname”.
Otherwise looks fine.
Thanks,
Florian
On 03/10/2016 12:22 PM, Florian Weimer wrote: > On 03/02/2016 11:39 AM, Michael Kerrisk (man-pages) wrote: >> of the dname field. POSIX defines it as char d_name[], a char‐ > > Should be “d_name” instead of “dname”. Fixed now. > Otherwise looks fine. Thanks, Florian! Cheers, Michael
diff --git a/bits/dirent.h b/bits/dirent.h index 7b79a53..8546c29 100644 --- a/bits/dirent.h +++ b/bits/dirent.h @@ -32,7 +32,7 @@ struct dirent unsigned char d_namlen; /* Length of the file name. */ /* Only this member is in the POSIX standard. */ - char d_name[1]; /* File name (actually longer). */ + char d_name __flexarr; /* File name. */ }; #ifdef __USE_LARGEFILE64 @@ -42,8 +42,7 @@ struct dirent64 unsigned short int d_reclen; unsigned char d_type; unsigned char d_namlen; - - char d_name[1]; + char d_name __flexarr; /* File name. */ }; #endif