diff mbox

glob: Avoid copying the d_name field of struct dirent [BZ #19779]

Message ID 56FBBA94.1040605@redhat.com
State New
Headers show

Commit Message

Florian Weimer March 30, 2016, 11:37 a.m. UTC
On 03/11/2016 11:27 PM, Roland McGrath wrote:
>> --- a/posix/bug-glob2.c
>> +++ b/posix/bug-glob2.c
> [...]
>> @@ -75,7 +87,7 @@ typedef struct
>>    int level;
>>    int idx;
>>    struct dirent d;
>> -  char room_for_dirent[NAME_MAX];
>> +  char room_for_dirent[1000];
>>  } my_DIR;
> 
> There should be some comment about where this arbitrary size comes from.

I changed it to a sizeof.

>> +/* A representation of a directory entry which does not depend on the
>> +   layout of struct dirent, or the size of ino_t.  */
>> +struct abstract_dirent
>> +{
>> +  const char *d_name;
>> +  int d_type;
>> +  bool skip_entry;
>> +};
> 
> I'm not convinced that this is a good name for the struct.  It's not an
> abstract form of 'struct dirent', because the name member points into some
> other buffer whose lifetime is not intrinsically related to this struct.

It's now struct readdir_result.

> uint8_t is big enough for d_type, and will make the struct smaller on
> ILP32.

I assume the struct is turned into scalars anyway, but I added the
__typeof__.

>> +/* Extract name and type from directory entry.  No copy of the name is
>> +   made.  */
>> +static void
>> +convert_dirent (const struct dirent *source, struct abstract_dirent *target)
> 
> I like Paul's suggestion of using a struct-returning function here.  The
> comment should be explicit that the result will point to SOURCE->d_name and
> so SOURCE must be kept live.

I made these changes.

>> +#if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
>> +  target->d_type = source->d_type;
>> +#else
>> +  target->d_type = 0;
>> +#endif
>> +#if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
>> +  /* Posix does not require that the d_ino field be present, and some
>> +     systems do not provide it. */
>> +  target->skip_entry = false;
>> +#else
>> +  target->skip_entry = source->d_ino == 0;
>> +#endif
> 
> For both of these I'd prefer to see an inline or macro that encapsulates
> the #if stuff without other repetition, e.g.:

There are now separate macros for that.

> 	target->type = D_TYPE (source);
> 	target->skip_entry = D_SKIP_ME (source);
> 
>> +	      struct abstract_dirent e;
>> +	      {
>> +		struct dirent *d = (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0)
>> +				    ? ((struct dirent *)
>> +				       (*pglob->gl_readdir) (stream))
>> +				    : __readdir (stream));
> 
> Convert to __glibc_unlikely while you're here (unless the gnulib sharing
> rules say not to, not clear on that off hand).

>> +		if (d == NULL)
>> +		  break;
>> +		convert_dirent (d, &e);
>> +	      }
> 
> This could all be nicely broken out into a function that takes STREAM,
> FLAGS, PGLOB, and returns E.

It's slightly more complicated than that because I accidentally removed
the interesting parts.

Further cleanups should wait until we have a decision whether we will
ever try to synchronize with gnulib again.

>> -		      len = NAMLEN (d);
>> -		      names->name[cur] = (char *) malloc (len + 1);
>> +		      names->name[cur] = strdup (e.d_name);
>>  		      if (names->name[cur] == NULL)
>>  			goto memory_error;
>> -		      *((char *) mempcpy (names->name[cur++], name, len))
>> -			= '\0';
>> +		      ++cur;
> 
> In the _DIRENT_HAVE_D_NAMLEN case, this assumes that strdup is as efficient
> as malloc+mempcpy (with no strlen required).  Perhaps it's close enough,
> but that is a subtle change you didn't mention as intended.

I want to avoid conditionalized code because these things tend to break
after a while (or never work in the first place).

I have tested the attached patch with fake-disabled d_type support, and
on i386-redhat-linux-gnu and x86_64-redhat-linux-gnu.

Florian

Comments

Roland McGrath March 30, 2016, 11:27 p.m. UTC | #1
> > uint8_t is big enough for d_type, and will make the struct smaller on
> > ILP32.
> 
> I assume the struct is turned into scalars anyway, but I added the
> __typeof__.

I like that fine, but we need to check if it's OK in gnulib.

> Further cleanups should wait until we have a decision whether we will
> ever try to synchronize with gnulib again.

The starting place should always be to expect that we want to share
everything we can with gnulib.  If there is something we in the abstract
can share with gnulib but you want to duplicate and maintain separately
rather than share, you need to make a clear case justifying the ways in
which it's better not to share.  Just that we have an annoying amount of
divergence already is not such a justification.  Rather, we should always
start by seeing if we can reharmonize before doing any other changes.  If
some cleanups on one side or the other make it easier to reharmonize, then
that's a reason to do them first.  But otherwise, we should be achieving
harmony first and then fixing things afterwards so the fixes are uniform
in both places.

> >> -		      len = NAMLEN (d);
> >> -		      names->name[cur] = (char *) malloc (len + 1);
> >> +		      names->name[cur] = strdup (e.d_name);
> >>  		      if (names->name[cur] == NULL)
> >>  			goto memory_error;
> >> -		      *((char *) mempcpy (names->name[cur++], name, len))
> >> -			= '\0';
> >> +		      ++cur;
> > 
> > In the _DIRENT_HAVE_D_NAMLEN case, this assumes that strdup is as efficient
> > as malloc+mempcpy (with no strlen required).  Perhaps it's close enough,
> > but that is a subtle change you didn't mention as intended.
> 
> I want to avoid conditionalized code because these things tend to break
> after a while (or never work in the first place).

That might be a sufficient reason to avoid writing such code in the first
place.  But when you are changing something from how it already is, you
need to be clear about what you are changing and why.  My point was not
that I had no idea what your motivation for the specific change might have
been.  My point was that you billed this whole larger change as doing one
thing, and then slipped this in without comment.

In this case, we already have supported configurations of both cases.
Hurd is _DIRENT_HAVE_D_NAMLEN (it uses the generic bits/dirent.h), while
others are not.  So just the configurations that work today not being
broken in the future (and being adequately tested) prevents "tending to
break after a while".

I am making a big deal out of this little case, and it's not because I
think this particular little performance-relevant change is so likely to
be important or even that I personally necessarily object to it.  What I
want to make a big deal about is the project's standards for attention
to detail and for clarity and transparency about intent and effect of
changes.  Sometimes "and I slipped this other little change in while I
was at it" is fine, though the starting place is always that every
separately-motivated change should be a separate change.  What's never
fine is slipping such a change in without comment or discussion, which
is what happened here.  Now we're discussing it rather than it going
unnoticed, which is what matters most.  But I am also concerned by the
way you responded to my review, which to me read as an off-hand remark
about a personal preference as if that were an adequate response.  Such
preferences are normal in the reasoning behind writing new code in a
particular way, and in such cases will likely go unremarked.  But here
you are changing existing code in a way that changes the compiled code
and its performance characteristics, and the only motivation you cited
was your preferences for how code should be all else being equal.  That
completely ignores the history behind the code being how it is today,
and hence ignores the project's key principle of conservatism.

> +/* Extract name and type from directory entry.  No copy of the name is
> +   made.  If SOURCE is NULL, result name is NULL.  */
> +static struct readdir_result
> +convert_dirent (const struct dirent *source)
> +{
> +  if (source == NULL)
> +    return (struct readdir_result) {};
> +  else
> +    return (struct readdir_result)
> +      {
> +	.name = source->d_name,
> +	D_INO_TO_RESULT (source),
> +	D_TYPE_TO_RESULT (source)
> +      };
> +}
> +
> +#ifndef COMPILE_GLOB64
> +/* Like convert_dirent, but works on struct dirent64 instead.  */
> +static struct readdir_result
> +convert_dirent64 (const struct dirent64 *source)
> +{
> +  if (source == NULL)
> +    return (struct readdir_result) {};
> +  else
> +    return (struct readdir_result)
> +      {
> +	.name = source->d_name,
> +	D_INO_TO_RESULT (source),
> +	D_TYPE_TO_RESULT (source)
> +      };
> +}
> +#endif

While inlines are better than macros when all else is equal, this
duplication of something that lends itself to a single-expression form
(?:) is a good reason to make it a macro instead.

Also, compound literals are a GNU extension and I'm not sure if they
meet gnulib's portability requirements or not.


Thanks,
Roland
Paul Eggert March 30, 2016, 11:58 p.m. UTC | #2
> +# if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
 > +  __typeof__ (((struct dirent) {}).d_type) type;
 > +# endif

__typeof__ isn't portable to non-GCC compilers. How about if we just 
declare d_type to be 'unsigned char'? That should be portable enough in 
practice.


 > +# if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
 > +/* Designated initializer based on the d_type member of struct
 > +   dirent.  */
 > +#  define D_TYPE_TO_RESULT(source) .type = (source)->d_type

Gnulib still ports to C89, which makes compound literals dicey. We've 
been thinking of upping that to C99 so this is not an insurmountable 
obstacle (so long as the code sticks to C99-compatible compound 
literals, which I think it does). Still, this macro and its relatives 
are a bit tricky, and if we're going to use tricky macros anyway perhaps 
we'd be better off having them expand to C89 as that shouldn't make 
things much more obscure.


 > +/* True if the directory entry D might be a symbolic link.  */
 > +static inline bool

In Gnulib we typically prefer to say just 'static' without the clutter 
of 'inline', for the same reason we typically don't bother with the 
clutter of 'register'.


 > +# define GL_READDIR(pglob, stream) (pglob)->gl_readdir ((stream))

No need for double parenthesization.
Florian Weimer March 31, 2016, 5:16 p.m. UTC | #3
On 03/31/2016 01:27 AM, Roland McGrath wrote:
>>> uint8_t is big enough for d_type, and will make the struct smaller on
>>> ILP32.
>>
>> I assume the struct is turned into scalars anyway, but I added the
>> __typeof__.
> 
> I like that fine, but we need to check if it's OK in gnulib.

I'm going to use unsigned char instead.

>>>> -		      len = NAMLEN (d);
>>>> -		      names->name[cur] = (char *) malloc (len + 1);
>>>> +		      names->name[cur] = strdup (e.d_name);
>>>>  		      if (names->name[cur] == NULL)
>>>>  			goto memory_error;
>>>> -		      *((char *) mempcpy (names->name[cur++], name, len))
>>>> -			= '\0';
>>>> +		      ++cur;
>>>
>>> In the _DIRENT_HAVE_D_NAMLEN case, this assumes that strdup is as efficient
>>> as malloc+mempcpy (with no strlen required).  Perhaps it's close enough,
>>> but that is a subtle change you didn't mention as intended.
>>
>> I want to avoid conditionalized code because these things tend to break
>> after a while (or never work in the first place).
> 
> That might be a sufficient reason to avoid writing such code in the first
> place.  But when you are changing something from how it already is, you
> need to be clear about what you are changing and why.  My point was not
> that I had no idea what your motivation for the specific change might have
> been.  My point was that you billed this whole larger change as doing one
> thing, and then slipped this in without comment.

Two more things:

For the non-glob64 case, I removed a copy of the name, which amounts to
one traversal less.  The internal strlen operation in strdup adds back a
string traversal.  So there is no net difference of the work done
(except that there are fewer writes).  CONVERT_DIRENT_DIRENT64 is used
in the glob64 code, so there is an unnecessary strlen operation there.

But the glibc tests for GLOB_ALTDIRFUNC do not set the d_namlen field.
Neither does the code in OpenSSH.  (GNU make sets d_namlen, though.)
Not looking at d_namlen avoids introducing application bugs.

d_ino/d_fileno has a similar problem, and even make doesn't set it. We
should probably ignore it altogether in GLOB_ALTDIRFUNC mode.

> In this case, we already have supported configurations of both cases.
> Hurd is _DIRENT_HAVE_D_NAMLEN (it uses the generic bits/dirent.h), while
> others are not.  So just the configurations that work today not being
> broken in the future (and being adequately tested) prevents "tending to
> break after a while".

I'm surprised there aren't any test failures on Hurd.  d_namlen is an
unsigned char, probably there is a sufficiently large value in it by
accident so that we end copying entire names, including the null
terminator, plus some extra bytes.

> I am making a big deal out of this little case, and it's not because I
> think this particular little performance-relevant change is so likely to
> be important or even that I personally necessarily object to it.  What I
> want to make a big deal about is the project's standards for attention
> to detail and for clarity and transparency about intent and effect of
> changes.  Sometimes "and I slipped this other little change in while I
> was at it" is fine, though the starting place is always that every
> separately-motivated change should be a separate change.

I can remove the use of d_namlen as a separate patch, before the copy
avoidance.  I did not realize it has implications on application
behavior at the time, for which I apologize.  But now, I still think
it's a desirable improvement.

> What's never
> fine is slipping such a change in without comment or discussion, which
> is what happened here.  Now we're discussing it rather than it going
> unnoticed, which is what matters most.  But I am also concerned by the
> way you responded to my review, which to me read as an off-hand remark
> about a personal preference as if that were an adequate response.  Such
> preferences are normal in the reasoning behind writing new code in a
> particular way, and in such cases will likely go unremarked.  But here
> you are changing existing code in a way that changes the compiled code
> and its performance characteristics, and the only motivation you cited
> was your preferences for how code should be all else being equal.  That
> completely ignores the history behind the code being how it is today,
> and hence ignores the project's key principle of conservatism.

Sorry, I don't follow.  I expressed a preference for a coding strategy
that reduces risk by aligning behavior across architectures, so that we
can reuse the results of testing on one set of architectures for others.

I think this approach is better than to check in code that wasn't tested
at all, or demand that every developer tests changes in generic code on
Hurd, x32, alpha, and other architectures which have interesting
differences not found anywhere else.

>> +/* Extract name and type from directory entry.  No copy of the name is
>> +   made.  If SOURCE is NULL, result name is NULL.  */
>> +static struct readdir_result
>> +convert_dirent (const struct dirent *source)
>> +{
>> +  if (source == NULL)
>> +    return (struct readdir_result) {};
>> +  else
>> +    return (struct readdir_result)
>> +      {
>> +	.name = source->d_name,
>> +	D_INO_TO_RESULT (source),
>> +	D_TYPE_TO_RESULT (source)
>> +      };
>> +}
>> +
>> +#ifndef COMPILE_GLOB64
>> +/* Like convert_dirent, but works on struct dirent64 instead.  */
>> +static struct readdir_result
>> +convert_dirent64 (const struct dirent64 *source)
>> +{
>> +  if (source == NULL)
>> +    return (struct readdir_result) {};
>> +  else
>> +    return (struct readdir_result)
>> +      {
>> +	.name = source->d_name,
>> +	D_INO_TO_RESULT (source),
>> +	D_TYPE_TO_RESULT (source)
>> +      };
>> +}
>> +#endif
> 
> While inlines are better than macros when all else is equal, this
> duplication of something that lends itself to a single-expression form
> (?:) is a good reason to make it a macro instead.

The call site relies on the function nature because the source argument
is evaluated multiple times.  I could change that by using a temporary
at the call site, but ...

> Also, compound literals are a GNU extension and I'm not sure if they
> meet gnulib's portability requirements or not.

... avoiding the compound literal requires writing a function.  I could
put the shared body into a macro, though.

Florian
Florian Weimer March 31, 2016, 5:39 p.m. UTC | #4
On 03/31/2016 07:16 PM, Florian Weimer wrote:
> On 03/31/2016 01:27 AM, Roland McGrath wrote:
>>>> uint8_t is big enough for d_type, and will make the struct smaller on
>>>> ILP32.
>>>
>>> I assume the struct is turned into scalars anyway, but I added the
>>> __typeof__.
>>
>> I like that fine, but we need to check if it's OK in gnulib.
> 
> I'm going to use unsigned char instead.
> 
>>>>> -		      len = NAMLEN (d);
>>>>> -		      names->name[cur] = (char *) malloc (len + 1);
>>>>> +		      names->name[cur] = strdup (e.d_name);
>>>>>  		      if (names->name[cur] == NULL)
>>>>>  			goto memory_error;
>>>>> -		      *((char *) mempcpy (names->name[cur++], name, len))
>>>>> -			= '\0';
>>>>> +		      ++cur;
>>>>
>>>> In the _DIRENT_HAVE_D_NAMLEN case, this assumes that strdup is as efficient
>>>> as malloc+mempcpy (with no strlen required).  Perhaps it's close enough,
>>>> but that is a subtle change you didn't mention as intended.
>>>
>>> I want to avoid conditionalized code because these things tend to break
>>> after a while (or never work in the first place).
>>
>> That might be a sufficient reason to avoid writing such code in the first
>> place.  But when you are changing something from how it already is, you
>> need to be clear about what you are changing and why.  My point was not
>> that I had no idea what your motivation for the specific change might have
>> been.  My point was that you billed this whole larger change as doing one
>> thing, and then slipped this in without comment.
> 
> Two more things:
> 
> For the non-glob64 case, I removed a copy of the name, which amounts to
> one traversal less.  The internal strlen operation in strdup adds back a
> string traversal.  So there is no net difference of the work done
> (except that there are fewer writes).  CONVERT_DIRENT_DIRENT64 is used
> in the glob64 code, so there is an unnecessary strlen operation there.
> 
> But the glibc tests for GLOB_ALTDIRFUNC do not set the d_namlen field.
> Neither does the code in OpenSSH.  (GNU make sets d_namlen, though.)
> Not looking at d_namlen avoids introducing application bugs.
> 
> d_ino/d_fileno has a similar problem, and even make doesn't set it. We
> should probably ignore it altogether in GLOB_ALTDIRFUNC mode.

Correction: make initializes d_ino to 1.

Florian
Roland McGrath April 1, 2016, 9:08 p.m. UTC | #5
>  > +# if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
>  > +  __typeof__ (((struct dirent) {}).d_type) type;
>  > +# endif
> 
> __typeof__ isn't portable to non-GCC compilers. How about if we just 
> declare d_type to be 'unsigned char'? That should be portable enough in 
> practice.

Fine by me.

> Gnulib still ports to C89, which makes compound literals dicey. We've 
> been thinking of upping that to C99 so this is not an insurmountable 
> obstacle (so long as the code sticks to C99-compatible compound 
> literals, which I think it does). Still, this macro and its relatives 
> are a bit tricky, and if we're going to use tricky macros anyway perhaps 
> we'd be better off having them expand to C89 as that shouldn't make 
> things much more obscure.

In this case I think it's adequate to use a non-initializing
definition followed by assignments.  That avoids the vagaries of
relying on field order.  Florian can verify that it compiles to the
same thing, which I expect it will.

>  > +/* True if the directory entry D might be a symbolic link.  */
>  > +static inline bool
> 
> In Gnulib we typically prefer to say just 'static' without the clutter 
> of 'inline', for the same reason we typically don't bother with the 
> clutter of 'register'.

Actually, in libc it's our standard style to use just "static" in C
source files.  The principle is that the compiler will decide whether
to inline.  I failed to catch this style violation in my review.  (We
do still use "static inline" in header files, but basically just
because it's shorter to write than "static __attribute__ ((unused))".)

>  > +# define GL_READDIR(pglob, stream) (pglob)->gl_readdir ((stream))
> 
> No need for double parenthesization.

Canonical parenthesization would be:

# define GL_READDIR(pglob, stream) ((pglob)->gl_readdir (stream))


Thanks,
Roland
Roland McGrath April 1, 2016, 9:55 p.m. UTC | #6
> I can remove the use of d_namlen as a separate patch, before the copy
> avoidance.  I did not realize it has implications on application
> behavior at the time, for which I apologize.  But now, I still think
> it's a desirable improvement.

That sounds great.

> Sorry, I don't follow.  I expressed a preference for a coding strategy
> that reduces risk by aligning behavior across architectures, so that we
> can reuse the results of testing on one set of architectures for others.

My point was that you included a change that did not seem relevant to the
stated goal, and your response to my review still didn't clearly say why
the status quo had to be changed.  Separating small changes out into their
own review threads is the best way to be sure that the rationale for the
change was stated clearly and gets discussed fully.

> > While inlines are better than macros when all else is equal, this
> > duplication of something that lends itself to a single-expression form
> > (?:) is a good reason to make it a macro instead.
> 
> The call site relies on the function nature because the source argument
> is evaluated multiple times.  I could change that by using a temporary
> at the call site, but ...
> 
> > Also, compound literals are a GNU extension and I'm not sure if they
> > meet gnulib's portability requirements or not.
> 
> ... avoiding the compound literal requires writing a function.  I could
> put the shared body into a macro, though.

As you can tell, I didn't drill down to all the details of what's possible.
I just noticed the glaring duplication and wanted to push back on that.
Sometimes duplication is the best of the available options, which are all
imperfect.  When that's so, it merits a comment about why duplication is
what makes the most sense.  The way to read my comment is, "This looks like
a fishy level of duplication; see if you can find something better, or
explain why this is the best we can do."


Thanks,
Roland
diff mbox

Patch

2016-03-30  Florian Weimer  <fweimer@redhat.com>

	[BZ #19779]
	CVE-2016-1234
	Avoid copying names of directory entries.
	* posix/glob.c (NAMELEN, DIRENT_MUST_BE, DIRENT_MIGHT_BE_SYMLINK)
	(DIRENT_MIGHT_BE_DIR, CONVERT_D_NAMLEN, CONVERT_D_INO)
	(CONVERT_D_TYPE, CONVERT_DIRENT_DIRENT64, REAL_DIR_ENTRY)
	(NAME_MAX): Remove macros.
	(struct readdir_result): New type.
	(D_TYPE_TO_RESULT, D_TYPE_TO_RESULT, GL_READDIR): New macros.
	(readdir_result_might_be_symlink, readdir_result_might_be_dir):
	New functions.
	(convert_dirent, convert_dirent64): New function.
	(glob_in_dir): Use struct dirent_result.  Call convert_dirent or
	convert_dirent64.  Adjust references to the readdir result.  Use
	strdup instead of NAMELEN and mempcpy.
	* sysdeps/unix/sysv/linux/i386/glob64.c:
	(convert_dirent, GL_READDIR): Redefine for second file inclusion.
	* posix/bug-glob2.c (LONG_NAME): Define.
	(filesystem): Add LONG_NAME.
	(my_DIR): Increase the size of room_for_dirent.

diff --git a/posix/bug-glob2.c b/posix/bug-glob2.c
index ddf5ec9..fca048b 100644
--- a/posix/bug-glob2.c
+++ b/posix/bug-glob2.c
@@ -40,6 +40,17 @@ 
 # define PRINTF(fmt, args...)
 #endif
 
+#define LONG_NAME \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
+  "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
 
 static struct
 {
@@ -58,6 +69,7 @@  static struct
       { ".", 3, DT_DIR, 0755 },
       { "..", 3, DT_DIR, 0755 },
       { "a", 3, DT_REG, 0644 },
+      { LONG_NAME, 3, DT_REG, 0644 },
     { "unreadable", 2, DT_DIR, 0111 },
       { ".", 3, DT_DIR, 0111 },
       { "..", 3, DT_DIR, 0755 },
@@ -75,7 +87,7 @@  typedef struct
   int level;
   int idx;
   struct dirent d;
-  char room_for_dirent[NAME_MAX];
+  char room_for_dirent[sizeof (LONG_NAME)];
 } my_DIR;
 
 
diff --git a/posix/glob.c b/posix/glob.c
index 0c04c3c..e043edc 100644
--- a/posix/glob.c
+++ b/posix/glob.c
@@ -24,6 +24,7 @@ 
 #include <errno.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <stdbool.h>
 #include <stddef.h>
 
 /* Outcomment the following line for production quality code.  */
@@ -57,10 +58,8 @@ 
 
 #if defined HAVE_DIRENT_H || defined __GNU_LIBRARY__
 # include <dirent.h>
-# define NAMLEN(dirent) strlen((dirent)->d_name)
 #else
 # define dirent direct
-# define NAMLEN(dirent) (dirent)->d_namlen
 # ifdef HAVE_SYS_NDIR_H
 #  include <sys/ndir.h>
 # endif
@@ -75,82 +74,8 @@ 
 # endif /* HAVE_VMSDIR_H */
 #endif
 
-
-/* In GNU systems, <dirent.h> defines this macro for us.  */
-#ifdef _D_NAMLEN
-# undef NAMLEN
-# define NAMLEN(d) _D_NAMLEN(d)
-#endif
-
-/* When used in the GNU libc the symbol _DIRENT_HAVE_D_TYPE is available
-   if the `d_type' member for `struct dirent' is available.
-   HAVE_STRUCT_DIRENT_D_TYPE plays the same role in GNULIB.  */
-#if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
-/* True if the directory entry D must be of type T.  */
-# define DIRENT_MUST_BE(d, t)	((d)->d_type == (t))
-
-/* True if the directory entry D might be a symbolic link.  */
-# define DIRENT_MIGHT_BE_SYMLINK(d) \
-    ((d)->d_type == DT_UNKNOWN || (d)->d_type == DT_LNK)
-
-/* True if the directory entry D might be a directory.  */
-# define DIRENT_MIGHT_BE_DIR(d)	 \
-    ((d)->d_type == DT_DIR || DIRENT_MIGHT_BE_SYMLINK (d))
-
-#else /* !HAVE_D_TYPE */
-# define DIRENT_MUST_BE(d, t)		false
-# define DIRENT_MIGHT_BE_SYMLINK(d)	true
-# define DIRENT_MIGHT_BE_DIR(d)		true
-#endif /* HAVE_D_TYPE */
-
-/* If the system has the `struct dirent64' type we use it internally.  */
-#if defined _LIBC && !defined COMPILE_GLOB64
-# if defined HAVE_DIRENT_H || defined __GNU_LIBRARY__
-#  define CONVERT_D_NAMLEN(d64, d32)
-# else
-#  define CONVERT_D_NAMLEN(d64, d32) \
-  (d64)->d_namlen = (d32)->d_namlen;
-# endif
-
-# if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
-#  define CONVERT_D_INO(d64, d32)
-# else
-#  define CONVERT_D_INO(d64, d32) \
-  (d64)->d_ino = (d32)->d_ino;
-# endif
-
-# ifdef _DIRENT_HAVE_D_TYPE
-#  define CONVERT_D_TYPE(d64, d32) \
-  (d64)->d_type = (d32)->d_type;
-# else
-#  define CONVERT_D_TYPE(d64, d32)
-# endif
-
-# define CONVERT_DIRENT_DIRENT64(d64, d32) \
-  memcpy ((d64)->d_name, (d32)->d_name, NAMLEN (d32) + 1);		      \
-  CONVERT_D_NAMLEN (d64, d32)						      \
-  CONVERT_D_INO (d64, d32)						      \
-  CONVERT_D_TYPE (d64, d32)
-#endif
-
-
-#if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
-/* Posix does not require that the d_ino field be present, and some
-   systems do not provide it. */
-# define REAL_DIR_ENTRY(dp) 1
-#else
-# define REAL_DIR_ENTRY(dp) (dp->d_ino != 0)
-#endif /* POSIX */
-
 #include <stdlib.h>
 #include <string.h>
-
-/* NAME_MAX is usually defined in <dirent.h> or <limits.h>.  */
-#include <limits.h>
-#ifndef NAME_MAX
-# define NAME_MAX (sizeof (((struct dirent *) 0)->d_name))
-#endif
-
 #include <alloca.h>
 
 #ifdef _LIBC
@@ -195,8 +120,104 @@ 
 
 static const char *next_brace_sub (const char *begin, int flags) __THROWNL;
 
+/* A representation of a directory entry which does not depend on the
+   layout of struct dirent, or the size of ino_t.  */
+struct readdir_result
+{
+  const char *name;
+  bool skip_entry;
+# if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
+  __typeof__ (((struct dirent) {}).d_type) type;
+# endif
+};
+
+# if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
+/* Designated initializer based on the d_type member of struct
+   dirent.  */
+#  define D_TYPE_TO_RESULT(source) .type = (source)->d_type
+
+/* True if the directory entry D might be a symbolic link.  */
+static inline bool
+readdir_result_might_be_symlink (struct readdir_result d)
+{
+  return d.type == DT_UNKNOWN || d.type == DT_LNK;
+}
+
+/* True if the directory entry D might be a directory.  */
+static inline bool
+readdir_result_might_be_dir (struct readdir_result d)
+{
+  return d.type == DT_DIR || readdir_result_might_be_symlink (d);
+}
+# else
+#  define D_TYPE_TO_RESULT(source)
+
+/* If we do not have type information, symbolic links and directories
+   are always a possibility.  */
+
+static inline bool
+readdir_result_might_be_symlink (struct readdir_result d)
+{
+  return true;
+}
+
+static inline bool
+readdir_result_might_be_dir (struct readdir_result d)
+{
+  return true;
+}
+
+# endif
+
+# if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
+/* Designated initializer for skip_entry.  POSIX does not require that
+   the d_ino field be present, and some systems do not provide it. */
+#  define D_INO_TO_RESULT(source) .skip_entry = false
+# else
+#  define D_INO_TO_RESULT(source) .skip_entry = (source)->d_ino == 0
+# endif
+
 #endif /* !defined _LIBC || !defined GLOB_ONLY_P */
 
+/* Call gl_readdir on STREAM.  This macro can be overridden to reduce
+   type safety if an old interface version needs to be supported.  */
+#ifndef GL_READDIR
+# define GL_READDIR(pglob, stream) (pglob)->gl_readdir ((stream))
+#endif
+
+/* Extract name and type from directory entry.  No copy of the name is
+   made.  If SOURCE is NULL, result name is NULL.  */
+static struct readdir_result
+convert_dirent (const struct dirent *source)
+{
+  if (source == NULL)
+    return (struct readdir_result) {};
+  else
+    return (struct readdir_result)
+      {
+	.name = source->d_name,
+	D_INO_TO_RESULT (source),
+	D_TYPE_TO_RESULT (source)
+      };
+}
+
+#ifndef COMPILE_GLOB64
+/* Like convert_dirent, but works on struct dirent64 instead.  */
+static struct readdir_result
+convert_dirent64 (const struct dirent64 *source)
+{
+  if (source == NULL)
+    return (struct readdir_result) {};
+  else
+    return (struct readdir_result)
+      {
+	.name = source->d_name,
+	D_INO_TO_RESULT (source),
+	D_TYPE_TO_RESULT (source)
+      };
+}
+#endif
+
 #ifndef attribute_hidden
 # define attribute_hidden
 #endif
@@ -1553,56 +1574,36 @@  glob_in_dir (const char *pattern, const char *directory, int flags,
 
 	  while (1)
 	    {
-	      const char *name;
-	      size_t len;
-#if defined _LIBC && !defined COMPILE_GLOB64
-	      struct dirent64 *d;
-	      union
-		{
-		  struct dirent64 d64;
-		  char room [offsetof (struct dirent64, d_name[0])
-			     + NAME_MAX + 1];
-		}
-	      d64buf;
-
-	      if (__glibc_unlikely (flags & GLOB_ALTDIRFUNC))
-		{
-		  struct dirent *d32 = (*pglob->gl_readdir) (stream);
-		  if (d32 != NULL)
-		    {
-		      CONVERT_DIRENT_DIRENT64 (&d64buf.d64, d32);
-		      d = &d64buf.d64;
-		    }
-		  else
-		    d = NULL;
-		}
-	      else
-		d = __readdir64 (stream);
+	      struct readdir_result d;
+	      {
+		if (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0))
+		  d = convert_dirent (GL_READDIR (pglob, stream));
+		else
+		  {
+#ifdef COMPILE_GLOB64
+		    d = convert_dirent (__readdir (stream));
 #else
-	      struct dirent *d = (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0)
-				  ? ((struct dirent *)
-				     (*pglob->gl_readdir) (stream))
-				  : __readdir (stream));
+		    d = convert_dirent64 (__readdir64 (stream));
 #endif
-	      if (d == NULL)
+		  }
+	      }
+	      if (d.name == NULL)
 		break;
-	      if (! REAL_DIR_ENTRY (d))
+	      if (d.skip_entry)
 		continue;
 
 	      /* If we shall match only directories use the information
 		 provided by the dirent call if possible.  */
-	      if ((flags & GLOB_ONLYDIR) && !DIRENT_MIGHT_BE_DIR (d))
+	      if ((flags & GLOB_ONLYDIR) && !readdir_result_might_be_dir (d))
 		continue;
 
-	      name = d->d_name;
-
-	      if (fnmatch (pattern, name, fnm_flags) == 0)
+	      if (fnmatch (pattern, d.name, fnm_flags) == 0)
 		{
 		  /* If the file we found is a symlink we have to
 		     make sure the target file exists.  */
-		  if (!DIRENT_MIGHT_BE_SYMLINK (d)
-		      || link_exists_p (dfd, directory, dirlen, name, pglob,
-					flags))
+		  if (!readdir_result_might_be_symlink (d)
+		      || link_exists_p (dfd, directory, dirlen, d.name,
+					pglob, flags))
 		    {
 		      if (cur == names->count)
 			{
@@ -1622,12 +1623,10 @@  glob_in_dir (const char *pattern, const char *directory, int flags,
 			  names = newnames;
 			  cur = 0;
 			}
-		      len = NAMLEN (d);
-		      names->name[cur] = (char *) malloc (len + 1);
+		      names->name[cur] = strdup (d.name);
 		      if (names->name[cur] == NULL)
 			goto memory_error;
-		      *((char *) mempcpy (names->name[cur++], name, len))
-			= '\0';
+		      ++cur;
 		      ++nfound;
 		    }
 		}
diff --git a/sysdeps/unix/sysv/linux/i386/glob64.c b/sysdeps/unix/sysv/linux/i386/glob64.c
index b4fcd1a..bc0fabc 100644
--- a/sysdeps/unix/sysv/linux/i386/glob64.c
+++ b/sysdeps/unix/sysv/linux/i386/glob64.c
@@ -1,3 +1,21 @@ 
+/* Two glob variants with 64-bit support, for dirent64 and __olddirent64.
+   Copyright (C) 1998-2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
 #include <dirent.h>
 #include <glob.h>
 #include <sys/stat.h>
@@ -38,11 +56,15 @@  int __old_glob64 (const char *__pattern, int __flags,
 
 #undef dirent
 #define dirent __old_dirent64
+#undef GL_READDIR
+# define GL_READDIR(pglob, stream) \
+  ((struct __old_dirent64 *) (pglob)->gl_readdir ((stream)))
 #undef __readdir
 #define __readdir(dirp) __old_readdir64 (dirp)
 #undef glob
 #define glob(pattern, flags, errfunc, pglob) \
   __old_glob64 (pattern, flags, errfunc, pglob)
+#define convert_dirent __old_convert_dirent
 #define glob_in_dir __old_glob_in_dir
 #define GLOB_ATTRIBUTE attribute_compat_text_section