diff mbox

libstdc++/67747 Allocate space for dirent::d_name

Message ID 20151001172337.GI12094@redhat.com
State New
Headers show

Commit Message

Jonathan Wakely Oct. 1, 2015, 5:23 p.m. UTC
On 30/09/15 09:30 -0600, Martin Sebor wrote:
>On 09/30/2015 05:01 AM, Jonathan Wakely wrote:
>>On 29/09/15 12:54 -0600, Martin Sebor wrote:
>>>On 09/29/2015 05:37 AM, Jonathan Wakely wrote:
>>>>POSIX says that dirent::d_name has an unspecified length, so calls to
>>>>readdir_r must pass a buffer with enough trailing space for
>>>>{NAME_MAX}+1 characters. I wasn't doing that, which works OK on
>>>>GNU/Linux and BSD where d_name is a large array, but fails on Solaris
>>>>32-bit.
>>>>
>>>>This uses pathconf to get NAME_MAX and allocates a buffer.
>>>>
>>>>Tested powerpc64le-linux and x86_64-dragonfly4.1, I'm going to commit
>>>>this to trunk today (and backport all the filesystem fixes to
>>>>gcc-5-branch).
>>>
>>>Calling pathconf is only necessary when _POSIX_NO_TRUNC is zero
>>>which I think exists mainly for legacy file systems. Otherwise,
>>>it's safe to use NAME_MAX instead. Avoiding the call to pathconf
>>
>>Oh, nice. I was using NAME_MAX originally but the glibc readdir_r(3)
>>man-page has an example using pathconf and says that should be used,
>>so I went down that route.
>
>GLIBC pathconf calls statfs to get the NAME_MAX value from the OS,
>so there's some chance that some unusual file system will define
>it as greater than 255. I tested all those on my Fedora PC and on
>my RHEL 7.2 powerpc64le box and didn't find one.
>
>There also isn't one in the Wikipedia comparison of file systems:
>  https://en.wikipedia.org/wiki/Comparison_of_file_systems
>
>But to be 100% safe, we would need to call pathconf if readdir
>failed due to truncation.
>
>>Can we be sure NAME_MAX will never be too big for the stack?
>
>When it's defined I think it's probably safe in practice but not
>guaranteed. Also, like all of these <limits.h> constants, it's not
>guaranteed to be defined at all when the implementation doesn't
>impose a limit. I believe AIX is one implementation that doesn't
>define it. So some preprocessor magic will be required to deal
>with that.

OK, latest version attached. This defines a helper type,
dirent_buffer, which either contains aligned_union<N, dirent> or
unique_ptr<void*, op_delete> depending on whether NAME_MAX is defined
or whether we have to use pathconf at run-time.

The dirent_buffer is created as part of the directory_iterator or
recursive_directory_iterator internals, so only once per iterator, and
shared between copies of that iterator. When NAME_MAX is defined there
is no second allocation, we just allocate a bit more space.

Tested powerpc64le-linux and x86_64-dragonfly.

I've started a build on powerpc-aix too but I don't know when the
tests for that will finish.

Comments

Jonathan Wakely Oct. 1, 2015, 6:38 p.m. UTC | #1
On 01/10/15 18:23 +0100, Jonathan Wakely wrote:
>+    struct op_delete {
>+	void operator()(void* p) const { ::operator delete(p); }
>+    };
>+    std::unique_ptr<::dirent*, op_delete> ptr;

Oops, that should be dirent not dirent* i.e.

    std::unique_ptr<::dirent, op_delete> ptr;

(Found on AIX, where NAME_MAX isn't defined.)
diff mbox

Patch

commit 6b30f2675aee599e43bcb55594de3abd2221c3ae
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Thu Oct 1 17:01:24 2015 +0100

    PR libstdc++/67747 Allocate space for dirent::d_name
    
    	PR libstdc++/67747
    	* src/filesystem/dir.cc (_Dir::direntp): New member.
    	(dirent_size, dirent_buffer): New helpers for readdir results.
    	(native_readdir) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Copy to supplied
    	dirent object. Handle end of directory.
    	(_Dir::advance): Allocate space for d_name.
    	(directory_iterator(const path&, directory_options, error_code*)):
    	Allocate a dirent_buffer alongside the _Dir object.
    	(recursive_directory_iterator::_Dir_stack): Add dirent_buffer member,
    	constructor and push() function that sets the element's entp.

diff --git a/libstdc++-v3/src/filesystem/dir.cc b/libstdc++-v3/src/filesystem/dir.cc
index bce751c..9372074 100644
--- a/libstdc++-v3/src/filesystem/dir.cc
+++ b/libstdc++-v3/src/filesystem/dir.cc
@@ -25,8 +25,12 @@ 
 #include <experimental/filesystem>
 #include <utility>
 #include <stack>
+#include <stddef.h>
 #include <string.h>
 #include <errno.h>
+#ifdef _GLIBCXX_HAVE_UNISTD_H
+# include <unistd.h>
+#endif
 #ifdef _GLIBCXX_HAVE_DIRENT_H
 # ifdef _GLIBCXX_HAVE_SYS_TYPES_H
 #  include <sys/types.h>
@@ -51,7 +55,7 @@  struct fs::_Dir
 
   _Dir(_Dir&& d)
   : dirp(std::exchange(d.dirp, nullptr)), path(std::move(d.path)),
-    entry(std::move(d.entry)), type(d.type)
+    entry(std::move(d.entry)), type(d.type), entp(d.entp)
   { }
 
   _Dir& operator=(_Dir&&) = delete;
@@ -64,20 +68,64 @@  struct fs::_Dir
   fs::path		path;
   directory_entry	entry;
   file_type		type = file_type::none;
+  ::dirent*		entp = nullptr;
 };
 
 namespace
 {
+#ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
+# undef NAME_MAX
+# define NAME_MAX 260
+#endif
+
+  // Size needed for struct dirent with {NAME_MAX} + 1 chars
+  static constexpr size_t
+  dirent_size(size_t name_max)
+  { return offsetof(::dirent, d_name) + name_max + 1; }
+
+  // Manage a buffer large enough for struct dirent with {NAME_MAX} + 1 chars
+  struct dirent_buffer
+  {
+#ifdef NAME_MAX
+    dirent_buffer(const fs::path&) { }
+
+    ::dirent* get() { return reinterpret_cast<::dirent*>(&ent); }
+
+    std::aligned_union<dirent_size(NAME_MAX), ::dirent>::type ent;
+#else
+
+    dirent_buffer(const fs::path& path __attribute__((__unused__)))
+    {
+      long name_max = 255;  // An informed guess.
+#ifdef _GLIBCXX_HAVE_UNISTD_H
+      long pc_name_max = pathconf(path.c_str(), _PC_NAME_MAX);
+      if (pc_name_max != -1)
+	name_max = pc_name_max;
+#endif
+      ptr.reset(static_cast<::dirent*>(::operator new(dirent_size(name_max))));
+    }
+
+    ::dirent* get() const { return ptr.get(); }
+
+    struct op_delete {
+	void operator()(void* p) const { ::operator delete(p); }
+    };
+    std::unique_ptr<::dirent*, op_delete> ptr;
+#endif
+  };
+
   template<typename Bitmask>
-    inline bool is_set(Bitmask obj, Bitmask bits)
+    inline bool
+    is_set(Bitmask obj, Bitmask bits)
     {
       return (obj & bits) != Bitmask::none;
     }
 
   // Returns {dirp, p} on success, {nullptr, p} on error.
   // If an ignored EACCES error occurs returns {}.
-  fs::_Dir
-  open_dir(const fs::path& p, fs::directory_options options, std::error_code* ec)
+  inline fs::_Dir
+  open_dir(const fs::path& p, fs::directory_options options,
+	   std::error_code* ec)
   {
     if (ec)
       ec->clear();
@@ -100,7 +148,7 @@  namespace
   }
 
   inline fs::file_type
-  get_file_type(const dirent& d __attribute__((__unused__)))
+  get_file_type(const ::dirent& d __attribute__((__unused__)))
   {
 #ifdef _GLIBCXX_HAVE_STRUCT_DIRENT_D_TYPE
     switch (d.d_type)
@@ -129,12 +177,24 @@  namespace
 #endif
   }
 
-  int
+  inline int
   native_readdir(DIR* dirp, ::dirent*& entryp)
   {
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-    if ((entryp = ::readdir(dirp)))
-      return 0;
+    const int saved_errno = errno;
+    errno = 0;
+    if (auto entp = ::readdir(dirp))
+      {
+	errno = saved_errno;
+	memcpy(entryp, entp, dirent_size(strlen(entp->d_name)));
+	return 0;
+      }
+    else if (errno == 0) // End of directory reached.
+      {
+	errno = saved_errno;
+	entryp = nullptr;
+	return 0;
+      }
     return errno;
 #else
     return ::readdir_r(dirp, entryp, &entryp);
@@ -142,6 +202,7 @@  namespace
   }
 }
 
+
 // Returns false when the end of the directory entries is reached.
 // Reports errors by setting ec or throwing.
 bool
@@ -150,8 +211,7 @@  fs::_Dir::advance(error_code* ec, directory_options options)
   if (ec)
     ec->clear();
 
-  ::dirent ent;
-  ::dirent* result = &ent;
+  auto result = entp;
   if (int err = native_readdir(dirp, result))
     {
       if (err == EACCES
@@ -168,10 +228,10 @@  fs::_Dir::advance(error_code* ec, directory_options options)
   else if (result != nullptr)
     {
       // skip past dot and dot-dot
-      if (!strcmp(ent.d_name, ".") || !strcmp(ent.d_name, ".."))
+      if (!strcmp(entp->d_name, ".") || !strcmp(entp->d_name, ".."))
 	return advance(ec, options);
-      entry = fs::directory_entry{path / ent.d_name};
-      type = get_file_type(ent);
+      entry = fs::directory_entry{path / entp->d_name};
+      type = get_file_type(*entp);
       return true;
     }
   else
@@ -190,9 +250,17 @@  directory_iterator(const path& p, directory_options options, error_code* ec)
 
   if (dir.dirp)
     {
-      auto sp = std::make_shared<fs::_Dir>(std::move(dir));
-      if (sp->advance(ec, options))
-	_M_dir.swap(sp);
+      // A _Dir with an associated dirent_buffer.
+      struct Dir_with_buffer : _Dir
+      {
+	Dir_with_buffer(_Dir&& d) : _Dir(std::move(d)), buf(this->path)
+	{ this->entp = buf.get(); }
+
+	dirent_buffer buf;
+      };
+      _M_dir = std::make_shared<Dir_with_buffer>(std::move(dir));
+      if (!_M_dir->advance(ec, options))
+	_M_dir.reset();
     }
   else if (!dir.path.empty())
     {
@@ -241,7 +309,17 @@  using Dir_iter_pair = std::pair<fs::_Dir, fs::directory_iterator>;
 
 struct fs::recursive_directory_iterator::_Dir_stack : std::stack<_Dir>
 {
+  _Dir_stack(const path& p) : buf(p) { }
+
+  void push(_Dir&& d)
+  {
+    d.entp = buf.get();
+    stack::push(std::move(d));
+  }
+
   void clear() { c.clear(); }
+
+  dirent_buffer buf;
 };
 
 fs::recursive_directory_iterator::
@@ -251,7 +329,7 @@  recursive_directory_iterator(const path& p, directory_options options,
 {
   if (DIR* dirp = ::opendir(p.c_str()))
     {
-      _M_dirs = std::make_shared<_Dir_stack>();
+      _M_dirs = std::make_shared<_Dir_stack>(p);
       _M_dirs->push(_Dir{ dirp, p });
       if (!_M_dirs->top().advance(ec))
 	_M_dirs.reset();