diff mbox

[libstdc++/61347] std::distance(list.first(),list.end()) in O(1)

Message ID alpine.DEB.2.11.1504131033550.13447@stedding.saclay.inria.fr
State New
Headers show

Commit Message

Marc Glisse April 13, 2015, 11:42 a.m. UTC
Hello,

this patch makes std::distance(list.first(),list.end()) a constant time 
operation when optimizing, with no penalty for other calls. We could do 
the test always (no __builtin_constant_p) but then it will add a small 
runtime penalty for other calls, someone else can take responsibility for 
that. I could protect the whole overload with #ifdef __OPTIMIZE__ (at -O0 
the compiler does not remove the test ++end==first as dead code), but I 
assume it is better to minimize differences. There are other ways to 
specialize distance, but overloading __distance seems to have the least 
drawbacks (most others involve a new extra file). From an optimization 
POV, it would be a bit better to avoid the while loop and instead call a 
separate function that does it (say the regular __distance), it would make 
the function smaller and thus easier to inline, but it is simpler this way 
and works.

We could do something similar for std::set, but C++ will not let us find 
the address of _Rb_tree_impl::_M_node_count from that of 
_Rb_tree_impl::_M_header, except when _Key_compare is pod, which luckily 
is an easily available information. Avoiding this complication is why I 
insisted on wrapping the size of std::list in a _List_node<size_t> for 
gcc-5, which may look a bit strange at first sight.

2015-04-13  Marc Glisse  <marc.glisse@inria.fr>

 	PR libstdc++/61347
 	* include/bits/stl_iterator_base_funcs.h (distance): Do not
 	qualify the call to __distance.
 	* include/bits/stl_list.h (__distance): New overloads for
 	_List_iterator and _List_const_iterator.
 	* testsuite/23_containers/list/61347.cc: New testcase.

Comments

Jonathan Wakely April 13, 2015, 1:40 p.m. UTC | #1
On 13/04/15 13:42 +0200, Marc Glisse wrote:
>this patch makes std::distance(list.first(),list.end()) a constant 
>time operation when optimizing, with no penalty for other calls. We 
>could do the test always (no __builtin_constant_p) but then it will 
>add a small runtime penalty for other calls, someone else can take 
>responsibility for that.

I like the way you've done it. No penalty for other calls is great
and IMHO it's OK that the optimisation only happens when optimising.

(Does it work even at -Og?)

>I could protect the whole overload with 
>#ifdef __OPTIMIZE__ (at -O0 the compiler does not remove the test 
>++end==first as dead code), but I assume it is better to minimize 
>differences.

Agreed.

>There are other ways to specialize distance, but 
>overloading __distance seems to have the least drawbacks (most others 
>involve a new extra file). From an optimization POV, it would be a bit 
>better to avoid the while loop and instead call a separate function 
>that does it (say the regular __distance), it would make the function 
>smaller and thus easier to inline, but it is simpler this way and 
>works.

Is there any (dis)advantage to making one call the other, to avoid
duplicating the function bodies?

  template<typename _Tp>
    inline ptrdiff_t
    __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
               input_iterator_tag __tag)
    {
      typedef _GLIBCXX_STD_C::_List_const_iterator<_Tp> _CIter;
      return std::__distance(_CIter(__first), _CIter(__last), __tag);
    }

>We could do something similar for std::set, but C++ will not let us 
>find the address of _Rb_tree_impl::_M_node_count from that of 
>_Rb_tree_impl::_M_header, except when _Key_compare is pod, which 
>luckily is an easily available information. Avoiding this complication 
>is why I insisted on wrapping the size of std::list in a 
>_List_node<size_t> for gcc-5, which may look a bit strange at first 
>sight.

Sadly, that node is going to look even stranger when I finish adding
support for C++11 allocators, as the type of node becomes dependent on
the allocator's pointer, which makes _List_node<size_t> much more
complicated :-(

I'll have to remember to add additional __distance overloads to handle
the new _List_ptr_iterator and _List_ptr_const_iterator types that
will be used for fancy pointers (although if I forget the optimisation
will still work for std::list<T, std::allocator<T>>, just not for the
vanishingly small number of people using allocators with fancy
pointers).

>Index: include/bits/stl_iterator_base_funcs.h
>===================================================================
>--- include/bits/stl_iterator_base_funcs.h	(revision 222041)
>+++ include/bits/stl_iterator_base_funcs.h	(working copy)
>@@ -107,22 +107,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    *  n may be negative.
>    *
>    *  For random access iterators, this uses their @c + and @c - operations
>    *  and are constant time.  For other %iterator classes they are linear time.
>   */
>   template<typename _InputIterator>
>     inline typename iterator_traits<_InputIterator>::difference_type
>     distance(_InputIterator __first, _InputIterator __last)
>     {
>       // concept requirements -- taken care of in __distance
>-      return std::__distance(__first, __last,
>-			     std::__iterator_category(__first));
>+      return __distance(__first, __last, std::__iterator_category(__first));

This should still be a qualified call to std::__distance, otherwise the
compiler might end up instantiating things to figure out if there are
any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we
don't want to know T's base classes and rheir associated namespaces.

>+  // Detect when distance is used to compute the size of the whole list.
>+  template<typename _Tp>
>+    inline ptrdiff_t
>+    __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
>+	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
>+               input_iterator_tag)
>+    {
>+      typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel;
>+      _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last;
>+      ++__beyond;
>+      bool __whole = __first == __beyond;
>+      if (__builtin_constant_p (__whole) && __whole)

This is clever :-)

This shouldn't interfere with any changes we might need to test before
backporting to the gcc-5-branch, so with the std:: qualification
restored on the call to __distance it's OK for trunk.

(I'll trust your judgment/masurements for whether it's worth making
the _List_iterator overload call the _List_const_iterator one.)
Marc Glisse April 13, 2015, 2:14 p.m. UTC | #2
On Mon, 13 Apr 2015, Jonathan Wakely wrote:

> On 13/04/15 13:42 +0200, Marc Glisse wrote:
>> this patch makes std::distance(list.first(),list.end()) a constant time 
>> operation when optimizing, with no penalty for other calls. We could do the 
>> test always (no __builtin_constant_p) but then it will add a small runtime 
>> penalty for other calls, someone else can take responsibility for that.
>
> I like the way you've done it. No penalty for other calls is great
> and IMHO it's OK that the optimisation only happens when optimising.
>
> (Does it work even at -Og?)

No, the testcase currently passes with -O1 but not -Og.

>> I could protect the whole overload with #ifdef __OPTIMIZE__ (at -O0 the 
>> compiler does not remove the test ++end==first as dead code), but I assume 
>> it is better to minimize differences.
>
> Agreed.
>
>> There are other ways to specialize distance, but overloading __distance 
>> seems to have the least drawbacks (most others involve a new extra file). 
>> From an optimization POV, it would be a bit better to avoid the while loop 
>> and instead call a separate function that does it (say the regular 
>> __distance), it would make the function smaller and thus easier to inline, 
>> but it is simpler this way and works.
>
> Is there any (dis)advantage to making one call the other, to avoid
> duplicating the function bodies?
>
> template<typename _Tp>
>   inline ptrdiff_t
>   __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
> 	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
>              input_iterator_tag __tag)
>   {
>     typedef _GLIBCXX_STD_C::_List_const_iterator<_Tp> _CIter;
>     return std::__distance(_CIter(__first), _CIter(__last), __tag);
>   }

I don't see any disadvantage, I'll do that.

>> We could do something similar for std::set, but C++ will not let us find 
>> the address of _Rb_tree_impl::_M_node_count from that of 
>> _Rb_tree_impl::_M_header, except when _Key_compare is pod, which luckily is 
>> an easily available information. Avoiding this complication is why I 
>> insisted on wrapping the size of std::list in a _List_node<size_t> for 
>> gcc-5, which may look a bit strange at first sight.
>
> Sadly, that node is going to look even stranger when I finish adding
> support for C++11 allocators, as the type of node becomes dependent on
> the allocator's pointer, which makes _List_node<size_t> much more
> complicated :-(

But then I assume _List_node_base is the part really getting more 
complicated, so without looking too closely it seems almost orthogonal.

> I'll have to remember to add additional __distance overloads to handle
> the new _List_ptr_iterator and _List_ptr_const_iterator types that
> will be used for fancy pointers (although if I forget the optimisation
> will still work for std::list<T, std::allocator<T>>, just not for the
> vanishingly small number of people using allocators with fancy
> pointers).
>
>> Index: include/bits/stl_iterator_base_funcs.h
>> ===================================================================
>> --- include/bits/stl_iterator_base_funcs.h	(revision 222041)
>> +++ include/bits/stl_iterator_base_funcs.h	(working copy)
>> @@ -107,22 +107,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>    *  n may be negative.
>>    *
>>    *  For random access iterators, this uses their @c + and @c - operations
>>    *  and are constant time.  For other %iterator classes they are linear 
>> time.
>>   */
>>   template<typename _InputIterator>
>>     inline typename iterator_traits<_InputIterator>::difference_type
>>     distance(_InputIterator __first, _InputIterator __last)
>>     {
>>       // concept requirements -- taken care of in __distance
>> -      return std::__distance(__first, __last,
>> -			     std::__iterator_category(__first));
>> +      return __distance(__first, __last, 
>> std::__iterator_category(__first));
>
> This should still be a qualified call to std::__distance, otherwise the
> compiler might end up instantiating things to figure out if there are
> any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we
> don't want to know T's base classes and rheir associated namespaces.

Then the approach will not work. C++ overload resolution will not consider 
the later __distance declarations in stl_list.h if the call is qualified 
(I didn't change it gratuitously). A forward declaration of list iterators 
and those __distance overloads in bits/stl_iterator_base_funcs.h is not 
very appealing but may work (or it may cause issues, I don't know). 
Otherwise, I guess we are back to creating a new file 
bits/list_iterator.h, which <list> includes if <iterator> has already been 
included and vice versa, and which provides overloads for distance 
directly.

>> +  // Detect when distance is used to compute the size of the whole list.
>> +  template<typename _Tp>
>> +    inline ptrdiff_t
>> +    __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
>> +	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
>> +               input_iterator_tag)
>> +    {
>> +      typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel;
>> +      _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last;
>> +      ++__beyond;
>> +      bool __whole = __first == __beyond;
>> +      if (__builtin_constant_p (__whole) && __whole)
>
> This is clever :-)
>
> This shouldn't interfere with any changes we might need to test before
> backporting to the gcc-5-branch, so with the std:: qualification
> restored on the call to __distance it's OK for trunk.
>
> (I'll trust your judgment/masurements for whether it's worth making
> the _List_iterator overload call the _List_const_iterator one.)
Jonathan Wakely April 13, 2015, 3:10 p.m. UTC | #3
On 13/04/15 16:14 +0200, Marc Glisse wrote:
>On Mon, 13 Apr 2015, Jonathan Wakely wrote:
>
>>On 13/04/15 13:42 +0200, Marc Glisse wrote:
>>>this patch makes std::distance(list.first(),list.end()) a constant 
>>>time operation when optimizing, with no penalty for other calls. 
>>>We could do the test always (no __builtin_constant_p) but then it 
>>>will add a small runtime penalty for other calls, someone else can 
>>>take responsibility for that.
>>
>>I like the way you've done it. No penalty for other calls is great
>>and IMHO it's OK that the optimisation only happens when optimising.
>>
>>(Does it work even at -Og?)
>
>No, the testcase currently passes with -O1 but not -Og.

OK, we can live with that.

>>Sadly, that node is going to look even stranger when I finish adding
>>support for C++11 allocators, as the type of node becomes dependent on
>>the allocator's pointer, which makes _List_node<size_t> much more
>>complicated :-(
>
>But then I assume _List_node_base is the part really getting more 
>complicated, so without looking too closely it seems almost 
>orthogonal.

In order to avoid making any changes to std::list<T, std::allocator<T>>
I'm adding an entire parallel hierarchy of a new node base type
(parameterized on the allocator's void_pointer type), a new node type
and new iterators (parameterized on the pointer type), which will only
be used when !is_pointer<Alloc::pointer>. So it will just mean adding
two new overloads for the two new iterator types.

Not a big deal, as long as I remember to do it :-)

>>This should still be a qualified call to std::__distance, otherwise the
>>compiler might end up instantiating things to figure out if there are
>>any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we
>>don't want to know T's base classes and rheir associated namespaces.
>
>Then the approach will not work. C++ overload resolution will not 
>consider the later __distance declarations in stl_list.h if the call 
>is qualified (I didn't change it gratuitously).

Ahhh, yes, of course.

>A forward declaration 
>of list iterators and those __distance overloads in 
>bits/stl_iterator_base_funcs.h is not very appealing but may work (or 
>it may cause issues, I don't know). Otherwise, I guess we are back to 
>creating a new file bits/list_iterator.h, which <list> includes if 
><iterator> has already been included and vice versa, and which 
>provides overloads for distance directly.

I don't have a preference, but I think the forward declarations should
work without problems. <list> includes bits/stl_iterator_base_funcs.h
so if the forward declarations didn't match the definitions for some
reason we'd know right away.
diff mbox

Patch

Index: include/bits/stl_iterator_base_funcs.h
===================================================================
--- include/bits/stl_iterator_base_funcs.h	(revision 222041)
+++ include/bits/stl_iterator_base_funcs.h	(working copy)
@@ -107,22 +107,21 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
    *  n may be negative.
    *
    *  For random access iterators, this uses their @c + and @c - operations
    *  and are constant time.  For other %iterator classes they are linear time.
   */
   template<typename _InputIterator>
     inline typename iterator_traits<_InputIterator>::difference_type
     distance(_InputIterator __first, _InputIterator __last)
     {
       // concept requirements -- taken care of in __distance
-      return std::__distance(__first, __last,
-			     std::__iterator_category(__first));
+      return __distance(__first, __last, std::__iterator_category(__first));
     }
 
   template<typename _InputIterator, typename _Distance>
     inline void
     __advance(_InputIterator& __i, _Distance __n, input_iterator_tag)
     {
       // concept requirements
       __glibcxx_function_requires(_InputIteratorConcept<_InputIterator>)
       _GLIBCXX_DEBUG_ASSERT(__n >= 0);
       while (__n--)
Index: include/bits/stl_list.h
===================================================================
--- include/bits/stl_list.h	(revision 222041)
+++ include/bits/stl_list.h	(working copy)
@@ -1861,13 +1861,64 @@  _GLIBCXX_END_NAMESPACE_CXX11
     operator>=(const list<_Tp, _Alloc>& __x, const list<_Tp, _Alloc>& __y)
     { return !(__x < __y); }
 
   /// See std::list::swap().
   template<typename _Tp, typename _Alloc>
     inline void
     swap(list<_Tp, _Alloc>& __x, list<_Tp, _Alloc>& __y)
     { __x.swap(__y); }
 
 _GLIBCXX_END_NAMESPACE_CONTAINER
+
+#if _GLIBCXX_USE_CXX11_ABI
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+  // Detect when distance is used to compute the size of the whole list.
+  template<typename _Tp>
+    inline ptrdiff_t
+    __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
+	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
+               input_iterator_tag)
+    {
+      typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel;
+      _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last;
+      ++__beyond;
+      bool __whole = __first == __beyond;
+      if (__builtin_constant_p (__whole) && __whole)
+	return static_cast<const _Sentinel*>(__last._M_node)->_M_data;
+
+      ptrdiff_t __n = 0;
+      while (__first != __last)
+	{
+	  ++__first;
+	  ++__n;
+	}
+      return __n;
+    }
+
+  template<typename _Tp>
+    inline ptrdiff_t
+    __distance(_GLIBCXX_STD_C::_List_const_iterator<_Tp> __first,
+	       _GLIBCXX_STD_C::_List_const_iterator<_Tp> __last,
+               input_iterator_tag)
+    {
+      typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel;
+      _GLIBCXX_STD_C::_List_const_iterator<_Tp> __beyond = __last;
+      ++__beyond;
+      bool __whole = __first == __beyond;
+      if (__builtin_constant_p (__whole) && __whole)
+	return static_cast<const _Sentinel*>(__last._M_node)->_M_data;
+
+      ptrdiff_t __n = 0;
+      while (__first != __last)
+	{
+	  ++__first;
+	  ++__n;
+	}
+      return __n;
+    }
+
+_GLIBCXX_END_NAMESPACE_VERSION
+#endif
 } // namespace std
 
 #endif /* _STL_LIST_H */
Index: testsuite/23_containers/list/61347.cc
===================================================================
--- testsuite/23_containers/list/61347.cc	(revision 0)
+++ testsuite/23_containers/list/61347.cc	(working copy)
@@ -0,0 +1,49 @@ 
+// { dg-options "-std=gnu++11 -O2 -D_GLIBCXX_USE_CXX11_ABI" }
+
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include <list>
+#include <iterator>
+#include <testsuite_hooks.h>
+
+__attribute__((__noinline__, __noclone__))
+void testm(std::list<short>& l)
+{
+  bool b = std::distance(l.begin(), l.end()) == l.size();
+  VERIFY( __builtin_constant_p(b) );
+  VERIFY( b );
+}
+
+__attribute__((__noinline__, __noclone__))
+void testc(const std::list<short>& l)
+{
+  bool b = std::distance(l.begin(), l.end()) == l.size();
+  VERIFY( __builtin_constant_p(b) );
+  VERIFY( b );
+}
+
+int main()
+{
+  bool test __attribute__((unused)) = true;
+
+#if _GLIBCXX_USE_DUAL_ABI
+  std::list<short> l;
+  testm(l);
+  testc(l);
+#endif
+}