Message ID | CAFTKwZ-PPOX=Vw0u7Cabe86TcWBcSZAwnyMr_vhhB4zVzdiX4w@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 09/10/16 16:23 +0100, Elliot Goodrich wrote: >Hi, > >If we unroll the loop so that we iterate both forwards and backwards, >we can take advantage of memory-level parallelism when chasing >pointers. This means that reverse takes 35% less time when nodes are >randomly scattered in memory and about the same time if nodes are >contiguous. > >Further, as our node pointers will never alias, we can interleave the >swaps of the next and previous pointers to remove further data >dependencies. This takes another 5% off the time when nodes are >scattered in memory and takes 20% off when nodes are contiguous. > >All in all we save 20%-40% depending on the memory layout. Nice, thanks for the patch. Do you have (or are you willing to sign) a copyright assignment for GCC? See https://gcc.gnu.org/contribute.html#legal for details. >For future improvement, by passing whether there is an odd or even >number of nodes in the list we can hoist one of the ifs out of the >loop and gain another 5-10% but most likely this is only possible when >_GLIBCXX_USE_CXX11_ABI is defined and size() is O(1). This would bring >the saving to 30%-45%. Is it worth writing a new overload of >_M_reverse which takes the size of the list? That certainly seems worthwhile. Do we need an overload or can it just be done with #if? It seems to me we'd either want to use the size, or not use it, we wouldn't want both versions defined at once. That suggests #if to me.
I haven't yet but I will try and sort it out tomorrow. If we're replacing the current method with one that takes a size parameter when _GLIBCXX_USE_CXX11_ABI is defined, is this going to cause any issues with ABI compatibility? If not, then I agree that we should go with the #if version. On 10 October 2016 at 17:12, Jonathan Wakely <jwakely@redhat.com> wrote: > On 09/10/16 16:23 +0100, Elliot Goodrich wrote: >> >> Hi, >> >> If we unroll the loop so that we iterate both forwards and backwards, >> we can take advantage of memory-level parallelism when chasing >> pointers. This means that reverse takes 35% less time when nodes are >> randomly scattered in memory and about the same time if nodes are >> contiguous. >> >> Further, as our node pointers will never alias, we can interleave the >> swaps of the next and previous pointers to remove further data >> dependencies. This takes another 5% off the time when nodes are >> scattered in memory and takes 20% off when nodes are contiguous. >> >> All in all we save 20%-40% depending on the memory layout. > > > Nice, thanks for the patch. > > Do you have (or are you willing to sign) a copyright assignment for > GCC? > > See https://gcc.gnu.org/contribute.html#legal for details. > >> For future improvement, by passing whether there is an odd or even >> number of nodes in the list we can hoist one of the ifs out of the >> loop and gain another 5-10% but most likely this is only possible when >> _GLIBCXX_USE_CXX11_ABI is defined and size() is O(1). This would bring >> the saving to 30%-45%. Is it worth writing a new overload of >> _M_reverse which takes the size of the list? > > > That certainly seems worthwhile. Do we need an overload or can it just > be done with #if? It seems to me we'd either want to use the size, or > not use it, we wouldn't want both versions defined at once. That > suggests #if to me. >
diff --git a/libstdc++-v3/src/c++98/list.cc b/libstdc++-v3/src/c++98/list.cc index 62b812a..aeb2402 100644 --- a/libstdc++-v3/src/c++98/list.cc +++ b/libstdc++-v3/src/c++98/list.cc @@ -112,15 +112,36 @@ namespace std _GLIBCXX_VISIBILITY(default) void _List_node_base::_M_reverse() _GLIBCXX_USE_NOEXCEPT { - _List_node_base* __tmp = this; - do - { - std::swap(__tmp->_M_next, __tmp->_M_prev); + _List_node_base* __forward = this; + _List_node_base* __backward = __forward->_M_prev; + if (__forward == __backward) { + return; + } - // Old next node is now prev. - __tmp = __tmp->_M_prev; - } - while (__tmp != this); + // unroll the loop to take advantage of memory-level parallelism + while (true) { + _List_node_base* __forward_next = __forward->_M_next; + _List_node_base* __backward_prev = __backward->_M_prev; + + __forward->_M_next = __forward->_M_prev; + __backward->_M_prev = __backward->_M_next; + + __backward->_M_next = __backward_prev; + __forward->_M_prev = __forward_next; + + if (__forward_next == __backward_prev) { + // if there is an odd number of nodes, swap the last one + std::swap(__forward_next->_M_next, __forward_next->_M_prev); + return; + } + + if (__forward_next == __backward) { + return; + } + + __forward = __forward_next; + __backward = __backward_prev; + } } void diff --git a/libstdc++-v3/testsuite/23_containers/list/operations/reverse/1.cc b/libstdc++-v3/testsuite/23_containers/list/operations/reverse/1.cc new file mode 100644 index 0000000..7eab83b --- /dev/null +++ b/libstdc++-v3/testsuite/23_containers/list/operations/reverse/1.cc @@ -0,0 +1,88 @@ +// Copyright (C) 2016-2016 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +// 23.2.2.4 list operations [lib.list.ops] + +#include <testsuite_hooks.h> +#include <list> + +// reverse +void +test01() +{ + bool test __attribute__((unused)) = true; + + typedef std::list<int> list_type; + typedef list_type::iterator iterator_type; + + const int A[] = {0, 1, 2, 3}; + const int N = sizeof(A) / sizeof(int); + + list_type list01(A, A + N); + + // iterators should still be valid after reverse + iterator_type i = list01.begin(); + iterator_type i0 = i++; + iterator_type i1 = i++; + iterator_type i2 = i++; + iterator_type i3 = i++; + + // reverse a list of size 4 + list01.reverse(); + i = list01.begin(); + VERIFY(*i++ == 3); + VERIFY(*i++ == 2); + VERIFY(*i++ == 1); + VERIFY(*i++ == 0); + + // iterators should be unchanged after reverse + VERIFY(*i0 == 0); + VERIFY(*i1 == 1); + VERIFY(*i2 == 2); + VERIFY(*i3 == 3); + + // reverse a list of size 3 + list01.pop_back(); + list01.reverse(); + i = list01.begin(); + VERIFY(*i++ == 1); + VERIFY(*i++ == 2); + VERIFY(*i++ == 3); + + // reverse a list of size 2 + list01.pop_back(); + list01.reverse(); + i = list01.begin(); + VERIFY(*i++ == 2); + VERIFY(*i++ == 1); + + // reverse a list of size 1 + list01.pop_back(); + list01.reverse(); + i = list01.begin(); + VERIFY(*i++ == 2); + + // reverse an empty list + list01.pop_back(); + list01.reverse(); +} + +int main() +{ + test01(); + return 0; +}