diff mbox series

[committed] libstdc++: Fix code size regressions in std::vector [PR110060]

Message ID 20230601150958.268109-1-jwakely@redhat.com
State New
Headers show
Series [committed] libstdc++: Fix code size regressions in std::vector [PR110060] | expand

Commit Message

Jonathan Wakely June 1, 2023, 3:09 p.m. UTC
Tested powerpc64le-linux. Pusshed to trunk.

-- >8 --

My r14-1452-gfb409a15d9babc change to add optimization hints to
std::vector causes regressions because it makes std::vector::size() and
std::vector::capacity() too big to inline. That's the opposite of what
I wanted, so revert the changes to those functions.

To achieve the original aim of optimizing vec.assign(vec.size(), x) we
can add a local optimization hint to _M_fill_assign, so that it doesn't
affect all other uses of size() and capacity().

Additionally, add the same hint to the _M_assign_aux overload for
forward iterators and add that to the testcase.

It would be nice to similarly optimize:
  if (vec1.size() == vec2.size()) vec1 = vec2;
but adding hints to operator=(const vector&) doesn't help. Presumably
the relationships between the two sizes and two capacities are too
complex to track effectively.

libstdc++-v3/ChangeLog:

	PR libstdc++/110060
	* include/bits/stl_vector.h (_Vector_base::_M_invariant):
	Remove.
	(vector::size, vector::capacity): Remove calls to _M_invariant.
	* include/bits/vector.tcc (vector::_M_fill_assign): Add
	optimization hint to reallocating path.
	(vector::_M_assign_aux(FwdIter, FwdIter, forward_iterator_tag)):
	Likewise.
	* testsuite/23_containers/vector/capacity/invariant.cc: Moved
	to...
	* testsuite/23_containers/vector/modifiers/assign/no_realloc.cc:
	...here. Check assign(FwdIter, FwdIter) too.
	* testsuite/23_containers/vector/types/1.cc: Revert addition
	of -Wno-stringop-overread option.
---
 libstdc++-v3/include/bits/stl_vector.h        | 23 +------------------
 libstdc++-v3/include/bits/vector.tcc          | 17 ++++++++++----
 .../assign/no_realloc.cc}                     |  6 +++++
 .../testsuite/23_containers/vector/types/1.cc |  2 +-
 4 files changed, 20 insertions(+), 28 deletions(-)
 rename libstdc++-v3/testsuite/23_containers/vector/{capacity/invariant.cc => modifiers/assign/no_realloc.cc} (70%)

Comments

Maxim Kuvyrkov June 8, 2023, 8:57 a.m. UTC | #1
Hi Jonathan,

Interestingly, this increases code-size of -O3 code on aarch64-linux-gnu on SPEC CPU2017's 641.leela_s benchmark [1].

In particular, FastBoard::get_nearby_enemies() grew from 1444 to 2212 bytes.  This seems like a corner-case; the rest of SPEC CPU2017 is, mostly, neutral to this patch.  Is this something you may be interested in investigating?  I'll be happy to assist.

Looking at assembly, one of the differences I see is that the "after" version has calls to realloc_insert(), while "before" version seems to have them inlined [2]. 

[1] https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt

[2] 641.leela_s is non-GPL/non-BSD benchmark, and I'm not sure if I can post its compiled and/or preprocessed code publicly.  I assume RedHat has SPEC CPU2017 license, and I can post details to you privately.

Kind regards,

--
Maxim Kuvyrkov
https://www.linaro.org




> On Jun 1, 2023, at 19:09, Jonathan Wakely via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> Tested powerpc64le-linux. Pusshed to trunk.
> 
> -- >8 --
> 
> My r14-1452-gfb409a15d9babc change to add optimization hints to
> std::vector causes regressions because it makes std::vector::size() and
> std::vector::capacity() too big to inline. That's the opposite of what
> I wanted, so revert the changes to those functions.
> 
> To achieve the original aim of optimizing vec.assign(vec.size(), x) we
> can add a local optimization hint to _M_fill_assign, so that it doesn't
> affect all other uses of size() and capacity().
> 
> Additionally, add the same hint to the _M_assign_aux overload for
> forward iterators and add that to the testcase.
> 
> It would be nice to similarly optimize:
>  if (vec1.size() == vec2.size()) vec1 = vec2;
> but adding hints to operator=(const vector&) doesn't help. Presumably
> the relationships between the two sizes and two capacities are too
> complex to track effectively.
> 
> libstdc++-v3/ChangeLog:
> 
> PR libstdc++/110060
> * include/bits/stl_vector.h (_Vector_base::_M_invariant):
> Remove.
> (vector::size, vector::capacity): Remove calls to _M_invariant.
> * include/bits/vector.tcc (vector::_M_fill_assign): Add
> optimization hint to reallocating path.
> (vector::_M_assign_aux(FwdIter, FwdIter, forward_iterator_tag)):
> Likewise.
> * testsuite/23_containers/vector/capacity/invariant.cc: Moved
> to...
> * testsuite/23_containers/vector/modifiers/assign/no_realloc.cc:
> ...here. Check assign(FwdIter, FwdIter) too.
> * testsuite/23_containers/vector/types/1.cc: Revert addition
> of -Wno-stringop-overread option.
> ---
> libstdc++-v3/include/bits/stl_vector.h        | 23 +------------------
> libstdc++-v3/include/bits/vector.tcc          | 17 ++++++++++----
> .../assign/no_realloc.cc}                     |  6 +++++
> .../testsuite/23_containers/vector/types/1.cc |  2 +-
> 4 files changed, 20 insertions(+), 28 deletions(-)
> rename libstdc++-v3/testsuite/23_containers/vector/{capacity/invariant.cc => modifiers/assign/no_realloc.cc} (70%)
> 
> diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h
> index e593be443bc..70ced3d101f 100644
> --- a/libstdc++-v3/include/bits/stl_vector.h
> +++ b/libstdc++-v3/include/bits/stl_vector.h
> @@ -389,23 +389,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> 
>     protected:
> 
> -      __attribute__((__always_inline__))
> -      _GLIBCXX20_CONSTEXPR void
> -      _M_invariant() const
> -      {
> -#if __OPTIMIZE__
> - if (this->_M_impl._M_finish < this->_M_impl._M_start)
> -  __builtin_unreachable();
> - if (this->_M_impl._M_finish > this->_M_impl._M_end_of_storage)
> -  __builtin_unreachable();
> -
> - size_t __sz = this->_M_impl._M_finish - this->_M_impl._M_start;
> - size_t __cap = this->_M_impl._M_end_of_storage - this->_M_impl._M_start;
> - if (__sz > __cap)
> -  __builtin_unreachable();
> -#endif
> -      }
> -
>       _GLIBCXX20_CONSTEXPR
>       void
>       _M_create_storage(size_t __n)
> @@ -1005,10 +988,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>       _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>       size_type
>       size() const _GLIBCXX_NOEXCEPT
> -      {
> - _Base::_M_invariant();
> - return size_type(this->_M_impl._M_finish - this->_M_impl._M_start);
> -      }
> +      { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
> 
>       /**  Returns the size() of the largest possible %vector.  */
>       _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> @@ -1095,7 +1075,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>       size_type
>       capacity() const _GLIBCXX_NOEXCEPT
>       {
> - _Base::_M_invariant();
> return size_type(this->_M_impl._M_end_of_storage
>   - this->_M_impl._M_start);
>       }
> diff --git a/libstdc++-v3/include/bits/vector.tcc b/libstdc++-v3/include/bits/vector.tcc
> index d6fdea2dd01..acd11e2dc68 100644
> --- a/libstdc++-v3/include/bits/vector.tcc
> +++ b/libstdc++-v3/include/bits/vector.tcc
> @@ -270,15 +270,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>     vector<_Tp, _Alloc>::
>     _M_fill_assign(size_t __n, const value_type& __val)
>     {
> +      const size_type __sz = size();
>       if (__n > capacity())
> {
> +  if (__n <= __sz)
> +    __builtin_unreachable();
>  vector __tmp(__n, __val, _M_get_Tp_allocator());
>  __tmp._M_impl._M_swap_data(this->_M_impl);
> }
> -      else if (__n > size())
> +      else if (__n > __sz)
> {
>  std::fill(begin(), end(), __val);
> -  const size_type __add = __n - size();
> +  const size_type __add = __n - __sz;
>  _GLIBCXX_ASAN_ANNOTATE_GROW(__add);
>  this->_M_impl._M_finish =
>    std::__uninitialized_fill_n_a(this->_M_impl._M_finish,
> @@ -316,10 +319,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>       _M_assign_aux(_ForwardIterator __first, _ForwardIterator __last,
>    std::forward_iterator_tag)
>       {
> + const size_type __sz = size();
> const size_type __len = std::distance(__first, __last);
> 
> if (__len > capacity())
>  {
> +    if (__len <= __sz)
> +      __builtin_unreachable();
> +
>    _S_check_init_len(__len, _M_get_Tp_allocator());
>    pointer __tmp(_M_allocate_and_copy(__len, __first, __last));
>    std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish,
> @@ -332,14 +339,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>    this->_M_impl._M_finish = this->_M_impl._M_start + __len;
>    this->_M_impl._M_end_of_storage = this->_M_impl._M_finish;
>  }
> - else if (size() >= __len)
> + else if (__sz >= __len)
>  _M_erase_at_end(std::copy(__first, __last, this->_M_impl._M_start));
> else
>  {
>    _ForwardIterator __mid = __first;
> -    std::advance(__mid, size());
> +    std::advance(__mid, __sz);
>    std::copy(__first, __mid, this->_M_impl._M_start);
> -    const size_type __attribute__((__unused__)) __n = __len - size();
> +    const size_type __attribute__((__unused__)) __n = __len - __sz;
>    _GLIBCXX_ASAN_ANNOTATE_GROW(__n);
>    this->_M_impl._M_finish =
>      std::__uninitialized_copy_a(__mid, __last,
> diff --git a/libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc b/libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
> similarity index 70%
> rename from libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc
> rename to libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
> index d68db694add..659f18dba56 100644
> --- a/libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc
> +++ b/libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
> @@ -1,5 +1,6 @@
> // { dg-do compile }
> // { dg-options "-O3 -g0" }
> +// { dg-require-normal-mode "" }
> // { dg-final { scan-assembler-not "_Znw" } }
> // GCC should be able to optimize away the paths involving reallocation.
> 
> @@ -14,3 +15,8 @@ void fill_val(std::vector<int>& vec, int i)
> {
>   vec.assign(vec.size(), i);
> }
> +
> +void fill_range(std::vector<int>& vec, const int* first)
> +{
> +  vec.assign(first, first + vec.size());
> +}
> diff --git a/libstdc++-v3/testsuite/23_containers/vector/types/1.cc b/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
> index 9be07d9fd5c..079e5af9556 100644
> --- a/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
> +++ b/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
> @@ -18,7 +18,7 @@
> // <http://www.gnu.org/licenses/>.
> 
> // { dg-do compile }
> -// { dg-options "-Wno-unused-result -Wno-stringop-overread" }
> +// { dg-options "-Wno-unused-result" }
> 
> #include <vector>
> #include <testsuite_greedy_ops.h>
> -- 
> 2.40.1
>
Jonathan Wakely June 8, 2023, 9:05 a.m. UTC | #2
On Thu, 8 Jun 2023 at 09:58, Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
wrote:

> Hi Jonathan,
>
> Interestingly, this increases code-size of -O3 code on aarch64-linux-gnu
> on SPEC CPU2017's 641.leela_s benchmark [1].
>
> In particular, FastBoard::get_nearby_enemies() grew from 1444 to 2212
> bytes.  This seems like a corner-case; the rest of SPEC CPU2017 is, mostly,
> neutral to this patch.  Is this something you may be interested in
> investigating?  I'll be happy to assist.
>

I'd certainly like to avoid the regression, but I'm too dumb to understand
most inlining bugs myself.


>
> Looking at assembly, one of the differences I see is that the "after"
> version has calls to realloc_insert(), while "before" version seems to have
> them inlined [2].
>
> [1]
> https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt
>
>
I find it annoying that adding `if (n < sz) __builtin_unreachable()` seems
to affect the size estimates for the function, and so perturbs inlining
decisions. That code shouldn't add any actual instructions, so shouldn't
affect size estimates.

I mentioned this in a meeting last week and Jason suggested checking
whether using __builtin_assume has the same undesirable consequences, so I
think I'll start by investigating that.



> [2] 641.leela_s is non-GPL/non-BSD benchmark, and I'm not sure if I can
> post its compiled and/or preprocessed code publicly.  I assume RedHat has
> SPEC CPU2017 license, and I can post details to you privately.
>
>
Yes, I think I can get the benchmark code from Vlad.

Thanks for bringing this to my attention.
Jakub Jelinek June 8, 2023, 9:13 a.m. UTC | #3
On Thu, Jun 08, 2023 at 10:05:43AM +0100, Jonathan Wakely via Gcc-patches wrote:
> > Looking at assembly, one of the differences I see is that the "after"
> > version has calls to realloc_insert(), while "before" version seems to have
> > them inlined [2].
> >
> > [1]
> > https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt
> >
> >
> I find it annoying that adding `if (n < sz) __builtin_unreachable()` seems
> to affect the size estimates for the function, and so perturbs inlining
> decisions. That code shouldn't add any actual instructions, so shouldn't
> affect size estimates.
> 
> I mentioned this in a meeting last week and Jason suggested checking
> whether using __builtin_assume has the same undesirable consequences, so I

We don't support __builtin_assume (intentionally), if you mean [[assume(n>=sz)]],
then because n >= sz doesn't have side-effects, it will be lowered to
exactly that if (n < sz) __builtin_unreachable(); - you can look at
-fdump-tree-all to confirm that.

I agree that the inliner should ignore if (comparison) __builtin_unreachable();
from costs estimation.  And inliner should ignore what we emit for [[assume()]]
if there are side-effects.  CCing Honza.

	Jakub
Jonathan Wakely June 8, 2023, 11:43 a.m. UTC | #4
On Thu, 8 Jun 2023 at 09:58, Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
wrote:

> Hi Jonathan,
>
> Interestingly, this increases code-size of -O3 code on aarch64-linux-gnu
> on SPEC CPU2017's 641.leela_s benchmark [1].
>
> In particular, FastBoard::get_nearby_enemies() grew from 1444 to 2212
> bytes.  This seems like a corner-case; the rest of SPEC CPU2017 is, mostly,
> neutral to this patch.  Is this something you may be interested in
> investigating?  I'll be happy to assist.
>
> Looking at assembly, one of the differences I see is that the "after"
> version has calls to realloc_insert(), while "before" version seems to have
> them inlined [2].
>

Was the size of that function stable at (approximately) 1444 bytes prior to
my most recent change?

Is it possible that r14-1452-gfb409a15d9babc caused the size to shrink, and
then r14-1470-gb7b255e77a2719 caused it to grow again?





>
> [1]
> https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt
>
> [2] 641.leela_s is non-GPL/non-BSD benchmark, and I'm not sure if I can
> post its compiled and/or preprocessed code publicly.  I assume RedHat has
> SPEC CPU2017 license, and I can post details to you privately.
>
> Kind regards,
>
> --
> Maxim Kuvyrkov
> https://www.linaro.org
>
>
>
>
> > On Jun 1, 2023, at 19:09, Jonathan Wakely via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> >
> > Tested powerpc64le-linux. Pusshed to trunk.
> >
> > -- >8 --
> >
> > My r14-1452-gfb409a15d9babc change to add optimization hints to
> > std::vector causes regressions because it makes std::vector::size() and
> > std::vector::capacity() too big to inline. That's the opposite of what
> > I wanted, so revert the changes to those functions.
> >
> > To achieve the original aim of optimizing vec.assign(vec.size(), x) we
> > can add a local optimization hint to _M_fill_assign, so that it doesn't
> > affect all other uses of size() and capacity().
> >
> > Additionally, add the same hint to the _M_assign_aux overload for
> > forward iterators and add that to the testcase.
> >
> > It would be nice to similarly optimize:
> >  if (vec1.size() == vec2.size()) vec1 = vec2;
> > but adding hints to operator=(const vector&) doesn't help. Presumably
> > the relationships between the two sizes and two capacities are too
> > complex to track effectively.
> >
> > libstdc++-v3/ChangeLog:
> >
> > PR libstdc++/110060
> > * include/bits/stl_vector.h (_Vector_base::_M_invariant):
> > Remove.
> > (vector::size, vector::capacity): Remove calls to _M_invariant.
> > * include/bits/vector.tcc (vector::_M_fill_assign): Add
> > optimization hint to reallocating path.
> > (vector::_M_assign_aux(FwdIter, FwdIter, forward_iterator_tag)):
> > Likewise.
> > * testsuite/23_containers/vector/capacity/invariant.cc: Moved
> > to...
> > * testsuite/23_containers/vector/modifiers/assign/no_realloc.cc:
> > ...here. Check assign(FwdIter, FwdIter) too.
> > * testsuite/23_containers/vector/types/1.cc: Revert addition
> > of -Wno-stringop-overread option.
> > ---
> > libstdc++-v3/include/bits/stl_vector.h        | 23 +------------------
> > libstdc++-v3/include/bits/vector.tcc          | 17 ++++++++++----
> > .../assign/no_realloc.cc}                     |  6 +++++
> > .../testsuite/23_containers/vector/types/1.cc |  2 +-
> > 4 files changed, 20 insertions(+), 28 deletions(-)
> > rename
> libstdc++-v3/testsuite/23_containers/vector/{capacity/invariant.cc =>
> modifiers/assign/no_realloc.cc} (70%)
> >
> > diff --git a/libstdc++-v3/include/bits/stl_vector.h
> b/libstdc++-v3/include/bits/stl_vector.h
> > index e593be443bc..70ced3d101f 100644
> > --- a/libstdc++-v3/include/bits/stl_vector.h
> > +++ b/libstdc++-v3/include/bits/stl_vector.h
> > @@ -389,23 +389,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >
> >     protected:
> >
> > -      __attribute__((__always_inline__))
> > -      _GLIBCXX20_CONSTEXPR void
> > -      _M_invariant() const
> > -      {
> > -#if __OPTIMIZE__
> > - if (this->_M_impl._M_finish < this->_M_impl._M_start)
> > -  __builtin_unreachable();
> > - if (this->_M_impl._M_finish > this->_M_impl._M_end_of_storage)
> > -  __builtin_unreachable();
> > -
> > - size_t __sz = this->_M_impl._M_finish - this->_M_impl._M_start;
> > - size_t __cap = this->_M_impl._M_end_of_storage -
> this->_M_impl._M_start;
> > - if (__sz > __cap)
> > -  __builtin_unreachable();
> > -#endif
> > -      }
> > -
> >       _GLIBCXX20_CONSTEXPR
> >       void
> >       _M_create_storage(size_t __n)
> > @@ -1005,10 +988,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >       _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> >       size_type
> >       size() const _GLIBCXX_NOEXCEPT
> > -      {
> > - _Base::_M_invariant();
> > - return size_type(this->_M_impl._M_finish - this->_M_impl._M_start);
> > -      }
> > +      { return size_type(this->_M_impl._M_finish -
> this->_M_impl._M_start); }
> >
> >       /**  Returns the size() of the largest possible %vector.  */
> >       _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> > @@ -1095,7 +1075,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >       size_type
> >       capacity() const _GLIBCXX_NOEXCEPT
> >       {
> > - _Base::_M_invariant();
> > return size_type(this->_M_impl._M_end_of_storage
> >   - this->_M_impl._M_start);
> >       }
> > diff --git a/libstdc++-v3/include/bits/vector.tcc
> b/libstdc++-v3/include/bits/vector.tcc
> > index d6fdea2dd01..acd11e2dc68 100644
> > --- a/libstdc++-v3/include/bits/vector.tcc
> > +++ b/libstdc++-v3/include/bits/vector.tcc
> > @@ -270,15 +270,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >     vector<_Tp, _Alloc>::
> >     _M_fill_assign(size_t __n, const value_type& __val)
> >     {
> > +      const size_type __sz = size();
> >       if (__n > capacity())
> > {
> > +  if (__n <= __sz)
> > +    __builtin_unreachable();
> >  vector __tmp(__n, __val, _M_get_Tp_allocator());
> >  __tmp._M_impl._M_swap_data(this->_M_impl);
> > }
> > -      else if (__n > size())
> > +      else if (__n > __sz)
> > {
> >  std::fill(begin(), end(), __val);
> > -  const size_type __add = __n - size();
> > +  const size_type __add = __n - __sz;
> >  _GLIBCXX_ASAN_ANNOTATE_GROW(__add);
> >  this->_M_impl._M_finish =
> >    std::__uninitialized_fill_n_a(this->_M_impl._M_finish,
> > @@ -316,10 +319,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >       _M_assign_aux(_ForwardIterator __first, _ForwardIterator __last,
> >    std::forward_iterator_tag)
> >       {
> > + const size_type __sz = size();
> > const size_type __len = std::distance(__first, __last);
> >
> > if (__len > capacity())
> >  {
> > +    if (__len <= __sz)
> > +      __builtin_unreachable();
> > +
> >    _S_check_init_len(__len, _M_get_Tp_allocator());
> >    pointer __tmp(_M_allocate_and_copy(__len, __first, __last));
> >    std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish,
> > @@ -332,14 +339,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >    this->_M_impl._M_finish = this->_M_impl._M_start + __len;
> >    this->_M_impl._M_end_of_storage = this->_M_impl._M_finish;
> >  }
> > - else if (size() >= __len)
> > + else if (__sz >= __len)
> >  _M_erase_at_end(std::copy(__first, __last, this->_M_impl._M_start));
> > else
> >  {
> >    _ForwardIterator __mid = __first;
> > -    std::advance(__mid, size());
> > +    std::advance(__mid, __sz);
> >    std::copy(__first, __mid, this->_M_impl._M_start);
> > -    const size_type __attribute__((__unused__)) __n = __len - size();
> > +    const size_type __attribute__((__unused__)) __n = __len - __sz;
> >    _GLIBCXX_ASAN_ANNOTATE_GROW(__n);
> >    this->_M_impl._M_finish =
> >      std::__uninitialized_copy_a(__mid, __last,
> > diff --git
> a/libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc
> b/libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
> > similarity index 70%
> > rename from
> libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc
> > rename to
> libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
> > index d68db694add..659f18dba56 100644
> > --- a/libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc
> > +++
> b/libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
> > @@ -1,5 +1,6 @@
> > // { dg-do compile }
> > // { dg-options "-O3 -g0" }
> > +// { dg-require-normal-mode "" }
> > // { dg-final { scan-assembler-not "_Znw" } }
> > // GCC should be able to optimize away the paths involving reallocation.
> >
> > @@ -14,3 +15,8 @@ void fill_val(std::vector<int>& vec, int i)
> > {
> >   vec.assign(vec.size(), i);
> > }
> > +
> > +void fill_range(std::vector<int>& vec, const int* first)
> > +{
> > +  vec.assign(first, first + vec.size());
> > +}
> > diff --git a/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
> b/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
> > index 9be07d9fd5c..079e5af9556 100644
> > --- a/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
> > +++ b/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
> > @@ -18,7 +18,7 @@
> > // <http://www.gnu.org/licenses/>.
> >
> > // { dg-do compile }
> > -// { dg-options "-Wno-unused-result -Wno-stringop-overread" }
> > +// { dg-options "-Wno-unused-result" }
> >
> > #include <vector>
> > #include <testsuite_greedy_ops.h>
> > --
> > 2.40.1
> >
>
>
Richard Biener June 9, 2023, 9 a.m. UTC | #5
On Thu, Jun 8, 2023 at 11:15 AM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Thu, Jun 08, 2023 at 10:05:43AM +0100, Jonathan Wakely via Gcc-patches wrote:
> > > Looking at assembly, one of the differences I see is that the "after"
> > > version has calls to realloc_insert(), while "before" version seems to have
> > > them inlined [2].
> > >
> > > [1]
> > > https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt
> > >
> > >
> > I find it annoying that adding `if (n < sz) __builtin_unreachable()` seems
> > to affect the size estimates for the function, and so perturbs inlining
> > decisions. That code shouldn't add any actual instructions, so shouldn't
> > affect size estimates.
> >
> > I mentioned this in a meeting last week and Jason suggested checking
> > whether using __builtin_assume has the same undesirable consequences, so I
>
> We don't support __builtin_assume (intentionally), if you mean [[assume(n>=sz)]],
> then because n >= sz doesn't have side-effects, it will be lowered to
> exactly that if (n < sz) __builtin_unreachable(); - you can look at
> -fdump-tree-all to confirm that.
>
> I agree that the inliner should ignore if (comparison) __builtin_unreachable();
> from costs estimation.  And inliner should ignore what we emit for [[assume()]]
> if there are side-effects.  CCing Honza.

Agreed, that would be nice.  Note that we have inliner limits in place to avoid
compile-time and memory-usage explosion as well so these kind of
"tricks" may be a way to
defeat them.

Richard.

>
>         Jakub
>
diff mbox series

Patch

diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h
index e593be443bc..70ced3d101f 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -389,23 +389,6 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
     protected:
 
-      __attribute__((__always_inline__))
-      _GLIBCXX20_CONSTEXPR void
-      _M_invariant() const
-      {
-#if __OPTIMIZE__
-	if (this->_M_impl._M_finish < this->_M_impl._M_start)
-	  __builtin_unreachable();
-	if (this->_M_impl._M_finish > this->_M_impl._M_end_of_storage)
-	  __builtin_unreachable();
-
-	size_t __sz = this->_M_impl._M_finish - this->_M_impl._M_start;
-	size_t __cap = this->_M_impl._M_end_of_storage - this->_M_impl._M_start;
-	if (__sz > __cap)
-	  __builtin_unreachable();
-#endif
-      }
-
       _GLIBCXX20_CONSTEXPR
       void
       _M_create_storage(size_t __n)
@@ -1005,10 +988,7 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
       _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
       size_type
       size() const _GLIBCXX_NOEXCEPT
-      {
-	_Base::_M_invariant();
-	return size_type(this->_M_impl._M_finish - this->_M_impl._M_start);
-      }
+      { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
 
       /**  Returns the size() of the largest possible %vector.  */
       _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
@@ -1095,7 +1075,6 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
       size_type
       capacity() const _GLIBCXX_NOEXCEPT
       {
-	_Base::_M_invariant();
 	return size_type(this->_M_impl._M_end_of_storage
 			   - this->_M_impl._M_start);
       }
diff --git a/libstdc++-v3/include/bits/vector.tcc b/libstdc++-v3/include/bits/vector.tcc
index d6fdea2dd01..acd11e2dc68 100644
--- a/libstdc++-v3/include/bits/vector.tcc
+++ b/libstdc++-v3/include/bits/vector.tcc
@@ -270,15 +270,18 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
     vector<_Tp, _Alloc>::
     _M_fill_assign(size_t __n, const value_type& __val)
     {
+      const size_type __sz = size();
       if (__n > capacity())
 	{
+	  if (__n <= __sz)
+	    __builtin_unreachable();
 	  vector __tmp(__n, __val, _M_get_Tp_allocator());
 	  __tmp._M_impl._M_swap_data(this->_M_impl);
 	}
-      else if (__n > size())
+      else if (__n > __sz)
 	{
 	  std::fill(begin(), end(), __val);
-	  const size_type __add = __n - size();
+	  const size_type __add = __n - __sz;
 	  _GLIBCXX_ASAN_ANNOTATE_GROW(__add);
 	  this->_M_impl._M_finish =
 	    std::__uninitialized_fill_n_a(this->_M_impl._M_finish,
@@ -316,10 +319,14 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
       _M_assign_aux(_ForwardIterator __first, _ForwardIterator __last,
 		    std::forward_iterator_tag)
       {
+	const size_type __sz = size();
 	const size_type __len = std::distance(__first, __last);
 
 	if (__len > capacity())
 	  {
+	    if (__len <= __sz)
+	      __builtin_unreachable();
+
 	    _S_check_init_len(__len, _M_get_Tp_allocator());
 	    pointer __tmp(_M_allocate_and_copy(__len, __first, __last));
 	    std::_Destroy(this->_M_impl._M_start, this->_M_impl._M_finish,
@@ -332,14 +339,14 @@  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 	    this->_M_impl._M_finish = this->_M_impl._M_start + __len;
 	    this->_M_impl._M_end_of_storage = this->_M_impl._M_finish;
 	  }
-	else if (size() >= __len)
+	else if (__sz >= __len)
 	  _M_erase_at_end(std::copy(__first, __last, this->_M_impl._M_start));
 	else
 	  {
 	    _ForwardIterator __mid = __first;
-	    std::advance(__mid, size());
+	    std::advance(__mid, __sz);
 	    std::copy(__first, __mid, this->_M_impl._M_start);
-	    const size_type __attribute__((__unused__)) __n = __len - size();
+	    const size_type __attribute__((__unused__)) __n = __len - __sz;
 	    _GLIBCXX_ASAN_ANNOTATE_GROW(__n);
 	    this->_M_impl._M_finish =
 	      std::__uninitialized_copy_a(__mid, __last,
diff --git a/libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc b/libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
similarity index 70%
rename from libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc
rename to libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
index d68db694add..659f18dba56 100644
--- a/libstdc++-v3/testsuite/23_containers/vector/capacity/invariant.cc
+++ b/libstdc++-v3/testsuite/23_containers/vector/modifiers/assign/no_realloc.cc
@@ -1,5 +1,6 @@ 
 // { dg-do compile }
 // { dg-options "-O3 -g0" }
+// { dg-require-normal-mode "" }
 // { dg-final { scan-assembler-not "_Znw" } }
 // GCC should be able to optimize away the paths involving reallocation.
 
@@ -14,3 +15,8 @@  void fill_val(std::vector<int>& vec, int i)
 {
   vec.assign(vec.size(), i);
 }
+
+void fill_range(std::vector<int>& vec, const int* first)
+{
+  vec.assign(first, first + vec.size());
+}
diff --git a/libstdc++-v3/testsuite/23_containers/vector/types/1.cc b/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
index 9be07d9fd5c..079e5af9556 100644
--- a/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
+++ b/libstdc++-v3/testsuite/23_containers/vector/types/1.cc
@@ -18,7 +18,7 @@ 
 // <http://www.gnu.org/licenses/>.
 
 // { dg-do compile }
-// { dg-options "-Wno-unused-result -Wno-stringop-overread" }
+// { dg-options "-Wno-unused-result" }
 
 #include <vector>
 #include <testsuite_greedy_ops.h>