From patchwork Tue Nov 21 16:24:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 1866923 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=ucw.cz header.i=@ucw.cz header.a=rsa-sha256 header.s=gen1 header.b=Os5V9FO4; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SZV7s147jz1ySN for ; Wed, 22 Nov 2023 03:24:25 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 643A9385B83C for ; Tue, 21 Nov 2023 16:24:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 4116B3858D35; Tue, 21 Nov 2023 16:24:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4116B3858D35 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kam.mff.cuni.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4116B3858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.113.20.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700583850; cv=none; b=qBr3te7I2uHa/qPeIzycOnZYE+jei5sqQgg7DHD+DP75HDN9jylQBswbSFI+ux81KEm0bF0g/i/W5Fh4MwBNbObhNkNV0uy4zsJ9nBlDMwjIH0gbD2rz9I0oJZ5ANUYUVrzZYG6nhR8gezv9VLfDF9oI5UBhy3V5xVtTUy702SU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700583850; c=relaxed/simple; bh=SR34EQ8ZyS9Uu56570KQwnDgMB/4IRJJiyASEfbVsSM=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=A8mN9598rFxNx16YM+uKNrLkwjbkuqPMkgqbbD14fMjo0oVuXF3KI8qACoOasOcIyhxBUZ5j8nh9KBdZ5+Y1qbqrT9GYj5oN5PkSIwy0ILHzq1TVze1PQw2f4NSXHFguR/oos4z0oRlb6ZKimQB4q3n9gIydfA61uUDedk20YJA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id D08B328B8C5; Tue, 21 Nov 2023 17:24:06 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1700583846; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=uE/M0KalHSCiCE7AeR54yvthQotgDOzCwudTCKWA6CY=; b=Os5V9FO4qruyE4VGdHI3J084F8ismD341N6DO0FaeKkAFsek+Znq+aW1j5hZ+reMT3E6VF qku/6V5f0AzFkUVYlBEbMMqj59/rR+Sejxvlmcss1C48/H1lG+RoNQY6wAtD2Wo70nRRtA 3KS0xp+eWVe4zY1txtjoAq23+Y+wCXk= Date: Tue, 21 Nov 2023 17:24:06 +0100 From: Jan Hubicka To: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org, jwakely@redhat.com Subject: libstdc++: Turn memmove to memcpy in vector reallocations Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, JMQ_SPF_NEUTRAL, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, this patch turns memmove to memcpy where we can and also avoids extra guard checking if block is non-empty. This does not show as performance improvement in my push_back micro-benchmark because vector rellocation does not happen that often. In general, however, we optimize memcpy better then memove (can inline it in some cases). Saving extra 3 instructions makes push_back more likely to be inlined though (estimate is now 23) I also filled in PR112653. I think for default allocator we should be able to work out from PTA that the memmove can be memcpy. Honestly I am not quite sure if I need to have the first __relocat_copy_a_1 tempalte. It handles the case we can't use memmove, but in my limited C++ skills I don't see how to get rid of it or make it a wrapper for __relocat_a_1 which is identical. Regtested on x86_64-linux. libstdc++-v3/ChangeLog: * include/bits/stl_uninitialized.h (__relocate_copy_a_1): New member fnctions. (__relocate_a_1): Do not check count to be non-zero before calling memmove. (__relocate_copy_a): New member function. * include/bits/stl_vector.h (_S_do_relocate_copy): New member function. * include/bits/vector.tcc (reserve, _M_realloc_append, _M_realloc_insert, _M_default_append): Use _S_relocate_copy. diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h b/libstdc++-v3/include/bits/stl_uninitialized.h index 1282af3bc43..983fa315e1b 100644 --- a/libstdc++-v3/include/bits/stl_uninitialized.h +++ b/libstdc++-v3/include/bits/stl_uninitialized.h @@ -1104,6 +1104,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION std::__addressof(*__first), __alloc); return __cur; } + template + _GLIBCXX20_CONSTEXPR + inline _ForwardIterator + __relocate_copy_a_1(_InputIterator __first, _InputIterator __last, + _ForwardIterator __result, _Allocator& __alloc) + noexcept(noexcept(std::__relocate_object_a(std::addressof(*__result), + std::addressof(*__first), + __alloc))) + { + typedef typename iterator_traits<_InputIterator>::value_type + _ValueType; + typedef typename iterator_traits<_ForwardIterator>::value_type + _ValueType2; + static_assert(std::is_same<_ValueType, _ValueType2>::value, + "relocation is only possible for values of the same type"); + _ForwardIterator __cur = __result; + for (; __first != __last; ++__first, (void)++__cur) + std::__relocate_object_a(std::__addressof(*__cur), + std::__addressof(*__first), __alloc); + return __cur; + } #if _GLIBCXX_HOSTED template @@ -1114,20 +1136,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION [[__maybe_unused__]] allocator<_Up>& __alloc) noexcept { ptrdiff_t __count = __last - __first; - if (__count > 0) - { #ifdef __cpp_lib_is_constant_evaluated - if (std::is_constant_evaluated()) + if (std::is_constant_evaluated()) + { + // Can't use memmove. Wrap the pointer so that __relocate_a_1 + // resolves to the non-trivial overload above. + if (__count > 0) { - // Can't use memmove. Wrap the pointer so that __relocate_a_1 - // resolves to the non-trivial overload above. __gnu_cxx::__normal_iterator<_Tp*, void> __out(__result); __out = std::__relocate_a_1(__first, __last, __out, __alloc); return __out.base(); } + return __result; + } #endif - __builtin_memmove(__result, __first, __count * sizeof(_Tp)); + __builtin_memmove(__result, __first, __count * sizeof(_Tp)); + return __result + __count; + } + template + _GLIBCXX20_CONSTEXPR + inline __enable_if_t::value, _Tp*> + __relocate_copy_a_1(_Tp* __first, _Tp* __last, + _Tp* __result, + [[__maybe_unused__]] allocator<_Up>& __alloc) noexcept + { + ptrdiff_t __count = __last - __first; +#ifdef __cpp_lib_is_constant_evaluated + if (std::is_constant_evaluated()) + { + // Can't use memcpy. Wrap the pointer so that __relocate_copy_a_1 + // resolves to the non-trivial overload above. + if (__count > 0) + { + __gnu_cxx::__normal_iterator<_Tp*, void> __out(__result); + __out = std::__relocate_a_1(__first, __last, __out, __alloc); + return __out.base(); + } + return __result; } +#endif + __builtin_memcpy(__result, __first, __count * sizeof(_Tp)); return __result + __count; } #endif @@ -1146,6 +1194,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION std::__niter_base(__last), std::__niter_base(__result), __alloc); } + template + _GLIBCXX20_CONSTEXPR + inline _ForwardIterator + __relocate_copy_a(_InputIterator __first, _InputIterator __last, + _ForwardIterator __result, _Allocator& __alloc) + noexcept(noexcept(__relocate_copy_a_1(std::__niter_base(__first), + std::__niter_base(__last), + std::__niter_base(__result), __alloc))) + { + return std::__relocate_copy_a_1(std::__niter_base(__first), + std::__niter_base(__last), + std::__niter_base(__result), __alloc); + } /// @endcond #endif // C++11 diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h index 973f4d7e2e9..4f9dba6c3fe 100644 --- a/libstdc++-v3/include/bits/stl_vector.h +++ b/libstdc++-v3/include/bits/stl_vector.h @@ -507,6 +507,31 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER #else using __do_it = __bool_constant<_S_use_relocate()>; return _S_do_relocate(__first, __last, __result, __alloc, __do_it{}); +#endif + } + static pointer + _S_do_relocate_copy(pointer __first, pointer __last, pointer __result, + _Tp_alloc_type& __alloc, true_type) noexcept + { + return std::__relocate_a(__first, __last, __result, __alloc); + } + + static pointer + _S_do_relocate_copy(pointer, pointer, pointer __result, + _Tp_alloc_type&, false_type) noexcept + { return __result; } + // same as _S_relocate but assumes that the destination block + // is disjoint (as in memcpy) + static _GLIBCXX20_CONSTEXPR pointer + _S_relocate_copy(pointer __first, pointer __last, pointer __result, + _Tp_alloc_type& __alloc) noexcept + { +#if __cpp_if_constexpr + // All callers have already checked _S_use_relocate() so just do it. + return std::__relocate_copy_a(__first, __last, __result, __alloc); +#else + using __do_it = __bool_constant<_S_use_relocate()>; + return _S_do_relocate_copy(__first, __last, __result, __alloc, __do_it{}); #endif } #endif // C++11 diff --git a/libstdc++-v3/include/bits/vector.tcc b/libstdc++-v3/include/bits/vector.tcc index 0ccef7911b3..2468ad85f49 100644 --- a/libstdc++-v3/include/bits/vector.tcc +++ b/libstdc++-v3/include/bits/vector.tcc @@ -77,8 +77,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER if _GLIBCXX17_CONSTEXPR (_S_use_relocate()) { __tmp = this->_M_allocate(__n); - _S_relocate(this->_M_impl._M_start, this->_M_impl._M_finish, - __tmp, _M_get_Tp_allocator()); + _S_relocate_copy(this->_M_impl._M_start, this->_M_impl._M_finish, + __tmp, _M_get_Tp_allocator()); } else #endif @@ -515,11 +515,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER if _GLIBCXX17_CONSTEXPR (_S_use_relocate()) { // Relocation cannot throw. - __new_finish = _S_relocate(__old_start, __position.base(), - __new_start, _M_get_Tp_allocator()); + __new_finish = _S_relocate_copy(__old_start, __position.base(), + __new_start, _M_get_Tp_allocator()); ++__new_finish; - __new_finish = _S_relocate(__position.base(), __old_finish, - __new_finish, _M_get_Tp_allocator()); + __new_finish = _S_relocate_copy(__position.base(), __old_finish, + __new_finish, _M_get_Tp_allocator()); } else #endif @@ -644,8 +644,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER if _GLIBCXX17_CONSTEXPR (_S_use_relocate()) { // Relocation cannot throw. - __new_finish = _S_relocate(__old_start, __old_finish, - __new_start, _M_get_Tp_allocator()); + __new_finish = _S_relocate_copy(__old_start, __old_finish, + __new_start, _M_get_Tp_allocator()); ++__new_finish; } else @@ -865,8 +865,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER if _GLIBCXX17_CONSTEXPR (_S_use_relocate()) { - _S_relocate(__old_start, __old_finish, - __new_start, _M_get_Tp_allocator()); + _S_relocate_copy(__old_start, __old_finish, + __new_start, _M_get_Tp_allocator()); } else {