From patchwork Mon Nov 11 18:51:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1193071 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-512994-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="CmYgFV57"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47Bg2f3YYYz9sSs for ; Tue, 12 Nov 2019 05:51:37 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=default; b=lUYeanjQWUEFSjRew+dDfDFV0FhfK Xp+MIkKGlyR2f4Ou9r2sES9m/dUfL/gTaPftSSnTVw86MHsqRf8U7573GNQnw+4o wtQdznyExAXkZwFGz8jeaCWm95M/vol6cUeb9vXyUNtO3VlQ4D5EnPquU/MsffiG FAZ+jiztOLTNjM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:references:date:in-reply-to:message-id:mime-version :content-type; s=default; bh=iRLA/8UtPh1xjTit1UHeR4avTcs=; b=CmY gFV575rlkC8iMDpfpruXMtjMYu/eastKz4hlKUykl3YQ5xiOp117n1GQWDKfC4Sb dNpxHOfF0w2L67R4LTd/c4zRT4XP3D4Hda/RlQO9Zym7GizY8Ujfq6PFyaU/jyr/ iWX3DkRAeuCs15T0z+9TftsjJ2bqFEQO8EhnNKKY= Received: (qmail 16581 invoked by alias); 11 Nov 2019 18:51:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16574 invoked by uid 89); 11 Nov 2019 18:51:27 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy=apart, wi X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.110.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Nov 2019 18:51:25 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A50DC1FB for ; Mon, 11 Nov 2019 10:51:23 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 30A913F52E for ; Mon, 11 Nov 2019 10:51:23 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [7/8] Use a single comparison for index-based alias checks References: Date: Mon, 11 Nov 2019 18:51:22 +0000 In-Reply-To: (Richard Sandiford's message of "Mon, 11 Nov 2019 18:45:00 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-IsSubscribed: yes This patch rewrites the index-based alias checks to use conditions of the form: (unsigned T) (a - b + bias) <= limit E.g. before the patch: struct s { int x[100]; }; void f1 (struct s *s1, int a, int b) { for (int i = 0; i < 32; ++i) s1->x[i + a] += s1->x[i + b]; } used: add w3, w1, 3 cmp w3, w2 add w3, w2, 3 ccmp w1, w3, 0, ge ble .L2 whereas after the patch it uses: sub w3, w1, w2 add w3, w3, 3 cmp w3, 6 bls .L2 The patch also fixes the seg_len1 and seg_len2 negation for cases in which seg_len is a "negative unsigned" value narrower than 64 bits, like it is for 32-bit targets. Previously we'd end up with values like 0xffffffff000000001 instead of 1. 2019-11-11 Richard Sandiford gcc/ * tree-data-ref.c (create_intersect_range_checks_index): Rewrite the index tests to have the form (unsigned T) (B - A + bias) <= limit. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-18.c: New test. * gcc.dg/vect/vect-alias-check-19.c: Likewise. * gcc.dg/vect/vect-alias-check-20.c: Likewise. Index: gcc/tree-data-ref.c =================================================================== --- gcc/tree-data-ref.c 2019-11-11 18:31:01.263118514 +0000 +++ gcc/tree-data-ref.c 2019-11-11 18:31:05.099091743 +0000 @@ -1744,7 +1744,9 @@ prune_runtime_alias_test_list (vec + <--- A: -ve step ---> + +-----+-------+-----+-------+-----+ + | n-1 | ..... | 0 | ..... | n-1 | + +-----+-------+-----+-------+-----+ + <--- B: +ve step ---> + <-- idx_len --> + | + min + + where "n" is the number of scalar iterations covered by the segment + and where each access spans idx_access units. + + A is the range of bytes accessed when the step is negative, + B is the range when the step is positive. + + When checking for general overlap, we need to test whether + the range: + + [min1 + low_offset1, min2 + high_offset1 + idx_access1 - 1] + + overlaps: + + [min2 + low_offset2, min2 + high_offset2 + idx_access2 - 1] + + where: + + low_offsetN = +ve step ? 0 : -idx_lenN; + high_offsetN = +ve step ? idx_lenN : 0; + + This is equivalent to testing whether: + + min1 + low_offset1 <= min2 + high_offset2 + idx_access2 - 1 + && min2 + low_offset2 <= min1 + high_offset1 + idx_access1 - 1 + + Converting this into a single test, there is an overlap if: + + 0 <= min2 - min1 + bias <= limit + + where bias = high_offset2 + idx_access2 - 1 - low_offset1 + limit = (high_offset1 - low_offset1 + idx_access1 - 1) + + (high_offset2 - low_offset2 + idx_access2 - 1) + i.e. limit = idx_len1 + idx_access1 - 1 + idx_len2 + idx_access2 - 1 + + Combining the tests requires limit to be computable in an unsigned + form of the index type; if it isn't, we fall back to the usual + pointer-based checks. */ + poly_offset_int limit = (idx_len1 + idx_access1 - 1 + + idx_len2 + idx_access2 - 1); + tree utype = unsigned_type_for (TREE_TYPE (min1)); + if (!wi::fits_to_tree_p (limit, utype)) + return false; + + poly_offset_int low_offset1 = neg_step ? -idx_len1 : 0; + poly_offset_int high_offset2 = neg_step ? 0 : idx_len2; + poly_offset_int bias = high_offset2 + idx_access2 - 1 - low_offset1; + + tree subject = fold_build2 (MINUS_EXPR, utype, + fold_convert (utype, min2), + fold_convert (utype, min1)); + subject = fold_build2 (PLUS_EXPR, utype, subject, + wide_int_to_tree (utype, bias)); + tree part_cond_expr = fold_build2 (GT_EXPR, boolean_type_node, subject, + wide_int_to_tree (utype, limit)); if (*cond_expr) *cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, *cond_expr, part_cond_expr); Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-18.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-18.c 2019-11-11 18:31:05.099091743 +0000 @@ -0,0 +1,64 @@ +#define N 200 +#define DIST 32 + +typedef signed char sc; +typedef unsigned char uc; +typedef signed short ss; +typedef unsigned short us; +typedef int si; +typedef unsigned int ui; +typedef signed long long sll; +typedef unsigned long long ull; + +#define FOR_EACH_TYPE(M) \ + M (sc) M (uc) \ + M (ss) M (us) \ + M (si) M (ui) \ + M (sll) M (ull) \ + M (float) M (double) + +#define TEST_VALUE(I) ((I) * 11 / 2) + +#define ADD_TEST(TYPE) \ + TYPE a_##TYPE[N * 2]; \ + void __attribute__((noinline, noclone)) \ + test_##TYPE (int x, int y) \ + { \ + for (int i = 0; i < N; ++i) \ + a_##TYPE[x - i] += a_##TYPE[y - i]; \ + } + +#define DO_TEST(TYPE) \ + for (int i = 0; i < DIST * 2; ++i) \ + { \ + for (int j = 0; j < N + DIST * 2; ++j) \ + a_##TYPE[j] = TEST_VALUE (j); \ + test_##TYPE (i + N - 1, DIST + N - 1); \ + for (int j = 0; j < N + DIST * 2; ++j) \ + { \ + TYPE expected; \ + if (j < i || j >= i + N) \ + expected = TEST_VALUE (j); \ + else if (i >= DIST) \ + expected = ((TYPE) TEST_VALUE (j) \ + + (TYPE) TEST_VALUE (j + DIST - i)); \ + else \ + expected = ((TYPE) TEST_VALUE (j) \ + + a_##TYPE[j + DIST - i]); \ + if (expected != a_##TYPE[j]) \ + __builtin_abort (); \ + } \ + } + +FOR_EACH_TYPE (ADD_TEST) + +int +main (void) +{ + FOR_EACH_TYPE (DO_TEST) + return 0; +} + +/* { dg-final { scan-tree-dump {flags: *WAR\n} "vect" { target vect_int } } } */ +/* { dg-final { scan-tree-dump "using an index-based overlap test" "vect" } } */ +/* { dg-final { scan-tree-dump-not "using an address-based" "vect" } } */ Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-19.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-19.c 2019-11-11 18:31:05.099091743 +0000 @@ -0,0 +1,62 @@ +#define N 200 +#define DIST 32 + +typedef signed char sc; +typedef unsigned char uc; +typedef signed short ss; +typedef unsigned short us; +typedef int si; +typedef unsigned int ui; +typedef signed long long sll; +typedef unsigned long long ull; + +#define FOR_EACH_TYPE(M) \ + M (sc) M (uc) \ + M (ss) M (us) \ + M (si) M (ui) \ + M (sll) M (ull) \ + M (float) M (double) + +#define ADD_TEST(TYPE) \ + TYPE a_##TYPE[N * 2]; \ + void __attribute__((noinline, noclone)) \ + test_##TYPE (int x, int y) \ + { \ + for (int i = 0; i < N; ++i) \ + { \ + a_##TYPE[i + x] = i; \ + a_##TYPE[i + y] = 42 - i * 2; \ + } \ + } + +#define DO_TEST(TYPE) \ + for (int i = 0; i < DIST * 2; ++i) \ + { \ + __builtin_memset (a_##TYPE, 0, sizeof (a_##TYPE)); \ + test_##TYPE (DIST, i); \ + for (int j = 0; j < N + DIST * 2; ++j) \ + { \ + TYPE expected = 0; \ + if (i > DIST && j >= i && j < i + N) \ + expected = 42 - (j - i) * 2; \ + if (j >= DIST && j < DIST + N) \ + expected = j - DIST; \ + if (i <= DIST && j >= i && j < i + N) \ + expected = 42 - (j - i) * 2; \ + if (expected != a_##TYPE[j]) \ + __builtin_abort (); \ + } \ + } + +FOR_EACH_TYPE (ADD_TEST) + +int +main (void) +{ + FOR_EACH_TYPE (DO_TEST) + return 0; +} + +/* { dg-final { scan-tree-dump {flags: *WAW\n} "vect" { target vect_int } } } */ +/* { dg-final { scan-tree-dump "using an index-based overlap test" "vect" } } */ +/* { dg-final { scan-tree-dump-not "using an address-based" "vect" } } */ Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-20.c =================================================================== --- /dev/null 2019-09-17 11:41:18.176664108 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-20.c 2019-11-11 18:31:05.099091743 +0000 @@ -0,0 +1,66 @@ +#define N 200 +#define DIST 32 + +typedef signed char sc; +typedef unsigned char uc; +typedef signed short ss; +typedef unsigned short us; +typedef int si; +typedef unsigned int ui; +typedef signed long long sll; +typedef unsigned long long ull; + +#define FOR_EACH_TYPE(M) \ + M (sc) M (uc) \ + M (ss) M (us) \ + M (si) M (ui) \ + M (sll) M (ull) \ + M (float) M (double) + +#define TEST_VALUE(I) ((I) * 11 / 2) + +#define ADD_TEST(TYPE) \ + TYPE a_##TYPE[N * 2]; \ + TYPE __attribute__((noinline, noclone)) \ + test_##TYPE (int x, int y) \ + { \ + TYPE res = 0; \ + for (int i = 0; i < N; ++i) \ + { \ + a_##TYPE[i + x] = i; \ + res += a_##TYPE[i + y]; \ + } \ + return res; \ + } + +#define DO_TEST(TYPE) \ + for (int i = 0; i < DIST * 2; ++i) \ + { \ + for (int j = 0; j < N + DIST * 2; ++j) \ + a_##TYPE[j] = TEST_VALUE (j); \ + TYPE res = test_##TYPE (DIST, i); \ + for (int j = 0; j < N; ++j) \ + if (a_##TYPE[j + DIST] != (TYPE) j) \ + __builtin_abort (); \ + TYPE expected_res = 0; \ + for (int j = i; j < i + N; ++j) \ + if (i <= DIST && j >= DIST && j < DIST + N) \ + expected_res += j - DIST; \ + else \ + expected_res += TEST_VALUE (j); \ + if (expected_res != res) \ + __builtin_abort (); \ + } + +FOR_EACH_TYPE (ADD_TEST) + +int +main (void) +{ + FOR_EACH_TYPE (DO_TEST) + return 0; +} + +/* { dg-final { scan-tree-dump {flags: *RAW\n} "vect" { target vect_int } } } */ +/* { dg-final { scan-tree-dump "using an index-based overlap test" "vect" } } */ +/* { dg-final { scan-tree-dump-not "using an address-based" "vect" } } */