From patchwork Wed May 17 21:41:44 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Koenig X-Patchwork-Id: 763775 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wSnpT57vqz9s4q for ; Thu, 18 May 2017 07:42:08 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="iEj+Vdo6"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=dD66KaAIeeY4kBwRL7sHt3GU4zpgZ0meiLBO0/zqkysZm9rmNJ GOLJUvhd3VdLjku3BGhJXDEfkWyfEecasOVfSVNT0tHbMDPCpzLlXnuWLJbFNJH4 4tSYCXmE61eWjd4zEep1I+64zsd03Ab+OeNr18ONJqLvfFrCsuBWPQnAs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=f1RUgBuzqrQktm9b5bivaH3tgdY=; b=iEj+Vdo6Q8s55/8i5Amu kiMD2YZ/fxgwO+8kYlb1fYkQsORotXmX6M66m4mtdxdDr60aWDZ6ikZ3TLbpTVBg PgTEBq/y125Jkbv6gwb9K3i9+JXuSSoiithgQsKjqp9sx80wgw1NunW/yjQfxyiZ emfcT7YHMIb74GJWny0RElI= Received: (qmail 53720 invoked by alias); 17 May 2017 21:41:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 53697 invoked by uid 89); 17 May 2017 21:41:49 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.7 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=kj, rank, 223, sk:fronten X-Spam-User: qpsmtpd, 2 recipients X-HELO: cc-smtpout1.netcologne.de Received: from cc-smtpout1.netcologne.de (HELO cc-smtpout1.netcologne.de) (89.1.8.211) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 17 May 2017 21:41:47 +0000 Received: from cc-smtpin1.netcologne.de (cc-smtpin1.netcologne.de [89.1.8.201]) by cc-smtpout1.netcologne.de (Postfix) with ESMTP id 0677713175; Wed, 17 May 2017 23:41:47 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by cc-smtpin1.netcologne.de (Postfix) with ESMTP id EDAB211D9A; Wed, 17 May 2017 23:41:46 +0200 (CEST) Received: from [78.35.135.112] (helo=cc-smtpin1.netcologne.de) by localhost with ESMTP (eXpurgate 4.1.9) (envelope-from ) id 591cc39a-021e-7f0000012729-7f0000019b8c-1 for ; Wed, 17 May 2017 23:41:46 +0200 Received: from [192.168.178.20] (xdsl-78-35-135-112.netcologne.de [78.35.135.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by cc-smtpin1.netcologne.de (Postfix) with ESMTPSA; Wed, 17 May 2017 23:41:45 +0200 (CEST) To: "fortran@gcc.gnu.org" , gcc-patches From: Thomas Koenig Subject: [patch, fortran] Handle MATMUL(TRANSPOSE(A),B) in inline matmul Message-ID: <619ade24-e4ce-82b0-9214-533feb74941c@netcologne.de> Date: Wed, 17 May 2017 23:41:44 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 Hello world, after receiving no negative feedback on my RFC patch, I have deciced to submit the patch. The attached patch handles MATMUL(TRANSPOSE(A),B) in inlining matmul. Speed is a bit faster than the library version. Regression-tested. OK for trunk? Regards Thomas 2017-05-17 Thomas Koenig PR fortran/66094 * frontend-passes.c (matrix_case): Add A2TB2. (inline_limit_check): Handle MATMUL(TRANSPOSE(A),B) (inline_matmul_assign): Likewise. 2017-05-17 Thomas Koenig PR fortran/66094 * gfortran.dg/inline_matmul_16.f90: New test. Index: frontend-passes.c =================================================================== --- frontend-passes.c (Revision 247809) +++ frontend-passes.c (Arbeitskopie) @@ -112,7 +112,7 @@ static int var_num = 1; /* What sort of matrix we are dealing with when inlining MATMUL. */ -enum matrix_case { none=0, A2B2, A2B1, A1B2, A2B2T }; +enum matrix_case { none=0, A2B2, A2B1, A1B2, A2B2T, A2TB2 }; /* Keep track of the number of expressions we have inserted so far using create_var. */ @@ -2252,7 +2252,7 @@ inline_limit_check (gfc_expr *a, gfc_expr *b, enum gfc_typespec ts; gfc_expr *cond; - gcc_assert (m_case == A2B2 || m_case == A2B2T); + gcc_assert (m_case == A2B2 || m_case == A2B2T || m_case == A2TB2); /* Calculation is done in real to avoid integer overflow. */ @@ -2425,6 +2425,20 @@ matmul_lhs_realloc (gfc_expr *c, gfc_expr *a, gfc_ cond = build_logical_expr (INTRINSIC_OR, ne1, ne2); break; + case A2TB2: + + ar->start[0] = get_array_inq_function (GFC_ISYM_SIZE, a, 2); + ar->start[1] = get_array_inq_function (GFC_ISYM_SIZE, b, 2); + + ne1 = build_logical_expr (INTRINSIC_NE, + get_array_inq_function (GFC_ISYM_SIZE, c, 1), + get_array_inq_function (GFC_ISYM_SIZE, a, 2)); + ne2 = build_logical_expr (INTRINSIC_NE, + get_array_inq_function (GFC_ISYM_SIZE, c, 2), + get_array_inq_function (GFC_ISYM_SIZE, b, 2)); + cond = build_logical_expr (INTRINSIC_OR, ne1, ne2); + break; + case A2B1: ar->start[0] = get_array_inq_function (GFC_ISYM_SIZE, a, 1); cond = build_logical_expr (INTRINSIC_NE, @@ -3009,7 +3023,7 @@ inline_matmul_assign (gfc_code **c, int *walk_subt a = expr2->value.function.actual; matrix_a = check_conjg_transpose_variable (a->expr, &conjg_a, &transpose_a); - if (transpose_a || matrix_a == NULL) + if (matrix_a == NULL) return 0; b = a->next; @@ -3026,27 +3040,36 @@ inline_matmul_assign (gfc_code **c, int *walk_subt || gfc_check_dependency (expr1, matrix_b, true)) return 0; + m_case = none; if (matrix_a->rank == 2) { - if (matrix_b->rank == 1) - m_case = A2B1; + if (transpose_a) + { + if (matrix_b->rank == 2 && !transpose_b) + m_case = A2TB2; + } else { - if (transpose_b) - m_case = A2B2T; - else - m_case = A2B2; + if (matrix_b->rank == 1) + m_case = A2B1; + else /* matrix_b->rank == 2 */ + { + if (transpose_b) + m_case = A2B2T; + else + m_case = A2B2; + } } } - else + else /* matrix_a->rank == 1 */ { - /* Vector * Transpose(B) not handled yet. */ - if (transpose_b) - m_case = none; - else - m_case = A1B2; + if (matrix_b->rank == 2) + { + if (!transpose_b) + m_case = A1B2; + } } - + if (m_case == none) return 0; @@ -3250,6 +3273,37 @@ inline_matmul_assign (gfc_code **c, int *walk_subt next_code_point = &test->next; } + + if (m_case == A2TB2) + { + c1 = get_array_inq_function (GFC_ISYM_SIZE, expr1, 1); + a2 = get_array_inq_function (GFC_ISYM_SIZE, matrix_a, 2); + + test = runtime_error_ne (c1, a2, "Incorrect extent in return array in " + "MATMUL intrinsic for dimension 1: " + "is %ld, should be %ld"); + + *next_code_point = test; + next_code_point = &test->next; + + c2 = get_array_inq_function (GFC_ISYM_SIZE, expr1, 2); + b2 = get_array_inq_function (GFC_ISYM_SIZE, matrix_b, 2); + test = runtime_error_ne (c2, b2, "Incorrect extent in return array in " + "MATMUL intrinsic for dimension 2: " + "is %ld, should be %ld"); + *next_code_point = test; + next_code_point = &test->next; + + a1 = get_array_inq_function (GFC_ISYM_SIZE, matrix_a, 1); + b1 = get_array_inq_function (GFC_ISYM_SIZE, matrix_b, 1); + + test = runtime_error_ne (b1, a1, "Incorrect extent in argument B in " + "MATMUL intrnisic for dimension 2: " + "is %ld, should be %ld"); + *next_code_point = test; + next_code_point = &test->next; + + } } *next_code_point = assign_zero; @@ -3331,6 +3385,39 @@ inline_matmul_assign (gfc_code **c, int *walk_subt break; + case A2TB2: + inline_limit_check (matrix_a, matrix_b, m_case); + + u1 = get_size_m1 (matrix_a, 2); + u2 = get_size_m1 (matrix_b, 2); + u3 = get_size_m1 (matrix_a, 1); + + do_1 = create_do_loop (gfc_copy_expr (zero), u1, NULL, &co->loc, ns); + do_2 = create_do_loop (gfc_copy_expr (zero), u2, NULL, &co->loc, ns); + do_3 = create_do_loop (gfc_copy_expr (zero), u3, NULL, &co->loc, ns); + + do_1->block->next = do_2; + do_2->block->next = do_3; + do_3->block->next = assign_matmul; + + var_1 = do_1->ext.iterator->var; + var_2 = do_2->ext.iterator->var; + var_3 = do_3->ext.iterator->var; + + list[0] = var_1; + list[1] = var_2; + cscalar = scalarized_expr (co->expr1, list, 2); + + list[0] = var_3; + list[1] = var_1; + ascalar = scalarized_expr (matrix_a, list, 2); + + list[0] = var_3; + list[1] = var_2; + bscalar = scalarized_expr (matrix_b, list, 2); + + break; + case A2B1: u1 = get_size_m1 (matrix_b, 1); u2 = get_size_m1 (matrix_a, 1);