From patchwork Fri Apr 6 16:52:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Thomas_K=C3=B6nig?= X-Patchwork-Id: 895764 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-476003-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=tkoenig.net Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="cuKPtyT3"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Hm3G0lLjz9s0x for ; Sat, 7 Apr 2018 02:52:56 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=uakgfDqEUOPWQMznKVoxZP5L6sXwAvEnuzHTxtQpmOvREewAAu wj+710YcPs2onUpDEOwa9hsOg2k734vgrS2vY7V0uk30QRFAiEjdR+Ww4EfOB8H2 f+jTZsfPrkYS7LWTccC7m6WsFJMhraKfkwPPxbmtsMa5WkhPy+Qm/9Mfs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=VjRQ4nngxJ1jOglcUs7zYLzldpE=; b=cuKPtyT3/L4IC9+xAaGi d2ebaxM5g9sZcI/7aFe59TGgNlLvcmpasEL9EGR2WWAlVQwG56IbIA5BaGhpiGkV pal248iEP+7uPJhwu6WhRa7rubjLDmvfdOWUe2gPjwx0RfI0jQhvvA/lJBVSrTsm qnomIfugCbTO7ZqGHzCsEwY= Received: (qmail 58689 invoked by alias); 6 Apr 2018 16:52:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 58664 invoked by uid 89); 6 Apr 2018 16:52:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=tkoenig@gcc.gnu.org, H*RU:sk:!192.16, tkoeniggccgnuorg, Hx-spam-relays-external:sk:!192.16 X-Spam-User: qpsmtpd, 2 recipients X-HELO: mo4-p00-ob.smtp.rzone.de Received: from mo4-p00-ob.smtp.rzone.de (HELO mo4-p00-ob.smtp.rzone.de) (85.215.255.22) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 06 Apr 2018 16:52:40 +0000 X-RZG-AUTH: :OGckYUunfvGNVUL0FlRnC4eRM+bOwx0tUtYTrJ/xeZX+ZVZvrbiROUdhOW6Sckk= X-RZG-CLASS-ID: mo00 Received: from [192.168.178.68] (xdsl-78-35-152-16.netcologne.de [78.35.152.16]) by smtp.strato.de (RZmta 43.1 DYNA|AUTH) with ESMTPSA id u09c50u36GqaA7u (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (curve secp521r1 with 521 ECDH bits, eq. 15360 bits RSA)) (Client did not present a certificate); Fri, 6 Apr 2018 18:52:36 +0200 (CEST) To: "fortran@gcc.gnu.org" , gcc-patches From: =?utf-8?q?Thomas_K=C3=B6nig?= Subject: [patch, libfortran] Fix PR 88235, buffer overrun in matmul Message-ID: <5fcb79b6-415f-7b0e-7ff3-2bbd5610ad68@tkoenig.net> Date: Fri, 6 Apr 2018 18:52:36 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 Hello world, the attached patch fixes a buffer overrun in matmul, an 8 regression. No test case since this was only detectable with the address sanitizer or with valgrind. Regression-tested on trunk. OK? Regards Thomas 2018-04-06 Thomas Koenig PR libfortran/85253 * m4/matmul_internal.m4: If ycount == 1, add one more row to the internal buffer. * generated/matmul_c10.c: Regenerated. * generated/matmul_c16.c: Regenerated. * generated/matmul_c4.c: Regenerated. * generated/matmul_c8.c: Regenerated. * generated/matmul_i1.c: Regenerated. * generated/matmul_i16.c: Regenerated. * generated/matmul_i2.c: Regenerated. * generated/matmul_i4.c: Regenerated. * generated/matmul_i8.c: Regenerated. * generated/matmul_r10.c: Regenerated. * generated/matmul_r16.c: Regenerated. * generated/matmul_r4.c: Regenerated. * generated/matmul_r8.c: Regenerated. * generated/matmulavx128_c10.c: Regenerated. * generated/matmulavx128_c16.c: Regenerated. * generated/matmulavx128_c4.c: Regenerated. * generated/matmulavx128_c8.c: Regenerated. * generated/matmulavx128_i1.c: Regenerated. * generated/matmulavx128_i16.c: Regenerated. * generated/matmulavx128_i2.c: Regenerated. * generated/matmulavx128_i4.c: Regenerated. * generated/matmulavx128_i8.c: Regenerated. * generated/matmulavx128_r10.c: Regenerated. * generated/matmulavx128_r16.c: Regenerated. * generated/matmulavx128_r4.c: Regenerated. * generated/matmulavx128_r8.c: Regenerated. Index: m4/matmul_internal.m4 =================================================================== --- m4/matmul_internal.m4 (Revision 259152) +++ m4/matmul_internal.m4 (Arbeitskopie) @@ -234,7 +234,7 @@ sinclude(`matmul_asm_'rtype_code`.m4')dnl /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_c10.c =================================================================== --- generated/matmul_c10.c (Revision 259152) +++ generated/matmul_c10.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_c10_avx (gfc_array_c10 * const restrict ret /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_c10_avx2 (gfc_array_c10 * const restrict re /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_c10_avx512f (gfc_array_c10 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_c10_vanilla (gfc_array_c10 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_c10 (gfc_array_c10 * const restrict retarra /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_c16.c =================================================================== --- generated/matmul_c16.c (Revision 259152) +++ generated/matmul_c16.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_c16_avx (gfc_array_c16 * const restrict ret /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_c16_avx2 (gfc_array_c16 * const restrict re /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_c16_avx512f (gfc_array_c16 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_c16_vanilla (gfc_array_c16 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_c16 (gfc_array_c16 * const restrict retarra /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_c4.c =================================================================== --- generated/matmul_c4.c (Revision 259152) +++ generated/matmul_c4.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_c4_avx (gfc_array_c4 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_c4_avx2 (gfc_array_c4 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_c4_avx512f (gfc_array_c4 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_c4_vanilla (gfc_array_c4 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_c4 (gfc_array_c4 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_c8.c =================================================================== --- generated/matmul_c8.c (Revision 259152) +++ generated/matmul_c8.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_c8_avx (gfc_array_c8 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_c8_avx2 (gfc_array_c8 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_c8_avx512f (gfc_array_c8 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_c8_vanilla (gfc_array_c8 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_c8 (gfc_array_c8 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_i1.c =================================================================== --- generated/matmul_i1.c (Revision 259152) +++ generated/matmul_i1.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_i1_avx (gfc_array_i1 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_i1_avx2 (gfc_array_i1 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_i1_avx512f (gfc_array_i1 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_i1_vanilla (gfc_array_i1 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_i1 (gfc_array_i1 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_i16.c =================================================================== --- generated/matmul_i16.c (Revision 259152) +++ generated/matmul_i16.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_i16_avx (gfc_array_i16 * const restrict ret /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_i16_avx2 (gfc_array_i16 * const restrict re /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_i16_avx512f (gfc_array_i16 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_i16_vanilla (gfc_array_i16 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_i16 (gfc_array_i16 * const restrict retarra /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_i2.c =================================================================== --- generated/matmul_i2.c (Revision 259152) +++ generated/matmul_i2.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_i2_avx (gfc_array_i2 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_i2_avx2 (gfc_array_i2 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_i2_avx512f (gfc_array_i2 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_i2_vanilla (gfc_array_i2 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_i2 (gfc_array_i2 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_i4.c =================================================================== --- generated/matmul_i4.c (Revision 259152) +++ generated/matmul_i4.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_i4_avx (gfc_array_i4 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_i4_avx2 (gfc_array_i4 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_i4_avx512f (gfc_array_i4 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_i4_vanilla (gfc_array_i4 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_i4 (gfc_array_i4 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_i8.c =================================================================== --- generated/matmul_i8.c (Revision 259152) +++ generated/matmul_i8.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_i8_avx (gfc_array_i8 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_i8_avx2 (gfc_array_i8 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_i8_avx512f (gfc_array_i8 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_i8_vanilla (gfc_array_i8 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_i8 (gfc_array_i8 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_r10.c =================================================================== --- generated/matmul_r10.c (Revision 259152) +++ generated/matmul_r10.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_r10_avx (gfc_array_r10 * const restrict ret /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_r10_avx2 (gfc_array_r10 * const restrict re /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_r10_avx512f (gfc_array_r10 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_r10_vanilla (gfc_array_r10 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_r10 (gfc_array_r10 * const restrict retarra /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_r16.c =================================================================== --- generated/matmul_r16.c (Revision 259152) +++ generated/matmul_r16.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_r16_avx (gfc_array_r16 * const restrict ret /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_r16_avx2 (gfc_array_r16 * const restrict re /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_r16_avx512f (gfc_array_r16 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_r16_vanilla (gfc_array_r16 * const restrict /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_r16 (gfc_array_r16 * const restrict retarra /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_r4.c =================================================================== --- generated/matmul_r4.c (Revision 259152) +++ generated/matmul_r4.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_r4_avx (gfc_array_r4 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_r4_avx2 (gfc_array_r4 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_r4_avx512f (gfc_array_r4 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_r4_vanilla (gfc_array_r4 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_r4 (gfc_array_r4 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmul_r8.c =================================================================== --- generated/matmul_r8.c (Revision 259152) +++ generated/matmul_r8.c (Arbeitskopie) @@ -318,7 +318,7 @@ matmul_r8_avx (gfc_array_r8 * const restrict retar /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -870,7 +870,7 @@ matmul_r8_avx2 (gfc_array_r8 * const restrict reta /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1422,7 +1422,7 @@ matmul_r8_avx512f (gfc_array_r8 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -1988,7 +1988,7 @@ matmul_r8_vanilla (gfc_array_r8 * const restrict r /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -2614,7 +2614,7 @@ matmul_r8 (gfc_array_r8 * const restrict retarray, /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_c10.c =================================================================== --- generated/matmulavx128_c10.c (Revision 259152) +++ generated/matmulavx128_c10.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_c10_avx128_fma3 (gfc_array_c10 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_c10_avx128_fma4 (gfc_array_c10 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_c16.c =================================================================== --- generated/matmulavx128_c16.c (Revision 259152) +++ generated/matmulavx128_c16.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_c16_avx128_fma3 (gfc_array_c16 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_c16_avx128_fma4 (gfc_array_c16 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_c4.c =================================================================== --- generated/matmulavx128_c4.c (Revision 259152) +++ generated/matmulavx128_c4.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_c4_avx128_fma3 (gfc_array_c4 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_c4_avx128_fma4 (gfc_array_c4 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_c8.c =================================================================== --- generated/matmulavx128_c8.c (Revision 259152) +++ generated/matmulavx128_c8.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_c8_avx128_fma3 (gfc_array_c8 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_c8_avx128_fma4 (gfc_array_c8 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_i1.c =================================================================== --- generated/matmulavx128_i1.c (Revision 259152) +++ generated/matmulavx128_i1.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_i1_avx128_fma3 (gfc_array_i1 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_i1_avx128_fma4 (gfc_array_i1 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_i16.c =================================================================== --- generated/matmulavx128_i16.c (Revision 259152) +++ generated/matmulavx128_i16.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_i16_avx128_fma3 (gfc_array_i16 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_i16_avx128_fma4 (gfc_array_i16 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_i2.c =================================================================== --- generated/matmulavx128_i2.c (Revision 259152) +++ generated/matmulavx128_i2.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_i2_avx128_fma3 (gfc_array_i2 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_i2_avx128_fma4 (gfc_array_i2 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_i4.c =================================================================== --- generated/matmulavx128_i4.c (Revision 259152) +++ generated/matmulavx128_i4.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_i4_avx128_fma3 (gfc_array_i4 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_i4_avx128_fma4 (gfc_array_i4 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_i8.c =================================================================== --- generated/matmulavx128_i8.c (Revision 259152) +++ generated/matmulavx128_i8.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_i8_avx128_fma3 (gfc_array_i8 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_i8_avx128_fma4 (gfc_array_i8 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_r10.c =================================================================== --- generated/matmulavx128_r10.c (Revision 259152) +++ generated/matmulavx128_r10.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_r10_avx128_fma3 (gfc_array_r10 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_r10_avx128_fma4 (gfc_array_r10 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_r16.c =================================================================== --- generated/matmulavx128_r16.c (Revision 259152) +++ generated/matmulavx128_r16.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_r16_avx128_fma3 (gfc_array_r16 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_r16_avx128_fma4 (gfc_array_r16 * const rest /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_r4.c =================================================================== --- generated/matmulavx128_r4.c (Revision 259152) +++ generated/matmulavx128_r4.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_r4_avx128_fma3 (gfc_array_r4 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_r4_avx128_fma4 (gfc_array_r4 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; Index: generated/matmulavx128_r8.c =================================================================== --- generated/matmulavx128_r8.c (Revision 259152) +++ generated/matmulavx128_r8.c (Arbeitskopie) @@ -283,7 +283,7 @@ matmul_r8_avx128_fma3 (gfc_array_r8 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536; @@ -836,7 +836,7 @@ matmul_r8_avx128_fma4 (gfc_array_r8 * const restri /* Adjust size of t1 to what is needed. */ index_type t1_dim; - t1_dim = (a_dim1-1) * 256 + b_dim1; + t1_dim = (a_dim1 - (ycount > 1)) * 256 + b_dim1; if (t1_dim > 65536) t1_dim = 65536;