From patchwork Tue Dec 8 21:10:01 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 554053 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id B7B461402BD for ; Wed, 9 Dec 2015 08:10:17 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=GEN7jtDQ; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=KmU2S/6GoEWE2oP3MFLuXLcb+bSiW0Pr3rbifj4kwwcF4rZbim 0xDZ3mFNBlm0Rn7GS6KX9ADdsNIPlHvuM+ttJunMHWrv2M92bgKFM0ZOXMc/fUgK HZC/aWa5K0Zn6gkFhcXCIV68WALaRnV9KvikT8BwuBMemw9q7C0s194JA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=+wuyspJa8aVUQ+C4XKziZ941YyM=; b=GEN7jtDQqTywb1ahghrk Sfs+GxCA0xXhNM621LeSU5qDXTW0sTLlgj0g9klGoUluGW10bOB8I+YBAY2IQRzJ ZJanhv7ccoFejVnq4bHeW7msdolH7MWaiFSffo4UoCCR+wQhqNjcYGKRFHUyHEk7 A/eneBm0FeoCbQJljG5FWv8= Received: (qmail 52786 invoked by alias); 8 Dec 2015 21:10:08 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 52773 invoked by uid 89); 8 Dec 2015 21:10:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=BAYES_00, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qg0-f52.google.com Received: from mail-qg0-f52.google.com (HELO mail-qg0-f52.google.com) (209.85.192.52) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 08 Dec 2015 21:10:05 +0000 Received: by qgeb1 with SMTP id b1so37395693qge.1 for ; Tue, 08 Dec 2015 13:10:03 -0800 (PST) X-Received: by 10.55.41.138 with SMTP id p10mr2546095qkp.18.1449609002900; Tue, 08 Dec 2015 13:10:02 -0800 (PST) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id t101sm2345440qge.2.2015.12.08.13.10.01 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 08 Dec 2015 13:10:02 -0800 (PST) To: GCC Patches From: Nathan Sidwell Subject: [PTX] initialization fragment emission Message-ID: <56674729.2030604@acm.org> Date: Tue, 8 Dec 2015 16:10:01 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 This patch completes the reworking of the initializer emission machinery. I've collected all the global variables it uses to hold state into a single structure, and simplified the variables there -- for instance we don't need a mode variable and cons up a CONST_INT rtx for each fragment. nathan 2015-12-08 Nathan Sidwell gcc/ * config/nvptx/nvptx.c (decl_chunk_size, decl_chunk_mode, decl_offset, init_part, object_size, object_finished): Replace with ... (struct init_frag): ... this new struct variable. (begin_decl_field, output_decl_chunk): Replace with ... (output_init_frag): ... this new function. (nvptx_assemble_value): Reimplement. (nvptx_assemble_integer, nvptx_output_skip): Adjust. (nvptx_assemble_decl_begin, nvptx_assemble_decl_end): Adjust. (nvptx_output_aligned_decl): Call nvptx_assemble_decl_end. gcc/testsuite/ * gcc.target/nvptx/trailing-init.c: New. Index: config/nvptx/nvptx.c =================================================================== --- config/nvptx/nvptx.c (revision 231418) +++ config/nvptx/nvptx.c (working copy) @@ -1484,73 +1484,70 @@ nvptx_hard_regno_mode_ok (int regno, mac return mode == cfun->machine->ret_reg_mode; } -/* Machinery to output constant initializers. When beginning an initializer, - we decide on a chunk size (which is visible in ptx in the type used), and - then all initializer data is buffered until a chunk is filled and ready to - be written out. */ - -/* Used when assembling integers to ensure data is emitted in - pieces whose size matches the declaration we printed. */ -static unsigned int decl_chunk_size; -static machine_mode decl_chunk_mode; -/* Used in the same situation, to keep track of the byte offset - into the initializer. */ -static unsigned HOST_WIDE_INT decl_offset; -/* The initializer part we are currently processing. */ -static HOST_WIDE_INT init_part; -/* The total size of the object. */ -static unsigned HOST_WIDE_INT object_size; -/* True if we found a skip extending to the end of the object. Used to - assert that no data follows. */ -static bool object_finished; - -/* Write the necessary separator string to begin a new initializer value. */ +/* Machinery to output constant initializers. When beginning an + initializer, we decide on a fragment size (which is visible in ptx + in the type used), and then all initializer data is buffered until + a fragment is filled and ready to be written out. */ + +static struct +{ + unsigned HOST_WIDE_INT mask; /* Mask for storing fragment. */ + unsigned HOST_WIDE_INT val; /* Current fragment value. */ + unsigned HOST_WIDE_INT remaining; /* Remaining bytes to be written + out. */ + unsigned size; /* Fragment size to accumulate. */ + unsigned offset; /* Offset within current fragment. */ + bool started; /* Whether we've output any initializer. */ +} init_frag; + +/* The current fragment is full, write it out. SYM may provide a + symbolic reference we should output, in which case the fragment + value is the addend. */ static void -begin_decl_field (void) +output_init_frag (rtx sym) { - /* We never see decl_offset at zero by the time we get here. */ - if (decl_offset == decl_chunk_size) - fprintf (asm_out_file, " = { "); - else - fprintf (asm_out_file, ", "); -} + fprintf (asm_out_file, init_frag.started ? ", " : " = { "); + unsigned HOST_WIDE_INT val = init_frag.val; -/* Output the currently stored chunk as an initializer value. */ + init_frag.started = true; + init_frag.val = 0; + init_frag.offset = 0; + init_frag.remaining--; + + if (sym) + { + fprintf (asm_out_file, "generic("); + output_address (VOIDmode, sym); + fprintf (asm_out_file, val ? ") + " : ")"); + } -static void -output_decl_chunk (void) -{ - begin_decl_field (); - output_address (VOIDmode, gen_int_mode (init_part, decl_chunk_mode)); - init_part = 0; + if (!sym || val) + fprintf (asm_out_file, HOST_WIDE_INT_PRINT_DEC, val); } -/* Add value VAL sized SIZE to the data we're emitting, and keep writing - out chunks as they fill up. */ +/* Add value VAL of size SIZE to the data we're emitting, and keep + writing out chunks as they fill up. */ static void -nvptx_assemble_value (HOST_WIDE_INT val, unsigned int size) +nvptx_assemble_value (unsigned HOST_WIDE_INT val, unsigned size) { - unsigned HOST_WIDE_INT chunk_offset = decl_offset % decl_chunk_size; - gcc_assert (!object_finished); - while (size > 0) + val &= ((unsigned HOST_WIDE_INT)2 << (size * BITS_PER_UNIT - 1)) - 1; + + for (unsigned part = 0; size; size -= part) { - int this_part = size; - if (chunk_offset + this_part > decl_chunk_size) - this_part = decl_chunk_size - chunk_offset; - HOST_WIDE_INT val_part; - HOST_WIDE_INT mask = 2; - mask <<= this_part * BITS_PER_UNIT - 1; - val_part = val & (mask - 1); - init_part |= val_part << (BITS_PER_UNIT * chunk_offset); - val >>= BITS_PER_UNIT * this_part; - size -= this_part; - decl_offset += this_part; - if (decl_offset % decl_chunk_size == 0) - output_decl_chunk (); + val >>= part * BITS_PER_UNIT; + part = init_frag.size - init_frag.offset; + if (part > size) + part = size; + + unsigned HOST_WIDE_INT partial + = val << (init_frag.offset * BITS_PER_UNIT); + init_frag.val |= partial & init_frag.mask; + init_frag.offset += part; - chunk_offset = 0; + if (init_frag.offset == init_frag.size) + output_init_frag (NULL); } } @@ -1567,8 +1564,7 @@ nvptx_assemble_integer (rtx x, unsigned gcc_unreachable (); case CONST_INT: - val = INTVAL (x); - nvptx_assemble_value (val, size); + nvptx_assemble_value (INTVAL (x), size); break; case CONST: @@ -1580,19 +1576,13 @@ nvptx_assemble_integer (rtx x, unsigned /* FALLTHROUGH */ case SYMBOL_REF: - gcc_assert (size = decl_chunk_size); - if (decl_offset % decl_chunk_size != 0) + gcc_assert (size == init_frag.size); + if (init_frag.offset) sorry ("cannot emit unaligned pointers in ptx assembly"); - decl_offset += size; - begin_decl_field (); nvptx_maybe_record_fnsym (x); - fprintf (asm_out_file, "generic("); - output_address (VOIDmode, x); - fprintf (asm_out_file, ")"); - - if (val) - fprintf (asm_out_file, " + " HOST_WIDE_INT_PRINT_DEC, val); + init_frag.val = val; + output_init_frag (x); break; } @@ -1606,21 +1596,28 @@ nvptx_assemble_integer (rtx x, unsigned void nvptx_output_skip (FILE *, unsigned HOST_WIDE_INT size) { - if (decl_offset + size >= object_size) + /* Finish the current fragment, if it's started. */ + if (init_frag.offset) { - if (decl_offset % decl_chunk_size != 0) - nvptx_assemble_value (0, decl_chunk_size); - object_finished = true; - return; + unsigned part = init_frag.size - init_frag.offset; + if (part > size) + part = (unsigned) size; + size -= part; + nvptx_assemble_value (0, part); } - while (size > decl_chunk_size) + /* If this skip doesn't terminate the initializer, write as many + remaining pieces as possible directly. */ + if (size < init_frag.remaining * init_frag.size) { - nvptx_assemble_value (0, decl_chunk_size); - size -= decl_chunk_size; + while (size >= init_frag.size) + { + size -= init_frag.size; + output_init_frag (NULL_RTX); + } + if (size) + nvptx_assemble_value (0, size); } - while (size-- > 0) - nvptx_assemble_value (0, 1); } /* Output a string STR with length SIZE. As in nvptx_output_skip we @@ -1662,15 +1659,18 @@ nvptx_assemble_decl_begin (FILE *file, c elt_size |= GET_MODE_SIZE (elt_mode); elt_size &= -elt_size; /* Extract LSB set. */ - elt_mode = mode_for_size (elt_size * BITS_PER_UNIT, MODE_INT, 0); - - decl_chunk_size = elt_size; - decl_chunk_mode = elt_mode; - decl_offset = 0; - init_part = 0; - object_size = size; - object_finished = !size; + init_frag.size = elt_size; + /* Avoid undefined shift behaviour by using '2'. */ + init_frag.mask = ((unsigned HOST_WIDE_INT)2 + << (elt_size * BITS_PER_UNIT - 1)) - 1; + init_frag.val = 0; + init_frag.offset = 0; + init_frag.started = false; + /* Size might not be a multiple of elt size, if there's an + initialized trailing struct array with smaller type than + elt_size. */ + init_frag.remaining = (size + elt_size - 1) / elt_size; fprintf (file, "%s .align %d .u%d ", section, align / BITS_PER_UNIT, @@ -1680,8 +1680,7 @@ nvptx_assemble_decl_begin (FILE *file, c if (size) /* We make everything an array, to simplify any initialization emission. */ - fprintf (file, "[" HOST_WIDE_INT_PRINT_DEC "]", - (size + elt_size - 1) / elt_size); + fprintf (file, "[" HOST_WIDE_INT_PRINT_DEC "]", init_frag.remaining); } /* Called when the initializer for a decl has been completely output through @@ -1690,14 +1689,10 @@ nvptx_assemble_decl_begin (FILE *file, c static void nvptx_assemble_decl_end (void) { - if (decl_offset != 0) - { - if (!object_finished && decl_offset % decl_chunk_size != 0) - nvptx_assemble_value (0, decl_chunk_size); - - fprintf (asm_out_file, " }"); - } - fprintf (asm_out_file, ";\n"); + if (init_frag.offset) + /* This can happen with a packed struct with trailing array member. */ + nvptx_assemble_value (0, init_frag.size - init_frag.offset); + fprintf (asm_out_file, init_frag.started ? " };\n" : ";\n"); } /* Output an uninitialized common or file-scope variable. */ @@ -1714,7 +1709,7 @@ nvptx_output_aligned_decl (FILE *file, c nvptx_assemble_decl_begin (file, name, section_for_decl (decl), TREE_TYPE (decl), size, align); - fprintf (file, ";\n"); + nvptx_assemble_decl_end (); } /* Implement TARGET_ASM_DECLARE_CONSTANT_NAME. Begin the process of Index: testsuite/gcc.target/nvptx/trailing-init.c =================================================================== --- testsuite/gcc.target/nvptx/trailing-init.c (revision 0) +++ testsuite/gcc.target/nvptx/trailing-init.c (working copy) @@ -0,0 +1,18 @@ +/* { dg-additional-options "-Wno-pedantic" } */ + +struct trailing +{ + unsigned m; + short ary[]; +} trailing = + {.ary = {1}}; + +struct packed +{ + unsigned m; + short ary[]; +} __attribute__ ((packed)) packed = + {.ary = {2}}; + +/* { dg-final { scan-assembler ".align 1 .u32 packed\\\[2\\\] = { 0, 2 };" } } */ +/* { dg-final { scan-assembler ".align 4 .u32 trailing\\\[2\\\] = { 0, 1 };" } } */