Message ID | mptftq95ds2.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | Add commentary to (SET_)TYPE_VECTOR_SUBPARTS | expand |
On Tue, Apr 23, 2019 at 10:25 AM Richard Sandiford <richard.sandiford@arm.com> wrote: > > This patch explains the encoding used for the precision field in > TYPE_VECTOR_SUBPARTS when NUM_POLY_INT_COEFFS == 2. OK to install? OK. Richard. > Richard > > > 2019-04-19 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > * tree.h (TYPE_VECTOR_SUBPARTS, SET_TYPE_VECTOR_SUBPARTS): Add > commentary about the encoding of precision. > > Index: gcc/tree.h > =================================================================== > --- gcc/tree.h 2019-04-23 09:21:46.206208219 +0100 > +++ gcc/tree.h 2019-04-23 09:21:58.898166354 +0100 > @@ -3734,6 +3734,8 @@ TYPE_VECTOR_SUBPARTS (const_tree node) > unsigned int precision = VECTOR_TYPE_CHECK (node)->type_common.precision; > if (NUM_POLY_INT_COEFFS == 2) > { > + /* See the corresponding code in SET_TYPE_VECTOR_SUBPARTS for a > + description of the encoding. */ > poly_uint64 res = 0; > res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff); > if (precision & 0x100) > @@ -3756,6 +3758,21 @@ SET_TYPE_VECTOR_SUBPARTS (tree node, pol > gcc_assert (index >= 0); > if (NUM_POLY_INT_COEFFS == 2) > { > + /* We have two coefficients that are each in the range 1 << [0, 63], > + so supporting all combinations would require 6 bits per coefficient > + and 12 bits in total. Since the precision field is only 10 bits > + in size, we need to be more restrictive than that. > + > + At present, coeff[1] is always either 0 (meaning that the number > + of units is constant) or equal to coeff[0] (meaning that the number > + of units is N + X * N for some target-dependent zero-based runtime > + parameter X). We can therefore encode coeff[1] in a single bit. > + > + The most compact encoding would be to use mask 0x3f for coeff[0] > + and 0x40 for coeff[1], leaving 0x380 unused. It's possible to > + get slightly more efficient code on some hosts if we instead > + treat the shift amount as an independent byte, so here we use > + 0xff for coeff[0] and 0x100 for coeff[1]. */ > unsigned HOST_WIDE_INT coeff1 = subparts.coeffs[1]; > gcc_assert (coeff1 == 0 || coeff1 == coeff0); > VECTOR_TYPE_CHECK (node)->type_common.precision
Index: gcc/tree.h =================================================================== --- gcc/tree.h 2019-04-23 09:21:46.206208219 +0100 +++ gcc/tree.h 2019-04-23 09:21:58.898166354 +0100 @@ -3734,6 +3734,8 @@ TYPE_VECTOR_SUBPARTS (const_tree node) unsigned int precision = VECTOR_TYPE_CHECK (node)->type_common.precision; if (NUM_POLY_INT_COEFFS == 2) { + /* See the corresponding code in SET_TYPE_VECTOR_SUBPARTS for a + description of the encoding. */ poly_uint64 res = 0; res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff); if (precision & 0x100) @@ -3756,6 +3758,21 @@ SET_TYPE_VECTOR_SUBPARTS (tree node, pol gcc_assert (index >= 0); if (NUM_POLY_INT_COEFFS == 2) { + /* We have two coefficients that are each in the range 1 << [0, 63], + so supporting all combinations would require 6 bits per coefficient + and 12 bits in total. Since the precision field is only 10 bits + in size, we need to be more restrictive than that. + + At present, coeff[1] is always either 0 (meaning that the number + of units is constant) or equal to coeff[0] (meaning that the number + of units is N + X * N for some target-dependent zero-based runtime + parameter X). We can therefore encode coeff[1] in a single bit. + + The most compact encoding would be to use mask 0x3f for coeff[0] + and 0x40 for coeff[1], leaving 0x380 unused. It's possible to + get slightly more efficient code on some hosts if we instead + treat the shift amount as an independent byte, so here we use + 0xff for coeff[0] and 0x100 for coeff[1]. */ unsigned HOST_WIDE_INT coeff1 = subparts.coeffs[1]; gcc_assert (coeff1 == 0 || coeff1 == coeff0); VECTOR_TYPE_CHECK (node)->type_common.precision