Message ID | mptmuf33scs.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | Make assemble_real generate canonical CONST_INTs | expand |
On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford <richard.sandiford@arm.com> wrote: > > assemble_real used GEN_INT to create integers directly from the > longs returned by real_to_target. assemble_integer then went on > to interpret the const_ints as though they had the mode corresponding > to the accompanying size parameter: > > imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require (); > > for (i = 0; i < size; i += subsize) > { > rtx partial = simplify_subreg (omode, x, imode, i); > > But in the assemble_real case, X might not be canonical for IMODE. > > If the interface to assemble_integer is supposed to allow outputting > (say) the low 4 bytes of a DImode integer, then the simplify_subreg > above is wrong. But if the number of bytes passed to assemble_integer > is supposed to be the number of bytes that the integer actually contains, > assemble_real is wrong. > > This patch takes the latter interpretation and makes assemble_real > generate const_ints that are canonical for the number of bytes passed. > > The flip_storage_order handling assumes that each long is a full > SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats > whose memory size is not a multiple of 32 bits (which includes > HFmode at least). The patch therefore leaves that code alone. > If interpreting each integer as SImode is correct, the const_ints > that it generates are also correct. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. Also tested > by making sure that there were no new errors from a range of > cross-built targets. OK to install? > > Richard > > > 2019-09-17 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > * varasm.c (assemble_real): Generate canonical const_ints. > > Index: gcc/varasm.c > =================================================================== > --- gcc/varasm.c 2019-09-05 08:49:30.829739618 +0100 > +++ gcc/varasm.c 2019-09-17 15:30:10.400740515 +0100 > @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar > real_to_target (data, &d, mode); > > /* Put out the first word with the specified alignment. */ > + unsigned int chunk_nunits = MIN (nunits, units_per); > if (reverse) > elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode)); > else > - elt = GEN_INT (data[0]); > - assemble_integer (elt, MIN (nunits, units_per), align, 1); > - nunits -= units_per; > + elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT)); why the appearant difference between the storage-order flipping variant using gen_int_mode vs. the GEN_INT with sext_hwi? Can't we use gen_int_mode in the non-flipping path and be done with that? > + assemble_integer (elt, chunk_nunits, align, 1); > + nunits -= chunk_nunits; > > /* Subsequent words need only 32-bit alignment. */ > align = min_align (align, 32); > > for (int i = 1; i < nelts; i++) > { > + chunk_nunits = MIN (nunits, units_per); > if (reverse) > elt = flip_storage_order (SImode, > gen_int_mode (data[nelts - 1 - i], SImode)); > else > - elt = GEN_INT (data[i]); > - assemble_integer (elt, MIN (nunits, units_per), align, 1); > - nunits -= units_per; > + elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT)); > + assemble_integer (elt, chunk_nunits, align, 1); > + nunits -= chunk_nunits; > } > } >
Richard Biener <richard.guenther@gmail.com> writes: > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> assemble_real used GEN_INT to create integers directly from the >> longs returned by real_to_target. assemble_integer then went on >> to interpret the const_ints as though they had the mode corresponding >> to the accompanying size parameter: >> >> imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require (); >> >> for (i = 0; i < size; i += subsize) >> { >> rtx partial = simplify_subreg (omode, x, imode, i); >> >> But in the assemble_real case, X might not be canonical for IMODE. >> >> If the interface to assemble_integer is supposed to allow outputting >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg >> above is wrong. But if the number of bytes passed to assemble_integer >> is supposed to be the number of bytes that the integer actually contains, >> assemble_real is wrong. >> >> This patch takes the latter interpretation and makes assemble_real >> generate const_ints that are canonical for the number of bytes passed. >> >> The flip_storage_order handling assumes that each long is a full >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats >> whose memory size is not a multiple of 32 bits (which includes >> HFmode at least). The patch therefore leaves that code alone. >> If interpreting each integer as SImode is correct, the const_ints >> that it generates are also correct. >> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu. Also tested >> by making sure that there were no new errors from a range of >> cross-built targets. OK to install? >> >> Richard >> >> >> 2019-09-17 Richard Sandiford <richard.sandiford@arm.com> >> >> gcc/ >> * varasm.c (assemble_real): Generate canonical const_ints. >> >> Index: gcc/varasm.c >> =================================================================== >> --- gcc/varasm.c 2019-09-05 08:49:30.829739618 +0100 >> +++ gcc/varasm.c 2019-09-17 15:30:10.400740515 +0100 >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar >> real_to_target (data, &d, mode); >> >> /* Put out the first word with the specified alignment. */ >> + unsigned int chunk_nunits = MIN (nunits, units_per); >> if (reverse) >> elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode)); >> else >> - elt = GEN_INT (data[0]); >> - assemble_integer (elt, MIN (nunits, units_per), align, 1); >> - nunits -= units_per; >> + elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT)); > > why the appearant difference between the storage-order flipping > variant using gen_int_mode vs. the GEN_INT with sext_hwi? > Can't we use gen_int_mode in the non-flipping path and be done with that? Yeah, I mentioned this in the covering note. The flip_storage_order stuff only seems to work for floats that are a multiple of 32 bits in size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the new "else" does. Hard-coding SImode also hard-codes BITS_PER_UNIT==8, unlike the "else". So if anything, it's flip_storage_order that might need to change to avoid hard-coding SImode. That doesn't look like a trivial change though. E.g. the number of bytes passed to assemble_integer would need to match the number of bytes in data[nelts - 1] rather than data[0]. The alignment code below would also need to be adjusted. Fixing that (if it is a bug) seems like a separate change and TBH I'd rather not touch it here. Thanks, Richard > >> + assemble_integer (elt, chunk_nunits, align, 1); >> + nunits -= chunk_nunits; >> >> /* Subsequent words need only 32-bit alignment. */ >> align = min_align (align, 32); >> >> for (int i = 1; i < nelts; i++) >> { >> + chunk_nunits = MIN (nunits, units_per); >> if (reverse) >> elt = flip_storage_order (SImode, >> gen_int_mode (data[nelts - 1 - i], SImode)); >> else >> - elt = GEN_INT (data[i]); >> - assemble_integer (elt, MIN (nunits, units_per), align, 1); >> - nunits -= units_per; >> + elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT)); >> + assemble_integer (elt, chunk_nunits, align, 1); >> + nunits -= chunk_nunits; >> } >> } >>
On Wed, Sep 18, 2019 at 11:41 AM Richard Sandiford <richard.sandiford@arm.com> wrote: > > Richard Biener <richard.guenther@gmail.com> writes: > > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford > > <richard.sandiford@arm.com> wrote: > >> > >> assemble_real used GEN_INT to create integers directly from the > >> longs returned by real_to_target. assemble_integer then went on > >> to interpret the const_ints as though they had the mode corresponding > >> to the accompanying size parameter: > >> > >> imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require (); > >> > >> for (i = 0; i < size; i += subsize) > >> { > >> rtx partial = simplify_subreg (omode, x, imode, i); > >> > >> But in the assemble_real case, X might not be canonical for IMODE. > >> > >> If the interface to assemble_integer is supposed to allow outputting > >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg > >> above is wrong. But if the number of bytes passed to assemble_integer > >> is supposed to be the number of bytes that the integer actually contains, > >> assemble_real is wrong. > >> > >> This patch takes the latter interpretation and makes assemble_real > >> generate const_ints that are canonical for the number of bytes passed. > >> > >> The flip_storage_order handling assumes that each long is a full > >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats > >> whose memory size is not a multiple of 32 bits (which includes > >> HFmode at least). The patch therefore leaves that code alone. > >> If interpreting each integer as SImode is correct, the const_ints > >> that it generates are also correct. > >> > >> Tested on aarch64-linux-gnu and x86_64-linux-gnu. Also tested > >> by making sure that there were no new errors from a range of > >> cross-built targets. OK to install? > >> > >> Richard > >> > >> > >> 2019-09-17 Richard Sandiford <richard.sandiford@arm.com> > >> > >> gcc/ > >> * varasm.c (assemble_real): Generate canonical const_ints. > >> > >> Index: gcc/varasm.c > >> =================================================================== > >> --- gcc/varasm.c 2019-09-05 08:49:30.829739618 +0100 > >> +++ gcc/varasm.c 2019-09-17 15:30:10.400740515 +0100 > >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar > >> real_to_target (data, &d, mode); > >> > >> /* Put out the first word with the specified alignment. */ > >> + unsigned int chunk_nunits = MIN (nunits, units_per); > >> if (reverse) > >> elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode)); > >> else > >> - elt = GEN_INT (data[0]); > >> - assemble_integer (elt, MIN (nunits, units_per), align, 1); > >> - nunits -= units_per; > >> + elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT)); > > > > why the appearant difference between the storage-order flipping > > variant using gen_int_mode vs. the GEN_INT with sext_hwi? > > Can't we use gen_int_mode in the non-flipping path and be done with that? > > Yeah, I mentioned this in the covering note. The flip_storage_order > stuff only seems to work for floats that are a multiple of 32 bits in > size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the > new "else" does. Hard-coding SImode also hard-codes BITS_PER_UNIT==8, > unlike the "else". > > So if anything, it's flip_storage_order that might need to change > to avoid hard-coding SImode. That doesn't look like a trivial change > though. E.g. the number of bytes passed to assemble_integer would need > to match the number of bytes in data[nelts - 1] rather than data[0]. > The alignment code below would also need to be adjusted. Fixing that > (if it is a bug) seems like a separate change and TBH I'd rather not > touch it here. Hmm, ok. Patch is OK then. Thanks, Richard. > Thanks, > Richard > > > > >> + assemble_integer (elt, chunk_nunits, align, 1); > >> + nunits -= chunk_nunits; > >> > >> /* Subsequent words need only 32-bit alignment. */ > >> align = min_align (align, 32); > >> > >> for (int i = 1; i < nelts; i++) > >> { > >> + chunk_nunits = MIN (nunits, units_per); > >> if (reverse) > >> elt = flip_storage_order (SImode, > >> gen_int_mode (data[nelts - 1 - i], SImode)); > >> else > >> - elt = GEN_INT (data[i]); > >> - assemble_integer (elt, MIN (nunits, units_per), align, 1); > >> - nunits -= units_per; > >> + elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT)); > >> + assemble_integer (elt, chunk_nunits, align, 1); > >> + nunits -= chunk_nunits; > >> } > >> } > >>
On Wed, 18 Sep 2019 at 11:41, Richard Sandiford <richard.sandiford@arm.com> wrote: > > Richard Biener <richard.guenther@gmail.com> writes: > > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford > > <richard.sandiford@arm.com> wrote: > >> > >> assemble_real used GEN_INT to create integers directly from the > >> longs returned by real_to_target. assemble_integer then went on > >> to interpret the const_ints as though they had the mode corresponding > >> to the accompanying size parameter: > >> > >> imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require (); > >> > >> for (i = 0; i < size; i += subsize) > >> { > >> rtx partial = simplify_subreg (omode, x, imode, i); > >> > >> But in the assemble_real case, X might not be canonical for IMODE. > >> > >> If the interface to assemble_integer is supposed to allow outputting > >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg > >> above is wrong. But if the number of bytes passed to assemble_integer > >> is supposed to be the number of bytes that the integer actually contains, > >> assemble_real is wrong. > >> > >> This patch takes the latter interpretation and makes assemble_real > >> generate const_ints that are canonical for the number of bytes passed. > >> > >> The flip_storage_order handling assumes that each long is a full > >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats > >> whose memory size is not a multiple of 32 bits (which includes > >> HFmode at least). The patch therefore leaves that code alone. > >> If interpreting each integer as SImode is correct, the const_ints > >> that it generates are also correct. > >> > >> Tested on aarch64-linux-gnu and x86_64-linux-gnu. Also tested > >> by making sure that there were no new errors from a range of > >> cross-built targets. OK to install? > >> > >> Richard > >> > >> > >> 2019-09-17 Richard Sandiford <richard.sandiford@arm.com> > >> > >> gcc/ > >> * varasm.c (assemble_real): Generate canonical const_ints. > >> > >> Index: gcc/varasm.c > >> =================================================================== > >> --- gcc/varasm.c 2019-09-05 08:49:30.829739618 +0100 > >> +++ gcc/varasm.c 2019-09-17 15:30:10.400740515 +0100 > >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar > >> real_to_target (data, &d, mode); > >> > >> /* Put out the first word with the specified alignment. */ > >> + unsigned int chunk_nunits = MIN (nunits, units_per); > >> if (reverse) > >> elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode)); > >> else > >> - elt = GEN_INT (data[0]); > >> - assemble_integer (elt, MIN (nunits, units_per), align, 1); > >> - nunits -= units_per; > >> + elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT)); > > > > why the appearant difference between the storage-order flipping > > variant using gen_int_mode vs. the GEN_INT with sext_hwi? > > Can't we use gen_int_mode in the non-flipping path and be done with that? > > Yeah, I mentioned this in the covering note. The flip_storage_order > stuff only seems to work for floats that are a multiple of 32 bits in > size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the > new "else" does. Hard-coding SImode also hard-codes BITS_PER_UNIT==8, > unlike the "else". > > So if anything, it's flip_storage_order that might need to change > to avoid hard-coding SImode. That doesn't look like a trivial change > though. E.g. the number of bytes passed to assemble_integer would need > to match the number of bytes in data[nelts - 1] rather than data[0]. > The alignment code below would also need to be adjusted. Fixing that > (if it is a bug) seems like a separate change and TBH I'd rather not > touch it here. > Hi Richard, I suspect you've probably noticed already, but in case you haven't: this patch causes a regression on arm: FAIL: gcc.target/arm/fp16-compile-alt-3.c scan-assembler \t.short\t49152 FAIL: gcc.target/arm/fp16-compile-ieee-3.c scan-assembler \t.short\t49152 Christophe > Thanks, > Richard > > > > >> + assemble_integer (elt, chunk_nunits, align, 1); > >> + nunits -= chunk_nunits; > >> > >> /* Subsequent words need only 32-bit alignment. */ > >> align = min_align (align, 32); > >> > >> for (int i = 1; i < nelts; i++) > >> { > >> + chunk_nunits = MIN (nunits, units_per); > >> if (reverse) > >> elt = flip_storage_order (SImode, > >> gen_int_mode (data[nelts - 1 - i], SImode)); > >> else > >> - elt = GEN_INT (data[i]); > >> - assemble_integer (elt, MIN (nunits, units_per), align, 1); > >> - nunits -= units_per; > >> + elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT)); > >> + assemble_integer (elt, chunk_nunits, align, 1); > >> + nunits -= chunk_nunits; > >> } > >> } > >>
Christophe Lyon <christophe.lyon@linaro.org> writes: > On Wed, 18 Sep 2019 at 11:41, Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> Richard Biener <richard.guenther@gmail.com> writes: >> > On Tue, Sep 17, 2019 at 4:33 PM Richard Sandiford >> > <richard.sandiford@arm.com> wrote: >> >> >> >> assemble_real used GEN_INT to create integers directly from the >> >> longs returned by real_to_target. assemble_integer then went on >> >> to interpret the const_ints as though they had the mode corresponding >> >> to the accompanying size parameter: >> >> >> >> imode = mode_for_size (size * BITS_PER_UNIT, mclass, 0).require (); >> >> >> >> for (i = 0; i < size; i += subsize) >> >> { >> >> rtx partial = simplify_subreg (omode, x, imode, i); >> >> >> >> But in the assemble_real case, X might not be canonical for IMODE. >> >> >> >> If the interface to assemble_integer is supposed to allow outputting >> >> (say) the low 4 bytes of a DImode integer, then the simplify_subreg >> >> above is wrong. But if the number of bytes passed to assemble_integer >> >> is supposed to be the number of bytes that the integer actually contains, >> >> assemble_real is wrong. >> >> >> >> This patch takes the latter interpretation and makes assemble_real >> >> generate const_ints that are canonical for the number of bytes passed. >> >> >> >> The flip_storage_order handling assumes that each long is a full >> >> SImode, which e.g. excludes BITS_PER_UNIT != 8 and float formats >> >> whose memory size is not a multiple of 32 bits (which includes >> >> HFmode at least). The patch therefore leaves that code alone. >> >> If interpreting each integer as SImode is correct, the const_ints >> >> that it generates are also correct. >> >> >> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu. Also tested >> >> by making sure that there were no new errors from a range of >> >> cross-built targets. OK to install? >> >> >> >> Richard >> >> >> >> >> >> 2019-09-17 Richard Sandiford <richard.sandiford@arm.com> >> >> >> >> gcc/ >> >> * varasm.c (assemble_real): Generate canonical const_ints. >> >> >> >> Index: gcc/varasm.c >> >> =================================================================== >> >> --- gcc/varasm.c 2019-09-05 08:49:30.829739618 +0100 >> >> +++ gcc/varasm.c 2019-09-17 15:30:10.400740515 +0100 >> >> @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar >> >> real_to_target (data, &d, mode); >> >> >> >> /* Put out the first word with the specified alignment. */ >> >> + unsigned int chunk_nunits = MIN (nunits, units_per); >> >> if (reverse) >> >> elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode)); >> >> else >> >> - elt = GEN_INT (data[0]); >> >> - assemble_integer (elt, MIN (nunits, units_per), align, 1); >> >> - nunits -= units_per; >> >> + elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT)); >> > >> > why the appearant difference between the storage-order flipping >> > variant using gen_int_mode vs. the GEN_INT with sext_hwi? >> > Can't we use gen_int_mode in the non-flipping path and be done with that? >> >> Yeah, I mentioned this in the covering note. The flip_storage_order >> stuff only seems to work for floats that are a multiple of 32 bits in >> size, so it doesn't e.g. handle HFmode or 80-bit floats, whereas the >> new "else" does. Hard-coding SImode also hard-codes BITS_PER_UNIT==8, >> unlike the "else". >> >> So if anything, it's flip_storage_order that might need to change >> to avoid hard-coding SImode. That doesn't look like a trivial change >> though. E.g. the number of bytes passed to assemble_integer would need >> to match the number of bytes in data[nelts - 1] rather than data[0]. >> The alignment code below would also need to be adjusted. Fixing that >> (if it is a bug) seems like a separate change and TBH I'd rather not >> touch it here. >> > > Hi Richard, > > I suspect you've probably noticed already, but in case you haven't: > this patch causes a regression on arm: > FAIL: gcc.target/arm/fp16-compile-alt-3.c scan-assembler \t.short\t49152 > FAIL: gcc.target/arm/fp16-compile-ieee-3.c scan-assembler \t.short\t49152 Hadn't noticed that actually (but should have) -- thanks for the heads up. I've applied the below as obvious after testing on armeb-eabi. Richard 2019-09-26 Richard Sandiford <richard.sandiford@arm.com> gcc/testsuite/ * gcc.target/arm/fp16-compile-alt-3.c: Expect (__fp16) -2.0 to be written as a negative short rather than a positive one. * gcc.target/arm/fp16-compile-ieee-3.c: Likewise. Index: gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c =================================================================== --- gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c 2019-03-08 18:14:28.836998325 +0000 +++ gcc/testsuite/gcc.target/arm/fp16-compile-alt-3.c 2019-09-26 11:42:47.502378676 +0100 @@ -7,4 +7,4 @@ __fp16 xx = -2.0; /* { dg-final { scan-assembler "\t.size\txx, 2" } } */ -/* { dg-final { scan-assembler "\t.short\t49152" } } */ +/* { dg-final { scan-assembler "\t.short\t-16384" } } */ Index: gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c =================================================================== --- gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c 2019-03-08 18:14:28.732998720 +0000 +++ gcc/testsuite/gcc.target/arm/fp16-compile-ieee-3.c 2019-09-26 11:42:47.506378645 +0100 @@ -6,4 +6,4 @@ __fp16 xx = -2.0; /* { dg-final { scan-assembler "\t.size\txx, 2" } } */ -/* { dg-final { scan-assembler "\t.short\t49152" } } */ +/* { dg-final { scan-assembler "\t.short\t-16384" } } */
Index: gcc/varasm.c =================================================================== --- gcc/varasm.c 2019-09-05 08:49:30.829739618 +0100 +++ gcc/varasm.c 2019-09-17 15:30:10.400740515 +0100 @@ -2873,25 +2873,27 @@ assemble_real (REAL_VALUE_TYPE d, scalar real_to_target (data, &d, mode); /* Put out the first word with the specified alignment. */ + unsigned int chunk_nunits = MIN (nunits, units_per); if (reverse) elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1], SImode)); else - elt = GEN_INT (data[0]); - assemble_integer (elt, MIN (nunits, units_per), align, 1); - nunits -= units_per; + elt = GEN_INT (sext_hwi (data[0], chunk_nunits * BITS_PER_UNIT)); + assemble_integer (elt, chunk_nunits, align, 1); + nunits -= chunk_nunits; /* Subsequent words need only 32-bit alignment. */ align = min_align (align, 32); for (int i = 1; i < nelts; i++) { + chunk_nunits = MIN (nunits, units_per); if (reverse) elt = flip_storage_order (SImode, gen_int_mode (data[nelts - 1 - i], SImode)); else - elt = GEN_INT (data[i]); - assemble_integer (elt, MIN (nunits, units_per), align, 1); - nunits -= units_per; + elt = GEN_INT (sext_hwi (data[i], chunk_nunits * BITS_PER_UNIT)); + assemble_integer (elt, chunk_nunits, align, 1); + nunits -= chunk_nunits; } }