Message ID | 20240412081328.11183-1-adiupina@astralinux.ru |
---|---|
State | New |
Headers | show |
Series | [RFC] prevent overflow in xlnx_dpdma_desc_get_source_address() | expand |
On Fri, 12 Apr 2024 at 09:13, Alexandra Diupina <adiupina@astralinux.ru> wrote: > > Overflow can occur in a situation where desc->source_address > has a maximum value (pow(2, 32) - 1), so add a cast to a > larger type before the assignment. > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > Fixes: d3c6369a96 ("introduce xlnx-dpdma") > Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru> > --- > hw/dma/xlnx_dpdma.c | 20 ++++++++++---------- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c > index 1f5cd64ed1..224259225c 100644 > --- a/hw/dma/xlnx_dpdma.c > +++ b/hw/dma/xlnx_dpdma.c > @@ -175,24 +175,24 @@ static uint64_t xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, > > switch (frag) { > case 0: > - addr = desc->source_address > - + (extract32(desc->address_extension, 16, 12) << 20); > + addr = (uint64_t)(desc->source_address > + + (extract32(desc->address_extension, 16, 12) << 20)); Unless I'm confused, this cast doesn't help, because we will have already done a 32-bit addition of desc->source_address and the value from the address_extension part, so it doesn't change the result. If we want to do the addition at 64 bits then using extract64() would be the simplest way to arrange for that. However, I can't figure out what this code is trying to do and make that line up with the data sheet; maybe this isn't the right datasheet for this device? https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field The datasheet suggests that we should take 32 bits of the address from one field (here desc->source_address) and 16 bits of the address from another (here desc->address_extension's high bits) and combine them to make a 48 bit address. But this code is only looking at 12 bits of the high 16 in address_extension, and it doesn't shift them right enough to put them into bits [47:32] of the final address. Xilinx folks: what hardware is this modelling, and is it really the right behaviour? Also, this device looks like it has a host-endianness bug: the DPDMADescriptor struct is read directly from guest memory in dma_memory_read(), but the device never does anything to swap the fields from guest memory order to host memory order. So this is likely broken on big-endian hosts. thanks -- PMM
Peter, thank you! I agree with you that as mentioned in the documentation https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field, we should take 32 bits of the address from one field (for example, case 1, SRC_ADDR2_EXT - in code it is desc->source_address2) and 16 bits (high or low) of the address from another field (ADDR_EXT_23 - in code it is desc->address_extension_23, we need [15:0] bits) and combine them to make a 48 bit address. Therefore, I suggest making the following changes to the code so that it matches the documentation: static uint64_t xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, uint8_t frag) { uint64_t addr = 0; assert(frag < 5); switch (frag) { case 0: addr = (uint64_t)desc->source_address + (extract64(desc->address_extension, 16, 16) << 32); break; case 1: addr = (uint64_t)desc->source_address2 + (extract64(desc->address_extension_23, 0, 16) << 32); break; case 2: addr = (uint64_t)desc->source_address3 + (extract64(desc->address_extension_23, 16, 16) << 32); break; case 3: addr = (uint64_t)desc->source_address4 + (extract64(desc->address_extension_45, 0, 16) << 32); break; case 4: addr = (uint64_t)desc->source_address5 + (extract64(desc->address_extension_45, 16, 16) << 32); break; default: addr = 0; break; } return addr; } This change adds a type cast and also uses extract64() instead of extract32() to avoid integer overflow on addition (there was a typo in the previous letter). Also in extract64() extracts a bit field with a length of 16 bits instead of 12, the shift is changed to 32 so that the extracted field fits into bits [47:32] of the final address. if this calculation is correct, I'm ready to create a second version of the patch. 12/04/24 13:06, Peter Maydell пишет: > On Fri, 12 Apr 2024 at 09:13, Alexandra Diupina <adiupina@astralinux.ru> wrote: >> Overflow can occur in a situation where desc->source_address >> has a maximum value (pow(2, 32) - 1), so add a cast to a >> larger type before the assignment. >> >> Found by Linux Verification Center (linuxtesting.org) with SVACE. >> >> Fixes: d3c6369a96 ("introduce xlnx-dpdma") >> Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru> >> --- >> hw/dma/xlnx_dpdma.c | 20 ++++++++++---------- >> 1 file changed, 10 insertions(+), 10 deletions(-) >> >> diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c >> index 1f5cd64ed1..224259225c 100644 >> --- a/hw/dma/xlnx_dpdma.c >> +++ b/hw/dma/xlnx_dpdma.c >> @@ -175,24 +175,24 @@ static uint64_t xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, >> >> switch (frag) { >> case 0: >> - addr = desc->source_address >> - + (extract32(desc->address_extension, 16, 12) << 20); >> + addr = (uint64_t)(desc->source_address >> + + (extract32(desc->address_extension, 16, 12) << 20)); > Unless I'm confused, this cast doesn't help, because we > will have already done a 32-bit addition of desc->source_address > and the value from the address_extension part, so it doesn't > change the result. > > If we want to do the addition at 64 bits then using extract64() > would be the simplest way to arrange for that. > > However, I can't figure out what this code is trying to do and > make that line up with the data sheet; maybe this isn't the > right datasheet for this device? > > https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field > > The datasheet suggests that we should take 32 bits of the address > from one field (here desc->source_address) and 16 bits of the > address from another (here desc->address_extension's high bits) > and combine them to make a 48 bit address. But this code is only > looking at 12 bits of the high 16 in address_extension, and it > doesn't shift them right enough to put them into bits [47:32] > of the final address. > > Xilinx folks: what hardware is this modelling, and is it > really the right behaviour? > > Also, this device looks like it has a host-endianness bug: the > DPDMADescriptor struct is read directly from guest memory in > dma_memory_read(), but the device never does anything to swap > the fields from guest memory order to host memory order. So > this is likely broken on big-endian hosts. > > thanks > -- PMM
+ To: Fred On Tue, 16 Apr 2024 at 19:56, Alexandra Diupina <adiupina@astralinux.ru> wrote: > Peter, thank you! I agree with you that > as mentioned in the documentation > https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field, > we should take 32 bits of the address from one field > (for example, case 1, SRC_ADDR2_EXT - in code it is desc->source_address2) > and 16 bits (high or low) of the address from another field > (ADDR_EXT_23 - in code it is desc->address_extension_23, we need [15:0] > bits) > and combine them to make a 48 bit address. > > Therefore, I suggest making the following changes to the code > so that it matches the documentation: > > static uint64_t xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, > uint8_t frag) > { > uint64_t addr = 0; > assert(frag < 5); > > switch (frag) { > case 0: > addr = (uint64_t)desc->source_address > + (extract64(desc->address_extension, 16, 16) << 32); > break; > case 1: > addr = (uint64_t)desc->source_address2 > + (extract64(desc->address_extension_23, 0, 16) << 32); > break; > case 2: > addr = (uint64_t)desc->source_address3 > + (extract64(desc->address_extension_23, 16, 16) << 32); > break; > case 3: > addr = (uint64_t)desc->source_address4 > + (extract64(desc->address_extension_45, 0, 16) << 32); > break; > case 4: > addr = (uint64_t)desc->source_address5 > + (extract64(desc->address_extension_45, 16, 16) << 32); > break; > default: > addr = 0; > break; > } > > return addr; > } > > > This change adds a type cast and also uses extract64() instead of > extract32() > to avoid integer overflow on addition (there was a typo in the previous > letter). > Also in extract64() extracts a bit field with a length of 16 bits > instead of 12, > the shift is changed to 32 so that the extracted field fits into bits > [47:32] of the final address. > > if this calculation is correct, I'm ready to create a second version of > the patch. > > > > > 12/04/24 13:06, Peter Maydell пишет: > > On Fri, 12 Apr 2024 at 09:13, Alexandra Diupina <adiupina@astralinux.ru> > wrote: > >> Overflow can occur in a situation where desc->source_address > >> has a maximum value (pow(2, 32) - 1), so add a cast to a > >> larger type before the assignment. > >> > >> Found by Linux Verification Center (linuxtesting.org) with SVACE. > >> > >> Fixes: d3c6369a96 ("introduce xlnx-dpdma") > >> Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru> > >> --- > >> hw/dma/xlnx_dpdma.c | 20 ++++++++++---------- > >> 1 file changed, 10 insertions(+), 10 deletions(-) > >> > >> diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c > >> index 1f5cd64ed1..224259225c 100644 > >> --- a/hw/dma/xlnx_dpdma.c > >> +++ b/hw/dma/xlnx_dpdma.c > >> @@ -175,24 +175,24 @@ static uint64_t > xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, > >> > >> switch (frag) { > >> case 0: > >> - addr = desc->source_address > >> - + (extract32(desc->address_extension, 16, 12) << 20); > >> + addr = (uint64_t)(desc->source_address > >> + + (extract32(desc->address_extension, 16, 12) << 20)); > > Unless I'm confused, this cast doesn't help, because we > > will have already done a 32-bit addition of desc->source_address > > and the value from the address_extension part, so it doesn't > > change the result. > > > > If we want to do the addition at 64 bits then using extract64() > > would be the simplest way to arrange for that. > > > > However, I can't figure out what this code is trying to do and > > make that line up with the data sheet; maybe this isn't the > > right datasheet for this device? > > > > https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field > > > > The datasheet suggests that we should take 32 bits of the address > > from one field (here desc->source_address) and 16 bits of the > > address from another (here desc->address_extension's high bits) > > and combine them to make a 48 bit address. But this code is only > > looking at 12 bits of the high 16 in address_extension, and it > > doesn't shift them right enough to put them into bits [47:32] > > of the final address. > > > > Xilinx folks: what hardware is this modelling, and is it > > really the right behaviour? > > > > Also, this device looks like it has a host-endianness bug: the > > DPDMADescriptor struct is read directly from guest memory in > > dma_memory_read(), but the device never does anything to swap > > the fields from guest memory order to host memory order. So > > this is likely broken on big-endian hosts. > > > > thanks > > -- PMM > > >
Hi, > -----Original Message----- > From: qemu-devel-bounces+fkonrad=amd.com@nongnu.org <qemu-devel-bounces+fkonrad=amd.com@nongnu.org> On Behalf Of > Peter Maydell > Sent: Friday, April 12, 2024 12:07 PM > To: Alexandra Diupina <adiupina@astralinux.ru> > Cc: Alistair Francis <alistair@alistair23.me>; Edgar E. Iglesias <edgar.iglesias@gmail.com>; qemu-arm@nongnu.org; qemu- > devel@nongnu.org; sdl.qemu@linuxtesting.org > Subject: Re: [PATCH RFC] prevent overflow in xlnx_dpdma_desc_get_source_address() > > On Fri, 12 Apr 2024 at 09:13, Alexandra Diupina <adiupina@astralinux.ru> wrote: > > > > Overflow can occur in a situation where desc->source_address > > has a maximum value (pow(2, 32) - 1), so add a cast to a > > larger type before the assignment. > > > > Found by Linux Verification Center (linuxtesting.org) with SVACE. > > > > Fixes: d3c6369a96 ("introduce xlnx-dpdma") > > Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru> > > --- > > hw/dma/xlnx_dpdma.c | 20 ++++++++++---------- > > 1 file changed, 10 insertions(+), 10 deletions(-) > > > > diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c > > index 1f5cd64ed1..224259225c 100644 > > --- a/hw/dma/xlnx_dpdma.c > > +++ b/hw/dma/xlnx_dpdma.c > > @@ -175,24 +175,24 @@ static uint64_t xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, > > > > switch (frag) { > > case 0: > > - addr = desc->source_address > > - + (extract32(desc->address_extension, 16, 12) << 20); > > + addr = (uint64_t)(desc->source_address > > + + (extract32(desc->address_extension, 16, 12) << 20)); > > Unless I'm confused, this cast doesn't help, because we > will have already done a 32-bit addition of desc->source_address > and the value from the address_extension part, so it doesn't > change the result. > > If we want to do the addition at 64 bits then using extract64() > would be the simplest way to arrange for that. > > However, I can't figure out what this code is trying to do and > make that line up with the data sheet; maybe this isn't the > right datasheet for this device? > > https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field > > The datasheet suggests that we should take 32 bits of the address > from one field (here desc->source_address) and 16 bits of the > address from another (here desc->address_extension's high bits) > and combine them to make a 48 bit address. But this code is only > looking at 12 bits of the high 16 in address_extension, and it > doesn't shift them right enough to put them into bits [47:32] > of the final address. > > Xilinx folks: what hardware is this modelling, and is it > really the right behaviour? Looks like this is the right documentation. Most probably the descriptor field changed since I did that model, or I got really confused. > > Also, this device looks like it has a host-endianness bug: the > DPDMADescriptor struct is read directly from guest memory in > dma_memory_read(), but the device never does anything to swap > the fields from guest memory order to host memory order. So > this is likely broken on big-endian hosts. > Yes indeed. Best Regards, Fred > thanks > -- PMM
17/04/24 13:05, Konrad, Frederic пишет: > Hi, > >> -----Original Message----- >> From: qemu-devel-bounces+fkonrad=amd.com@nongnu.org <qemu-devel-bounces+fkonrad=amd.com@nongnu.org> On Behalf Of >> Peter Maydell >> Sent: Friday, April 12, 2024 12:07 PM >> To: Alexandra Diupina <adiupina@astralinux.ru> >> Cc: Alistair Francis <alistair@alistair23.me>; Edgar E. Iglesias <edgar.iglesias@gmail.com>; qemu-arm@nongnu.org; qemu- >> devel@nongnu.org; sdl.qemu@linuxtesting.org >> Subject: Re: [PATCH RFC] prevent overflow in xlnx_dpdma_desc_get_source_address() >> >> On Fri, 12 Apr 2024 at 09:13, Alexandra Diupina <adiupina@astralinux.ru> wrote: >>> Overflow can occur in a situation where desc->source_address >>> has a maximum value (pow(2, 32) - 1), so add a cast to a >>> larger type before the assignment. >>> >>> Found by Linux Verification Center (linuxtesting.org) with SVACE. >>> >>> Fixes: d3c6369a96 ("introduce xlnx-dpdma") >>> Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru> >>> --- >>> hw/dma/xlnx_dpdma.c | 20 ++++++++++---------- >>> 1 file changed, 10 insertions(+), 10 deletions(-) >>> >>> diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c >>> index 1f5cd64ed1..224259225c 100644 >>> --- a/hw/dma/xlnx_dpdma.c >>> +++ b/hw/dma/xlnx_dpdma.c >>> @@ -175,24 +175,24 @@ static uint64_t xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, >>> >>> switch (frag) { >>> case 0: >>> - addr = desc->source_address >>> - + (extract32(desc->address_extension, 16, 12) << 20); >>> + addr = (uint64_t)(desc->source_address >>> + + (extract32(desc->address_extension, 16, 12) << 20)); >> Unless I'm confused, this cast doesn't help, because we >> will have already done a 32-bit addition of desc->source_address >> and the value from the address_extension part, so it doesn't >> change the result. >> >> If we want to do the addition at 64 bits then using extract64() >> would be the simplest way to arrange for that. >> >> However, I can't figure out what this code is trying to do and >> make that line up with the data sheet; maybe this isn't the >> right datasheet for this device? >> >> https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/ADDR_EXT-Field >> >> The datasheet suggests that we should take 32 bits of the address >> from one field (here desc->source_address) and 16 bits of the >> address from another (here desc->address_extension's high bits) >> and combine them to make a 48 bit address. But this code is only >> looking at 12 bits of the high 16 in address_extension, and it >> doesn't shift them right enough to put them into bits [47:32] >> of the final address. >> >> Xilinx folks: what hardware is this modelling, and is it >> really the right behaviour? > Looks like this is the right documentation. Most probably the descriptor field changed > since I did that model, or I got really confused. > >> Also, this device looks like it has a host-endianness bug: the >> DPDMADescriptor struct is read directly from guest memory in >> dma_memory_read(), but the device never does anything to swap >> the fields from guest memory order to host memory order. So >> this is likely broken on big-endian hosts. >> > Yes indeed. > > Best Regards, > Fred > >> thanks >> -- PMM hw/dma/xlnx_dpdma.c defines a dma_ops structure with the endianness field set to DEVICE_NATIVE_ENDIAN. Doesn't this guarantee that there will be no endian errors? Alexandra
On Tue, 23 Apr 2024 at 11:23, Alexandra Diupina <adiupina@astralinux.ru> wrote: > 17/04/24 13:05, Konrad, Frederic пишет: > > Peter Maydell wrote: > >> Also, this device looks like it has a host-endianness bug: the > >> DPDMADescriptor struct is read directly from guest memory in > >> dma_memory_read(), but the device never does anything to swap > >> the fields from guest memory order to host memory order. So > >> this is likely broken on big-endian hosts. > hw/dma/xlnx_dpdma.c defines a dma_ops structure > with the endianness field set to DEVICE_NATIVE_ENDIAN. > Doesn't this guarantee that there will be no endian errors? The endianness on a MemoryRegionOps struct tells the memory core how to handle guest accesses to the *registers* that struct is for. So if the .endianness is DEVICE_LITTLE_ENDIAN and the guest is big-endian, the memory core will byteswap the data values. That means that the read and write functions always take and return "value" arguments which are what you expect (i.e. the guest wrote 0x12345678 and the write fn value argument is 0x12345678). However, if a device does direct accesses to guest *memory* (like a DMA-capable device will do), that is something it has to handle the endianness for itself. (The device's manual should say what way round it does memory loads.) There are two basic ways a device can do guest memory access: (1) we provide functions which say "load from guest memory a value of this type with this endianness"; for instance ldq_le_dma() will load a 64-bit little-endian value into a host C uint64_t, and ldq_be_dma() will do the same for a 64-bit big-endian value. There are similar functions for stores. This can be the simplest approach but it's a bit less efficient because it goes up and down the memory subsystem for each data item being read. (2) we provide functions which are "load/store a sequence of bytes from guest memory". This is what dma_memory_read() does. Because it's just a sequence of bytes, the device code is then responsible for interpreting what those bytes mean: maybe they're just bytes, or maybe they're a sequence of little-endian 32-bit values, or maybe they're something else. The device code also has to deal with the fact that the alignment of these values in memory is not necessarily the same as what the host CPU requires. These things together mean you can't simply cast the pointer to a sequence-of-bytes to a host type, or tell dma_memory_read() to write the sequence-of-bytes to a host struct type. We provide functions like ldl_le_p() to say "read a little-endian 32-bit value from this host pointer location" to assist in pulling values out of a sequence-of-bytes array. Or if the data is all of the same type (eg it's an array of 32-bit values) you can dma_memory_read() it into a host uint32_t[] array and then use le32_to_cpu() to convert those uint32_t values to the host's endianness. (I use the dma_* family of functions here as an example since that's what xlnx_dpdma.c happens to use; the same general principle applies for our various other families of guest load/store functions, like the address_space_* ones.) thanks -- PMM
diff --git a/hw/dma/xlnx_dpdma.c b/hw/dma/xlnx_dpdma.c index 1f5cd64ed1..224259225c 100644 --- a/hw/dma/xlnx_dpdma.c +++ b/hw/dma/xlnx_dpdma.c @@ -175,24 +175,24 @@ static uint64_t xlnx_dpdma_desc_get_source_address(DPDMADescriptor *desc, switch (frag) { case 0: - addr = desc->source_address - + (extract32(desc->address_extension, 16, 12) << 20); + addr = (uint64_t)(desc->source_address + + (extract32(desc->address_extension, 16, 12) << 20)); break; case 1: - addr = desc->source_address2 - + (extract32(desc->address_extension_23, 0, 12) << 8); + addr = (uint64_t)(desc->source_address2 + + (extract32(desc->address_extension_23, 0, 12) << 8)); break; case 2: - addr = desc->source_address3 - + (extract32(desc->address_extension_23, 16, 12) << 20); + addr = (uint64_t)(desc->source_address3 + + (extract32(desc->address_extension_23, 16, 12) << 20)); break; case 3: - addr = desc->source_address4 - + (extract32(desc->address_extension_45, 0, 12) << 8); + addr = (uint64_t)(desc->source_address4 + + (extract32(desc->address_extension_45, 0, 12) << 8)); break; case 4: - addr = desc->source_address5 - + (extract32(desc->address_extension_45, 16, 12) << 20); + addr = (uint64_t)(desc->source_address5 + + (extract32(desc->address_extension_45, 16, 12) << 20)); break; default: addr = 0;
Overflow can occur in a situation where desc->source_address has a maximum value (pow(2, 32) - 1), so add a cast to a larger type before the assignment. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d3c6369a96 ("introduce xlnx-dpdma") Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru> --- hw/dma/xlnx_dpdma.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)