diff mbox

vfio: add check for memory region overflow condition

Message ID jpg8u1blcdp.fsf@linux.bootlegged.copy
State New
Headers show

Commit Message

Bandan Das March 21, 2016, 10 p.m. UTC
vfio_listener_region_add for a iommu mr results in
an overflow assert since emulated iommu memory region is initialized
with UINT64_MAX. Add a check just like memory_region_size()
does.

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 hw/vfio/common.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Alex Williamson March 21, 2016, 10:34 p.m. UTC | #1
On Mon, 21 Mar 2016 18:00:50 -0400
Bandan Das <bsd@redhat.com> wrote:

> vfio_listener_region_add for a iommu mr results in
> an overflow assert since emulated iommu memory region is initialized
> with UINT64_MAX. Add a check just like memory_region_size()
> does.
> 
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  hw/vfio/common.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index fb588d8..269244b 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -349,7 +349,12 @@ static void vfio_listener_region_add(MemoryListener *listener,
>      if (int128_ge(int128_make64(iova), llend)) {
>          return;
>      }
> -    end = int128_get64(llend);
> +
> +    if (int128_eq(llend, int128_2_64())) {
> +            end = UINT64_MAX;
> +    } else {
> +            end = int128_get64(llend);
> +    }
>  
>      if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
>          error_report("vfio: IOMMU container %p can't map guest IOVA region"

But now all the calculations where we use end-1 are wrong.  See the
discussion with Pierre Morel in the January qemu-devel archives.
There's a solution in there, but I never saw a follow-up from Pierre
with a revised patch.  Thanks,

Alex
Bandan Das March 22, 2016, 12:06 a.m. UTC | #2
Alex Williamson <alex.williamson@redhat.com> writes:

> On Mon, 21 Mar 2016 18:00:50 -0400
> Bandan Das <bsd@redhat.com> wrote:
>
>> vfio_listener_region_add for a iommu mr results in
>> an overflow assert since emulated iommu memory region is initialized
>> with UINT64_MAX. Add a check just like memory_region_size()
>> does.
>> 
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>  hw/vfio/common.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index fb588d8..269244b 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -349,7 +349,12 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>      if (int128_ge(int128_make64(iova), llend)) {
>>          return;
>>      }
>> -    end = int128_get64(llend);
>> +
>> +    if (int128_eq(llend, int128_2_64())) {
>> +            end = UINT64_MAX;
>> +    } else {
>> +            end = int128_get64(llend);
>> +    }
>>  
>>      if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
>>          error_report("vfio: IOMMU container %p can't map guest IOVA region"
>
> But now all the calculations where we use end-1 are wrong.  See the
> discussion with Pierre Morel in the January qemu-devel archives.
> There's a solution in there, but I never saw a follow-up from Pierre
> with a revised patch.  Thanks,

I am missing something. When end < UIN64_MAX, end - 1 calculations are valid because
the patch doesn't change that behavior. When end is UINT64_MAX, int128_get64() doesn't know how
to calculate this value and we are just feeding it manually. The patch is just the opposite
of what memory_region_init() did to init the mem region in the first place:
   mr->size = int128_make64(size);
   if (size == UINT64_MAX) {
      mr->size = int128_2_64();
   }
So, end - 1 is still valid for end = UINT64_MAX, no ?

> Alex
Alex Williamson March 22, 2016, 12:30 a.m. UTC | #3
On Mon, 21 Mar 2016 20:06:32 -0400
Bandan Das <bsd@redhat.com> wrote:

> Alex Williamson <alex.williamson@redhat.com> writes:
> 
> > On Mon, 21 Mar 2016 18:00:50 -0400
> > Bandan Das <bsd@redhat.com> wrote:
> >  
> >> vfio_listener_region_add for a iommu mr results in
> >> an overflow assert since emulated iommu memory region is initialized
> >> with UINT64_MAX. Add a check just like memory_region_size()
> >> does.
> >> 
> >> Signed-off-by: Bandan Das <bsd@redhat.com>
> >> ---
> >>  hw/vfio/common.c | 7 ++++++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >> index fb588d8..269244b 100644
> >> --- a/hw/vfio/common.c
> >> +++ b/hw/vfio/common.c
> >> @@ -349,7 +349,12 @@ static void vfio_listener_region_add(MemoryListener *listener,
> >>      if (int128_ge(int128_make64(iova), llend)) {
> >>          return;
> >>      }
> >> -    end = int128_get64(llend);
> >> +
> >> +    if (int128_eq(llend, int128_2_64())) {
> >> +            end = UINT64_MAX;
> >> +    } else {
> >> +            end = int128_get64(llend);
> >> +    }
> >>  
> >>      if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
> >>          error_report("vfio: IOMMU container %p can't map guest IOVA region"  
> >
> > But now all the calculations where we use end-1 are wrong.  See the
> > discussion with Pierre Morel in the January qemu-devel archives.
> > There's a solution in there, but I never saw a follow-up from Pierre
> > with a revised patch.  Thanks,  
> 
> I am missing something. When end < UIN64_MAX, end - 1 calculations are valid because
> the patch doesn't change that behavior. When end is UINT64_MAX, int128_get64() doesn't know how
> to calculate this value and we are just feeding it manually. The patch is just the opposite
> of what memory_region_init() did to init the mem region in the first place:
>    mr->size = int128_make64(size);
>    if (size == UINT64_MAX) {
>       mr->size = int128_2_64();
>    }
> So, end - 1 is still valid for end = UINT64_MAX, no ?

int128_2_64() is not equal to UINT64_MAX, so assigning UIN64_MAX to
@end is clearing altering the value.  If we had a range from zero to
int128_2_64() then the size of that region is int128_2_64().  If we
alter @end to be UINT64_MAX, then the size is only UINT64_MAX and @end
- 1 is off by one versus the case where we use the value directly.
You're effectively changing @end to be the last address in the range,
but only in some cases, and not adjusting the remaining code to match.
Not only that, but the vfio map command is probably going to fail if we
pass in such an unaligned size since the mapping granularity is
likely the system page size.  Thanks,

Alex
Bandan Das March 22, 2016, 1:54 a.m. UTC | #4
Alex Williamson <alex.williamson@redhat.com> writes:

> On Mon, 21 Mar 2016 20:06:32 -0400
> Bandan Das <bsd@redhat.com> wrote:
>
>> Alex Williamson <alex.williamson@redhat.com> writes:
>> 
>> > On Mon, 21 Mar 2016 18:00:50 -0400
>> > Bandan Das <bsd@redhat.com> wrote:
>> >  
>> >> vfio_listener_region_add for a iommu mr results in
>> >> an overflow assert since emulated iommu memory region is initialized
>> >> with UINT64_MAX. Add a check just like memory_region_size()
>> >> does.
>> >> 
>> >> Signed-off-by: Bandan Das <bsd@redhat.com>
>> >> ---
>> >>  hw/vfio/common.c | 7 ++++++-
>> >>  1 file changed, 6 insertions(+), 1 deletion(-)
>> >> 
>> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> >> index fb588d8..269244b 100644
>> >> --- a/hw/vfio/common.c
>> >> +++ b/hw/vfio/common.c
>> >> @@ -349,7 +349,12 @@ static void vfio_listener_region_add(MemoryListener *listener,
>> >>      if (int128_ge(int128_make64(iova), llend)) {
>> >>          return;
>> >>      }
>> >> -    end = int128_get64(llend);
>> >> +
>> >> +    if (int128_eq(llend, int128_2_64())) {
>> >> +            end = UINT64_MAX;
>> >> +    } else {
>> >> +            end = int128_get64(llend);
>> >> +    }
>> >>  
>> >>      if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
>> >>          error_report("vfio: IOMMU container %p can't map guest IOVA region"  
>> >
>> > But now all the calculations where we use end-1 are wrong.  See the
>> > discussion with Pierre Morel in the January qemu-devel archives.
>> > There's a solution in there, but I never saw a follow-up from Pierre
>> > with a revised patch.  Thanks,  
>> 
>> I am missing something. When end < UIN64_MAX, end - 1 calculations are valid because
>> the patch doesn't change that behavior. When end is UINT64_MAX, int128_get64() doesn't know how
>> to calculate this value and we are just feeding it manually. The patch is just the opposite
>> of what memory_region_init() did to init the mem region in the first place:
>>    mr->size = int128_make64(size);
>>    if (size == UINT64_MAX) {
>>       mr->size = int128_2_64();
>>    }
>> So, end - 1 is still valid for end = UINT64_MAX, no ?
>
> int128_2_64() is not equal to UINT64_MAX, so assigning UIN64_MAX to
> @end is clearing altering the value.  If we had a range from zero to

I thought in128_2_64 is the 128 bit representation of UINT64_MAX. The
if condition in memory_region_init doesn't make sense otherwise.

> int128_2_64() then the size of that region is int128_2_64().  If we
> alter @end to be UINT64_MAX, then the size is only UINT64_MAX and @end
> - 1 is off by one versus the case where we use the value directly.

Ok, you mean something like:
int128_get64(int128_sub(int128_2_64(), int128_make64(1)));  for (end - 1) ?
But we still have to deal with (end - iova) when calling vfio_dmap_map().
int128_get64() will definitely assert for iova = 0. 

> You're effectively changing @end to be the last address in the range,

No, I think I am changing "end" to what we initally started with for size
before converting to 128 bit.

> but only in some cases, and not adjusting the remaining code to match.
> Not only that, but the vfio map command is probably going to fail if we
> pass in such an unaligned size since the mapping granularity is

Trying to map such a large region is wrong anyway, I am still trying
to workout a solution to avoid calling memory_region_init_iommu()
with UINT64_MAX which is what emulated vt-d currently does.

> likely the system page size.  Thanks,
>
> Alex
Alex Williamson March 22, 2016, 2:16 a.m. UTC | #5
On Mon, 21 Mar 2016 21:54:48 -0400
Bandan Das <bsd@redhat.com> wrote:

> Alex Williamson <alex.williamson@redhat.com> writes:
> 
> > On Mon, 21 Mar 2016 20:06:32 -0400
> > Bandan Das <bsd@redhat.com> wrote:
> >  
> >> Alex Williamson <alex.williamson@redhat.com> writes:
> >>   
> >> > On Mon, 21 Mar 2016 18:00:50 -0400
> >> > Bandan Das <bsd@redhat.com> wrote:
> >> >    
> >> >> vfio_listener_region_add for a iommu mr results in
> >> >> an overflow assert since emulated iommu memory region is initialized
> >> >> with UINT64_MAX. Add a check just like memory_region_size()
> >> >> does.
> >> >> 
> >> >> Signed-off-by: Bandan Das <bsd@redhat.com>
> >> >> ---
> >> >>  hw/vfio/common.c | 7 ++++++-
> >> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >> >> 
> >> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >> >> index fb588d8..269244b 100644
> >> >> --- a/hw/vfio/common.c
> >> >> +++ b/hw/vfio/common.c
> >> >> @@ -349,7 +349,12 @@ static void vfio_listener_region_add(MemoryListener *listener,
> >> >>      if (int128_ge(int128_make64(iova), llend)) {
> >> >>          return;
> >> >>      }
> >> >> -    end = int128_get64(llend);
> >> >> +
> >> >> +    if (int128_eq(llend, int128_2_64())) {
> >> >> +            end = UINT64_MAX;
> >> >> +    } else {
> >> >> +            end = int128_get64(llend);
> >> >> +    }
> >> >>  
> >> >>      if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
> >> >>          error_report("vfio: IOMMU container %p can't map guest IOVA region"    
> >> >
> >> > But now all the calculations where we use end-1 are wrong.  See the
> >> > discussion with Pierre Morel in the January qemu-devel archives.
> >> > There's a solution in there, but I never saw a follow-up from Pierre
> >> > with a revised patch.  Thanks,    
> >> 
> >> I am missing something. When end < UIN64_MAX, end - 1 calculations are valid because
> >> the patch doesn't change that behavior. When end is UINT64_MAX, int128_get64() doesn't know how
> >> to calculate this value and we are just feeding it manually. The patch is just the opposite
> >> of what memory_region_init() did to init the mem region in the first place:
> >>    mr->size = int128_make64(size);
> >>    if (size == UINT64_MAX) {
> >>       mr->size = int128_2_64();
> >>    }
> >> So, end - 1 is still valid for end = UINT64_MAX, no ?  
> >
> > int128_2_64() is not equal to UINT64_MAX, so assigning UIN64_MAX to
> > @end is clearing altering the value.  If we had a range from zero to  
> 
> I thought in128_2_64 is the 128 bit representation of UINT64_MAX. The
> if condition in memory_region_init doesn't make sense otherwise.

2^64 cannot be represented with a uint64_t, 2^64 - 1 can:

int128_2_64 = 1_0000_0000_0000_0000h
UINT64_MAX  =   ffff_ffff_ffff_ffffh
 
> > int128_2_64() then the size of that region is int128_2_64().  If we
> > alter @end to be UINT64_MAX, then the size is only UINT64_MAX and @end
> > - 1 is off by one versus the case where we use the value directly.  
> 
> Ok, you mean something like:
> int128_get64(int128_sub(int128_2_64(), int128_make64(1)));  for (end - 1) ?
> But we still have to deal with (end - iova) when calling vfio_dmap_map().
> int128_get64() will definitely assert for iova = 0. 

I don't know that that's the most efficient way to handle it, but @end
represents a different thing by imposing that -1 and it needs to be
handled in the reset of the code.

> > You're effectively changing @end to be the last address in the range,  
> 
> No, I think I am changing "end" to what we initally started with for size
> before converting to 128 bit.

Nope, it's the difference between the size of the region and the last
address of the region.

> > but only in some cases, and not adjusting the remaining code to match.
> > Not only that, but the vfio map command is probably going to fail if we
> > pass in such an unaligned size since the mapping granularity is  
> 
> Trying to map such a large region is wrong anyway, I am still trying
> to workout a solution to avoid calling memory_region_init_iommu()
> with UINT64_MAX which is what emulated vt-d currently does.

Right, the address width of the IOMMU on x86 is typically nowhere near
2^64, so if you take the vfio_dma_map path, you'll surely explode.
Does this fix actually fix anything or just move us to the next
assert?  Thanks,

Alex
Peter Xu March 22, 2016, 3:01 a.m. UTC | #6
On Mon, Mar 21, 2016 at 06:00:50PM -0400, Bandan Das wrote:
> 
> vfio_listener_region_add for a iommu mr results in
> an overflow assert since emulated iommu memory region is initialized
> with UINT64_MAX. Add a check just like memory_region_size()
> does.

Hi, Bandan,

In case you missed this:

https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg02865.html

-- peterx
Bandan Das March 22, 2016, 6:55 p.m. UTC | #7
Alex Williamson <alex.williamson@redhat.com> writes:
...
>> >>    mr->size = int128_make64(size);
>> >>    if (size == UINT64_MAX) {
>> >>       mr->size = int128_2_64();
>> >>    }
>> >> So, end - 1 is still valid for end = UINT64_MAX, no ?  
>> >
>> > int128_2_64() is not equal to UINT64_MAX, so assigning UIN64_MAX to
>> > @end is clearing altering the value.  If we had a range from zero to  
>> 
>> I thought in128_2_64 is the 128 bit representation of UINT64_MAX. The
>> if condition in memory_region_init doesn't make sense otherwise.
>
> 2^64 cannot be represented with a uint64_t, 2^64 - 1 can:
>
> int128_2_64 = 1_0000_0000_0000_0000h
> UINT64_MAX  =   ffff_ffff_ffff_ffffh

Thanks, understood this part. I still don't understand the if condition
in memory_region_init however. Unless, that function actually takes the
last address for the size parameter and in that case, it should be
UINT64_MAX-1 for a size of UINT64_MAX.

>> > int128_2_64() then the size of that region is int128_2_64().  If we
>> > alter @end to be UINT64_MAX, then the size is only UINT64_MAX and @end
>> > - 1 is off by one versus the case where we use the value directly.  
>> 
>> Ok, you mean something like:
>> int128_get64(int128_sub(int128_2_64(), int128_make64(1)));  for (end - 1) ?
>> But we still have to deal with (end - iova) when calling vfio_dmap_map().
>> int128_get64() will definitely assert for iova = 0. 
>
> I don't know that that's the most efficient way to handle it, but @end
> represents a different thing by imposing that -1 and it needs to be
> handled in the reset of the code.
>
>> > You're effectively changing @end to be the last address in the range,  
>> 
>> No, I think I am changing "end" to what we initally started with for size
>> before converting to 128 bit.
>
> Nope, it's the difference between the size of the region and the last
> address of the region.

Ok, but note that it's the "size" that actually asserts here since the
offset is 0. So, we started with a size UINT64_MAX but end with mr->size =
128_2_64().

>> > but only in some cases, and not adjusting the remaining code to match.
>> > Not only that, but the vfio map command is probably going to fail if we
>> > pass in such an unaligned size since the mapping granularity is  
>> 
>> Trying to map such a large region is wrong anyway, I am still trying
>> to workout a solution to avoid calling memory_region_init_iommu()
>> with UINT64_MAX which is what emulated vt-d currently does.
>
> Right, the address width of the IOMMU on x86 is typically nowhere near
> 2^64, so if you take the vfio_dma_map path, you'll surely explode.

And it does. If we fix this assert, then vfio_dma_map() attempts mapping
this direct mapped address range starting from 0 and prints a 
warning message; happens for the whole range and goes on for ever.
The overflow check seemed to me like something we should fix, but now
I am more confused then ever!

> Does this fix actually fix anything or just move us to the next
> assert?  Thanks,
>
> Alex
Bandan Das March 22, 2016, 7:07 p.m. UTC | #8
Peter Xu <peterx@redhat.com> writes:

> On Mon, Mar 21, 2016 at 06:00:50PM -0400, Bandan Das wrote:
>> 
>> vfio_listener_region_add for a iommu mr results in
>> an overflow assert since emulated iommu memory region is initialized
>> with UINT64_MAX. Add a check just like memory_region_size()
>> does.
>
> Hi, Bandan,
>
> In case you missed this:
>
> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg02865.html

Thank you Peter, I wasn't aware. But unfortunately, I don't think he's doing the right
thing either when handing @end ? Oh well, it's a RFC :) 

> -- peterx
Alex Williamson March 22, 2016, 7:31 p.m. UTC | #9
On Tue, 22 Mar 2016 14:55:14 -0400
Bandan Das <bsd@redhat.com> wrote:

> Alex Williamson <alex.williamson@redhat.com> writes:
> ...
> >> >>    mr->size = int128_make64(size);
> >> >>    if (size == UINT64_MAX) {
> >> >>       mr->size = int128_2_64();
> >> >>    }
> >> >> So, end - 1 is still valid for end = UINT64_MAX, no ?    
> >> >
> >> > int128_2_64() is not equal to UINT64_MAX, so assigning UIN64_MAX to
> >> > @end is clearing altering the value.  If we had a range from zero to    
> >> 
> >> I thought in128_2_64 is the 128 bit representation of UINT64_MAX. The
> >> if condition in memory_region_init doesn't make sense otherwise.  
> >
> > 2^64 cannot be represented with a uint64_t, 2^64 - 1 can:
> >
> > int128_2_64 = 1_0000_0000_0000_0000h
> > UINT64_MAX  =   ffff_ffff_ffff_ffffh  
> 
> Thanks, understood this part. I still don't understand the if condition
> in memory_region_init however. Unless, that function actually takes the
> last address for the size parameter and in that case, it should be
> UINT64_MAX-1 for a size of UINT64_MAX.

Seems like some sort of compatibility convention since
memory_region_init() only takes a uint64_t as the size.  memory.c
interprets that as "oh, I know you really mean 2^64".
 
> >> > int128_2_64() then the size of that region is int128_2_64().  If we
> >> > alter @end to be UINT64_MAX, then the size is only UINT64_MAX and @end
> >> > - 1 is off by one versus the case where we use the value directly.    
> >> 
> >> Ok, you mean something like:
> >> int128_get64(int128_sub(int128_2_64(), int128_make64(1)));  for (end - 1) ?
> >> But we still have to deal with (end - iova) when calling vfio_dmap_map().
> >> int128_get64() will definitely assert for iova = 0.   
> >
> > I don't know that that's the most efficient way to handle it, but @end
> > represents a different thing by imposing that -1 and it needs to be
> > handled in the reset of the code.
> >  
> >> > You're effectively changing @end to be the last address in the range,    
> >> 
> >> No, I think I am changing "end" to what we initally started with for size
> >> before converting to 128 bit.  
> >
> > Nope, it's the difference between the size of the region and the last
> > address of the region.  
> 
> Ok, but note that it's the "size" that actually asserts here since the
> offset is 0. So, we started with a size UINT64_MAX but end with mr->size =
> 128_2_64().

A size of UINT64_MAX doesn't make much sense to me, that would mean the
last address is UINT64_MAX - 1.  A size of 2^64 means the last address
is UINT64_MAX, which seems reasonable.
 
> >> > but only in some cases, and not adjusting the remaining code to match.
> >> > Not only that, but the vfio map command is probably going to fail if we
> >> > pass in such an unaligned size since the mapping granularity is    
> >> 
> >> Trying to map such a large region is wrong anyway, I am still trying
> >> to workout a solution to avoid calling memory_region_init_iommu()
> >> with UINT64_MAX which is what emulated vt-d currently does.  
> >
> > Right, the address width of the IOMMU on x86 is typically nowhere near
> > 2^64, so if you take the vfio_dma_map path, you'll surely explode.  
> 
> And it does. If we fix this assert, then vfio_dma_map() attempts mapping
> this direct mapped address range starting from 0 and prints a 
> warning message; happens for the whole range and goes on for ever.
> The overflow check seemed to me like something we should fix, but now
> I am more confused then ever!

Is the MemoryRegion memory_region_is_iommu() such that you're calling
vfio_dma_map() from vfio_iommu_map_notify()?  If so then we should
probably be using 128bit helpers for doing sanity checking and go ahead
and let something assert if we get to the vfio_dma_map() in
vfio_listener_region_add() with a 2^64 size.  Then if you're taking the
memory_region_is_iommu() path, vfio_dma_map() is going to be called
with translations within that 2^64 bit address space, not mapping the
entire space, right?  Thanks,

Alex
Alex Williamson March 22, 2016, 7:31 p.m. UTC | #10
On Tue, 22 Mar 2016 15:07:00 -0400
Bandan Das <bsd@redhat.com> wrote:

> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, Mar 21, 2016 at 06:00:50PM -0400, Bandan Das wrote:  
> >> 
> >> vfio_listener_region_add for a iommu mr results in
> >> an overflow assert since emulated iommu memory region is initialized
> >> with UINT64_MAX. Add a check just like memory_region_size()
> >> does.  
> >
> > Hi, Bandan,
> >
> > In case you missed this:
> >
> > https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg02865.html  
> 
> Thank you Peter, I wasn't aware. But unfortunately, I don't think he's doing the right
> thing either when handing @end ? Oh well, it's a RFC :) 

Agree, the fix there is bogus too.
Bandan Das March 22, 2016, 8:55 p.m. UTC | #11
Alex Williamson <alex.williamson@redhat.com> writes:
...
>> 
>> And it does. If we fix this assert, then vfio_dma_map() attempts mapping
>> this direct mapped address range starting from 0 and prints a 
>> warning message; happens for the whole range and goes on for ever.
>> The overflow check seemed to me like something we should fix, but now
>> I am more confused then ever!
>
> Is the MemoryRegion memory_region_is_iommu() such that you're calling
> vfio_dma_map() from vfio_iommu_map_notify()?  If so then we should

Yes, that is correct. This all started after we added the iommu mapping
replay changes but I was wrong about the vfio_dma_map part. Please see
below.

> probably be using 128bit helpers for doing sanity checking and go ahead
> and let something assert if we get to the vfio_dma_map() in
> vfio_listener_region_add() with a 2^64 size.  Then if you're taking the
> memory_region_is_iommu() path, vfio_dma_map() is going to be called
> with translations within that 2^64 bit address space, not mapping the
> entire space, right?  Thanks,

The 128 bit operations make sense...

The error message comes from:
if (!memory_region_is_ram(mr)) {
        error_report("iommu map to non memory area %"HWADDR_PRIx"",
                     xlat);
        goto out;
    }
in vfio_iommu_map_notify() before we even get to vfio_dma_map().

This gets attempted for the entire range because dmar isn't enabled yet and
vtd_iommu_translate() does this direct mapping in 4k increments in the translate
path :
...

    if (!s->dmar_enabled) {
        /* DMAR disabled, passthrough, use 4k-page*/
        ret.iova = addr & VTD_PAGE_MASK_4K;
        ret.translated_addr = addr & VTD_PAGE_MASK_4K;
        ret.addr_mask = ~VTD_PAGE_MASK_4K;
        ret.perm = IOMMU_RW;
        return ret;
    }

I am not sure yet who actually uses it though.
memory_region_iommu_replay() does the whole iteration
if perm != IOMMU_NONE:

void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
                                hwaddr granularity, bool is_write)
{
    hwaddr addr;
    IOMMUTLBEntry iotlb;

    for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
        iotlb = mr->iommu_ops->translate(mr, addr, is_write);
        if (iotlb.perm != IOMMU_NONE) {
            n->notify(n, &iotlb);
        }
...

> Alex
Peter Xu March 23, 2016, 2:42 a.m. UTC | #12
On Tue, Mar 22, 2016 at 03:07:00PM -0400, Bandan Das wrote:
> Thank you Peter, I wasn't aware. But unfortunately, I don't think he's doing the right
> thing either when handing @end ? Oh well, it's a RFC :) 

Possibly. Just to make sure you know about the whole thing (rather
than only the @end fix), since I see that fixing the core dump
should possibly be the start to move on to VFIO+IOMMU. As you
mentioned, guessing that it's never a bad thing to know more about
ideas. :)

Thanks.

-- peterx
diff mbox

Patch

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index fb588d8..269244b 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -349,7 +349,12 @@  static void vfio_listener_region_add(MemoryListener *listener,
     if (int128_ge(int128_make64(iova), llend)) {
         return;
     }
-    end = int128_get64(llend);
+
+    if (int128_eq(llend, int128_2_64())) {
+            end = UINT64_MAX;
+    } else {
+            end = int128_get64(llend);
+    }
 
     if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
         error_report("vfio: IOMMU container %p can't map guest IOVA region"