Message ID | 20170517055756.GA31206@pxdev.xzpeter.org |
---|---|
State | New |
Headers | show |
On 05/17/2017 07:57 AM, Peter Xu wrote: > On Wed, May 17, 2017 at 12:23:42PM +0800, Peter Xu wrote: >> On Tue, May 16, 2017 at 06:51:03PM +0200, Maxime Coquelin wrote: >>> Hi Peter, >>> >>> On 05/16/2017 03:24 PM, Maxime Coquelin wrote: >>>> >>>> >>>> On 05/15/2017 10:50 AM, Peter Xu wrote: >>>>> The problem is that, address_space_get_iotlb_entry() shares a lot with >>>>> address_space_translate(). This patch tries to abstract the >>>>> shared elements. >>>>> >>>>> Originally, this work is derived from discussion from VT-d passthrough >>>>> series discussions [1]. But for sure we can just see this series as a >>>>> standalone cleanup. So I posted it separately here. >>>>> >>>>> Smoke tests are done with general VM boots, IOs, especially with vhost >>>>> dmar configurations. >>>>> >>>>> I believe with current series I can throw away the old patch [1], >>>>> which may be good. But before that, please kindly review. Thanks. >>>> >>>> I faced the problem the old patch fixes when declaring and attaching an >>>> IOMMU device, but booting the kernel with intel_iommu=off. >>>> >>>> I tested again with patches 1 & 4 of your series, and I confirm it fixes >>>> the issue: >>>> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com> >>> >>> I did some more testing with my "vhost-user IOMMU" setup, and the series >>> actually breaks with IOMMU device attached, and intel_iommu=on. >>> >>> The main difference with the previous passing test is the guest RAM >>> size. In the working setup, it is 2G of 2M hugepages, vs. 4G of 2M >>> hugepages in the failing one. Note that I also reproduce with vhost-kernel >>> backend. >>> >>> The error happens in the first vhost_device_iotlb_miss() call: >>> qemu-system-x86_64: Fail to lookup the translated address b5d7c000 >>> >>> I don't have the root cause yet, I'll keep you updated. >> >> Maxime, >> >> Thanks a lot for help testing this series! >> >> I reproduced this problem, and this is not a problem obvious enough >> for me. Let me investigate as well. >> >> -- >> Peter Xu > > Maxime, > > Could you help try adding this change upon current to see whether > problem solved? Hi Peter, Yes, problem is solved with below change. Thanks! Maxime > diff --git a/exec.c b/exec.c > index 697d902..68576a2 100644 > --- a/exec.c > +++ b/exec.c > @@ -521,6 +521,10 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, > goto iotlb_fail; > } > > + /* Convert memory region offset into address space offset */ > + xlat += section.offset_within_address_space - > + section.offset_within_region; > + > if (plen == (hwaddr)-1) { > /* > * We use default page size here. Logically it only happens > > Thanks in advance, >
On 17/05/2017 09:17, Maxime Coquelin wrote: > > > On 05/17/2017 07:57 AM, Peter Xu wrote: >> On Wed, May 17, 2017 at 12:23:42PM +0800, Peter Xu wrote: >>> On Tue, May 16, 2017 at 06:51:03PM +0200, Maxime Coquelin wrote: >>>> Hi Peter, >>>> >>>> On 05/16/2017 03:24 PM, Maxime Coquelin wrote: >>>>> >>>>> >>>>> On 05/15/2017 10:50 AM, Peter Xu wrote: >>>>>> The problem is that, address_space_get_iotlb_entry() shares a lot >>>>>> with >>>>>> address_space_translate(). This patch tries to abstract the >>>>>> shared elements. >>>>>> >>>>>> Originally, this work is derived from discussion from VT-d >>>>>> passthrough >>>>>> series discussions [1]. But for sure we can just see this series as a >>>>>> standalone cleanup. So I posted it separately here. >>>>>> >>>>>> Smoke tests are done with general VM boots, IOs, especially with >>>>>> vhost >>>>>> dmar configurations. >>>>>> >>>>>> I believe with current series I can throw away the old patch [1], >>>>>> which may be good. But before that, please kindly review. Thanks. >>>>> >>>>> I faced the problem the old patch fixes when declaring and >>>>> attaching an >>>>> IOMMU device, but booting the kernel with intel_iommu=off. >>>>> >>>>> I tested again with patches 1 & 4 of your series, and I confirm it >>>>> fixes >>>>> the issue: >>>>> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com> >>>> >>>> I did some more testing with my "vhost-user IOMMU" setup, and the >>>> series >>>> actually breaks with IOMMU device attached, and intel_iommu=on. >>>> >>>> The main difference with the previous passing test is the guest RAM >>>> size. In the working setup, it is 2G of 2M hugepages, vs. 4G of 2M >>>> hugepages in the failing one. Note that I also reproduce with >>>> vhost-kernel >>>> backend. >>>> >>>> The error happens in the first vhost_device_iotlb_miss() call: >>>> qemu-system-x86_64: Fail to lookup the translated address b5d7c000 >>>> >>>> I don't have the root cause yet, I'll keep you updated. >>> >>> Maxime, >>> >>> Thanks a lot for help testing this series! >>> >>> I reproduced this problem, and this is not a problem obvious enough >>> for me. Let me investigate as well. >>> >>> -- >>> Peter Xu >> >> Maxime, >> >> Could you help try adding this change upon current to see whether >> problem solved? > > Hi Peter, > > Yes, problem is solved with below change. Cool---Peter, please send the new patch 4 as toplevel message! Thanks, Paolo
On Wed, May 17, 2017 at 09:26:25AM +0200, Paolo Bonzini wrote: > > > On 17/05/2017 09:17, Maxime Coquelin wrote: > > > > > > On 05/17/2017 07:57 AM, Peter Xu wrote: > >> On Wed, May 17, 2017 at 12:23:42PM +0800, Peter Xu wrote: > >>> On Tue, May 16, 2017 at 06:51:03PM +0200, Maxime Coquelin wrote: > >>>> Hi Peter, > >>>> > >>>> On 05/16/2017 03:24 PM, Maxime Coquelin wrote: > >>>>> > >>>>> > >>>>> On 05/15/2017 10:50 AM, Peter Xu wrote: > >>>>>> The problem is that, address_space_get_iotlb_entry() shares a lot > >>>>>> with > >>>>>> address_space_translate(). This patch tries to abstract the > >>>>>> shared elements. > >>>>>> > >>>>>> Originally, this work is derived from discussion from VT-d > >>>>>> passthrough > >>>>>> series discussions [1]. But for sure we can just see this series as a > >>>>>> standalone cleanup. So I posted it separately here. > >>>>>> > >>>>>> Smoke tests are done with general VM boots, IOs, especially with > >>>>>> vhost > >>>>>> dmar configurations. > >>>>>> > >>>>>> I believe with current series I can throw away the old patch [1], > >>>>>> which may be good. But before that, please kindly review. Thanks. > >>>>> > >>>>> I faced the problem the old patch fixes when declaring and > >>>>> attaching an > >>>>> IOMMU device, but booting the kernel with intel_iommu=off. > >>>>> > >>>>> I tested again with patches 1 & 4 of your series, and I confirm it > >>>>> fixes > >>>>> the issue: > >>>>> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com> > >>>> > >>>> I did some more testing with my "vhost-user IOMMU" setup, and the > >>>> series > >>>> actually breaks with IOMMU device attached, and intel_iommu=on. > >>>> > >>>> The main difference with the previous passing test is the guest RAM > >>>> size. In the working setup, it is 2G of 2M hugepages, vs. 4G of 2M > >>>> hugepages in the failing one. Note that I also reproduce with > >>>> vhost-kernel > >>>> backend. > >>>> > >>>> The error happens in the first vhost_device_iotlb_miss() call: > >>>> qemu-system-x86_64: Fail to lookup the translated address b5d7c000 > >>>> > >>>> I don't have the root cause yet, I'll keep you updated. > >>> > >>> Maxime, > >>> > >>> Thanks a lot for help testing this series! > >>> > >>> I reproduced this problem, and this is not a problem obvious enough > >>> for me. Let me investigate as well. > >>> > >>> -- > >>> Peter Xu > >> > >> Maxime, > >> > >> Could you help try adding this change upon current to see whether > >> problem solved? > > > > Hi Peter, > > > > Yes, problem is solved with below change. > > Cool---Peter, please send the new patch 4 as toplevel message! Will do. Thanks!
diff --git a/exec.c b/exec.c index 697d902..68576a2 100644 --- a/exec.c +++ b/exec.c @@ -521,6 +521,10 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, goto iotlb_fail; } + /* Convert memory region offset into address space offset */ + xlat += section.offset_within_address_space - + section.offset_within_region; + if (plen == (hwaddr)-1) { /* * We use default page size here. Logically it only happens