diff mbox

[17/30] memory: add address_space_translate

Message ID 51A09018.7000901@web.de
State New
Headers show

Commit Message

Jan Kiszka May 25, 2013, 10:19 a.m. UTC
On 2013-05-25 09:47, Paolo Bonzini wrote:
> Il 25/05/2013 08:40, Jan Kiszka ha scritto:
>> On 2013-05-21 12:57, Paolo Bonzini wrote:
>>> Using phys_page_find to translate an AddressSpace to a
>>> MemoryRegionSection is unwieldy.  It requires to pass the page
>>> index rather than the address, and later
>>> memory_region_section_addr has to be called.  Replace 
>>> memory_region_section_addr with a function that does all of it:
>>> call phys_page_find, compute the offset within the region, and
>>> check how big the current mapping is.  This way, a large flat
>>> region can be written with a single lookup rather than a page at
>>> a time.
>>>
>>> address_space_translate will also provide a single point where
>>> IOMMU forwarding is implemented.
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- cputlb.c
>>> |  20 ++--- exec.c                | 201
>>> +++++++++++++++++++++++++++----------------------- 
>>> include/exec/cputlb.h |  12 ++- include/exec/memory.h |  31
>>> ++++---- translate-all.c       |   6 +- 5 files changed, 143
>>> insertions(+), 127 deletions(-)
>>>
>>> diff --git a/cputlb.c b/cputlb.c index aba7e44..1f85da0 100644 
>>> --- a/cputlb.c +++ b/cputlb.c @@ -248,13 +248,18 @@ void
>>> tlb_set_page(CPUArchState *env, target_ulong vaddr, target_ulong
>>> code_address; uintptr_t addend; CPUTLBEntry *te; -    hwaddr
>>> iotlb; +    hwaddr iotlb, xlat, sz;
>>>
>>> assert(size >= TARGET_PAGE_SIZE); if (size != TARGET_PAGE_SIZE)
>>> { tlb_add_large_page(env, vaddr, size); } -    section =
>>> phys_page_find(address_space_memory.dispatch, paddr >>
>>> TARGET_PAGE_BITS); + +    sz = size; +    section =
>>> address_space_translate(&address_space_memory, paddr, &xlat,
>>> &sz, +                                      false); +
>>> assert(sz >= TARGET_PAGE_SIZE); + #if defined(DEBUG_TLB) 
>>> printf("tlb_set_page: vaddr=" TARGET_FMT_lx " paddr=0x"
>>> TARGET_FMT_plx " prot=%x idx=%d pd=0x%08lx\n", @@ -269,15 +274,14
>>> @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr, } if
>>> (memory_region_is_ram(section->mr) || 
>>> memory_region_is_romd(section->mr)) { -        addend =
>>> (uintptr_t)memory_region_get_ram_ptr(section->mr) -        +
>>> memory_region_section_addr(section, paddr); +        addend =
>>> (uintptr_t)memory_region_get_ram_ptr(section->mr) + xlat; } else
>>> { addend = 0; }
>>>
>>> code_address = address; -    iotlb =
>>> memory_region_section_get_iotlb(env, section, vaddr, paddr,
>>> prot, -                                            &address); +
>>> iotlb = memory_region_section_get_iotlb(env, section, vaddr,
>>> paddr, xlat, +                                            prot,
>>> &address);
>>>
>>> index = (vaddr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); 
>>> env->iotlb[mmu_idx][index] = iotlb - vaddr; @@ -300,9 +304,7 @@
>>> void tlb_set_page(CPUArchState *env, target_ulong vaddr, /* Write
>>> access calls the I/O callback.  */ te->addr_write = address |
>>> TLB_MMIO; } else if (memory_region_is_ram(section->mr) -
>>> && !cpu_physical_memory_is_dirty( -
>>> section->mr->ram_addr -                           +
>>> memory_region_section_addr(section, paddr))) { +
>>> && !cpu_physical_memory_is_dirty(section->mr->ram_addr + xlat))
>>> { te->addr_write = address | TLB_NOTDIRTY; } else { 
>>> te->addr_write = address; diff --git a/exec.c b/exec.c index
>>> 82da067..e5ee8ff 100644 --- a/exec.c +++ b/exec.c @@ -182,7
>>> +182,7 @@ static void phys_page_set(AddressSpaceDispatch *d, 
>>> phys_page_set_level(&d->phys_map, &index, &nb, leaf, P_L2_LEVELS
>>> - 1); }
>>>
>>> -MemoryRegionSection *phys_page_find(AddressSpaceDispatch *d,
>>> hwaddr index) +static MemoryRegionSection
>>> *phys_page_find(AddressSpaceDispatch *d, hwaddr index) { 
>>> PhysPageEntry lp = d->phys_map; PhysPageEntry *p; @@ -198,6
>>> +198,22 @@ MemoryRegionSection
>>> *phys_page_find(AddressSpaceDispatch *d, hwaddr index) return
>>> &phys_sections[lp.ptr]; }
>>>
>>> +MemoryRegionSection *address_space_translate(AddressSpace *as,
>>> hwaddr addr, +                                             hwaddr
>>> *xlat, hwaddr *plen, +
>>> bool is_write) +{ +    MemoryRegionSection *section; + +
>>> section = phys_page_find(as->dispatch, addr >>
>>> TARGET_PAGE_BITS); +    /* Compute offset within
>>> MemoryRegionSection */ +    addr -=
>>> section->offset_within_address_space; +    *plen =
>>> MIN(section->size - addr, *plen);
> 
>> This limitation causes problems. Consider two overlapping memory
>> regions A and B. A handles 4-byte accesses and is at least 4 bytes
>> long, B only deals with a single byte. They overlap like this:
> 
>> B (prio 1):   X A (prio 0): XXXX... ^access here with 4 bytes
>> length
> 
>> Now if an access happens at the marked position, it is split into
>> one 2-byte access to A, followed by a one-byte access to B and
>> another one-byte access to A. But the right access emulation would
>> be 4-bytes to A, no?
> 
> I guess it depends on the width of the bus.  On a 16-bit computer the
> right thing to do would be to split the write in two, one two-byte
> access to A and one two-byte access to B.  But as a first
> approximation the fix you suggest is the right one.  Implementing bus
> width in TCG is tricky, so we should get by with the simple fix.
> 
> This patch is not in the pull request I sent, so there is time to make
> it work.  Is it simply
> 
> -    *plen = MIN(section->size - addr, *plen);
> +    addr += section->offset_within_memory_region;
> +    *plen = MIN(section->mr->size - addr, *plen);
> 
> or something like that?  If you post it as a diff I can squash it in.

This works for me now:


Jan

Comments

Paolo Bonzini May 25, 2013, 11:20 a.m. UTC | #1
Il 25/05/2013 12:19, Jan Kiszka ha scritto:
>          addr -= section->offset_within_address_space;
> -        len = MIN(section->size - addr, len);
                     ^^^^^^^^^^^^^   ^^^^

This is the size of a section minus an offset in the section.

> +        diff = int128_sub(section->mr->size, int128_make64(addr));
                             ^^^^^^^^^^^^^^^^^                ^^^^

This is the size of a region minus the same offset in the section.

> +        len = MIN(int128_get64(diff), len);
>  
>          /* Compute offset within MemoryRegion */
>          addr += section->offset_within_region;

So this has to be moved above.  Do you have a branch pushed somewhere
that I can test against?

Paolo
Jan Kiszka May 25, 2013, 11:30 a.m. UTC | #2
On 2013-05-25 13:20, Paolo Bonzini wrote:
> Il 25/05/2013 12:19, Jan Kiszka ha scritto:
>>          addr -= section->offset_within_address_space;
>> -        len = MIN(section->size - addr, len);
>                      ^^^^^^^^^^^^^   ^^^^
> 
> This is the size of a section minus an offset in the section.
> 
>> +        diff = int128_sub(section->mr->size, int128_make64(addr));
>                              ^^^^^^^^^^^^^^^^^                ^^^^
> 
> This is the size of a region minus the same offset in the section.
> 
>> +        len = MIN(int128_get64(diff), len);
>>  
>>          /* Compute offset within MemoryRegion */
>>          addr += section->offset_within_region;
> 
> So this has to be moved above.

Right, fixed.

>  Do you have a branch pushed somewhere
> that I can test against?

git://git.kiszka.org/qemu.git queues/ioport

Jan
Paolo Bonzini May 26, 2013, 8:56 a.m. UTC | #3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 25/05/2013 13:30, Jan Kiszka ha scritto:
> On 2013-05-25 13:20, Paolo Bonzini wrote:
>> Il 25/05/2013 12:19, Jan Kiszka ha scritto:
>>> addr -= section->offset_within_address_space; -        len =
>>> MIN(section->size - addr, len);
>> ^^^^^^^^^^^^^   ^^^^
>> 
>> This is the size of a section minus an offset in the section.
>> 
>>> +        diff = int128_sub(section->mr->size,
>>> int128_make64(addr));
>> ^^^^^^^^^^^^^^^^^                ^^^^
>> 
>> This is the size of a region minus the same offset in the
>> section.
>> 
>>> +        len = MIN(int128_get64(diff), len);
>>> 
>>> /* Compute offset within MemoryRegion */ addr +=
>>> section->offset_within_region;
>> 
>> So this has to be moved above.
> 
> Right, fixed.
> 
>> Do you have a branch pushed somewhere that I can test against?
> 
> git://git.kiszka.org/qemu.git queues/ioport

Nice patches.  Only one thing, how is .impl.unaligned different from
the existing .valid.unaligned?

Paolo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRoc4vAAoJEBvWZb6bTYby7UEP/0l5Stow6nrKirryCNuds3ky
L3FTjqEhLZ+aKha4rdbYNKhwbvY92rfR/Wo8hMckkFv6wwcieHD+nYQltuIDW3y2
xV7FmCPJd1M1pajIQ5Nk+Ssqf7Qhv7jRX5Kg2L6ksNf5srb5485J9oH4M9hIO6CH
XFHO9GeKmynbeeZmSNFF9mNlQd0leHx6h6IbJxYov/eD0zxp5cxoN7xM6+V5hCP/
UjBa2Zn07AI94BS/64YSHSa155BLxIOS5jJi25PNOWti29wHE8Nhw2KF5pMD/JYM
hPblsTIpFf6zg53FMHmrY/B990Ol+IjNogQhRh3tSfNf/48XWA4l+anLdaMweQf6
TuCEOoDqgNphz6c5lOPjzQB/CJZFUCQzxaUjok499ZQyPY5SpIgjKmzl7Vszos8I
wnYmJJ3xVLyJujvx61rg7qiW3ckyVx0sPHZXt/DWEDUNDHTW+HdqAcrZ8TgJrIni
gQF1VEKNplxdgauIv17RGxnVOXwb8PcoNwXE9Gn6XJ/KKQLh9KukmdufCSBmOARV
iiAo8N4rUK8FaCFXDwrZ3S8OIiyWCSdnud/4wZPcsI+r5gM/rrtlrZoQExyx5meO
9RzLBtL5hwFaQ9erE8ce+GDKbo+nie593z3UiN5NKLuH5KjWig2OYQxV6ylg4Vx7
RscCNjGvppk+2I3PAv9R
=pCeD
-----END PGP SIGNATURE-----
Paolo Bonzini May 26, 2013, 9:01 a.m. UTC | #4
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 25/05/2013 13:30, Jan Kiszka ha scritto:
> On 2013-05-25 13:20, Paolo Bonzini wrote:
>> Il 25/05/2013 12:19, Jan Kiszka ha scritto:
>>> addr -= section->offset_within_address_space; -        len =
>>> MIN(section->size - addr, len);
>> ^^^^^^^^^^^^^   ^^^^
>> 
>> This is the size of a section minus an offset in the section.
>> 
>>> +        diff = int128_sub(section->mr->size,
>>> int128_make64(addr));
>> ^^^^^^^^^^^^^^^^^                ^^^^
>> 
>> This is the size of a region minus the same offset in the
>> section.
>> 
>>> +        len = MIN(int128_get64(diff), len);
>>> 
>>> /* Compute offset within MemoryRegion */ addr +=
>>> section->offset_within_region;
>> 
>> So this has to be moved above.
> 
> Right, fixed.
> 
>> Do you have a branch pushed somewhere that I can test against?
> 
> git://git.kiszka.org/qemu.git queues/ioport

And another thing... in "ioport: Switch dispatching to memory core
layer", could you put memory_region_iorange_read/write in
mr->ops->read/write instead of adding an "if" in the dispatch routines?

Paolo

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRoc9MAAoJEBvWZb6bTYbya6IP/Rso0r/qUrRttOZYyC0GX1IL
xBXZEyCDrbV49pQw9O85Lpb2BRzbUOy38RZ/gvrUfGzg9iYo+88nxoA4qiTXSKil
IBxC/Nu5QCcV01zTWLE7dsjAGVVP6td2UWqtI/41T7H1OmhmN9BR2hD0hHhDT6aO
f1KAhzkrdD2aOJHCyWjtehriZFH+J638QuAem8zMIC6kFGOBm475iYBI5l4haXaN
k2Rx66EaOuJRFzZVscHnTw3Ohrk2QZEe8EpBScQIf3hxAlDv4wPDzXYjADUmCGhH
u5dHXSDMgD3+ll5/XLAq3dX/lZPn6nJCzAqV2DPboAxhjpe5+gt0i50r91Di8T+3
Rn8ycEi5hy7Pck++ijwTDs4JGU1vGvT6xpnCmsTwiU3Tw5Yd8lNEQJyOQAMsdsrH
GX+Tsqiowoq8LKVACHSOHouffUB4TM5XdUN0dPGKdohPGyDSlRlkE58mFPeBmMRT
SIhaCPkykpRguNFnkxx5iLl66yUsi5Jb0qeXslffKKx6wCr6N5d3jOeb2jqjG/LT
AAlQ4U5GdKbKU6rDWmcB8GeV2HZh9+ivIUwC4Q4+s+inlaXXh5V4hnNKQ8aHc2MX
wBfq07Zit8bbCUvB+tYf+ORAWHUV1BFWjFRkE4sZ08ouCs45TN7Ug67334GyPvxe
K5Zz+3M9ZKkvnK4kNINg
=71ln
-----END PGP SIGNATURE-----
Jan Kiszka May 26, 2013, 9:02 a.m. UTC | #5
On 2013-05-26 10:56, Paolo Bonzini wrote:
> Il 25/05/2013 13:30, Jan Kiszka ha scritto:
>> On 2013-05-25 13:20, Paolo Bonzini wrote:
>>> Il 25/05/2013 12:19, Jan Kiszka ha scritto:
>>>> addr -= section->offset_within_address_space; -        len =
>>>> MIN(section->size - addr, len);
>>> ^^^^^^^^^^^^^   ^^^^
>>>
>>> This is the size of a section minus an offset in the section.
>>>
>>>> +        diff = int128_sub(section->mr->size,
>>>> int128_make64(addr));
>>> ^^^^^^^^^^^^^^^^^                ^^^^
>>>
>>> This is the size of a region minus the same offset in the
>>> section.
>>>
>>>> +        len = MIN(int128_get64(diff), len);
>>>>
>>>> /* Compute offset within MemoryRegion */ addr +=
>>>> section->offset_within_region;
>>>
>>> So this has to be moved above.
> 
>> Right, fixed.
> 
>>> Do you have a branch pushed somewhere that I can test against?
> 
>> git://git.kiszka.org/qemu.git queues/ioport
> 
> Nice patches.  Only one thing, how is .impl.unaligned different from
> the existing .valid.unaligned?

See memory.h: valid controls is an unaligned access traps or gets
processed, impl manages if it is passed as-is to the device or broken up
and aligned first.

Jan
Jan Kiszka May 26, 2013, 9:12 a.m. UTC | #6
On 2013-05-26 11:01, Paolo Bonzini wrote:
> Il 25/05/2013 13:30, Jan Kiszka ha scritto:
>> On 2013-05-25 13:20, Paolo Bonzini wrote:
>>> Il 25/05/2013 12:19, Jan Kiszka ha scritto:
>>>> addr -= section->offset_within_address_space; -        len =
>>>> MIN(section->size - addr, len);
>>> ^^^^^^^^^^^^^   ^^^^
>>>
>>> This is the size of a section minus an offset in the section.
>>>
>>>> +        diff = int128_sub(section->mr->size,
>>>> int128_make64(addr));
>>> ^^^^^^^^^^^^^^^^^                ^^^^
>>>
>>> This is the size of a region minus the same offset in the
>>> section.
>>>
>>>> +        len = MIN(int128_get64(diff), len);
>>>>
>>>> /* Compute offset within MemoryRegion */ addr +=
>>>> section->offset_within_region;
>>>
>>> So this has to be moved above.
> 
>> Right, fixed.
> 
>>> Do you have a branch pushed somewhere that I can test against?
> 
>> git://git.kiszka.org/qemu.git queues/ioport
> 
> And another thing... in "ioport: Switch dispatching to memory core
> layer", could you put memory_region_iorange_read/write in
> mr->ops->read/write instead of adding an "if" in the dispatch routines?

Not trivially because mr->opaque is passed to the read/write handler,
but memory_region_iorange_read/write needs the region. Can add more data
structures to handles this, but what does it buy us?

Jan
Paolo Bonzini May 26, 2013, 6:23 p.m. UTC | #7
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 26/05/2013 11:12, Jan Kiszka ha scritto:
> On 2013-05-26 11:01, Paolo Bonzini wrote:
>> Il 25/05/2013 13:30, Jan Kiszka ha scritto:
>>> On 2013-05-25 13:20, Paolo Bonzini wrote:
>>>> Il 25/05/2013 12:19, Jan Kiszka ha scritto:
>>>>> addr -= section->offset_within_address_space; -        len
>>>>> = MIN(section->size - addr, len);
>>>> ^^^^^^^^^^^^^   ^^^^
>>>> 
>>>> This is the size of a section minus an offset in the
>>>> section.
>>>> 
>>>>> +        diff = int128_sub(section->mr->size, 
>>>>> int128_make64(addr));
>>>> ^^^^^^^^^^^^^^^^^                ^^^^
>>>> 
>>>> This is the size of a region minus the same offset in the 
>>>> section.
>>>> 
>>>>> +        len = MIN(int128_get64(diff), len);
>>>>> 
>>>>> /* Compute offset within MemoryRegion */ addr += 
>>>>> section->offset_within_region;
>>>> 
>>>> So this has to be moved above.
>> 
>>> Right, fixed.
>> 
>>>> Do you have a branch pushed somewhere that I can test
>>>> against?
>> 
>>> git://git.kiszka.org/qemu.git queues/ioport
>> 
>> And another thing... in "ioport: Switch dispatching to memory
>> core layer", could you put memory_region_iorange_read/write in 
>> mr->ops->read/write instead of adding an "if" in the dispatch
>> routines?
> 
> Not trivially because mr->opaque is passed to the read/write
> handler, but memory_region_iorange_read/write needs the region. Can
> add more data structures to handles this, but what does it buy us?

It's simpler, and I think it lets us remove ops->old_portio too.  We
can have something like

typedef MemoryRegionPortioList {
    MemoryRegion mr;
    void *piolist_opaque;
    MemoryRegionPortio pio[];
}

then you can set mr->opaque == mr.  It's more similar to how other
regions are implemented, e.g. subpages.

Paolo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRolMGAAoJEBvWZb6bTYbym5wP/izYQtpCSLhiAe6LiyN5cSVj
wMe+gSqhqyvi8Ho/GewilHlp1L0UKXQs18KhKqZbEtVETFdvKYoU+sJCPqWbRThW
Gj62dfo1Zn814gnv/CxiebslcI5fiU2UrHtpW1oXBKCZWmLJrfZSJqT5K7o1Z8Y3
qRDcR2sIISIQgAxM/n2EbuJWq/v50BUCDOXgpT3eUrq3rvdQHbh/Ym+y8Zd74sXl
VhK3UmP17ro0YPLtImForhrFutgavtgKuKCVvMTkD04ZLrWrW+QdXJmGl+LuCp+M
L/ib7jaR/4uPJ1RiDPbZexKc956yRsA5gzvJo9kHE0B6IipM+uqVCbrfk96kWrAC
Cg3qGn26cT/bc4zNF7NudJxAErVaHf220iJecoqFXNi6OwZTM6IgHlVH58l+yrRE
swQlxqQHQoYEzm4ZzsuK9C9Y50Y4C6G0LzkEWeyvYi5UdbOxt/hl1yh0pdSnhSSx
47Hw28nZgceDpcySv/Deb3zBzw/Pxx1Z3wkTxMhVpZ6t3Bot/T9hmkGQyuFqvXWF
cdrJEdYmJ2CEkmY2uwKUWCZeMzNtR+qfij+QlcI9RjFoVlmlOySJmBwR55Y7yu5m
H4wRZjeQ8PtfpZH/NmmwkNPOxaghDTaUn9k9/rEQzCDM9b17ddovvDI8JYqo9oaa
K1di55883LPNfkvycya3
=dJkN
-----END PGP SIGNATURE-----
Paolo Bonzini May 27, 2013, 7:20 a.m. UTC | #8
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 26/05/2013 11:02, Jan Kiszka ha scritto:
>>> 
>>> Nice patches.  Only one thing, how is .impl.unaligned different
>>> from the existing .valid.unaligned?
> See memory.h: valid controls is an unaligned access traps or gets 
> processed, impl manages if it is passed as-is to the device or
> broken up and aligned first.

I took the following patches:

    exec: Allow unaligned address_space_rw
    exec: Resolve subpages in one step except for IOTLB fills
    exec: Implement subpage_read/write via address_space_rw

to limit the conflicts and because I realized that TCG should never
encounter an IOMMU.  Thus I removed the is_write argument from
address_space_translate_internal and moved the IOMMU handling to
address_space_translate.  I'll push to my iommu branch after some more
testing.  If you reorganize the old_portio handling along the lines I
suggested in my previous email you should have no conflicts when rebasing.

Paolo
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRowkzAAoJEBvWZb6bTYbybiQP/i4S4tUisfJVPlnKosEwxUxk
LY0I6oyeXIQznv/2VC+CkKuNFRpQyPgOoD+Nc4mVdVSEgexdJdNHvT3qQeewkRBB
Kk074LS/tMdtyRFyhf+ZuhYFBFgCV4LINyJ3iZV9E913LqV/cM6CXnvYcOWwNspu
2ZPGLCueG5y0iHcHHFo2ZXRf/OqUKiKIzMMKPzS57+7o7rUqav8wSnQV1Qox/L0G
xeeUZ4BO9a59ULIb63bwhrjcUHS77vddvAcPDe5LifAo1OZqZWXsWvzdj/3VKY0A
Deyt+q02QA9erYPR0K16nLr3j+8z44McPugHWNNmAzQzMXjFIVE1g7kBmqzBLAst
tStrog73ol1xgor5xl1wIqEfGQNGOroNuPEpdlU7NQf4c6aFJ7cWFMJnaOtfZuNB
7eTtQV32lQkqns7ho3wJFsv7k2sWjpBrjrXCNnii+LH/MUUm8KsAQ+w2JVCRHEDK
AEmnniEoCSLMbwZFOiKGT0J5291dZgeK6dbH2NVJ8jYhTWaDGuIldXtGyMvBkvpi
+YZF4Z3vvOPFeT+zl0DPh0xHsaXcblfZbe3pe4aDDlgjmsUAbyNIsTsAIn/+eq0y
WwjuEj76bR8Y+RbJgkQ1VP6mCDqQLaOAiKWniZZ1wjZMBmE3vCwsmi3CNEVMxJOD
XOnhQIEOw5kautUlcnQ3
=1nFS
-----END PGP SIGNATURE-----
Jan Kiszka May 27, 2013, 7:23 a.m. UTC | #9
On 2013-05-27 09:20, Paolo Bonzini wrote:
> Il 26/05/2013 11:02, Jan Kiszka ha scritto:
>>>>
>>>> Nice patches.  Only one thing, how is .impl.unaligned different
>>>> from the existing .valid.unaligned?
>> See memory.h: valid controls is an unaligned access traps or gets
>> processed, impl manages if it is passed as-is to the device or
>> broken up and aligned first.
> 
> I took the following patches:
> 
>     exec: Allow unaligned address_space_rw
>     exec: Resolve subpages in one step except for IOTLB fills
>     exec: Implement subpage_read/write via address_space_rw
> 
> to limit the conflicts and because I realized that TCG should never
> encounter an IOMMU.

Err, why? Will we emulate IOMMUs for TCG differently?

>  Thus I removed the is_write argument from
> address_space_translate_internal and moved the IOMMU handling to
> address_space_translate.  I'll push to my iommu branch after some more
> testing.  If you reorganize the old_portio handling along the lines I
> suggested in my previous email you should have no conflicts when rebasing.

old_portio should rather be eliminated on the long run. But that's a
future story.

Jan
Paolo Bonzini May 27, 2013, 8:19 a.m. UTC | #10
Il 27/05/2013 09:23, Jan Kiszka ha scritto:
> On 2013-05-27 09:20, Paolo Bonzini wrote:
>> Il 26/05/2013 11:02, Jan Kiszka ha scritto:
>>>>>
>>>>> Nice patches.  Only one thing, how is .impl.unaligned different
>>>>> from the existing .valid.unaligned?
>>> See memory.h: valid controls is an unaligned access traps or gets
>>> processed, impl manages if it is passed as-is to the device or
>>> broken up and aligned first.
>>
>> I took the following patches:
>>
>>     exec: Allow unaligned address_space_rw
>>     exec: Resolve subpages in one step except for IOTLB fills
>>     exec: Implement subpage_read/write via address_space_rw
>>
>> to limit the conflicts and because I realized that TCG should never
>> encounter an IOMMU.
> 
> Err, why? Will we emulate IOMMUs for TCG differently?

Because IOMMUs should never be added to address_space_memory.

TCG should only encounter an IOMMU during device emulation (DMA), not
because of reads/writes from the CPU.  So the IOTLBs should never point
to an IOMMU region.

Paolo

>>  Thus I removed the is_write argument from
>> address_space_translate_internal and moved the IOMMU handling to
>> address_space_translate.  I'll push to my iommu branch after some more
>> testing.  If you reorganize the old_portio handling along the lines I
>> suggested in my previous email you should have no conflicts when rebasing.
> 
> old_portio should rather be eliminated on the long run. But that's a
> future story.
> 
> Jan
>
Jan Kiszka May 27, 2013, 8:37 a.m. UTC | #11
On 2013-05-27 10:19, Paolo Bonzini wrote:
> Il 27/05/2013 09:23, Jan Kiszka ha scritto:
>> On 2013-05-27 09:20, Paolo Bonzini wrote:
>>> Il 26/05/2013 11:02, Jan Kiszka ha scritto:
>>>>>>
>>>>>> Nice patches.  Only one thing, how is .impl.unaligned different
>>>>>> from the existing .valid.unaligned?
>>>> See memory.h: valid controls is an unaligned access traps or gets
>>>> processed, impl manages if it is passed as-is to the device or
>>>> broken up and aligned first.
>>>
>>> I took the following patches:
>>>
>>>     exec: Allow unaligned address_space_rw
>>>     exec: Resolve subpages in one step except for IOTLB fills
>>>     exec: Implement subpage_read/write via address_space_rw
>>>
>>> to limit the conflicts and because I realized that TCG should never
>>> encounter an IOMMU.
>>
>> Err, why? Will we emulate IOMMUs for TCG differently?
> 
> Because IOMMUs should never be added to address_space_memory.
> 
> TCG should only encounter an IOMMU during device emulation (DMA), not
> because of reads/writes from the CPU.  So the IOTLBs should never point
> to an IOMMU region.

OK. That's true for the current QEMU architecture, but may no longer be
like this if we once emulate device controllers via TCG (in hybrid
system setups). Anyway, I guess it's fine to optimize this for now.

Jan
Peter Maydell May 27, 2013, 10:33 a.m. UTC | #12
On 27 May 2013 09:19, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 27/05/2013 09:23, Jan Kiszka ha scritto:
>> Err, why? Will we emulate IOMMUs for TCG differently?
>
> Because IOMMUs should never be added to address_space_memory.
>
> TCG should only encounter an IOMMU during device emulation (DMA), not
> because of reads/writes from the CPU.  So the IOTLBs should never point
> to an IOMMU region.

This seems a slightly dubious assumption to me. For instance
here's a sample system diagram that puts a Cortex-M3 CPU
behind an IOMMU (the MMU-500 dotted line):
http://www.arm.com/images/CoreLink_MMU-500_in_System.jpg
Admittedly we're a long way from being able to model that
since we don't support multiple CPUs in one system yet.

Can we have an assertion if you try to add an IOMMU to
the CPU's view of memory, so it's obvious if we ever do
run into this case?

thanks
-- PMM
Paolo Bonzini May 27, 2013, 10:45 a.m. UTC | #13
Il 27/05/2013 12:33, Peter Maydell ha scritto:
> On 27 May 2013 09:19, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> Il 27/05/2013 09:23, Jan Kiszka ha scritto:
>>> Err, why? Will we emulate IOMMUs for TCG differently?
>>
>> Because IOMMUs should never be added to address_space_memory.
>>
>> TCG should only encounter an IOMMU during device emulation (DMA), not
>> because of reads/writes from the CPU.  So the IOTLBs should never point
>> to an IOMMU region.
> 
> This seems a slightly dubious assumption to me. For instance
> here's a sample system diagram that puts a Cortex-M3 CPU
> behind an IOMMU (the MMU-500 dotted line):
> http://www.arm.com/images/CoreLink_MMU-500_in_System.jpg
> Admittedly we're a long way from being able to model that
> since we don't support multiple CPUs in one system yet.

It is possible to do it.

One way is to add IOMMU handling to the memory dispatch routines.  This
was present in Avi's patches.  With the changes to propagate errors
through MMIO dispatch it raises some interesting points WRT
time-of-check-to-time-of-use (we need to ensure the translation is only
done once and reused), but it should be doable and anyway isn't the
biggest problem.  The biggest problem is that, I think, this wouldn't
work because all accesses including reading code would be treated as MMIO.

A second way is to use the IOMMU notifiers to flush the CPU TLB entry
whenever the corresponding IOMMU entry changes.  This is probably not
what hardware does, but it is faster and doesn't have the problem of
code accesses.  The previous versions I sent do this, except I didn't
have the IOMMU notifiers yet.

As in other cases, I prefer no code to untested code.  The design is
sane (it isn't mine, so I can say it :)) and we know it can be done.

> Can we have an assertion if you try to add an IOMMU to
> the CPU's view of memory, so it's obvious if we ever do
> run into this case?

Almost, I have an assertion that triggers if tlb_set_page would be
handed out an IOMMU region.

Paolo
Peter Maydell May 27, 2013, 11:33 a.m. UTC | #14
On 27 May 2013 11:45, Paolo Bonzini <pbonzini@redhat.com> wrote:
> It is possible to do it.

> As in other cases, I prefer no code to untested code.  The design is
> sane (it isn't mine, so I can say it :)) and we know it can be done.

Agreed; we're a long way from being able to make use of this.

> Almost, I have an assertion that triggers if tlb_set_page would be
> handed out an IOMMU region.

Sounds good.

thanks
-- PMM
diff mbox

Patch

diff --git a/exec.c b/exec.c
index 1610604..3d36bc7 100644
--- a/exec.c
+++ b/exec.c
@@ -210,13 +210,15 @@  MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
     IOMMUTLBEntry iotlb;
     MemoryRegionSection *section;
     hwaddr len = *plen;
+    Int128 diff;
 
     for (;;) {
         section = address_space_lookup_region(as, addr);
 
         /* Compute offset within MemoryRegionSection */
         addr -= section->offset_within_address_space;
-        len = MIN(section->size - addr, len);
+        diff = int128_sub(section->mr->size, int128_make64(addr));
+        len = MIN(int128_get64(diff), len);
 
         /* Compute offset within MemoryRegion */
         addr += section->offset_within_region;