Message ID | 1437389593-15297-1-git-send-email-real@ispras.ru |
---|---|
State | New |
Headers | show |
On 20/07/2015 12:53, Efimov Vasily wrote: > This patch improves PAM emulation. > > PAM defines 4 memory access redirection modes. In mode 1 reads are directed to > RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all > access is directed to PCI. In mode 3 it is directed to RAM. Currently all modes > are emulated using aliases. It is good for modes 0 and 3 but modes 1 and 2 > require more complicated logic. Present API has not needed region type. > > The patch uses ROM-like regions for modes 1 and 2. Each region has I/O callbacks > to redirect access to destination defined by current mode. Write access is > always redirected by callback. If actual read source is RAM or ROM (it is > common case) then ram_addr of PAM region is set to ram_addr of source region > with offset. Otherwise, when source region is an I/O region, reading is > redirected to source region read callback by PAM region one. > > The reasons of ram_addr modification for read redirection are: > - QEMU cannot execute code outside RAM or ROM (while BIOS tries exactly that); > - it is faster because of TLB is used. > > Redirection is based on address spaces: for PCI and for RAM. QEMU has no ones so > PAM creates private address spaces with root regions that alias to actual PCI > and RAM regions. > > The memory commit callbacks are used to keep read source and write destination > address spaces and ram_addr up to date. > > Signed-off-by: Efimov Vasily <real@ispras.ru> Out of curiosity, would it be necessary to flush the TLB when the PAM registers change? In QEMU, the TLB also has the function of a cache in some sense (because, by pointing to a ram_addr_t, it prevents reads, writes or fetches from going through the slow MMIO path). Paolo
21.07.2015 10:46, Paolo Bonzini пишет: > Out of curiosity, would it be necessary to flush the TLB when the PAM > registers change? > > In QEMU, the TLB also has the function of a cache in some sense > (because, by pointing to a ram_addr_t, it prevents reads, writes or > fetches from going through the slow MMIO path). There is exec.c: tcg_commit callback. It calls cpu-exec.c: cpu_reload_memory_map that calls cputlb.c:tlb_flush. When PAM register is changed pam_update is called. Its call is surrounded by memory_region_transaction_begin/_commit couple both in i440fx and MCH9 host bridges. tcg_commit and pam_mem_commit are called during memory_region_transaction_commit execution. To summarize: - TLB is flushed by existing code. - Should I remove pam_update_redirection call from pam_set_current? It will be called by pam_mem_commit soon? Note that a PAM API user should call memory_region_transaction_begin/_commit in this case. > > Paolo > Vasily
On Mon, Jul 20, 2015 at 01:53:13PM +0300, Efimov Vasily wrote: > This patch improves PAM emulation. > > PAM defines 4 memory access redirection modes. In mode 1 reads are directed to > RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all > access is directed to PCI. In mode 3 it is directed to RAM. Currently all modes > are emulated using aliases. It is good for modes 0 and 3 but modes 1 and 2 > require more complicated logic. Present API has not needed region type. Hi, Is the motivation of this patch to improve the correctness of the hardware emulation or is there some feature that the current QEMU implementation prevents? There has been some interest recently in making SeaBIOS boot faster on QEMU and the shadow ram fixups in SeaBIOS have been shown to consume a notable amount of bootup time (~12ms on my machine). Will changing the PAM support improve overall boot times? I wonder if going from "mode 0" -> "mode 2" -> "mode 3" would be more expensive than the current SeaBIOS "mode 0" -> "mode 3" mechanism. Also note that SeaBIOS would still need to copy from the high memory location even if the PAM registers were fully implemented as SeaBIOS often exceeds 128K - the "mode 2" overwrite oneself trick would only work for the last 128K of the rom. Thanks, -Kevin
22.07.2015 19:37, Kevin O'Connor пишет: Hi, > > Hi, > > Is the motivation of this patch to improve the correctness of the > hardware emulation or is there some feature that the current QEMU > implementation prevents? The motivation is to improve correctness. The current QEMU PAM implementation prevents executing any guest that behave closer to specs about PAM than SeaBIOS. > > There has been some interest recently in making SeaBIOS boot faster on > QEMU and the shadow ram fixups in SeaBIOS have been shown to consume a > notable amount of bootup time (~12ms on my machine). > > Will changing the PAM support improve overall boot times? I wonder if > going from "mode 0" -> "mode 2" -> "mode 3" would be more expensive > than the current SeaBIOS "mode 0" -> "mode 3" mechanism. The patch does not touch emulation of modes 0 and 3. SeaBIOS still can use they, so no fixups are required in SeaBIOS. The SeaBIOS patch I listed in letter is only to show difference between current and new PAM implementations. An open-source BIOS is just simple way to do that. I measured time between QEMU start and original SeaBIOS attempt to boot from hard disk. With original PAM it equals 267ms against 269ms with new PAM. 100 measurements are made. Standard deviation is 15ms. Measurements are based on QEMU_CLOCK_REALTIME and SeaBIOS I/O 0x402 port output. I patched isa-debugcon device to add per-line timestamps. Onetime 2ms slowdown seems to be acceptable tradeoff. > > Also note that SeaBIOS would still need to copy from the high memory > location even if the PAM registers were fully implemented as SeaBIOS > often exceeds 128K - the "mode 2" overwrite oneself trick would only > work for the last 128K of the rom. > > Thanks, > -Kevin > Vasily
Ping Vasily
On 20/07/2015 12:53, Efimov Vasily wrote: > + read_src = address_space_translate(read_as, pam_offset, > + &offset_within_read_leaf, &unused, > + false); > + > + if (memory_region_is_ram(read_src) || memory_region_is_romd(read_src)) { > + /* Read source is RAM or ROM. > + Make current PAM region a ROM with body inside read source. */ > + cur->ram_addr = read_src->ram_addr + offset_within_read_leaf; > + cur->rom_device = true; > + cur->romd_mode = true; > + } else { > + /* Make current PAM region a clearly I/O region. */ > + cur->ram_addr = ~(ram_addr_t) 0; > + cur->rom_device = false; > + cur->romd_mode = false; > + } > + > + pam->write_as = write_as; > + pam->read_as = read_as; > +} > + Hi Vasily, I agree that this patch is an improvement compared to the earlier versions, but it's still a bit of an abstraction violation and I'm not sure if it works with KVM. Let's see if we can improve things. Please correct me on the following: 1) For the "Make current PAM region a ROM" case, we can get the ram_addr_t directly from the pc.bios and pc.rom MemoryRegions, and poke into pam->region[1] and pam->region[2] when we create them. 2) For the "Make current PAM region an I/O region" case, you could add an IOMMU region that to 0xc0000-0xfffff. The listener would disable pam->region[1] if address_space_translate returns an I/O region and enable it if it returns RAM/ROM. However, I cannot understand or remember what is the case where you get an I/O region. Paolo
07.09.2015 15:50, Paolo Bonzini пишет: > > Hi Vasily, > > I agree that this patch is an improvement compared to the earlier > versions, but it's still a bit of an abstraction violation and I'm not > sure if it works with KVM. It does not work with KVM. BIOS freezes on 0xCAA26 on mov %ax,%es instruction (according to QEMU disassembler). Last debug messages are: Booting from ROM... Booting from ca80:003c So, a lot of BIOS code is executed. I will investigate the problem. > > Let's see if we can improve things. Please correct me on the following: > > 1) For the "Make current PAM region a ROM" case, we can get the > ram_addr_t directly from the pc.bios and pc.rom MemoryRegions, and poke > into pam->region[1] and pam->region[2] when we create them. Yes, we can. But what if another region is at the address? The -pflash option is an example. By default and with -bios option mtree is: address-space: memory 0000000000000000-ffffffffffffffff (prio 0, RW): system 0000000000000000-0000000007ffffff (prio 0, RW): alias ram-below-4g @pc.ram 0000000000000000-0000000007ffffff 0000000000000000-ffffffffffffffff (prio -1, RW): pci 00000000000c0000-00000000000dffff (prio 1, RW): pc.rom 00000000000e0000-00000000000fffff (prio 1, R-): alias isa-bios @pc.bios 0000000000020000-000000000003ffff 00000000fffc0000-00000000ffffffff (prio 0, R-): pc.bios With -pflash option mtree is: address-space: memory 0000000000000000-ffffffffffffffff (prio 0, RW): system 0000000000000000-0000000007ffffff (prio 0, RW): alias ram-below-4g @pc.ram 0000000000000000-0000000007ffffff 0000000000000000-ffffffffffffffff (prio -1, RW): pci 00000000000c0000-00000000000dffff (prio 1, RW): pc.rom 00000000000e0000-00000000000fffff (prio 1, R-): isa-bios 00000000fffc0000-00000000ffffffff (prio 0, R-): system.flash0 There is significant difference about isa-bios at least. In general, new PAM redirects access to region at the address but not exactly to pc.bios/isa-bios or pc.rom. In other words, I suggest more generic solution. We also can set up redirection at machine initialization, but using of listener makes sure redirection is actual at runtime. I do not know case in which regions at the PAM addresses are changed dynamically during guest work. But even during machine initialization the memory tree is changed multiple times. So, listener at least ensures the last version of tree is used for redirection choice. > > 2) For the "Make current PAM region an I/O region" case, you could add > an IOMMU region that to 0xc0000-0xfffff. The listener would disable > pam->region[1] if address_space_translate returns an I/O region and > enable it if it returns RAM/ROM. As I see, IOMMU cannot be used for CPU access to memory because of assertion (!section->mr->iommu_ops) in exec.c: address_space_translate_for_iotlb IOMMU is used for redirection of device and debug access through address_space_translate. Please correct me if I miss something. > However, I cannot understand or > remember what is the case where you get an I/O region. There is no known case in which an I/O region is at PAM addresses. But it is theoretically possible. Hence, I decide to implement it instead of insert an assertion. Also note, the code also covers case of simple container memory region at the address. > > Paolo >
On 09/09/2015 14:03, Ефимов Василий wrote: > We also can set up redirection at machine initialization, but using of > listener makes sure redirection is actual at runtime. I do not know > case in which regions at the PAM addresses are changed dynamically > during guest work. But even during machine initialization the memory > tree is changed multiple times. So, listener at least ensures the last > version of tree is used for redirection choice. Fair enough. >> 2) For the "Make current PAM region an I/O region" case, you could add >> an IOMMU region that to 0xc0000-0xfffff. The listener would disable >> pam->region[1] if address_space_translate returns an I/O region and >> enable it if it returns RAM/ROM. > As I see, IOMMU cannot be used for CPU access to memory because of > assertion (!section->mr->iommu_ops) in > exec.c: address_space_translate_for_iotlb > IOMMU is used for redirection of device and debug access through > address_space_translate. Please correct me if I miss something. You're right. We could remove the assertion and reuse subpage_ops for IOMMUs, it would not allow running code but it would allow accesses. But it's not necessary because this can never happen in practice. > There is no known case in which an I/O region is at PAM addresses. > But it is theoretically possible. Hence, I decide to implement it > instead of insert an assertion. Let's keep the code simple and assert. If you put a BAR at 0xc0000-0xfffff, RAM wins, so this situation should never happen on x86 chipsets. Paolo
====== Whitespaces are added to prevent attempt to apply SeaBIOS patch to QEMU. diff --git a/src/fw/shadow.c b/src/fw/shadow.c index 4f00006..7249aa2 100644 --- a/src/fw/shadow.c +++ b/src/fw/shadow.c @@ -26,32 +26,62 @@ static void __make_bios_writable_intel(u16 bdf, u32 pam0) { // Make ram from 0xc0000-0xf0000 writable - int clear = 0; + dprintf(1, "PAM mode 1 test begin\n"); + unsigned *m = (unsigned *) BUILD_ROM_START; + + pci_config_writeb(bdf, pam0 + 1, 0x33); + *m = 0xdeafbeef; + + pci_config_writeb(bdf, pam0 + 1, 0x11); + volatile unsigned t = *m; + *m = t; + + pci_config_writeb(bdf, pam0 + 1, 0x33); + t = *m; + + pci_config_writeb(bdf, pam0 + 1, 0x00); + unsigned t2 = *m; + + dprintf(1, "t = 0x%x, t2 = 0x%x\n", t, t2); + + dprintf(1, "PAM mode 1 test end\n"); int i; + unsigned *mem, *mem_limit; for (i=0; i<6; i++) { u32 pam = pam0 + 1 + i; int reg = pci_config_readb(bdf, pam); - if (CONFIG_OPTIONROMS_DEPLOYED && (reg & 0x11) != 0x11) { - // Need to copy optionroms to work around qemu implementation - void *mem = (void*)(BUILD_ROM_START + i * 32*1024); - memcpy((void*)BUILD_BIOS_TMP_ADDR, mem, 32*1024); + if ((reg & 0x11) != 0x11) { + mem = (unsigned *)(BUILD_ROM_START + i * 32 * 1024); + pci_config_writeb(bdf, pam, 0x22); + mem_limit = mem + 32 * 1024 / sizeof(unsigned); + + while (mem < mem_limit) { + volatile unsigned tmp = *mem; + *mem = tmp; + mem++; + } pci_config_writeb(bdf, pam, 0x33); - memcpy(mem, (void*)BUILD_BIOS_TMP_ADDR, 32*1024); - clear = 1; } else { pci_config_writeb(bdf, pam, 0x33); } } - if (clear) - memset((void*)BUILD_BIOS_TMP_ADDR, 0, 32*1024); // Make ram from 0xf0000-0x100000 writable int reg = pci_config_readb(bdf, pam0); - pci_config_writeb(bdf, pam0, 0x30); if (reg & 0x10) // Ram already present. return; + pci_config_writeb(bdf, pam0, 0x22); + mem = (unsigned *)BUILD_BIOS_ADDR; + mem_limit = mem + 32 * 1024 / sizeof(unsigned); + while (mem < mem_limit) { + volatile unsigned tmp = *mem; + *mem = tmp; + mem++; + } + pci_config_writeb(bdf, pam0, 0x33); + // Copy bios. extern u8 code32flat_start[], code32flat_end[]; memcpy(code32flat_start, code32flat_start + BIOS_SRC_OFFSET @@ -61,17 +91,6 @@ __make_bios_writable_intel(u16 bdf, u32 pam0) static void make_bios_writable_intel(u16 bdf, u32 pam0) { - int reg = pci_config_readb(bdf, pam0); - if (!(reg & 0x10)) { - // QEMU doesn't fully implement the piix shadow capabilities - - // if ram isn't backing the bios segment when shadowing is - // disabled, the code itself wont be in memory. So, run the - // code from the high-memory flash location. - u32 pos = (u32)__make_bios_writable_intel + BIOS_SRC_OFFSET; - void (*func)(u16 bdf, u32 pam0) = (void*)pos; - func(bdf, pam0); - return; - } // Ram already present - just enable writes __make_bios_writable_intel(bdf, pam0); } ====== hw/pci-host/pam.c | 167 +++++++++++++++++++++++++++++++++++++++------- include/hw/pci-host/pam.h | 4 +- 2 files changed, 146 insertions(+), 25 deletions(-) diff --git a/hw/pci-host/pam.c b/hw/pci-host/pam.c index 17d826c..e668396 100644 --- a/hw/pci-host/pam.c +++ b/hw/pci-host/pam.c @@ -27,43 +27,162 @@ * THE SOFTWARE. */ -#include "qom/object.h" -#include "sysemu/sysemu.h" #include "hw/pci-host/pam.h" +#include "exec/address-spaces.h" +#include "qemu/bswap.h" + +static bool pam_as_initialized; +static MemoryRegion pam_ram_as_root; +static AddressSpace pam_ram_address_space; +static MemoryRegion pam_pci_as_root; +static AddressSpace pam_pci_address_space; + +static void pam_write(void *opaque, hwaddr addr, uint64_t data, + unsigned size) +{ + PAMMemoryRegion *pam = (PAMMemoryRegion *) opaque; + hwaddr pam_offset = pam->region[0].addr; + uint8_t data1 = (uint8_t) data; + + address_space_write(pam->write_as, pam_offset + addr, + MEMTXATTRS_UNSPECIFIED, &data1, 1); +} + +static uint64_t pam_read(void *opaque, hwaddr addr, unsigned size) +{ + PAMMemoryRegion *pam = (PAMMemoryRegion *) opaque; + hwaddr pam_offset = pam->region[0].addr; + uint8_t ret; + + address_space_read(pam->read_as, pam_offset + addr, + MEMTXATTRS_UNSPECIFIED, &ret, 1); + + return (uint64_t) ret; +} + +static MemoryRegionOps pam_ops = { + .write = pam_write, + .read = pam_read, + .impl.max_access_size = 1, +}; + +static void pam_update_redirection(PAMMemoryRegion *pam) +{ + hwaddr pam_offset = pam->region[0].addr; + AddressSpace *write_as, *read_as; + MemoryRegion *cur, *read_src; + hwaddr unused, offset_within_read_leaf; + + if (pam->current == 1) { + /* Read from RAM and write to PCI */ + write_as = &pam_pci_address_space; + read_as = &pam_ram_address_space; + cur = &pam->region[1]; + } else if (pam->current == 2) { + /* Read from PCI and write to RAM */ + write_as = &pam_ram_address_space; + read_as = &pam_pci_address_space; + cur = &pam->region[2]; + } else { + return; + } + + read_src = address_space_translate(read_as, pam_offset, + &offset_within_read_leaf, &unused, + false); + + if (memory_region_is_ram(read_src) || memory_region_is_romd(read_src)) { + /* Read source is RAM or ROM. + Make current PAM region a ROM with body inside read source. */ + cur->ram_addr = read_src->ram_addr + offset_within_read_leaf; + cur->rom_device = true; + cur->romd_mode = true; + } else { + /* Make current PAM region a clearly I/O region. */ + cur->ram_addr = ~(ram_addr_t) 0; + cur->rom_device = false; + cur->romd_mode = false; + } + + pam->write_as = write_as; + pam->read_as = read_as; +} + +static void pam_set_current(PAMMemoryRegion *pam, unsigned new_current) +{ + assert(new_current <= 3); + + if (new_current != pam->current) { + memory_region_set_enabled(&pam->region[pam->current], false); + pam->current = new_current; + memory_region_set_enabled(&pam->region[new_current], true); + + pam_update_redirection(pam); + } +} + +static void pam_mem_commit(MemoryListener *listener) +{ + PAMMemoryRegion *pam = container_of(listener, PAMMemoryRegion, listener); + pam_update_redirection(pam); +} void init_pam(DeviceState *dev, MemoryRegion *ram_memory, MemoryRegion *system_memory, MemoryRegion *pci_address_space, - PAMMemoryRegion *mem, uint32_t start, uint32_t size) + PAMMemoryRegion *pam, uint32_t start, uint32_t size) { + if (unlikely(!pam_as_initialized)) { + memory_region_init_alias(&pam_ram_as_root, OBJECT(dev), + "pam-ram-root", ram_memory, 0, UINT64_MAX); + address_space_init(&pam_ram_address_space, &pam_ram_as_root, + "PAM_RAM"); + + memory_region_init_alias(&pam_pci_as_root, OBJECT(dev), + "pam-pci-root", pci_address_space, 0, UINT64_MAX); + address_space_init(&pam_pci_address_space, &pam_pci_as_root, + "PAM_PCI"); + + pam_as_initialized = true; + } int i; - /* RAM */ - memory_region_init_alias(&mem->alias[3], OBJECT(dev), "pam-ram", ram_memory, - start, size); - /* ROM (XXX: not quite correct) */ - memory_region_init_alias(&mem->alias[1], OBJECT(dev), "pam-rom", ram_memory, - start, size); - memory_region_set_readonly(&mem->alias[1], true); - - /* XXX: should distinguish read/write cases */ - memory_region_init_alias(&mem->alias[0], OBJECT(dev), "pam-pci", pci_address_space, - start, size); - memory_region_init_alias(&mem->alias[2], OBJECT(dev), "pam-pci", ram_memory, - start, size); - - for (i = 0; i < 4; ++i) { - memory_region_set_enabled(&mem->alias[i], false); + /* All access to PCI */ + memory_region_init_alias(&pam->region[0], OBJECT(dev), "pam-pci", + pci_address_space, start, size); + + /* Read from RAM and write to PCI */ + memory_region_init_io(&pam->region[1], OBJECT(dev), &pam_ops, pam, + "pam-r-ram-w-pci", size); + + /* Read from PCI and write to RAM */ + memory_region_init_io(&pam->region[2], OBJECT(dev), &pam_ops, pam, + "pam-r-pci-w-ram", size); + + /* All access to RAM */ + memory_region_init_alias(&pam->region[3], OBJECT(dev), "pam-ram", + ram_memory, start, size); + + pam->current = 0; + memory_region_add_subregion_overlap(system_memory, start, + &pam->region[0], 1); + memory_region_set_enabled(&pam->region[0], true); + + for (i = 1; i < 4; i++) { + memory_region_set_enabled(&pam->region[i], false); memory_region_add_subregion_overlap(system_memory, start, - &mem->alias[i], 1); + &pam->region[i], 1); } - mem->current = 0; + + pam->listener = (MemoryListener) { + .commit = pam_mem_commit, + }; + + memory_listener_register(&pam->listener, &address_space_memory); } void pam_update(PAMMemoryRegion *pam, int idx, uint8_t val) { assert(0 <= idx && idx <= 12); - memory_region_set_enabled(&pam->alias[pam->current], false); - pam->current = (val >> ((!(idx & 1)) * 4)) & PAM_ATTR_MASK; - memory_region_set_enabled(&pam->alias[pam->current], true); + pam_set_current(pam, (val >> ((!(idx & 1)) * 4)) & PAM_ATTR_MASK); } diff --git a/include/hw/pci-host/pam.h b/include/hw/pci-host/pam.h index 6116c63..838988c 100644 --- a/include/hw/pci-host/pam.h +++ b/include/hw/pci-host/pam.h @@ -82,8 +82,10 @@ #define SMRAM_C_BASE_SEG ((uint8_t)0x2) /* hardwired to b010 */ typedef struct PAMMemoryRegion { - MemoryRegion alias[4]; /* index = PAM value */ + MemoryRegion region[4]; /* index = PAM value */ unsigned current; + AddressSpace *read_as, *write_as; + MemoryListener listener; } PAMMemoryRegion; void init_pam(DeviceState *dev, MemoryRegion *ram, MemoryRegion *system,
This patch improves PAM emulation. PAM defines 4 memory access redirection modes. In mode 1 reads are directed to RAM and writes are directed to PCI. In mode 2 it is contrary. In mode 0 all access is directed to PCI. In mode 3 it is directed to RAM. Currently all modes are emulated using aliases. It is good for modes 0 and 3 but modes 1 and 2 require more complicated logic. Present API has not needed region type. The patch uses ROM-like regions for modes 1 and 2. Each region has I/O callbacks to redirect access to destination defined by current mode. Write access is always redirected by callback. If actual read source is RAM or ROM (it is common case) then ram_addr of PAM region is set to ram_addr of source region with offset. Otherwise, when source region is an I/O region, reading is redirected to source region read callback by PAM region one. The reasons of ram_addr modification for read redirection are: - QEMU cannot execute code outside RAM or ROM (while BIOS tries exactly that); - it is faster because of TLB is used. Redirection is based on address spaces: for PCI and for RAM. QEMU has no ones so PAM creates private address spaces with root regions that alias to actual PCI and RAM regions. The memory commit callbacks are used to keep read source and write destination address spaces and ram_addr up to date. Signed-off-by: Efimov Vasily <real@ispras.ru> --- v2 change: - use address spaces for access redirection Qemu distribution includes SeaBIOS which has hacks to work around incorrect modes 1 and 2 emulation. This patch series is tested using modified SeaBIOS. It is forced to use mode 2 for copying its data. BIOS reads a value from memory and immediately writes it to same address. According to PAM definition, reads are directed to PCI (i.e. to BIOS ROM) and writes are directed to RAM. The patch for SeaBIOS is listed below. Both SeaBIOS versions works with new PAM but the modified one does not work with old PAM.