diff mbox series

[v4,2/3] lib: logic_pio: Reject accesses to unregistered CPU MMIO regions

Message ID 1560262374-67875-3-git-send-email-john.garry@huawei.com
State Changes Requested
Delegated to: Bjorn Helgaas
Headers show
Series Fix ARM64 crash for accessing unmapped IO port regions | expand

Commit Message

John Garry June 11, 2019, 2:12 p.m. UTC
Currently when accessing logical indirect PIO addresses in
logic_{in, out}{,s}, we first ensure that the region is registered.

However, no such check exists for CPU MMIO regions. The CPU MMIO regions
would be registered by the PCI host (when PCI_IOBASE is defined) in
pci_register_io_range().

We have seen scenarios when systems which don't have a PCI host, or they
do but the PCI host probe fails, certain drivers attempts to still attempt
to access PCI IO ports; examples are in [1] and [2].

Such is a case on an ARM64 system without a PCI host:

root@(none)$ insmod hwmon/f71805f.ko
 Unable to handle kernel paging request at virtual address ffff7dfffee0002e
 Mem abort info:
   ESR = 0x96000046
   Exception class = DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
 Data abort info:
   ISV = 0, ISS = 0x00000046
   CM = 0, WnR = 1
 swapper pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
 [ffff7dfffee0002e] pgd=000000000141c003, pud=000000000141d003, pmd=0000000000000000
 Internal error: Oops: 96000046 [#1] PREEMPT SMP
 Modules linked in: f71805f(+)
 CPU: 20 PID: 2736 Comm: insmod Not tainted 5.1.0-rc1-00003-g6f1bfec2a620-dirty #99
 Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
 pstate: 80000005 (Nzcv daif -PAN -UAO)
 pc : logic_outb+0x54/0xb8
 lr : f71805f_find+0x2c/0x1b8 [f71805f]
 sp : ffff000025fbba90
 x29: ffff000025fbba90 x28: ffff000008b944d0
 x27: ffff000025fbbdf0 x26: 0000000000000100
 x25: ffff801f8c270580 x24: ffff000011420000
 x23: ffff000025fbbb3e x22: ffff000025fbbb40
 x21: ffff000008b991b8 x20: 0000000000000087
 x19: 000000000000002e x18: ffffffffffffffff
 x17: 0000000000000000 x16: 0000000000000000
 x15: ffff00001127d6c8 x14: 0000000000000000
 x13: 0000000000000000 x12: 0000000000000000
 x11: 0000000000010820 x10: 0000841fdac40000
 x9 : 0000000000000001 x8 : 0000000040000000
 x7 : 0000000000210d00 x6 : 0000000000000000
 x5 : ffff801fb6a46040 x4 : ffff841febeaeda0
 x3 : 0000000000ffbffe x2 : ffff000025fbbb40
 x1 : ffff7dfffee0002e x0 : ffff7dfffee00000
 Process insmod (pid: 2736, stack limit = 0x(____ptrval____))
 Call trace:
  logic_outb+0x54/0xb8
  f71805f_find+0x2c/0x1b8 [f71805f]
  f71805f_init+0x38/0xe48 [f71805f]
  do_one_initcall+0x5c/0x198
  do_init_module+0x54/0x1b0
  load_module+0x1dc4/0x2158
  __se_sys_init_module+0x14c/0x1e8
  __arm64_sys_init_module+0x18/0x20
  el0_svc_common+0x5c/0x100
  el0_svc_handler+0x2c/0x80
  el0_svc+0x8/0xc
 Code: d2bfdc00 f2cfbfe0 f2ffffe0 8b000021 (39000034)
 ---[ end trace 10ea80bde051bbfc ]---
root@(none)$

Well-behaved drivers call request_{muxed_}region() to grab the IO port
region, but success here still doesn't actually mean that there is some IO
port mapped in this region.

This patch adds a check to ensure that the CPU MMIO region is registered
prior to accessing the PCI IO ports.

Any failed checks silently return.

[1] https://lore.kernel.org/linux-pci/56F209A9.4040304@huawei.com
[2] https://lore.kernel.org/linux-arm-kernel/e6995b4a-184a-d8d4-f4d4-9ce75d8f47c0@huawei.com/

Signed-off-by: John Garry <john.garry@huawei.com>
---
 lib/logic_pio.c | 60 +++++++++++++++++++++++++++++++++----------------
 1 file changed, 41 insertions(+), 19 deletions(-)

Comments

Bjorn Helgaas June 13, 2019, 3:20 a.m. UTC | #1
On Tue, Jun 11, 2019 at 10:12:53PM +0800, John Garry wrote:
> Currently when accessing logical indirect PIO addresses in
> logic_{in, out}{,s}, we first ensure that the region is registered.

I think logic_pio is specifically concerned with I/O port space, so
it's a little bit unfortunate that we named this "PIO".

PIO is a general term for "Programmed I/O", which just means the CPU
is involved in each transfer, as opposed to DMA.  The transfers can be
to either MMIO or I/O port space.

So this ends up being a little confusing because I think you mean
"Port I/O", not "Programmed I/O".

> However, no such check exists for CPU MMIO regions. The CPU MMIO regions
> would be registered by the PCI host (when PCI_IOBASE is defined) in
> pci_register_io_range().

IIUC this "CPU MMIO region" is an MMIO region where a memory load or
store from the CPU is converted by a PCI host bridge into an I/O port
transaction on PCI.

Again IIUC, logic_pio supports two kinds of I/O port space accesses:

  1) The simple "bridge converts loads/stores to an MMIO region to PCI
     I/O port transactions" kind, and

  2) The more complicated "somebody supplies logic_pio_host_ops
     functions that do arbitrary magic to generate I/O port
     transactions on some bus.

And this patch is making the first kind smarter.  Previously it would
perform the memory access whenever "addr < MMIO_UPPER_LIMIT", but
after this patch it will only do it if find_io_range() succeeds.

Right?  Sorry for restating what probably should have been obvious to
me.

I have two observations here:

  1) The simple "bridge converts CPU MMIO space to PCI I/O port space"
     flavor is essentially identical to what ia64 (and probably other
     architectures) does.  This should really be combined somehow.

  2) If you made a default set of logic_pio_host_ops that merely did
     loads/stores and maybe added a couple fields in the struct
     logic_pio_hwaddr, I bet you could unify the two kinds so
     logic_inb() would look something like this:

       u8 logic_inb(unsigned long addr)
       {
         struct logic_pio_hwaddr *range = find_io_range(addr);

	 if (!range)
	   return (u8) ~0;

         return (u8) range->ops->in(range->hostdata, addr, sz);
       }

> We have seen scenarios when systems which don't have a PCI host, or they
> do but the PCI host probe fails, certain drivers attempts to still attempt
> to access PCI IO ports; examples are in [1] and [2].
> 
> Such is a case on an ARM64 system without a PCI host:
> 
> root@(none)$ insmod hwmon/f71805f.ko
>  Unable to handle kernel paging request at virtual address ffff7dfffee0002e
>  Mem abort info:
>    ESR = 0x96000046
>    Exception class = DABT (current EL), IL = 32 bits
>    SET = 0, FnV = 0
>    EA = 0, S1PTW = 0
>  Data abort info:
>    ISV = 0, ISS = 0x00000046
>    CM = 0, WnR = 1
>  swapper pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
>  [ffff7dfffee0002e] pgd=000000000141c003, pud=000000000141d003, pmd=0000000000000000
>  Internal error: Oops: 96000046 [#1] PREEMPT SMP
>  Modules linked in: f71805f(+)
>  CPU: 20 PID: 2736 Comm: insmod Not tainted 5.1.0-rc1-00003-g6f1bfec2a620-dirty #99
>  Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
>  pstate: 80000005 (Nzcv daif -PAN -UAO)
>  pc : logic_outb+0x54/0xb8
>  lr : f71805f_find+0x2c/0x1b8 [f71805f]
>  sp : ffff000025fbba90
>  x29: ffff000025fbba90 x28: ffff000008b944d0
>  x27: ffff000025fbbdf0 x26: 0000000000000100
>  x25: ffff801f8c270580 x24: ffff000011420000
>  x23: ffff000025fbbb3e x22: ffff000025fbbb40
>  x21: ffff000008b991b8 x20: 0000000000000087
>  x19: 000000000000002e x18: ffffffffffffffff
>  x17: 0000000000000000 x16: 0000000000000000
>  x15: ffff00001127d6c8 x14: 0000000000000000
>  x13: 0000000000000000 x12: 0000000000000000
>  x11: 0000000000010820 x10: 0000841fdac40000
>  x9 : 0000000000000001 x8 : 0000000040000000
>  x7 : 0000000000210d00 x6 : 0000000000000000
>  x5 : ffff801fb6a46040 x4 : ffff841febeaeda0
>  x3 : 0000000000ffbffe x2 : ffff000025fbbb40
>  x1 : ffff7dfffee0002e x0 : ffff7dfffee00000
>  Process insmod (pid: 2736, stack limit = 0x(____ptrval____))
>  Call trace:
>   logic_outb+0x54/0xb8
>   f71805f_find+0x2c/0x1b8 [f71805f]
>   f71805f_init+0x38/0xe48 [f71805f]
>   do_one_initcall+0x5c/0x198
>   do_init_module+0x54/0x1b0
>   load_module+0x1dc4/0x2158
>   __se_sys_init_module+0x14c/0x1e8
>   __arm64_sys_init_module+0x18/0x20
>   el0_svc_common+0x5c/0x100
>   el0_svc_handler+0x2c/0x80
>   el0_svc+0x8/0xc
>  Code: d2bfdc00 f2cfbfe0 f2ffffe0 8b000021 (39000034)
>  ---[ end trace 10ea80bde051bbfc ]---
> root@(none)$
> 
> Well-behaved drivers call request_{muxed_}region() to grab the IO port
> region, but success here still doesn't actually mean that there is some IO
> port mapped in this region.
> 
> This patch adds a check to ensure that the CPU MMIO region is registered
> prior to accessing the PCI IO ports.
> 
> Any failed checks silently return.

I *think* what you're doing here is making inb/outb/etc work the same
as on x86, i.e., if no device responds to an inb(), the caller gets
~0, and if no device claims an outb() the data gets dropped.

That should be explicit in the commit log.

> [1] https://lore.kernel.org/linux-pci/56F209A9.4040304@huawei.com
> [2] https://lore.kernel.org/linux-arm-kernel/e6995b4a-184a-d8d4-f4d4-9ce75d8f47c0@huawei.com/
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  lib/logic_pio.c | 60 +++++++++++++++++++++++++++++++++----------------
>  1 file changed, 41 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/logic_pio.c b/lib/logic_pio.c
> index 40d9428010e1..47d24f428908 100644
> --- a/lib/logic_pio.c
> +++ b/lib/logic_pio.c
> @@ -126,7 +126,7 @@ static struct logic_pio_hwaddr *find_io_range(unsigned long pio)
>  		if (in_range(pio, range->io_start, range->size))
>  			return range;
>  	}
> -	pr_err("PIO entry token %lx invalid\n", pio);
> +
>  	return NULL;
>  }
>  
> @@ -197,11 +197,12 @@ unsigned long logic_pio_trans_cpuaddr(resource_size_t addr)
>  type logic_in##bw(unsigned long addr)					\
>  {									\
>  	type ret = (type)~0;						\
> +	struct logic_pio_hwaddr *range = find_io_range(addr);		\
>  									\
>  	if (addr < MMIO_UPPER_LIMIT) {					\
> -		ret = read##bw(PCI_IOBASE + addr);			\
> +		if (range)						\
> +			ret = read##bw(PCI_IOBASE + addr);		\
>  	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) { \
> -		struct logic_pio_hwaddr *range = find_io_range(addr);	\
>  		size_t sz = sizeof(type);				\
>  									\
>  		if (range && range->ops)				\
> @@ -214,10 +215,12 @@ type logic_in##bw(unsigned long addr)					\
>  									\
>  void logic_out##bw(type value, unsigned long addr)			\
>  {									\
> +	struct logic_pio_hwaddr *range = find_io_range(addr);		\
> +									\
>  	if (addr < MMIO_UPPER_LIMIT) {					\
> -		write##bw(value, PCI_IOBASE + addr);			\
> +		if (range)						\
> +			write##bw(value, PCI_IOBASE + addr);		\
>  	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) {	\
> -		struct logic_pio_hwaddr *range = find_io_range(addr);	\
>  		size_t sz = sizeof(type);				\
>  									\
>  		if (range && range->ops)				\
> @@ -230,10 +233,12 @@ void logic_out##bw(type value, unsigned long addr)			\
>  									\
>  void logic_ins##bw(unsigned long addr, void *buf, unsigned int cnt)	\
>  {									\
> +	struct logic_pio_hwaddr *range = find_io_range(addr);		\
> +									\
>  	if (addr < MMIO_UPPER_LIMIT) {					\
> -		reads##bw(PCI_IOBASE + addr, buf, cnt);			\
> +		if (range)						\
> +			reads##bw(PCI_IOBASE + addr, buf, cnt);		\
>  	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) {	\
> -		struct logic_pio_hwaddr *range = find_io_range(addr);	\
>  		size_t sz = sizeof(type);				\
>  									\
>  		if (range && range->ops)				\
> @@ -242,16 +247,17 @@ void logic_ins##bw(unsigned long addr, void *buf, unsigned int cnt)	\
>  		else							\
>  			WARN_ON_ONCE(1);				\
>  	}								\
> -									\
>  }									\
>  									\
>  void logic_outs##bw(unsigned long addr, const void *buf,		\
>  		    unsigned int cnt)					\
>  {									\
> +	struct logic_pio_hwaddr *range = find_io_range(addr);		\
> +									\
>  	if (addr < MMIO_UPPER_LIMIT) {					\
> -		writes##bw(PCI_IOBASE + addr, buf, cnt);		\
> +		if (range)						\
> +			writes##bw(PCI_IOBASE + addr, buf, cnt);	\
>  	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) {	\
> -		struct logic_pio_hwaddr *range = find_io_range(addr);	\
>  		size_t sz = sizeof(type);				\
>  									\
>  		if (range && range->ops)				\
> @@ -269,28 +275,44 @@ type logic_in##bw(unsigned long addr)					\
>  {									\
>  	type ret = (type)~0;						\
>  									\
> -	if (addr < MMIO_UPPER_LIMIT)					\
> -		ret = read##bw(PCI_IOBASE + addr);			\
> +	if (addr < MMIO_UPPER_LIMIT) {					\
> +		struct logic_pio_hwaddr *range = find_io_range(addr);	\
> +									\
> +		if (range)						\
> +			ret = read##bw(PCI_IOBASE + addr);		\
> +	}								\
>  	return ret;							\
>  }									\
>  									\
> -void logic_out##bw(type value, unsigned long addr)			\
> +void logic_out##bw(type val, unsigned long addr)			\
>  {									\
> -	if (addr < MMIO_UPPER_LIMIT)					\
> -		write##bw(value, PCI_IOBASE + addr);			\
> +	if (addr < MMIO_UPPER_LIMIT) {					\
> +		struct logic_pio_hwaddr *range = find_io_range(addr);	\
> +									\
> +		if (range)						\
> +			write##bw(val, PCI_IOBASE + addr);		\
> +	}								\
>  }									\
>  									\
>  void logic_ins##bw(unsigned long addr, void *buf, unsigned int cnt)	\
>  {									\
> -	if (addr < MMIO_UPPER_LIMIT)					\
> -		reads##bw(PCI_IOBASE + addr, buf, cnt);			\
> +	if (addr < MMIO_UPPER_LIMIT) {					\
> +		struct logic_pio_hwaddr *range = find_io_range(addr);	\
> +									\
> +		if (range)						\
> +			reads##bw(PCI_IOBASE + addr, buf, cnt);		\
> +	}								\
>  }									\
>  									\
>  void logic_outs##bw(unsigned long addr, const void *buf,		\
>  		    unsigned int cnt)					\
>  {									\
> -	if (addr < MMIO_UPPER_LIMIT)					\
> -		writes##bw(PCI_IOBASE + addr, buf, cnt);		\
> +	if (addr < MMIO_UPPER_LIMIT) {					\
> +		struct logic_pio_hwaddr *range = find_io_range(addr);	\
> +									\
> +		if (range)						\
> +			writes##bw(PCI_IOBASE + addr, buf, cnt);	\
> +	}								\
>  }
>  #endif /* CONFIG_INDIRECT_PIO */
>  
> -- 
> 2.17.1
>
Arnd Bergmann June 13, 2019, 7:47 a.m. UTC | #2
On Thu, Jun 13, 2019 at 5:20 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Jun 11, 2019 at 10:12:53PM +0800, John Garry wrote:
> > Currently when accessing logical indirect PIO addresses in
> > logic_{in, out}{,s}, we first ensure that the region is registered.
>
> I think logic_pio is specifically concerned with I/O port space, so
> it's a little bit unfortunate that we named this "PIO".
>
> PIO is a general term for "Programmed I/O", which just means the CPU
> is involved in each transfer, as opposed to DMA.  The transfers can be
> to either MMIO or I/O port space.
>
> So this ends up being a little confusing because I think you mean
> "Port I/O", not "Programmed I/O".

I think the terms that John uses are more common: I would also
assume that "PIO" (regardless of whether you expand it as Port
or Programmed I/O) refers only to inb/outb and PCI/ISA/LPC
I/O space, and is distinct from "MMIO", which refers to the readl/writel
accessors and PCI memory space.

That is consistent with the usage across at least the x86, powerpc
and ia64 architectures when they refer to PIO.

        Arnd
John Garry June 13, 2019, 10:17 a.m. UTC | #3
On 13/06/2019 04:20, Bjorn Helgaas wrote:
> On Tue, Jun 11, 2019 at 10:12:53PM +0800, John Garry wrote:
>> Currently when accessing logical indirect PIO addresses in
>> logic_{in, out}{,s}, we first ensure that the region is registered.
>

Hi Bjorn,

> I think logic_pio is specifically concerned with I/O port space, so
> it's a little bit unfortunate that we named this "PIO".
>
> PIO is a general term for "Programmed I/O", which just means the CPU
> is involved in each transfer, as opposed to DMA.  The transfers can be
> to either MMIO or I/O port space.
>
> So this ends up being a little confusing because I think you mean
> "Port I/O", not "Programmed I/O".
>

Personally I agree that the naming isn't great. But then Arnd does think 
that "PIO" is appropriate.

There were many different names along the way to this support merged, 
and I think that the naming became almost irrelevant in the end.

>> However, no such check exists for CPU MMIO regions. The CPU MMIO regions
>> would be registered by the PCI host (when PCI_IOBASE is defined) in
>> pci_register_io_range().
>
> IIUC this "CPU MMIO region" is an MMIO region where a memory load or
> store from the CPU is converted by a PCI host bridge into an I/O port
> transaction on PCI.
>

Right

> Again IIUC, logic_pio supports two kinds of I/O port space accesses:
>
>   1) The simple "bridge converts loads/stores to an MMIO region to PCI
>      I/O port transactions" kind, and
>
>   2) The more complicated "somebody supplies logic_pio_host_ops
>      functions that do arbitrary magic to generate I/O port
>      transactions on some bus.

Right

>
> And this patch is making the first kind smarter.  Previously it would
> perform the memory access whenever "addr < MMIO_UPPER_LIMIT", but
> after this patch it will only do it if find_io_range() succeeds.
>
> Right?  Sorry for restating what probably should have been obvious to
> me.
>

Yes, right. A logical PIO range is registered for a PCI MMIO region in 
pci_register_io_range(). As such, if no range is registered, we can 
assume that no PCI MMIO region has been mapped and discard any attempted 
access.

> I have two observations here:
>
>   1) The simple "bridge converts CPU MMIO space to PCI I/O port space"
>      flavor is essentially identical to what ia64 (and probably other
>      architectures) does.  This should really be combined somehow.
>

Maybe. For ia64, it seems to have some "platform" versions of IO port 
accessors, and then also accessors need a fence barrier. I'm not sure 
how well that would fit with logical PIO. It would need further analysis.

IIRC, PPC did have its own custom version of "indirect IO". So we could 
look to factor that out.

>   2) If you made a default set of logic_pio_host_ops that merely did
>      loads/stores and maybe added a couple fields in the struct
>      logic_pio_hwaddr, I bet you could unify the two kinds so
>      logic_inb() would look something like this:
>

Yeah, I did consider this. We do not provide host operators for PCI MMIO 
ranges. We could simply provide regular versions of inb et al for this. 
A small obstacle for this is that we redefine inb et al, so would need 
"direct" versions also. It would be strange.

So I'm not sure on the value of this.

>        u8 logic_inb(unsigned long addr)
>        {
>          struct logic_pio_hwaddr *range = find_io_range(addr);
>
> 	 if (!range)
> 	   return (u8) ~0;
>
>          return (u8) range->ops->in(range->hostdata, addr, sz);
>        }
>
>> We have seen scenarios when systems which don't have a PCI host, or they
>> do but the PCI host probe fails, certain drivers attempts to still attempt
>> to access PCI IO ports; examples are in [1] and [2].
>>
>> Such is a case on an ARM64 system without a PCI host:
>>
>> root@(none)$ insmod hwmon/f71805f.ko
>>  Unable to handle kernel paging request at virtual address ffff7dfffee0002e
>>  Mem abort info:
>>    ESR = 0x96000046
>>    Exception class = DABT (current EL), IL = 32 bits
>>    SET = 0, FnV = 0
>>    EA = 0, S1PTW = 0
>>  Data abort info:
>>    ISV = 0, ISS = 0x00000046
>>    CM = 0, WnR = 1
>>  swapper pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____)
>>  [ffff7dfffee0002e] pgd=000000000141c003, pud=000000000141d003, pmd=0000000000000000
>>  Internal error: Oops: 96000046 [#1] PREEMPT SMP
>>  Modules linked in: f71805f(+)
>>  CPU: 20 PID: 2736 Comm: insmod Not tainted 5.1.0-rc1-00003-g6f1bfec2a620-dirty #99
>>  Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
>>  pstate: 80000005 (Nzcv daif -PAN -UAO)
>>  pc : logic_outb+0x54/0xb8
>>  lr : f71805f_find+0x2c/0x1b8 [f71805f]
>>  sp : ffff000025fbba90
>>  x29: ffff000025fbba90 x28: ffff000008b944d0
>>  x27: ffff000025fbbdf0 x26: 0000000000000100
>>  x25: ffff801f8c270580 x24: ffff000011420000
>>  x23: ffff000025fbbb3e x22: ffff000025fbbb40
>>  x21: ffff000008b991b8 x20: 0000000000000087
>>  x19: 000000000000002e x18: ffffffffffffffff
>>  x17: 0000000000000000 x16: 0000000000000000
>>  x15: ffff00001127d6c8 x14: 0000000000000000
>>  x13: 0000000000000000 x12: 0000000000000000
>>  x11: 0000000000010820 x10: 0000841fdac40000
>>  x9 : 0000000000000001 x8 : 0000000040000000
>>  x7 : 0000000000210d00 x6 : 0000000000000000
>>  x5 : ffff801fb6a46040 x4 : ffff841febeaeda0
>>  x3 : 0000000000ffbffe x2 : ffff000025fbbb40
>>  x1 : ffff7dfffee0002e x0 : ffff7dfffee00000
>>  Process insmod (pid: 2736, stack limit = 0x(____ptrval____))
>>  Call trace:
>>   logic_outb+0x54/0xb8
>>   f71805f_find+0x2c/0x1b8 [f71805f]
>>   f71805f_init+0x38/0xe48 [f71805f]
>>   do_one_initcall+0x5c/0x198
>>   do_init_module+0x54/0x1b0
>>   load_module+0x1dc4/0x2158
>>   __se_sys_init_module+0x14c/0x1e8
>>   __arm64_sys_init_module+0x18/0x20
>>   el0_svc_common+0x5c/0x100
>>   el0_svc_handler+0x2c/0x80
>>   el0_svc+0x8/0xc
>>  Code: d2bfdc00 f2cfbfe0 f2ffffe0 8b000021 (39000034)
>>  ---[ end trace 10ea80bde051bbfc ]---
>> root@(none)$
>>
>> Well-behaved drivers call request_{muxed_}region() to grab the IO port
>> region, but success here still doesn't actually mean that there is some IO
>> port mapped in this region.
>>
>> This patch adds a check to ensure that the CPU MMIO region is registered
>> prior to accessing the PCI IO ports.
>>
>> Any failed checks silently return.
>
> I *think* what you're doing here is making inb/outb/etc work the same
> as on x86, i.e., if no device responds to an inb(), the caller gets
> ~0, and if no device claims an outb() the data gets dropped.
>

Correct, but with a caveat: when you say no device responds, this means 
that - for arm64 case - no PCI MMIO region is mapped.

> That should be explicit in the commit log.
>
>> [1] https://lore.kernel.org/linux-pci/56F209A9.4040304@huawei.com
>> [2] https://lore.kernel.org/linux-arm-kernel/e6995b4a-184a-d8d4-f4d4-9ce75d8f47c0@huawei.com/
>>

Thanks
Bjorn Helgaas June 13, 2019, 1:46 p.m. UTC | #4
On Thu, Jun 13, 2019 at 11:17:37AM +0100, John Garry wrote:
> On 13/06/2019 04:20, Bjorn Helgaas wrote:
> > On Tue, Jun 11, 2019 at 10:12:53PM +0800, John Garry wrote:
> > > Currently when accessing logical indirect PIO addresses in
> > > logic_{in, out}{,s}, we first ensure that the region is registered.
> 
> > I think logic_pio is specifically concerned with I/O port space, so
> > it's a little bit unfortunate that we named this "PIO".
> > 
> > PIO is a general term for "Programmed I/O", which just means the CPU
> > is involved in each transfer, as opposed to DMA.  The transfers can be
> > to either MMIO or I/O port space.
> > 
> > So this ends up being a little confusing because I think you mean
> > "Port I/O", not "Programmed I/O".
> 
> Personally I agree that the naming isn't great. But then Arnd does think
> that "PIO" is appropriate.
> 
> There were many different names along the way to this support merged, and I
> think that the naming became almost irrelevant in the end.

Yep, Arnd is right.  The "PIO" name contributed a little to my
confusion, but I think the bigger piece was that I read the "indirect
PIO addresses" above as being parallel to the "CPU MMIO regions"
below, when in fact, they are not.  The arguments to logic_inb() are
always port addresses, never CPU MMIO addresses, but in some cases
logic_inb() internally references a CPU MMIO region that corresponds
to the port address.

Possible commit log text:

  The logic_{in,out}*() functions access two regions of I/O port
  addresses:

    1) [0, MMIO_UPPER_LIMIT): these are assumed to be
       LOGIC_PIO_CPU_MMIO regions, where a bridge converts CPU loads
       and stores to MMIO space on its primary side into I/O port
       transactions on its secondary side.

    2) [MMIO_UPPER_LIMIT, IO_SPACE_LIMIT): these are assumed to be
       LOGIC_PIO_INDIRECT regions, where we verify that the region was
       registered by logic_pio_register_range() before calling the
       logic_pio_host_ops functions to perform the access.

  Previously there was no requirement that accesses to the
  LOGIC_PIO_CPU_MMIO area matched anything registered by
  logic_pio_register_range(), and accesses to unregistered I/O ports
  could cause exceptions like the one below.

  Verify that accesses to ports in the LOGIC_PIO_CPU_MMIO area
  correspond to registered ranges.  Accesses to ports outside those
  registered ranges fail (logic_in*() returns ~0 data and logic_out*()
  does nothing).

  This matches the x86 behavior where in*() returns ~0 if no device
  responds, and out*() is dropped if no device claims it.

> >   1) The simple "bridge converts CPU MMIO space to PCI I/O port space"
> >      flavor is essentially identical to what ia64 (and probably other
> >      architectures) does.  This should really be combined somehow.
> 
> Maybe. For ia64, it seems to have some "platform" versions of IO port
> accessors, and then also accessors need a fence barrier. I'm not sure how
> well that would fit with logical PIO. It would need further analysis.

Right.  That shouldn't be part of this series, but I think it would be
nice to someday unify the ia64 add_io_space() path with the
pci_register_io_range() path.  There might have to be ia64-specific
accessors at the bottom for the fences, but I think the top side could
be unified because it's conceptually the same thing -- an MMIO region
that is translated by a bridge to an I/O port region.

> >   2) If you made a default set of logic_pio_host_ops that merely did
> >      loads/stores and maybe added a couple fields in the struct
> >      logic_pio_hwaddr, I bet you could unify the two kinds so
> >      logic_inb() would look something like this:
> 
> Yeah, I did consider this. We do not provide host operators for PCI MMIO
> ranges. We could simply provide regular versions of inb et al for this. A
> small obstacle for this is that we redefine inb et al, so would need
> "direct" versions also. It would be strange.

Yeah, just a thought, maybe it wouldn't work out.

> > > Any failed checks silently return.
> > 
> > I *think* what you're doing here is making inb/outb/etc work the same
> > as on x86, i.e., if no device responds to an inb(), the caller gets
> > ~0, and if no device claims an outb() the data gets dropped.
> 
> Correct, but with a caveat: when you say no device responds, this means that
> - for arm64 case - no PCI MMIO region is mapped.

Yep.  I was describing the x86 behavior, where we don't do any mapping
and all we can say is that no device responded.

Bjorn
John Garry June 13, 2019, 2:09 p.m. UTC | #5
Hi Bjorn,

>> There were many different names along the way to this support merged, and I
>> think that the naming became almost irrelevant in the end.
>
> Yep, Arnd is right.  The "PIO" name contributed a little to my
> confusion, but I think the bigger piece was that I read the "indirect
> PIO addresses" above as being parallel to the "CPU MMIO regions"
> below, when in fact, they are not.  The arguments to logic_inb() are
> always port addresses, never CPU MMIO addresses, but in some cases
> logic_inb() internally references a CPU MMIO region that corresponds
> to the port address.

Right

>
> Possible commit log text:
>
>   The logic_{in,out}*() functions access two regions of I/O port
>   addresses:
>
>     1) [0, MMIO_UPPER_LIMIT): these are assumed to be
>        LOGIC_PIO_CPU_MMIO regions, where a bridge converts CPU loads
>        and stores to MMIO space on its primary side into I/O port
>        transactions on its secondary side.
>
>     2) [MMIO_UPPER_LIMIT, IO_SPACE_LIMIT): these are assumed to be
>        LOGIC_PIO_INDIRECT regions, where we verify that the region was
>        registered by logic_pio_register_range() before calling the
>        logic_pio_host_ops functions to perform the access.
>
>   Previously there was no requirement that accesses to the
>   LOGIC_PIO_CPU_MMIO area matched anything registered by
>   logic_pio_register_range(), and accesses to unregistered I/O ports
>   could cause exceptions like the one below.
>
>   Verify that accesses to ports in the LOGIC_PIO_CPU_MMIO area
>   correspond to registered ranges.  Accesses to ports outside those
>   registered ranges fail (logic_in*() returns ~0 data and logic_out*()
>   does nothing).
>
>   This matches the x86 behavior where in*() returns ~0 if no device
>   responds, and out*() is dropped if no device claims it.

It reads quite well so I can incorporate it. I'd still like to mention 
about request_{muxed_}region(), and how this does not protect against 
accesses to unregistered regions.

>
>>>   1) The simple "bridge converts CPU MMIO space to PCI I/O port space"
>>>      flavor is essentially identical to what ia64 (and probably other
>>>      architectures) does.  This should really be combined somehow.
>>
>> Maybe. For ia64, it seems to have some "platform" versions of IO port
>> accessors, and then also accessors need a fence barrier. I'm not sure how
>> well that would fit with logical PIO. It would need further analysis.
>
> Right.  That shouldn't be part of this series, but I think it would be
> nice to someday unify the ia64 add_io_space() path with the
> pci_register_io_range() path.  There might have to be ia64-specific
> accessors at the bottom for the fences, but I think the top side could
> be unified because it's conceptually the same thing -- an MMIO region
> that is translated by a bridge to an I/O port region.

Yes, it would be good to move any arch-specific port IO function to this 
common framework. To mention it again, what's under 
CONFIG_PPC_INDIRECT_PIO seems an obvious candidate.

>
>>>   2) If you made a default set of logic_pio_host_ops that merely did
>>>      loads/stores and maybe added a couple fields in the struct
>>>      logic_pio_hwaddr, I bet you could unify the two kinds so
>>>      logic_inb() would look something like this:
>>
>> Yeah, I did consider this. We do not provide host operators for PCI MMIO
>> ranges. We could simply provide regular versions of inb et al for this. A
>> small obstacle for this is that we redefine inb et al, so would need
>> "direct" versions also. It would be strange.
>
> Yeah, just a thought, maybe it wouldn't work out.
>
>>>> Any failed checks silently return.
>>>
>>> I *think* what you're doing here is making inb/outb/etc work the same
>>> as on x86, i.e., if no device responds to an inb(), the caller gets
>>> ~0, and if no device claims an outb() the data gets dropped.
>>
>> Correct, but with a caveat: when you say no device responds, this means that
>> - for arm64 case - no PCI MMIO region is mapped.
>
> Yep.  I was describing the x86 behavior, where we don't do any mapping
> and all we can say is that no device responded.
>
> Bjorn
>

Thanks,
John

> .
>
diff mbox series

Patch

diff --git a/lib/logic_pio.c b/lib/logic_pio.c
index 40d9428010e1..47d24f428908 100644
--- a/lib/logic_pio.c
+++ b/lib/logic_pio.c
@@ -126,7 +126,7 @@  static struct logic_pio_hwaddr *find_io_range(unsigned long pio)
 		if (in_range(pio, range->io_start, range->size))
 			return range;
 	}
-	pr_err("PIO entry token %lx invalid\n", pio);
+
 	return NULL;
 }
 
@@ -197,11 +197,12 @@  unsigned long logic_pio_trans_cpuaddr(resource_size_t addr)
 type logic_in##bw(unsigned long addr)					\
 {									\
 	type ret = (type)~0;						\
+	struct logic_pio_hwaddr *range = find_io_range(addr);		\
 									\
 	if (addr < MMIO_UPPER_LIMIT) {					\
-		ret = read##bw(PCI_IOBASE + addr);			\
+		if (range)						\
+			ret = read##bw(PCI_IOBASE + addr);		\
 	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) { \
-		struct logic_pio_hwaddr *range = find_io_range(addr);	\
 		size_t sz = sizeof(type);				\
 									\
 		if (range && range->ops)				\
@@ -214,10 +215,12 @@  type logic_in##bw(unsigned long addr)					\
 									\
 void logic_out##bw(type value, unsigned long addr)			\
 {									\
+	struct logic_pio_hwaddr *range = find_io_range(addr);		\
+									\
 	if (addr < MMIO_UPPER_LIMIT) {					\
-		write##bw(value, PCI_IOBASE + addr);			\
+		if (range)						\
+			write##bw(value, PCI_IOBASE + addr);		\
 	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) {	\
-		struct logic_pio_hwaddr *range = find_io_range(addr);	\
 		size_t sz = sizeof(type);				\
 									\
 		if (range && range->ops)				\
@@ -230,10 +233,12 @@  void logic_out##bw(type value, unsigned long addr)			\
 									\
 void logic_ins##bw(unsigned long addr, void *buf, unsigned int cnt)	\
 {									\
+	struct logic_pio_hwaddr *range = find_io_range(addr);		\
+									\
 	if (addr < MMIO_UPPER_LIMIT) {					\
-		reads##bw(PCI_IOBASE + addr, buf, cnt);			\
+		if (range)						\
+			reads##bw(PCI_IOBASE + addr, buf, cnt);		\
 	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) {	\
-		struct logic_pio_hwaddr *range = find_io_range(addr);	\
 		size_t sz = sizeof(type);				\
 									\
 		if (range && range->ops)				\
@@ -242,16 +247,17 @@  void logic_ins##bw(unsigned long addr, void *buf, unsigned int cnt)	\
 		else							\
 			WARN_ON_ONCE(1);				\
 	}								\
-									\
 }									\
 									\
 void logic_outs##bw(unsigned long addr, const void *buf,		\
 		    unsigned int cnt)					\
 {									\
+	struct logic_pio_hwaddr *range = find_io_range(addr);		\
+									\
 	if (addr < MMIO_UPPER_LIMIT) {					\
-		writes##bw(PCI_IOBASE + addr, buf, cnt);		\
+		if (range)						\
+			writes##bw(PCI_IOBASE + addr, buf, cnt);	\
 	} else if (addr >= MMIO_UPPER_LIMIT && addr < IO_SPACE_LIMIT) {	\
-		struct logic_pio_hwaddr *range = find_io_range(addr);	\
 		size_t sz = sizeof(type);				\
 									\
 		if (range && range->ops)				\
@@ -269,28 +275,44 @@  type logic_in##bw(unsigned long addr)					\
 {									\
 	type ret = (type)~0;						\
 									\
-	if (addr < MMIO_UPPER_LIMIT)					\
-		ret = read##bw(PCI_IOBASE + addr);			\
+	if (addr < MMIO_UPPER_LIMIT) {					\
+		struct logic_pio_hwaddr *range = find_io_range(addr);	\
+									\
+		if (range)						\
+			ret = read##bw(PCI_IOBASE + addr);		\
+	}								\
 	return ret;							\
 }									\
 									\
-void logic_out##bw(type value, unsigned long addr)			\
+void logic_out##bw(type val, unsigned long addr)			\
 {									\
-	if (addr < MMIO_UPPER_LIMIT)					\
-		write##bw(value, PCI_IOBASE + addr);			\
+	if (addr < MMIO_UPPER_LIMIT) {					\
+		struct logic_pio_hwaddr *range = find_io_range(addr);	\
+									\
+		if (range)						\
+			write##bw(val, PCI_IOBASE + addr);		\
+	}								\
 }									\
 									\
 void logic_ins##bw(unsigned long addr, void *buf, unsigned int cnt)	\
 {									\
-	if (addr < MMIO_UPPER_LIMIT)					\
-		reads##bw(PCI_IOBASE + addr, buf, cnt);			\
+	if (addr < MMIO_UPPER_LIMIT) {					\
+		struct logic_pio_hwaddr *range = find_io_range(addr);	\
+									\
+		if (range)						\
+			reads##bw(PCI_IOBASE + addr, buf, cnt);		\
+	}								\
 }									\
 									\
 void logic_outs##bw(unsigned long addr, const void *buf,		\
 		    unsigned int cnt)					\
 {									\
-	if (addr < MMIO_UPPER_LIMIT)					\
-		writes##bw(PCI_IOBASE + addr, buf, cnt);		\
+	if (addr < MMIO_UPPER_LIMIT) {					\
+		struct logic_pio_hwaddr *range = find_io_range(addr);	\
+									\
+		if (range)						\
+			writes##bw(PCI_IOBASE + addr, buf, cnt);	\
+	}								\
 }
 #endif /* CONFIG_INDIRECT_PIO */