diff mbox series

mtd: ubi: avoid expensive do_div() on 32-bit machines

Message ID 20240313084707.3292300-1-arnd@kernel.org
State New
Headers show
Series mtd: ubi: avoid expensive do_div() on 32-bit machines | expand

Commit Message

Arnd Bergmann March 13, 2024, 8:46 a.m. UTC
From: Arnd Bergmann <arnd@arndb.de>

The use of do_div() in ubi_nvmem_reg_read() makes calling it on
32-bit machines rather expensive. Since the 'from' variable is
known to be a 32-bit quantity, it is clearly never needed and
can be optimized into a regular division operation.

Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/mtd/ubi/nvmem.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Zhihao Cheng March 13, 2024, 11:29 a.m. UTC | #1
在 2024/3/13 16:46, Arnd Bergmann 写道:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
> 32-bit machines rather expensive. Since the 'from' variable is
> known to be a 32-bit quantity, it is clearly never needed and
> can be optimized into a regular division operation.
> 
Do you meet a performance problem on a 32-bit machine? There are too 
many places invoking do_div, why do you optimize this one?
Have you tested the influence on a x86_64 platform after this patch 
applied? Looks like that do_div is more efficient in x86.

> Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
> Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>   drivers/mtd/ubi/nvmem.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c
> index 8aeb9c428e51..a94a1a9aaec1 100644
> --- a/drivers/mtd/ubi/nvmem.c
> +++ b/drivers/mtd/ubi/nvmem.c
> @@ -6,7 +6,6 @@
>   /* UBI NVMEM provider */
>   #include "ubi.h"
>   #include <linux/nvmem-provider.h>
> -#include <asm/div64.h>
>   
>   /* List of all NVMEM devices */
>   static LIST_HEAD(nvmem_devices);
> @@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from,
>   	struct ubi_nvmem *unv = priv;
>   	struct ubi_volume_desc *desc;
>   	uint32_t offs;
> -	uint64_t lnum = from;
> +	uint32_t lnum;
>   	int err = 0;
>   
>   	desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY);
>   	if (IS_ERR(desc))
>   		return PTR_ERR(desc);
>   
> -	offs = do_div(lnum, unv->usable_leb_size);
> +	offs = from % unv->usable_leb_size;
> +	lnum = from / unv->usable_leb_size;
>   	while (bytes_left) {
>   		to_read = unv->usable_leb_size - offs;
>   
>
Arnd Bergmann March 13, 2024, 11:53 a.m. UTC | #2
On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
> 在 2024/3/13 16:46, Arnd Bergmann 写道:
>> From: Arnd Bergmann <arnd@arndb.de>
>> 
>> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
>> 32-bit machines rather expensive. Since the 'from' variable is
>> known to be a 32-bit quantity, it is clearly never needed and
>> can be optimized into a regular division operation.
>> 
> Do you meet a performance problem on a 32-bit machine? There are too 
> many places invoking do_div, why do you optimize this one?
> Have you tested the influence on a x86_64 platform after this patch 
> applied? Looks like that do_div is more efficient in x86.

This one was just introduced. The call site looks like a fast
path and it caused a build regression that Daniel addressed with
an suboptimal commit b8a77b9a5f9c ("mtd: ubi: fix NVMEM over
UBI volumes on 32-bit systems").

The way it usually goes is that someone adds an open-coded
64-bit division that causes a link failure, which prompts
the original developer to either rewrite the code to avoid
the long division if possible, or add do_div() after showing
that it is now performance critical, e.g. only called at
probe time.

     Arnd
Zhihao Cheng March 13, 2024, 12:10 p.m. UTC | #3
在 2024/3/13 19:53, Arnd Bergmann 写道:
> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>> 在 2024/3/13 16:46, Arnd Bergmann 写道:
>>> From: Arnd Bergmann <arnd@arndb.de>
>>>
>>> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
>>> 32-bit machines rather expensive. Since the 'from' variable is
>>> known to be a 32-bit quantity, it is clearly never needed and
>>> can be optimized into a regular division operation.
>>>
>> Do you meet a performance problem on a 32-bit machine? There are too
>> many places invoking do_div, why do you optimize this one?
>> Have you tested the influence on a x86_64 platform after this patch
>> applied? Looks like that do_div is more efficient in x86.
> 
> This one was just introduced. The call site looks like a fast
> path and it caused a build regression that Daniel addressed with
> an suboptimal commit b8a77b9a5f9c ("mtd: ubi: fix NVMEM over
> UBI volumes on 32-bit systems").
> 
> The way it usually goes is that someone adds an open-coded
> 64-bit division that causes a link failure, which prompts
I'm a little confused, what kind of link failure? Could you show an example?
> the original developer to either rewrite the code to avoid
> the long division if possible, or add do_div() after showing
> that it is now performance critical, e.g. only called at
> probe time.
> 
>       Arnd
> .
>
Arnd Bergmann March 13, 2024, 12:21 p.m. UTC | #4
On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote:
> 在 2024/3/13 19:53, Arnd Bergmann 写道:
>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>> 
>> The way it usually goes is that someone adds an open-coded
>> 64-bit division that causes a link failure, which prompts
> I'm a little confused, what kind of link failure? Could you show an example?

The open-coded 64-bit division without using do_div() shows up as

x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read':
nvmem.c:(.text+0x10a): undefined reference to `__umoddi3'
x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3'
x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold':
nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3'

The idea is that gcc expects __umoddi3 to be provided by libgcc,
but Linux intentionally leaves it out in order to catch accidental
64-bit divisions.

     Arnd
Zhihao Cheng March 13, 2024, 1:29 p.m. UTC | #5
在 2024/3/13 20:21, Arnd Bergmann 写道:
> On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote:
>> 在 2024/3/13 19:53, Arnd Bergmann 写道:
>>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>>>
>>> The way it usually goes is that someone adds an open-coded
>>> 64-bit division that causes a link failure, which prompts
>> I'm a little confused, what kind of link failure? Could you show an example?
> 
> The open-coded 64-bit division without using do_div() shows up as
> 
> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read':
> nvmem.c:(.text+0x10a): undefined reference to `__umoddi3'
> x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3'
> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold':
> nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3'
>  > The idea is that gcc expects __umoddi3 to be provided by libgcc,
> but Linux intentionally leaves it out in order to catch accidental
> 64-bit divisions.
> 

Thanks for explaination, which means that do_div is used for 64-bit 
division to solve the link failure caused by missed libgcc. Since 
parameter 'from' is u32, there is no need to invoke do_div on a 32-bit 
platform, you just want to stop the wasting behavior on a 32-bit 
platform. Do I understand right?
Arnd Bergmann March 13, 2024, 1:39 p.m. UTC | #6
On Wed, Mar 13, 2024, at 14:29, Zhihao Cheng wrote:
> 在 2024/3/13 20:21, Arnd Bergmann 写道:
>> On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote:
>>> 在 2024/3/13 19:53, Arnd Bergmann 写道:
>>>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>>>>
>>>> The way it usually goes is that someone adds an open-coded
>>>> 64-bit division that causes a link failure, which prompts
>>> I'm a little confused, what kind of link failure? Could you show an example?
>> 
>> The open-coded 64-bit division without using do_div() shows up as
>> 
>> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read':
>> nvmem.c:(.text+0x10a): undefined reference to `__umoddi3'
>> x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3'
>> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold':
>> nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3'
>>  > The idea is that gcc expects __umoddi3 to be provided by libgcc,
>> but Linux intentionally leaves it out in order to catch accidental
>> 64-bit divisions.
>> 
>
> Thanks for explaination, which means that do_div is used for 64-bit 
> division to solve the link failure caused by missed libgcc. Since 
> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit 
> platform, you just want to stop the wasting behavior on a 32-bit 
> platform. Do I understand right?

Yes, correct.

      Arnd
Zhihao Cheng March 13, 2024, 1:41 p.m. UTC | #7
在 2024/3/13 16:46, Arnd Bergmann 写道:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
> 32-bit machines rather expensive. Since the 'from' variable is
> known to be a 32-bit quantity, it is clearly never needed and
> can be optimized into a regular division operation.
> 
> Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
> Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>   drivers/mtd/ubi/nvmem.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
> 
> diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c
> index 8aeb9c428e51..a94a1a9aaec1 100644
> --- a/drivers/mtd/ubi/nvmem.c
> +++ b/drivers/mtd/ubi/nvmem.c
> @@ -6,7 +6,6 @@
>   /* UBI NVMEM provider */
>   #include "ubi.h"
>   #include <linux/nvmem-provider.h>
> -#include <asm/div64.h>
>   
>   /* List of all NVMEM devices */
>   static LIST_HEAD(nvmem_devices);
> @@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from,
>   	struct ubi_nvmem *unv = priv;
>   	struct ubi_volume_desc *desc;
>   	uint32_t offs;
> -	uint64_t lnum = from;
> +	uint32_t lnum;
>   	int err = 0;
>   
>   	desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY);
>   	if (IS_ERR(desc))
>   		return PTR_ERR(desc);
>   
> -	offs = do_div(lnum, unv->usable_leb_size);
> +	offs = from % unv->usable_leb_size;
> +	lnum = from / unv->usable_leb_size;
>   	while (bytes_left) {
>   		to_read = unv->usable_leb_size - offs;
>   
>
Zhihao Cheng March 13, 2024, 1:43 p.m. UTC | #8
在 2024/3/13 21:39, Arnd Bergmann 写道:

>> Thanks for explaination, which means that do_div is used for 64-bit
>> division to solve the link failure caused by missed libgcc. Since
>> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit
>> platform, you just want to stop the wasting behavior on a 32-bit
>> platform. Do I understand right?
> 
> Yes, correct.
> 
>        Arnd
> .
> 
How do you find it? I mean there are so many types and many do_div 
callers, do you have a static check tool?
Arnd Bergmann March 13, 2024, 1:51 p.m. UTC | #9
On Wed, Mar 13, 2024, at 14:43, Zhihao Cheng wrote:
> 在 2024/3/13 21:39, Arnd Bergmann 写道:
>
>>> Thanks for explaination, which means that do_div is used for 64-bit
>>> division to solve the link failure caused by missed libgcc. Since
>>> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit
>>> platform, you just want to stop the wasting behavior on a 32-bit
>>> platform. Do I understand right?
>> 
> How do you find it? I mean there are so many types and many do_div 
> callers, do you have a static check tool?

I had a local fix with the same effect that I applied when
the code originally broke, but had not gotten around to
sending my fix.

After b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on
32-bit systems") introduced a different workaround, I got the
link error because I had removed the do_div().

     Arnd
diff mbox series

Patch

diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c
index 8aeb9c428e51..a94a1a9aaec1 100644
--- a/drivers/mtd/ubi/nvmem.c
+++ b/drivers/mtd/ubi/nvmem.c
@@ -6,7 +6,6 @@ 
 /* UBI NVMEM provider */
 #include "ubi.h"
 #include <linux/nvmem-provider.h>
-#include <asm/div64.h>
 
 /* List of all NVMEM devices */
 static LIST_HEAD(nvmem_devices);
@@ -27,14 +26,15 @@  static int ubi_nvmem_reg_read(void *priv, unsigned int from,
 	struct ubi_nvmem *unv = priv;
 	struct ubi_volume_desc *desc;
 	uint32_t offs;
-	uint64_t lnum = from;
+	uint32_t lnum;
 	int err = 0;
 
 	desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY);
 	if (IS_ERR(desc))
 		return PTR_ERR(desc);
 
-	offs = do_div(lnum, unv->usable_leb_size);
+	offs = from % unv->usable_leb_size;
+	lnum = from / unv->usable_leb_size;
 	while (bytes_left) {
 		to_read = unv->usable_leb_size - offs;