Message ID | 20240313084707.3292300-1-arnd@kernel.org |
---|---|
State | New |
Headers | show |
Series | mtd: ubi: avoid expensive do_div() on 32-bit machines | expand |
在 2024/3/13 16:46, Arnd Bergmann 写道: > From: Arnd Bergmann <arnd@arndb.de> > > The use of do_div() in ubi_nvmem_reg_read() makes calling it on > 32-bit machines rather expensive. Since the 'from' variable is > known to be a 32-bit quantity, it is clearly never needed and > can be optimized into a regular division operation. > Do you meet a performance problem on a 32-bit machine? There are too many places invoking do_div, why do you optimize this one? Have you tested the influence on a x86_64 platform after this patch applied? Looks like that do_div is more efficient in x86. > Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems") > Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > drivers/mtd/ubi/nvmem.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c > index 8aeb9c428e51..a94a1a9aaec1 100644 > --- a/drivers/mtd/ubi/nvmem.c > +++ b/drivers/mtd/ubi/nvmem.c > @@ -6,7 +6,6 @@ > /* UBI NVMEM provider */ > #include "ubi.h" > #include <linux/nvmem-provider.h> > -#include <asm/div64.h> > > /* List of all NVMEM devices */ > static LIST_HEAD(nvmem_devices); > @@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from, > struct ubi_nvmem *unv = priv; > struct ubi_volume_desc *desc; > uint32_t offs; > - uint64_t lnum = from; > + uint32_t lnum; > int err = 0; > > desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY); > if (IS_ERR(desc)) > return PTR_ERR(desc); > > - offs = do_div(lnum, unv->usable_leb_size); > + offs = from % unv->usable_leb_size; > + lnum = from / unv->usable_leb_size; > while (bytes_left) { > to_read = unv->usable_leb_size - offs; > >
On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote: > 在 2024/3/13 16:46, Arnd Bergmann 写道: >> From: Arnd Bergmann <arnd@arndb.de> >> >> The use of do_div() in ubi_nvmem_reg_read() makes calling it on >> 32-bit machines rather expensive. Since the 'from' variable is >> known to be a 32-bit quantity, it is clearly never needed and >> can be optimized into a regular division operation. >> > Do you meet a performance problem on a 32-bit machine? There are too > many places invoking do_div, why do you optimize this one? > Have you tested the influence on a x86_64 platform after this patch > applied? Looks like that do_div is more efficient in x86. This one was just introduced. The call site looks like a fast path and it caused a build regression that Daniel addressed with an suboptimal commit b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems"). The way it usually goes is that someone adds an open-coded 64-bit division that causes a link failure, which prompts the original developer to either rewrite the code to avoid the long division if possible, or add do_div() after showing that it is now performance critical, e.g. only called at probe time. Arnd
在 2024/3/13 19:53, Arnd Bergmann 写道: > On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote: >> 在 2024/3/13 16:46, Arnd Bergmann 写道: >>> From: Arnd Bergmann <arnd@arndb.de> >>> >>> The use of do_div() in ubi_nvmem_reg_read() makes calling it on >>> 32-bit machines rather expensive. Since the 'from' variable is >>> known to be a 32-bit quantity, it is clearly never needed and >>> can be optimized into a regular division operation. >>> >> Do you meet a performance problem on a 32-bit machine? There are too >> many places invoking do_div, why do you optimize this one? >> Have you tested the influence on a x86_64 platform after this patch >> applied? Looks like that do_div is more efficient in x86. > > This one was just introduced. The call site looks like a fast > path and it caused a build regression that Daniel addressed with > an suboptimal commit b8a77b9a5f9c ("mtd: ubi: fix NVMEM over > UBI volumes on 32-bit systems"). > > The way it usually goes is that someone adds an open-coded > 64-bit division that causes a link failure, which prompts I'm a little confused, what kind of link failure? Could you show an example? > the original developer to either rewrite the code to avoid > the long division if possible, or add do_div() after showing > that it is now performance critical, e.g. only called at > probe time. > > Arnd > . >
On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote: > 在 2024/3/13 19:53, Arnd Bergmann 写道: >> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote: >> >> The way it usually goes is that someone adds an open-coded >> 64-bit division that causes a link failure, which prompts > I'm a little confused, what kind of link failure? Could you show an example? The open-coded 64-bit division without using do_div() shows up as x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read': nvmem.c:(.text+0x10a): undefined reference to `__umoddi3' x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3' x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold': nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3' The idea is that gcc expects __umoddi3 to be provided by libgcc, but Linux intentionally leaves it out in order to catch accidental 64-bit divisions. Arnd
在 2024/3/13 20:21, Arnd Bergmann 写道: > On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote: >> 在 2024/3/13 19:53, Arnd Bergmann 写道: >>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote: >>> >>> The way it usually goes is that someone adds an open-coded >>> 64-bit division that causes a link failure, which prompts >> I'm a little confused, what kind of link failure? Could you show an example? > > The open-coded 64-bit division without using do_div() shows up as > > x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read': > nvmem.c:(.text+0x10a): undefined reference to `__umoddi3' > x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3' > x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold': > nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3' > > The idea is that gcc expects __umoddi3 to be provided by libgcc, > but Linux intentionally leaves it out in order to catch accidental > 64-bit divisions. > Thanks for explaination, which means that do_div is used for 64-bit division to solve the link failure caused by missed libgcc. Since parameter 'from' is u32, there is no need to invoke do_div on a 32-bit platform, you just want to stop the wasting behavior on a 32-bit platform. Do I understand right?
On Wed, Mar 13, 2024, at 14:29, Zhihao Cheng wrote: > 在 2024/3/13 20:21, Arnd Bergmann 写道: >> On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote: >>> 在 2024/3/13 19:53, Arnd Bergmann 写道: >>>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote: >>>> >>>> The way it usually goes is that someone adds an open-coded >>>> 64-bit division that causes a link failure, which prompts >>> I'm a little confused, what kind of link failure? Could you show an example? >> >> The open-coded 64-bit division without using do_div() shows up as >> >> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read': >> nvmem.c:(.text+0x10a): undefined reference to `__umoddi3' >> x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3' >> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold': >> nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3' >> > The idea is that gcc expects __umoddi3 to be provided by libgcc, >> but Linux intentionally leaves it out in order to catch accidental >> 64-bit divisions. >> > > Thanks for explaination, which means that do_div is used for 64-bit > division to solve the link failure caused by missed libgcc. Since > parameter 'from' is u32, there is no need to invoke do_div on a 32-bit > platform, you just want to stop the wasting behavior on a 32-bit > platform. Do I understand right? Yes, correct. Arnd
在 2024/3/13 16:46, Arnd Bergmann 写道: > From: Arnd Bergmann <arnd@arndb.de> > > The use of do_div() in ubi_nvmem_reg_read() makes calling it on > 32-bit machines rather expensive. Since the 'from' variable is > known to be a 32-bit quantity, it is clearly never needed and > can be optimized into a regular division operation. > > Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems") > Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > drivers/mtd/ubi/nvmem.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> > > diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c > index 8aeb9c428e51..a94a1a9aaec1 100644 > --- a/drivers/mtd/ubi/nvmem.c > +++ b/drivers/mtd/ubi/nvmem.c > @@ -6,7 +6,6 @@ > /* UBI NVMEM provider */ > #include "ubi.h" > #include <linux/nvmem-provider.h> > -#include <asm/div64.h> > > /* List of all NVMEM devices */ > static LIST_HEAD(nvmem_devices); > @@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from, > struct ubi_nvmem *unv = priv; > struct ubi_volume_desc *desc; > uint32_t offs; > - uint64_t lnum = from; > + uint32_t lnum; > int err = 0; > > desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY); > if (IS_ERR(desc)) > return PTR_ERR(desc); > > - offs = do_div(lnum, unv->usable_leb_size); > + offs = from % unv->usable_leb_size; > + lnum = from / unv->usable_leb_size; > while (bytes_left) { > to_read = unv->usable_leb_size - offs; > >
在 2024/3/13 21:39, Arnd Bergmann 写道: >> Thanks for explaination, which means that do_div is used for 64-bit >> division to solve the link failure caused by missed libgcc. Since >> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit >> platform, you just want to stop the wasting behavior on a 32-bit >> platform. Do I understand right? > > Yes, correct. > > Arnd > . > How do you find it? I mean there are so many types and many do_div callers, do you have a static check tool?
On Wed, Mar 13, 2024, at 14:43, Zhihao Cheng wrote: > 在 2024/3/13 21:39, Arnd Bergmann 写道: > >>> Thanks for explaination, which means that do_div is used for 64-bit >>> division to solve the link failure caused by missed libgcc. Since >>> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit >>> platform, you just want to stop the wasting behavior on a 32-bit >>> platform. Do I understand right? >> > How do you find it? I mean there are so many types and many do_div > callers, do you have a static check tool? I had a local fix with the same effect that I applied when the code originally broke, but had not gotten around to sending my fix. After b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems") introduced a different workaround, I got the link error because I had removed the do_div(). Arnd
diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c index 8aeb9c428e51..a94a1a9aaec1 100644 --- a/drivers/mtd/ubi/nvmem.c +++ b/drivers/mtd/ubi/nvmem.c @@ -6,7 +6,6 @@ /* UBI NVMEM provider */ #include "ubi.h" #include <linux/nvmem-provider.h> -#include <asm/div64.h> /* List of all NVMEM devices */ static LIST_HEAD(nvmem_devices); @@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from, struct ubi_nvmem *unv = priv; struct ubi_volume_desc *desc; uint32_t offs; - uint64_t lnum = from; + uint32_t lnum; int err = 0; desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY); if (IS_ERR(desc)) return PTR_ERR(desc); - offs = do_div(lnum, unv->usable_leb_size); + offs = from % unv->usable_leb_size; + lnum = from / unv->usable_leb_size; while (bytes_left) { to_read = unv->usable_leb_size - offs;