diff mbox series

package/nodejs/nodejs-src: remove .node files with different architecture

Message ID 20231031120713.678783-1-giulio.benetti@benettiengineering.com
State Changes Requested
Headers show
Series package/nodejs/nodejs-src: remove .node files with different architecture | expand

Commit Message

Giulio Benetti Oct. 31, 2023, 12:07 p.m. UTC
Actually nodejs-src fails to build when additional modules with prebuilt
.node files are added. This is due to Buildroot's check-bin-arch that
checks for executable files and libraries that have different architecture
from the one we're building for. So let's go through all .node files in
$(TARGET_DIR)/usr/lib/node_modules and check if the architecture is
different or not found and in case delete the .node file.

Fixes:
https://bugs.busybox.net/show_bug.cgi?id=15823

Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com>
---
 package/nodejs/nodejs-src/nodejs-src.mk | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Yann E. MORIN Nov. 1, 2023, 3:21 p.m. UTC | #1
Giulio, All,

On 2023-10-31 13:07 +0100, Giulio Benetti spake thusly:
> Actually nodejs-src fails to build when additional modules with prebuilt
> .node files are added. This is due to Buildroot's check-bin-arch that
> checks for executable files and libraries that have different architecture
> from the one we're building for. So let's go through all .node files in
> $(TARGET_DIR)/usr/lib/node_modules and check if the architecture is
> different or not found and in case delete the .node file.

Unfortunately, this is not going to work correctly.

For example, the module failing in #15823 is reported to be serialport,
but really the failing one is the scoped @serialport/bindings-cpp
module.

That module has "prebuilds" binaries for various systems, here's the
list:

    $ find prebuilds/ -type f
    prebuilds/android-arm/node.napi.armv7.node
    prebuilds/darwin-x64+arm64/node.napi.node
    prebuilds/android-arm64/node.napi.armv8.node
    prebuilds/win32-ia32/node.napi.node
    prebuilds/linux-arm/node.napi.armv7.node
    prebuilds/linux-arm/node.napi.armv6.node
    prebuilds/linux-x64/node.napi.musl.node
    prebuilds/linux-x64/node.napi.glibc.node
    prebuilds/linux-arm64/node.napi.armv8.node
    prebuilds/win32-x64/node.napi.node

and running readelf -h on each will report many modules actually matching
the current machine. For example, those two will both match AArch64,
while obviously we only want to keep the latter:

    prebuilds/android-arm64/node.napi.armv8.node
    prebuilds/linux-arm64/node.napi.armv8.node

Same goes for arm, and similar goes for x64 musl vs. glibc.

Those last two are also very interesting: it means that modules need to
be built properly against the correct C library, not just the target
CPU.

Which means that, even a module matching the current arch is not
necessarily applicable to said arch if it was built for another C
library...

Which leads to the question: what happens for such a module when there
is no prebuilt binary for the current arch? Is it going to be compiled
by npm when it installs the module? Thanks to Adam, I could inspect the
tarball of @serial/bindings-cpp@12.0.1, and I could spot .cpp and h
files in the src/ subidr:

    $ wget 'https://registry.npmjs.org/@serialport/bindings-cpp/-/bindings-cpp-12.0.1.tgz'
    $ tar tzf bindings-cpp-12.0.1.tgz |grep src/
    package/src/darwin_list.cpp
    package/src/poller.cpp
    package/src/serialport_linux.cpp
    package/src/serialport_unix.cpp
    package/src/serialport_win.cpp
    package/src/serialport.cpp
    package/src/darwin_list.h
    package/src/poller.h
    package/src/serialport_linux.h
    package/src/serialport_unix.h
    package/src/serialport_win.h
    package/src/serialport.h

There's also binding.gyp that looks like it lists what to build under
which condition.

In which case, how does npm decide that it can use one such prebuild, or
whether it should compile from the source files? Can we force it to
never use prebuilds and always compile?

I'm afraid the proposed patch just covers up a bigger issue by sweeping
the dust under the rug... It will indeed make check-bin-arch happy
because no binary will indeed match another arch, indeed, but that does
not mean the module will be actually correctly installed...

So yes, npm is such a mess that maybe sweeping the dust under the rug is
the only solution we can implement, but it is disappointing...

Maybe we can just also exclude the node modules from check-bin-arch
altogether and be done with it:

    NODEJS_SRC_BIN_ARCH_EXCLUDE = /usr/lib/node_modules/

Regards,
Yann E. MORIN.

> Fixes:
> https://bugs.busybox.net/show_bug.cgi?id=15823
> 
> Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com>
> ---
>  package/nodejs/nodejs-src/nodejs-src.mk | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/package/nodejs/nodejs-src/nodejs-src.mk b/package/nodejs/nodejs-src/nodejs-src.mk
> index 3452c93728..bdbf0709c6 100644
> --- a/package/nodejs/nodejs-src/nodejs-src.mk
> +++ b/package/nodejs/nodejs-src/nodejs-src.mk
> @@ -247,6 +247,18 @@ define NODEJS_SRC_INSTALL_MODULES
>  	# npm install call below and setting npm_config_rollback=false can both
>  	# help in diagnosing the problem.
>  	$(NPM) install -g $(NODEJS_SRC_MODULES_LIST)
> +
> +	# Remove prebuilt files which are not compatible with the architecture
> +	# and OS(Linux) we're building for. NOTE: .node files that don't have a
> +	# readelf output have different ABI(i.e. Windows, Darwin etc.)
> +	for f in $$(find $(TARGET_DIR)/usr/lib/node_modules -type f -name "*.node"); do \
> +		echo $$f; \
> +		arch=`$(TARGET_READELF) -h "$$f" 2>&1 | \
> +			sed -r -e '/^  Machine: +(.+)/!d; s//\1/;' | head -1`; \
> +		if [ "$$arch" != "$(BR2_READELF_ARCH_NAME)" ]; then \
> +			rm -f $$f; \
> +		fi \
> +	done
>  endef
>  endif
>  
> -- 
> 2.34.1
> 
> _______________________________________________
> buildroot mailing list
> buildroot@buildroot.org
> https://lists.buildroot.org/mailman/listinfo/buildroot
Giulio Benetti Nov. 2, 2023, 11:16 p.m. UTC | #2
Hi Yann, All,

On 01/11/23 16:21, Yann E. MORIN wrote:
> Giulio, All,
> 
> On 2023-10-31 13:07 +0100, Giulio Benetti spake thusly:
>> Actually nodejs-src fails to build when additional modules with prebuilt
>> .node files are added. This is due to Buildroot's check-bin-arch that
>> checks for executable files and libraries that have different architecture
>> from the one we're building for. So let's go through all .node files in
>> $(TARGET_DIR)/usr/lib/node_modules and check if the architecture is
>> different or not found and in case delete the .node file.
> 
> Unfortunately, this is not going to work correctly.
> 
> For example, the module failing in #15823 is reported to be serialport,
> but really the failing one is the scoped @serialport/bindings-cpp
> module.
> 
> That module has "prebuilds" binaries for various systems, here's the
> list:
> 
>      $ find prebuilds/ -type f
>      prebuilds/android-arm/node.napi.armv7.node
>      prebuilds/darwin-x64+arm64/node.napi.node
>      prebuilds/android-arm64/node.napi.armv8.node
>      prebuilds/win32-ia32/node.napi.node
>      prebuilds/linux-arm/node.napi.armv7.node
>      prebuilds/linux-arm/node.napi.armv6.node
>      prebuilds/linux-x64/node.napi.musl.node
>      prebuilds/linux-x64/node.napi.glibc.node
>      prebuilds/linux-arm64/node.napi.armv8.node
>      prebuilds/win32-x64/node.napi.node
> 
> and running readelf -h on each will report many modules actually matching
> the current machine. For example, those two will both match AArch64,
> while obviously we only want to keep the latter:
> 
>      prebuilds/android-arm64/node.napi.armv8.node
>      prebuilds/linux-arm64/node.napi.armv8.node

Yes. I've found it was a better solution compared to what we have now,
but I'd need to check for OS too, but...

> Same goes for arm, and similar goes for x64 musl vs. glibc.
> 
> Those last two are also very interesting: it means that modules need to
> be built properly against the correct C library, not just the target
> CPU.

...you're totally right, I haven't considered libc :-/

> Which means that, even a module matching the current arch is not
> necessarily applicable to said arch if it was built for another C
> library...

yes,

> Which leads to the question: what happens for such a module when there
> is no prebuilt binary for the current arch? Is it going to be compiled
> by npm when it installs the module? Thanks to Adam, I could inspect the
> tarball of @serial/bindings-cpp@12.0.1, and I could spot .cpp and h
> files in the src/ subidr:
> 
>      $ wget 'https://registry.npmjs.org/@serialport/bindings-cpp/-/bindings-cpp-12.0.1.tgz'
>      $ tar tzf bindings-cpp-12.0.1.tgz |grep src/
>      package/src/darwin_list.cpp
>      package/src/poller.cpp
>      package/src/serialport_linux.cpp
>      package/src/serialport_unix.cpp
>      package/src/serialport_win.cpp
>      package/src/serialport.cpp
>      package/src/darwin_list.h
>      package/src/poller.h
>      package/src/serialport_linux.h
>      package/src/serialport_unix.h
>      package/src/serialport_win.h
>      package/src/serialport.h
> 
> There's also binding.gyp that looks like it lists what to build under
> which condition.

yes, .gyp file is another build system, same Mozilla adopted for 
NSS(maybe something else):
https://gyp.gsrc.io/index.md

But NodeJS seems to deal with it without the need to add a gyp infra.

> In which case, how does npm decide that it can use one such prebuild, or
> whether it should compile from the source files? Can we force it to
> never use prebuilds and always compile?

I'm working on this after your review,

> I'm afraid the proposed patch just covers up a bigger issue by sweeping
> the dust under the rug... It will indeed make check-bin-arch happy
> because no binary will indeed match another arch, indeed, but that does
> not mean the module will be actually correctly installed...

exactly,

> So yes, npm is such a mess that maybe sweeping the dust under the rug is
> the only solution we can implement, but it is disappointing...

I think it's not the way to go because of libc, that changes everything,

> Maybe we can just also exclude the node modules from check-bin-arch
> altogether and be done with it:
> 
>      NODEJS_SRC_BIN_ARCH_EXCLUDE = /usr/lib/node_modules/

This ^^^ doesn't fix the libc problem. If you nm:
target/usr/lib/node_modules/serialport/bindings-cpp/build/Release/bindings.node

you will see that NodeJS assume native code to be built with glibc, so
I really need to go with building from scratch. And this works using:
npm rebuild ...
but I'm dealing with installation that is not so easy.

Once I have it building natively with correct libc I will send a patch.

Thank you for the review!

Best regards
Giulio Benetti Nov. 2, 2023, 11:43 p.m. UTC | #3
On 03/11/23 00:16, Giulio Benetti wrote:

[ SNIP ]

>> There's also binding.gyp that looks like it lists what to build under
>> which condition.
> 
> yes, .gyp file is another build system, same Mozilla adopted for 
> NSS(maybe something else):
> https://gyp.gsrc.io/index.md
> 
> But NodeJS seems to deal with it without the need to add a gyp infra.

yes it does have node-gyp that deals with that fortunately,

>> In which case, how does npm decide that it can use one such prebuild, or
>> whether it should compile from the source files? Can we force it to
>> never use prebuilds and always compile?
> 
> I'm working on this after your review,
> 
>> I'm afraid the proposed patch just covers up a bigger issue by sweeping
>> the dust under the rug... It will indeed make check-bin-arch happy
>> because no binary will indeed match another arch, indeed, but that does
>> not mean the module will be actually correctly installed...
> 
> exactly,
> 
>> So yes, npm is such a mess that maybe sweeping the dust under the rug is
>> the only solution we can implement, but it is disappointing...
> 
> I think it's not the way to go because of libc, that changes everything,
> 
>> Maybe we can just also exclude the node modules from check-bin-arch
>> altogether and be done with it:
>>
>>      NODEJS_SRC_BIN_ARCH_EXCLUDE = /usr/lib/node_modules/
> 
> This ^^^ doesn't fix the libc problem. If you nm:
> target/usr/lib/node_modules/serialport/bindings-cpp/build/Release/bindings.node
> 
> you will see that NodeJS assume native code to be built with glibc, so
> I really need to go with building from scratch. And this works using:
> npm rebuild ...

Still not, I'm using:
npm pack ....
to download the tarball that only contains the module but not its
dependencies, so I need to extract and retrieve all the dependencies
including "@serialport/bindings-cpp" that must be natively built and
it contains the prebuilt binaries, so I will have to deal with
installation of the single os-arch directory.

Not an easy task, but I'll take care.

Best regards
Yann E. MORIN Nov. 3, 2023, 9:21 p.m. UTC | #4
Giulio, All,

On 2023-11-03 00:16 +0100, Giulio Benetti spake thusly:
> On 01/11/23 16:21, Yann E. MORIN wrote:
> >On 2023-10-31 13:07 +0100, Giulio Benetti spake thusly:
[--SNIP--]
> >There's also binding.gyp that looks like it lists what to build under
> >which condition.
> yes, .gyp file is another build system, same Mozilla adopted for NSS(maybe
> something else):
> https://gyp.gsrc.io/index.md
> But NodeJS seems to deal with it without the need to add a gyp infra.
> >In which case, how does npm decide that it can use one such prebuild, or
> >whether it should compile from the source files? Can we force it to
> >never use prebuilds and always compile?
> I'm working on this after your review,

Great!

If we can make without the prebuild stuff: is the "prebuilds" directory
name a norm, and if so, can we forcibly remove them in one fell swoop?

> >So yes, npm is such a mess that maybe sweeping the dust under the rug is
> >the only solution we can implement, but it is disappointing...
> I think it's not the way to go because of libc, that changes everything,

Well, I would assume that npm has "magic" to choose the proper prebuilt.
Otherwise, it would not even have the libc as a criteria in the
prebuilds name.

> >Maybe we can just also exclude the node modules from check-bin-arch
> >altogether and be done with it:
> >     NODEJS_SRC_BIN_ARCH_EXCLUDE = /usr/lib/node_modules/
> This ^^^ doesn't fix the libc problem. If you nm:
> target/usr/lib/node_modules/serialport/bindings-cpp/build/Release/bindings.node
> you will see that NodeJS assume native code to be built with glibc, so

Oh, I was just walking the "npm is a mess anyway, just ignore anything
from it, we don't care" road. If we take this stance, then
_BIN_ARCH_EXCLUDE is a trivial way to avoid the check-bin-arch issue.

And I am fine if this is the outcome of your investigations.

> I really need to go with building from scratch. And this works using:
> npm rebuild ...
> but I'm dealing with installation that is not so easy.
> Once I have it building natively with correct libc I will send a patch.

If we can get something that is "relatively" simple, easy, *and*
maintenable, then that's OK. But itherwise, I would say we should not
spend any unreasonable amount of time on the topic...

Thanks for looking into this mess, anyway! 👍 :-)

Regards,
Yann E. MORIN.
Giulio Benetti Nov. 5, 2023, 2:27 p.m. UTC | #5
Hi Yann, All,

On 03/11/23 22:21, Yann E. MORIN wrote:

[ SNIP ]

> 
> Oh, I was just walking the "npm is a mess anyway, just ignore anything
> from it, we don't care" road. If we take this stance, then
> _BIN_ARCH_EXCLUDE is a trivial way to avoid the check-bin-arch issue.
> 
> And I am fine if this is the outcome of your investigations.
> 

Since npm rebuilding is not that easy and fast to deal with I've just 
sent this patch to temporary exclude the /usr/lib/node_modules folder
as you've suggested(forgotten to add "Suggested-by: Yann E. MORIN 
<yann.morin.1998@free.fr>").

Later I hope to find time to deal with npm rebuilding.

Best regards
Yann E. MORIN Nov. 7, 2023, 4:15 p.m. UTC | #6
Giulio, All,

On 2023-11-05 15:27 +0100, Giulio Benetti spake thusly:
> On 03/11/23 22:21, Yann E. MORIN wrote:
> >Oh, I was just walking the "npm is a mess anyway, just ignore anything
> >from it, we don't care" road. If we take this stance, then
> >_BIN_ARCH_EXCLUDE is a trivial way to avoid the check-bin-arch issue.
[--SNIP--]
> Since npm rebuilding is not that easy and fast to deal with I've just sent
> this patch to temporary exclude the /usr/lib/node_modules folder
> as you've suggested(forgotten to add "Suggested-by: Yann E. MORIN
> <yann.morin.1998@free.fr>").
> 
> Later I hope to find time to deal with npm rebuilding.

I think tht it might in fact be what we're doing already, as we use:
    npm_config_build_from_source=true

in the environment when we call npm.

However, I was not able to find any reference to those options... :-/

Regards,
Yann E. MORIN.
Arnout Vandecappelle Nov. 7, 2023, 8:08 p.m. UTC | #7
On 07/11/2023 17:15, Yann E. MORIN wrote:
> Giulio, All,
> 
> On 2023-11-05 15:27 +0100, Giulio Benetti spake thusly:
>> On 03/11/23 22:21, Yann E. MORIN wrote:
>>> Oh, I was just walking the "npm is a mess anyway, just ignore anything
>> >from it, we don't care" road. If we take this stance, then
>>> _BIN_ARCH_EXCLUDE is a trivial way to avoid the check-bin-arch issue.
> [--SNIP--]
>> Since npm rebuilding is not that easy and fast to deal with I've just sent
>> this patch to temporary exclude the /usr/lib/node_modules folder
>> as you've suggested(forgotten to add "Suggested-by: Yann E. MORIN
>> <yann.morin.1998@free.fr>").
>>
>> Later I hope to find time to deal with npm rebuilding.
> 
> I think tht it might in fact be what we're doing already, as we use:
>      npm_config_build_from_source=true
> 
> in the environment when we call npm.

  I guess that even if the binaries for the actual target get rebuilt, the other 
irrelevant binaries that were also downloaded are still there and trigger the issue.

  Regards,
  Arnout

> 
> However, I was not able to find any reference to those options... :-/
> 
> Regards,
> Yann E. MORIN.
>
Yann E. MORIN Nov. 7, 2023, 8:40 p.m. UTC | #8
Arnout, All,

On 2023-11-07 21:08 +0100, Arnout Vandecappelle spake thusly:
> On 07/11/2023 17:15, Yann E. MORIN wrote:
> >On 2023-11-05 15:27 +0100, Giulio Benetti spake thusly:
> >>On 03/11/23 22:21, Yann E. MORIN wrote:
> >>>Oh, I was just walking the "npm is a mess anyway, just ignore anything
> >>>from it, we don't care" road. If we take this stance, then
> >>>_BIN_ARCH_EXCLUDE is a trivial way to avoid the check-bin-arch issue.
> >[--SNIP--]
> >>Since npm rebuilding is not that easy and fast to deal with I've just sent
> >>this patch to temporary exclude the /usr/lib/node_modules folder
> >>as you've suggested(forgotten to add "Suggested-by: Yann E. MORIN
> >><yann.morin.1998@free.fr>").
> >>
> >>Later I hope to find time to deal with npm rebuilding.
> >
> >I think tht it might in fact be what we're doing already, as we use:
> >     npm_config_build_from_source=true
> >
> >in the environment when we call npm.
> 
>  I guess that even if the binaries for the actual target get rebuilt, the
> other irrelevant binaries that were also downloaded are still there and
> trigger the issue.

Yes, absolutely, and confirmed by experience. ;-)

I'm still working on that patch...

Regards,
Yann E. MORIN.
diff mbox series

Patch

diff --git a/package/nodejs/nodejs-src/nodejs-src.mk b/package/nodejs/nodejs-src/nodejs-src.mk
index 3452c93728..bdbf0709c6 100644
--- a/package/nodejs/nodejs-src/nodejs-src.mk
+++ b/package/nodejs/nodejs-src/nodejs-src.mk
@@ -247,6 +247,18 @@  define NODEJS_SRC_INSTALL_MODULES
 	# npm install call below and setting npm_config_rollback=false can both
 	# help in diagnosing the problem.
 	$(NPM) install -g $(NODEJS_SRC_MODULES_LIST)
+
+	# Remove prebuilt files which are not compatible with the architecture
+	# and OS(Linux) we're building for. NOTE: .node files that don't have a
+	# readelf output have different ABI(i.e. Windows, Darwin etc.)
+	for f in $$(find $(TARGET_DIR)/usr/lib/node_modules -type f -name "*.node"); do \
+		echo $$f; \
+		arch=`$(TARGET_READELF) -h "$$f" 2>&1 | \
+			sed -r -e '/^  Machine: +(.+)/!d; s//\1/;' | head -1`; \
+		if [ "$$arch" != "$(BR2_READELF_ARCH_NAME)" ]; then \
+			rm -f $$f; \
+		fi \
+	done
 endef
 endif