diff mbox series

[ovs-dev,ovs,v3] debian/rules: Fix incorrect use of link-time optimizer.

Message ID CALsEdxQFOfCoK8bkDzoCd=YrO7_D_k_FOiuxr+4zk1mtmJ6iyg@mail.gmail.com
State Superseded
Delegated to: Simon Horman
Headers show
Series [ovs-dev,ovs,v3] debian/rules: Fix incorrect use of link-time optimizer. | expand

Checks

Context Check Description
ovsrobot/apply-robot fail apply and check: fail
ovsrobot/intel-ovs-compilation fail test: fail

Commit Message

Roberto Bartzen Acosta Jan. 8, 2024, 11:34 a.m. UTC
Current version of debian/rules simply uses the default lto GCC
optimization settings during the linkage process.

The main problem with this approach is that GCC on OS like Ubuntu
Jammy, for example, can enable the -flto=auto option during the
openvswitch building and linking process. In this case, the linked
dynamic libraries would need to be builded based on the same lto
optimization options, at the risk of not working, according to
documentation [1].

I'm not sure of the real benefits of using this link-time optimization
option, and since when it is enabled it causes problems with shared
libs link libjemalloc, for example, it seems safer overwritten
compiler decision by passing -fno-lto command.

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto

Reported-at: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
---
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Simon Horman Jan. 9, 2024, 12:53 p.m. UTC | #1
On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta via dev wrote:
> Current version of debian/rules simply uses the default lto GCC
> optimization settings during the linkage process.
> 
> The main problem with this approach is that GCC on OS like Ubuntu
> Jammy, for example, can enable the -flto=auto option during the
> openvswitch building and linking process. In this case, the linked
> dynamic libraries would need to be builded based on the same lto
> optimization options, at the risk of not working, according to
> documentation [1].
> 
> I'm not sure of the real benefits of using this link-time optimization
> option, and since when it is enabled it causes problems with shared
> libs link libjemalloc, for example, it seems safer overwritten
> compiler decision by passing -fno-lto command.
> 
> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
> 
> Reported-at: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>

Hi,

for one reason or another our automation was unable to apply this patch.
But I have done so locally in my own tree (it is not upstream)
and run the GitHub based tests with success:

https://github.com/horms/ovs/actions/runs/7448358289

From my point of view this patch seems reasonable.

Acked-by: Simon Horman <horms@ovn.org>

But I would be interested to hear feedback from others before applying it
to the upstream tree.

> ---
>  debian/rules | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/debian/rules b/debian/rules
> index dc5cc8a65..de8771813 100755
> --- a/debian/rules
> +++ b/debian/rules
> @@ -2,7 +2,7 @@
>  # -*- makefile -*-
>  #export DH_VERBOSE=1
>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
Ilya Maximets Jan. 10, 2024, 6:37 p.m. UTC | #2
On 1/9/24 13:53, Simon Horman wrote:
> On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta via dev wrote:
>> Current version of debian/rules simply uses the default lto GCC
>> optimization settings during the linkage process.
>>
>> The main problem with this approach is that GCC on OS like Ubuntu
>> Jammy, for example, can enable the -flto=auto option during the
>> openvswitch building and linking process. In this case, the linked
>> dynamic libraries would need to be builded based on the same lto
>> optimization options, at the risk of not working, according to
>> documentation [1].
>>
>> I'm not sure of the real benefits of using this link-time optimization

At least for DPDK-enabled builds LTO usually provides noticeable
performance improvement.  In the past I measured up to 10% packet
rate increase with LTO.  Though the gap went a bit lower with modern
compilers.  I think, it should still provide noticeable performance
improvements at least for the userspace datapath.

>> option, and since when it is enabled it causes problems with shared
>> libs link libjemalloc, for example, it seems safer overwritten
>> compiler decision by passing -fno-lto command.

Building shared libraries with LTO is indeed a bit problematic.
I agree that correctness should have a priority over potential
performance benefits.

Just out of curiosity, how do you add jemalloc into dependencies?
Using one of the environment variables?

>>
>> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
>>
>> Reported-at: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
>> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
> 
> Hi,
> 
> for one reason or another our automation was unable to apply this patch.
> But I have done so locally in my own tree (it is not upstream)
> and run the GitHub based tests with success:
> 
> https://github.com/horms/ovs/actions/runs/7448358289
> 
> From my point of view this patch seems reasonable.
> 
> Acked-by: Simon Horman <horms@ovn.org>
> 
> But I would be interested to hear feedback from others before applying it
> to the upstream tree.

@Frode, do you have any thoughts on this change?  Is it a problem
also for the main (in-distribution) Ubuntu/Debian builds?

> 
>> ---
>>  debian/rules | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/debian/rules b/debian/rules
>> index dc5cc8a65..de8771813 100755
>> --- a/debian/rules
>> +++ b/debian/rules
>> @@ -2,7 +2,7 @@
>>  # -*- makefile -*-
>>  #export DH_VERBOSE=1
>>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
>> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
>> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
Roberto Bartzen Acosta Jan. 15, 2024, 11:14 a.m. UTC | #3
Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <i.maximets@redhat.com>
escreveu:
>
> On 1/9/24 13:53, Simon Horman wrote:
> > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta via
dev wrote:
> >> Current version of debian/rules simply uses the default lto GCC
> >> optimization settings during the linkage process.
> >>
> >> The main problem with this approach is that GCC on OS like Ubuntu
> >> Jammy, for example, can enable the -flto=auto option during the
> >> openvswitch building and linking process. In this case, the linked
> >> dynamic libraries would need to be builded based on the same lto
> >> optimization options, at the risk of not working, according to
> >> documentation [1].
> >>
> >> I'm not sure of the real benefits of using this link-time optimization
>
> At least for DPDK-enabled builds LTO usually provides noticeable
> performance improvement.  In the past I measured up to 10% packet
> rate increase with LTO.  Though the gap went a bit lower with modern
> compilers.  I think, it should still provide noticeable performance
> improvements at least for the userspace datapath.
>
> >> option, and since when it is enabled it causes problems with shared
> >> libs link libjemalloc, for example, it seems safer overwritten
> >> compiler decision by passing -fno-lto command.
>
> Building shared libraries with LTO is indeed a bit problematic.
> I agree that correctness should have a priority over potential
> performance benefits.
>
> Just out of curiosity, how do you add jemalloc into dependencies?
> Using one of the environment variables?

yeap, I'm just using the standard libs flag on a Ubuntu OS as suggested by
the OpenvSwitch documentation (*LIBS=-ljemalloc* ) [1]

override_dh_auto_configure:	test -d _debian || mkdir _debian	cd
_debian && ( \		test -e Makefile || \		../configure --prefix=/usr
--localstatedir=/var --enable-ssl LIBS=-ljemalloc \					
--sysconfdir=/etc \					 $(DATAPATH_CONFIGURE_OPTS) \					
$(EXTRA_CONFIGURE_OPTS) \					 )


[1] https://docs.openvswitch.org/en/latest/intro/install/general/


>
> >>
> >> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
> >>
> >> Reported-at:
https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
> >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
> >
> > Hi,
> >
> > for one reason or another our automation was unable to apply this patch.
> > But I have done so locally in my own tree (it is not upstream)
> > and run the GitHub based tests with success:
> >
> > https://github.com/horms/ovs/actions/runs/7448358289
> >
> > From my point of view this patch seems reasonable.
> >
> > Acked-by: Simon Horman <horms@ovn.org>
> >
> > But I would be interested to hear feedback from others before applying
it
> > to the upstream tree.
>
> @Frode, do you have any thoughts on this change?  Is it a problem
> also for the main (in-distribution) Ubuntu/Debian builds?
>
> >
> >> ---
> >>  debian/rules | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/debian/rules b/debian/rules
> >> index dc5cc8a65..de8771813 100755
> >> --- a/debian/rules
> >> +++ b/debian/rules
> >> @@ -2,7 +2,7 @@
> >>  # -*- makefile -*-
> >>  #export DH_VERBOSE=1
> >>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
> >> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
> >> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
>
Frode Nordahl Jan. 25, 2024, 8:01 p.m. UTC | #4
Apologies for the tardy response to this thread, freeze deadlines and
PTO has prevented me from responding sooner.

On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
<roberto.acosta@luizalabs.com> wrote:
>
>
>
> Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <i.maximets@redhat.com> escreveu:
> >
> > On 1/9/24 13:53, Simon Horman wrote:
> > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta via dev wrote:
> > >> Current version of debian/rules simply uses the default lto GCC
> > >> optimization settings during the linkage process.
> > >>
> > >> The main problem with this approach is that GCC on OS like Ubuntu
> > >> Jammy, for example, can enable the -flto=auto option during the
> > >> openvswitch building and linking process. In this case, the linked
> > >> dynamic libraries would need to be builded based on the same lto
> > >> optimization options, at the risk of not working, according to
> > >> documentation [1].

The documentation also says:
"When producing the final binary, GCC only applies link-time
optimizations to those files that contain bytecode. Therefore, you can
mix and match object files and libraries with GIMPLE bytecodes and
final object code. GCC automatically selects which files to optimize
in LTO mode and which files to link without further processing."

Which to me reads like it is supported to mix LTO optimized and
non-optimized objects?


> > >>
> > >> I'm not sure of the real benefits of using this link-time optimization
> >
> > At least for DPDK-enabled builds LTO usually provides noticeable
> > performance improvement.  In the past I measured up to 10% packet
> > rate increase with LTO.  Though the gap went a bit lower with modern
> > compilers.  I think, it should still provide noticeable performance
> > improvements at least for the userspace datapath.
> >
> > >> option, and since when it is enabled it causes problems with shared
> > >> libs link libjemalloc, for example, it seems safer overwritten
> > >> compiler decision by passing -fno-lto command.
> >
> > Building shared libraries with LTO is indeed a bit problematic.
> > I agree that correctness should have a priority over potential
> > performance benefits.
> >
> > Just out of curiosity, how do you add jemalloc into dependencies?
> > Using one of the environment variables?
>
> yeap, I'm just using the standard libs flag on a Ubuntu OS as suggested by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
>
> override_dh_auto_configure:
> test -d _debian || mkdir _debian
> cd _debian && ( \
> test -e Makefile || \
> ../configure --prefix=/usr --localstatedir=/var --enable-ssl LIBS=-ljemalloc \
> --sysconfdir=/etc \
> $(DATAPATH_CONFIGURE_OPTS) \
> $(EXTRA_CONFIGURE_OPTS) \
> )
>
>
> [1] https://docs.openvswitch.org/en/latest/intro/install/general/

This made me scratch my head, because in recent development efforts I
found the need to rebuild a package linking OVS to libunwind, which is
not the default for the package. And this worked fine, despite having
the default Ubuntu LTO settings applied. Furthermore both the
libunwind and libjemalloc packages appear to be built with the default
LTO settings as well.

> >
> > >>
> > >> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
> > >>
> > >> Reported-at: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
> > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
> > >
> > > Hi,
> > >
> > > for one reason or another our automation was unable to apply this patch.
> > > But I have done so locally in my own tree (it is not upstream)
> > > and run the GitHub based tests with success:
> > >
> > > https://github.com/horms/ovs/actions/runs/7448358289
> > >
> > > From my point of view this patch seems reasonable.
> > >
> > > Acked-by: Simon Horman <horms@ovn.org>
> > >
> > > But I would be interested to hear feedback from others before applying it
> > > to the upstream tree.
> >
> > @Frode, do you have any thoughts on this change?  Is it a problem
> > also for the main (in-distribution) Ubuntu/Debian builds?

TBH, I'm not convinced the LTO configuration is the root of the issue,
and I'd prefer if we left the LTO options alone and figured out what's
really going on here.
Roberto Bartzen Acosta Jan. 25, 2024, 9:33 p.m. UTC | #5
Em qui., 25 de jan. de 2024 às 17:01, Frode Nordahl <
frode.nordahl@canonical.com> escreveu:

> Apologies for the tardy response to this thread, freeze deadlines and
> PTO has prevented me from responding sooner.
>
> On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
> <roberto.acosta@luizalabs.com> wrote:
> >
> >
> >
> > Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <
> i.maximets@redhat.com> escreveu:
> > >
> > > On 1/9/24 13:53, Simon Horman wrote:
> > > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta via
> dev wrote:
> > > >> Current version of debian/rules simply uses the default lto GCC
> > > >> optimization settings during the linkage process.
> > > >>
> > > >> The main problem with this approach is that GCC on OS like Ubuntu
> > > >> Jammy, for example, can enable the -flto=auto option during the
> > > >> openvswitch building and linking process. In this case, the linked
> > > >> dynamic libraries would need to be builded based on the same lto
> > > >> optimization options, at the risk of not working, according to
> > > >> documentation [1].
>
> The documentation also says:
> "When producing the final binary, GCC only applies link-time
> optimizations to those files that contain bytecode. Therefore, you can
> mix and match object files and libraries with GIMPLE bytecodes and
> final object code. GCC automatically selects which files to optimize
> in LTO mode and which files to link without further processing."
>
> Which to me reads like it is supported to mix LTO optimized and
> non-optimized objects?
>

Yeah, that makes sense but another important doc reference is "The
important thing to keep in mind is that to enable link-time optimizations
you need to use the GCC driver to perform the link step. GCC automatically
performs link-time optimization if any of the objects involved were
compiled with the -flto command-line option." . So, my assumption is that
LTO was automatically enabled because some of the objects involved in
compiling with the jammy dev environment support it.

In that case, the LTO documentation suggests that we always used -flto with
some optimization flag (e.g. -O2) both at compile time and at link time.
So, AFAIU, you should link the code using exactly the same optimization
flags used at compile time, and the statement about "automatically selects
which files to optimize in LTO" is correct as long as it is possible to
read the GIMPLE bytecode of the objects involved for the optimization
option passed (in our case: jemalloc).  As detailed in the doc, the idea of
-flto is to inject in the object files some internal (GIMPLE-bytecode)
representation of the source code, and that is also used at link time
because, for optimization steps, the inlining bytecode happens again when
linking your whole program. The libjemalloc has its own headers/inline (or
inlinable) functions that also need to use LTO with some optimization
option (e.g. -O2) when compiling that library from its source code (and in
that case its GIMPLE bytecode is stored in the library).

In summary, my assumption is that, probably, libjemalloc installed on OS
didn't generate the GIMPLE bytecode properly to be used in the complete
build and linking process with the optimization option passed by
openvswitch. In other words, the issue seems to be with the jemalloc lib in
the OS Jammy.

So, why am I making a possible change to remove the LTO flag from the
openvswitch build process?  because the linkage process is susceptible to
OS-GCC decisions that can generate incoherent results in some cases.


>
> > > >>
> > > >> I'm not sure of the real benefits of using this link-time
> optimization
> > >
> > > At least for DPDK-enabled builds LTO usually provides noticeable
> > > performance improvement.  In the past I measured up to 10% packet
> > > rate increase with LTO.  Though the gap went a bit lower with modern
> > > compilers.  I think, it should still provide noticeable performance
> > > improvements at least for the userspace datapath.
> > >
> > > >> option, and since when it is enabled it causes problems with shared
> > > >> libs link libjemalloc, for example, it seems safer overwritten
> > > >> compiler decision by passing -fno-lto command.
> > >
> > > Building shared libraries with LTO is indeed a bit problematic.
> > > I agree that correctness should have a priority over potential
> > > performance benefits.
> > >
> > > Just out of curiosity, how do you add jemalloc into dependencies?
> > > Using one of the environment variables?
> >
> > yeap, I'm just using the standard libs flag on a Ubuntu OS as suggested
> by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
> >
> > override_dh_auto_configure:
> > test -d _debian || mkdir _debian
> > cd _debian && ( \
> > test -e Makefile || \
> > ../configure --prefix=/usr --localstatedir=/var --enable-ssl
> LIBS=-ljemalloc \
> > --sysconfdir=/etc \
> > $(DATAPATH_CONFIGURE_OPTS) \
> > $(EXTRA_CONFIGURE_OPTS) \
> > )
> >
> >
> > [1] https://docs.openvswitch.org/en/latest/intro/install/general/
>
> This made me scratch my head, because in recent development efforts I
> found the need to rebuild a package linking OVS to libunwind, which is
> not the default for the package. And this worked fine, despite having
> the default Ubuntu LTO settings applied. Furthermore both the
> libunwind and libjemalloc packages appear to be built with the default
> LTO settings as well.
>

Please, could you check if you can build and link correctly with jemalloc
in your environment?


>
> > >
> > > >>
> > > >> [1]
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
> > > >>
> > > >> Reported-at:
> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
> > > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
> > > >
> > > > Hi,
> > > >
> > > > for one reason or another our automation was unable to apply this
> patch.
> > > > But I have done so locally in my own tree (it is not upstream)
> > > > and run the GitHub based tests with success:
> > > >
> > > > https://github.com/horms/ovs/actions/runs/7448358289
> > > >
> > > > From my point of view this patch seems reasonable.
> > > >
> > > > Acked-by: Simon Horman <horms@ovn.org>
> > > >
> > > > But I would be interested to hear feedback from others before
> applying it
> > > > to the upstream tree.
> > >
> > > @Frode, do you have any thoughts on this change?  Is it a problem
> > > also for the main (in-distribution) Ubuntu/Debian builds?
>
> TBH, I'm not convinced the LTO configuration is the root of the issue,
> and I'd prefer if we left the LTO options alone and figured out what's
> really going on here.
>

I would really appreciate your help figuring this out ;)

Thanks
Roberto

>
> --
> Frode Nordahl
>
> > > >
> > > >> ---
> > > >>  debian/rules | 2 +-
> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >>
> > > >> diff --git a/debian/rules b/debian/rules
> > > >> index dc5cc8a65..de8771813 100755
> > > >> --- a/debian/rules
> > > >> +++ b/debian/rules
> > > >> @@ -2,7 +2,7 @@
> > > >>  # -*- makefile -*-
> > > >>  #export DH_VERBOSE=1
> > > >>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
> > > >> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
> > > >> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
> > >
> >
> >
> > ‘Esta mensagem é direcionada apenas para os endereços constantes no
> cabeçalho inicial. Se você não está listado nos endereços constantes no
> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
> imediatamente anuladas e proibidas’.
> >
> >  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
> por esse e-mail ou por seus anexos’.
>
Frode Nordahl Jan. 25, 2024, 9:44 p.m. UTC | #6
tor. 25. jan. 2024, 22:32 skrev Roberto Bartzen Acosta <
roberto.acosta@luizalabs.com>:

>
>
> Em qui., 25 de jan. de 2024 às 17:01, Frode Nordahl <
> frode.nordahl@canonical.com> escreveu:
>
>> Apologies for the tardy response to this thread, freeze deadlines and
>> PTO has prevented me from responding sooner.
>>
>> On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
>> <roberto.acosta@luizalabs.com> wrote:
>> >
>> >
>> >
>> > Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <
>> i.maximets@redhat.com> escreveu:
>> > >
>> > > On 1/9/24 13:53, Simon Horman wrote:
>> > > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta
>> via dev wrote:
>> > > >> Current version of debian/rules simply uses the default lto GCC
>> > > >> optimization settings during the linkage process.
>> > > >>
>> > > >> The main problem with this approach is that GCC on OS like Ubuntu
>> > > >> Jammy, for example, can enable the -flto=auto option during the
>> > > >> openvswitch building and linking process. In this case, the linked
>> > > >> dynamic libraries would need to be builded based on the same lto
>> > > >> optimization options, at the risk of not working, according to
>> > > >> documentation [1].
>>
>> The documentation also says:
>> "When producing the final binary, GCC only applies link-time
>> optimizations to those files that contain bytecode. Therefore, you can
>> mix and match object files and libraries with GIMPLE bytecodes and
>> final object code. GCC automatically selects which files to optimize
>> in LTO mode and which files to link without further processing."
>>
>> Which to me reads like it is supported to mix LTO optimized and
>> non-optimized objects?
>>
>
> Yeah, that makes sense but another important doc reference is "The
> important thing to keep in mind is that to enable link-time optimizations
> you need to use the GCC driver to perform the link step. GCC automatically
> performs link-time optimization if any of the objects involved were
> compiled with the -flto command-line option." . So, my assumption is that
> LTO was automatically enabled because some of the objects involved in
> compiling with the jammy dev environment support it.
>
> In that case, the LTO documentation suggests that we always used -flto with
> some optimization flag (e.g. -O2) both at compile time and at link time.
> So, AFAIU, you should link the code using exactly the same optimization
> flags used at compile time, and the statement about "automatically selects
> which files to optimize in LTO" is correct as long as it is possible to
> read the GIMPLE bytecode of the objects involved for the optimization
> option passed (in our case: jemalloc).  As detailed in the doc, the idea of
> -flto is to inject in the object files some internal (GIMPLE-bytecode)
> representation of the source code, and that is also used at link time
> because, for optimization steps, the inlining bytecode happens again when
> linking your whole program. The libjemalloc has its own headers/inline (or
> inlinable) functions that also need to use LTO with some optimization
> option (e.g. -O2) when compiling that library from its source code (and in
> that case its GIMPLE bytecode is stored in the library).
>
> In summary, my assumption is that, probably, libjemalloc installed on OS
> didn't generate the GIMPLE bytecode properly to be used in the complete
> build and linking process with the optimization option passed by
> openvswitch. In other words, the issue seems to be with the jemalloc lib in
> the OS Jammy.
>

If that is the case, should we not fix the jemalloc build instead?

So, why am I making a possible change to remove the LTO flag from the
> openvswitch build process?  because the linkage process is susceptible to
> OS-GCC decisions that can generate incoherent results in some cases.
>
>
>>
>> > > >>
>> > > >> I'm not sure of the real benefits of using this link-time
>> optimization
>> > >
>> > > At least for DPDK-enabled builds LTO usually provides noticeable
>> > > performance improvement.  In the past I measured up to 10% packet
>> > > rate increase with LTO.  Though the gap went a bit lower with modern
>> > > compilers.  I think, it should still provide noticeable performance
>> > > improvements at least for the userspace datapath.
>> > >
>> > > >> option, and since when it is enabled it causes problems with shared
>> > > >> libs link libjemalloc, for example, it seems safer overwritten
>> > > >> compiler decision by passing -fno-lto command.
>> > >
>> > > Building shared libraries with LTO is indeed a bit problematic.
>> > > I agree that correctness should have a priority over potential
>> > > performance benefits.
>> > >
>> > > Just out of curiosity, how do you add jemalloc into dependencies?
>> > > Using one of the environment variables?
>> >
>> > yeap, I'm just using the standard libs flag on a Ubuntu OS as suggested
>> by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
>> >
>> > override_dh_auto_configure:
>> > test -d _debian || mkdir _debian
>> > cd _debian && ( \
>> > test -e Makefile || \
>> > ../configure --prefix=/usr --localstatedir=/var --enable-ssl
>> LIBS=-ljemalloc \
>> > --sysconfdir=/etc \
>> > $(DATAPATH_CONFIGURE_OPTS) \
>> > $(EXTRA_CONFIGURE_OPTS) \
>> > )
>> >
>> >
>> > [1] https://docs.openvswitch.org/en/latest/intro/install/general/
>>
>> This made me scratch my head, because in recent development efforts I
>> found the need to rebuild a package linking OVS to libunwind, which is
>> not the default for the package. And this worked fine, despite having
>> the default Ubuntu LTO settings applied. Furthermore both the
>> libunwind and libjemalloc packages appear to be built with the default
>> LTO settings as well.
>>
>
> Please, could you check if you can build and link correctly with jemalloc
> in your environment?
>

It does not, and my point is that this appears to be specifically about
jemalloc and not about shared libraries and LTO in general, which is why
I'm arguing not to change the LTO settings.


>
>> > >
>> > > >>
>> > > >> [1]
>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
>> > > >>
>> > > >> Reported-at:
>> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
>> > > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
>> > > >
>> > > > Hi,
>> > > >
>> > > > for one reason or another our automation was unable to apply this
>> patch.
>> > > > But I have done so locally in my own tree (it is not upstream)
>> > > > and run the GitHub based tests with success:
>> > > >
>> > > > https://github.com/horms/ovs/actions/runs/7448358289
>> > > >
>> > > > From my point of view this patch seems reasonable.
>> > > >
>> > > > Acked-by: Simon Horman <horms@ovn.org>
>> > > >
>> > > > But I would be interested to hear feedback from others before
>> applying it
>> > > > to the upstream tree.
>> > >
>> > > @Frode, do you have any thoughts on this change?  Is it a problem
>> > > also for the main (in-distribution) Ubuntu/Debian builds?
>>
>> TBH, I'm not convinced the LTO configuration is the root of the issue,
>> and I'd prefer if we left the LTO options alone and figured out what's
>> really going on here.
>>
>
> I would really appreciate your help figuring this out ;)
>

Sure, let's try to figure it out.

--
Frode Nordahl

Thanks
> Roberto
>
>>
>> --
>> Frode Nordahl
>>
>> > > >
>> > > >> ---
>> > > >>  debian/rules | 2 +-
>> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> > > >>
>> > > >> diff --git a/debian/rules b/debian/rules
>> > > >> index dc5cc8a65..de8771813 100755
>> > > >> --- a/debian/rules
>> > > >> +++ b/debian/rules
>> > > >> @@ -2,7 +2,7 @@
>> > > >>  # -*- makefile -*-
>> > > >>  #export DH_VERBOSE=1
>> > > >>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
>> > > >> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
>> > > >> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
>> > >
>> >
>> >
>> > ‘Esta mensagem é direcionada apenas para os endereços constantes no
>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>> imediatamente anuladas e proibidas’.
>> >
>> >  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>> por esse e-mail ou por seus anexos’.
>>
>
>
> *‘Esta mensagem é direcionada apenas para os endereços constantes no
> cabeçalho inicial. Se você não está listado nos endereços constantes no
> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
> imediatamente anuladas e proibidas’.*
>
>  *‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
> por esse e-mail ou por seus anexos’.*
>
Frode Nordahl Jan. 26, 2024, 7:36 a.m. UTC | #7
On Thu, Jan 25, 2024 at 10:44 PM Frode Nordahl
<frode.nordahl@canonical.com> wrote:
>
>
>
> tor. 25. jan. 2024, 22:32 skrev Roberto Bartzen Acosta <roberto.acosta@luizalabs.com>:
>>
>>
>>
>> Em qui., 25 de jan. de 2024 às 17:01, Frode Nordahl <frode.nordahl@canonical.com> escreveu:
>>>
>>> Apologies for the tardy response to this thread, freeze deadlines and
>>> PTO has prevented me from responding sooner.
>>>
>>> On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
>>> <roberto.acosta@luizalabs.com> wrote:
>>> >
>>> >
>>> >
>>> > Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <i.maximets@redhat.com> escreveu:
>>> > >
>>> > > On 1/9/24 13:53, Simon Horman wrote:
>>> > > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta via dev wrote:
>>> > > >> Current version of debian/rules simply uses the default lto GCC
>>> > > >> optimization settings during the linkage process.
>>> > > >>
>>> > > >> The main problem with this approach is that GCC on OS like Ubuntu
>>> > > >> Jammy, for example, can enable the -flto=auto option during the
>>> > > >> openvswitch building and linking process. In this case, the linked
>>> > > >> dynamic libraries would need to be builded based on the same lto
>>> > > >> optimization options, at the risk of not working, according to
>>> > > >> documentation [1].
>>>
>>> The documentation also says:
>>> "When producing the final binary, GCC only applies link-time
>>> optimizations to those files that contain bytecode. Therefore, you can
>>> mix and match object files and libraries with GIMPLE bytecodes and
>>> final object code. GCC automatically selects which files to optimize
>>> in LTO mode and which files to link without further processing."
>>>
>>> Which to me reads like it is supported to mix LTO optimized and
>>> non-optimized objects?
>>
>>
>> Yeah, that makes sense but another important doc reference is "The important thing to keep in mind is that to enable link-time optimizations you need to use the GCC driver to perform the link step. GCC automatically performs link-time optimization if any of the objects involved were compiled with the -flto command-line option." . So, my assumption is that LTO was automatically enabled because some of the objects involved in compiling with the jammy dev environment support it.
>>
>> In that case, the LTO documentation suggests that we always used -flto with some optimization flag (e.g. -O2) both at compile time and at link time. So, AFAIU, you should link the code using exactly the same optimization flags used at compile time, and the statement about "automatically selects which files to optimize in LTO" is correct as long as it is possible to read the GIMPLE bytecode of the objects involved for the optimization option passed (in our case: jemalloc).  As detailed in the doc, the idea of -flto is to inject in the object files some internal (GIMPLE-bytecode) representation of the source code, and that is also used at link time because, for optimization steps, the inlining bytecode happens again when linking your whole program. The libjemalloc has its own headers/inline (or inlinable) functions that also need to use LTO with some optimization option (e.g. -O2) when compiling that library from its source code (and in that case its GIMPLE bytecode is stored in the library).
>>
>> In summary, my assumption is that, probably, libjemalloc installed on OS didn't generate the GIMPLE bytecode properly to be used in the complete build and linking process with the optimization option passed by openvswitch. In other words, the issue seems to be with the jemalloc lib in the OS Jammy.
>
>
> If that is the case, should we not fix the jemalloc build instead?
>
>> So, why am I making a possible change to remove the LTO flag from the openvswitch build process?  because the linkage process is susceptible to OS-GCC decisions that can generate incoherent results in some cases.
>>
>>>
>>>
>>> > > >>
>>> > > >> I'm not sure of the real benefits of using this link-time optimization
>>> > >
>>> > > At least for DPDK-enabled builds LTO usually provides noticeable
>>> > > performance improvement.  In the past I measured up to 10% packet
>>> > > rate increase with LTO.  Though the gap went a bit lower with modern
>>> > > compilers.  I think, it should still provide noticeable performance
>>> > > improvements at least for the userspace datapath.
>>> > >
>>> > > >> option, and since when it is enabled it causes problems with shared
>>> > > >> libs link libjemalloc, for example, it seems safer overwritten
>>> > > >> compiler decision by passing -fno-lto command.
>>> > >
>>> > > Building shared libraries with LTO is indeed a bit problematic.
>>> > > I agree that correctness should have a priority over potential
>>> > > performance benefits.
>>> > >
>>> > > Just out of curiosity, how do you add jemalloc into dependencies?
>>> > > Using one of the environment variables?
>>> >
>>> > yeap, I'm just using the standard libs flag on a Ubuntu OS as suggested by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
>>> >
>>> > override_dh_auto_configure:
>>> > test -d _debian || mkdir _debian
>>> > cd _debian && ( \
>>> > test -e Makefile || \
>>> > ../configure --prefix=/usr --localstatedir=/var --enable-ssl LIBS=-ljemalloc \
>>> > --sysconfdir=/etc \
>>> > $(DATAPATH_CONFIGURE_OPTS) \
>>> > $(EXTRA_CONFIGURE_OPTS) \
>>> > )
>>> >
>>> >
>>> > [1] https://docs.openvswitch.org/en/latest/intro/install/general/
>>>
>>> This made me scratch my head, because in recent development efforts I
>>> found the need to rebuild a package linking OVS to libunwind, which is
>>> not the default for the package. And this worked fine, despite having
>>> the default Ubuntu LTO settings applied. Furthermore both the
>>> libunwind and libjemalloc packages appear to be built with the default
>>> LTO settings as well.
>>
>>
>> Please, could you check if you can build and link correctly with jemalloc in your environment?
>
>
> It does not, and my point is that this appears to be specifically about jemalloc and not about shared libraries and LTO in general, which is why I'm arguing not to change the LTO settings.
>
>>
>>>
>>> > >
>>> > > >>
>>> > > >> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
>>> > > >>
>>> > > >> Reported-at: https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
>>> > > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
>>> > > >
>>> > > > Hi,
>>> > > >
>>> > > > for one reason or another our automation was unable to apply this patch.
>>> > > > But I have done so locally in my own tree (it is not upstream)
>>> > > > and run the GitHub based tests with success:
>>> > > >
>>> > > > https://github.com/horms/ovs/actions/runs/7448358289
>>> > > >
>>> > > > From my point of view this patch seems reasonable.
>>> > > >
>>> > > > Acked-by: Simon Horman <horms@ovn.org>
>>> > > >
>>> > > > But I would be interested to hear feedback from others before applying it
>>> > > > to the upstream tree.
>>> > >
>>> > > @Frode, do you have any thoughts on this change?  Is it a problem
>>> > > also for the main (in-distribution) Ubuntu/Debian builds?
>>>
>>> TBH, I'm not convinced the LTO configuration is the root of the issue,
>>> and I'd prefer if we left the LTO options alone and figured out what's
>>> really going on here.
>>
>>
>> I would really appreciate your help figuring this out ;)
>
>
> Sure, let's try to figure it out.

Building the package with:

    EXTRA_CONFIGURE_OPTS='CFLAGS=-fno-builtin LIBS=-ljemalloc'

appears to fix this for me, does that work for you?
Roberto Bartzen Acosta Jan. 26, 2024, 11:44 a.m. UTC | #8
Em qui., 25 de jan. de 2024 às 18:45, Frode Nordahl <
frode.nordahl@canonical.com> escreveu:

>
>
> tor. 25. jan. 2024, 22:32 skrev Roberto Bartzen Acosta <
> roberto.acosta@luizalabs.com>:
>
>>
>>
>> Em qui., 25 de jan. de 2024 às 17:01, Frode Nordahl <
>> frode.nordahl@canonical.com> escreveu:
>>
>>> Apologies for the tardy response to this thread, freeze deadlines and
>>> PTO has prevented me from responding sooner.
>>>
>>> On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
>>> <roberto.acosta@luizalabs.com> wrote:
>>> >
>>> >
>>> >
>>> > Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <
>>> i.maximets@redhat.com> escreveu:
>>> > >
>>> > > On 1/9/24 13:53, Simon Horman wrote:
>>> > > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta
>>> via dev wrote:
>>> > > >> Current version of debian/rules simply uses the default lto GCC
>>> > > >> optimization settings during the linkage process.
>>> > > >>
>>> > > >> The main problem with this approach is that GCC on OS like Ubuntu
>>> > > >> Jammy, for example, can enable the -flto=auto option during the
>>> > > >> openvswitch building and linking process. In this case, the linked
>>> > > >> dynamic libraries would need to be builded based on the same lto
>>> > > >> optimization options, at the risk of not working, according to
>>> > > >> documentation [1].
>>>
>>> The documentation also says:
>>> "When producing the final binary, GCC only applies link-time
>>> optimizations to those files that contain bytecode. Therefore, you can
>>> mix and match object files and libraries with GIMPLE bytecodes and
>>> final object code. GCC automatically selects which files to optimize
>>> in LTO mode and which files to link without further processing."
>>>
>>> Which to me reads like it is supported to mix LTO optimized and
>>> non-optimized objects?
>>>
>>
>> Yeah, that makes sense but another important doc reference is "The
>> important thing to keep in mind is that to enable link-time optimizations
>> you need to use the GCC driver to perform the link step. GCC automatically
>> performs link-time optimization if any of the objects involved were
>> compiled with the -flto command-line option." . So, my assumption is that
>> LTO was automatically enabled because some of the objects involved in
>> compiling with the jammy dev environment support it.
>>
>> In that case, the LTO documentation suggests that we always used -flto with
>> some optimization flag (e.g. -O2) both at compile time and at link time.
>> So, AFAIU, you should link the code using exactly the same optimization
>> flags used at compile time, and the statement about "automatically selects
>> which files to optimize in LTO" is correct as long as it is possible to
>> read the GIMPLE bytecode of the objects involved for the optimization
>> option passed (in our case: jemalloc).  As detailed in the doc, the idea of
>> -flto is to inject in the object files some internal (GIMPLE-bytecode)
>> representation of the source code, and that is also used at link time
>> because, for optimization steps, the inlining bytecode happens again when
>> linking your whole program. The libjemalloc has its own headers/inline (or
>> inlinable) functions that also need to use LTO with some optimization
>> option (e.g. -O2) when compiling that library from its source code (and in
>> that case its GIMPLE bytecode is stored in the library).
>>
>> In summary, my assumption is that, probably, libjemalloc installed on OS
>> didn't generate the GIMPLE bytecode properly to be used in the complete
>> build and linking process with the optimization option passed by
>> openvswitch. In other words, the issue seems to be with the jemalloc lib in
>> the OS Jammy.
>>
>
> If that is the case, should we not fix the jemalloc build instead?
>

Of course, It would be a good approach! maybe you could let the internal
Canonical SEG OS team know about this.


> So, why am I making a possible change to remove the LTO flag from the
>> openvswitch build process?  because the linkage process is susceptible to
>> OS-GCC decisions that can generate incoherent results in some cases.
>>
>>
>>>
>>> > > >>
>>> > > >> I'm not sure of the real benefits of using this link-time
>>> optimization
>>> > >
>>> > > At least for DPDK-enabled builds LTO usually provides noticeable
>>> > > performance improvement.  In the past I measured up to 10% packet
>>> > > rate increase with LTO.  Though the gap went a bit lower with modern
>>> > > compilers.  I think, it should still provide noticeable performance
>>> > > improvements at least for the userspace datapath.
>>> > >
>>> > > >> option, and since when it is enabled it causes problems with
>>> shared
>>> > > >> libs link libjemalloc, for example, it seems safer overwritten
>>> > > >> compiler decision by passing -fno-lto command.
>>> > >
>>> > > Building shared libraries with LTO is indeed a bit problematic.
>>> > > I agree that correctness should have a priority over potential
>>> > > performance benefits.
>>> > >
>>> > > Just out of curiosity, how do you add jemalloc into dependencies?
>>> > > Using one of the environment variables?
>>> >
>>> > yeap, I'm just using the standard libs flag on a Ubuntu OS as
>>> suggested by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
>>> >
>>> > override_dh_auto_configure:
>>> > test -d _debian || mkdir _debian
>>> > cd _debian && ( \
>>> > test -e Makefile || \
>>> > ../configure --prefix=/usr --localstatedir=/var --enable-ssl
>>> LIBS=-ljemalloc \
>>> > --sysconfdir=/etc \
>>> > $(DATAPATH_CONFIGURE_OPTS) \
>>> > $(EXTRA_CONFIGURE_OPTS) \
>>> > )
>>> >
>>> >
>>> > [1] https://docs.openvswitch.org/en/latest/intro/install/general/
>>>
>>> This made me scratch my head, because in recent development efforts I
>>> found the need to rebuild a package linking OVS to libunwind, which is
>>> not the default for the package. And this worked fine, despite having
>>> the default Ubuntu LTO settings applied. Furthermore both the
>>> libunwind and libjemalloc packages appear to be built with the default
>>> LTO settings as well.
>>>
>>
>> Please, could you check if you can build and link correctly with jemalloc
>> in your environment?
>>
>
> It does not, and my point is that this appears to be specifically about
> jemalloc and not about shared libraries and LTO in general, which is why
> I'm arguing not to change the LTO settings.
>
>
It makes sense, although I haven't tested with other unusual shared libs.


>
>>
>>> > >
>>> > > >>
>>> > > >> [1]
>>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
>>> > > >>
>>> > > >> Reported-at:
>>> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
>>> > > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
>>> > > >
>>> > > > Hi,
>>> > > >
>>> > > > for one reason or another our automation was unable to apply this
>>> patch.
>>> > > > But I have done so locally in my own tree (it is not upstream)
>>> > > > and run the GitHub based tests with success:
>>> > > >
>>> > > > https://github.com/horms/ovs/actions/runs/7448358289
>>> > > >
>>> > > > From my point of view this patch seems reasonable.
>>> > > >
>>> > > > Acked-by: Simon Horman <horms@ovn.org>
>>> > > >
>>> > > > But I would be interested to hear feedback from others before
>>> applying it
>>> > > > to the upstream tree.
>>> > >
>>> > > @Frode, do you have any thoughts on this change?  Is it a problem
>>> > > also for the main (in-distribution) Ubuntu/Debian builds?
>>>
>>> TBH, I'm not convinced the LTO configuration is the root of the issue,
>>> and I'd prefer if we left the LTO options alone and figured out what's
>>> really going on here.
>>>
>>
>> I would really appreciate your help figuring this out ;)
>>
>
> Sure, let's try to figure it out.
>
> --
> Frode Nordahl
>
> Thanks
>> Roberto
>>
>>>
>>> --
>>> Frode Nordahl
>>>
>>> > > >
>>> > > >> ---
>>> > > >>  debian/rules | 2 +-
>>> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> > > >>
>>> > > >> diff --git a/debian/rules b/debian/rules
>>> > > >> index dc5cc8a65..de8771813 100755
>>> > > >> --- a/debian/rules
>>> > > >> +++ b/debian/rules
>>> > > >> @@ -2,7 +2,7 @@
>>> > > >>  # -*- makefile -*-
>>> > > >>  #export DH_VERBOSE=1
>>> > > >>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
>>> > > >> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
>>> > > >> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
>>> > >
>>> >
>>> >
>>> > ‘Esta mensagem é direcionada apenas para os endereços constantes no
>>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>>> imediatamente anuladas e proibidas’.
>>> >
>>> >  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>>> por esse e-mail ou por seus anexos’.
>>>
>>
>>
>> *‘Esta mensagem é direcionada apenas para os endereços constantes no
>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>> imediatamente anuladas e proibidas’.*
>>
>>  *‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>> por esse e-mail ou por seus anexos’.*
>>
>
Roberto Bartzen Acosta Jan. 26, 2024, 12:04 p.m. UTC | #9
Em sex., 26 de jan. de 2024 às 04:37, Frode Nordahl <
frode.nordahl@canonical.com> escreveu:

> On Thu, Jan 25, 2024 at 10:44 PM Frode Nordahl
> <frode.nordahl@canonical.com> wrote:
> >
> >
> >
> > tor. 25. jan. 2024, 22:32 skrev Roberto Bartzen Acosta <
> roberto.acosta@luizalabs.com>:
> >>
> >>
> >>
> >> Em qui., 25 de jan. de 2024 às 17:01, Frode Nordahl <
> frode.nordahl@canonical.com> escreveu:
> >>>
> >>> Apologies for the tardy response to this thread, freeze deadlines and
> >>> PTO has prevented me from responding sooner.
> >>>
> >>> On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
> >>> <roberto.acosta@luizalabs.com> wrote:
> >>> >
> >>> >
> >>> >
> >>> > Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <
> i.maximets@redhat.com> escreveu:
> >>> > >
> >>> > > On 1/9/24 13:53, Simon Horman wrote:
> >>> > > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen Acosta
> via dev wrote:
> >>> > > >> Current version of debian/rules simply uses the default lto GCC
> >>> > > >> optimization settings during the linkage process.
> >>> > > >>
> >>> > > >> The main problem with this approach is that GCC on OS like
> Ubuntu
> >>> > > >> Jammy, for example, can enable the -flto=auto option during the
> >>> > > >> openvswitch building and linking process. In this case, the
> linked
> >>> > > >> dynamic libraries would need to be builded based on the same lto
> >>> > > >> optimization options, at the risk of not working, according to
> >>> > > >> documentation [1].
> >>>
> >>> The documentation also says:
> >>> "When producing the final binary, GCC only applies link-time
> >>> optimizations to those files that contain bytecode. Therefore, you can
> >>> mix and match object files and libraries with GIMPLE bytecodes and
> >>> final object code. GCC automatically selects which files to optimize
> >>> in LTO mode and which files to link without further processing."
> >>>
> >>> Which to me reads like it is supported to mix LTO optimized and
> >>> non-optimized objects?
> >>
> >>
> >> Yeah, that makes sense but another important doc reference is "The
> important thing to keep in mind is that to enable link-time optimizations
> you need to use the GCC driver to perform the link step. GCC automatically
> performs link-time optimization if any of the objects involved were
> compiled with the -flto command-line option." . So, my assumption is that
> LTO was automatically enabled because some of the objects involved in
> compiling with the jammy dev environment support it.
> >>
> >> In that case, the LTO documentation suggests that we always used -flto
> with some optimization flag (e.g. -O2) both at compile time and at link
> time. So, AFAIU, you should link the code using exactly the same
> optimization flags used at compile time, and the statement about
> "automatically selects which files to optimize in LTO" is correct as long
> as it is possible to read the GIMPLE bytecode of the objects involved for
> the optimization option passed (in our case: jemalloc).  As detailed in the
> doc, the idea of -flto is to inject in the object files some internal
> (GIMPLE-bytecode) representation of the source code, and that is also used
> at link time because, for optimization steps, the inlining bytecode happens
> again when linking your whole program. The libjemalloc has its own
> headers/inline (or inlinable) functions that also need to use LTO with some
> optimization option (e.g. -O2) when compiling that library from its source
> code (and in that case its GIMPLE bytecode is stored in the library).
> >>
> >> In summary, my assumption is that, probably, libjemalloc installed on
> OS didn't generate the GIMPLE bytecode properly to be used in the complete
> build and linking process with the optimization option passed by
> openvswitch. In other words, the issue seems to be with the jemalloc lib in
> the OS Jammy.
> >
> >
> > If that is the case, should we not fix the jemalloc build instead?
> >
> >> So, why am I making a possible change to remove the LTO flag from the
> openvswitch build process?  because the linkage process is susceptible to
> OS-GCC decisions that can generate incoherent results in some cases.
> >>
> >>>
> >>>
> >>> > > >>
> >>> > > >> I'm not sure of the real benefits of using this link-time
> optimization
> >>> > >
> >>> > > At least for DPDK-enabled builds LTO usually provides noticeable
> >>> > > performance improvement.  In the past I measured up to 10% packet
> >>> > > rate increase with LTO.  Though the gap went a bit lower with
> modern
> >>> > > compilers.  I think, it should still provide noticeable performance
> >>> > > improvements at least for the userspace datapath.
> >>> > >
> >>> > > >> option, and since when it is enabled it causes problems with
> shared
> >>> > > >> libs link libjemalloc, for example, it seems safer overwritten
> >>> > > >> compiler decision by passing -fno-lto command.
> >>> > >
> >>> > > Building shared libraries with LTO is indeed a bit problematic.
> >>> > > I agree that correctness should have a priority over potential
> >>> > > performance benefits.
> >>> > >
> >>> > > Just out of curiosity, how do you add jemalloc into dependencies?
> >>> > > Using one of the environment variables?
> >>> >
> >>> > yeap, I'm just using the standard libs flag on a Ubuntu OS as
> suggested by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
> >>> >
> >>> > override_dh_auto_configure:
> >>> > test -d _debian || mkdir _debian
> >>> > cd _debian && ( \
> >>> > test -e Makefile || \
> >>> > ../configure --prefix=/usr --localstatedir=/var --enable-ssl
> LIBS=-ljemalloc \
> >>> > --sysconfdir=/etc \
> >>> > $(DATAPATH_CONFIGURE_OPTS) \
> >>> > $(EXTRA_CONFIGURE_OPTS) \
> >>> > )
> >>> >
> >>> >
> >>> > [1] https://docs.openvswitch.org/en/latest/intro/install/general/
> >>>
> >>> This made me scratch my head, because in recent development efforts I
> >>> found the need to rebuild a package linking OVS to libunwind, which is
> >>> not the default for the package. And this worked fine, despite having
> >>> the default Ubuntu LTO settings applied. Furthermore both the
> >>> libunwind and libjemalloc packages appear to be built with the default
> >>> LTO settings as well.
> >>
> >>
> >> Please, could you check if you can build and link correctly with
> jemalloc in your environment?
> >
> >
> > It does not, and my point is that this appears to be specifically about
> jemalloc and not about shared libraries and LTO in general, which is why
> I'm arguing not to change the LTO settings.
> >
> >>
> >>>
> >>> > >
> >>> > > >>
> >>> > > >> [1]
> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
> >>> > > >>
> >>> > > >> Reported-at:
> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
> >>> > > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
> >>> > > >
> >>> > > > Hi,
> >>> > > >
> >>> > > > for one reason or another our automation was unable to apply
> this patch.
> >>> > > > But I have done so locally in my own tree (it is not upstream)
> >>> > > > and run the GitHub based tests with success:
> >>> > > >
> >>> > > > https://github.com/horms/ovs/actions/runs/7448358289
> >>> > > >
> >>> > > > From my point of view this patch seems reasonable.
> >>> > > >
> >>> > > > Acked-by: Simon Horman <horms@ovn.org>
> >>> > > >
> >>> > > > But I would be interested to hear feedback from others before
> applying it
> >>> > > > to the upstream tree.
> >>> > >
> >>> > > @Frode, do you have any thoughts on this change?  Is it a problem
> >>> > > also for the main (in-distribution) Ubuntu/Debian builds?
> >>>
> >>> TBH, I'm not convinced the LTO configuration is the root of the issue,
> >>> and I'd prefer if we left the LTO options alone and figured out what's
> >>> really going on here.
> >>
> >>
> >> I would really appreciate your help figuring this out ;)
> >
> >
> > Sure, let's try to figure it out.
>
> Building the package with:
>
>     EXTRA_CONFIGURE_OPTS='CFLAGS=-fno-builtin LIBS=-ljemalloc'
>
> appears to fix this for me, does that work for you?
>
>
yeah! Thank you!
I tested using a bit different export flags but with the same idea, and it
works!

So, my initial assumption about jemalloc issue with inline functions +
LTO seems to be right! However, enable buildin flag represents to use link
using Inline functions, instead of generating a function call. This saves a
function call and can be more optimized and useful for a large improvement
in terms of performances.

My question then would be what is the best approach: disable LTO or disable
built in...

What do you think about this?




export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-builtin
override_dh_auto_configure:
test -d _debian || mkdir _debian
cd _debian && ( \
test -e Makefile || \
../configure --prefix=/usr --localstatedir=/var --enable-ssl
LIBS=-ljemalloc \
--sysconfdir=/etc \
$(DATAPATH_CONFIGURE_OPTS) \
$(EXTRA_CONFIGURE_OPTS) \
)

readelf -a
./debian/openvswitch-switch/usr/lib/openvswitch-switch/ovs-vswitchd | grep
jemalloc -B 20 -A 5
   03     .init .plt .plt.got .plt.sec .text .fini
   04     .rodata .eh_frame_hdr .eh_frame
   05     .tdata .init_array .fini_array .data.rel.ro .dynamic .got .data
.bss
   06     .dynamic
   07     .note.gnu.property
   08     .note.gnu.build-id .note.ABI-tag
   09     .tdata .tbss
   10     .note.gnu.property
   11     .eh_frame_hdr
   12
   13     .tdata .init_array .fini_array .data.rel.ro .dynamic .got

Dynamic section at offset 0x26b348 contains 37 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libssl.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libcrypto.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libcap-ng.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libnuma.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libbpf.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libjemalloc.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libunbound.so.8]
 0x0000000000000001 (NEEDED)             Shared library: [libunwind.so.8]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library:
[ld-linux-x86-64.so.2]
 0x000000000000000c (INIT)               0x22000



> --
> Frode Nordahl
>
> > --
> > Frode Nordahl
> >
> >> Thanks
> >> Roberto
> >>>
> >>>
> >>> --
> >>> Frode Nordahl
> >>>
> >>> > > >
> >>> > > >> ---
> >>> > > >>  debian/rules | 2 +-
> >>> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>> > > >>
> >>> > > >> diff --git a/debian/rules b/debian/rules
> >>> > > >> index dc5cc8a65..de8771813 100755
> >>> > > >> --- a/debian/rules
> >>> > > >> +++ b/debian/rules
> >>> > > >> @@ -2,7 +2,7 @@
> >>> > > >>  # -*- makefile -*-
> >>> > > >>  #export DH_VERBOSE=1
> >>> > > >>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
> >>> > > >> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
> >>> > > >> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
> >>> > >
> >>> >
> >>> >
> >>> > ‘Esta mensagem é direcionada apenas para os endereços constantes no
> cabeçalho inicial. Se você não está listado nos endereços constantes no
> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
> imediatamente anuladas e proibidas’.
> >>> >
> >>> >  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
> por esse e-mail ou por seus anexos’.
> >>
> >>
> >>
> >> ‘Esta mensagem é direcionada apenas para os endereços constantes no
> cabeçalho inicial. Se você não está listado nos endereços constantes no
> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
> imediatamente anuladas e proibidas’.
> >>
> >>  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
> por esse e-mail ou por seus anexos’.
>
Frode Nordahl Jan. 26, 2024, 12:21 p.m. UTC | #10
fre. 26. jan. 2024, 13:03 skrev Roberto Bartzen Acosta <
roberto.acosta@luizalabs.com>:

>
>
> Em sex., 26 de jan. de 2024 às 04:37, Frode Nordahl <
> frode.nordahl@canonical.com> escreveu:
>
>> On Thu, Jan 25, 2024 at 10:44 PM Frode Nordahl
>> <frode.nordahl@canonical.com> wrote:
>> >
>> >
>> >
>> > tor. 25. jan. 2024, 22:32 skrev Roberto Bartzen Acosta <
>> roberto.acosta@luizalabs.com>:
>> >>
>> >>
>> >>
>> >> Em qui., 25 de jan. de 2024 às 17:01, Frode Nordahl <
>> frode.nordahl@canonical.com> escreveu:
>> >>>
>> >>> Apologies for the tardy response to this thread, freeze deadlines and
>> >>> PTO has prevented me from responding sooner.
>> >>>
>> >>> On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
>> >>> <roberto.acosta@luizalabs.com> wrote:
>> >>> >
>> >>> >
>> >>> >
>> >>> > Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <
>> i.maximets@redhat.com> escreveu:
>> >>> > >
>> >>> > > On 1/9/24 13:53, Simon Horman wrote:
>> >>> > > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen
>> Acosta via dev wrote:
>> >>> > > >> Current version of debian/rules simply uses the default lto GCC
>> >>> > > >> optimization settings during the linkage process.
>> >>> > > >>
>> >>> > > >> The main problem with this approach is that GCC on OS like
>> Ubuntu
>> >>> > > >> Jammy, for example, can enable the -flto=auto option during the
>> >>> > > >> openvswitch building and linking process. In this case, the
>> linked
>> >>> > > >> dynamic libraries would need to be builded based on the same
>> lto
>> >>> > > >> optimization options, at the risk of not working, according to
>> >>> > > >> documentation [1].
>> >>>
>> >>> The documentation also says:
>> >>> "When producing the final binary, GCC only applies link-time
>> >>> optimizations to those files that contain bytecode. Therefore, you can
>> >>> mix and match object files and libraries with GIMPLE bytecodes and
>> >>> final object code. GCC automatically selects which files to optimize
>> >>> in LTO mode and which files to link without further processing."
>> >>>
>> >>> Which to me reads like it is supported to mix LTO optimized and
>> >>> non-optimized objects?
>> >>
>> >>
>> >> Yeah, that makes sense but another important doc reference is "The
>> important thing to keep in mind is that to enable link-time optimizations
>> you need to use the GCC driver to perform the link step. GCC automatically
>> performs link-time optimization if any of the objects involved were
>> compiled with the -flto command-line option." . So, my assumption is that
>> LTO was automatically enabled because some of the objects involved in
>> compiling with the jammy dev environment support it.
>> >>
>> >> In that case, the LTO documentation suggests that we always used -flto
>> with some optimization flag (e.g. -O2) both at compile time and at link
>> time. So, AFAIU, you should link the code using exactly the same
>> optimization flags used at compile time, and the statement about
>> "automatically selects which files to optimize in LTO" is correct as long
>> as it is possible to read the GIMPLE bytecode of the objects involved for
>> the optimization option passed (in our case: jemalloc).  As detailed in the
>> doc, the idea of -flto is to inject in the object files some internal
>> (GIMPLE-bytecode) representation of the source code, and that is also used
>> at link time because, for optimization steps, the inlining bytecode happens
>> again when linking your whole program. The libjemalloc has its own
>> headers/inline (or inlinable) functions that also need to use LTO with some
>> optimization option (e.g. -O2) when compiling that library from its source
>> code (and in that case its GIMPLE bytecode is stored in the library).
>> >>
>> >> In summary, my assumption is that, probably, libjemalloc installed on
>> OS didn't generate the GIMPLE bytecode properly to be used in the complete
>> build and linking process with the optimization option passed by
>> openvswitch. In other words, the issue seems to be with the jemalloc lib in
>> the OS Jammy.
>> >
>> >
>> > If that is the case, should we not fix the jemalloc build instead?
>> >
>> >> So, why am I making a possible change to remove the LTO flag from the
>> openvswitch build process?  because the linkage process is susceptible to
>> OS-GCC decisions that can generate incoherent results in some cases.
>> >>
>> >>>
>> >>>
>> >>> > > >>
>> >>> > > >> I'm not sure of the real benefits of using this link-time
>> optimization
>> >>> > >
>> >>> > > At least for DPDK-enabled builds LTO usually provides noticeable
>> >>> > > performance improvement.  In the past I measured up to 10% packet
>> >>> > > rate increase with LTO.  Though the gap went a bit lower with
>> modern
>> >>> > > compilers.  I think, it should still provide noticeable
>> performance
>> >>> > > improvements at least for the userspace datapath.
>> >>> > >
>> >>> > > >> option, and since when it is enabled it causes problems with
>> shared
>> >>> > > >> libs link libjemalloc, for example, it seems safer overwritten
>> >>> > > >> compiler decision by passing -fno-lto command.
>> >>> > >
>> >>> > > Building shared libraries with LTO is indeed a bit problematic.
>> >>> > > I agree that correctness should have a priority over potential
>> >>> > > performance benefits.
>> >>> > >
>> >>> > > Just out of curiosity, how do you add jemalloc into dependencies?
>> >>> > > Using one of the environment variables?
>> >>> >
>> >>> > yeap, I'm just using the standard libs flag on a Ubuntu OS as
>> suggested by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
>> >>> >
>> >>> > override_dh_auto_configure:
>> >>> > test -d _debian || mkdir _debian
>> >>> > cd _debian && ( \
>> >>> > test -e Makefile || \
>> >>> > ../configure --prefix=/usr --localstatedir=/var --enable-ssl
>> LIBS=-ljemalloc \
>> >>> > --sysconfdir=/etc \
>> >>> > $(DATAPATH_CONFIGURE_OPTS) \
>> >>> > $(EXTRA_CONFIGURE_OPTS) \
>> >>> > )
>> >>> >
>> >>> >
>> >>> > [1] https://docs.openvswitch.org/en/latest/intro/install/general/
>> >>>
>> >>> This made me scratch my head, because in recent development efforts I
>> >>> found the need to rebuild a package linking OVS to libunwind, which is
>> >>> not the default for the package. And this worked fine, despite having
>> >>> the default Ubuntu LTO settings applied. Furthermore both the
>> >>> libunwind and libjemalloc packages appear to be built with the default
>> >>> LTO settings as well.
>> >>
>> >>
>> >> Please, could you check if you can build and link correctly with
>> jemalloc in your environment?
>> >
>> >
>> > It does not, and my point is that this appears to be specifically about
>> jemalloc and not about shared libraries and LTO in general, which is why
>> I'm arguing not to change the LTO settings.
>> >
>> >>
>> >>>
>> >>> > >
>> >>> > > >>
>> >>> > > >> [1]
>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
>> >>> > > >>
>> >>> > > >> Reported-at:
>> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
>> >>> > > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
>> >>> > > >
>> >>> > > > Hi,
>> >>> > > >
>> >>> > > > for one reason or another our automation was unable to apply
>> this patch.
>> >>> > > > But I have done so locally in my own tree (it is not upstream)
>> >>> > > > and run the GitHub based tests with success:
>> >>> > > >
>> >>> > > > https://github.com/horms/ovs/actions/runs/7448358289
>> >>> > > >
>> >>> > > > From my point of view this patch seems reasonable.
>> >>> > > >
>> >>> > > > Acked-by: Simon Horman <horms@ovn.org>
>> >>> > > >
>> >>> > > > But I would be interested to hear feedback from others before
>> applying it
>> >>> > > > to the upstream tree.
>> >>> > >
>> >>> > > @Frode, do you have any thoughts on this change?  Is it a problem
>> >>> > > also for the main (in-distribution) Ubuntu/Debian builds?
>> >>>
>> >>> TBH, I'm not convinced the LTO configuration is the root of the issue,
>> >>> and I'd prefer if we left the LTO options alone and figured out what's
>> >>> really going on here.
>> >>
>> >>
>> >> I would really appreciate your help figuring this out ;)
>> >
>> >
>> > Sure, let's try to figure it out.
>>
>> Building the package with:
>>
>>     EXTRA_CONFIGURE_OPTS='CFLAGS=-fno-builtin LIBS=-ljemalloc'
>>
>> appears to fix this for me, does that work for you?
>>
>>
> yeah! Thank you!
> I tested using a bit different export flags but with the same idea, and it
> works!
>
> So, my initial assumption about jemalloc issue with inline functions +
> LTO seems to be right! However, enable buildin flag represents to use link
> using Inline functions, instead of generating a function call. This saves a
> function call and can be more optimized and useful for a large improvement
> in terms of performances.
>
> My question then would be what is the best approach: disable LTO or
> disable built in...
>
> What do you think about this?
>

It depends a bit on what your goal is. The default for the package will
remain using malloc from libc, for which the current defaults are fine,
meaning no change is necessary.

If you will be rebuilding the package anyway you're free to choose the path
that appears best to you.

To help future travelers we should probably mention this in the
documentation around the same place where LIBS=-ljemalloc is currently
mentioned.

Would you be interested in repurposing your patch submission for that?

PS: One alternative might be to patch in jemalloc at runtime without any
rebuilding using the LD_PRELOAD trick [2], have not confirmed whether it
works in this specific case.

2: https://github.com/jemalloc/jemalloc/wiki/Getting-Started

--
Frode Nordahl


>
> export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-builtin
> override_dh_auto_configure:
> test -d _debian || mkdir _debian
> cd _debian && ( \
> test -e Makefile || \
> ../configure --prefix=/usr --localstatedir=/var --enable-ssl
> LIBS=-ljemalloc \
> --sysconfdir=/etc \
> $(DATAPATH_CONFIGURE_OPTS) \
> $(EXTRA_CONFIGURE_OPTS) \
> )
>
> readelf -a
> ./debian/openvswitch-switch/usr/lib/openvswitch-switch/ovs-vswitchd | grep
> jemalloc -B 20 -A 5
>    03     .init .plt .plt.got .plt.sec .text .fini
>    04     .rodata .eh_frame_hdr .eh_frame
>    05     .tdata .init_array .fini_array .data.rel.ro .dynamic .got .data
> .bss
>    06     .dynamic
>    07     .note.gnu.property
>    08     .note.gnu.build-id .note.ABI-tag
>    09     .tdata .tbss
>    10     .note.gnu.property
>    11     .eh_frame_hdr
>    12
>    13     .tdata .init_array .fini_array .data.rel.ro .dynamic .got
>
> Dynamic section at offset 0x26b348 contains 37 entries:
>   Tag        Type                         Name/Value
>  0x0000000000000001 (NEEDED)             Shared library: [libssl.so.3]
>  0x0000000000000001 (NEEDED)             Shared library: [libcrypto.so.3]
>  0x0000000000000001 (NEEDED)             Shared library: [libcap-ng.so.0]
>  0x0000000000000001 (NEEDED)             Shared library: [libnuma.so.1]
>  0x0000000000000001 (NEEDED)             Shared library: [libbpf.so.0]
>  0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
>  0x0000000000000001 (NEEDED)             Shared library: [libjemalloc.so.2]
>  0x0000000000000001 (NEEDED)             Shared library: [libunbound.so.8]
>  0x0000000000000001 (NEEDED)             Shared library: [libunwind.so.8]
>  0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
>  0x0000000000000001 (NEEDED)             Shared library:
> [ld-linux-x86-64.so.2]
>  0x000000000000000c (INIT)               0x22000
>
>
>
>> --
>> Frode Nordahl
>>
>> > --
>> > Frode Nordahl
>> >
>> >> Thanks
>> >> Roberto
>> >>>
>> >>>
>> >>> --
>> >>> Frode Nordahl
>> >>>
>> >>> > > >
>> >>> > > >> ---
>> >>> > > >>  debian/rules | 2 +-
>> >>> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>> > > >>
>> >>> > > >> diff --git a/debian/rules b/debian/rules
>> >>> > > >> index dc5cc8a65..de8771813 100755
>> >>> > > >> --- a/debian/rules
>> >>> > > >> +++ b/debian/rules
>> >>> > > >> @@ -2,7 +2,7 @@
>> >>> > > >>  # -*- makefile -*-
>> >>> > > >>  #export DH_VERBOSE=1
>> >>> > > >>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
>> >>> > > >> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
>> >>> > > >> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
>> >>> > >
>> >>> >
>> >>> >
>> >>> > ‘Esta mensagem é direcionada apenas para os endereços constantes no
>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>> imediatamente anuladas e proibidas’.
>> >>> >
>> >>> >  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>> por esse e-mail ou por seus anexos’.
>> >>
>> >>
>> >>
>> >> ‘Esta mensagem é direcionada apenas para os endereços constantes no
>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>> imediatamente anuladas e proibidas’.
>> >>
>> >>  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>> por esse e-mail ou por seus anexos’.
>>
>
>
> *‘Esta mensagem é direcionada apenas para os endereços constantes no
> cabeçalho inicial. Se você não está listado nos endereços constantes no
> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
> imediatamente anuladas e proibidas’.*
>
>  *‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
> por esse e-mail ou por seus anexos’.*
>
Roberto Bartzen Acosta Jan. 26, 2024, 12:51 p.m. UTC | #11
Em sex., 26 de jan. de 2024 às 09:21, Frode Nordahl <
frode.nordahl@canonical.com> escreveu:

>
>
> fre. 26. jan. 2024, 13:03 skrev Roberto Bartzen Acosta <
> roberto.acosta@luizalabs.com>:
>
>>
>>
>> Em sex., 26 de jan. de 2024 às 04:37, Frode Nordahl <
>> frode.nordahl@canonical.com> escreveu:
>>
>>> On Thu, Jan 25, 2024 at 10:44 PM Frode Nordahl
>>> <frode.nordahl@canonical.com> wrote:
>>> >
>>> >
>>> >
>>> > tor. 25. jan. 2024, 22:32 skrev Roberto Bartzen Acosta <
>>> roberto.acosta@luizalabs.com>:
>>> >>
>>> >>
>>> >>
>>> >> Em qui., 25 de jan. de 2024 às 17:01, Frode Nordahl <
>>> frode.nordahl@canonical.com> escreveu:
>>> >>>
>>> >>> Apologies for the tardy response to this thread, freeze deadlines and
>>> >>> PTO has prevented me from responding sooner.
>>> >>>
>>> >>> On Mon, Jan 15, 2024 at 12:14 PM Roberto Bartzen Acosta
>>> >>> <roberto.acosta@luizalabs.com> wrote:
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > Em qua., 10 de jan. de 2024 às 15:37, Ilya Maximets <
>>> i.maximets@redhat.com> escreveu:
>>> >>> > >
>>> >>> > > On 1/9/24 13:53, Simon Horman wrote:
>>> >>> > > > On Mon, Jan 08, 2024 at 08:34:36AM -0300, Roberto Bartzen
>>> Acosta via dev wrote:
>>> >>> > > >> Current version of debian/rules simply uses the default lto
>>> GCC
>>> >>> > > >> optimization settings during the linkage process.
>>> >>> > > >>
>>> >>> > > >> The main problem with this approach is that GCC on OS like
>>> Ubuntu
>>> >>> > > >> Jammy, for example, can enable the -flto=auto option during
>>> the
>>> >>> > > >> openvswitch building and linking process. In this case, the
>>> linked
>>> >>> > > >> dynamic libraries would need to be builded based on the same
>>> lto
>>> >>> > > >> optimization options, at the risk of not working, according to
>>> >>> > > >> documentation [1].
>>> >>>
>>> >>> The documentation also says:
>>> >>> "When producing the final binary, GCC only applies link-time
>>> >>> optimizations to those files that contain bytecode. Therefore, you
>>> can
>>> >>> mix and match object files and libraries with GIMPLE bytecodes and
>>> >>> final object code. GCC automatically selects which files to optimize
>>> >>> in LTO mode and which files to link without further processing."
>>> >>>
>>> >>> Which to me reads like it is supported to mix LTO optimized and
>>> >>> non-optimized objects?
>>> >>
>>> >>
>>> >> Yeah, that makes sense but another important doc reference is "The
>>> important thing to keep in mind is that to enable link-time optimizations
>>> you need to use the GCC driver to perform the link step. GCC automatically
>>> performs link-time optimization if any of the objects involved were
>>> compiled with the -flto command-line option." . So, my assumption is that
>>> LTO was automatically enabled because some of the objects involved in
>>> compiling with the jammy dev environment support it.
>>> >>
>>> >> In that case, the LTO documentation suggests that we always used
>>> -flto with some optimization flag (e.g. -O2) both at compile time and at
>>> link time. So, AFAIU, you should link the code using exactly the same
>>> optimization flags used at compile time, and the statement about
>>> "automatically selects which files to optimize in LTO" is correct as long
>>> as it is possible to read the GIMPLE bytecode of the objects involved for
>>> the optimization option passed (in our case: jemalloc).  As detailed in the
>>> doc, the idea of -flto is to inject in the object files some internal
>>> (GIMPLE-bytecode) representation of the source code, and that is also used
>>> at link time because, for optimization steps, the inlining bytecode happens
>>> again when linking your whole program. The libjemalloc has its own
>>> headers/inline (or inlinable) functions that also need to use LTO with some
>>> optimization option (e.g. -O2) when compiling that library from its source
>>> code (and in that case its GIMPLE bytecode is stored in the library).
>>> >>
>>> >> In summary, my assumption is that, probably, libjemalloc installed on
>>> OS didn't generate the GIMPLE bytecode properly to be used in the complete
>>> build and linking process with the optimization option passed by
>>> openvswitch. In other words, the issue seems to be with the jemalloc lib in
>>> the OS Jammy.
>>> >
>>> >
>>> > If that is the case, should we not fix the jemalloc build instead?
>>> >
>>> >> So, why am I making a possible change to remove the LTO flag from the
>>> openvswitch build process?  because the linkage process is susceptible to
>>> OS-GCC decisions that can generate incoherent results in some cases.
>>> >>
>>> >>>
>>> >>>
>>> >>> > > >>
>>> >>> > > >> I'm not sure of the real benefits of using this link-time
>>> optimization
>>> >>> > >
>>> >>> > > At least for DPDK-enabled builds LTO usually provides noticeable
>>> >>> > > performance improvement.  In the past I measured up to 10% packet
>>> >>> > > rate increase with LTO.  Though the gap went a bit lower with
>>> modern
>>> >>> > > compilers.  I think, it should still provide noticeable
>>> performance
>>> >>> > > improvements at least for the userspace datapath.
>>> >>> > >
>>> >>> > > >> option, and since when it is enabled it causes problems with
>>> shared
>>> >>> > > >> libs link libjemalloc, for example, it seems safer overwritten
>>> >>> > > >> compiler decision by passing -fno-lto command.
>>> >>> > >
>>> >>> > > Building shared libraries with LTO is indeed a bit problematic.
>>> >>> > > I agree that correctness should have a priority over potential
>>> >>> > > performance benefits.
>>> >>> > >
>>> >>> > > Just out of curiosity, how do you add jemalloc into dependencies?
>>> >>> > > Using one of the environment variables?
>>> >>> >
>>> >>> > yeap, I'm just using the standard libs flag on a Ubuntu OS as
>>> suggested by the OpenvSwitch documentation (LIBS=-ljemalloc ) [1]
>>> >>> >
>>> >>> > override_dh_auto_configure:
>>> >>> > test -d _debian || mkdir _debian
>>> >>> > cd _debian && ( \
>>> >>> > test -e Makefile || \
>>> >>> > ../configure --prefix=/usr --localstatedir=/var --enable-ssl
>>> LIBS=-ljemalloc \
>>> >>> > --sysconfdir=/etc \
>>> >>> > $(DATAPATH_CONFIGURE_OPTS) \
>>> >>> > $(EXTRA_CONFIGURE_OPTS) \
>>> >>> > )
>>> >>> >
>>> >>> >
>>> >>> > [1] https://docs.openvswitch.org/en/latest/intro/install/general/
>>> >>>
>>> >>> This made me scratch my head, because in recent development efforts I
>>> >>> found the need to rebuild a package linking OVS to libunwind, which
>>> is
>>> >>> not the default for the package. And this worked fine, despite having
>>> >>> the default Ubuntu LTO settings applied. Furthermore both the
>>> >>> libunwind and libjemalloc packages appear to be built with the
>>> default
>>> >>> LTO settings as well.
>>> >>
>>> >>
>>> >> Please, could you check if you can build and link correctly with
>>> jemalloc in your environment?
>>> >
>>> >
>>> > It does not, and my point is that this appears to be specifically
>>> about jemalloc and not about shared libraries and LTO in general, which is
>>> why I'm arguing not to change the LTO settings.
>>> >
>>> >>
>>> >>>
>>> >>> > >
>>> >>> > > >>
>>> >>> > > >> [1]
>>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-flto
>>> >>> > > >>
>>> >>> > > >> Reported-at:
>>> https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/2015748
>>> >>> > > >> Signed-off-by: Roberto Bartzen Acosta <rbartzen@gmail.com>
>>> >>> > > >
>>> >>> > > > Hi,
>>> >>> > > >
>>> >>> > > > for one reason or another our automation was unable to apply
>>> this patch.
>>> >>> > > > But I have done so locally in my own tree (it is not upstream)
>>> >>> > > > and run the GitHub based tests with success:
>>> >>> > > >
>>> >>> > > > https://github.com/horms/ovs/actions/runs/7448358289
>>> >>> > > >
>>> >>> > > > From my point of view this patch seems reasonable.
>>> >>> > > >
>>> >>> > > > Acked-by: Simon Horman <horms@ovn.org>
>>> >>> > > >
>>> >>> > > > But I would be interested to hear feedback from others before
>>> applying it
>>> >>> > > > to the upstream tree.
>>> >>> > >
>>> >>> > > @Frode, do you have any thoughts on this change?  Is it a problem
>>> >>> > > also for the main (in-distribution) Ubuntu/Debian builds?
>>> >>>
>>> >>> TBH, I'm not convinced the LTO configuration is the root of the
>>> issue,
>>> >>> and I'd prefer if we left the LTO options alone and figured out
>>> what's
>>> >>> really going on here.
>>> >>
>>> >>
>>> >> I would really appreciate your help figuring this out ;)
>>> >
>>> >
>>> > Sure, let's try to figure it out.
>>>
>>> Building the package with:
>>>
>>>     EXTRA_CONFIGURE_OPTS='CFLAGS=-fno-builtin LIBS=-ljemalloc'
>>>
>>> appears to fix this for me, does that work for you?
>>>
>>>
>> yeah! Thank you!
>> I tested using a bit different export flags but with the same idea, and
>> it works!
>>
>> So, my initial assumption about jemalloc issue with inline functions +
>> LTO seems to be right! However, enable buildin flag represents to use link
>> using Inline functions, instead of generating a function call. This saves a
>> function call and can be more optimized and useful for a large improvement
>> in terms of performances.
>>
>> My question then would be what is the best approach: disable LTO or
>> disable built in...
>>
>> What do you think about this?
>>
>
> It depends a bit on what your goal is. The default for the package will
> remain using malloc from libc, for which the current defaults are fine,
> meaning no change is necessary.
>
> If you will be rebuilding the package anyway you're free to choose the
> path that appears best to you.
>
> To help future travelers we should probably mention this in the
> documentation around the same place where LIBS=-ljemalloc is currently
> mentioned.
>
> Would you be interested in repurposing your patch submission for that?
>

Sure, will do!
I'll just update the documentation by mentioning the additional flags that
may be required to link with jemalloc.


>
> PS: One alternative might be to patch in jemalloc at runtime without any
> rebuilding using the LD_PRELOAD trick [2], have not confirmed whether it
> works in this specific case.
>
> 2: https://github.com/jemalloc/jemalloc/wiki/Getting-Started
>
> --
> Frode Nordahl
>
>
>>
>> export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-builtin
>> override_dh_auto_configure:
>> test -d _debian || mkdir _debian
>> cd _debian && ( \
>> test -e Makefile || \
>> ../configure --prefix=/usr --localstatedir=/var --enable-ssl
>> LIBS=-ljemalloc \
>> --sysconfdir=/etc \
>> $(DATAPATH_CONFIGURE_OPTS) \
>> $(EXTRA_CONFIGURE_OPTS) \
>> )
>>
>> readelf -a
>> ./debian/openvswitch-switch/usr/lib/openvswitch-switch/ovs-vswitchd | grep
>> jemalloc -B 20 -A 5
>>    03     .init .plt .plt.got .plt.sec .text .fini
>>    04     .rodata .eh_frame_hdr .eh_frame
>>    05     .tdata .init_array .fini_array .data.rel.ro .dynamic .got
>> .data .bss
>>    06     .dynamic
>>    07     .note.gnu.property
>>    08     .note.gnu.build-id .note.ABI-tag
>>    09     .tdata .tbss
>>    10     .note.gnu.property
>>    11     .eh_frame_hdr
>>    12
>>    13     .tdata .init_array .fini_array .data.rel.ro .dynamic .got
>>
>> Dynamic section at offset 0x26b348 contains 37 entries:
>>   Tag        Type                         Name/Value
>>  0x0000000000000001 (NEEDED)             Shared library: [libssl.so.3]
>>  0x0000000000000001 (NEEDED)             Shared library: [libcrypto.so.3]
>>  0x0000000000000001 (NEEDED)             Shared library: [libcap-ng.so.0]
>>  0x0000000000000001 (NEEDED)             Shared library: [libnuma.so.1]
>>  0x0000000000000001 (NEEDED)             Shared library: [libbpf.so.0]
>>  0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
>>  0x0000000000000001 (NEEDED)             Shared library:
>> [libjemalloc.so.2]
>>  0x0000000000000001 (NEEDED)             Shared library: [libunbound.so.8]
>>  0x0000000000000001 (NEEDED)             Shared library: [libunwind.so.8]
>>  0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
>>  0x0000000000000001 (NEEDED)             Shared library:
>> [ld-linux-x86-64.so.2]
>>  0x000000000000000c (INIT)               0x22000
>>
>>
>>
>>> --
>>> Frode Nordahl
>>>
>>> > --
>>> > Frode Nordahl
>>> >
>>> >> Thanks
>>> >> Roberto
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Frode Nordahl
>>> >>>
>>> >>> > > >
>>> >>> > > >> ---
>>> >>> > > >>  debian/rules | 2 +-
>>> >>> > > >>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> >>> > > >>
>>> >>> > > >> diff --git a/debian/rules b/debian/rules
>>> >>> > > >> index dc5cc8a65..de8771813 100755
>>> >>> > > >> --- a/debian/rules
>>> >>> > > >> +++ b/debian/rules
>>> >>> > > >> @@ -2,7 +2,7 @@
>>> >>> > > >>  # -*- makefile -*-
>>> >>> > > >>  #export DH_VERBOSE=1
>>> >>> > > >>  export DEB_BUILD_MAINT_OPTIONS = hardening=+all
>>> >>> > > >> -export DEB_CFLAGS_MAINT_APPEND = -fPIC
>>> >>> > > >> +export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto
>>> >>> > >
>>> >>> >
>>> >>> >
>>> >>> > ‘Esta mensagem é direcionada apenas para os endereços constantes
>>> no cabeçalho inicial. Se você não está listado nos endereços constantes no
>>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>>> imediatamente anuladas e proibidas’.
>>> >>> >
>>> >>> >  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis
>>> para assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>>> por esse e-mail ou por seus anexos’.
>>> >>
>>> >>
>>> >>
>>> >> ‘Esta mensagem é direcionada apenas para os endereços constantes no
>>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>>> imediatamente anuladas e proibidas’.
>>> >>
>>> >>  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>>> por esse e-mail ou por seus anexos’.
>>>
>>
>>
>> *‘Esta mensagem é direcionada apenas para os endereços constantes no
>> cabeçalho inicial. Se você não está listado nos endereços constantes no
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão
>> imediatamente anuladas e proibidas’.*
>>
>>  *‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>> por esse e-mail ou por seus anexos’.*
>>
>
diff mbox series

Patch

diff --git a/debian/rules b/debian/rules
index dc5cc8a65..de8771813 100755
--- a/debian/rules
+++ b/debian/rules
@@ -2,7 +2,7 @@ 
 # -*- makefile -*-
 #export DH_VERBOSE=1
 export DEB_BUILD_MAINT_OPTIONS = hardening=+all
-export DEB_CFLAGS_MAINT_APPEND = -fPIC
+export DEB_CFLAGS_MAINT_APPEND = -fPIC -fno-lto

 %:
  dh $@