diff mbox

makefile: detect corrupted elf files

Message ID 20130521214645.GA8863@redhat.com
State New
Headers show

Commit Message

Michael S. Tsirkin May 21, 2013, 9:46 p.m. UTC
Once in a while make gets killed and doesn't
clean up partial object files after it.
Result is nasty errors from link.
This hack checks object is well formed before linking,
and rebuilds it if not.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Is below useful for others?

 Makefile.target | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Peter Maydell May 21, 2013, 10:01 p.m. UTC | #1
On 21 May 2013 22:46, Michael S. Tsirkin <mst@redhat.com> wrote:
> Once in a while make gets killed and doesn't
> clean up partial object files after it.
> Result is nasty errors from link.
> This hack checks object is well formed before linking,
> and rebuilds it if not.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> Is below useful for others?

Seems to me like this is just working around a make bug:
it is supposed to delete the partial object if it gets
killed.

> +$(all-obj-y): % : $$(if $$(shell size %), , CORRUPTBINARY)

If we do do this, we probably ought to be running the
cross-prefix version of size, not the host version.

thanks
-- PMM
Michael S. Tsirkin May 21, 2013, 10:09 p.m. UTC | #2
On Tue, May 21, 2013 at 11:01:05PM +0100, Peter Maydell wrote:
> On 21 May 2013 22:46, Michael S. Tsirkin <mst@redhat.com> wrote:
> > Once in a while make gets killed and doesn't
> > clean up partial object files after it.
> > Result is nasty errors from link.
> > This hack checks object is well formed before linking,
> > and rebuilds it if not.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >
> > Is below useful for others?
> 
> Seems to me like this is just working around a make bug:
> it is supposed to delete the partial object if it gets
> killed.

It can't if it gets killed by kill -9 or e.g. OOM killer
(or OS reboot).

> > +$(all-obj-y): % : $$(if $$(shell size %), , CORRUPTBINARY)
> 
> If we do do this, we probably ought to be running the
> cross-prefix version of size, not the host version.

Good point, thanks.

> thanks
> -- PMM
Markus Armbruster May 22, 2013, 7:44 a.m. UTC | #3
"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Tue, May 21, 2013 at 11:01:05PM +0100, Peter Maydell wrote:
>> On 21 May 2013 22:46, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > Once in a while make gets killed and doesn't
>> > clean up partial object files after it.
>> > Result is nasty errors from link.
>> > This hack checks object is well formed before linking,
>> > and rebuilds it if not.
>> >
>> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> > ---
>> >
>> > Is below useful for others?
>> 
>> Seems to me like this is just working around a make bug:
>> it is supposed to delete the partial object if it gets
>> killed.
>
> It can't if it gets killed by kill -9 or e.g. OOM killer
> (or OS reboot).

Any generated file could be truncated then, not just objects.

If you abort a build with kill -9 or equivalent, you blow away the build
tree and start over.  A sufficiently unlucky truncation could still
build, but not work.

[...]
Michael S. Tsirkin May 22, 2013, 8:37 a.m. UTC | #4
On Wed, May 22, 2013 at 09:44:04AM +0200, Markus Armbruster wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Tue, May 21, 2013 at 11:01:05PM +0100, Peter Maydell wrote:
> >> On 21 May 2013 22:46, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> > Once in a while make gets killed and doesn't
> >> > clean up partial object files after it.
> >> > Result is nasty errors from link.
> >> > This hack checks object is well formed before linking,
> >> > and rebuilds it if not.
> >> >
> >> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> > ---
> >> >
> >> > Is below useful for others?
> >> 
> >> Seems to me like this is just working around a make bug:
> >> it is supposed to delete the partial object if it gets
> >> killed.
> >
> > It can't if it gets killed by kill -9 or e.g. OOM killer
> > (or OS reboot).
> 
> Any generated file could be truncated then, not just objects.
> 
> If you abort a build with kill -9 or equivalent, you blow away the build
> tree and start over.  A sufficiently unlucky truncation could still
> build, but not work.
> 
> [...]

At the moment, true.
It's my fault for running -rc kernels all the time I guess, I get
crashes kind of often, and losing more time on make clean
on top of reboot annoys me.
But we actually could make it robust, even against OS crash. Output to a
temporary file then rename.  This hack won't be needed then.

Are people interested? If yes I can implement this.
Peter Maydell May 22, 2013, 8:38 a.m. UTC | #5
On 22 May 2013 09:37, Michael S. Tsirkin <mst@redhat.com> wrote:
> It's my fault for running -rc kernels all the time I guess, I get
> crashes kind of often, and losing more time on make clean
> on top of reboot annoys me.
> But we actually could make it robust, even against OS crash. Output to a
> temporary file then rename.  This hack won't be needed then.

I think that would be better implemented in the compiler/linker :-)
Alternatively, stop doing compiles on horribly unstable kernels.

-- PMM
Paolo Bonzini May 22, 2013, 8:43 a.m. UTC | #6
Il 22/05/2013 10:38, Peter Maydell ha scritto:
> On 22 May 2013 09:37, Michael S. Tsirkin <mst@redhat.com> wrote:
>> It's my fault for running -rc kernels all the time I guess, I get
>> crashes kind of often, and losing more time on make clean
>> on top of reboot annoys me.
>> But we actually could make it robust, even against OS crash. Output to a
>> temporary file then rename.  This hack won't be needed then.
> 
> I think that would be better implemented in the compiler/linker :-)
> Alternatively, stop doing compiles on horribly unstable kernels.

Any filesystem with delayed writes can do this if you have a power loss.

But I agree that this patch doesn't solve the problem.  For example, if
you get stale files in the ccache directory even zapping the build
directory won't do.

Paolo
Michael S. Tsirkin May 22, 2013, 8:46 a.m. UTC | #7
On Wed, May 22, 2013 at 09:38:39AM +0100, Peter Maydell wrote:
> On 22 May 2013 09:37, Michael S. Tsirkin <mst@redhat.com> wrote:
> > It's my fault for running -rc kernels all the time I guess, I get
> > crashes kind of often, and losing more time on make clean
> > on top of reboot annoys me.
> > But we actually could make it robust, even against OS crash. Output to a
> > temporary file then rename.  This hack won't be needed then.
> 
> I think that would be better implemented in the compiler/linker :-)

Ow so you mean you want me to run -rc compiler/linker too?

In any case we are generating quite a bit of
code by ourselves, this would need handling,
we can do it uniformly.

> Alternatively, stop doing compiles on horribly unstable kernels.
> 
> -- PMM

Well I do lots of kernel development so I dislike the alternatives:
- Compile on a stable kernel, reboot, test, reboot back to stable kernel
- Compile on another box and copy bits over
Michael S. Tsirkin May 22, 2013, 8:52 a.m. UTC | #8
On Wed, May 22, 2013 at 10:43:45AM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 10:38, Peter Maydell ha scritto:
> > On 22 May 2013 09:37, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> It's my fault for running -rc kernels all the time I guess, I get
> >> crashes kind of often, and losing more time on make clean
> >> on top of reboot annoys me.
> >> But we actually could make it robust, even against OS crash. Output to a
> >> temporary file then rename.  This hack won't be needed then.
> > 
> > I think that would be better implemented in the compiler/linker :-)
> > Alternatively, stop doing compiles on horribly unstable kernels.
> 
> Any filesystem with delayed writes can do this if you have a power loss.
> 
> But I agree that this patch doesn't solve the problem.  For example, if
> you get stale files in the ccache directory even zapping the build
> directory won't do.
> 
> Paolo

The fix is simple here: don't use ccache.  I don't.

In fact, from what I saw people use ccache to work around makefile bugs,
so they can do make clean; make and have it finish quickly.

Any other examples?
Paolo Bonzini May 22, 2013, 9:22 a.m. UTC | #9
Il 22/05/2013 10:52, Michael S. Tsirkin ha scritto:
> The fix is simple here: don't use ccache.  I don't.
> 
> In fact, from what I saw people use ccache to work around makefile bugs,
> so they can do make clean; make and have it finish quickly.
> 
> Any other examples?

Testing configure patches should be done (also) from a clean build
directory, for example.

Paolo
Michael S. Tsirkin May 22, 2013, 9:42 a.m. UTC | #10
On Wed, May 22, 2013 at 11:22:52AM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 10:52, Michael S. Tsirkin ha scritto:
> > The fix is simple here: don't use ccache.  I don't.
> > 
> > In fact, from what I saw people use ccache to work around makefile bugs,
> > so they can do make clean; make and have it finish quickly.
> > 
> > Any other examples?
> 
> Testing configure patches should be done (also) from a clean build
> directory, for example.
> 
> Paolo

In fact, relying on make clean for testing the build
system is a mistake.  It's easy for it to forget to
remove some temporary file. You really should do
a clean clone.

But testing is a completely separate issue IMO,
I'm not trying to fix that, just reduce the chance
of a failed or corrupted build.
Stefan Hajnoczi May 22, 2013, 9:48 a.m. UTC | #11
On Wed, May 22, 2013 at 12:46:45AM +0300, Michael S. Tsirkin wrote:
> Once in a while make gets killed and doesn't
> clean up partial object files after it.
> Result is nasty errors from link.
> This hack checks object is well formed before linking,
> and rebuilds it if not.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Is below useful for others?
> 
>  Makefile.target | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/Makefile.target b/Makefile.target
> index ce4391f..4dddee5 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -191,3 +191,10 @@ endif
>  
>  GENERATED_HEADERS += config-target.h
>  Makefile: $(GENERATED_HEADERS)
> +
> +.SECONDEXPANSION:
> +
> +.PHONY: CORRUPTBINARY
> +
> +$(all-obj-y): % : $$(if $$(shell size %), , CORRUPTBINARY)

How does size(1) establish the validity of the ELF file?  Is it possible
to sneak past a truncated file (which I think is the only type of
corruption you're trying to protect against)?

Stefan
Michael S. Tsirkin May 22, 2013, 10 a.m. UTC | #12
On Wed, May 22, 2013 at 11:48:54AM +0200, Stefan Hajnoczi wrote:
> On Wed, May 22, 2013 at 12:46:45AM +0300, Michael S. Tsirkin wrote:
> > Once in a while make gets killed and doesn't
> > clean up partial object files after it.
> > Result is nasty errors from link.
> > This hack checks object is well formed before linking,
> > and rebuilds it if not.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > 
> > Is below useful for others?
> > 
> >  Makefile.target | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/Makefile.target b/Makefile.target
> > index ce4391f..4dddee5 100644
> > --- a/Makefile.target
> > +++ b/Makefile.target
> > @@ -191,3 +191,10 @@ endif
> >  
> >  GENERATED_HEADERS += config-target.h
> >  Makefile: $(GENERATED_HEADERS)
> > +
> > +.SECONDEXPANSION:
> > +
> > +.PHONY: CORRUPTBINARY
> > +
> > +$(all-obj-y): % : $$(if $$(shell size %), , CORRUPTBINARY)
> 
> How does size(1) establish the validity of the ELF file?
>  Is it possible
> to sneak past a truncated file (which I think is the only type of
> corruption you're trying to protect against)?
> 
> Stefan

It just parses the header, so of course it is.
But it does seem to catch the common failure
scenarios for me.
Paolo Bonzini May 22, 2013, 10:40 a.m. UTC | #13
Il 22/05/2013 11:42, Michael S. Tsirkin ha scritto:
> On Wed, May 22, 2013 at 11:22:52AM +0200, Paolo Bonzini wrote:
>> Il 22/05/2013 10:52, Michael S. Tsirkin ha scritto:
>>> The fix is simple here: don't use ccache.  I don't.
>>>
>>> In fact, from what I saw people use ccache to work around makefile bugs,
>>> so they can do make clean; make and have it finish quickly.
>>>
>>> Any other examples?
>>
>> Testing configure patches should be done (also) from a clean build
>> directory, for example.
>>
>> Paolo
> 
> In fact, relying on make clean for testing the build
> system is a mistake.  It's easy for it to forget to
> remove some temporary file. You really should do
> a clean clone.

Yes, I use a clean clone (and a clean build directory for each patch),
_hence_ ccache helps reducing test times.

Paolo
Michael S. Tsirkin May 22, 2013, 10:50 a.m. UTC | #14
On Wed, May 22, 2013 at 12:40:23PM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 11:42, Michael S. Tsirkin ha scritto:
> > On Wed, May 22, 2013 at 11:22:52AM +0200, Paolo Bonzini wrote:
> >> Il 22/05/2013 10:52, Michael S. Tsirkin ha scritto:
> >>> The fix is simple here: don't use ccache.  I don't.
> >>>
> >>> In fact, from what I saw people use ccache to work around makefile bugs,
> >>> so they can do make clean; make and have it finish quickly.
> >>>
> >>> Any other examples?
> >>
> >> Testing configure patches should be done (also) from a clean build
> >> directory, for example.
> >>
> >> Paolo
> > 
> > In fact, relying on make clean for testing the build
> > system is a mistake.  It's easy for it to forget to
> > remove some temporary file. You really should do
> > a clean clone.
> 
> Yes, I use a clean clone (and a clean build directory for each patch),
> _hence_ ccache helps reducing test times.
> 
> Paolo

I see, this workflow is the exact reverse of mine:

I do as much as possible in a single tree so I
rely on the makefile dependencies to be correct
to rebuild the right things.

You don't need so many dependencies: just enough
to pull the right bits from the cache.

I can see how any patches correcting in-place rebuilds
won't scratch any of your itches.
Paolo Bonzini May 22, 2013, 10:51 a.m. UTC | #15
Il 22/05/2013 12:50, Michael S. Tsirkin ha scritto:
> On Wed, May 22, 2013 at 12:40:23PM +0200, Paolo Bonzini wrote:
>> Il 22/05/2013 11:42, Michael S. Tsirkin ha scritto:
>>> On Wed, May 22, 2013 at 11:22:52AM +0200, Paolo Bonzini wrote:
>>>> Il 22/05/2013 10:52, Michael S. Tsirkin ha scritto:
>>>>> The fix is simple here: don't use ccache.  I don't.
>>>>>
>>>>> In fact, from what I saw people use ccache to work around makefile bugs,
>>>>> so they can do make clean; make and have it finish quickly.
>>>>>
>>>>> Any other examples?
>>>>
>>>> Testing configure patches should be done (also) from a clean build
>>>> directory, for example.
>>>
>>> In fact, relying on make clean for testing the build
>>> system is a mistake.  It's easy for it to forget to
>>> remove some temporary file. You really should do
>>> a clean clone.
>>
>> Yes, I use a clean clone (and a clean build directory for each patch),
>> _hence_ ccache helps reducing test times.
> 
> I see, this workflow is the exact reverse of mine:
> 
> I do as much as possible in a single tree so I
> rely on the makefile dependencies to be correct
> to rebuild the right things.

Usually I do the same---I just do slightly more thorough testing for
configure patches.

Paolo
Michael S. Tsirkin May 22, 2013, 11:09 a.m. UTC | #16
On Wed, May 22, 2013 at 12:51:42PM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 12:50, Michael S. Tsirkin ha scritto:
> > On Wed, May 22, 2013 at 12:40:23PM +0200, Paolo Bonzini wrote:
> >> Il 22/05/2013 11:42, Michael S. Tsirkin ha scritto:
> >>> On Wed, May 22, 2013 at 11:22:52AM +0200, Paolo Bonzini wrote:
> >>>> Il 22/05/2013 10:52, Michael S. Tsirkin ha scritto:
> >>>>> The fix is simple here: don't use ccache.  I don't.
> >>>>>
> >>>>> In fact, from what I saw people use ccache to work around makefile bugs,
> >>>>> so they can do make clean; make and have it finish quickly.
> >>>>>
> >>>>> Any other examples?
> >>>>
> >>>> Testing configure patches should be done (also) from a clean build
> >>>> directory, for example.
> >>>
> >>> In fact, relying on make clean for testing the build
> >>> system is a mistake.  It's easy for it to forget to
> >>> remove some temporary file. You really should do
> >>> a clean clone.
> >>
> >> Yes, I use a clean clone (and a clean build directory for each patch),
> >> _hence_ ccache helps reducing test times.
> > 
> > I see, this workflow is the exact reverse of mine:
> > 
> > I do as much as possible in a single tree so I
> > rely on the makefile dependencies to be correct
> > to rebuild the right things.
> 
> Usually I do the same---I just do slightly more thorough testing for
> configure patches.
> 
> Paolo

I've no idea what happens with ccache on a crash by the way.
It's possible that it's careful to do renames in order to not leave
corrupted output files behind.
Paolo Bonzini May 22, 2013, 11:12 a.m. UTC | #17
Il 22/05/2013 13:09, Michael S. Tsirkin ha scritto:
> > Usually I do the same---I just do slightly more thorough testing for
> > configure patches.
> 
> I've no idea what happens with ccache on a crash by the way.
> It's possible that it's careful to do renames in order to not leave
> corrupted output files behind.

It doesn't, it leave 0-sized files.  (Or at least it didn't last time
power failed during a compilation. :))

Paolo
Michael S. Tsirkin May 22, 2013, 11:35 a.m. UTC | #18
On Wed, May 22, 2013 at 01:12:15PM +0200, Paolo Bonzini wrote:
> Il 22/05/2013 13:09, Michael S. Tsirkin ha scritto:
> > > Usually I do the same---I just do slightly more thorough testing for
> > > configure patches.
> > 
> > I've no idea what happens with ccache on a crash by the way.
> > It's possible that it's careful to do renames in order to not leave
> > corrupted output files behind.
> 
> It doesn't, it leave 0-sized files.  (Or at least it didn't last time
> power failed during a compilation. :))
> 
> Paolo

Well looking at the source, there's quite a bit of
handling of renames, so maybe ccache hackers will be
interested in fixing this.

Manpage says:
       It should be noted that ccache is susceptible to general storage
	problems. If a bad object file sneaks into the cache for some reason, it
	will of course stay bad. Some possible reasons for erroneous object
	files are bad hardware (disk drive, disk controller, memory, etc), buggy
	drivers or file systems, a bad CCACHE_PREFIX command or compiler
	wrapper.

	...


       There are no reported issues about ccache producing broken object
       files reproducibly. That doesn’t mean it can’t happen, so if you find
       a repeatable case, please report it.

power failure is not listed ...
Blue Swirl May 25, 2013, 5:32 p.m. UTC | #19
On Wed, May 22, 2013 at 11:35 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Wed, May 22, 2013 at 01:12:15PM +0200, Paolo Bonzini wrote:
>> Il 22/05/2013 13:09, Michael S. Tsirkin ha scritto:
>> > > Usually I do the same---I just do slightly more thorough testing for
>> > > configure patches.
>> >
>> > I've no idea what happens with ccache on a crash by the way.
>> > It's possible that it's careful to do renames in order to not leave
>> > corrupted output files behind.
>>
>> It doesn't, it leave 0-sized files.  (Or at least it didn't last time
>> power failed during a compilation. :))
>>
>> Paolo
>
> Well looking at the source, there's quite a bit of
> handling of renames, so maybe ccache hackers will be
> interested in fixing this.
>
> Manpage says:
>        It should be noted that ccache is susceptible to general storage
>         problems. If a bad object file sneaks into the cache for some reason, it
>         will of course stay bad. Some possible reasons for erroneous object
>         files are bad hardware (disk drive, disk controller, memory, etc), buggy
>         drivers or file systems, a bad CCACHE_PREFIX command or compiler
>         wrapper.
>
>         ...
>
>
>        There are no reported issues about ccache producing broken object
>        files reproducibly. That doesn’t mean it can’t happen, so if you find
>        a repeatable case, please report it.
>
> power failure is not listed ...

Neither is kill -9 issued by evil BOFH. IIRC I've also had bad builds
and weird errors because the disk was almost completely full (not
necessarily due to ccache). Once I overclocked a machine but I had to
reduce the speed because of random compile errors.

But I think your patch is way too simple to cover possible failure
cases when you can't trust the compile environment. Maybe you should
build in two separate directories and compare the resulting objects or
just the final executables. For added paranoia, build using two
machines which have different set of components from different
manufacturers, but identical userland.

Another way to handle this would be to enhance GCC and linker to use
atomic operations when producing or combining object files. The tools
could also print a SHA of the object which the next user should
verify. Even better, the object files should include a robust checksum
to ensure integrity.

>
> --
> MST
Michael S. Tsirkin May 26, 2013, 7:35 a.m. UTC | #20
On Sat, May 25, 2013 at 05:32:24PM +0000, Blue Swirl wrote:
> On Wed, May 22, 2013 at 11:35 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Wed, May 22, 2013 at 01:12:15PM +0200, Paolo Bonzini wrote:
> >> Il 22/05/2013 13:09, Michael S. Tsirkin ha scritto:
> >> > > Usually I do the same---I just do slightly more thorough testing for
> >> > > configure patches.
> >> >
> >> > I've no idea what happens with ccache on a crash by the way.
> >> > It's possible that it's careful to do renames in order to not leave
> >> > corrupted output files behind.
> >>
> >> It doesn't, it leave 0-sized files.  (Or at least it didn't last time
> >> power failed during a compilation. :))
> >>
> >> Paolo
> >
> > Well looking at the source, there's quite a bit of
> > handling of renames, so maybe ccache hackers will be
> > interested in fixing this.
> >
> > Manpage says:
> >        It should be noted that ccache is susceptible to general storage
> >         problems. If a bad object file sneaks into the cache for some reason, it
> >         will of course stay bad. Some possible reasons for erroneous object
> >         files are bad hardware (disk drive, disk controller, memory, etc), buggy
> >         drivers or file systems, a bad CCACHE_PREFIX command or compiler
> >         wrapper.
> >
> >         ...
> >
> >
> >        There are no reported issues about ccache producing broken object
> >        files reproducibly. That doesn’t mean it can’t happen, so if you find
> >        a repeatable case, please report it.
> >
> > power failure is not listed ...
> 
> Neither is kill -9 issued by evil BOFH.

So presumably ccache will not fill with junk if you do this.

> IIRC I've also had bad builds
> and weird errors because the disk was almost completely full (not
> necessarily due to ccache). Once I overclocked a machine but I had to
> reduce the speed because of random compile errors.

This could be a parallel build, and possibly we have some
missing dependencies in the makefile.

> 
> But I think your patch is way too simple to cover possible failure
> cases when you can't trust the compile environment.

That's not the intent. Merely to address the common failure
which I personally observe all the time.

> Maybe you should
> build in two separate directories and compare the resulting objects or
> just the final executables. For added paranoia, build using two
> machines which have different set of components from different
> manufacturers, but identical userland.
> 
> Another way to handle this would be to enhance GCC and linker to use
> atomic operations when producing or combining object files. The tools
> could also print a SHA of the object which the next user should
> verify. Even better, the object files should include a robust checksum
> to ensure integrity.

I think we can make the makefile more robust. It can create a temporary
file in same directory and rename when ready. This will prevent
corrupted files from appearing in the first place.


> >
> > --
> > MST
Peter Maydell May 26, 2013, 9:12 a.m. UTC | #21
On 26 May 2013 08:35, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Sat, May 25, 2013 at 05:32:24PM +0000, Blue Swirl wrote:
>> Another way to handle this would be to enhance GCC and linker to use
>> atomic operations when producing or combining object files. The tools
>> could also print a SHA of the object which the next user should
>> verify. Even better, the object files should include a robust checksum
>> to ensure integrity.
>
> I think we can make the makefile more robust. It can create a temporary
> file in same directory and rename when ready. This will prevent
> corrupted files from appearing in the first place.

I definitely think individual project makefiles are the wrong place
to fix this. If create-as-temp-and-rename is useful functionality
it needs to go in the compiler so that everybody benefits. Or you
could write yourself a cc wrapper that did the renaming and use
configure's --cc= flag.

thanks
-- PMM
Michael S. Tsirkin May 26, 2013, 12:31 p.m. UTC | #22
On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
> On 26 May 2013 08:35, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Sat, May 25, 2013 at 05:32:24PM +0000, Blue Swirl wrote:
> >> Another way to handle this would be to enhance GCC and linker to use
> >> atomic operations when producing or combining object files. The tools
> >> could also print a SHA of the object which the next user should
> >> verify. Even better, the object files should include a robust checksum
> >> to ensure integrity.
> >
> > I think we can make the makefile more robust. It can create a temporary
> > file in same directory and rename when ready. This will prevent
> > corrupted files from appearing in the first place.
> 
> I definitely think individual project makefiles are the wrong place
> to fix this. If create-as-temp-and-rename is useful functionality
> it needs to go in the compiler so that everybody benefits.

This will not help users on existing systems.
Also it's not just compiler. We'd have to do it in linker,
asm, ... lots of work.
You are wellcome to implement this in compiler/linker/etc if you like
but we will still want to handle it in our makefile as well.

> Or you
> could write yourself a cc wrapper that did the renaming and use
> configure's --cc= flag.
> 
> thanks
> -- PMM

We also run lots of scripts in our makefiles. Would you like
to also change each of them individually? Add wrapper scripts for e.g.
python? What's the benefit as compared to just fixing it all in one
place in the makefile?
Stefan Weil May 26, 2013, 12:48 p.m. UTC | #23
Am 26.05.2013 14:31, schrieb Michael S. Tsirkin:
> On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
>> On 26 May 2013 08:35, Michael S. Tsirkin <mst@redhat.com> wrote:
>>> On Sat, May 25, 2013 at 05:32:24PM +0000, Blue Swirl wrote:
>>>> Another way to handle this would be to enhance GCC and linker to use
>>>> atomic operations when producing or combining object files. The tools
>>>> could also print a SHA of the object which the next user should
>>>> verify. Even better, the object files should include a robust checksum
>>>> to ensure integrity.
>>> I think we can make the makefile more robust. It can create a temporary
>>> file in same directory and rename when ready. This will prevent
>>> corrupted files from appearing in the first place.
>> I definitely think individual project makefiles are the wrong place
>> to fix this. If create-as-temp-and-rename is useful functionality
>> it needs to go in the compiler so that everybody benefits.
> This will not help users on existing systems.
> Also it's not just compiler. We'd have to do it in linker,
> asm, ... lots of work.
> You are wellcome to implement this in compiler/linker/etc if you like
> but we will still want to handle it in our makefile as well.
>
>> Or you
>> could write yourself a cc wrapper that did the renaming and use
>> configure's --cc= flag.
>>
>> thanks
>> -- PMM
> We also run lots of scripts in our makefiles. Would you like
> to also change each of them individually? Add wrapper scripts for e.g.
> python? What's the benefit as compared to just fixing it all in one
> place in the makefile?


GNU make automatically removes .o files which were built
because of a Makefile rule if that rule returns an error, so
OOM or compiler crashes should not result in corrupted .o
files. The same applies to other kinds of files built by
make.

If there are corrupted files, we have to look whether the
Makefile rules for those files are correct (or exist at all).

Are there other Open Source projects which try to detect
corrupted elf files? I know none.

Regards
Stefan W.
Michael S. Tsirkin May 26, 2013, 1:11 p.m. UTC | #24
On Sun, May 26, 2013 at 02:48:11PM +0200, Stefan Weil wrote:
> Am 26.05.2013 14:31, schrieb Michael S. Tsirkin:
> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
> >> On 26 May 2013 08:35, Michael S. Tsirkin <mst@redhat.com> wrote:
> >>> On Sat, May 25, 2013 at 05:32:24PM +0000, Blue Swirl wrote:
> >>>> Another way to handle this would be to enhance GCC and linker to use
> >>>> atomic operations when producing or combining object files. The tools
> >>>> could also print a SHA of the object which the next user should
> >>>> verify. Even better, the object files should include a robust checksum
> >>>> to ensure integrity.
> >>> I think we can make the makefile more robust. It can create a temporary
> >>> file in same directory and rename when ready. This will prevent
> >>> corrupted files from appearing in the first place.
> >> I definitely think individual project makefiles are the wrong place
> >> to fix this. If create-as-temp-and-rename is useful functionality
> >> it needs to go in the compiler so that everybody benefits.
> > This will not help users on existing systems.
> > Also it's not just compiler. We'd have to do it in linker,
> > asm, ... lots of work.
> > You are wellcome to implement this in compiler/linker/etc if you like
> > but we will still want to handle it in our makefile as well.
> >
> >> Or you
> >> could write yourself a cc wrapper that did the renaming and use
> >> configure's --cc= flag.
> >>
> >> thanks
> >> -- PMM
> > We also run lots of scripts in our makefiles. Would you like
> > to also change each of them individually? Add wrapper scripts for e.g.
> > python? What's the benefit as compared to just fixing it all in one
> > place in the makefile?
> 
> 
> GNU make automatically removes .o files which were built
> because of a Makefile rule if that rule returns an error, so
> OOM or compiler crashes should not result in corrupted .o
> files.

Not if make itself is killed.

> The same applies to other kinds of files built by
> make.

Another problem is power failures and other cases of sudden reboot.

> 
> If there are corrupted files, we have to look whether the
> Makefile rules for those files are correct (or exist at all).
> Are there other Open Source projects which try to detect
> corrupted elf files? I know none.
> 
> Regards
> Stefan W.
> 

It saves me time, at least.
I can keep it out of tree if it rubs others the wrong way
for some reason, it's no big deal.
If I have some spare time I might code up the more
generic thing with create then rename for all files,
if I do we can discuss that.
Peter Maydell May 26, 2013, 1:36 p.m. UTC | #25
On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
>> I definitely think individual project makefiles are the wrong place
>> to fix this. If create-as-temp-and-rename is useful functionality
>> it needs to go in the compiler so that everybody benefits.
>
> This will not help users on existing systems.
> Also it's not just compiler. We'd have to do it in linker,
> asm, ... lots of work.

This is clearly less work than implementing it in the makefile
of every single open source project in the world (or even every
single open source project in Debian).

> You are wellcome to implement this in compiler/linker/etc if you like
> but we will still want to handle it in our makefile as well.

I specifically don't want it handled in our makefiles because
it's the wrong place to fix the problem and it will make
our build system more complicated.

-- PMM
Michael S. Tsirkin May 26, 2013, 1:40 p.m. UTC | #26
On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
> >> I definitely think individual project makefiles are the wrong place
> >> to fix this. If create-as-temp-and-rename is useful functionality
> >> it needs to go in the compiler so that everybody benefits.
> >
> > This will not help users on existing systems.
> > Also it's not just compiler. We'd have to do it in linker,
> > asm, ... lots of work.
> 
> This is clearly less work than implementing it in the makefile
> of every single open source project in the world (or even every
> single open source project in Debian).

You seem to have removed the part that explained that
1. we run scripts in our makefiles so need to handle that anyway
2. we care about users on existing systems

This means that we would need the fix in our makefiles even
if compiler and linker gain this feature.

> > You are wellcome to implement this in compiler/linker/etc if you like
> > but we will still want to handle it in our makefile as well.
> 
> I specifically don't want it handled in our makefiles because
> it's the wrong place to fix the problem and it will make
> our build system more complicated.
> 
> -- PMM
Blue Swirl May 26, 2013, 6:20 p.m. UTC | #27
On Sun, May 26, 2013 at 1:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
>> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
>> >> I definitely think individual project makefiles are the wrong place
>> >> to fix this. If create-as-temp-and-rename is useful functionality
>> >> it needs to go in the compiler so that everybody benefits.
>> >
>> > This will not help users on existing systems.
>> > Also it's not just compiler. We'd have to do it in linker,
>> > asm, ... lots of work.
>>
>> This is clearly less work than implementing it in the makefile
>> of every single open source project in the world (or even every
>> single open source project in Debian).
>
> You seem to have removed the part that explained that
> 1. we run scripts in our makefiles so need to handle that anyway
> 2. we care about users on existing systems

A generic hook (default none, or maybe "test -s") after object
production and before linkage should be enough but would scale to SHA
producing/verifying tools.

>
> This means that we would need the fix in our makefiles even
> if compiler and linker gain this feature.

Depends on the feature. If the object files have robust checksums
which are checked after output and before input, this should be
transparent to the build system.

>
>> > You are wellcome to implement this in compiler/linker/etc if you like
>> > but we will still want to handle it in our makefile as well.
>>
>> I specifically don't want it handled in our makefiles because
>> it's the wrong place to fix the problem and it will make
>> our build system more complicated.

+1

Also, what is the worst case scenario? The link fails and you have to
clean up and rebuild? An automated build system can't produce the
expected output if the build machine is unreliable?

>>
>> -- PMM
>
>
Michael S. Tsirkin May 26, 2013, 6:24 p.m. UTC | #28
On Sun, May 26, 2013 at 06:20:17PM +0000, Blue Swirl wrote:
> On Sun, May 26, 2013 at 1:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
> >> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
> >> >> I definitely think individual project makefiles are the wrong place
> >> >> to fix this. If create-as-temp-and-rename is useful functionality
> >> >> it needs to go in the compiler so that everybody benefits.
> >> >
> >> > This will not help users on existing systems.
> >> > Also it's not just compiler. We'd have to do it in linker,
> >> > asm, ... lots of work.
> >>
> >> This is clearly less work than implementing it in the makefile
> >> of every single open source project in the world (or even every
> >> single open source project in Debian).
> >
> > You seem to have removed the part that explained that
> > 1. we run scripts in our makefiles so need to handle that anyway
> > 2. we care about users on existing systems
> 
> A generic hook (default none, or maybe "test -s") after object
> production and before linkage should be enough but would scale to SHA
> producing/verifying tools.
> 
> >
> > This means that we would need the fix in our makefiles even
> > if compiler and linker gain this feature.
> 
> Depends on the feature. If the object files have robust checksums
> which are checked after output and before input, this should be
> transparent to the build system.
> 
> >
> >> > You are wellcome to implement this in compiler/linker/etc if you like
> >> > but we will still want to handle it in our makefile as well.
> >>
> >> I specifically don't want it handled in our makefiles because
> >> it's the wrong place to fix the problem and it will make
> >> our build system more complicated.
> 
> +1
> 
> Also, what is the worst case scenario? The link fails and you have to
> clean up and rebuild? An automated build system can't produce the
> expected output if the build machine is unreliable?

It's a simple issue.
Each time I reboot during build, I have to make clean and rebuild.
This wastes my time so I looked for ways to save the time.
On my system at least, it has no measureable cost,
likely also because size only looks at headers and metadata.

If others are not interested, I can keep it out of tree.

> >>
> >> -- PMM
> >
> >
Blue Swirl May 26, 2013, 7:28 p.m. UTC | #29
On Sun, May 26, 2013 at 6:24 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Sun, May 26, 2013 at 06:20:17PM +0000, Blue Swirl wrote:
>> On Sun, May 26, 2013 at 1:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
>> >> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
>> >> >> I definitely think individual project makefiles are the wrong place
>> >> >> to fix this. If create-as-temp-and-rename is useful functionality
>> >> >> it needs to go in the compiler so that everybody benefits.
>> >> >
>> >> > This will not help users on existing systems.
>> >> > Also it's not just compiler. We'd have to do it in linker,
>> >> > asm, ... lots of work.
>> >>
>> >> This is clearly less work than implementing it in the makefile
>> >> of every single open source project in the world (or even every
>> >> single open source project in Debian).
>> >
>> > You seem to have removed the part that explained that
>> > 1. we run scripts in our makefiles so need to handle that anyway
>> > 2. we care about users on existing systems
>>
>> A generic hook (default none, or maybe "test -s") after object
>> production and before linkage should be enough but would scale to SHA
>> producing/verifying tools.
>>
>> >
>> > This means that we would need the fix in our makefiles even
>> > if compiler and linker gain this feature.
>>
>> Depends on the feature. If the object files have robust checksums
>> which are checked after output and before input, this should be
>> transparent to the build system.
>>
>> >
>> >> > You are wellcome to implement this in compiler/linker/etc if you like
>> >> > but we will still want to handle it in our makefile as well.
>> >>
>> >> I specifically don't want it handled in our makefiles because
>> >> it's the wrong place to fix the problem and it will make
>> >> our build system more complicated.
>>
>> +1
>>
>> Also, what is the worst case scenario? The link fails and you have to
>> clean up and rebuild? An automated build system can't produce the
>> expected output if the build machine is unreliable?
>
> It's a simple issue.
> Each time I reboot during build, I have to make clean and rebuild.
> This wastes my time so I looked for ways to save the time.

Compile under a stable kernel and test the bleeding edge kernel only
as KVM guest? Get a different box for testing or compiling? Run 'sync'
every time gcc finishes?

Don't you have bigger problems with file systems due to the crashes?

> On my system at least, it has no measureable cost,
> likely also because size only looks at headers and metadata.

For example on OpenBSD, 'size' does not seem to come from binutils, so
there could be portability issues.

>
> If others are not interested, I can keep it out of tree.

I've had problems with disk close to full, so I'm semi-interested if
the solution does not slow down others and it's not too ugly.

>
>> >>
>> >> -- PMM
>> >
>> >
Michael S. Tsirkin May 26, 2013, 8:15 p.m. UTC | #30
On Sun, May 26, 2013 at 07:28:40PM +0000, Blue Swirl wrote:
> On Sun, May 26, 2013 at 6:24 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Sun, May 26, 2013 at 06:20:17PM +0000, Blue Swirl wrote:
> >> On Sun, May 26, 2013 at 1:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> > On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
> >> >> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
> >> >> >> I definitely think individual project makefiles are the wrong place
> >> >> >> to fix this. If create-as-temp-and-rename is useful functionality
> >> >> >> it needs to go in the compiler so that everybody benefits.
> >> >> >
> >> >> > This will not help users on existing systems.
> >> >> > Also it's not just compiler. We'd have to do it in linker,
> >> >> > asm, ... lots of work.
> >> >>
> >> >> This is clearly less work than implementing it in the makefile
> >> >> of every single open source project in the world (or even every
> >> >> single open source project in Debian).
> >> >
> >> > You seem to have removed the part that explained that
> >> > 1. we run scripts in our makefiles so need to handle that anyway
> >> > 2. we care about users on existing systems
> >>
> >> A generic hook (default none, or maybe "test -s") after object
> >> production and before linkage should be enough but would scale to SHA
> >> producing/verifying tools.
> >>
> >> >
> >> > This means that we would need the fix in our makefiles even
> >> > if compiler and linker gain this feature.
> >>
> >> Depends on the feature. If the object files have robust checksums
> >> which are checked after output and before input, this should be
> >> transparent to the build system.
> >>
> >> >
> >> >> > You are wellcome to implement this in compiler/linker/etc if you like
> >> >> > but we will still want to handle it in our makefile as well.
> >> >>
> >> >> I specifically don't want it handled in our makefiles because
> >> >> it's the wrong place to fix the problem and it will make
> >> >> our build system more complicated.
> >>
> >> +1
> >>
> >> Also, what is the worst case scenario? The link fails and you have to
> >> clean up and rebuild? An automated build system can't produce the
> >> expected output if the build machine is unreliable?
> >
> > It's a simple issue.
> > Each time I reboot during build, I have to make clean and rebuild.
> > This wastes my time so I looked for ways to save the time.
> 
> Compile under a stable kernel and test the bleeding edge kernel only
> as KVM guest? Get a different box for testing or compiling? Run 'sync'
> every time gcc finishes?

What's the question here?

> Don't you have bigger problems with file systems due to the crashes?

As it happens, no. Maybe because I'm using ext4.
Maybe I'm lucky.

> > On my system at least, it has no measureable cost,
> > likely also because size only looks at headers and metadata.
> 
> For example on OpenBSD, 'size' does not seem to come from binutils, so
> there could be portability issues.

True, I'm not saying it's perfect.

> >
> > If others are not interested, I can keep it out of tree.
> 
> I've had problems with disk close to full, so I'm semi-interested if
> the solution does not slow down others and it's not too ugly.

I think the simplest way to do it is to change makefile to unlike, create
then rename. Then you are safe against abrupt killing or crashing make.
And with a journaled fs, if you also have e.g. linux ext4 and mount with
data=ordered, you are safe against power failures.

It shouldn't be hard to do and I don't expect this to have any
measureable speed impact.  What do you think?

> >
> >> >>
> >> >> -- PMM
> >> >
> >> >
Blue Swirl May 26, 2013, 8:29 p.m. UTC | #31
On Sun, May 26, 2013 at 8:15 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Sun, May 26, 2013 at 07:28:40PM +0000, Blue Swirl wrote:
>> On Sun, May 26, 2013 at 6:24 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Sun, May 26, 2013 at 06:20:17PM +0000, Blue Swirl wrote:
>> >> On Sun, May 26, 2013 at 1:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> > On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
>> >> >> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> >> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
>> >> >> >> I definitely think individual project makefiles are the wrong place
>> >> >> >> to fix this. If create-as-temp-and-rename is useful functionality
>> >> >> >> it needs to go in the compiler so that everybody benefits.
>> >> >> >
>> >> >> > This will not help users on existing systems.
>> >> >> > Also it's not just compiler. We'd have to do it in linker,
>> >> >> > asm, ... lots of work.
>> >> >>
>> >> >> This is clearly less work than implementing it in the makefile
>> >> >> of every single open source project in the world (or even every
>> >> >> single open source project in Debian).
>> >> >
>> >> > You seem to have removed the part that explained that
>> >> > 1. we run scripts in our makefiles so need to handle that anyway
>> >> > 2. we care about users on existing systems
>> >>
>> >> A generic hook (default none, or maybe "test -s") after object
>> >> production and before linkage should be enough but would scale to SHA
>> >> producing/verifying tools.
>> >>
>> >> >
>> >> > This means that we would need the fix in our makefiles even
>> >> > if compiler and linker gain this feature.
>> >>
>> >> Depends on the feature. If the object files have robust checksums
>> >> which are checked after output and before input, this should be
>> >> transparent to the build system.
>> >>
>> >> >
>> >> >> > You are wellcome to implement this in compiler/linker/etc if you like
>> >> >> > but we will still want to handle it in our makefile as well.
>> >> >>
>> >> >> I specifically don't want it handled in our makefiles because
>> >> >> it's the wrong place to fix the problem and it will make
>> >> >> our build system more complicated.
>> >>
>> >> +1
>> >>
>> >> Also, what is the worst case scenario? The link fails and you have to
>> >> clean up and rebuild? An automated build system can't produce the
>> >> expected output if the build machine is unreliable?
>> >
>> > It's a simple issue.
>> > Each time I reboot during build, I have to make clean and rebuild.
>> > This wastes my time so I looked for ways to save the time.
>>
>> Compile under a stable kernel and test the bleeding edge kernel only
>> as KVM guest? Get a different box for testing or compiling? Run 'sync'
>> every time gcc finishes?
>
> What's the question here?

The question is if any of the suggestions solves the problem?

Also how about something this: post boot, find -name '*.o' | xargs -iF
sh -c 'if test ! -s F; then rm F;fi'

>
>> Don't you have bigger problems with file systems due to the crashes?
>
> As it happens, no. Maybe because I'm using ext4.
> Maybe I'm lucky.
>
>> > On my system at least, it has no measureable cost,
>> > likely also because size only looks at headers and metadata.
>>
>> For example on OpenBSD, 'size' does not seem to come from binutils, so
>> there could be portability issues.
>
> True, I'm not saying it's perfect.
>
>> >
>> > If others are not interested, I can keep it out of tree.
>>
>> I've had problems with disk close to full, so I'm semi-interested if
>> the solution does not slow down others and it's not too ugly.
>
> I think the simplest way to do it is to change makefile to unlike, create
> then rename. Then you are safe against abrupt killing or crashing make.
> And with a journaled fs, if you also have e.g. linux ext4 and mount with
> data=ordered, you are safe against power failures.
>
> It shouldn't be hard to do and I don't expect this to have any
> measureable speed impact.  What do you think?

I'd prefer a more generic solution, like the hook. What you propose
wouldn't protect from the disk full scenario.

>
>> >
>> >> >>
>> >> >> -- PMM
>> >> >
>> >> >
Michael S. Tsirkin May 26, 2013, 8:55 p.m. UTC | #32
On Sun, May 26, 2013 at 08:29:35PM +0000, Blue Swirl wrote:
> On Sun, May 26, 2013 at 8:15 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Sun, May 26, 2013 at 07:28:40PM +0000, Blue Swirl wrote:
> >> On Sun, May 26, 2013 at 6:24 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> > On Sun, May 26, 2013 at 06:20:17PM +0000, Blue Swirl wrote:
> >> >> On Sun, May 26, 2013 at 1:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >> > On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
> >> >> >> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >> >> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
> >> >> >> >> I definitely think individual project makefiles are the wrong place
> >> >> >> >> to fix this. If create-as-temp-and-rename is useful functionality
> >> >> >> >> it needs to go in the compiler so that everybody benefits.
> >> >> >> >
> >> >> >> > This will not help users on existing systems.
> >> >> >> > Also it's not just compiler. We'd have to do it in linker,
> >> >> >> > asm, ... lots of work.
> >> >> >>
> >> >> >> This is clearly less work than implementing it in the makefile
> >> >> >> of every single open source project in the world (or even every
> >> >> >> single open source project in Debian).
> >> >> >
> >> >> > You seem to have removed the part that explained that
> >> >> > 1. we run scripts in our makefiles so need to handle that anyway
> >> >> > 2. we care about users on existing systems
> >> >>
> >> >> A generic hook (default none, or maybe "test -s") after object
> >> >> production and before linkage should be enough but would scale to SHA
> >> >> producing/verifying tools.
> >> >>
> >> >> >
> >> >> > This means that we would need the fix in our makefiles even
> >> >> > if compiler and linker gain this feature.
> >> >>
> >> >> Depends on the feature. If the object files have robust checksums
> >> >> which are checked after output and before input, this should be
> >> >> transparent to the build system.
> >> >>
> >> >> >
> >> >> >> > You are wellcome to implement this in compiler/linker/etc if you like
> >> >> >> > but we will still want to handle it in our makefile as well.
> >> >> >>
> >> >> >> I specifically don't want it handled in our makefiles because
> >> >> >> it's the wrong place to fix the problem and it will make
> >> >> >> our build system more complicated.
> >> >>
> >> >> +1
> >> >>
> >> >> Also, what is the worst case scenario? The link fails and you have to
> >> >> clean up and rebuild? An automated build system can't produce the
> >> >> expected output if the build machine is unreliable?
> >> >
> >> > It's a simple issue.
> >> > Each time I reboot during build, I have to make clean and rebuild.
> >> > This wastes my time so I looked for ways to save the time.
> >>
> >> Compile under a stable kernel and test the bleeding edge kernel only
> >> as KVM guest? Get a different box for testing or compiling? Run 'sync'
> >> every time gcc finishes?
> >
> > What's the question here?
> 
> The question is if any of the suggestions solves the problem?
> 
> Also how about something this: post boot, find -name '*.o' | xargs -iF
> sh -c 'if test ! -s F; then rm F;fi'

Maybe. I don't know if it's the only kind one typically sees after a
power failure. I'll experiment when this happens next.

> >
> >> Don't you have bigger problems with file systems due to the crashes?
> >
> > As it happens, no. Maybe because I'm using ext4.
> > Maybe I'm lucky.
> >
> >> > On my system at least, it has no measureable cost,
> >> > likely also because size only looks at headers and metadata.
> >>
> >> For example on OpenBSD, 'size' does not seem to come from binutils, so
> >> there could be portability issues.
> >
> > True, I'm not saying it's perfect.
> >
> >> >
> >> > If others are not interested, I can keep it out of tree.
> >>
> >> I've had problems with disk close to full, so I'm semi-interested if
> >> the solution does not slow down others and it's not too ugly.
> >
> > I think the simplest way to do it is to change makefile to unlike, create
> > then rename. Then you are safe against abrupt killing or crashing make.
> > And with a journaled fs, if you also have e.g. linux ext4 and mount with
> > data=ordered, you are safe against power failures.
> >
> > It shouldn't be hard to do and I don't expect this to have any
> > measureable speed impact.  What do you think?
> 
> I'd prefer a more generic solution, like the hook.

What is meant by the hook?

> What you propose
> wouldn't protect from the disk full scenario.

Why not?
I think it will - renaming file in same directory doesn't need any space
so is almost sure to succeed even on disk full.

What I mean is this

gcc -o a.o a.c -> rm a.o && gcc -o a.tmp.o a.c && mv a.tmp.o a.o

> >
> >> >
> >> >> >>
> >> >> >> -- PMM
> >> >> >
> >> >> >
Anthony Liguori May 26, 2013, 9:03 p.m. UTC | #33
"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Sun, May 26, 2013 at 06:20:17PM +0000, Blue Swirl wrote:
> It's a simple issue.
> Each time I reboot during build, I have to make clean and rebuild.
> This wastes my time so I looked for ways to save the time.
> On my system at least, it has no measureable cost,
> likely also because size only looks at headers and metadata.
>
> If others are not interested, I can keep it out of tree.

You probably should.  Trying to be robust here is going to cause more
headache than it's worth.

I think your problem has better solutions too.  Doing a full build with
all optional dependencies enabled really doesn't take that long.

  $ time ( ~/git/qemu/configure && CCACHE_DISABLE=1 make -j24)
  <lots of output>
  real	2m28.222s
  user	21m33.763s
  sys	1m30.721s

I've switched to this as standard practice since it's so quick.

This is a modest two socket system with spinning disks.  I'm sure it's
even faster with more recent processors and SSDs.  With tmpfs as the
build directory it would probably fly.

Our build parallelizes very well, even if you only have slow systems,
distcc will work wonders.

Regards,

Anthony Liguori

>
>> >>
>> >> -- PMM
>> >
>> >
Michael S. Tsirkin May 28, 2013, 10:33 a.m. UTC | #34
On Sun, May 26, 2013 at 08:29:35PM +0000, Blue Swirl wrote:
> On Sun, May 26, 2013 at 8:15 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Sun, May 26, 2013 at 07:28:40PM +0000, Blue Swirl wrote:
> >> On Sun, May 26, 2013 at 6:24 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> > On Sun, May 26, 2013 at 06:20:17PM +0000, Blue Swirl wrote:
> >> >> On Sun, May 26, 2013 at 1:40 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >> > On Sun, May 26, 2013 at 02:36:28PM +0100, Peter Maydell wrote:
> >> >> >> On 26 May 2013 13:31, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >> >> > On Sun, May 26, 2013 at 10:12:21AM +0100, Peter Maydell wrote:
> >> >> >> >> I definitely think individual project makefiles are the wrong place
> >> >> >> >> to fix this. If create-as-temp-and-rename is useful functionality
> >> >> >> >> it needs to go in the compiler so that everybody benefits.
> >> >> >> >
> >> >> >> > This will not help users on existing systems.
> >> >> >> > Also it's not just compiler. We'd have to do it in linker,
> >> >> >> > asm, ... lots of work.
> >> >> >>
> >> >> >> This is clearly less work than implementing it in the makefile
> >> >> >> of every single open source project in the world (or even every
> >> >> >> single open source project in Debian).
> >> >> >
> >> >> > You seem to have removed the part that explained that
> >> >> > 1. we run scripts in our makefiles so need to handle that anyway
> >> >> > 2. we care about users on existing systems
> >> >>
> >> >> A generic hook (default none, or maybe "test -s") after object
> >> >> production and before linkage should be enough but would scale to SHA
> >> >> producing/verifying tools.
> >> >>
> >> >> >
> >> >> > This means that we would need the fix in our makefiles even
> >> >> > if compiler and linker gain this feature.
> >> >>
> >> >> Depends on the feature. If the object files have robust checksums
> >> >> which are checked after output and before input, this should be
> >> >> transparent to the build system.
> >> >>
> >> >> >
> >> >> >> > You are wellcome to implement this in compiler/linker/etc if you like
> >> >> >> > but we will still want to handle it in our makefile as well.
> >> >> >>
> >> >> >> I specifically don't want it handled in our makefiles because
> >> >> >> it's the wrong place to fix the problem and it will make
> >> >> >> our build system more complicated.
> >> >>
> >> >> +1
> >> >>
> >> >> Also, what is the worst case scenario? The link fails and you have to
> >> >> clean up and rebuild? An automated build system can't produce the
> >> >> expected output if the build machine is unreliable?
> >> >
> >> > It's a simple issue.
> >> > Each time I reboot during build, I have to make clean and rebuild.
> >> > This wastes my time so I looked for ways to save the time.
> >>
> >> Compile under a stable kernel and test the bleeding edge kernel only
> >> as KVM guest? Get a different box for testing or compiling? Run 'sync'
> >> every time gcc finishes?
> >
> > What's the question here?
> 
> The question is if any of the suggestions solves the problem?
> 
> Also how about something this: post boot, find -name '*.o' | xargs -iF
> sh -c 'if test ! -s F; then rm F;fi'

On Linux, even easier:
find -name '*.o' -empty -exec rm '{}' ';'

Seems to be enough here. Thanks, I'll use this hack and
leave makefiles alone for now.

> >
> >> Don't you have bigger problems with file systems due to the crashes?
> >
> > As it happens, no. Maybe because I'm using ext4.
> > Maybe I'm lucky.
> >
> >> > On my system at least, it has no measureable cost,
> >> > likely also because size only looks at headers and metadata.
> >>
> >> For example on OpenBSD, 'size' does not seem to come from binutils, so
> >> there could be portability issues.
> >
> > True, I'm not saying it's perfect.
> >
> >> >
> >> > If others are not interested, I can keep it out of tree.
> >>
> >> I've had problems with disk close to full, so I'm semi-interested if
> >> the solution does not slow down others and it's not too ugly.
> >
> > I think the simplest way to do it is to change makefile to unlike, create
> > then rename. Then you are safe against abrupt killing or crashing make.
> > And with a journaled fs, if you also have e.g. linux ext4 and mount with
> > data=ordered, you are safe against power failures.
> >
> > It shouldn't be hard to do and I don't expect this to have any
> > measureable speed impact.  What do you think?
> 
> I'd prefer a more generic solution, like the hook. What you propose
> wouldn't protect from the disk full scenario.
> 
> >
> >> >
> >> >> >>
> >> >> >> -- PMM
> >> >> >
> >> >> >
diff mbox

Patch

diff --git a/Makefile.target b/Makefile.target
index ce4391f..4dddee5 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -191,3 +191,10 @@  endif
 
 GENERATED_HEADERS += config-target.h
 Makefile: $(GENERATED_HEADERS)
+
+.SECONDEXPANSION:
+
+.PHONY: CORRUPTBINARY
+
+$(all-obj-y): % : $$(if $$(shell size %), , CORRUPTBINARY)
+