diff mbox

[U-Boot,v3] dfu: Introduction of the "dfu_hash_algo" env variable for checksum method setting

Message ID 1399552067-31208-1-git-send-email-l.majewski@samsung.com
State Superseded
Delegated to: Marek Vasut
Headers show

Commit Message

Łukasz Majewski May 8, 2014, 12:27 p.m. UTC
Up till now the CRC32 of received data was calculated unconditionally.
The standard crc32 implementation causes long delay when large images
were uploaded.

The "dfu_hash_algo" environment variable gives the opportunity to
enable on demand (when e.g. debugging) the hash (crc32) calculation.
It can be done without need to recompile the u-boot binary and reuses
the generic hash framework.

By default the crc32 is NOT calculated anymore.

Tests results:
400 MiB ums.img file
With 		crc32 calculation: 65 sec [avg 6.29 MB/s]
Without 		crc32 calculation: 25 sec [avg 16.17 MB/s]

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Cc: Marek Vasut <marex@denx.de>
---
Changes for v2:
- Utilization of hash_block generic function to calculate CRC32 checksum
Changes for v3:
- Remove dependency on altering the lib/hash.c generic implementation
- Use of hash_update() function to calculate crc32 in the same way as
  it was done with crc32
---
 drivers/dfu/dfu.c |   47 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 42 insertions(+), 5 deletions(-)

Comments

Marek Vasut May 8, 2014, 1:07 p.m. UTC | #1
On Thursday, May 08, 2014 at 02:27:47 PM, Lukasz Majewski wrote:
> Up till now the CRC32 of received data was calculated unconditionally.
> The standard crc32 implementation causes long delay when large images
> were uploaded.
> 
> The "dfu_hash_algo" environment variable gives the opportunity to
> enable on demand (when e.g. debugging) the hash (crc32) calculation.
> It can be done without need to recompile the u-boot binary and reuses
> the generic hash framework.
> 
> By default the crc32 is NOT calculated anymore.
> 
> Tests results:
> 400 MiB ums.img file
> With 		crc32 calculation: 65 sec [avg 6.29 MB/s]
> Without 		crc32 calculation: 25 sec [avg 16.17 MB/s]
> 
> Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
> Cc: Marek Vasut <marex@denx.de>

Reviewed-by: Marek Vasut <marex@denx.de>

Best regards,
Marek Vasut
Wolfgang Denk May 9, 2014, 4:27 a.m. UTC | #2
Dear Lukasz Majewski,

In message <1399552067-31208-1-git-send-email-l.majewski@samsung.com> you wrote:
> Up till now the CRC32 of received data was calculated unconditionally.
> The standard crc32 implementation causes long delay when large images
> were uploaded.
> 
> The "dfu_hash_algo" environment variable gives the opportunity to
> enable on demand (when e.g. debugging) the hash (crc32) calculation.
> It can be done without need to recompile the u-boot binary and reuses
> the generic hash framework.
> 
> By default the crc32 is NOT calculated anymore.

I consider this a VARY BAD idea, as it causes a significant decrease
of reliability and robustness of the systems.  Please do not do this.

In any case, if you introduce this, the behaviour should be
documented, and the default setting should be such as to keep the
previous behaviour, i. e. CRC checking should remain on by default.
then people who are willing to trade reliability for a little speed
can still switch it off, but the unawarerest of the users will not
suffer.


Best regards,

Wolfgang Denk
Łukasz Majewski May 9, 2014, 6:52 a.m. UTC | #3
Hi Wolfgang,

> Dear Lukasz Majewski,
> 
> In message <1399552067-31208-1-git-send-email-l.majewski@samsung.com>
> you wrote:
> > Up till now the CRC32 of received data was calculated
> > unconditionally. The standard crc32 implementation causes long
> > delay when large images were uploaded.
> > 
> > The "dfu_hash_algo" environment variable gives the opportunity to
> > enable on demand (when e.g. debugging) the hash (crc32) calculation.
> > It can be done without need to recompile the u-boot binary and
> > reuses the generic hash framework.
> > 
> > By default the crc32 is NOT calculated anymore.
> 
> I consider this a VARY BAD idea, as it causes a significant decrease
> of reliability and robustness of the systems.  Please do not do this.

I do understand that reliability is very important, but please
consider following arguments:

1. Now calculated crc32 is only used for debugging. 

For automated tests I use MD5 and compare this value before sending
data to target via DFU and after I read it. This testing is done purely
on HOST machine.

Please refer to the discussion which we had at previous version of this
patch:

http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/183311/focus=183455

Participants have agreed, that we shall optionally enable crc32 (or
other algorithm) calculation. 

2. The current crc32 implementation is painfully slow (although I have
only L1 enabled on my target). 

3. With large files (like rootfs images) we store data (to medium) with
32 MiB chunks, which means that when we calculate complete crc32 the
image is already written to its final destination.

Of course we could store the rootfs to some "free" space on the eMMC,
then calculate crc32 and store it to the final position. This however
would take considerable time and require wrapping our binaries to
special headers (as described below). 

4. This patch also allows some flexibility: by setting the env variable
we can decide which algorithm to use (crc32, sha1, etc). It is
appealing since we use the hash_* code anyway.

> 
> In any case, if you introduce this, the behaviour should be
> documented, and the default setting should be such as to keep the
> previous behaviour, i. e. CRC checking should remain on by default.
> then people who are willing to trade reliability for a little speed

I would not touch the code if the speedup wouldn't be so significant.
Reducing flashing time of 400 MiB file from 65 s to 25 s is worth its
effort.

> can still switch it off, but the unawarerest of the users will not
> suffer.

As I've stated previously the crc32 in the current dfu implementation
is only informative.

To take the full advantage of it, we would need to modify the dfu-util
to wrap the sent file to some kind of header or locally write some
script to do that. However, this is not specified by the standard and
would be u-boot's extension of the DFU. 

Even more important issue is that it would work only for small files
(like uImage).

> 
> 
> Best regards,
> 
> Wolfgang Denk
>
Wolfgang Denk May 9, 2014, 8:31 a.m. UTC | #4
Dear Lukasz,

In message <20140509085203.31133238@amdc2363> you wrote:
> 
> For automated tests I use MD5 and compare this value before sending
> data to target via DFU and after I read it. This testing is done purely
> on HOST machine.

This is unsufficient.  You should always verify the image on the
target after the download has completed.

> Participants have agreed, that we shall optionally enable crc32 (or
> other algorithm) calculation. 

If this is the default now, it should remain the default.

> 2. The current crc32 implementation is painfully slow (although I have
> only L1 enabled on my target). 

This is an unrelated problem then, which should excluded from this
discussion here.

> 3. With large files (like rootfs images) we store data (to medium) with
> 32 MiB chunks, which means that when we calculate complete crc32 the
> image is already written to its final destination.

You can still detect if the download was corrupted, report a proper
error and initiate a re-download.  This would at least give you a
chance to react to corrupted data.  Just closing the eyes and hoping
no errors will ever happen has always been a bad strategy.

> 4. This patch also allows some flexibility: by setting the env variable
> we can decide which algorithm to use (crc32, sha1, etc). It is
> appealing since we use the hash_* code anyway.

Agreed.  This was not my point.

What I complained about is the change in behaviour.  I asked to make
the existing behaviour the default, so unaware users will not be
affected. Only if you intentionally want some other behaviour you can
then enable this by setting the env variable.

> > In any case, if you introduce this, the behaviour should be
> > documented, and the default setting should be such as to keep the
> > previous behaviour, i. e. CRC checking should remain on by default.
> > then people who are willing to trade reliability for a little speed
> 
> I would not touch the code if the speedup wouldn't be so significant.
> Reducing flashing time of 400 MiB file from 65 s to 25 s is worth its
> effort.

I disagree, if you pay for the speed by reduced reliability, and if
you don't even inform the user about this new behaviour.

Also, I feel it might be worth to investigate why the checksumming is
slow on your system.

> As I've stated previously the crc32 in the current dfu implementation
> is only informative.

It is pretty useful information, isn't it?

> To take the full advantage of it, we would need to modify the dfu-util
> to wrap the sent file to some kind of header or locally write some
> script to do that. However, this is not specified by the standard and
> would be u-boot's extension of the DFU. 

Ok, add this to the many deficientcies of DFU :-(

> Even more important issue is that it would work only for small files
> (like uImage).

Why so? Can we not calculate CRC even when the transfer is broken
down into several chunks?

Best regards,

Wolfgang Denk
Łukasz Majewski May 9, 2014, 9:54 a.m. UTC | #5
Hi Wolfgang,

> Dear Lukasz,
> 
> In message <20140509085203.31133238@amdc2363> you wrote:
> > 
> > For automated tests I use MD5 and compare this value before sending
> > data to target via DFU and after I read it. This testing is done
> > purely on HOST machine.
> 
> This is unsufficient.  You should always verify the image on the
> target after the download has completed.

I agree.

> 
> > Participants have agreed, that we shall optionally enable crc32 (or
> > other algorithm) calculation. 
> 
> If this is the default now, it should remain the default.
> 
> > 2. The current crc32 implementation is painfully slow (although I
> > have only L1 enabled on my target). 
> 
> This is an unrelated problem then, which should excluded from this
> discussion here.
> 
> > 3. With large files (like rootfs images) we store data (to medium)
> > with 32 MiB chunks, which means that when we calculate complete
> > crc32 the image is already written to its final destination.
> 
> You can still detect if the download was corrupted, report a proper
> error and initiate a re-download.  This would at least give you a
> chance to react to corrupted data. 

In this particular case I would need to chop the large file either at
dfu-util or on some script, where each chunk need to have the header
with its crc32. Then before storing each chunk I can assess if data
wasn't corrupted.

This would provide reliability.

Now, even that I have the crc32 calculated for a chunk, I don't have
the original one to compare.

> Just closing the eyes and hoping
> no errors will ever happen has always been a bad strategy.

+1

> 
> > 4. This patch also allows some flexibility: by setting the env
> > variable we can decide which algorithm to use (crc32, sha1, etc).
> > It is appealing since we use the hash_* code anyway.
> 
> Agreed.  This was not my point.
> 
> What I complained about is the change in behaviour.  I asked to make
> the existing behaviour the default, so unaware users will not be
> affected. Only if you intentionally want some other behaviour you can
> then enable this by setting the env variable.

Ok. I will preserve the default behavior. However, personally I think
that for a long term this proposed solution is better.

> 
> > > In any case, if you introduce this, the behaviour should be
> > > documented, and the default setting should be such as to keep the
> > > previous behaviour, i. e. CRC checking should remain on by
> > > default. then people who are willing to trade reliability for a
> > > little speed
> > 
> > I would not touch the code if the speedup wouldn't be so
> > significant. Reducing flashing time of 400 MiB file from 65 s to 25
> > s is worth its effort.
> 
> I disagree, if you pay for the speed by reduced reliability, and if
> you don't even inform the user about this new behaviour.
> 
> Also, I feel it might be worth to investigate why the checksumming is
> slow on your system.
> 
> > As I've stated previously the crc32 in the current dfu
> > implementation is only informative.
> 
> It is pretty useful information, isn't it?

It depends what do you want to do with it. If you have target
connected via serial to some test setup and log this output and process
it on HOST afterwards, then it is useful. 

Otherwise, you only see on console the CRC, which you can by hand
compare with crc calculated on your host. And this information displays
just after you stored the data to the medium (and corrupted the
previous one).

> 
> > To take the full advantage of it, we would need to modify the
> > dfu-util to wrap the sent file to some kind of header or locally
> > write some script to do that. However, this is not specified by the
> > standard and would be u-boot's extension of the DFU. 
> 
> Ok, add this to the many deficientcies of DFU :-(

The standard only allow the file which is the input to dfu-util to be
protected by CRC. Then dfu-util check this value and strips off the
header.

> 
> > Even more important issue is that it would work only for small files
> > (like uImage).
> 
> Why so? Can we not calculate CRC even when the transfer is broken
> down into several chunks?

To do that one would need to:

- chop the large file to several smaller ones (and the chunk size can
  be different for each platform and must be know for HOST utils)
- calculate crc32 for each chunk
- wrap it to some header not conforming to the DFU standard -it would
  be the u-boot extension
- send each chunk separately to target - by calling dfu-util several
  times.

Handling of this would be difficult because of the need of DFU state
machine extension.


> 
> Best regards,
> 
> Wolfgang Denk
>
Tom Rini May 12, 2014, 2:45 p.m. UTC | #6
On Fri, May 09, 2014 at 10:31:54AM +0200, Wolfgang Denk wrote:
> Dear Lukasz,
> 
> In message <20140509085203.31133238@amdc2363> you wrote:
> > 
> > For automated tests I use MD5 and compare this value before sending
> > data to target via DFU and after I read it. This testing is done purely
> > on HOST machine.
> 
> This is unsufficient.  You should always verify the image on the
> target after the download has completed.

True.  But this patch doesn't really change what you would have to do,
and arguably make it easier.

> > Participants have agreed, that we shall optionally enable crc32 (or
> > other algorithm) calculation. 
> 
> If this is the default now, it should remain the default.

Keep in mind what this current default is.  We say "here was the CRC32".
We do not compare it with an expected value nor do we have the ability
to since we're not passed from the host what the value was.

> > 2. The current crc32 implementation is painfully slow (although I have
> > only L1 enabled on my target). 
> 
> This is an unrelated problem then, which should excluded from this
> discussion here.

Agreed.

> > 3. With large files (like rootfs images) we store data (to medium) with
> > 32 MiB chunks, which means that when we calculate complete crc32 the
> > image is already written to its final destination.
> 
> You can still detect if the download was corrupted, report a proper
> error and initiate a re-download.  This would at least give you a
> chance to react to corrupted data.  Just closing the eyes and hoping
> no errors will ever happen has always been a bad strategy.

Before and after this change, only if the console is being monitored by
some script.  We do not nor are we given an expected hash so we cannot
say data was corrupted.

> > 4. This patch also allows some flexibility: by setting the env variable
> > we can decide which algorithm to use (crc32, sha1, etc). It is
> > appealing since we use the hash_* code anyway.
> 
> Agreed.  This was not my point.
> 
> What I complained about is the change in behaviour.  I asked to make
> the existing behaviour the default, so unaware users will not be
> affected. Only if you intentionally want some other behaviour you can
> then enable this by setting the env variable.

Well, looking at mainline usage of DFU, Lukasz is speaking for about
half of the users / implementors.  Since Denx is working with the other
half, can you shed some light on actual use vs theoretical
possibilities?
Łukasz Majewski May 15, 2014, 7:09 a.m. UTC | #7
Hi Tom, Wolfgang,

> On Fri, May 09, 2014 at 10:31:54AM +0200, Wolfgang Denk wrote:
> > Dear Lukasz,
> > 
> > In message <20140509085203.31133238@amdc2363> you wrote:
> > > 
> > > For automated tests I use MD5 and compare this value before
> > > sending data to target via DFU and after I read it. This testing
> > > is done purely on HOST machine.
> > 
> > This is unsufficient.  You should always verify the image on the
> > target after the download has completed.
> 
> True.  But this patch doesn't really change what you would have to do,
> and arguably make it easier.
> 
> > > Participants have agreed, that we shall optionally enable crc32
> > > (or other algorithm) calculation. 
> > 
> > If this is the default now, it should remain the default.
> 
> Keep in mind what this current default is.  We say "here was the
> CRC32". We do not compare it with an expected value nor do we have
> the ability to since we're not passed from the host what the value
> was.
> 
> > > 2. The current crc32 implementation is painfully slow (although I
> > > have only L1 enabled on my target). 
> > 
> > This is an unrelated problem then, which should excluded from this
> > discussion here.
> 
> Agreed.
> 
> > > 3. With large files (like rootfs images) we store data (to
> > > medium) with 32 MiB chunks, which means that when we calculate
> > > complete crc32 the image is already written to its final
> > > destination.
> > 
> > You can still detect if the download was corrupted, report a proper
> > error and initiate a re-download.  This would at least give you a
> > chance to react to corrupted data.  Just closing the eyes and hoping
> > no errors will ever happen has always been a bad strategy.
> 
> Before and after this change, only if the console is being monitored
> by some script.  We do not nor are we given an expected hash so we
> cannot say data was corrupted.
> 
> > > 4. This patch also allows some flexibility: by setting the env
> > > variable we can decide which algorithm to use (crc32, sha1, etc).
> > > It is appealing since we use the hash_* code anyway.
> > 
> > Agreed.  This was not my point.
> > 
> > What I complained about is the change in behaviour.  I asked to make
> > the existing behaviour the default, so unaware users will not be
> > affected. Only if you intentionally want some other behaviour you
> > can then enable this by setting the env variable.
> 
> Well, looking at mainline usage of DFU, Lukasz is speaking for about
> half of the users / implementors.  Since Denx is working with the
> other half, can you shed some light on actual use vs theoretical
> possibilities?
> 

I don't want to urge anybody on making any conclusion here :-), but I
would be very grateful if we could come up with an agreement.

As I've stated previously, my opinion is similar to the one presented
by Tom in this message.

For me it would be best to not calculate any checksum on default and
only enable it when needed.
Heiko Schocher May 15, 2014, 9:27 a.m. UTC | #8
Hello Lukasz,

Sorry for answering so late to this thread ...

Am 15.05.2014 09:09, schrieb Lukasz Majewski:
> Hi Tom, Wolfgang,
>
>> On Fri, May 09, 2014 at 10:31:54AM +0200, Wolfgang Denk wrote:
>>> Dear Lukasz,
>>>
>>> In message<20140509085203.31133238@amdc2363>  you wrote:
>>>>
>>>> For automated tests I use MD5 and compare this value before
>>>> sending data to target via DFU and after I read it. This testing
>>>> is done purely on HOST machine.
>>>
>>> This is unsufficient.  You should always verify the image on the
>>> target after the download has completed.
>>
>> True.  But this patch doesn't really change what you would have to do,
>> and arguably make it easier.
>>
>>>> Participants have agreed, that we shall optionally enable crc32
>>>> (or other algorithm) calculation.
>>>
>>> If this is the default now, it should remain the default.
>>
>> Keep in mind what this current default is.  We say "here was the
>> CRC32". We do not compare it with an expected value nor do we have
>> the ability to since we're not passed from the host what the value
>> was.
>>
>>>> 2. The current crc32 implementation is painfully slow (although I
>>>> have only L1 enabled on my target).
>>>
>>> This is an unrelated problem then, which should excluded from this
>>> discussion here.
>>
>> Agreed.
>>
>>>> 3. With large files (like rootfs images) we store data (to
>>>> medium) with 32 MiB chunks, which means that when we calculate
>>>> complete crc32 the image is already written to its final
>>>> destination.
>>>
>>> You can still detect if the download was corrupted, report a proper
>>> error and initiate a re-download.  This would at least give you a
>>> chance to react to corrupted data.  Just closing the eyes and hoping
>>> no errors will ever happen has always been a bad strategy.
>>
>> Before and after this change, only if the console is being monitored
>> by some script.  We do not nor are we given an expected hash so we
>> cannot say data was corrupted.
>>
>>>> 4. This patch also allows some flexibility: by setting the env
>>>> variable we can decide which algorithm to use (crc32, sha1, etc).
>>>> It is appealing since we use the hash_* code anyway.
>>>
>>> Agreed.  This was not my point.
>>>
>>> What I complained about is the change in behaviour.  I asked to make
>>> the existing behaviour the default, so unaware users will not be
>>> affected. Only if you intentionally want some other behaviour you
>>> can then enable this by setting the env variable.
>>
>> Well, looking at mainline usage of DFU, Lukasz is speaking for about
>> half of the users / implementors.  Since Denx is working with the
>> other half, can you shed some light on actual use vs theoretical
>> possibilities?
>>
>
> I don't want to urge anybody on making any conclusion here :-), but I
> would be very grateful if we could come up with an agreement.
>
> As I've stated previously, my opinion is similar to the one presented
> by Tom in this message.
>
> For me it would be best to not calculate any checksum on default and
> only enable it when needed.

Hmm.. as we use the calculated crc only for printing it on the console,
the question is really, why should we calculate it?

I try this patch on the siemens boards and report if the speed
impact is there also so big as in your tests. (which board was this?
Are there caches enabled?)

And I ask the customer of the siemens boards, if they check the
crc value on the u-boot console output, if not, I vote for droping
the crc calculation as default ...

BTW: Should such a crc check not be added to dfu-util and u-boot?

bye,
Heiko
Wolfgang Denk May 15, 2014, 11:19 a.m. UTC | #9
Dear Lukasz,

In message <20140515090904.32f1d13d@amdc2363> you wrote:
> 
> > > What I complained about is the change in behaviour.  I asked to make
> > > the existing behaviour the default, so unaware users will not be
> > > affected. Only if you intentionally want some other behaviour you
> > > can then enable this by setting the env variable.
> > 
> > Well, looking at mainline usage of DFU, Lukasz is speaking for about
> > half of the users / implementors.  Since Denx is working with the
> > other half, can you shed some light on actual use vs theoretical
> > possibilities?
> 
> I don't want to urge anybody on making any conclusion here :-), but I
> would be very grateful if we could come up with an agreement.
> 
> As I've stated previously, my opinion is similar to the one presented
> by Tom in this message.
> 
> For me it would be best to not calculate any checksum on default and
> only enable it when needed.

I asked Heiko to run some actual tests on the boards where he has to
maintain DFU for.  For a 288 MiB image he did not measure any
difference - with your patch applied, both with and without CRC
enabled, we would get the same (slow) 1:54 minutes download time.

This reinforces my speculation that you are actually addressing the
wrong problem.  Instead of adding new code and environment variables
and making the system even more complex, we should just leave
everything as is, and you should try to find out why the CRC
calculation is so low for you.  Checking if caches are enabled is
probably among the things that should be done first.


Regarding the checksumming topic in general:  the fact that the DFU
standard defines a method to verify the checksum of the image (dwCRC
field in the DFU File Suffix), but does not transmit this vital data
to the target so the actual file download and storage procedure on the
target is completely unprotected is IMO a serious design flaw of the
DFU protocl.  Do you think it would be possible to have this augmented
/ fixed?



Best regards,

Wolfgang Denk
Łukasz Majewski May 15, 2014, 1:43 p.m. UTC | #10
Hi Wolfgang,

> Dear Lukasz,
> 
> In message <20140515090904.32f1d13d@amdc2363> you wrote:
> > 
> > > > What I complained about is the change in behaviour.  I asked to
> > > > make the existing behaviour the default, so unaware users will
> > > > not be affected. Only if you intentionally want some other
> > > > behaviour you can then enable this by setting the env variable.
> > > 
> > > Well, looking at mainline usage of DFU, Lukasz is speaking for
> > > about half of the users / implementors.  Since Denx is working
> > > with the other half, can you shed some light on actual use vs
> > > theoretical possibilities?
> > 
> > I don't want to urge anybody on making any conclusion here :-), but
> > I would be very grateful if we could come up with an agreement.
> > 
> > As I've stated previously, my opinion is similar to the one
> > presented by Tom in this message.
> > 
> > For me it would be best to not calculate any checksum on default and
> > only enable it when needed.
> 
> I asked Heiko to run some actual tests on the boards where he has to
> maintain DFU for.  For a 288 MiB image he did not measure any
> difference - with your patch applied, both with and without CRC
> enabled, we would get the same (slow) 1:54 minutes download time.

As I've spoken with Heiko, am33xx uses NAND memory. I deal with eMMC.
Moreover, the size of "chunks" are different - 1 MiB and 32 MiB.

I must double check for the rationale for chunk size of 32 MiB at
Trats/Trats2 boards. I suspect, that eMMC write performance depend
on that.

I will perform some performance tests with 1 MiB chunk size and share
the result.

> 
> This reinforces my speculation that you are actually addressing the
> wrong problem.  Instead of adding new code and environment variables
> and making the system even more complex, we should just leave
> everything as is, 

During working on this patch I've replaced the crc32() method with the
call to hash_method(), which IMHO is welcome.

I also don't personally like the crc32, hence I like the choice which
this patch gives me to use other algorithm (for which I've got HW
support on my platform - e.g. MD5).

> and you should try to find out why the CRC
> calculation is so low for you.  Checking if caches are enabled is
> probably among the things that should be done first.

L1 is enabled. L2 has been disabled on purpose (power consumption
reduction). 

> 
> 
> Regarding the checksumming topic in general:  the fact that the DFU
> standard defines a method to verify the checksum of the image (dwCRC
> field in the DFU File Suffix), but does not transmit this vital data
> to the target so the actual file download and storage procedure on the
> target is completely unprotected is IMO a serious design flaw of the
> DFU protocl.  Do you think it would be possible to have this augmented
> / fixed?

Please note that the last revision of DFU is from 2004. I've contacted
Greg KH (one of the original authors) and he replied that no new attempt
to revise the standard was made. 

It is possible to fix this problem, by augmenting the current
implementation.

I saw the idea [*] of defining only one (or special) alt setting and in
this one file embed the header with integrity data, list of
files/partitions images included in this file, and even the information
to where we want to write. In this way we would still comply with DFU
1.1 standard, which would be "wrapped" to some host scripts and special
u-boot code. It even would be possible to leave the current code
untouched. 

The original link with the idea [*]:
http://codelectron.wordpress.com/2014/02/28/flexible-firmware-upgrade/

And the ML discussion:
http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/181715/match=proposal+hack+efficient+usb+dfu+linux+based+boards

The best however, would be to revise the standard to include such
functionality to it. In the same time I'm fully aware that this is
very unlikely to happen.

> 
> 
> 
> Best regards,
> 
> Wolfgang Denk
>
Wolfgang Denk May 15, 2014, 2:07 p.m. UTC | #11
Dear Lukasz,

In message <20140515154334.626923b4@amdc2363> you wrote:
> 
> > This reinforces my speculation that you are actually addressing the
> > wrong problem.  Instead of adding new code and environment variables
> > and making the system even more complex, we should just leave
> > everything as is, 
> 
> During working on this patch I've replaced the crc32() method with the
> call to hash_method(), which IMHO is welcome.

Yes, indeed this is highly welcome.  Thanks a lot for that!

> I also don't personally like the crc32, hence I like the choice which
> this patch gives me to use other algorithm (for which I've got HW
> support on my platform - e.g. MD5).

Well, is this really useful?  dfu-utils provides only CRC caculation,
so where would you get the reference value for any other checksum metod
from?

> > and you should try to find out why the CRC
> > calculation is so low for you.  Checking if caches are enabled is
> > probably among the things that should be done first.
> 
> L1 is enabled. L2 has been disabled on purpose (power consumption
> reduction). 

This certainly contributes to slow code execution.

> Please note that the last revision of DFU is from 2004. I've contacted
> Greg KH (one of the original authors) and he replied that no new attempt
> to revise the standard was made. 

This may just mean that users were just happy with the current
situation.  It's definitely better than if changed had been proposed
but rejected.

> The best however, would be to revise the standard to include such
> functionality to it. In the same time I'm fully aware that this is
> very unlikely to happen.

Why do you think it is unlikely?  Of course, it would require that
someone comes up with such a proposal in the first place.  But you
sound as if you were certain a proposal had no chance for being
considered.  I may be naive, but should we not at least try before
giving up?

Best regards,

Wolfgang Denk
Łukasz Majewski May 16, 2014, 6:08 a.m. UTC | #12
Hi Wolfgang,

> Dear Lukasz,
> 
> In message <20140515154334.626923b4@amdc2363> you wrote:
> > 
> > > This reinforces my speculation that you are actually addressing
> > > the wrong problem.  Instead of adding new code and environment
> > > variables and making the system even more complex, we should just
> > > leave everything as is, 
> > 
> > During working on this patch I've replaced the crc32() method with
> > the call to hash_method(), which IMHO is welcome.
> 
> Yes, indeed this is highly welcome.  Thanks a lot for that!
> 
> > I also don't personally like the crc32, hence I like the choice
> > which this patch gives me to use other algorithm (for which I've
> > got HW support on my platform - e.g. MD5).
> 
> Well, is this really useful?  dfu-utils provides only CRC caculation,
> so where would you get the reference value for any other checksum
> metod from?

I was rather thinking about a test setup with my target connected via
serial console to HOST machine. Then I could compare the CRC32/MD5/SHA1
just after sending the data.

For my target it is better to use MD5 or SHA1 since support for them is
provided via the specialized, embedded crypto IP.

> 
> > > and you should try to find out why the CRC
> > > calculation is so low for you.  Checking if caches are enabled is
> > > probably among the things that should be done first.
> > 
> > L1 is enabled. L2 has been disabled on purpose (power consumption
> > reduction). 
> 
> This certainly contributes to slow code execution.
> 
> > Please note that the last revision of DFU is from 2004. I've
> > contacted Greg KH (one of the original authors) and he replied that
> > no new attempt to revise the standard was made. 
> 
> This may just mean that users were just happy with the current
> situation. 

It is hard to say. 

> It's definitely better than if changed had been proposed
> but rejected.

True.

> 
> > The best however, would be to revise the standard to include such
> > functionality to it. In the same time I'm fully aware that this is
> > very unlikely to happen.
> 
> Why do you think it is unlikely? 

I don't have the experience with preparing USB standards, but I assume
that it is somewhat hard to revise the standard after 10 years.

> Of course, it would require that
> someone comes up with such a proposal in the first place.  But you
> sound as if you were certain a proposal had no chance for being
> considered. 

No, this is not what I meant.

> I may be naive, but should we not at least try before
> giving up?

Unfortunately my time budget is limited and I feel like this has lower
priority than fixing/solving current DFU problems.

> 
> Best regards,
> 
> Wolfgang Denk
>
Łukasz Majewski May 16, 2014, 8:58 a.m. UTC | #13
Hi Wolfgang, Tom,

> Hi Wolfgang,
> 
> > Dear Lukasz,
> > 
> > In message <20140515090904.32f1d13d@amdc2363> you wrote:
> > > 
> > > > > What I complained about is the change in behaviour.  I asked
> > > > > to make the existing behaviour the default, so unaware users
> > > > > will not be affected. Only if you intentionally want some
> > > > > other behaviour you can then enable this by setting the env
> > > > > variable.
> > > > 
> > > > Well, looking at mainline usage of DFU, Lukasz is speaking for
> > > > about half of the users / implementors.  Since Denx is working
> > > > with the other half, can you shed some light on actual use vs
> > > > theoretical possibilities?
> > > 
> > > I don't want to urge anybody on making any conclusion here :-),
> > > but I would be very grateful if we could come up with an
> > > agreement.
> > > 
> > > As I've stated previously, my opinion is similar to the one
> > > presented by Tom in this message.
> > > 
> > > For me it would be best to not calculate any checksum on default
> > > and only enable it when needed.
> > 
> > I asked Heiko to run some actual tests on the boards where he has to
> > maintain DFU for.  For a 288 MiB image he did not measure any
> > difference - with your patch applied, both with and without CRC
> > enabled, we would get the same (slow) 1:54 minutes download time.
> 
> As I've spoken with Heiko, am33xx uses NAND memory. I deal with eMMC.
> Moreover, the size of "chunks" are different - 1 MiB and 32 MiB.
> 
> I must double check for the rationale for chunk size of 32 MiB at
> Trats/Trats2 boards. I suspect, that eMMC write performance depend
> on that.
> 
> I will perform some performance tests with 1 MiB chunk size and share
> the result.

Unfortunately the 32 MiB is fixed for our platform. since lthor uses it
by default.

> 
> > 
> > This reinforces my speculation that you are actually addressing the
> > wrong problem.  Instead of adding new code and environment variables
> > and making the system even more complex, we should just leave
> > everything as is, 
> 
> During working on this patch I've replaced the crc32() method with the
> call to hash_method(), which IMHO is welcome.
> 
> I also don't personally like the crc32, hence I like the choice which
> this patch gives me to use other algorithm (for which I've got HW
> support on my platform - e.g. MD5).
> 
> > and you should try to find out why the CRC
> > calculation is so low for you.  Checking if caches are enabled is
> > probably among the things that should be done first.
> 
> L1 is enabled. L2 has been disabled on purpose (power consumption
> reduction). 

Regarding L2 - our platform requires SMC calls to enable and manage L2
cache. In my opinion support for this in u-boot is an overkill.


Can we conclude this whole discussion? The main point was if we should
keep calculating and displaying crc32 as default for DFU transfers.

I'm for the option to NOT display and calculate it by default (PATCH
v3).
Heiko Schocher May 19, 2014, 2:02 p.m. UTC | #14
Hello Lukasz,

Am 16.05.2014 10:58, schrieb Lukasz Majewski:
> Hi Wolfgang, Tom,
>
>> Hi Wolfgang,
>>
>>> Dear Lukasz,
>>>
>>> In message<20140515090904.32f1d13d@amdc2363>  you wrote:
>>>>
>>>>>> What I complained about is the change in behaviour.  I asked
>>>>>> to make the existing behaviour the default, so unaware users
>>>>>> will not be affected. Only if you intentionally want some
>>>>>> other behaviour you can then enable this by setting the env
>>>>>> variable.
>>>>>
>>>>> Well, looking at mainline usage of DFU, Lukasz is speaking for
>>>>> about half of the users / implementors.  Since Denx is working
>>>>> with the other half, can you shed some light on actual use vs
>>>>> theoretical possibilities?
>>>>
>>>> I don't want to urge anybody on making any conclusion here :-),
>>>> but I would be very grateful if we could come up with an
>>>> agreement.
>>>>
>>>> As I've stated previously, my opinion is similar to the one
>>>> presented by Tom in this message.
>>>>
>>>> For me it would be best to not calculate any checksum on default
>>>> and only enable it when needed.
>>>
>>> I asked Heiko to run some actual tests on the boards where he has to
>>> maintain DFU for.  For a 288 MiB image he did not measure any
>>> difference - with your patch applied, both with and without CRC
>>> enabled, we would get the same (slow) 1:54 minutes download time.
>>
>> As I've spoken with Heiko, am33xx uses NAND memory. I deal with eMMC.
>> Moreover, the size of "chunks" are different - 1 MiB and 32 MiB.
>>
>> I must double check for the rationale for chunk size of 32 MiB at
>> Trats/Trats2 boards. I suspect, that eMMC write performance depend
>> on that.
>>
>> I will perform some performance tests with 1 MiB chunk size and share
>> the result.
>
> Unfortunately the 32 MiB is fixed for our platform. since lthor uses it
> by default.
>
>>
>>>
>>> This reinforces my speculation that you are actually addressing the
>>> wrong problem.  Instead of adding new code and environment variables
>>> and making the system even more complex, we should just leave
>>> everything as is,
>>
>> During working on this patch I've replaced the crc32() method with the
>> call to hash_method(), which IMHO is welcome.
>>
>> I also don't personally like the crc32, hence I like the choice which
>> this patch gives me to use other algorithm (for which I've got HW
>> support on my platform - e.g. MD5).
>>
>>> and you should try to find out why the CRC
>>> calculation is so low for you.  Checking if caches are enabled is
>>> probably among the things that should be done first.
>>
>> L1 is enabled. L2 has been disabled on purpose (power consumption
>> reduction).
>
> Regarding L2 - our platform requires SMC calls to enable and manage L2
> cache. In my opinion support for this in u-boot is an overkill.
>
>
> Can we conclude this whole discussion? The main point was if we should
> keep calculating and displaying crc32 as default for DFU transfers.
>
> I'm for the option to NOT display and calculate it by default (PATCH
> v3).

I talked with the siemens board customer, they also do not check/use
the displayed value from U-Boot ...

So, for me it is OK to not display this value ... but we should add
to DFU such a check ... or?

bye,
Heiko
Lukasz Majewski May 20, 2014, 5:22 p.m. UTC | #15
Hi Heiko,

> 
> Hello Lukasz,
> 
> Am 16.05.2014 10:58, schrieb Lukasz Majewski:
> > Hi Wolfgang, Tom,
> >
> >> Hi Wolfgang,
> >>
> >>> Dear Lukasz,
> >>>
> >>> In message<20140515090904.32f1d13d@amdc2363>  you wrote:
> >>>>
> >>>>>> What I complained about is the change in behaviour.  I asked
> >>>>>> to make the existing behaviour the default, so unaware users
> >>>>>> will not be affected. Only if you intentionally want some
> >>>>>> other behaviour you can then enable this by setting the env
> >>>>>> variable.
> >>>>>
> >>>>> Well, looking at mainline usage of DFU, Lukasz is speaking for
> >>>>> about half of the users / implementors.  Since Denx is working
> >>>>> with the other half, can you shed some light on actual use vs
> >>>>> theoretical possibilities?
> >>>>
> >>>> I don't want to urge anybody on making any conclusion here :-),
> >>>> but I would be very grateful if we could come up with an
> >>>> agreement.
> >>>>
> >>>> As I've stated previously, my opinion is similar to the one
> >>>> presented by Tom in this message.
> >>>>
> >>>> For me it would be best to not calculate any checksum on default
> >>>> and only enable it when needed.
> >>>
> >>> I asked Heiko to run some actual tests on the boards where he has
> >>> to maintain DFU for.  For a 288 MiB image he did not measure any
> >>> difference - with your patch applied, both with and without CRC
> >>> enabled, we would get the same (slow) 1:54 minutes download time.
> >>
> >> As I've spoken with Heiko, am33xx uses NAND memory. I deal with
> >> eMMC. Moreover, the size of "chunks" are different - 1 MiB and 32
> >> MiB.
> >>
> >> I must double check for the rationale for chunk size of 32 MiB at
> >> Trats/Trats2 boards. I suspect, that eMMC write performance depend
> >> on that.
> >>
> >> I will perform some performance tests with 1 MiB chunk size and
> >> share the result.
> >
> > Unfortunately the 32 MiB is fixed for our platform. since lthor
> > uses it by default.
> >
> >>
> >>>
> >>> This reinforces my speculation that you are actually addressing
> >>> the wrong problem.  Instead of adding new code and environment
> >>> variables and making the system even more complex, we should just
> >>> leave everything as is,
> >>
> >> During working on this patch I've replaced the crc32() method with
> >> the call to hash_method(), which IMHO is welcome.
> >>
> >> I also don't personally like the crc32, hence I like the choice
> >> which this patch gives me to use other algorithm (for which I've
> >> got HW support on my platform - e.g. MD5).
> >>
> >>> and you should try to find out why the CRC
> >>> calculation is so low for you.  Checking if caches are enabled is
> >>> probably among the things that should be done first.
> >>
> >> L1 is enabled. L2 has been disabled on purpose (power consumption
> >> reduction).
> >
> > Regarding L2 - our platform requires SMC calls to enable and manage
> > L2 cache. In my opinion support for this in u-boot is an overkill.
> >
> >
> > Can we conclude this whole discussion? The main point was if we
> > should keep calculating and displaying crc32 as default for DFU
> > transfers.
> >
> > I'm for the option to NOT display and calculate it by default (PATCH
> > v3).
> 
> I talked with the siemens board customer, they also do not check/use
> the displayed value from U-Boot ...
> 
> So, for me it is OK to not display this value ...

Ok, so we now have a consensus here.

> but we should add
> to DFU such a check ... or?

Yes, we should add a way to send the CRC32/MD5/SHA1 to our boards.

Unfortunately I'm out of office now, so it is hard for me to develop
something.

However, the rough idea would be to send the CRC32 (or any other
checksum) to a separate alt setting.
> 
> bye,
> Heiko
Łukasz Majewski May 22, 2014, 9:46 a.m. UTC | #16
Hi Heiko,

> Hello Lukasz,
> 
> Am 16.05.2014 10:58, schrieb Lukasz Majewski:
> > Hi Wolfgang, Tom,
> >
> >> Hi Wolfgang,
> >>
> >>> Dear Lukasz,
> >>>
> >>> In message<20140515090904.32f1d13d@amdc2363>  you wrote:
> >>>>
> >>>>>> What I complained about is the change in behaviour.  I asked
> >>>>>> to make the existing behaviour the default, so unaware users
> >>>>>> will not be affected. Only if you intentionally want some
> >>>>>> other behaviour you can then enable this by setting the env
> >>>>>> variable.
> >>>>>
> >>>>> Well, looking at mainline usage of DFU, Lukasz is speaking for
> >>>>> about half of the users / implementors.  Since Denx is working
> >>>>> with the other half, can you shed some light on actual use vs
> >>>>> theoretical possibilities?
> >>>>
> >>>> I don't want to urge anybody on making any conclusion here :-),
> >>>> but I would be very grateful if we could come up with an
> >>>> agreement.
> >>>>
> >>>> As I've stated previously, my opinion is similar to the one
> >>>> presented by Tom in this message.
> >>>>
> >>>> For me it would be best to not calculate any checksum on default
> >>>> and only enable it when needed.
> >>>
> >>> I asked Heiko to run some actual tests on the boards where he has
> >>> to maintain DFU for.  For a 288 MiB image he did not measure any
> >>> difference - with your patch applied, both with and without CRC
> >>> enabled, we would get the same (slow) 1:54 minutes download time.
> >>
> >> As I've spoken with Heiko, am33xx uses NAND memory. I deal with
> >> eMMC. Moreover, the size of "chunks" are different - 1 MiB and 32
> >> MiB.
> >>
> >> I must double check for the rationale for chunk size of 32 MiB at
> >> Trats/Trats2 boards. I suspect, that eMMC write performance depend
> >> on that.
> >>
> >> I will perform some performance tests with 1 MiB chunk size and
> >> share the result.
> >
> > Unfortunately the 32 MiB is fixed for our platform. since lthor
> > uses it by default.
> >
> >>
> >>>
> >>> This reinforces my speculation that you are actually addressing
> >>> the wrong problem.  Instead of adding new code and environment
> >>> variables and making the system even more complex, we should just
> >>> leave everything as is,
> >>
> >> During working on this patch I've replaced the crc32() method with
> >> the call to hash_method(), which IMHO is welcome.
> >>
> >> I also don't personally like the crc32, hence I like the choice
> >> which this patch gives me to use other algorithm (for which I've
> >> got HW support on my platform - e.g. MD5).
> >>
> >>> and you should try to find out why the CRC
> >>> calculation is so low for you.  Checking if caches are enabled is
> >>> probably among the things that should be done first.
> >>
> >> L1 is enabled. L2 has been disabled on purpose (power consumption
> >> reduction).
> >
> > Regarding L2 - our platform requires SMC calls to enable and manage
> > L2 cache. In my opinion support for this in u-boot is an overkill.
> >
> >
> > Can we conclude this whole discussion? The main point was if we
> > should keep calculating and displaying crc32 as default for DFU
> > transfers.
> >
> > I'm for the option to NOT display and calculate it by default (PATCH
> > v3).
> 
> I talked with the siemens board customer, they also do not check/use
> the displayed value from U-Boot ...
> 
> So, for me it is OK to not display this value ... 

Applied this patch (with no default CRC32 calculation - v3) to
u-boot-dfu tree.

> but we should add
> to DFU such a check ... or?
> 
> bye,
> Heiko
diff mbox

Patch

diff --git a/drivers/dfu/dfu.c b/drivers/dfu/dfu.c
index 51b1026..c06f4cc 100644
--- a/drivers/dfu/dfu.c
+++ b/drivers/dfu/dfu.c
@@ -13,6 +13,7 @@ 
 #include <mmc.h>
 #include <fat.h>
 #include <dfu.h>
+#include <hash.h>
 #include <linux/list.h>
 #include <linux/compiler.h>
 
@@ -20,6 +21,7 @@  static bool dfu_reset_request;
 static LIST_HEAD(dfu_list);
 static int dfu_alt_num;
 static int alt_num_cnt;
+static struct hash_algo *dfu_hash_algo;
 
 bool dfu_reset(void)
 {
@@ -99,6 +101,24 @@  unsigned char *dfu_get_buf(void)
 	return dfu_buf;
 }
 
+static char *dfu_get_hash_algo(void)
+{
+	char *s;
+
+	s = getenv("dfu_hash_algo");
+	if (!s)
+		return NULL;
+
+	if (!strcmp(s, "crc32")) {
+		debug("%s: DFU hash method: %s\n", __func__, s);
+		return s;
+	}
+
+	error("DFU hash method: %s not supported!\n", s);
+
+	return NULL;
+}
+
 static int dfu_write_buffer_drain(struct dfu_entity *dfu)
 {
 	long w_size;
@@ -109,8 +129,9 @@  static int dfu_write_buffer_drain(struct dfu_entity *dfu)
 	if (w_size == 0)
 		return 0;
 
-	/* update CRC32 */
-	dfu->crc = crc32(dfu->crc, dfu->i_buf_start, w_size);
+	if (dfu_hash_algo)
+		dfu_hash_algo->hash_update(dfu_hash_algo, &dfu->crc,
+					   dfu->i_buf_start, w_size, 0);
 
 	ret = dfu->write_medium(dfu, dfu->offset, dfu->i_buf_start, &w_size);
 	if (ret)
@@ -134,7 +155,9 @@  int dfu_flush(struct dfu_entity *dfu, void *buf, int size, int blk_seq_num)
 	if (dfu->flush_medium)
 		ret = dfu->flush_medium(dfu);
 
-	printf("\nDFU complete CRC32: 0x%08x\n", dfu->crc);
+	if (dfu_hash_algo)
+		printf("\nDFU complete %s: 0x%08x\n", dfu_hash_algo->name,
+		       dfu->crc);
 
 	/* clear everything */
 	dfu_free_buf();
@@ -234,7 +257,11 @@  static int dfu_read_buffer_fill(struct dfu_entity *dfu, void *buf, int size)
 		/* consume */
 		if (chunk > 0) {
 			memcpy(buf, dfu->i_buf, chunk);
-			dfu->crc = crc32(dfu->crc, buf, chunk);
+			if (dfu_hash_algo)
+				dfu_hash_algo->hash_update(dfu_hash_algo,
+							   &dfu->crc, buf,
+							   chunk, 0);
+
 			dfu->i_buf += chunk;
 			dfu->b_left -= chunk;
 			dfu->r_left -= chunk;
@@ -318,7 +345,9 @@  int dfu_read(struct dfu_entity *dfu, void *buf, int size, int blk_seq_num)
 	}
 
 	if (ret < size) {
-		debug("%s: %s CRC32: 0x%x\n", __func__, dfu->name, dfu->crc);
+		if (dfu_hash_algo)
+			debug("%s: %s %s: 0x%x\n", __func__, dfu->name,
+			      dfu_hash_algo->name, dfu->crc);
 		puts("\nUPLOAD ... done\nCtrl+C to exit ...\n");
 
 		dfu_free_buf();
@@ -393,6 +422,14 @@  int dfu_config_entities(char *env, char *interface, int num)
 	dfu_alt_num = dfu_find_alt_num(env);
 	debug("%s: dfu_alt_num=%d\n", __func__, dfu_alt_num);
 
+	dfu_hash_algo = NULL;
+	s = dfu_get_hash_algo();
+	if (s) {
+		ret = hash_lookup_algo(s, &dfu_hash_algo);
+		if (ret)
+			error("Hash algorithm %s not supported\n", s);
+	}
+
 	dfu = calloc(sizeof(*dfu), dfu_alt_num);
 	if (!dfu)
 		return -1;