diff mbox series

[v9,3/7] env: Allow U-Boot scripts to be placed in a .env file

Message ID 20211019164418.v9.3.Ie78bfbfca0d01d9cba501e127f446ec48e1f7afe@changeid
State Superseded
Delegated to: Tom Rini
Headers show
Series env: Allow environment in text files | expand

Commit Message

Simon Glass Oct. 19, 2021, 10:44 p.m. UTC
At present U-Boot environment variables, and thus scripts, are defined
by CONFIG_EXTRA_ENV_SETTINGS. It is painful to add large amounts of text
to this file and dealing with quoting and newlines is harder than it
should be. It would be better if we could just type the script into a
text file and have it included by U-Boot.

Add a feature that brings in a .env file associated with the board
config, if present. To use it, create a file in a board/<vendor>
directory, typically called <board>.env and controlled by the
CONFIG_ENV_SOURCE_FILE option.

The environment variables should be of the form "var=value". Values can
extend to multiple lines. See the README under 'Environment Variables:'
for more information and an example. Note that variables names may
not end in + due to the += syntax below.

In many cases environment variables need access to the U-Boot CONFIG
variables to select different options. Enable this so that the environment
scripts can be as useful as the ones currently in the board config files.
This uses the C preprocessor, means that comments can be included in the
environment using /* ... */

Also support += to allow variables to be appended to. This is needed when
using the preprocessor.

Signed-off-by: Simon Glass <sjg@chromium.org>
Reviewed-by: Marek Behún <marek.behun@nic.cz>
Tested-by: Marek Behún <marek.behun@nic.cz>
---

Changes in v9:
- Drop mention of other strange characters
- Clarify that the + restriction is on the variable name not its value
- Add some tests for the script
- Deal with leading tabs
- Squash indentation down to one space
- Convert newlines within strings to spaces, which seems more consistent
- Handle appending an empty string to an empty var

Changes in v8:
- Update commit message to avoid mentioning the 'env' subdirectory
- Update commit message to mention the + restriction, etc.
- Overwrite the env file each time, to avoid incremental-build problems

Changes in v7:
- Use 'env' basename instead of 'environment' for intermediate output files
- Show a message indicating the source text file being used
- Give an error if CONFIG_EXTRA_ENV_SETTINGS is also defined
- Use CONFIG_ENV_SOURCE_FILE instead of rules to specify the text-file name
- Make board.env the default name if CONFIG_ENV_SOURCE_FILE is empty
- Rewrite the documentation
- Drop the use of common.env
- Update awk script to output the whole CONFIG string, or just a comment

Changes in v6:
- Combine the two env2string.awk patches into one

Changes in v5:
- Explain how to include the common.env file
- Explain why variables starting with _ , and / are not supported
- Expand the definition of how to declare an environment variable
- Explain what happens to empty variables
- Update maintainer
- Move use of += to this patch
- Explain that environment variables may not end in +

Changes in v4:
- Move this from being part of configuring U-Boot to part of building it
- Don't put the environment in autoconf.mk as it is not needed
- Add documentation in rST format instead of README
- Drop mention of import/export
- Update awk script to ignore blank lines, as generated by clang
- Add documentation in rST format instead of README

Changes in v3:
- Adjust Makefile to generate the .inc and .h files in separate fules
- Add more detail in the README about the format of .env files
- Improve the comment about " in the awk script
- Correctly terminate environment files with \n
- Define __UBOOT_CONFIG__ when collecting environment files

Changes in v2:
- Move .env file from include/configs to board/
- Use awk script to process environment since it is much easier on the brain
- Add information and updated example script to README
- Add dependency rule so that the environment is rebuilt when it changes
- Add separate patch to enable C preprocessor for environment files
- Enable var+=value form to simplify composing variables in multiple steps

 MAINTAINERS               |  7 +++
 Makefile                  | 66 ++++++++++++++++++++++++++-
 config.mk                 |  2 +
 doc/usage/environment.rst | 77 +++++++++++++++++++++++++++++++-
 env/Kconfig               | 18 ++++++++
 env/embedded.c            |  1 +
 include/env_default.h     | 11 +++++
 scripts/env2string.awk    | 71 ++++++++++++++++++++++++++++++
 test/py/tests/test_env.py | 93 +++++++++++++++++++++++++++++++++++++++
 9 files changed, 344 insertions(+), 2 deletions(-)
 create mode 100644 scripts/env2string.awk

Comments

Wolfgang Denk Oct. 21, 2021, 9:50 a.m. UTC | #1
Dear Simon,

In message <20211019164418.v9.3.Ie78bfbfca0d01d9cba501e127f446ec48e1f7afe@changeid> you wrote:
> At present U-Boot environment variables, and thus scripts, are defined
> by CONFIG_EXTRA_ENV_SETTINGS. It is painful to add large amounts of text
> to this file and dealing with quoting and newlines is harder than it
> should be. It would be better if we could just type the script into a
> text file and have it included by U-Boot.
>
> Add a feature that brings in a .env file associated with the board
> config, if present. To use it, create a file in a board/<vendor>
> directory, typically called <board>.env and controlled by the
> CONFIG_ENV_SOURCE_FILE option.
>
> The environment variables should be of the form "var=value". Values can
> extend to multiple lines. See the README under 'Environment Variables:'
> for more information and an example. Note that variables names may
> not end in + due to the += syntax below.

I still object to placing new, arbitrary restrictions on what may or
may not be used in environment variable names.

We have discussed alternative implementations, and trivial changes
like using "=+" instead of "+=" as append operator do not require
such restictions.

Thus, and only for the restictions on variable names:

Naked-by: Wolfgang Denk <wd@denx.de>

Best regards,

Wolfgang Denk
Tom Rini Oct. 21, 2021, 12:23 p.m. UTC | #2
On Thu, Oct 21, 2021 at 11:50:02AM +0200, Wolfgang Denk wrote:
> Dear Simon,
> 
> In message <20211019164418.v9.3.Ie78bfbfca0d01d9cba501e127f446ec48e1f7afe@changeid> you wrote:
> > At present U-Boot environment variables, and thus scripts, are defined
> > by CONFIG_EXTRA_ENV_SETTINGS. It is painful to add large amounts of text
> > to this file and dealing with quoting and newlines is harder than it
> > should be. It would be better if we could just type the script into a
> > text file and have it included by U-Boot.
> >
> > Add a feature that brings in a .env file associated with the board
> > config, if present. To use it, create a file in a board/<vendor>
> > directory, typically called <board>.env and controlled by the
> > CONFIG_ENV_SOURCE_FILE option.
> >
> > The environment variables should be of the form "var=value". Values can
> > extend to multiple lines. See the README under 'Environment Variables:'
> > for more information and an example. Note that variables names may
> > not end in + due to the += syntax below.
> 
> I still object to placing new, arbitrary restrictions on what may or
> may not be used in environment variable names.
> 
> We have discussed alternative implementations, and trivial changes
> like using "=+" instead of "+=" as append operator do not require
> such restictions.
> 
> Thus, and only for the restictions on variable names:
> 
> Naked-by: Wolfgang Denk <wd@denx.de>

Do you have any other feedback on the entire rest of the series?
Because I'm not sure the benefit of "we can still support '+' at the end
of a variable name, if anyone uses that" outweighs "we can more easily
append variables in constructing our environment without relying on
uncommon operators".  To me "=+" as the append syntax is worse than "no
+ at the end of your variables".
Wolfgang Denk Oct. 21, 2021, 1:06 p.m. UTC | #3
Dear Tom,

In message <20211021122325.GX7964@bill-the-cat> you wrote:
> 
> Do you have any other feedback on the entire rest of the series?

I already wrote that I support the concept, and the few nit I saw
have been fixed, I think.  Except this unneeded breaking of backward
compatibility.

> Because I'm not sure the benefit of "we can still support '+' at the end
> of a variable name, if anyone uses that" outweighs "we can more easily
> append variables in constructing our environment without relying on
> uncommon operators".

We introduce a new feature here. Defining an append operator is a
convenience thing. It could probably also solved using the
preprocessor, likely in a more ugly way.

In any case, the feature is new, and the operator is new.

For the implementation it does not matter if we define this operator
as "+=" or '=+" or "=." or something else.

the only difference is that any ioperator starting with an equal
sign is inherently backward compatible without need for arbitrary
new restrictions.

> To me "=+" as the append syntax is worse than "no
> + at the end of your variables".

In which way is it worse? For esthetic reasons?

I confirm that '+=' looks better.  But '+=" is technically broken.



Best regards,

Wolfgang Denk
Marek Behún Oct. 21, 2021, 1:25 p.m. UTC | #4
Hello,

On Thu, 21 Oct 2021 15:06:51 +0200
Wolfgang Denk <wd@denx.de> wrote:

> I confirm that '+=' looks better.  But '+=" is technically broken.

a bit of my opinion:
I think =+ will confuse far more people than + as last character of var
name working weirdly. But I also think that + should be supported as
last character. Therefore I propose backslash escaping in variable name,
i.e.
  var+=value
appends value to var, while
  var\+=value
sets variable with name "var+"

Marek
Marek Behún Oct. 21, 2021, 1:28 p.m. UTC | #5
On Thu, 21 Oct 2021 15:25:37 +0200
Marek Behún <marek.behun@nic.cz> wrote:

> Hello,
> 
> On Thu, 21 Oct 2021 15:06:51 +0200
> Wolfgang Denk <wd@denx.de> wrote:
> 
> > I confirm that '+=' looks better.  But '+=" is technically broken.  
> 
> a bit of my opinion:
> I think =+ will confuse far more people than + as last character of var
> name working weirdly. But I also think that + should be supported as
> last character. Therefore I propose backslash escaping in variable name,
> i.e.
>   var+=value
> appends value to var, while
>   var\+=value
> sets variable with name "var+"
> 
> Marek

Also, I think that it would be better if spaces and tabs were allowed
to indent the .env file, i.e.

	var_a	=	3
	var_bcd	=	7

should set "var_a" to "3", "var_bcd" to "7".

If special character are needed in either name or value, they could be
escaped and/or quoted.

Marek
Wolfgang Denk Oct. 21, 2021, 3:12 p.m. UTC | #6
Dear Marek,

In message <20211021152831.15524883@thinkpad> you wrote:
>
>
> > I think =+ will confuse far more people than + as last character of var

I still fail to see why '=+' could be confusing if properly
documented to be the append operator.

I mean, it is not a new invention of mine.

OpenEmbedded / Yocto uses '=+' a lot, like in

	meta/recipes-kernel/dtc/dtc.inc:

	PACKAGES =+ "${PN}-misc"

Actually they use both '+=' and '=+', like

	RESULT+=${ERRORS}


> > name working weirdly. But I also think that + should be supported as
> > last character. Therefore I propose backslash escaping in variable name,
> > i.e.
> >   var+=value
> > appends value to var, while
> >   var\+=value
> > sets variable with name "var+"

Yes, this is yet another alternative for handling this problem.

> Also, I think that it would be better if spaces and tabs were allowed
> to indent the .env file, i.e.
>
> 	var_a	=	3
> 	var_bcd	=	7
>
> should set "var_a" to "3", "var_bcd" to "7".

Why not...

In the end it boils down that the file format should be properly
defined and have a clear syntax description.  Apparently the
"<name>=<value>\0" format as used in U-Boot persistent storage
should not be taken literally here.


Best regards,

Wolfgang Denk
Tom Rini Oct. 21, 2021, 3:18 p.m. UTC | #7
On Thu, Oct 21, 2021 at 05:12:26PM +0200, Wolfgang Denk wrote:
> Dear Marek,
> 
> In message <20211021152831.15524883@thinkpad> you wrote:
> >
> >
> > > I think =+ will confuse far more people than + as last character of var
> 
> I still fail to see why '=+' could be confusing if properly
> documented to be the append operator.
> 
> I mean, it is not a new invention of mine.
> 
> OpenEmbedded / Yocto uses '=+' a lot, like in
> 
> 	meta/recipes-kernel/dtc/dtc.inc:
> 
> 	PACKAGES =+ "${PN}-misc"
> 
> Actually they use both '+=' and '=+', like
> 
> 	RESULT+=${ERRORS}

The OE example is exactly why I want to avoid =+.  You cannot
interchangeably use += and =+ as they evaluate differently.  See
https://www.yoctoproject.org/docs/3.1/bitbake-user-manual/bitbake-user-manual.html#appending-and-prepending
and that =+ there is the opposite of what we want.  So, more confusion.
Simon Glass Oct. 21, 2021, 3:59 p.m. UTC | #8
Hi Marek,

On Thu, 21 Oct 2021 at 07:28, Marek Behún <marek.behun@nic.cz> wrote:
>
> On Thu, 21 Oct 2021 15:25:37 +0200
> Marek Behún <marek.behun@nic.cz> wrote:
>
> > Hello,
> >
> > On Thu, 21 Oct 2021 15:06:51 +0200
> > Wolfgang Denk <wd@denx.de> wrote:
> >
> > > I confirm that '+=' looks better.  But '+=" is technically broken.
> >
> > a bit of my opinion:
> > I think =+ will confuse far more people than + as last character of var
> > name working weirdly. But I also think that + should be supported as
> > last character. Therefore I propose backslash escaping in variable name,
> > i.e.
> >   var+=value
> > appends value to var, while
> >   var\+=value
> > sets variable with name "var+"

My first preference is to disallow + at the end of an end var. Perhaps
we can start printing a warning if people do it, for a few releases.

My distance second preference is what Marek has here, using a
backslash to escape the + character.

As for =+ ...while I can see how people might parse it (we are setting
the var equal to what it has with an appending string) I think it is a
terrible idea as it is just not what people expect. Also, putting the
+ after the = places (similarly unlikely) restrictions on the
expression.

The current format is basically the same as 'print'. So if I can't
have the first preference, we could ensure that it prints a \ in the
case that the var ends with +

> >
> > Marek
>
> Also, I think that it would be better if spaces and tabs were allowed
> to indent the .env file, i.e.
>
>         var_a   =       3
>         var_bcd =       7
>
> should set "var_a" to "3", "var_bcd" to "7".
>
> If special character are needed in either name or value, they could be
> escaped and/or quoted.

They are allowed in the value but are reduced to a single space in the
front. We need this for multi-line strings (but I'm a bit worried
about it).

We could update it to skip any leading space after the = I think.

I don't like spaces before the = though. It doesn't match the 'print'
output (which has no space) and it is confusing:

fred=echo something; if [
a == b ]; then
fi

We may well have things like this if we try automatic conversion (I
haven't looked though).

I think we need strict rules so it is easy for people to get exactly
the env they want.

Regards,
Simon
Tom Rini Oct. 21, 2021, 4:03 p.m. UTC | #9
On Thu, Oct 21, 2021 at 09:59:38AM -0600, Simon Glass wrote:
> Hi Marek,
> 
> On Thu, 21 Oct 2021 at 07:28, Marek Behún <marek.behun@nic.cz> wrote:
> >
> > On Thu, 21 Oct 2021 15:25:37 +0200
> > Marek Behún <marek.behun@nic.cz> wrote:
> >
> > > Hello,
> > >
> > > On Thu, 21 Oct 2021 15:06:51 +0200
> > > Wolfgang Denk <wd@denx.de> wrote:
> > >
> > > > I confirm that '+=' looks better.  But '+=" is technically broken.
> > >
> > > a bit of my opinion:
> > > I think =+ will confuse far more people than + as last character of var
> > > name working weirdly. But I also think that + should be supported as
> > > last character. Therefore I propose backslash escaping in variable name,
> > > i.e.
> > >   var+=value
> > > appends value to var, while
> > >   var\+=value
> > > sets variable with name "var+"
> 
> My first preference is to disallow + at the end of an end var. Perhaps
> we can start printing a warning if people do it, for a few releases.
> 
> My distance second preference is what Marek has here, using a
> backslash to escape the + character.

How bad does it make the parser look if we allow trailing + in variable
names, by escaping them?  It's seemingly the substantive objection at
this point.
Simon Glass Oct. 21, 2021, 4:51 p.m. UTC | #10
Hi Tom,

On Thu, 21 Oct 2021 at 10:03, Tom Rini <trini@konsulko.com> wrote:
>
> On Thu, Oct 21, 2021 at 09:59:38AM -0600, Simon Glass wrote:
> > Hi Marek,
> >
> > On Thu, 21 Oct 2021 at 07:28, Marek Behún <marek.behun@nic.cz> wrote:
> > >
> > > On Thu, 21 Oct 2021 15:25:37 +0200
> > > Marek Behún <marek.behun@nic.cz> wrote:
> > >
> > > > Hello,
> > > >
> > > > On Thu, 21 Oct 2021 15:06:51 +0200
> > > > Wolfgang Denk <wd@denx.de> wrote:
> > > >
> > > > > I confirm that '+=' looks better.  But '+=" is technically broken.
> > > >
> > > > a bit of my opinion:
> > > > I think =+ will confuse far more people than + as last character of var
> > > > name working weirdly. But I also think that + should be supported as
> > > > last character. Therefore I propose backslash escaping in variable name,
> > > > i.e.
> > > >   var+=value
> > > > appends value to var, while
> > > >   var\+=value
> > > > sets variable with name "var+"
> >
> > My first preference is to disallow + at the end of an end var. Perhaps
> > we can start printing a warning if people do it, for a few releases.
> >
> > My distance second preference is what Marek has here, using a
> > backslash to escape the + character.
>
> How bad does it make the parser look if we allow trailing + in variable
> names, by escaping them?  It's seemingly the substantive objection at
> this point.

I'm pretty sure we can do it easily enough. I'll take a look when I
get back to this.

Regards,
Simon
Rasmus Villemoes Oct. 22, 2021, 6:40 a.m. UTC | #11
On 21/10/2021 18.03, Tom Rini wrote:
> On Thu, Oct 21, 2021 at 09:59:38AM -0600, Simon Glass wrote:
>> Hi Marek,
>>
>> On Thu, 21 Oct 2021 at 07:28, Marek Behún <marek.behun@nic.cz> wrote:
>>>
>>> On Thu, 21 Oct 2021 15:25:37 +0200
>>> Marek Behún <marek.behun@nic.cz> wrote:
>>>
>>>> Hello,
>>>>
>>>> On Thu, 21 Oct 2021 15:06:51 +0200
>>>> Wolfgang Denk <wd@denx.de> wrote:
>>>>
>>>>> I confirm that '+=' looks better.  But '+=" is technically broken.
>>>>
>>>> a bit of my opinion:
>>>> I think =+ will confuse far more people than + as last character of var
>>>> name working weirdly. But I also think that + should be supported as
>>>> last character. Therefore I propose backslash escaping in variable name,
>>>> i.e.
>>>>   var+=value
>>>> appends value to var, while
>>>>   var\+=value
>>>> sets variable with name "var+"
>>
>> My first preference is to disallow + at the end of an end var. Perhaps
>> we can start printing a warning if people do it, for a few releases.
>>
>> My distance second preference is what Marek has here, using a
>> backslash to escape the + character.
> 
> How bad does it make the parser look if we allow trailing + in variable
> names, by escaping them?  It's seemingly the substantive objection at
> this point.
> 

So I don't understand all the bruhahaha around a + at the end of the
varname. Nobody suggests (that I have seen) changing anything about how
U-Boot at runtime interprets and handles variables, so all variable
names that used to be valid continue to be so.

It's just that this _new_ method of generating the environment places
certain restrictions on what can be done. The old-fashioned ways
(mkenvimage, good'ol CONFIG_ENV_EXTRA_SETTINGS with all its warts, and
run-time 'env set') continue to allow people to define whatever env vars
they want. This new method is meant for transitioning in-tree boards'
default environment, and no in-tree boards need anything exotic.

Also, completely independent of whether the subsequent parser is
implemented in awk or python or rust, or what syntax it accepts and the
semantics of that syntax, the fact that we first pass the input through
cpp already means some things just won't work the same way as when given
to mkenvimage, and that applies to both sides of the =:

printf 'x/* huh */y=/* where did this go ? */\nmsg=I like unix\nfive
 spaces=5spaces\n' | gcc -E -P -x assembler-with-cpp -

x y=
msg=I like 1
five spaces=5spaces

[In case its broken by the email, there are actually five spaces in the
printf string between the words "five" and "spaces", the above should
illustrate that cpp collapses those to a single space].

So, I'd much rather we do a cleaner break, and accept (and ignore)
whitespace on either side of the assignment operator - that's how Make
variable assignments work IIRC. And since a lot of people making use of
this will already be familiar with Yocto, I think we should have two
compound assignment operators, .= and +=. That still allows one to use
any non-whitespace, non-= characters (modulo the restrictions imposed by
the whole thing passing through cpp) in variable names, so

foo+=abc

means

foo+ = abc

while could append to foo by saying

foo += abc

That whitespace around the assignment operators would also make the
input files somewhat more readable - there really is no point in having
human-readable, human-editable text files having a format
almost-but-not-quite be that of the on-disk format.

Rasmus
Wolfgang Denk Oct. 22, 2021, 8:06 a.m. UTC | #12
Dear Simon,

In message <CAPnjgZ1BuAJC6Vhp06HrifU9jZqbOEuC+zRauV1DH3rY1qZp3Q@mail.gmail.com> you wrote:
>
> > > i.e.
> > >   var+=value
> > > appends value to var, while
> > >   var\+=value
> > > sets variable with name "var+"
>
> My first preference is to disallow + at the end of an end var. Perhaps
> we can start printing a warning if people do it, for a few releases.

This might seem to be a harmless change, but it is actually a
fundamental one.  And it breaks backward compatiility.  And all this
without need, as a list of alternatives have been suggested.

> My distance second preference is what Marek has here, using a
> backslash to escape the + character.

Actually this has the same problem, as the backslash is also a legal
character in a variable name:

	=> setenv foo\\+ bar
	=> printenv foo\\+  
	foo\+=bar


Yes, it was probably not a good idea not to restrict the allowed
character set when I implemented this stuff 21 years ago, but then
code size was critical - we had U-Boot running from 128 kB EPROM
(you remember these huge chips which were erased under UV light?).

The fact is, '=' and NUL are the only characters that cannot be used
in a variable name.


> As for =+ ...while I can see how people might parse it (we are setting
> the var equal to what it has with an appending string) I think it is a
> terrible idea as it is just not what people expect.

What do people expect? This is a totally new feature, so people will
use what they find in the documentation and in example code.

> Also, putting the
> + after the = places (similarly unlikely) restrictions on the
> expression.

There is a fundamental difference here.

For the '+=' case, there is no way to escape the '+', as all
commonly used escapes are valid characters in the variable name,
too.

With '=+', the '=' defines where the variable name ends, and from
here you can define your own rules as where the value part begins -
this is just a matter how you implement your parser.

> The current format is basically the same as 'print'. So if I can't
> have the first preference, we could ensure that it prints a \ in the
> case that the var ends with +

But '\' is a legal character in the variable name, too. Anything but
'=' and NUL is a legal char. And this makes escaping impossible:

	=> setenv \'foo\\-\' foobar
	=> printenv \'foo\\-\'
	'foo\-'=foobar

> > Also, I think that it would be better if spaces and tabs were allowed
> > to indent the .env file, i.e.
> >
> >         var_a   =       3
> >         var_bcd =       7
> >
> > should set "var_a" to "3", "var_bcd" to "7".
> >
> > If special character are needed in either name or value, they could be
> > escaped and/or quoted.
>
> They are allowed in the value but are reduced to a single space in the
> front. We need this for multi-line strings (but I'm a bit worried
> about it).

You mean this automatically insert a newline between parts? ugh...
I didn't realize this. Did I miss it in the documentation?

> We could update it to skip any leading space after the = I think.

So what if you need a leading space?


> I don't like spaces before the = though. It doesn't match the 'print'
> output (which has no space) and it is confusing:

env print also does not add any spaces after the '='.

> I think we need strict rules so it is easy for people to get exactly
> the env they want.

Strict rules, proper documentation, and a set of examples.


Best regards,

Wolfgang Denk
Wolfgang Denk Oct. 22, 2021, 8:08 a.m. UTC | #13
Dear Tom,

In message <20211021160311.GC3577824@bill-the-cat> you wrote:
> 
> How bad does it make the parser look if we allow trailing + in variable
> names, by escaping them?  It's seemingly the substantive objection at
> this point.

Any escape character is also a legal name character.

Best regards,

Wolfgang Denk
Marek Behún Oct. 22, 2021, 2:04 p.m. UTC | #14
Hello Wolfgang,

On Fri, 22 Oct 2021 10:06:55 +0200
Wolfgang Denk <wd@denx.de> wrote:

> For the '+=' case, there is no way to escape the '+', as all
> commonly used escapes are valid characters in the variable name,
> too.

We can define that backslash is to be also escaped if it is to be used
as variable name:
  weird_var\\\+=abcd

I still think it is far more intuitive than =+.

Anyway, IMO '+' as the last character in varname is a extreme corner
case; I think that no one sane actually uses it.

But even if they did, Simon's patches do not break it. Simon's patches
only disallow it in board environment definition during compilation.

I think it is a completely reasonable thing to diallow board
maintainers (i.e. only U-Boot developers, not users) from using such
construction.

Marek
Tom Rini Oct. 22, 2021, 2:47 p.m. UTC | #15
On Fri, Oct 22, 2021 at 10:08:05AM +0200, Wolfgang Denk wrote:
> Dear Tom,
> 
> In message <20211021160311.GC3577824@bill-the-cat> you wrote:
> > 
> > How bad does it make the parser look if we allow trailing + in variable
> > names, by escaping them?  It's seemingly the substantive objection at
> > this point.
> 
> Any escape character is also a legal name character.

I am struggling to have a non-meme reaction to this.  Perhaps the best
step is just earlier on in the series note that variable names need to
fit within the broadly and commonly used set of characters and assorted
funny business you can do historically needs to be migrated.
Tom Rini Oct. 22, 2021, 2:50 p.m. UTC | #16
On Fri, Oct 22, 2021 at 10:06:55AM +0200, Wolfgang Denk wrote:
> Dear Simon,
> 
> In message <CAPnjgZ1BuAJC6Vhp06HrifU9jZqbOEuC+zRauV1DH3rY1qZp3Q@mail.gmail.com> you wrote:
> >
> > > > i.e.
> > > >   var+=value
> > > > appends value to var, while
> > > >   var\+=value
> > > > sets variable with name "var+"
> >
> > My first preference is to disallow + at the end of an end var. Perhaps
> > we can start printing a warning if people do it, for a few releases.
> 
> This might seem to be a harmless change, but it is actually a
> fundamental one.  And it breaks backward compatiility.  And all this
> without need, as a list of alternatives have been suggested.
> 
> > My distance second preference is what Marek has here, using a
> > backslash to escape the + character.
> 
> Actually this has the same problem, as the backslash is also a legal
> character in a variable name:
> 
> 	=> setenv foo\\+ bar
> 	=> printenv foo\\+  
> 	foo\+=bar
> 
> 
> Yes, it was probably not a good idea not to restrict the allowed
> character set when I implemented this stuff 21 years ago, but then
> code size was critical - we had U-Boot running from 128 kB EPROM
> (you remember these huge chips which were erased under UV light?).
> 
> The fact is, '=' and NUL are the only characters that cannot be used
> in a variable name.
> 
> 
> > As for =+ ...while I can see how people might parse it (we are setting
> > the var equal to what it has with an appending string) I think it is a
> > terrible idea as it is just not what people expect.
> 
> What do people expect? This is a totally new feature, so people will
> use what they find in the documentation and in example code.
> 
> > Also, putting the
> > + after the = places (similarly unlikely) restrictions on the
> > expression.
> 
> There is a fundamental difference here.
> 
> For the '+=' case, there is no way to escape the '+', as all
> commonly used escapes are valid characters in the variable name,
> too.
> 
> With '=+', the '=' defines where the variable name ends, and from
> here you can define your own rules as where the value part begins -
> this is just a matter how you implement your parser.
> 
> > The current format is basically the same as 'print'. So if I can't
> > have the first preference, we could ensure that it prints a \ in the
> > case that the var ends with +
> 
> But '\' is a legal character in the variable name, too. Anything but
> '=' and NUL is a legal char. And this makes escaping impossible:
> 
> 	=> setenv \'foo\\-\' foobar
> 	=> printenv \'foo\\-\'
> 	'foo\-'=foobar
> 
> > > Also, I think that it would be better if spaces and tabs were allowed
> > > to indent the .env file, i.e.
> > >
> > >         var_a   =       3
> > >         var_bcd =       7
> > >
> > > should set "var_a" to "3", "var_bcd" to "7".
> > >
> > > If special character are needed in either name or value, they could be
> > > escaped and/or quoted.
> >
> > They are allowed in the value but are reduced to a single space in the
> > front. We need this for multi-line strings (but I'm a bit worried
> > about it).
> 
> You mean this automatically insert a newline between parts? ugh...
> I didn't realize this. Did I miss it in the documentation?
> 
> > We could update it to skip any leading space after the = I think.
> 
> So what if you need a leading space?
> 
> 
> > I don't like spaces before the = though. It doesn't match the 'print'
> > output (which has no space) and it is confusing:
> 
> env print also does not add any spaces after the '='.
> 
> > I think we need strict rules so it is easy for people to get exactly
> > the env they want.
> 
> Strict rules, proper documentation, and a set of examples.

And sanity and restrictions introduced to our environment variables.
The amount of "fun" things that were allowed by disallowing only NUL and
= from names, and also allowing us to stay crazy tiny are just not
relevant to where we are now.
Wolfgang Denk Oct. 24, 2021, 3:46 p.m. UTC | #17
Dear Tom,

In message <20211022144759.GG3577824@bill-the-cat> you wrote:
> 
> > Any escape character is also a legal name character.
>
> I am struggling to have a non-meme reaction to this.  Perhaps the best
> step is just earlier on in the series note that variable names need to
> fit within the broadly and commonly used set of characters and assorted
> funny business you can do historically needs to be migrated.

Indeed I think this is the most reasonable approach.

Like you cannot write any aritrary code in plain C and have to fall
back to assembler in a few places, this patch series should simply
not claim to be able to support all legal environment settings.

It is a convenience tool, and it is OK if it has a few restrictions,
like for the character set of supported variable names.

But:

1) These restrictions must be clearly documented, both in the commit
   message and in the related documentation/readme.
2) There should be another, more primitive way to generate
   environment settings without these restrictions..

Best regards,

Wolfgang Denk
Tom Rini Oct. 24, 2021, 4:44 p.m. UTC | #18
On Sun, Oct 24, 2021 at 05:46:00PM +0200, Wolfgang Denk wrote:
> Dear Tom,
> 
> In message <20211022144759.GG3577824@bill-the-cat> you wrote:
> > 
> > > Any escape character is also a legal name character.
> >
> > I am struggling to have a non-meme reaction to this.  Perhaps the best
> > step is just earlier on in the series note that variable names need to
> > fit within the broadly and commonly used set of characters and assorted
> > funny business you can do historically needs to be migrated.
> 
> Indeed I think this is the most reasonable approach.
> 
> Like you cannot write any aritrary code in plain C and have to fall
> back to assembler in a few places, this patch series should simply
> not claim to be able to support all legal environment settings.
> 
> It is a convenience tool, and it is OK if it has a few restrictions,
> like for the character set of supported variable names.
> 
> But:
> 
> 1) These restrictions must be clearly documented, both in the commit
>    message and in the related documentation/readme.
> 2) There should be another, more primitive way to generate
>    environment settings without these restrictions..

First, in that we don't have tests today for any of the "interesting"
possible variable options, I have no clue which ones even work as
intended.

Second, yes, an end result here should be that yes, the default
environment should be more easily buildable and integrated with
arbitrary tools, so if something else can parse it (libubootenv?) it can
be done.
Simon Glass Oct. 24, 2021, 7:54 p.m. UTC | #19
Hi Rasmus,

On Fri, 22 Oct 2021 at 00:41, Rasmus Villemoes
<rasmus.villemoes@prevas.dk> wrote:
>
> On 21/10/2021 18.03, Tom Rini wrote:
> > On Thu, Oct 21, 2021 at 09:59:38AM -0600, Simon Glass wrote:
> >> Hi Marek,
> >>
> >> On Thu, 21 Oct 2021 at 07:28, Marek Behún <marek.behun@nic.cz> wrote:
> >>>
> >>> On Thu, 21 Oct 2021 15:25:37 +0200
> >>> Marek Behún <marek.behun@nic.cz> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> On Thu, 21 Oct 2021 15:06:51 +0200
> >>>> Wolfgang Denk <wd@denx.de> wrote:
> >>>>
> >>>>> I confirm that '+=' looks better.  But '+=" is technically broken.
> >>>>
> >>>> a bit of my opinion:
> >>>> I think =+ will confuse far more people than + as last character of var
> >>>> name working weirdly. But I also think that + should be supported as
> >>>> last character. Therefore I propose backslash escaping in variable name,
> >>>> i.e.
> >>>>   var+=value
> >>>> appends value to var, while
> >>>>   var\+=value
> >>>> sets variable with name "var+"
> >>
> >> My first preference is to disallow + at the end of an end var. Perhaps
> >> we can start printing a warning if people do it, for a few releases.
> >>
> >> My distance second preference is what Marek has here, using a
> >> backslash to escape the + character.
> >
> > How bad does it make the parser look if we allow trailing + in variable
> > names, by escaping them?  It's seemingly the substantive objection at
> > this point.
> >
>
> So I don't understand all the bruhahaha around a + at the end of the
> varname. Nobody suggests (that I have seen) changing anything about how
> U-Boot at runtime interprets and handles variables, so all variable
> names that used to be valid continue to be so.
>
> It's just that this _new_ method of generating the environment places
> certain restrictions on what can be done. The old-fashioned ways
> (mkenvimage, good'ol CONFIG_ENV_EXTRA_SETTINGS with all its warts, and
> run-time 'env set') continue to allow people to define whatever env vars
> they want. This new method is meant for transitioning in-tree boards'
> default environment, and no in-tree boards need anything exotic.
>
> Also, completely independent of whether the subsequent parser is
> implemented in awk or python or rust, or what syntax it accepts and the
> semantics of that syntax, the fact that we first pass the input through
> cpp already means some things just won't work the same way as when given
> to mkenvimage, and that applies to both sides of the =:
>
> printf 'x/* huh */y=/* where did this go ? */\nmsg=I like unix\nfive
>  spaces=5spaces\n' | gcc -E -P -x assembler-with-cpp -
>
> x y=
> msg=I like 1
> five spaces=5spaces
>
> [In case its broken by the email, there are actually five spaces in the
> printf string between the words "five" and "spaces", the above should
> illustrate that cpp collapses those to a single space].
>
> So, I'd much rather we do a cleaner break, and accept (and ignore)
> whitespace on either side of the assignment operator - that's how Make
> variable assignments work IIRC. And since a lot of people making use of
> this will already be familiar with Yocto, I think we should have two
> compound assignment operators, .= and +=. That still allows one to use
> any non-whitespace, non-= characters (modulo the restrictions imposed by
> the whole thing passing through cpp) in variable names, so
>
> foo+=abc
>
> means
>
> foo+ = abc
>
> while could append to foo by saying
>
> foo += abc
>
> That whitespace around the assignment operators would also make the
> input files somewhat more readable - there really is no point in having
> human-readable, human-editable text files having a format
> almost-but-not-quite be that of the on-disk format.

I am OK with this on the name front, as I assume we don't want to
allow space in a var name!

But how do I do this?

bootargs=console=fred
#ifdef SOMETHING
bootargs+= dm-verity=...
#endif

We need the space between the bootargs.

Regards,
Simon
Rasmus Villemoes Oct. 25, 2021, 7:06 a.m. UTC | #20
On 24/10/2021 21.54, Simon Glass wrote:
> Hi Rasmus,
> 
> On Fri, 22 Oct 2021 at 00:41, Rasmus Villemoes
> <rasmus.villemoes@prevas.dk> wrote:
>>
>> On 21/10/2021 18.03, Tom Rini wrote:
>>> On Thu, Oct 21, 2021 at 09:59:38AM -0600, Simon Glass wrote:
>>>> Hi Marek,
>>>>
>>>> On Thu, 21 Oct 2021 at 07:28, Marek Behún <marek.behun@nic.cz> wrote:
>>>>>
>>>>> On Thu, 21 Oct 2021 15:25:37 +0200
>>>>> Marek Behún <marek.behun@nic.cz> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> On Thu, 21 Oct 2021 15:06:51 +0200
>>>>>> Wolfgang Denk <wd@denx.de> wrote:
>>>>>>
>>>>>>> I confirm that '+=' looks better.  But '+=" is technically broken.
>>>>>>
>>>>>> a bit of my opinion:
>>>>>> I think =+ will confuse far more people than + as last character of var
>>>>>> name working weirdly. But I also think that + should be supported as
>>>>>> last character. Therefore I propose backslash escaping in variable name,
>>>>>> i.e.
>>>>>>   var+=value
>>>>>> appends value to var, while
>>>>>>   var\+=value
>>>>>> sets variable with name "var+"
>>>>
>>>> My first preference is to disallow + at the end of an end var. Perhaps
>>>> we can start printing a warning if people do it, for a few releases.
>>>>
>>>> My distance second preference is what Marek has here, using a
>>>> backslash to escape the + character.
>>>
>>> How bad does it make the parser look if we allow trailing + in variable
>>> names, by escaping them?  It's seemingly the substantive objection at
>>> this point.
>>>
>>
>> So I don't understand all the bruhahaha around a + at the end of the
>> varname. Nobody suggests (that I have seen) changing anything about how
>> U-Boot at runtime interprets and handles variables, so all variable
>> names that used to be valid continue to be so.
>>
>> It's just that this _new_ method of generating the environment places
>> certain restrictions on what can be done. The old-fashioned ways
>> (mkenvimage, good'ol CONFIG_ENV_EXTRA_SETTINGS with all its warts, and
>> run-time 'env set') continue to allow people to define whatever env vars
>> they want. This new method is meant for transitioning in-tree boards'
>> default environment, and no in-tree boards need anything exotic.
>>
>> Also, completely independent of whether the subsequent parser is
>> implemented in awk or python or rust, or what syntax it accepts and the
>> semantics of that syntax, the fact that we first pass the input through
>> cpp already means some things just won't work the same way as when given
>> to mkenvimage, and that applies to both sides of the =:
>>
>> printf 'x/* huh */y=/* where did this go ? */\nmsg=I like unix\nfive
>>  spaces=5spaces\n' | gcc -E -P -x assembler-with-cpp -
>>
>> x y=
>> msg=I like 1
>> five spaces=5spaces
>>
>> [In case its broken by the email, there are actually five spaces in the
>> printf string between the words "five" and "spaces", the above should
>> illustrate that cpp collapses those to a single space].
>>
>> So, I'd much rather we do a cleaner break, and accept (and ignore)
>> whitespace on either side of the assignment operator - that's how Make
>> variable assignments work IIRC. And since a lot of people making use of
>> this will already be familiar with Yocto, I think we should have two
>> compound assignment operators, .= and +=. That still allows one to use
>> any non-whitespace, non-= characters (modulo the restrictions imposed by
>> the whole thing passing through cpp) in variable names, so
>>
>> foo+=abc
>>
>> means
>>
>> foo+ = abc
>>
>> while could append to foo by saying
>>
>> foo += abc
>>
>> That whitespace around the assignment operators would also make the
>> input files somewhat more readable - there really is no point in having
>> human-readable, human-editable text files having a format
>> almost-but-not-quite be that of the on-disk format.
> 
> I am OK with this on the name front, as I assume we don't want to
> allow space in a var name!

Exactly, there's really never any case where that would be sensible. But
I would probably go a bit further and simply restrict varnames to the
usual alphanumerics plus [_.+-] - in particular, exclude single and
double quotes and backslash. That leaves the door open for somebody to
later support "arbitrary" variable names by defining what it would mean
to say e.g.

"abc \n'\"\tfoo" = hahaha

or whatever syntax they'd propose, but there's absolutely no point in
implementing anything like that initially.

> But how do I do this?
> 
> bootargs=console=fred
> #ifdef SOMETHING
> bootargs+= dm-verity=...
> #endif
> 
> We need the space between the bootargs.

Exactly like that, and for the case where you want to append something
_without_ an extra space there's the .= operator I also suggested.

FWIW, I like the idea that an indented line represents a continuation of
the value of the current assignment.

I don't think I've seen it addressed, but how do you deal with CONFIG_
string items? In the config.h file, the macro definition includes the
double quotes, but I don't think one wants that in the env values.

conf = CONFIG_DEFAULT_CONF
bootcmd = bootm $loadaddr#$conf

would attempt to boot e.g. 0x12340000#"foo", and while one can do
de-quotification in U-Boot shell, it's rather cumbersome.

Rasmus
Wolfgang Denk Oct. 25, 2021, 7:48 a.m. UTC | #21
Dear Tom,

In message <20211024164404.GQ3577824@bill-the-cat> you wrote:
> 
> > It is a convenience tool, and it is OK if it has a few restrictions,
> > like for the character set of supported variable names.
> > 
> > But:
> > 
> > 1) These restrictions must be clearly documented, both in the commit
> >    message and in the related documentation/readme.
> > 2) There should be another, more primitive way to generate
> >    environment settings without these restrictions..
>
> First, in that we don't have tests today for any of the "interesting"
> possible variable options, I have no clue which ones even work as
> intended.
>
> Second, yes, an end result here should be that yes, the default
> environment should be more easily buildable and integrated with
> arbitrary tools, so if something else can parse it (libubootenv?) it can
> be done.

Actually I have an even more low-level approach in mind, like the
capability to include (or rather import) binary U-Boot environment
data given in the usual

	<name1>=<value1>\0...<nameN>=<valueN>\0\0

form.  This might come in handy if your data comes from exporting
the environmentof a running system.

Best regards,

Wolfgang Denk
Simon Glass Oct. 25, 2021, 3:18 p.m. UTC | #22
Hi Rasmus,

On Mon, 25 Oct 2021 at 01:06, Rasmus Villemoes
<rasmus.villemoes@prevas.dk> wrote:
>
> On 24/10/2021 21.54, Simon Glass wrote:
> > Hi Rasmus,
> >
> > On Fri, 22 Oct 2021 at 00:41, Rasmus Villemoes
> > <rasmus.villemoes@prevas.dk> wrote:
> >>
> >> On 21/10/2021 18.03, Tom Rini wrote:
> >>> On Thu, Oct 21, 2021 at 09:59:38AM -0600, Simon Glass wrote:
> >>>> Hi Marek,
> >>>>
> >>>> On Thu, 21 Oct 2021 at 07:28, Marek Behún <marek.behun@nic.cz> wrote:
> >>>>>
> >>>>> On Thu, 21 Oct 2021 15:25:37 +0200
> >>>>> Marek Behún <marek.behun@nic.cz> wrote:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> On Thu, 21 Oct 2021 15:06:51 +0200
> >>>>>> Wolfgang Denk <wd@denx.de> wrote:
> >>>>>>
> >>>>>>> I confirm that '+=' looks better.  But '+=" is technically broken.
> >>>>>>
> >>>>>> a bit of my opinion:
> >>>>>> I think =+ will confuse far more people than + as last character of var
> >>>>>> name working weirdly. But I also think that + should be supported as
> >>>>>> last character. Therefore I propose backslash escaping in variable name,
> >>>>>> i.e.
> >>>>>>   var+=value
> >>>>>> appends value to var, while
> >>>>>>   var\+=value
> >>>>>> sets variable with name "var+"
> >>>>
> >>>> My first preference is to disallow + at the end of an end var. Perhaps
> >>>> we can start printing a warning if people do it, for a few releases.
> >>>>
> >>>> My distance second preference is what Marek has here, using a
> >>>> backslash to escape the + character.
> >>>
> >>> How bad does it make the parser look if we allow trailing + in variable
> >>> names, by escaping them?  It's seemingly the substantive objection at
> >>> this point.
> >>>
> >>
> >> So I don't understand all the bruhahaha around a + at the end of the
> >> varname. Nobody suggests (that I have seen) changing anything about how
> >> U-Boot at runtime interprets and handles variables, so all variable
> >> names that used to be valid continue to be so.
> >>
> >> It's just that this _new_ method of generating the environment places
> >> certain restrictions on what can be done. The old-fashioned ways
> >> (mkenvimage, good'ol CONFIG_ENV_EXTRA_SETTINGS with all its warts, and
> >> run-time 'env set') continue to allow people to define whatever env vars
> >> they want. This new method is meant for transitioning in-tree boards'
> >> default environment, and no in-tree boards need anything exotic.
> >>
> >> Also, completely independent of whether the subsequent parser is
> >> implemented in awk or python or rust, or what syntax it accepts and the
> >> semantics of that syntax, the fact that we first pass the input through
> >> cpp already means some things just won't work the same way as when given
> >> to mkenvimage, and that applies to both sides of the =:
> >>
> >> printf 'x/* huh */y=/* where did this go ? */\nmsg=I like unix\nfive
> >>  spaces=5spaces\n' | gcc -E -P -x assembler-with-cpp -
> >>
> >> x y=
> >> msg=I like 1
> >> five spaces=5spaces
> >>
> >> [In case its broken by the email, there are actually five spaces in the
> >> printf string between the words "five" and "spaces", the above should
> >> illustrate that cpp collapses those to a single space].
> >>
> >> So, I'd much rather we do a cleaner break, and accept (and ignore)
> >> whitespace on either side of the assignment operator - that's how Make
> >> variable assignments work IIRC. And since a lot of people making use of
> >> this will already be familiar with Yocto, I think we should have two
> >> compound assignment operators, .= and +=. That still allows one to use
> >> any non-whitespace, non-= characters (modulo the restrictions imposed by
> >> the whole thing passing through cpp) in variable names, so
> >>
> >> foo+=abc
> >>
> >> means
> >>
> >> foo+ = abc
> >>
> >> while could append to foo by saying
> >>
> >> foo += abc
> >>
> >> That whitespace around the assignment operators would also make the
> >> input files somewhat more readable - there really is no point in having
> >> human-readable, human-editable text files having a format
> >> almost-but-not-quite be that of the on-disk format.
> >
> > I am OK with this on the name front, as I assume we don't want to
> > allow space in a var name!
>
> Exactly, there's really never any case where that would be sensible. But
> I would probably go a bit further and simply restrict varnames to the
> usual alphanumerics plus [_.+-] - in particular, exclude single and
> double quotes and backslash. That leaves the door open for somebody to
> later support "arbitrary" variable names by defining what it would mean
> to say e.g.
>
> "abc \n'\"\tfoo" = hahaha
>
> or whatever syntax they'd propose, but there's absolutely no point in
> implementing anything like that initially.
>
> > But how do I do this?
> >
> > bootargs=console=fred
> > #ifdef SOMETHING
> > bootargs+= dm-verity=...
> > #endif
> >
> > We need the space between the bootargs.
>
> Exactly like that, and for the case where you want to append something
> _without_ an extra space there's the .= operator I also suggested.

Do you have a link to the docs for that?

Perhaps we should get this initial thing in and we can take it from
there. I expect that as we start to convert more environments we'll
find more things we need.

>
> FWIW, I like the idea that an indented line represents a continuation of
> the value of the current assignment.
>
> I don't think I've seen it addressed, but how do you deal with CONFIG_
> string items? In the config.h file, the macro definition includes the
> double quotes, but I don't think one wants that in the env values.
>
> conf = CONFIG_DEFAULT_CONF
> bootcmd = bootm $loadaddr#$conf
>
> would attempt to boot e.g. 0x12340000#"foo", and while one can do
> de-quotification in U-Boot shell, it's rather cumbersome.

Badly. I really don't like the stringify stuff so we don't have that
problem, but we have this one.

I would much prefer it be automatic, if possible. This needs some
thought...e.g. I wonder what the default behaviour should be?

In your example, conf would end up being set to "fred" with the quote,
for example, which seem like the wrong default behaviour. Perhaps we
do need something like:

conf=_dequote(CONFIG_DEFAULT_CONF)

Regards,
Simon
Rasmus Villemoes Oct. 25, 2021, 7:52 p.m. UTC | #23
On 25/10/2021 17.18, Simon Glass wrote:
> Hi Rasmus,
> 
> On Mon, 25 Oct 2021 at 01:06, Rasmus Villemoes
> <rasmus.villemoes@prevas.dk> wrote:

>> Exactly, there's really never any case where that would be sensible. But
>> I would probably go a bit further and simply restrict varnames to the
>> usual alphanumerics plus [_.+-] - in particular, exclude single and
>> double quotes and backslash. That leaves the door open for somebody to
>> later support "arbitrary" variable names by defining what it would mean
>> to say e.g.
>>
>> "abc \n'\"\tfoo" = hahaha
>>
>> or whatever syntax they'd propose, but there's absolutely no point in
>> implementing anything like that initially.
>>
>>> But how do I do this?
>>>
>>> bootargs=console=fred
>>> #ifdef SOMETHING
>>> bootargs+= dm-verity=...
>>> #endif
>>>
>>> We need the space between the bootargs.
>>
>> Exactly like that, and for the case where you want to append something
>> _without_ an extra space there's the .= operator I also suggested.
> 
> Do you have a link to the docs for that?
>

https://docs.yoctoproject.org/bitbake/bitbake-user-manual/bitbake-user-manual-metadata.html#appending-and-prepending-with-spaces

As I said, it will be familiar to anyone who has ever dealt with
Yocto/bitbake, and the behaviour of += (of adding the RHS with a space
prepended) is known from Make. I think there are uses for both, and it
shouldn't be very hard to support both operators.
of the current assignment.

>> I don't think I've seen it addressed, but how do you deal with CONFIG_
>> string items? In the config.h file, the macro definition includes the
>> double quotes, but I don't think one wants that in the env values.
>>
>> conf = CONFIG_DEFAULT_CONF
>> bootcmd = bootm $loadaddr#$conf
>>
>> would attempt to boot e.g. 0x12340000#"foo", and while one can do
>> de-quotification in U-Boot shell, it's rather cumbersome.
> 
> Badly. I really don't like the stringify stuff so we don't have that
> problem, but we have this one.
> 
> I would much prefer it be automatic, if possible. This needs some
> thought...e.g. I wonder what the default behaviour should be?

I think it should be the value without quotes, if that is at all
possible. But, it's not completely trivial, because cpp won't do macro
expansion inside what it sees as a string literal, so if one wants a
CONFIG string inside double quotes, one can't do

  foo = "CONFIG_FOO"

Dunno, I think it needs some more thought.

Rasmus
Wolfgang Denk Oct. 26, 2021, 10:15 a.m. UTC | #24
Dear Simon,

In message <CAPnjgZ0Rn00ob09hHZsu-sszbm9-UhDDSkDLmGZ5HeWSzV1H7Q@mail.gmail.com> you wrote:
>
> > > We need the space between the bootargs.
> >
> > Exactly like that, and for the case where you want to append something
> > _without_ an extra space there's the .=3D operator I also suggested.
>
> Do you have a link to the docs for that?
>
> Perhaps we should get this initial thing in and we can take it from
> there. I expect that as we start to convert more environments we'll
> find more things we need.

I think it is not a good idea to use two different operators for the
same appand operation, just to add a space in one case.

So assume I want to start the appended part with a TAB character,
would we define another operator then?

We have problems with escaping characters for the variable _name_
part, but not for the value. We can for example use standard shell
escape rules, like:

	foo += bar
	foo += \ bar
	foo += ' bar'

Best regards,

Wolfgang Denk
Simon Glass Oct. 28, 2021, 2:18 p.m. UTC | #25
Hi,

On Tue, 26 Oct 2021 at 04:16, Wolfgang Denk <wd@denx.de> wrote:
>
> Dear Simon,
>
> In message <CAPnjgZ0Rn00ob09hHZsu-sszbm9-UhDDSkDLmGZ5HeWSzV1H7Q@mail.gmail.com> you wrote:
> >
> > > > We need the space between the bootargs.
> > >
> > > Exactly like that, and for the case where you want to append something
> > > _without_ an extra space there's the .=3D operator I also suggested.
> >
> > Do you have a link to the docs for that?
> >
> > Perhaps we should get this initial thing in and we can take it from
> > there. I expect that as we start to convert more environments we'll
> > find more things we need.
>
> I think it is not a good idea to use two different operators for the
> same appand operation, just to add a space in one case.
>
> So assume I want to start the appended part with a TAB character,
> would we define another operator then?
>
> We have problems with escaping characters for the variable _name_
> part, but not for the value. We can for example use standard shell
> escape rules, like:
>
>         foo += bar
>         foo += \ bar
>         foo += ' bar'
>

I don't mind either way and there is precedent for .= maybe in perl?
Can't remember.

But I think that change would be for user-friendliness, rather than a
strict requirement, so if we have agreement on the series as is now, I
say let's go ahead with that and refine it later. Patches welcome, as
they say.

After that we can discuss:
- this idea to relax the whitespace rules
- the idea of restricting env-var names to A-Za-z_0-9- or similar
- whether we can write a tool to convert all the envs automatically
- if not, what to do to encourage people to migrate so we can drop env
from the ad-hoc CONFIG thing

Regards,
Simon
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 71f468c00a8..36846528368 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -738,6 +738,13 @@  F:	test/env/
 F:	tools/env*
 F:	tools/mkenvimage.c
 
+ENVIRONMENT AS TEXT
+M:	Simon Glass <sjg@chromium.org>
+R:	Wolfgang Denk <wd@denx.de>
+S:	Maintained
+F:	doc/usage/environment.rst
+F:	scripts/env2string.awk
+
 FPGA
 M:	Michal Simek <michal.simek@xilinx.com>
 S:	Maintained
diff --git a/Makefile b/Makefile
index f911f703443..370c8710eb0 100644
--- a/Makefile
+++ b/Makefile
@@ -513,6 +513,7 @@  version_h := include/generated/version_autogenerated.h
 timestamp_h := include/generated/timestamp_autogenerated.h
 defaultenv_h := include/generated/defaultenv_autogenerated.h
 dt_h := include/generated/dt.h
+env_h := include/generated/environment.h
 
 no-dot-config-targets := clean clobber mrproper distclean \
 			 help %docs check% coccicheck \
@@ -1785,6 +1786,69 @@  quiet_cmd_sym ?= SYM     $@
 u-boot.sym: u-boot FORCE
 	$(call if_changed,sym)
 
+# Environment processing
+# ---------------------------------------------------------------------------
+
+# Directory where we expect the .env file, if it exists
+ENV_DIR := $(srctree)/board/$(BOARDDIR)
+
+# Basename of .env file, stripping quotes
+ENV_SOURCE_FILE := $(CONFIG_ENV_SOURCE_FILE:"%"=%)
+
+# Filename of .env file
+ENV_FILE_CFG := $(ENV_DIR)/$(ENV_SOURCE_FILE).env
+
+# Default filename, if CONFIG_ENV_SOURCE_FILE is empty
+ENV_FILE_BOARD := $(ENV_DIR)/$(CONFIG_SYS_BOARD:"%"=%).env
+
+# Select between the CONFIG_ENV_SOURCE_FILE and the default one
+ENV_FILE := $(if $(ENV_SOURCE_FILE),$(ENV_FILE_CFG),$(wildcard $(ENV_FILE_BOARD)))
+
+# Run the environment text file through the preprocessor, but only if it is
+# non-empty, to save time and possible build errors if something is wonky with
+# the board
+quiet_cmd_gen_envp = ENVP    $@
+      cmd_gen_envp = \
+	if [ -s "$(ENV_FILE)" ]; then \
+		$(CPP) -P $(CFLAGS) -x assembler-with-cpp -D__ASSEMBLY__ \
+			-D__UBOOT_CONFIG__ \
+			-I . -I include -I $(srctree)/include \
+			-include linux/kconfig.h -include include/config.h \
+			-I$(srctree)/arch/$(ARCH)/include \
+			$< -o $@; \
+	else \
+		echo -n >$@ ; \
+	fi
+include/generated/env.in: include/generated/env.txt FORCE
+	$(call cmd,gen_envp)
+
+# Regenerate the environment if it changes
+# We use 'wildcard' since the file is not required to exist (at present), in
+# which case we don't want this dependency, but instead should create an empty
+# file
+# This rule is useful since it shows the source file for the environment
+quiet_cmd_envc = ENVC    $@
+      cmd_envc = \
+	if [ -f "$<" ]; then \
+		cat $< > $@; \
+	elif [ -n "$(ENV_SOURCE_FILE)" ]; then \
+		echo "Missing file $(ENV_FILE_CFG)"; \
+	else \
+		echo -n >$@ ; \
+	fi
+
+include/generated/env.txt: $(wildcard $(ENV_FILE)) FORCE
+	$(call cmd,envc)
+
+# Write out the resulting environment, converted to a C string
+quiet_cmd_gen_envt = ENVT    $@
+      cmd_gen_envt = \
+	awk -f $(srctree)/scripts/env2string.awk $< >$@
+$(env_h): include/generated/env.in
+	$(call cmd,gen_envt)
+
+# ---------------------------------------------------------------------------
+
 # The actual objects are generated when descending,
 # make sure no implicit rule kicks in
 $(sort $(u-boot-init) $(u-boot-main)): $(u-boot-dirs) ;
@@ -1840,7 +1904,7 @@  endif
 # prepare2 creates a makefile if using a separate output directory
 prepare2: prepare3 outputmakefile cfg
 
-prepare1: prepare2 $(version_h) $(timestamp_h) $(dt_h) \
+prepare1: prepare2 $(version_h) $(timestamp_h) $(dt_h) $(env_h) \
                    include/config/auto.conf
 ifeq ($(wildcard $(LDSCRIPT)),)
 	@echo >&2 "  Could not find linker script."
diff --git a/config.mk b/config.mk
index 7bb1fd4ed1b..2595aed218b 100644
--- a/config.mk
+++ b/config.mk
@@ -50,8 +50,10 @@  endif
 ifneq ($(BOARD),)
 ifdef	VENDOR
 BOARDDIR = $(VENDOR)/$(BOARD)
+ENVDIR=${vendor}/env
 else
 BOARDDIR = $(BOARD)
+ENVDIR=${board}/env
 endif
 endif
 ifdef	BOARD
diff --git a/doc/usage/environment.rst b/doc/usage/environment.rst
index 7a733b44556..667fd193ea1 100644
--- a/doc/usage/environment.rst
+++ b/doc/usage/environment.rst
@@ -15,7 +15,82 @@  environment is erased by accident, a default environment is provided.
 
 Some configuration options can be set using Environment Variables.
 
-List of environment variables (most likely not complete):
+Text-based Environment
+----------------------
+
+The default environment for a board is created using a `.env` environment file
+using a simple text format. The base filename for this is defined by
+`CONFIG_ENV_SOURCE_FILE`, or `CONFIG_SYS_BOARD` if that is empty.
+
+The file must be in the board directory and have a .env extension, so
+assuming that there is a board vendor, the resulting filename is therefore::
+
+   board/<vendor>/<board>/<CONFIG_ENV_SOURCE_FILE>.env
+
+or::
+
+   board/<vendor>/<board>/<CONFIG_SYS_BOARD>.env
+
+This is a plain text file where you can type your environment variables in
+the form `var=value`. Blank lines and multi-line variables are supported.
+The conversion script looks for a line that starts in column 1 with a string
+and has an equals sign immediately afterwards. Spaces before the = are not
+permitted. It is a good idea to indent your scripts so that only the 'var='
+appears at the start of a line.
+
+To add additional text to a variable you can use var+=value. This text is
+merged into the variable during the make process and made available as a
+single value to U-Boot. To support this, environment variables may not end
+in `+`.
+
+This file can include C-style comments. Blank lines and multi-line
+variables are supported, and you can use normal C preprocessor directives
+and CONFIG defines from your board config also.
+
+For example, for snapper9260 you would create a text file called
+`board/bluewater/snapper9260.env` containing the environment text.
+
+Example::
+
+    stdout=serial
+    #ifdef CONFIG_LCD
+    stdout+=,lcd
+    #endif
+    bootcmd=
+        /* U-Boot script for booting */
+
+        if [ -z ${tftpserverip} ]; then
+            echo "Use 'setenv tftpserverip a.b.c.d' to set IP address."
+        fi
+
+        usb start; setenv autoload n; bootp;
+        tftpboot ${tftpserverip}:
+        bootm
+    failed=
+        /* Print a message when boot fails */
+        echo CONFIG_SYS_BOARD boot failed - please check your image
+        echo Load address is CONFIG_SYS_LOAD_ADDR
+
+If CONFIG_ENV_SOURCE_FILE is empty and the default filename is not present, then
+the old-style C environment is used instead. See below.
+
+Old-style C environment
+-----------------------
+
+Traditionally, the default environment is created in `include/env_default.h`,
+and can be augmented by various `CONFIG` defines. See that file for details. In
+particular you can define `CONFIG_EXTRA_ENV_SETTINGS` in your board file
+to add environment variables.
+
+Board maintainers are encouraged to migrate to the text-based environment as it
+is easier to maintain. The distro-board script still requires the old-style
+environment but work is underway to address this.
+
+
+List of environment variables
+-----------------------------
+
+This is most-likely not complete:
 
 baudrate
     see CONFIG_BAUDRATE
diff --git a/env/Kconfig b/env/Kconfig
index f75f2b13536..b93ad5c8ee0 100644
--- a/env/Kconfig
+++ b/env/Kconfig
@@ -3,6 +3,24 @@  menu "Environment"
 config ENV_SUPPORT
 	def_bool y
 
+config ENV_SOURCE_FILE
+	string "Environment file to use"
+	default ""
+	help
+	  This sets the basename to use to generate the default environment.
+	  This a text file as described in doc/usage/environment.rst
+
+	  The file must be in the board directory and have a .env extension, so
+	  the resulting filename is typically
+	  board/<vendor>/<board>/<CONFIG_ENV_SOURCE_FILE>.env
+
+	  If the file is not present, an error is produced.
+
+	  If this CONFIG is empty, U-Boot uses CONFIG SYS_BOARD as a default, if
+	  the file board/<vendor>/<board>/<SYS_BOARD>.env exists. Otherwise the
+	  environment is assumed to come from the ad-hoc
+	  CONFIG_EXTRA_ENV_SETTINGS #define
+
 config SAVEENV
 	def_bool y if CMD_SAVEENV
 
diff --git a/env/embedded.c b/env/embedded.c
index 208553e6af1..9f26e6cad9c 100644
--- a/env/embedded.c
+++ b/env/embedded.c
@@ -66,6 +66,7 @@ 
 #endif
 
 #define DEFAULT_ENV_INSTANCE_EMBEDDED
+#include <config.h>
 #include <env_default.h>
 
 #ifdef CONFIG_ENV_ADDR_REDUND
diff --git a/include/env_default.h b/include/env_default.h
index 66e203eb6e4..c06506313e5 100644
--- a/include/env_default.h
+++ b/include/env_default.h
@@ -10,6 +10,10 @@ 
 #include <env_callback.h>
 #include <linux/stringify.h>
 
+#ifndef USE_HOSTCC
+#include <generated/environment.h>
+#endif
+
 #ifdef DEFAULT_ENV_INSTANCE_EMBEDDED
 env_t embedded_environment __UBOOT_ENV_SECTION__(environment) = {
 	ENV_CRC,	/* CRC Sum */
@@ -110,6 +114,13 @@  const uchar default_environment[] = {
 #if defined(CONFIG_BOOTCOUNT_BOOTLIMIT) && (CONFIG_BOOTCOUNT_BOOTLIMIT > 0)
 	"bootlimit="	__stringify(CONFIG_BOOTCOUNT_BOOTLIMIT)"\0"
 #endif
+#ifdef CONFIG_EXTRA_ENV_TEXT
+# ifdef CONFIG_EXTRA_ENV_SETTINGS
+# error "Your board uses a text-file environment, so must not define CONFIG_EXTRA_ENV_SETTINGS"
+# endif
+	/* This is created in the Makefile */
+	CONFIG_EXTRA_ENV_TEXT
+#endif
 #ifdef	CONFIG_EXTRA_ENV_SETTINGS
 	CONFIG_EXTRA_ENV_SETTINGS
 #endif
diff --git a/scripts/env2string.awk b/scripts/env2string.awk
new file mode 100644
index 00000000000..aff3adc67f0
--- /dev/null
+++ b/scripts/env2string.awk
@@ -0,0 +1,71 @@ 
+# SPDX-License-Identifier: GPL-2.0+
+#
+# Copyright 2021 Google, Inc
+#
+# SPDX-License-Identifier:	GPL-2.0+
+#
+# Awk script to parse a text file containing an environment and convert it
+# to a C string which can be compiled into U-Boot.
+
+# The resulting output is:
+#
+#   #define CONFIG_EXTRA_ENV_TEXT "<environment here>"
+#
+# If the input is empty, this script outputs a comment instead.
+
+BEGIN {
+	# env holds the env variable we are currently processing
+	env = "";
+	ORS = ""
+}
+
+# Skip empty lines, as these are generated by the clang preprocessor
+NF {
+	# Quote quotes
+	gsub("\"", "\\\"")
+
+	# Is this the start of a new environment variable?
+	if (match($0, "^([^ \t=][^ =]*)=(.*)", arr)) {
+		if (length(env) != 0) {
+			# Record the value of the variable now completed
+			vars[var] = env
+		}
+		var = arr[1]
+		env = arr[2]
+
+		# Deal with +=
+		if (length(env) != 0 && match(var, "(.*)[+]$", var_arr)) {
+			var = var_arr[1]
+			env = vars[var] env
+		}
+	} else {
+		# Change newline to space
+		gsub(/^[ \t]+/, "")
+
+		# Don't keep leading spaces generated by the previous blank line
+		if (length(env) == 0) {
+			env = $0
+		} else {
+			env = env " " $0
+		}
+	}
+}
+
+END {
+	# Record the value of the variable now completed. If the variable is
+	# empty it is not set.
+	if (length(env) != 0) {
+		vars[var] = env
+	}
+
+	if (length(vars) != 0) {
+		printf("%s", "#define CONFIG_EXTRA_ENV_TEXT \"")
+
+		# Print out all the variables
+		for (var in vars) {
+			env = vars[var]
+			print var "=" vars[var] "\\0"
+		}
+		print "\"\n"
+	}
+}
diff --git a/test/py/tests/test_env.py b/test/py/tests/test_env.py
index 9bed2f48d77..c142e2c86a2 100644
--- a/test/py/tests/test_env.py
+++ b/test/py/tests/test_env.py
@@ -7,6 +7,7 @@ 
 import os
 import os.path
 from subprocess import call, check_call, CalledProcessError
+import tempfile
 
 import pytest
 import u_boot_utils
@@ -515,3 +516,95 @@  def test_env_ext4(state_test_env):
     finally:
         if fs_img:
             call('rm -f %s' % fs_img, shell=True)
+
+def test_env_text(u_boot_console):
+    """Test the script that converts the environment to a text file"""
+
+    def check_script(intext, expect_val):
+        """Check a test case
+
+        Args:
+            intext: Text to pass to the script
+            expect_val: Expected value of the CONFIG_EXTRA_ENV_TEXT string, or
+                None if we expect it not to be defined
+        """
+        with tempfile.TemporaryDirectory() as path:
+            fname = os.path.join(path, 'infile')
+            with open(fname, 'w') as inf:
+                print(intext, file=inf)
+            result = u_boot_utils.run_and_log(cons, ['awk', '-f', script, fname])
+            if expect_val is not None:
+                expect = '#define CONFIG_EXTRA_ENV_TEXT "%s"\n' % expect_val
+                assert result == expect
+            else:
+                assert result == ''
+
+    cons = u_boot_console
+    script = os.path.join(cons.config.source_dir, 'scripts', 'env2string.awk')
+
+    # simple script with a single var
+    check_script('fred=123', 'fred=123\\0')
+
+    # no vars
+    check_script('', None)
+
+    # two vars
+    check_script('''fred=123
+ernie=456''', 'fred=123\\0ernie=456\\0')
+
+    # blank lines
+    check_script('''fred=123
+
+
+ernie=456
+
+''', 'fred=123\\0ernie=456\\0')
+
+    # append
+    check_script('''fred=123
+ernie=456
+fred+= 456''', 'fred=123 456\\0ernie=456\\0')
+
+    # append from empty
+    check_script('''fred=
+ernie=456
+fred+= 456''', 'fred= 456\\0ernie=456\\0')
+
+    # variable with + in it
+    check_script('fred+ernie=123', 'fred+ernie=123\\0')
+
+    # ignores variables that are empty
+    check_script('''fred=
+fred+=
+ernie=456''', 'ernie=456\\0')
+
+    # contains quotes
+    check_script('''fred="my var"
+ernie=another"''', 'fred=\\"my var\\"\\0ernie=another\\"\\0')
+
+    # multi-line vars - new vars always start at column 1
+    check_script('''fred=first
+ second
+\tthird with tab
+
+   after blank
+ confusing=oops
+ernie=another"''', 'fred=first second third with tab after blank confusing=oops\\0ernie=another\\"\\0')
+
+    # real-world example
+    check_script('''ubifs_boot=
+	env exists bootubipart ||
+		env set bootubipart UBI;
+	env exists bootubivol ||
+		env set bootubivol boot;
+	if ubi part ${bootubipart} &&
+		ubifsmount ubi${devnum}:${bootubivol};
+	then
+		devtype=ubi;
+		run scan_dev_for_boot;
+	fi
+''',
+        'ubifs_boot=env exists bootubipart || env set bootubipart UBI; '
+        'env exists bootubivol || env set bootubivol boot; '
+        'if ubi part ${bootubipart} && ubifsmount ubi${devnum}:${bootubivol}; '
+        'then devtype=ubi; run scan_dev_for_boot; fi\\0')