mbox series

[v2,00/12] arm64: Kconfig: Update ARCH_EXYNOS select configs

Message ID 20210928235635.1348330-1-willmcvicker@google.com
Headers show
Series arm64: Kconfig: Update ARCH_EXYNOS select configs | expand

Message

William McVicker Sept. 28, 2021, 11:56 p.m. UTC
This is v2 of the series of patches that modularizes a number of core
ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
modularized all of the drivers that are removed from the ARCH_EXYNOS
series of "select XXX". This includes setting the following configs as
tristate:

 * COMMON_CLK_SAMSUNG
 * EXYNOS_ARM64_COMMON_CLK
 * PINCTRL_SAMSUNG
 * PINCTRL_EXYNOS
 * EXYNOS_PMU_ARM64
 * EXYNOS_PM_DOMAINS

Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
The reason for these new configs is because we are not able to easily
modularize the ARMv7 PMU driver due to built-in arch dependencies on
pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
the ARM and ARM64 portions into two separate configs.

Overall, these drivers didn't require much refactoring and converted to
modules relatively easily. However, due to my lack of exynos hardware, I
was not able to boot test these changes. I'm mostly concerned about the
CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
requesting help for testing these changes on the respective hardware.

Lastly, this series is based off of [1] since there are dependencies on
EXYNOS_CHIPID from that series..

[1] https://lore.kernel.org/lkml/20210919093114.35987-1-krzysztof.kozlowski@canonical.com/

* From v1:
  - Fixed modifying hidden configs
  - Modularized all the drivers that were touched
  - Removed HAVE_S3C_RTC
  - Updated all Samsung ARCH_XXX configs as suggested from reviews
  - Rebased on top of 5.15-rc3 and pulled in [1]

Will McVicker (12):
  arm64: don't have ARCH_EXYNOS select EXYNOS_CHIPID
  timekeeping: add API for getting timekeeping_suspended
  clk: samsung: add support for CPU clocks
  clk: samsung: exynos5433: update apollo and atlas clock probing
  clk: export __clk_lookup
  clk: samsung: modularize exynos arm64 clk drivers
  clk: samsung: set exynos arm64 clk driver as tristate
  pinctrl: samsung: modularize the ARM and ARM64 pinctrls
  pinctrl: samsung: set PINCTRL_EXYNOS and PINCTRL_SAMSUNG as tristate
  soc: samsung: pmu: modularize the Exynos ARMv8 PMU driver
  soc: samsung: pm_domains: modularize EXYNOS_PM_DOMAINS
  ARM: rtc: remove HAVE_S3C_RTC in favor of direct dependencies

 arch/arm/Kconfig                              |   1 -
 arch/arm/mach-exynos/Kconfig                  |   6 +-
 arch/arm/mach-s3c/Kconfig.s3c64xx             |   1 -
 arch/arm/mach-s5pv210/Kconfig                 |   3 -
 arch/arm64/Kconfig.platforms                  |   6 -
 drivers/clk/clk.c                             |   1 +
 drivers/clk/samsung/Kconfig                   |   5 +-
 drivers/clk/samsung/Makefile                  |   3 +-
 drivers/clk/samsung/clk-cpu.c                 |  28 +-
 drivers/clk/samsung/clk-cpu.h                 |   2 +-
 drivers/clk/samsung/clk-exynos5433.c          | 465 ++++++++----------
 drivers/clk/samsung/clk-exynos7.c             | 177 +++----
 drivers/clk/samsung/clk-pll.c                 |   6 +-
 drivers/clk/samsung/clk.c                     |  35 +-
 drivers/clk/samsung/clk.h                     |  50 +-
 drivers/pinctrl/samsung/Kconfig               |   5 +-
 drivers/pinctrl/samsung/Makefile              |  13 +-
 drivers/pinctrl/samsung/pinctrl-exynos-arm.c  | 102 ++--
 .../pinctrl/samsung/pinctrl-exynos-arm64.c    |  73 +--
 drivers/pinctrl/samsung/pinctrl-exynos.c      |  17 +-
 drivers/pinctrl/samsung/pinctrl-samsung.c     |  11 +-
 drivers/rtc/Kconfig                           |  10 +-
 drivers/soc/samsung/Kconfig                   |  18 +-
 drivers/soc/samsung/Makefile                  |   8 +-
 drivers/soc/samsung/exynos-pmu.c              |  13 +-
 drivers/soc/samsung/exynos-pmu.h              |   2 +-
 drivers/soc/samsung/pm_domains.c              |  12 +-
 include/linux/soc/samsung/exynos-pmu.h        |   2 +-
 include/linux/timekeeping.h                   |   1 +
 kernel/time/timekeeping.c                     |  11 +
 30 files changed, 553 insertions(+), 534 deletions(-)

Comments

Krzysztof Kozlowski Sept. 29, 2021, 1:02 p.m. UTC | #1
On 29/09/2021 01:56, Will McVicker wrote:
> This is v2 of the series of patches that modularizes a number of core
> ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> modularized all of the drivers that are removed from the ARCH_EXYNOS
> series of "select XXX". This includes setting the following configs as
> tristate:
> 
>  * COMMON_CLK_SAMSUNG
>  * EXYNOS_ARM64_COMMON_CLK
>  * PINCTRL_SAMSUNG
>  * PINCTRL_EXYNOS
>  * EXYNOS_PMU_ARM64
>  * EXYNOS_PM_DOMAINS
> 
> Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> The reason for these new configs is because we are not able to easily
> modularize the ARMv7 PMU driver due to built-in arch dependencies on
> pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> the ARM and ARM64 portions into two separate configs.
> 
> Overall, these drivers didn't require much refactoring and converted to
> modules relatively easily. However, due to my lack of exynos hardware, I
> was not able to boot test these changes. I'm mostly concerned about the
> CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> requesting help for testing these changes on the respective hardware.
> 

These are all not tested at all? In such case, since these are not
trivial changes, please mark the series as RFT.

I will not be able to test these for some days, so it must wait.


Best regards,
Krzysztof
William McVicker Sept. 29, 2021, 7:48 p.m. UTC | #2
On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
<krzysztof.kozlowski@canonical.com> wrote:
>
> On 29/09/2021 01:56, Will McVicker wrote:
> > This is v2 of the series of patches that modularizes a number of core
> > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > series of "select XXX". This includes setting the following configs as
> > tristate:
> >
> >  * COMMON_CLK_SAMSUNG
> >  * EXYNOS_ARM64_COMMON_CLK
> >  * PINCTRL_SAMSUNG
> >  * PINCTRL_EXYNOS
> >  * EXYNOS_PMU_ARM64
> >  * EXYNOS_PM_DOMAINS
> >
> > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > The reason for these new configs is because we are not able to easily
> > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > the ARM and ARM64 portions into two separate configs.
> >
> > Overall, these drivers didn't require much refactoring and converted to
> > modules relatively easily. However, due to my lack of exynos hardware, I
> > was not able to boot test these changes. I'm mostly concerned about the
> > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > requesting help for testing these changes on the respective hardware.
> >
>
> These are all not tested at all? In such case, since these are not
> trivial changes, please mark the series as RFT.
>
> I will not be able to test these for some days, so it must wait.
>
>
> Best regards,
> Krzysztof

+Cc Arnd and Olof,

Hi Krzysztof,

To avoid the scrambled conversation from the first patchset, I'm going
to address all your general questions here in the cover letter thread
so that it's easier for everyone to follow and reference in the
future.

>What is more, it seems you entirely ignored Geert's comments. I pointed
>attention to it last time and you just said you will send v2 instead of
>joining discussion.
>
>It's a NAK for this reason - ignoring what Geert brought: you just broke
>distro configs for Exynos.

First off I did want to chime into the discussion from the previous
patchset, but I felt that Lee and Saravana addressed all your concerns
regarding the intent and feasibility. You also made it clear what the
next steps were that I needed to take.

>Please also explain why Exynos is so special that we deviate from the
>policy for all SoC that critical SoC-related drivers have to be enabled
>(built-in or as module).

I am not actually changing ANY default build configurations here and
I'm not removing any existing configuration. I tried to make it pretty
clear in my original patch series commit messages that none of my
changes modify the default behavior. The .config is the same with and
without my patches. All of these drivers remain enabled as built-in.
So if there is a distro that requires all of these drivers to be
built-in, then they can continue as is without noticing any
difference. IOW, all of these changes are/should be backwards
compatible.

I really appreciate yours and John Stultz's comments regarding
including the "why" in my commit message wording. I will spend more
time on the next series on trying to write a more meaningful commit
message, but before that we can surely discuss the "why" here.

As mentioned by Lee and Saravana, our common goal is to make it easier
for everyone to contribute upstream. In particular, this series of
patches is laying the ground work for distros to have more flexibility
in supporting a wider range of platforms without forcing everyone to
include unnecessary drivers. You said that upstream supports a generic
kernel, but I argue that the upstream "generic" arm64 kernel can't be
considered generic if it builds in SoC specific drivers that can be
modules. This patch series is addressing exactly that -- allow distros
to move SoC specific drivers out of the core kernel and into modules.
Ultimately, our goal is to be able to directly develop with the
upstream kernel on new and old SoCs by not including SoC specific
drivers in our generic kernel distro. This helps the upstream
community in a number of ways:

(1) It makes the ARM64 generic kernel smaller by converting more
drivers into modules
(2) It makes it a lot easier for everyone to develop upstream if they
can directly use the upstream kernel without carrying downstream
changes.

>Even if there was, I think it is good to have dependencies like
>ARCH_EXYNOS, as they let us partition the (19000, as Arnd said recently)
>Kconfig symbols into better manageable groups.  Without these, we cannot
>do better than "depends on ARM || ARM64 || COMPILE_TEST".

My patch series still keeps the dependencies on ARCH_EXYNOS. I am
totally fine with "depends on ARCH_EXYNOS" and totally fine with
"default ARCH_EXYNOS". The problem we have is that ARCH_EXYNOS
forcefully selects SoC specific drivers to be built-in because it just
adds more and more SoC-specific drivers to a generic kernel.

I know you are asking for me to only push changes that have proven to
work. The theory behind these changes has been proven downstream on
other devices and I'm more than willing to help debug any issues that
arise out of this patch series, but since I don't have the hardware
myself I do need help with device testing these changes. We are not
trying to trick upstream in anyway to accept something that is not
functional or going to hurt the upstream community. I am more than
willing to help upstream and am totally willing to work with upstream
to verify all of these changes before they are accepted (feel free to
send me any dusty, unused hardware lying around if you want the extra
help with device testing).

I hope that helps clarifies things! I will address other patch
specific comments in those threads as well.

Thanks,
Will
Krzysztof Kozlowski Sept. 30, 2021, 6:14 a.m. UTC | #3
On 29/09/2021 21:48, Will McVicker wrote:
> On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> <krzysztof.kozlowski@canonical.com> wrote:
>>
>> On 29/09/2021 01:56, Will McVicker wrote:
>>> This is v2 of the series of patches that modularizes a number of core
>>> ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
>>> modularized all of the drivers that are removed from the ARCH_EXYNOS
>>> series of "select XXX". This includes setting the following configs as
>>> tristate:
>>>
>>>  * COMMON_CLK_SAMSUNG
>>>  * EXYNOS_ARM64_COMMON_CLK
>>>  * PINCTRL_SAMSUNG
>>>  * PINCTRL_EXYNOS
>>>  * EXYNOS_PMU_ARM64
>>>  * EXYNOS_PM_DOMAINS
>>>
>>> Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
>>> which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
>>> The reason for these new configs is because we are not able to easily
>>> modularize the ARMv7 PMU driver due to built-in arch dependencies on
>>> pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
>>> the ARM and ARM64 portions into two separate configs.
>>>
>>> Overall, these drivers didn't require much refactoring and converted to
>>> modules relatively easily. However, due to my lack of exynos hardware, I
>>> was not able to boot test these changes. I'm mostly concerned about the
>>> CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
>>> requesting help for testing these changes on the respective hardware.
>>>
>>
>> These are all not tested at all? In such case, since these are not
>> trivial changes, please mark the series as RFT.
>>
>> I will not be able to test these for some days, so it must wait.
>>
>>
>> Best regards,
>> Krzysztof
> 
> +Cc Arnd and Olof,
> 
> Hi Krzysztof,
> 
> To avoid the scrambled conversation from the first patchset, I'm going
> to address all your general questions here in the cover letter thread
> so that it's easier for everyone to follow and reference in the
> future.
> 
>> What is more, it seems you entirely ignored Geert's comments. I pointed
>> attention to it last time and you just said you will send v2 instead of
>> joining discussion.
>>
>> It's a NAK for this reason - ignoring what Geert brought: you just broke
>> distro configs for Exynos.
> 
> First off I did want to chime into the discussion from the previous
> patchset, but I felt that Lee and Saravana addressed all your concerns
> regarding the intent and feasibility. You also made it clear what the
> next steps were that I needed to take.

One of the steps was problem with distros using everything as modules.
They should not receive these drivers as modules.
Reminder: these are essential drivers and all Exynos platforms must have
them as built-in (at least till someone really tests this on multiple
setups).

> 
>> Please also explain why Exynos is so special that we deviate from the
>> policy for all SoC that critical SoC-related drivers have to be enabled
>> (built-in or as module).
> 
> I am not actually changing ANY default build configurations here and
> I'm not removing any existing configuration.

You are changing not default, but selectability which is part of the
enforced configuration to make platforms working. The distros do not
always choose defaults but rather all as modules. Kernel configuration
is huge and complex, so by mistake they could now even disable
potentially essential driver. There is no need to disable for example
essential clock driver on a supported Exynos platform.

> I tried to make it pretty
> clear in my original patch series commit messages that none of my
> changes modify the default behavior. The .config is the same with and
> without my patches. All of these drivers remain enabled as built-in.
> So if there is a distro that requires all of these drivers to be
> built-in, then they can continue as is without noticing any
> difference. IOW, all of these changes are/should be backwards
> compatible.

I was not referring to default neither to backwards compatibility.
Please explain why Exynos is special that it does not require essential
drivers to be selected as built-in. For example why aren't same changes
done for Renesas?

Is that now a new global approach that all SoC drivers should be allowed
to be disabled for ARCH_XXX?

> 
> I really appreciate yours and John Stultz's comments regarding
> including the "why" in my commit message wording. I will spend more
> time on the next series on trying to write a more meaningful commit
> message, but before that we can surely discuss the "why" here.
> 
> As mentioned by Lee and Saravana, our common goal is to make it easier
> for everyone to contribute upstream. In particular, this series of
> patches is laying the ground work for distros to have more flexibility
> in supporting a wider range of platforms without forcing everyone to
> include unnecessary drivers. 

The drivers are usually necessary. Actually, you admitted you didn't
test patchset, so how do you even know that they are unnecessary? How do
 you judge?

> You said that upstream supports a generic
> kernel, but I argue that the upstream "generic" arm64 kernel can't be
> considered generic if it builds in SoC specific drivers that can be
> modules.

Good point, but since having them as modules was not tested, I consider
it as theoretical topic.

> This patch series is addressing exactly that -- allow distros
> to move SoC specific drivers out of the core kernel and into modules.
> Ultimately, our goal is to be able to directly develop with the
> upstream kernel on new and old SoCs by not including SoC specific
> drivers in our generic kernel distro. This helps the upstream
> community in a number of ways:
> 
> (1) It makes the ARM64 generic kernel smaller by converting more
> drivers into modules
> (2) It makes it a lot easier for everyone to develop upstream if they
> can directly use the upstream kernel without carrying downstream
> changes.

I don't understand the point (2) here. Anyone can use upstream kernel
for supported and unsupported platforms. How upstream benefits from a
change affecting supported platforms made for unsupported, downstream
platforms.

> 
>> Even if there was, I think it is good to have dependencies like
>> ARCH_EXYNOS, as they let us partition the (19000, as Arnd said recently)
>> Kconfig symbols into better manageable groups.  Without these, we cannot
>> do better than "depends on ARM || ARM64 || COMPILE_TEST".
> 
> My patch series still keeps the dependencies on ARCH_EXYNOS. I am
> totally fine with "depends on ARCH_EXYNOS" and totally fine with
> "default ARCH_EXYNOS". The problem we have is that ARCH_EXYNOS
> forcefully selects SoC specific drivers to be built-in because it just
> adds more and more SoC-specific drivers to a generic kernel.

The selected drivers are essential for supported platforms. We don't
even know what are these unsupported, downstream platforms you want
customize kernel for. They cannot be audited, cannot be compared.

Therefore I don't agree with calling it a "problem" that we select
*necessary* drivers for supported platforms. It's by design - supported
platforms should receive them without ability to remove.

If you want to change it, let me paste from previous discussion:

Affecting upstream platforms just because vendor/downstream does not
want to mainline some code is unacceptable. Please upstream your drivers
and DTS.

Everyone else are working like this. NXP, Renesas, Xilinx, TI, Rockchip,
AllWinner. Samsung or Google is not special to receive an exception for
this.


> 
> I know you are asking for me to only push changes that have proven to
> work. 

Yep, tested.

> The theory behind these changes has been proven downstream on
> other devices and I'm more than willing to help debug any issues that
> arise out of this patch series, but since I don't have the hardware
> myself I do need help with device testing these changes.

Downstream uses very specific Linux "distro" or fork - Android - with
changes not present in others. Although tests on other
(unsupported/unupstreamed) devices is good, but it's not sufficient. For
example I guess that half or most of Odroid devices are running standard
Linux distro (Arch, Ubuntu).

You also mentioned downstream devices but without actually ever defining
them. Please be more specific. What SoC, what hardware?

Best regards,
Krzysztof
Arnd Bergmann Sept. 30, 2021, 9:01 a.m. UTC | #4
On Thu, Sep 30, 2021 at 8:15 AM Krzysztof Kozlowski
<krzysztof.kozlowski@canonical.com> wrote:
> On 29/09/2021 21:48, Will McVicker wrote:
> > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> wrote:
> >> What is more, it seems you entirely ignored Geert's comments. I pointed
> >> attention to it last time and you just said you will send v2 instead of
> >> joining discussion.
> >>
> >> It's a NAK for this reason - ignoring what Geert brought: you just broke
> >> distro configs for Exynos.
> >
> > First off I did want to chime into the discussion from the previous
> > patchset, but I felt that Lee and Saravana addressed all your concerns
> > regarding the intent and feasibility. You also made it clear what the
> > next steps were that I needed to take.
>
> One of the steps was problem with distros using everything as modules.
> They should not receive these drivers as modules.
> Reminder: these are essential drivers and all Exynos platforms must have
> them as built-in (at least till someone really tests this on multiple
> setups).

Agreed. I absolutely love the work of the GKI developers to turn more
drivers into loadable modules, but the "correctness-first" principle is
not up for negotiation. If you are uncomfortable with the code or the
amount of testing because you think it breaks something, you should
reject the patches. Moving core platform functionality is fundamentally
hard and it can go wrong in all possible ways where it used to work
by accident because the init order was fixed.

> >> Please also explain why Exynos is so special that we deviate from the
> >> policy for all SoC that critical SoC-related drivers have to be enabled
> >> (built-in or as module).
> >
> > I am not actually changing ANY default build configurations here and
> > I'm not removing any existing configuration.
>
> You are changing not default, but selectability which is part of the
> enforced configuration to make platforms working. The distros do not
> always choose defaults but rather all as modules. Kernel configuration
> is huge and complex, so by mistake they could now even disable
> potentially essential driver. There is no need to disable for example
> essential clock driver on a supported Exynos platform.

I'm not overly worried about the defaults. If the drivers work as loadable
modules, I'm happy with them being loadable modules in distros.
If they don't work this way, then the patches are broken and should
not get merged.

I don't even mind having essential drivers that can be turned off,
since we already have a ton of those (e.g. serial ports on most platforms).
It's up to distros to know which drivers to enable, though having
either reasonable defaults or fail-safe Kconfig dependencies (e.g.
making it impossible to turn off but allowing modules) is clearly
best.

> > I tried to make it pretty
> > clear in my original patch series commit messages that none of my
> > changes modify the default behavior. The .config is the same with and
> > without my patches. All of these drivers remain enabled as built-in.
> > So if there is a distro that requires all of these drivers to be
> > built-in, then they can continue as is without noticing any
> > difference. IOW, all of these changes are/should be backwards
> > compatible.
>
> I was not referring to default neither to backwards compatibility.
> Please explain why Exynos is special that it does not require essential
> drivers to be selected as built-in. For example why aren't same changes
> done for Renesas?
>
> Is that now a new global approach that all SoC drivers should be allowed
> to be disabled for ARCH_XXX?

I wouldn't enforce it either way across platforms. I would prefer drivers
to be loadable modules where possible (and tested), rather than
selected by the platform Kconfig. If you want to ensure the exynos
drivers are impossible to turn into a nonworking state, that's up to you.

> > You said that upstream supports a generic
> > kernel, but I argue that the upstream "generic" arm64 kernel can't be
> > considered generic if it builds in SoC specific drivers that can be
> > modules.
>
> Good point, but since having them as modules was not tested, I consider
> it as theoretical topic.

I actually disagree strongly with labelling the kernel as "non-generic"
just because it requires platform specific support to be built-in rather than
a loadable module. This has never been possible on any platform
I'm aware of, and likely never will, except for minor variations of
an existing platform.

Look at x86 as an example: there are less than a dozen SoC platforms
supported and they are incredibly similar hardware-wise, but the kernel
is anything but "generic" in the sense that was mentioned above.
Most of the platform specific drivers in arch/x86/platform and the
corresponding bits in drivers/{irqchip,clocksource,iommu} are always
built-in, and a lot more is hardwired in architecture code as PCI
quirks or conditional on cpuid or dmi firmware checks.

> >> Even if there was, I think it is good to have dependencies like
> >> ARCH_EXYNOS, as they let us partition the (19000, as Arnd said recently)
> >> Kconfig symbols into better manageable groups.  Without these, we cannot
> >> do better than "depends on ARM || ARM64 || COMPILE_TEST".
> >
> > My patch series still keeps the dependencies on ARCH_EXYNOS. I am
> > totally fine with "depends on ARCH_EXYNOS" and totally fine with
> > "default ARCH_EXYNOS". The problem we have is that ARCH_EXYNOS
> > forcefully selects SoC specific drivers to be built-in because it just
> > adds more and more SoC-specific drivers to a generic kernel.
>
> The selected drivers are essential for supported platforms. We don't
> even know what are these unsupported, downstream platforms you want
> customize kernel for. They cannot be audited, cannot be compared.
>
> Therefore I don't agree with calling it a "problem" that we select
> *necessary* drivers for supported platforms. It's by design - supported
> platforms should receive them without ability to remove.
>
> If you want to change it, let me paste from previous discussion:
>
> Affecting upstream platforms just because vendor/downstream does not
> want to mainline some code is unacceptable. Please upstream your drivers
> and DTS.

Agreed. I understand that it would be convenient for SoC vendors to
never have to upstream their platform code again, and that Android
would benefit from this in the short run.

From my upstream perspective, this is absolutely a non-goal. If it becomes
easier as a side-effect of making the kernel more modular, that's fine.
The actual goal should be to get more people to contribute upstream so
devices run code that has been reviewed and integrated into new kernels.

> > I know you are asking for me to only push changes that have proven to
> > work.
>
> Yep, tested.

I'm generally fine with "obviously correct" ones as well, but it's up to
you to categorize them ;-)

         Arnd
Lee Jones Sept. 30, 2021, 9:23 a.m. UTC | #5
I've taken the liberty of cherry-picking some of the points you have
reiteratted a few times.  Hopefully I can help to address them
adequently.

On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> Reminder: these are essential drivers and all Exynos platforms must have
> them as built-in (at least till someone really tests this on multiple
> setups).

> Therefore I don't agree with calling it a "problem" that we select
> *necessary* drivers for supported platforms. It's by design - supported
> platforms should receive them without ability to remove.

> The selected drivers are essential for supported platforms.

SoC specific drivers are only essential/necessary/required in
images designed to execute solely on a platform that requires them.
For a kernel image which is designed to be generic i.e. one that has
the ability to boot on vast array of platforms, the drivers simply
have to be *available*.

Forcing all H/W drivers that are only *potentially* utilised on *some*
platforms as core binary built-ins doesn't make any technical sense.
The two most important issues this causes are image size and a lack of
configurability/flexibility relating to real-world application i.e.
the one issue we already agreed upon; H/W or features that are too
new (pre-release).

Bloating a generic kernel with potentially hundreds of unnecessary
drivers that will never be executed in the vast majority of instances
doesn't achieve anything.  If we have a kernel image that has the
ability to boot on 10's of architectures which have 10's of platforms
each, that's a whole host of unused/wasted executable space.

In order for vendors to work more closely with upstream, they need the
ability to over-ride a *few* drivers to supplement them with some
functionality which they believe provides them with a competitive edge
(I think you called this "value-add" before) prior to the release of a
device.  This is a requirement that cannot be worked around.

By insisting on drivers (most of which will not be executed in the
vast majority of cases) to be built-in, you are insisting on a
downstream kernel fork, which all of us would like to avoid [0].

[0] Full disclosure: part of my role at Linaro is to keep the Android
kernel running as close to Mainline as possible and encourage/push the
upstream-first mantra, hence my involvement with this and other sets.
I assure you all intentions are good and honourable.  If you haven't
already seen it, please see Todd's most recent update on the goals and
status of GKI:

  Article: https://tinyurl.com/saaen3sp
  Video:   https://youtu.be/O_lCFGinFPM

> We don't even know what are these unsupported, downstream platforms
> you want customize kernel for. They cannot be audited, cannot be
> compared.  Affecting upstream platforms just because
> vendor/downstream does not want to mainline some code is
> unacceptable. Please upstream your drivers and DTS.

> You also mentioned downstream devices but without actually ever defining
> them. Please be more specific. What SoC, what hardware?

Accepting changes based on the proviso that vendors upstream all of
their work-in-progress solutions is an unfair position.  We already
discussed why upstreaming support for bleeding edge H/W and
functionality is unrealistic in terms of competitive advantage.

Besides, we might not be talking about new silicon at all (I don't
believe anyone alluded to that).  The flexibility in configuration
simply allows for generic upstream drivers to be swapped out for ones
which may have more or slightly different functionality (that can't be
publicly shared until release).

> Please explain why Exynos is special that it does not require essential
> drivers to be selected as built-in. For example why aren't same changes
> done for Renesas?

> Everyone else are working like this. NXP, Renesas, Xilinx, TI, Rockchip,
> AllWinner. Samsung or Google is not special to receive an exception for
> this.

Exynos isn't special in this regard.  This applies to any vendor who
releases Android images and wishes to be solve all of the issues the
GKI project addresses (please read the article above for more about
this).

I truly hope this has helped to align my thoughts with yours.
Lee Jones Sept. 30, 2021, 9:30 a.m. UTC | #6
On Thu, 30 Sep 2021, Arnd Bergmann wrote:

> On Thu, Sep 30, 2021 at 8:15 AM Krzysztof Kozlowski
> <krzysztof.kozlowski@canonical.com> wrote:
> > On 29/09/2021 21:48, Will McVicker wrote:
> > > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> wrote:
> > >> What is more, it seems you entirely ignored Geert's comments. I pointed
> > >> attention to it last time and you just said you will send v2 instead of
> > >> joining discussion.
> > >>
> > >> It's a NAK for this reason - ignoring what Geert brought: you just broke
> > >> distro configs for Exynos.
> > >
> > > First off I did want to chime into the discussion from the previous
> > > patchset, but I felt that Lee and Saravana addressed all your concerns
> > > regarding the intent and feasibility. You also made it clear what the
> > > next steps were that I needed to take.
> >
> > One of the steps was problem with distros using everything as modules.
> > They should not receive these drivers as modules.
> > Reminder: these are essential drivers and all Exynos platforms must have
> > them as built-in (at least till someone really tests this on multiple
> > setups).
> 
> Agreed. I absolutely love the work of the GKI developers to turn more
> drivers into loadable modules, but the "correctness-first" principle is
> not up for negotiation. If you are uncomfortable with the code or the
> amount of testing because you think it breaks something, you should
> reject the patches. Moving core platform functionality is fundamentally
> hard and it can go wrong in all possible ways where it used to work
> by accident because the init order was fixed.
> 
> > >> Please also explain why Exynos is so special that we deviate from the
> > >> policy for all SoC that critical SoC-related drivers have to be enabled
> > >> (built-in or as module).
> > >
> > > I am not actually changing ANY default build configurations here and
> > > I'm not removing any existing configuration.
> >
> > You are changing not default, but selectability which is part of the
> > enforced configuration to make platforms working. The distros do not
> > always choose defaults but rather all as modules. Kernel configuration
> > is huge and complex, so by mistake they could now even disable
> > potentially essential driver. There is no need to disable for example
> > essential clock driver on a supported Exynos platform.
> 
> I'm not overly worried about the defaults. If the drivers work as loadable
> modules, I'm happy with them being loadable modules in distros.
> If they don't work this way, then the patches are broken and should
> not get merged.
> 
> I don't even mind having essential drivers that can be turned off,
> since we already have a ton of those (e.g. serial ports on most platforms).
> It's up to distros to know which drivers to enable, though having
> either reasonable defaults or fail-safe Kconfig dependencies (e.g.
> making it impossible to turn off but allowing modules) is clearly
> best.
> 
> > > I tried to make it pretty
> > > clear in my original patch series commit messages that none of my
> > > changes modify the default behavior. The .config is the same with and
> > > without my patches. All of these drivers remain enabled as built-in.
> > > So if there is a distro that requires all of these drivers to be
> > > built-in, then they can continue as is without noticing any
> > > difference. IOW, all of these changes are/should be backwards
> > > compatible.
> >
> > I was not referring to default neither to backwards compatibility.
> > Please explain why Exynos is special that it does not require essential
> > drivers to be selected as built-in. For example why aren't same changes
> > done for Renesas?
> >
> > Is that now a new global approach that all SoC drivers should be allowed
> > to be disabled for ARCH_XXX?
> 
> I wouldn't enforce it either way across platforms. I would prefer drivers
> to be loadable modules where possible (and tested), rather than
> selected by the platform Kconfig. If you want to ensure the exynos
> drivers are impossible to turn into a nonworking state, that's up to you.
> 
> > > You said that upstream supports a generic
> > > kernel, but I argue that the upstream "generic" arm64 kernel can't be
> > > considered generic if it builds in SoC specific drivers that can be
> > > modules.
> >
> > Good point, but since having them as modules was not tested, I consider
> > it as theoretical topic.
> 
> I actually disagree strongly with labelling the kernel as "non-generic"
> just because it requires platform specific support to be built-in rather than
> a loadable module. This has never been possible on any platform
> I'm aware of, and likely never will, except for minor variations of
> an existing platform.
> 
> Look at x86 as an example: there are less than a dozen SoC platforms
> supported and they are incredibly similar hardware-wise, but the kernel
> is anything but "generic" in the sense that was mentioned above.
> Most of the platform specific drivers in arch/x86/platform and the
> corresponding bits in drivers/{irqchip,clocksource,iommu} are always
> built-in, and a lot more is hardwired in architecture code as PCI
> quirks or conditional on cpuid or dmi firmware checks.
> 
> > >> Even if there was, I think it is good to have dependencies like
> > >> ARCH_EXYNOS, as they let us partition the (19000, as Arnd said recently)
> > >> Kconfig symbols into better manageable groups.  Without these, we cannot
> > >> do better than "depends on ARM || ARM64 || COMPILE_TEST".
> > >
> > > My patch series still keeps the dependencies on ARCH_EXYNOS. I am
> > > totally fine with "depends on ARCH_EXYNOS" and totally fine with
> > > "default ARCH_EXYNOS". The problem we have is that ARCH_EXYNOS
> > > forcefully selects SoC specific drivers to be built-in because it just
> > > adds more and more SoC-specific drivers to a generic kernel.
> >
> > The selected drivers are essential for supported platforms. We don't
> > even know what are these unsupported, downstream platforms you want
> > customize kernel for. They cannot be audited, cannot be compared.
> >
> > Therefore I don't agree with calling it a "problem" that we select
> > *necessary* drivers for supported platforms. It's by design - supported
> > platforms should receive them without ability to remove.
> >
> > If you want to change it, let me paste from previous discussion:
> >
> > Affecting upstream platforms just because vendor/downstream does not
> > want to mainline some code is unacceptable. Please upstream your drivers
> > and DTS.
> 
> Agreed. I understand that it would be convenient for SoC vendors to
> never have to upstream their platform code again, and that Android
> would benefit from this in the short run.
> 
> From my upstream perspective, this is absolutely a non-goal. If it becomes
> easier as a side-effect of making the kernel more modular, that's fine.
> The actual goal should be to get more people to contribute upstream so
> devices run code that has been reviewed and integrated into new kernels.
> 
> > > I know you are asking for me to only push changes that have proven to
> > > work.
> >
> > Yep, tested.
> 
> I'm generally fine with "obviously correct" ones as well, but it's up to
> you to categorize them ;-)

Arnd,

  FWIW, I agree with all of your points.

Krzysztof,

  It sounds like a lack of testing is your main concern.

  How can I help here?  What H/W do I need to be able to fully test this?
Geert Uytterhoeven Sept. 30, 2021, 10:05 a.m. UTC | #7
On Thu, Sep 30, 2021 at 11:01 AM Arnd Bergmann <arnd@arndb.de> wrote:
> On Thu, Sep 30, 2021 at 8:15 AM Krzysztof Kozlowski
> <krzysztof.kozlowski@canonical.com> wrote:
> > On 29/09/2021 21:48, Will McVicker wrote:
> > > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> wrote:
> > >> What is more, it seems you entirely ignored Geert's comments. I pointed
> > >> attention to it last time and you just said you will send v2 instead of
> > >> joining discussion.
> > >>
> > >> It's a NAK for this reason - ignoring what Geert brought: you just broke
> > >> distro configs for Exynos.
> > >
> > > First off I did want to chime into the discussion from the previous
> > > patchset, but I felt that Lee and Saravana addressed all your concerns
> > > regarding the intent and feasibility. You also made it clear what the
> > > next steps were that I needed to take.
> >
> > One of the steps was problem with distros using everything as modules.
> > They should not receive these drivers as modules.
> > Reminder: these are essential drivers and all Exynos platforms must have
> > them as built-in (at least till someone really tests this on multiple
> > setups).
>
> Agreed. I absolutely love the work of the GKI developers to turn more
> drivers into loadable modules, but the "correctness-first" principle is
> not up for negotiation. If you are uncomfortable with the code or the
> amount of testing because you think it breaks something, you should
> reject the patches. Moving core platform functionality is fundamentally
> hard and it can go wrong in all possible ways where it used to work
> by accident because the init order was fixed.

Exactly.
And patches to change this that haven't been tested at all should be
very clearly marked that way.

> > > You said that upstream supports a generic
> > > kernel, but I argue that the upstream "generic" arm64 kernel can't be
> > > considered generic if it builds in SoC specific drivers that can be
> > > modules.
> >
> > Good point, but since having them as modules was not tested, I consider
> > it as theoretical topic.
>
> I actually disagree strongly with labelling the kernel as "non-generic"
> just because it requires platform specific support to be built-in rather than
> a loadable module. This has never been possible on any platform
> I'm aware of, and likely never will, except for minor variations of
> an existing platform.

There seem to be two different meanings for a "generic" kernel:
  1. To me, a generic kernel is a kernel that can boot on a variety of
     hardware platforms. and will operate correctly with all expected
     functionality.  That is, it should include[1] drivers for all that
     hardware.
     Examples of this are arm/multi_v7_defconfig and arm64/defconfig.
  2. To GKI, a generic kernel is a minimal kernel that doesn't include
     any[2] hardware-specific drivers, and has loadable modules for
     all[2] hardware-specific drivers. Before loading these modules, it
     can run an application on the CPU, but it cannot do any I/O to real
     hardware.

[1] Developers usually want the drivers builtin, distros want them
     modular.
[2] On arm/arm64, this must rely on the GIC and architectured timer
    being built-in (AFAIK these cannot be modules yet), and thus
    precludes booting on Cortex A9 and older that lack GIC and/or
    architectured timer as their primary interrupt controller and timer.
    I'm wondering how this works on other architectures with
    less-standardized interrupt controllers and timers?

> Look at x86 as an example: there are less than a dozen SoC platforms
> supported and they are incredibly similar hardware-wise, but the kernel
> is anything but "generic" in the sense that was mentioned above.
> Most of the platform specific drivers in arch/x86/platform and the
> corresponding bits in drivers/{irqchip,clocksource,iommu} are always
> built-in, and a lot more is hardwired in architecture code as PCI
> quirks or conditional on cpuid or dmi firmware checks.

Indeed. X86 has less variability in the hardware than other
architectures, and has all the critical drivers built-in or in firmware.
Where there have been big differences (remember Voyager?), x86 never
had the option to have a generic (in the meaning of 1 above) kernel
(but James did try, cfr. https://lwn.net/Articles/328469/).

Gr{oetje,eeting}s,

                        Geert
Geert Uytterhoeven Sept. 30, 2021, 10:17 a.m. UTC | #8
Hi Lee,

On Thu, Sep 30, 2021 at 11:23 AM Lee Jones <lee.jones@linaro.org> wrote:
> I've taken the liberty of cherry-picking some of the points you have
> reiteratted a few times.  Hopefully I can help to address them
> adequently.
>
> On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > Reminder: these are essential drivers and all Exynos platforms must have
> > them as built-in (at least till someone really tests this on multiple
> > setups).
>
> > Therefore I don't agree with calling it a "problem" that we select
> > *necessary* drivers for supported platforms. It's by design - supported
> > platforms should receive them without ability to remove.
>
> > The selected drivers are essential for supported platforms.
>
> SoC specific drivers are only essential/necessary/required in
> images designed to execute solely on a platform that requires them.

Why?

> For a kernel image which is designed to be generic i.e. one that has
> the ability to boot on vast array of platforms, the drivers simply
> have to be *available*.

If the drivers are really essential/necessary/required, this precludes
running the generic kernel image on the platform that requires them,
making the kernel not sufficiently generic.

> Forcing all H/W drivers that are only *potentially* utilised on *some*
> platforms as core binary built-ins doesn't make any technical sense.
> The two most important issues this causes are image size and a lack of
> configurability/flexibility relating to real-world application i.e.
> the one issue we already agreed upon; H/W or features that are too
> new (pre-release).

True, if "potentially".  If not potentially, they must be included.

> Bloating a generic kernel with potentially hundreds of unnecessary
> drivers that will never be executed in the vast majority of instances
> doesn't achieve anything.  If we have a kernel image that has the
> ability to boot on 10's of architectures which have 10's of platforms
> each, that's a whole host of unused/wasted executable space.

The key here is if the driver is required or not to use the platform,
and why it is required.  If the requirement comes from some deficiency
in the kernel code or config system, it should be fixed, if possible.
And the fix should be tested.
If it cannot be fixed, the driver should be included, else it would
preclude running the generic kernel on the affected platform.

> In order for vendors to work more closely with upstream, they need the
> ability to over-ride a *few* drivers to supplement them with some
> functionality which they believe provides them with a competitive edge
> (I think you called this "value-add" before) prior to the release of a
> device.  This is a requirement that cannot be worked around.
>
> By insisting on drivers (most of which will not be executed in the
> vast majority of cases) to be built-in, you are insisting on a
> downstream kernel fork, which all of us would like to avoid [0].
>
> [0] Full disclosure: part of my role at Linaro is to keep the Android
> kernel running as close to Mainline as possible and encourage/push the
> upstream-first mantra, hence my involvement with this and other sets.
> I assure you all intentions are good and honourable.  If you haven't
> already seen it, please see Todd's most recent update on the goals and
> status of GKI:
>
>   Article: https://tinyurl.com/saaen3sp
>   Video:   https://youtu.be/O_lCFGinFPM
>
> > We don't even know what are these unsupported, downstream platforms
> > you want customize kernel for. They cannot be audited, cannot be
> > compared.  Affecting upstream platforms just because
> > vendor/downstream does not want to mainline some code is
> > unacceptable. Please upstream your drivers and DTS.
>
> > You also mentioned downstream devices but without actually ever defining
> > them. Please be more specific. What SoC, what hardware?
>
> Accepting changes based on the proviso that vendors upstream all of
> their work-in-progress solutions is an unfair position.  We already
> discussed why upstreaming support for bleeding edge H/W and
> functionality is unrealistic in terms of competitive advantage.
>
> Besides, we might not be talking about new silicon at all (I don't
> believe anyone alluded to that).  The flexibility in configuration
> simply allows for generic upstream drivers to be swapped out for ones
> which may have more or slightly different functionality (that can't be
> publicly shared until release).

The replacement drivers are already a downstream kernel fork, which you
would like to avoid?

Gr{oetje,eeting}s,

                        Geert
Krzysztof Kozlowski Sept. 30, 2021, 10:33 a.m. UTC | #9
On 30/09/2021 11:30, Lee Jones wrote:
> On Thu, 30 Sep 2021, Arnd Bergmann wrote:
> 
>> On Thu, Sep 30, 2021 at 8:15 AM Krzysztof Kozlowski
>> <krzysztof.kozlowski@canonical.com> wrote:
>>> On 29/09/2021 21:48, Will McVicker wrote:
>>>> On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> wrote:
>>>>> What is more, it seems you entirely ignored Geert's comments. I pointed
>>>>> attention to it last time and you just said you will send v2 instead of
>>>>> joining discussion.
>>>>>
>>>>> It's a NAK for this reason - ignoring what Geert brought: you just broke
>>>>> distro configs for Exynos.
>>>>
>>>> First off I did want to chime into the discussion from the previous
>>>> patchset, but I felt that Lee and Saravana addressed all your concerns
>>>> regarding the intent and feasibility. You also made it clear what the
>>>> next steps were that I needed to take.
>>>
>>> One of the steps was problem with distros using everything as modules.
>>> They should not receive these drivers as modules.
>>> Reminder: these are essential drivers and all Exynos platforms must have
>>> them as built-in (at least till someone really tests this on multiple
>>> setups).
>>
>> Agreed. I absolutely love the work of the GKI developers to turn more
>> drivers into loadable modules, but the "correctness-first" principle is
>> not up for negotiation. If you are uncomfortable with the code or the
>> amount of testing because you think it breaks something, you should
>> reject the patches. Moving core platform functionality is fundamentally
>> hard and it can go wrong in all possible ways where it used to work
>> by accident because the init order was fixed.
>>
>>>>> Please also explain why Exynos is so special that we deviate from the
>>>>> policy for all SoC that critical SoC-related drivers have to be enabled
>>>>> (built-in or as module).
>>>>
>>>> I am not actually changing ANY default build configurations here and
>>>> I'm not removing any existing configuration.
>>>
>>> You are changing not default, but selectability which is part of the
>>> enforced configuration to make platforms working. The distros do not
>>> always choose defaults but rather all as modules. Kernel configuration
>>> is huge and complex, so by mistake they could now even disable
>>> potentially essential driver. There is no need to disable for example
>>> essential clock driver on a supported Exynos platform.
>>
>> I'm not overly worried about the defaults. If the drivers work as loadable
>> modules, I'm happy with them being loadable modules in distros.
>> If they don't work this way, then the patches are broken and should
>> not get merged.
>>
>> I don't even mind having essential drivers that can be turned off,
>> since we already have a ton of those (e.g. serial ports on most platforms).
>> It's up to distros to know which drivers to enable, though having
>> either reasonable defaults or fail-safe Kconfig dependencies (e.g.
>> making it impossible to turn off but allowing modules) is clearly
>> best.
>>
>>>> I tried to make it pretty
>>>> clear in my original patch series commit messages that none of my
>>>> changes modify the default behavior. The .config is the same with and
>>>> without my patches. All of these drivers remain enabled as built-in.
>>>> So if there is a distro that requires all of these drivers to be
>>>> built-in, then they can continue as is without noticing any
>>>> difference. IOW, all of these changes are/should be backwards
>>>> compatible.
>>>
>>> I was not referring to default neither to backwards compatibility.
>>> Please explain why Exynos is special that it does not require essential
>>> drivers to be selected as built-in. For example why aren't same changes
>>> done for Renesas?
>>>
>>> Is that now a new global approach that all SoC drivers should be allowed
>>> to be disabled for ARCH_XXX?
>>
>> I wouldn't enforce it either way across platforms. I would prefer drivers
>> to be loadable modules where possible (and tested), rather than
>> selected by the platform Kconfig. If you want to ensure the exynos
>> drivers are impossible to turn into a nonworking state, that's up to you.
>>
>>>> You said that upstream supports a generic
>>>> kernel, but I argue that the upstream "generic" arm64 kernel can't be
>>>> considered generic if it builds in SoC specific drivers that can be
>>>> modules.
>>>
>>> Good point, but since having them as modules was not tested, I consider
>>> it as theoretical topic.
>>
>> I actually disagree strongly with labelling the kernel as "non-generic"
>> just because it requires platform specific support to be built-in rather than
>> a loadable module. This has never been possible on any platform
>> I'm aware of, and likely never will, except for minor variations of
>> an existing platform.
>>
>> Look at x86 as an example: there are less than a dozen SoC platforms
>> supported and they are incredibly similar hardware-wise, but the kernel
>> is anything but "generic" in the sense that was mentioned above.
>> Most of the platform specific drivers in arch/x86/platform and the
>> corresponding bits in drivers/{irqchip,clocksource,iommu} are always
>> built-in, and a lot more is hardwired in architecture code as PCI
>> quirks or conditional on cpuid or dmi firmware checks.
>>
>>>>> Even if there was, I think it is good to have dependencies like
>>>>> ARCH_EXYNOS, as they let us partition the (19000, as Arnd said recently)
>>>>> Kconfig symbols into better manageable groups.  Without these, we cannot
>>>>> do better than "depends on ARM || ARM64 || COMPILE_TEST".
>>>>
>>>> My patch series still keeps the dependencies on ARCH_EXYNOS. I am
>>>> totally fine with "depends on ARCH_EXYNOS" and totally fine with
>>>> "default ARCH_EXYNOS". The problem we have is that ARCH_EXYNOS
>>>> forcefully selects SoC specific drivers to be built-in because it just
>>>> adds more and more SoC-specific drivers to a generic kernel.
>>>
>>> The selected drivers are essential for supported platforms. We don't
>>> even know what are these unsupported, downstream platforms you want
>>> customize kernel for. They cannot be audited, cannot be compared.
>>>
>>> Therefore I don't agree with calling it a "problem" that we select
>>> *necessary* drivers for supported platforms. It's by design - supported
>>> platforms should receive them without ability to remove.
>>>
>>> If you want to change it, let me paste from previous discussion:
>>>
>>> Affecting upstream platforms just because vendor/downstream does not
>>> want to mainline some code is unacceptable. Please upstream your drivers
>>> and DTS.
>>
>> Agreed. I understand that it would be convenient for SoC vendors to
>> never have to upstream their platform code again, and that Android
>> would benefit from this in the short run.
>>
>> From my upstream perspective, this is absolutely a non-goal. If it becomes
>> easier as a side-effect of making the kernel more modular, that's fine.
>> The actual goal should be to get more people to contribute upstream so
>> devices run code that has been reviewed and integrated into new kernels.
>>
>>>> I know you are asking for me to only push changes that have proven to
>>>> work.
>>>
>>> Yep, tested.
>>
>> I'm generally fine with "obviously correct" ones as well, but it's up to
>> you to categorize them ;-)

Thanks Arnd!

> 
> Arnd,
> 
>   FWIW, I agree with all of your points.>
> Krzysztof,
> 
>   It sounds like a lack of testing is your main concern.
> 
>   How can I help here?  What H/W do I need to be able to fully test this?

The changes here need to be tested on affected platforms (ARMv7 and
ARMv8), when built as a modules on some types of regular distros (e.g.
Arch, Ubuntu). From each of such boot I would be happy to see number of
new dmesg warnings/errors plus number of probe deferrals.

Since the drivers could be switched to modules (and some distros might
do it), they might be hit by surprise regressions in boot performance
due to probe deferrals. This should be also checked on these platforms.
Geert pointed out before that clocks in many cases are not optional -
driver needs them and will wait defer.

Assuming of course that boot succeeds. Minor differences in boot speed
should not be a problem, I think, because distro anyway chosen
all-module approach so it accepts the penalty.

Best regards,
Krzysztof
Krzysztof Kozlowski Sept. 30, 2021, 10:52 a.m. UTC | #10
On 30/09/2021 11:23, Lee Jones wrote:
> I've taken the liberty of cherry-picking some of the points you have
> reiteratted a few times.  Hopefully I can help to address them
> adequently.
> 
> On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
>> Reminder: these are essential drivers and all Exynos platforms must have
>> them as built-in (at least till someone really tests this on multiple
>> setups).
> 
>> Therefore I don't agree with calling it a "problem" that we select
>> *necessary* drivers for supported platforms. It's by design - supported
>> platforms should receive them without ability to remove.
> 
>> The selected drivers are essential for supported platforms.
> 
> SoC specific drivers are only essential/necessary/required in
> images designed to execute solely on a platform that requires them.
> For a kernel image which is designed to be generic i.e. one that has
> the ability to boot on vast array of platforms, the drivers simply
> have to be *available*.

By "essential", I meant drivers which are needed for supported platform
to boot properly. Also without significant performance penalty due to
probe deferrals.

Therefore if someone selects ARCH_EXYNOS in mainline kernel, it means
he/she wants such devices to be working fine with generic kernel.

This is the difference with term "drivers have to be available".

> 
> Forcing all H/W drivers that are only *potentially* utilised on *some*
> platforms as core binary built-ins doesn't make any technical sense.
> The two most important issues this causes are image size and a lack of
> configurability/flexibility relating to real-world application i.e.
> the one issue we already agreed upon; H/W or features that are too
> new (pre-release).

Geert responded here. If drivers are essential for supported platforms
to boot, having them built-in has technical sense.
For other drivers not and we do not discuss such cases.

> 
> Bloating a generic kernel with potentially hundreds of unnecessary
> drivers that will never be executed in the vast majority of instances
> doesn't achieve anything.  If we have a kernel image that has the
> ability to boot on 10's of architectures which have 10's of platforms
> each, that's a whole host of unused/wasted executable space.

Geert responded here.

> 
> In order for vendors to work more closely with upstream, they need the
> ability to over-ride a *few* drivers to supplement them with some
> functionality which they believe provides them with a competitive edge
> (I think you called this "value-add" before) prior to the release of a
> device.  This is a requirement that cannot be worked around.
> 
> By insisting on drivers (most of which will not be executed in the
> vast majority of cases) to be built-in, you are insisting on a
> downstream kernel fork, which all of us would like to avoid [0].

The same with all the DTS and hundreds of drivers you keep out of tree.

> 
> [0] Full disclosure: part of my role at Linaro is to keep the Android
> kernel running as close to Mainline as possible and encourage/push the
> upstream-first mantra, hence my involvement with this and other sets.
> I assure you all intentions are good and honourable.  If you haven't
> already seen it, please see Todd's most recent update on the goals and
> status of GKI:
> 
>   Article: https://tinyurl.com/saaen3sp
>   Video:   https://youtu.be/O_lCFGinFPM
> 
>> We don't even know what are these unsupported, downstream platforms
>> you want customize kernel for. They cannot be audited, cannot be
>> compared.  Affecting upstream platforms just because
>> vendor/downstream does not want to mainline some code is
>> unacceptable. Please upstream your drivers and DTS.
> 
>> You also mentioned downstream devices but without actually ever defining
>> them. Please be more specific. What SoC, what hardware?
> 
> Accepting changes based on the proviso that vendors upstream all of
> their work-in-progress solutions is an unfair position.  

You twisted (or ignored) my words here. I said before - sacrificing any
mainline platform (e.g. reliable boot for distro) for an out-of-tree
vendor which does no want to upstream drivers is the unfair position.
One of the incentives of upstreaming is to be able to shape kernel and
be considered in kernel upstream decisions. That's the privileged for
upstreamed platforms - they have an impact.

Vendor decides to stay out - fine, vendor's choice. You loose impact.
Any sad words like "oh upstream does not want to change for me" should
receive a message: "please join the upstream :), so we will change
together".

> We already
> discussed why upstreaming support for bleeding edge H/W and
> functionality is unrealistic in terms of competitive advantage.

Nope, my last point was not responded to. You said that there is no
point for vendor to upstream bleeding edge HW. Then you said that there
is also little point to upstream older HW.

Basically following this approach you agree that vendor does not have to
do anything because it is inconvenient for him.

However upstream has to adapt to downstream vendor, right?

No, this is *unfair to all the platforms we upstreamed*.

> 
> Besides, we might not be talking about new silicon at all (I don't
> believe anyone alluded to that).  The flexibility in configuration
> simply allows for generic upstream drivers to be swapped out for ones
> which may have more or slightly different functionality (that can't be
> publicly shared until release).
> 
>> Please explain why Exynos is special that it does not require essential
>> drivers to be selected as built-in. For example why aren't same changes
>> done for Renesas?
> 
>> Everyone else are working like this. NXP, Renesas, Xilinx, TI, Rockchip,
>> AllWinner. Samsung or Google is not special to receive an exception for
>> this.
> 
> Exynos isn't special in this regard.  This applies to any vendor who
> releases Android images and wishes to be solve all of the issues the
> GKI project addresses (please read the article above for more about
> this).

We have then slightly different goals, because you are driven by Android
release images and this is why you are less interested in NXP and
Renesas. Only some vendors should receive such changes? No, in upstream
we are not focusing on any specific distro and there are other uses of
Exynos (and NXP and Renesas) platforms. Therefore the choice/policy and
overall design tries to match all vendors, not only some subset
convenient to Android.

If Android (or some vendor) wants to have exception for a specific
driver/platform, it must do it in upstream way, by contributing, not by
changing to match downstream needs while ignoring other users.

Best regards,
Krzysztof
Lee Jones Sept. 30, 2021, 10:56 a.m. UTC | #11
On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:

> Hi Lee,
> 
> On Thu, Sep 30, 2021 at 11:23 AM Lee Jones <lee.jones@linaro.org> wrote:
> > I've taken the liberty of cherry-picking some of the points you have
> > reiteratted a few times.  Hopefully I can help to address them
> > adequently.
> >
> > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > Reminder: these are essential drivers and all Exynos platforms must have
> > > them as built-in (at least till someone really tests this on multiple
> > > setups).
> >
> > > Therefore I don't agree with calling it a "problem" that we select
> > > *necessary* drivers for supported platforms. It's by design - supported
> > > platforms should receive them without ability to remove.
> >
> > > The selected drivers are essential for supported platforms.
> >
> > SoC specific drivers are only essential/necessary/required in
> > images designed to execute solely on a platform that requires them.
> 
> Why?

Because without them the image wouldn't functional on any level.

But you're right, there is still no requirement for it to be built-in.

> > For a kernel image which is designed to be generic i.e. one that has
> > the ability to boot on vast array of platforms, the drivers simply
> > have to be *available*.
> 
> If the drivers are really essential/necessary/required, this precludes
> running the generic kernel image on the platform that requires them,
> making the kernel not sufficiently generic.

If they are not at all present, then yes.  However that is not what is
being suggested.  The essential functionality will be provided.  Just
not built-in.

> > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > platforms as core binary built-ins doesn't make any technical sense.
> > The two most important issues this causes are image size and a lack of
> > configurability/flexibility relating to real-world application i.e.
> > the one issue we already agreed upon; H/W or features that are too
> > new (pre-release).
> 
> True, if "potentially".  If not potentially, they must be included.

I'm not sure what you're trying to say here.  Would you mind elaborating?

> > Bloating a generic kernel with potentially hundreds of unnecessary
> > drivers that will never be executed in the vast majority of instances
> > doesn't achieve anything.  If we have a kernel image that has the
> > ability to boot on 10's of architectures which have 10's of platforms
> > each, that's a whole host of unused/wasted executable space.
> 
> The key here is if the driver is required or not to use the platform,
> and why it is required.  If the requirement comes from some deficiency
> in the kernel code or config system, it should be fixed, if possible.
> And the fix should be tested.
> If it cannot be fixed, the driver should be included, else it would
> preclude running the generic kernel on the affected platform.

Sorry, I'm not following.

> > In order for vendors to work more closely with upstream, they need the
> > ability to over-ride a *few* drivers to supplement them with some
> > functionality which they believe provides them with a competitive edge
> > (I think you called this "value-add" before) prior to the release of a
> > device.  This is a requirement that cannot be worked around.
> >
> > By insisting on drivers (most of which will not be executed in the
> > vast majority of cases) to be built-in, you are insisting on a
> > downstream kernel fork, which all of us would like to avoid [0].
> >
> > [0] Full disclosure: part of my role at Linaro is to keep the Android
> > kernel running as close to Mainline as possible and encourage/push the
> > upstream-first mantra, hence my involvement with this and other sets.
> > I assure you all intentions are good and honourable.  If you haven't
> > already seen it, please see Todd's most recent update on the goals and
> > status of GKI:
> >
> >   Article: https://tinyurl.com/saaen3sp
> >   Video:   https://youtu.be/O_lCFGinFPM
> >
> > > We don't even know what are these unsupported, downstream platforms
> > > you want customize kernel for. They cannot be audited, cannot be
> > > compared.  Affecting upstream platforms just because
> > > vendor/downstream does not want to mainline some code is
> > > unacceptable. Please upstream your drivers and DTS.
> >
> > > You also mentioned downstream devices but without actually ever defining
> > > them. Please be more specific. What SoC, what hardware?
> >
> > Accepting changes based on the proviso that vendors upstream all of
> > their work-in-progress solutions is an unfair position.  We already
> > discussed why upstreaming support for bleeding edge H/W and
> > functionality is unrealistic in terms of competitive advantage.
> >
> > Besides, we might not be talking about new silicon at all (I don't
> > believe anyone alluded to that).  The flexibility in configuration
> > simply allows for generic upstream drivers to be swapped out for ones
> > which may have more or slightly different functionality (that can't be
> > publicly shared until release).
> 
> The replacement drivers are already a downstream kernel fork, which you
> would like to avoid?

Avoid, yes, absolutely.

However, in the real world of competitive tech, other than in extreme
cases such as; fully open source, community driven, cute embedded
hobby platforms, there will always be changes required on-top of
upstream projects.  Even as upstream kernel developers we need to
understand and respect that.
Tomasz Figa Sept. 30, 2021, 11:01 a.m. UTC | #12
2021年9月30日(木) 18:23 Lee Jones <lee.jones@linaro.org>:
>
> I've taken the liberty of cherry-picking some of the points you have
> reiteratted a few times.  Hopefully I can help to address them
> adequently.
>
> On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > Reminder: these are essential drivers and all Exynos platforms must have
> > them as built-in (at least till someone really tests this on multiple
> > setups).
>
> > Therefore I don't agree with calling it a "problem" that we select
> > *necessary* drivers for supported platforms. It's by design - supported
> > platforms should receive them without ability to remove.
>
> > The selected drivers are essential for supported platforms.
>
> SoC specific drivers are only essential/necessary/required in
> images designed to execute solely on a platform that requires them.
> For a kernel image which is designed to be generic i.e. one that has
> the ability to boot on vast array of platforms, the drivers simply
> have to be *available*.
>
> Forcing all H/W drivers that are only *potentially* utilised on *some*
> platforms as core binary built-ins doesn't make any technical sense.
> The two most important issues this causes are image size and a lack of
> configurability/flexibility relating to real-world application i.e.
> the one issue we already agreed upon; H/W or features that are too
> new (pre-release).
>
> Bloating a generic kernel with potentially hundreds of unnecessary
> drivers that will never be executed in the vast majority of instances
> doesn't achieve anything.  If we have a kernel image that has the
> ability to boot on 10's of architectures which have 10's of platforms
> each, that's a whole host of unused/wasted executable space.
>
> In order for vendors to work more closely with upstream, they need the
> ability to over-ride a *few* drivers to supplement them with some
> functionality which they believe provides them with a competitive edge
> (I think you called this "value-add" before) prior to the release of a
> device.  This is a requirement that cannot be worked around.

[Chiming in as a clock driver sub-maintainer and someone who spent a
non-insignificant part of his life on SoC driver bring-up - not as a
Google employee.]

I'd argue that the proper way for them to achieve it would be to
extend the upstream frameworks and/or existing drivers with
appropriate APIs to allow their downstream modules to plug into what's
already available upstream.

Best regards,
Tomasz
Geert Uytterhoeven Sept. 30, 2021, 11:25 a.m. UTC | #13
Hi Lee,

On Thu, Sep 30, 2021 at 12:56 PM Lee Jones <lee.jones@linaro.org> wrote:
> On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:
> > On Thu, Sep 30, 2021 at 11:23 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > I've taken the liberty of cherry-picking some of the points you have
> > > reiteratted a few times.  Hopefully I can help to address them
> > > adequently.
> > >
> > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > Reminder: these are essential drivers and all Exynos platforms must have
> > > > them as built-in (at least till someone really tests this on multiple
> > > > setups).
> > >
> > > > Therefore I don't agree with calling it a "problem" that we select
> > > > *necessary* drivers for supported platforms. It's by design - supported
> > > > platforms should receive them without ability to remove.
> > >
> > > > The selected drivers are essential for supported platforms.
> > >
> > > SoC specific drivers are only essential/necessary/required in
> > > images designed to execute solely on a platform that requires them.
> >
> > Why?
>
> Because without them the image wouldn't functional on any level.
>
> But you're right, there is still no requirement for it to be built-in.
>
> > > For a kernel image which is designed to be generic i.e. one that has
> > > the ability to boot on vast array of platforms, the drivers simply
> > > have to be *available*.
> >
> > If the drivers are really essential/necessary/required, this precludes
> > running the generic kernel image on the platform that requires them,
> > making the kernel not sufficiently generic.
>
> If they are not at all present, then yes.  However that is not what is
> being suggested.  The essential functionality will be provided.  Just
> not built-in.

I really meant "essential/necessary/required to be built-in".

> > > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > > platforms as core binary built-ins doesn't make any technical sense.
> > > The two most important issues this causes are image size and a lack of
> > > configurability/flexibility relating to real-world application i.e.
> > > the one issue we already agreed upon; H/W or features that are too
> > > new (pre-release).
> >
> > True, if "potentially".  If not potentially, they must be included.
>
> I'm not sure what you're trying to say here.  Would you mind elaborating?

It was a comment to your "*potentially* utilised on *some* platforms".
It is clear they are not used on the other ("not *some*") platforms, but your
sentence was unclear whether they are always or only sometimes used on
"*some*" platforms.
"always" => "not potentially"
"sometimes" => "potentially".

I hope this makes it more clear.

> > > Bloating a generic kernel with potentially hundreds of unnecessary
> > > drivers that will never be executed in the vast majority of instances
> > > doesn't achieve anything.  If we have a kernel image that has the
> > > ability to boot on 10's of architectures which have 10's of platforms
> > > each, that's a whole host of unused/wasted executable space.
> >
> > The key here is if the driver is required or not to use the platform,
> > and why it is required.  If the requirement comes from some deficiency
> > in the kernel code or config system, it should be fixed, if possible.
> > And the fix should be tested.
> > If it cannot be fixed, the driver should be included, else it would
> > preclude running the generic kernel on the affected platform.
>
> Sorry, I'm not following.

It all depends on why the driver is "required to be built-in".
Depending on the reason behind that requirement, the driver can be
changed from built-in to modular without ill effects on functionality.

Gr{oetje,eeting}s,

                        Geert
Geert Uytterhoeven Sept. 30, 2021, 11:27 a.m. UTC | #14
Hi Tomasz,

On Thu, Sep 30, 2021 at 1:01 PM Tomasz Figa <tomasz.figa@gmail.com> wrote:
> 2021年9月30日(木) 18:23 Lee Jones <lee.jones@linaro.org>:
> > I've taken the liberty of cherry-picking some of the points you have
> > reiteratted a few times.  Hopefully I can help to address them
> > adequently.
> >
> > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > Reminder: these are essential drivers and all Exynos platforms must have
> > > them as built-in (at least till someone really tests this on multiple
> > > setups).
> >
> > > Therefore I don't agree with calling it a "problem" that we select
> > > *necessary* drivers for supported platforms. It's by design - supported
> > > platforms should receive them without ability to remove.
> >
> > > The selected drivers are essential for supported platforms.
> >
> > SoC specific drivers are only essential/necessary/required in
> > images designed to execute solely on a platform that requires them.
> > For a kernel image which is designed to be generic i.e. one that has
> > the ability to boot on vast array of platforms, the drivers simply
> > have to be *available*.
> >
> > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > platforms as core binary built-ins doesn't make any technical sense.
> > The two most important issues this causes are image size and a lack of
> > configurability/flexibility relating to real-world application i.e.
> > the one issue we already agreed upon; H/W or features that are too
> > new (pre-release).
> >
> > Bloating a generic kernel with potentially hundreds of unnecessary
> > drivers that will never be executed in the vast majority of instances
> > doesn't achieve anything.  If we have a kernel image that has the
> > ability to boot on 10's of architectures which have 10's of platforms
> > each, that's a whole host of unused/wasted executable space.
> >
> > In order for vendors to work more closely with upstream, they need the
> > ability to over-ride a *few* drivers to supplement them with some
> > functionality which they believe provides them with a competitive edge
> > (I think you called this "value-add" before) prior to the release of a
> > device.  This is a requirement that cannot be worked around.
>
> [Chiming in as a clock driver sub-maintainer and someone who spent a
> non-insignificant part of his life on SoC driver bring-up - not as a
> Google employee.]
>
> I'd argue that the proper way for them to achieve it would be to
> extend the upstream frameworks and/or existing drivers with
> appropriate APIs to allow their downstream modules to plug into what's
> already available upstream.

Yes, that's one possible solution.
We do have to be careful this would not just become some fishy API
to touch deeply buried internals, and for which there are no valid use
cases in upstream.

Gr{oetje,eeting}s,

                        Geert
Lee Jones Sept. 30, 2021, 11:51 a.m. UTC | #15
On Thu, 30 Sep 2021, Tomasz Figa wrote:

> 2021年9月30日(木) 18:23 Lee Jones <lee.jones@linaro.org>:
> >
> > I've taken the liberty of cherry-picking some of the points you have
> > reiteratted a few times.  Hopefully I can help to address them
> > adequently.
> >
> > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > Reminder: these are essential drivers and all Exynos platforms must have
> > > them as built-in (at least till someone really tests this on multiple
> > > setups).
> >
> > > Therefore I don't agree with calling it a "problem" that we select
> > > *necessary* drivers for supported platforms. It's by design - supported
> > > platforms should receive them without ability to remove.
> >
> > > The selected drivers are essential for supported platforms.
> >
> > SoC specific drivers are only essential/necessary/required in
> > images designed to execute solely on a platform that requires them.
> > For a kernel image which is designed to be generic i.e. one that has
> > the ability to boot on vast array of platforms, the drivers simply
> > have to be *available*.
> >
> > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > platforms as core binary built-ins doesn't make any technical sense.
> > The two most important issues this causes are image size and a lack of
> > configurability/flexibility relating to real-world application i.e.
> > the one issue we already agreed upon; H/W or features that are too
> > new (pre-release).
> >
> > Bloating a generic kernel with potentially hundreds of unnecessary
> > drivers that will never be executed in the vast majority of instances
> > doesn't achieve anything.  If we have a kernel image that has the
> > ability to boot on 10's of architectures which have 10's of platforms
> > each, that's a whole host of unused/wasted executable space.
> >
> > In order for vendors to work more closely with upstream, they need the
> > ability to over-ride a *few* drivers to supplement them with some
> > functionality which they believe provides them with a competitive edge
> > (I think you called this "value-add" before) prior to the release of a
> > device.  This is a requirement that cannot be worked around.
> 
> [Chiming in as a clock driver sub-maintainer and someone who spent a
> non-insignificant part of his life on SoC driver bring-up - not as a
> Google employee.]
> 
> I'd argue that the proper way for them to achieve it would be to
> extend the upstream frameworks and/or existing drivers with
> appropriate APIs to allow their downstream modules to plug into what's
> already available upstream.

Is that the same as exporting symbols to framework APIs?

Since this is already a method GKI uses to allow external modules to
interact with the core kernel/frameworks.  However, it's not possible
to upstream these without an upstream user for each one.
Lee Jones Sept. 30, 2021, 12:08 p.m. UTC | #16
On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:

> Hi Lee,
> 
> On Thu, Sep 30, 2021 at 12:56 PM Lee Jones <lee.jones@linaro.org> wrote:
> > On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:
> > > On Thu, Sep 30, 2021 at 11:23 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > I've taken the liberty of cherry-picking some of the points you have
> > > > reiteratted a few times.  Hopefully I can help to address them
> > > > adequently.
> > > >
> > > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > > Reminder: these are essential drivers and all Exynos platforms must have
> > > > > them as built-in (at least till someone really tests this on multiple
> > > > > setups).
> > > >
> > > > > Therefore I don't agree with calling it a "problem" that we select
> > > > > *necessary* drivers for supported platforms. It's by design - supported
> > > > > platforms should receive them without ability to remove.
> > > >
> > > > > The selected drivers are essential for supported platforms.
> > > >
> > > > SoC specific drivers are only essential/necessary/required in
> > > > images designed to execute solely on a platform that requires them.
> > >
> > > Why?
> >
> > Because without them the image wouldn't functional on any level.
> >
> > But you're right, there is still no requirement for it to be built-in.
> >
> > > > For a kernel image which is designed to be generic i.e. one that has
> > > > the ability to boot on vast array of platforms, the drivers simply
> > > > have to be *available*.
> > >
> > > If the drivers are really essential/necessary/required, this precludes
> > > running the generic kernel image on the platform that requires them,
> > > making the kernel not sufficiently generic.
> >
> > If they are not at all present, then yes.  However that is not what is
> > being suggested.  The essential functionality will be provided.  Just
> > not built-in.
> 
> I really meant "essential/necessary/required to be built-in".

Then I agree with you.  My position is that if they don't *have* to be
built-in, then why force it?

> > > > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > > > platforms as core binary built-ins doesn't make any technical sense.
> > > > The two most important issues this causes are image size and a lack of
> > > > configurability/flexibility relating to real-world application i.e.
> > > > the one issue we already agreed upon; H/W or features that are too
> > > > new (pre-release).
> > >
> > > True, if "potentially".  If not potentially, they must be included.
> >
> > I'm not sure what you're trying to say here.  Would you mind elaborating?
> 
> It was a comment to your "*potentially* utilised on *some* platforms".
> It is clear they are not used on the other ("not *some*") platforms, but your
> sentence was unclear whether they are always or only sometimes used on
> "*some*" platforms.
> "always" => "not potentially"
> "sometimes" => "potentially".
> 
> I hope this makes it more clear.

Not really, but I'll try to clean mine up:

The aim is to have a single kernel (image + modules) that can be
booted on a plethora of platforms.  For the sake of argument say 10.
Let's also say that each of the platforms are equal and will be booted
the same amount of times.

Taking the example above, when I say that the H/W specific drivers
will only be *potentially* utilised, I mean that they will only be
bound and probed 1/10 times i.e. when booted on the associated
platform.  Which means that in the vast majority of boots (9/10) they
will lie dormant, taking up unnecessary space.

Another way to say this would be; the kernel needs to have the
capability to boot all of the supported platforms, but it will only
ever be utilised on one at a time.

> > > > Bloating a generic kernel with potentially hundreds of unnecessary
> > > > drivers that will never be executed in the vast majority of instances
> > > > doesn't achieve anything.  If we have a kernel image that has the
> > > > ability to boot on 10's of architectures which have 10's of platforms
> > > > each, that's a whole host of unused/wasted executable space.
> > >
> > > The key here is if the driver is required or not to use the platform,
> > > and why it is required.  If the requirement comes from some deficiency
> > > in the kernel code or config system, it should be fixed, if possible.
> > > And the fix should be tested.
> > > If it cannot be fixed, the driver should be included, else it would
> > > preclude running the generic kernel on the affected platform.
> >
> > Sorry, I'm not following.
> 
> It all depends on why the driver is "required to be built-in".
> Depending on the reason behind that requirement, the driver can be
> changed from built-in to modular without ill effects on functionality.

Absolutely.

There are cases where drivers simply can't be built as modules.  These
unavoidable situations are legitimate use-cases and the technology/
code-base will have to work around these as required.

The argument here is that if they can be separated and have been shown
to work well in either use-case, then it is my opinion that placing an
artificial barrier up based mostly on politics is not the correct
approach.
Tomasz Figa Sept. 30, 2021, 12:10 p.m. UTC | #17
2021年9月30日(木) 20:51 Lee Jones <lee.jones@linaro.org>:
>
> On Thu, 30 Sep 2021, Tomasz Figa wrote:
>
> > 2021年9月30日(木) 18:23 Lee Jones <lee.jones@linaro.org>:
> > >
> > > I've taken the liberty of cherry-picking some of the points you have
> > > reiteratted a few times.  Hopefully I can help to address them
> > > adequently.
> > >
> > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > Reminder: these are essential drivers and all Exynos platforms must have
> > > > them as built-in (at least till someone really tests this on multiple
> > > > setups).
> > >
> > > > Therefore I don't agree with calling it a "problem" that we select
> > > > *necessary* drivers for supported platforms. It's by design - supported
> > > > platforms should receive them without ability to remove.
> > >
> > > > The selected drivers are essential for supported platforms.
> > >
> > > SoC specific drivers are only essential/necessary/required in
> > > images designed to execute solely on a platform that requires them.
> > > For a kernel image which is designed to be generic i.e. one that has
> > > the ability to boot on vast array of platforms, the drivers simply
> > > have to be *available*.
> > >
> > > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > > platforms as core binary built-ins doesn't make any technical sense.
> > > The two most important issues this causes are image size and a lack of
> > > configurability/flexibility relating to real-world application i.e.
> > > the one issue we already agreed upon; H/W or features that are too
> > > new (pre-release).
> > >
> > > Bloating a generic kernel with potentially hundreds of unnecessary
> > > drivers that will never be executed in the vast majority of instances
> > > doesn't achieve anything.  If we have a kernel image that has the
> > > ability to boot on 10's of architectures which have 10's of platforms
> > > each, that's a whole host of unused/wasted executable space.
> > >
> > > In order for vendors to work more closely with upstream, they need the
> > > ability to over-ride a *few* drivers to supplement them with some
> > > functionality which they believe provides them with a competitive edge
> > > (I think you called this "value-add" before) prior to the release of a
> > > device.  This is a requirement that cannot be worked around.
> >
> > [Chiming in as a clock driver sub-maintainer and someone who spent a
> > non-insignificant part of his life on SoC driver bring-up - not as a
> > Google employee.]
> >
> > I'd argue that the proper way for them to achieve it would be to
> > extend the upstream frameworks and/or existing drivers with
> > appropriate APIs to allow their downstream modules to plug into what's
> > already available upstream.
>
> Is that the same as exporting symbols to framework APIs?
>
> Since this is already a method GKI uses to allow external modules to
> interact with the core kernel/frameworks.  However, it's not possible
> to upstream these without an upstream user for each one.

Not necessary the core frameworks, could also be changing the ways the
existing drivers register to allow additional drivers to extend the
functionality rather than completely overwrite them. It's really hard
to tell what the right way would be without knowing the exact things
they find missing in the upstream drivers. As for upstream users, this
is exactly the point - upstream is a bidirectional effort, one takes
from it and should contribute things back.

Generally, the subsystems being mentioned here are so basic (clock,
pinctrl, rtc), that I really can't imagine what kind of rocket science
one might want to hide for competitive reasons... If it's for an
entire SoC, I wonder why Intel and AMD don't have similar concerns and
contribute support for their newest hardware far before the release.
Krzysztof Kozlowski Sept. 30, 2021, 12:15 p.m. UTC | #18
On 30/09/2021 14:10, Tomasz Figa wrote:
> 2021年9月30日(木) 20:51 Lee Jones <lee.jones@linaro.org>:
>>
>> On Thu, 30 Sep 2021, Tomasz Figa wrote:
>>
>>
>> Is that the same as exporting symbols to framework APIs?
>>
>> Since this is already a method GKI uses to allow external modules to
>> interact with the core kernel/frameworks.  However, it's not possible
>> to upstream these without an upstream user for each one.
> 
> Not necessary the core frameworks, could also be changing the ways the
> existing drivers register to allow additional drivers to extend the
> functionality rather than completely overwrite them. 

Yes, the first user could be within the kernel after modifying some of
the drivers.

> It's really hard
> to tell what the right way would be without knowing the exact things
> they find missing in the upstream drivers. As for upstream users, this
> is exactly the point - upstream is a bidirectional effort, one takes
> from it and should contribute things back.
> 
> Generally, the subsystems being mentioned here are so basic (clock,
> pinctrl, rtc), that I really can't imagine what kind of rocket science
> one might want to hide for competitive reasons... If it's for an
> entire SoC, I wonder why Intel and AMD don't have similar concerns and
> contribute support for their newest hardware far before the release.

Lee used the argument of not-disclosing-edge-hw but I also don't see
much of it in the case of few drivers needed to be overridden. Just
bunch of registers for the same stuff we have sine 8 years. Rather the
vendor does not want to commit effort towards upstreaming these...


Best regards,
Krzysztof
Krzysztof Kozlowski Sept. 30, 2021, 12:21 p.m. UTC | #19
On 30/09/2021 11:23, Lee Jones wrote:
> [0] Full disclosure: part of my role at Linaro is to keep the Android
> kernel running as close to Mainline as possible and encourage/push the
> upstream-first mantra, hence my involvement with this and other sets.
> I assure you all intentions are good and honourable.  If you haven't
> already seen it, please see Todd's most recent update on the goals and
> status of GKI:
> 
>   Article: https://tinyurl.com/saaen3sp
>   Video:   https://youtu.be/O_lCFGinFPM
> 

Side topic, why this patchset is in your scope or Will's/Google's scope?
Just drop it from Android main kernel, it will not be your problem. I
mean, really, you don't need this patchset in your tree at all. The only
platform which needs it, the only platform which will loose something
will be one specific vendor. Therefore this will be an incentive for
them to join both discussions and upstream development. :)

Best regards,
Krzysztof
Lee Jones Sept. 30, 2021, 12:32 p.m. UTC | #20
On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:

> On 30/09/2021 11:23, Lee Jones wrote:
> > I've taken the liberty of cherry-picking some of the points you have
> > reiteratted a few times.  Hopefully I can help to address them
> > adequently.
> > 
> > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> >> Reminder: these are essential drivers and all Exynos platforms must have
> >> them as built-in (at least till someone really tests this on multiple
> >> setups).
> > 
> >> Therefore I don't agree with calling it a "problem" that we select
> >> *necessary* drivers for supported platforms. It's by design - supported
> >> platforms should receive them without ability to remove.
> > 
> >> The selected drivers are essential for supported platforms.
> > 
> > SoC specific drivers are only essential/necessary/required in
> > images designed to execute solely on a platform that requires them.
> > For a kernel image which is designed to be generic i.e. one that has
> > the ability to boot on vast array of platforms, the drivers simply
> > have to be *available*.
> 
> By "essential", I meant drivers which are needed for supported platform
> to boot properly.

Yes, I know what you meant by essential.

My comment still stands.  As long as they are provided, it should work.

> Also without significant performance penalty due to
> probe deferrals.

Agreed.  We will try to find a way to test this.

> > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > platforms as core binary built-ins doesn't make any technical sense.
> > The two most important issues this causes are image size and a lack of
> > configurability/flexibility relating to real-world application i.e.
> > the one issue we already agreed upon; H/W or features that are too
> > new (pre-release).
> 
> Geert responded here. If drivers are essential for supported platforms
> to boot, having them built-in has technical sense.

We're going to have to agree to disagree here.

None of us are currently aware of any technical reasons why these
particular drivers have to be built-in.  Hopefully we can prove this
out with testing.

> > By insisting on drivers (most of which will not be executed in the
> > vast majority of cases) to be built-in, you are insisting on a
> > downstream kernel fork, which all of us would like to avoid [0].
> 
> The same with all the DTS and hundreds of drivers you keep out of tree.

I do not keep any drivers out of tree. :)

> >> We don't even know what are these unsupported, downstream platforms
> >> you want customize kernel for. They cannot be audited, cannot be
> >> compared.  Affecting upstream platforms just because
> >> vendor/downstream does not want to mainline some code is
> >> unacceptable.
> >>
> >> Please upstream your drivers and DTS.
> > 
> > Accepting changes based on the proviso that vendors upstream all of
> > their work-in-progress solutions is an unfair position.  
> 
> You twisted (or ignored) my words here. I said before - sacrificing any
> mainline platform (e.g. reliable boot for distro) for an out-of-tree
> vendor which does no want to upstream drivers is the unfair position.
> One of the incentives of upstreaming is to be able to shape kernel and
> be considered in kernel upstream decisions. That's the privileged for
> upstreamed platforms - they have an impact.
> 
> Vendor decides to stay out - fine, vendor's choice. You loose impact.
> Any sad words like "oh upstream does not want to change for me" should
> receive a message: "please join the upstream :), so we will change
> together".

No one is proposing to sacrifice anything.

This change is believed to be benign.

> > We already
> > discussed why upstreaming support for bleeding edge H/W and
> > functionality is unrealistic in terms of competitive advantage.
> 
> Nope, my last point was not responded to. You said that there is no
> point for vendor to upstream bleeding edge HW. Then you said that there
> is also little point to upstream older HW.

The driver in question is already upstream.

It is my hope that the vendor will realign with Mainline and upstream
the differences (providing the current competitive edge).  Although I
have no real influence in this regard, as I have no contact with
them.

> Basically following this approach you agree that vendor does not have to
> do anything because it is inconvenient for him.
> 
> However upstream has to adapt to downstream vendor, right?
> 
> No, this is *unfair to all the platforms we upstreamed*.

The point here is that this change should not be a determent to
anyone.

> > Exynos isn't special in this regard.  This applies to any vendor who
> > releases Android images and wishes to be solve all of the issues the
> > GKI project addresses (please read the article above for more about
> > this).
> 
> We have then slightly different goals, because you are driven by Android
> release images and this is why you are less interested in NXP and
> Renesas. Only some vendors should receive such changes? No, in upstream
> we are not focusing on any specific distro and there are other uses of
> Exynos (and NXP and Renesas) platforms. Therefore the choice/policy and
> overall design tries to match all vendors, not only some subset
> convenient to Android.
> 
> If Android (or some vendor) wants to have exception for a specific
> driver/platform, it must do it in upstream way, by contributing, not by
> changing to match downstream needs while ignoring other users.

No users are being ignored.  No damaging changes are being proposed.

This has little to do with Android and everything to do with the
possibly of a more generic kernel.  Moving drivers out of platform
code and into /drivers along with enabling components as modules has
been an on-going effort for many years now.

I'm struggling to see how this is different.
Lee Jones Sept. 30, 2021, 12:34 p.m. UTC | #21
On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> >   It sounds like a lack of testing is your main concern.
> > 
> >   How can I help here?  What H/W do I need to be able to fully test this?
> 
> The changes here need to be tested on affected platforms (ARMv7 and
> ARMv8), when built as a modules on some types of regular distros (e.g.
> Arch, Ubuntu). From each of such boot I would be happy to see number of
> new dmesg warnings/errors plus number of probe deferrals.
> 
> Since the drivers could be switched to modules (and some distros might
> do it), they might be hit by surprise regressions in boot performance
> due to probe deferrals. This should be also checked on these platforms.
> Geert pointed out before that clocks in many cases are not optional -
> driver needs them and will wait defer.
> 
> Assuming of course that boot succeeds. Minor differences in boot speed
> should not be a problem, I think, because distro anyway chosen
> all-module approach so it accepts the penalty.

Do you have any suggestions in terms of devboards?
Krzysztof Kozlowski Sept. 30, 2021, 12:38 p.m. UTC | #22
On 30/09/2021 14:34, Lee Jones wrote:
> On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
>>>   It sounds like a lack of testing is your main concern.
>>>
>>>   How can I help here?  What H/W do I need to be able to fully test this?
>>
>> The changes here need to be tested on affected platforms (ARMv7 and
>> ARMv8), when built as a modules on some types of regular distros (e.g.
>> Arch, Ubuntu). From each of such boot I would be happy to see number of
>> new dmesg warnings/errors plus number of probe deferrals.
>>
>> Since the drivers could be switched to modules (and some distros might
>> do it), they might be hit by surprise regressions in boot performance
>> due to probe deferrals. This should be also checked on these platforms.
>> Geert pointed out before that clocks in many cases are not optional -
>> driver needs them and will wait defer.
>>
>> Assuming of course that boot succeeds. Minor differences in boot speed
>> should not be a problem, I think, because distro anyway chosen
>> all-module approach so it accepts the penalty.
> 
> Do you have any suggestions in terms of devboards?

Minimal set:
1. Something with Exynos4 (e.g. Odroid U3 with Exynos4412).
2. Something with Exynos 5422/5800 (e.g. Odroid XU3/XU4/HC1/MC1 or
Chromebook Peach Pi).
3. Exynos5433 - TM2 or TM2E. Boards are not widely available, so we need
to rely on provided testing by Samsung.

What would be good is to also test Exynos3 boards, but this is also not
widely available.


Best regards,
Krzysztof
Lee Jones Sept. 30, 2021, 12:39 p.m. UTC | #23
On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:

> On 30/09/2021 11:23, Lee Jones wrote:
> > [0] Full disclosure: part of my role at Linaro is to keep the Android
> > kernel running as close to Mainline as possible and encourage/push the
> > upstream-first mantra, hence my involvement with this and other sets.
> > I assure you all intentions are good and honourable.  If you haven't
> > already seen it, please see Todd's most recent update on the goals and
> > status of GKI:
> > 
> >   Article: https://tinyurl.com/saaen3sp
> >   Video:   https://youtu.be/O_lCFGinFPM
> > 
> 
> Side topic, why this patchset is in your scope or Will's/Google's scope?
> Just drop it from Android main kernel, it will not be your problem. I
> mean, really, you don't need this patchset in your tree at all. The only
> platform which needs it, the only platform which will loose something
> will be one specific vendor. Therefore this will be an incentive for
> them to join both discussions and upstream development. :)

How would they fix this besides upstreaming support for unreleased
work-in-progress H/W?

Haven't I explained this several times already? :)
Lee Jones Sept. 30, 2021, 12:45 p.m. UTC | #24
On Thu, 30 Sep 2021, Tomasz Figa wrote:

> 2021年9月30日(木) 20:51 Lee Jones <lee.jones@linaro.org>:
> >
> > On Thu, 30 Sep 2021, Tomasz Figa wrote:
> >
> > > 2021年9月30日(木) 18:23 Lee Jones <lee.jones@linaro.org>:
> > > >
> > > > I've taken the liberty of cherry-picking some of the points you have
> > > > reiteratted a few times.  Hopefully I can help to address them
> > > > adequently.
> > > >
> > > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > > Reminder: these are essential drivers and all Exynos platforms must have
> > > > > them as built-in (at least till someone really tests this on multiple
> > > > > setups).
> > > >
> > > > > Therefore I don't agree with calling it a "problem" that we select
> > > > > *necessary* drivers for supported platforms. It's by design - supported
> > > > > platforms should receive them without ability to remove.
> > > >
> > > > > The selected drivers are essential for supported platforms.
> > > >
> > > > SoC specific drivers are only essential/necessary/required in
> > > > images designed to execute solely on a platform that requires them.
> > > > For a kernel image which is designed to be generic i.e. one that has
> > > > the ability to boot on vast array of platforms, the drivers simply
> > > > have to be *available*.
> > > >
> > > > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > > > platforms as core binary built-ins doesn't make any technical sense.
> > > > The two most important issues this causes are image size and a lack of
> > > > configurability/flexibility relating to real-world application i.e.
> > > > the one issue we already agreed upon; H/W or features that are too
> > > > new (pre-release).
> > > >
> > > > Bloating a generic kernel with potentially hundreds of unnecessary
> > > > drivers that will never be executed in the vast majority of instances
> > > > doesn't achieve anything.  If we have a kernel image that has the
> > > > ability to boot on 10's of architectures which have 10's of platforms
> > > > each, that's a whole host of unused/wasted executable space.
> > > >
> > > > In order for vendors to work more closely with upstream, they need the
> > > > ability to over-ride a *few* drivers to supplement them with some
> > > > functionality which they believe provides them with a competitive edge
> > > > (I think you called this "value-add" before) prior to the release of a
> > > > device.  This is a requirement that cannot be worked around.
> > >
> > > [Chiming in as a clock driver sub-maintainer and someone who spent a
> > > non-insignificant part of his life on SoC driver bring-up - not as a
> > > Google employee.]
> > >
> > > I'd argue that the proper way for them to achieve it would be to
> > > extend the upstream frameworks and/or existing drivers with
> > > appropriate APIs to allow their downstream modules to plug into what's
> > > already available upstream.
> >
> > Is that the same as exporting symbols to framework APIs?
> >
> > Since this is already a method GKI uses to allow external modules to
> > interact with the core kernel/frameworks.  However, it's not possible
> > to upstream these without an upstream user for each one.
> 
> Not necessary the core frameworks, could also be changing the ways the
> existing drivers register to allow additional drivers to extend the
> functionality rather than completely overwrite them. It's really hard
> to tell what the right way would be without knowing the exact things
> they find missing in the upstream drivers. As for upstream users, this
> is exactly the point - upstream is a bidirectional effort, one takes
> from it and should contribute things back.
> 
> Generally, the subsystems being mentioned here are so basic (clock,
> pinctrl, rtc), that I really can't imagine what kind of rocket science
> one might want to hide for competitive reasons... If it's for an
> entire SoC, I wonder why Intel and AMD don't have similar concerns and
> contribute support for their newest hardware far before the release.

I don't have visibility into the driver-overrides I'm afraid.

I do know that code-space can be a problem though.  So any way we can
make the core binary smaller (i.e. remove anything that can be built
as a module) will have a positive effect.
Krzysztof Kozlowski Sept. 30, 2021, 1:08 p.m. UTC | #25
On 30/09/2021 14:39, Lee Jones wrote:
> On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> 
>> On 30/09/2021 11:23, Lee Jones wrote:
>>> [0] Full disclosure: part of my role at Linaro is to keep the Android
>>> kernel running as close to Mainline as possible and encourage/push the
>>> upstream-first mantra, hence my involvement with this and other sets.
>>> I assure you all intentions are good and honourable.  If you haven't
>>> already seen it, please see Todd's most recent update on the goals and
>>> status of GKI:
>>>
>>>   Article: https://tinyurl.com/saaen3sp
>>>   Video:   https://youtu.be/O_lCFGinFPM
>>>
>>
>> Side topic, why this patchset is in your scope or Will's/Google's scope?
>> Just drop it from Android main kernel, it will not be your problem. I
>> mean, really, you don't need this patchset in your tree at all. The only
>> platform which needs it, the only platform which will loose something
>> will be one specific vendor. Therefore this will be an incentive for
>> them to join both discussions and upstream development. :)
> 
> How would they fix this besides upstreaming support for unreleased
> work-in-progress H/W?
> 
> Haven't I explained this several times already? :)

Either that way or the same as Will's doing but that's not my question.
I understand you flush the queue of your GKI patches to be closer to
upstream. Reduce the backlog/burden. you can achieve your goal by simply
dropping such patch and making it not your problem. :)


Best regards,
Krzysztof
Lee Jones Sept. 30, 2021, 1:29 p.m. UTC | #26
On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:

> On 30/09/2021 14:39, Lee Jones wrote:
> > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > 
> >> On 30/09/2021 11:23, Lee Jones wrote:
> >>> [0] Full disclosure: part of my role at Linaro is to keep the Android
> >>> kernel running as close to Mainline as possible and encourage/push the
> >>> upstream-first mantra, hence my involvement with this and other sets.
> >>> I assure you all intentions are good and honourable.  If you haven't
> >>> already seen it, please see Todd's most recent update on the goals and
> >>> status of GKI:
> >>>
> >>>   Article: https://tinyurl.com/saaen3sp
> >>>   Video:   https://youtu.be/O_lCFGinFPM
> >>>
> >>
> >> Side topic, why this patchset is in your scope or Will's/Google's scope?
> >> Just drop it from Android main kernel, it will not be your problem. I
> >> mean, really, you don't need this patchset in your tree at all. The only
> >> platform which needs it, the only platform which will loose something
> >> will be one specific vendor. Therefore this will be an incentive for
> >> them to join both discussions and upstream development. :)
> > 
> > How would they fix this besides upstreaming support for unreleased
> > work-in-progress H/W?
> > 
> > Haven't I explained this several times already? :)
> 
> Either that way or the same as Will's doing but that's not my question.
> I understand you flush the queue of your GKI patches to be closer to
> upstream. Reduce the backlog/burden. you can achieve your goal by simply
> dropping such patch and making it not your problem. :)

git reset --hard mainline/master   # job done - tea break  :)

Seriously though, we wish to encourage the use of GKI so all vendors
can enjoy the benefits of more easily updateable/secure code-bases.

I can't see how pushing back on seamlessly benign changes would
benefit them or anyone else.
Geert Uytterhoeven Sept. 30, 2021, 4:09 p.m. UTC | #27
Hi Lee,

On Thu, Sep 30, 2021 at 2:08 PM Lee Jones <lee.jones@linaro.org> wrote:
> On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:
> > On Thu, Sep 30, 2021 at 12:56 PM Lee Jones <lee.jones@linaro.org> wrote:
> > > On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:
> > > > On Thu, Sep 30, 2021 at 11:23 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > > I've taken the liberty of cherry-picking some of the points you have
> > > > > reiteratted a few times.  Hopefully I can help to address them
> > > > > adequently.
> > > > >
> > > > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > > > Reminder: these are essential drivers and all Exynos platforms must have
> > > > > > them as built-in (at least till someone really tests this on multiple
> > > > > > setups).
> > > > >
> > > > > > Therefore I don't agree with calling it a "problem" that we select
> > > > > > *necessary* drivers for supported platforms. It's by design - supported
> > > > > > platforms should receive them without ability to remove.
> > > > >
> > > > > > The selected drivers are essential for supported platforms.
> > > > >
> > > > > SoC specific drivers are only essential/necessary/required in
> > > > > images designed to execute solely on a platform that requires them.
> > > >
> > > > Why?
> > >
> > > Because without them the image wouldn't functional on any level.
> > >
> > > But you're right, there is still no requirement for it to be built-in.
> > >
> > > > > For a kernel image which is designed to be generic i.e. one that has
> > > > > the ability to boot on vast array of platforms, the drivers simply
> > > > > have to be *available*.
> > > >
> > > > If the drivers are really essential/necessary/required, this precludes
> > > > running the generic kernel image on the platform that requires them,
> > > > making the kernel not sufficiently generic.
> > >
> > > If they are not at all present, then yes.  However that is not what is
> > > being suggested.  The essential functionality will be provided.  Just
> > > not built-in.
> >
> > I really meant "essential/necessary/required to be built-in".
>
> Then I agree with you.  My position is that if they don't *have* to be
> built-in, then why force it?
>
> > > > > Forcing all H/W drivers that are only *potentially* utilised on *some*
> > > > > platforms as core binary built-ins doesn't make any technical sense.
> > > > > The two most important issues this causes are image size and a lack of
> > > > > configurability/flexibility relating to real-world application i.e.
> > > > > the one issue we already agreed upon; H/W or features that are too
> > > > > new (pre-release).
> > > >
> > > > True, if "potentially".  If not potentially, they must be included.
> > >
> > > I'm not sure what you're trying to say here.  Would you mind elaborating?
> >
> > It was a comment to your "*potentially* utilised on *some* platforms".
> > It is clear they are not used on the other ("not *some*") platforms, but your
> > sentence was unclear whether they are always or only sometimes used on
> > "*some*" platforms.
> > "always" => "not potentially"
> > "sometimes" => "potentially".
> >
> > I hope this makes it more clear.
>
> Not really, but I'll try to clean mine up:
>
> The aim is to have a single kernel (image + modules) that can be
> booted on a plethora of platforms.  For the sake of argument say 10.
> Let's also say that each of the platforms are equal and will be booted
> the same amount of times.
>
> Taking the example above, when I say that the H/W specific drivers
> will only be *potentially* utilised, I mean that they will only be
> bound and probed 1/10 times i.e. when booted on the associated
> platform.  Which means that in the vast majority of boots (9/10) they
> will lie dormant, taking up unnecessary space.
>
> Another way to say this would be; the kernel needs to have the
> capability to boot all of the supported platforms, but it will only
> ever be utilised on one at a time.

That's true even for drivers for "generic" hardware, right?
E.g. arm64 selects ARM_GIC and ARM_GIC_V3, where most (all?)
platforms have at most one of them.

> > > > > Bloating a generic kernel with potentially hundreds of unnecessary
> > > > > drivers that will never be executed in the vast majority of instances
> > > > > doesn't achieve anything.  If we have a kernel image that has the
> > > > > ability to boot on 10's of architectures which have 10's of platforms
> > > > > each, that's a whole host of unused/wasted executable space.
> > > >
> > > > The key here is if the driver is required or not to use the platform,
> > > > and why it is required.  If the requirement comes from some deficiency
> > > > in the kernel code or config system, it should be fixed, if possible.
> > > > And the fix should be tested.
> > > > If it cannot be fixed, the driver should be included, else it would
> > > > preclude running the generic kernel on the affected platform.
> > >
> > > Sorry, I'm not following.
> >
> > It all depends on why the driver is "required to be built-in".
> > Depending on the reason behind that requirement, the driver can be
> > changed from built-in to modular without ill effects on functionality.
>
> Absolutely.
>
> There are cases where drivers simply can't be built as modules.  These
> unavoidable situations are legitimate use-cases and the technology/
> code-base will have to work around these as required.
>
> The argument here is that if they can be separated and have been shown
> to work well in either use-case, then it is my opinion that placing an
> artificial barrier up based mostly on politics is not the correct
> approach.

Agreed.

Gr{oetje,eeting}s,

                        Geert
Geert Uytterhoeven Sept. 30, 2021, 4:12 p.m. UTC | #28
Hi Lee,

On Thu, Sep 30, 2021 at 3:29 PM Lee Jones <lee.jones@linaro.org> wrote:
> On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > On 30/09/2021 14:39, Lee Jones wrote:
> > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > >> On 30/09/2021 11:23, Lee Jones wrote:
> > >>> [0] Full disclosure: part of my role at Linaro is to keep the Android
> > >>> kernel running as close to Mainline as possible and encourage/push the
> > >>> upstream-first mantra, hence my involvement with this and other sets.
> > >>> I assure you all intentions are good and honourable.  If you haven't
> > >>> already seen it, please see Todd's most recent update on the goals and
> > >>> status of GKI:
> > >>>
> > >>>   Article: https://tinyurl.com/saaen3sp
> > >>>   Video:   https://youtu.be/O_lCFGinFPM
> > >>>
> > >>
> > >> Side topic, why this patchset is in your scope or Will's/Google's scope?
> > >> Just drop it from Android main kernel, it will not be your problem. I
> > >> mean, really, you don't need this patchset in your tree at all. The only
> > >> platform which needs it, the only platform which will loose something
> > >> will be one specific vendor. Therefore this will be an incentive for
> > >> them to join both discussions and upstream development. :)
> > >
> > > How would they fix this besides upstreaming support for unreleased
> > > work-in-progress H/W?
> > >
> > > Haven't I explained this several times already? :)
> >
> > Either that way or the same as Will's doing but that's not my question.
> > I understand you flush the queue of your GKI patches to be closer to
> > upstream. Reduce the backlog/burden. you can achieve your goal by simply
> > dropping such patch and making it not your problem. :)
>
> git reset --hard mainline/master   # job done - tea break  :)
>
> Seriously though, we wish to encourage the use of GKI so all vendors
> can enjoy the benefits of more easily updateable/secure code-bases.
>
> I can't see how pushing back on seamlessly benign changes would
> benefit them or anyone else.

I like your wording ;-)

Indeed, seamlessly benign changes, which are (1) not tested, and (2)
some believed by the platform maintainer to break the platform.
What can possibly go wrong? ;-)

Gr{oetje,eeting}s,

                        Geert
Lee Jones Sept. 30, 2021, 4:21 p.m. UTC | #29
On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:

> Hi Lee,
> 
> On Thu, Sep 30, 2021 at 3:29 PM Lee Jones <lee.jones@linaro.org> wrote:
> > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > On 30/09/2021 14:39, Lee Jones wrote:
> > > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > >> On 30/09/2021 11:23, Lee Jones wrote:
> > > >>> [0] Full disclosure: part of my role at Linaro is to keep the Android
> > > >>> kernel running as close to Mainline as possible and encourage/push the
> > > >>> upstream-first mantra, hence my involvement with this and other sets.
> > > >>> I assure you all intentions are good and honourable.  If you haven't
> > > >>> already seen it, please see Todd's most recent update on the goals and
> > > >>> status of GKI:
> > > >>>
> > > >>>   Article: https://tinyurl.com/saaen3sp
> > > >>>   Video:   https://youtu.be/O_lCFGinFPM
> > > >>>
> > > >>
> > > >> Side topic, why this patchset is in your scope or Will's/Google's scope?
> > > >> Just drop it from Android main kernel, it will not be your problem. I
> > > >> mean, really, you don't need this patchset in your tree at all. The only
> > > >> platform which needs it, the only platform which will loose something
> > > >> will be one specific vendor. Therefore this will be an incentive for
> > > >> them to join both discussions and upstream development. :)
> > > >
> > > > How would they fix this besides upstreaming support for unreleased
> > > > work-in-progress H/W?
> > > >
> > > > Haven't I explained this several times already? :)
> > >
> > > Either that way or the same as Will's doing but that's not my question.
> > > I understand you flush the queue of your GKI patches to be closer to
> > > upstream. Reduce the backlog/burden. you can achieve your goal by simply
> > > dropping such patch and making it not your problem. :)
> >
> > git reset --hard mainline/master   # job done - tea break  :)
> >
> > Seriously though, we wish to encourage the use of GKI so all vendors
> > can enjoy the benefits of more easily updateable/secure code-bases.
> >
> > I can't see how pushing back on seamlessly benign changes would
> > benefit them or anyone else.
> 
> I like your wording ;-)
> 
> Indeed, seamlessly benign changes, which are (1) not tested, and (2)
> some believed by the platform maintainer to break the platform.
> What can possibly go wrong? ;-)

William has already shown a willingness to test the series.

There is already a downstream proof-of-concept of this working.

I am hopeful. :)
Geert Uytterhoeven Sept. 30, 2021, 4:26 p.m. UTC | #30
Hi Lee,

On Thu, Sep 30, 2021 at 6:21 PM Lee Jones <lee.jones@linaro.org> wrote:
> On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:
> > On Thu, Sep 30, 2021 at 3:29 PM Lee Jones <lee.jones@linaro.org> wrote:
> > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > On 30/09/2021 14:39, Lee Jones wrote:
> > > > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > >> On 30/09/2021 11:23, Lee Jones wrote:
> > > > >>> [0] Full disclosure: part of my role at Linaro is to keep the Android
> > > > >>> kernel running as close to Mainline as possible and encourage/push the
> > > > >>> upstream-first mantra, hence my involvement with this and other sets.
> > > > >>> I assure you all intentions are good and honourable.  If you haven't
> > > > >>> already seen it, please see Todd's most recent update on the goals and
> > > > >>> status of GKI:
> > > > >>>
> > > > >>>   Article: https://tinyurl.com/saaen3sp
> > > > >>>   Video:   https://youtu.be/O_lCFGinFPM
> > > > >>>
> > > > >>
> > > > >> Side topic, why this patchset is in your scope or Will's/Google's scope?
> > > > >> Just drop it from Android main kernel, it will not be your problem. I
> > > > >> mean, really, you don't need this patchset in your tree at all. The only
> > > > >> platform which needs it, the only platform which will loose something
> > > > >> will be one specific vendor. Therefore this will be an incentive for
> > > > >> them to join both discussions and upstream development. :)
> > > > >
> > > > > How would they fix this besides upstreaming support for unreleased
> > > > > work-in-progress H/W?
> > > > >
> > > > > Haven't I explained this several times already? :)
> > > >
> > > > Either that way or the same as Will's doing but that's not my question.
> > > > I understand you flush the queue of your GKI patches to be closer to
> > > > upstream. Reduce the backlog/burden. you can achieve your goal by simply
> > > > dropping such patch and making it not your problem. :)
> > >
> > > git reset --hard mainline/master   # job done - tea break  :)
> > >
> > > Seriously though, we wish to encourage the use of GKI so all vendors
> > > can enjoy the benefits of more easily updateable/secure code-bases.
> > >
> > > I can't see how pushing back on seamlessly benign changes would
> > > benefit them or anyone else.
> >
> > I like your wording ;-)
> >
> > Indeed, seamlessly benign changes, which are (1) not tested, and (2)
> > some believed by the platform maintainer to break the platform.
> > What can possibly go wrong? ;-)
>
> William has already shown a willingness to test the series.

I'm looking forward to the results!

Gr{oetje,eeting}s,

                        Geert
William McVicker Sept. 30, 2021, 6:02 p.m. UTC | #31
On Thu, Sep 30, 2021 at 9:26 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Lee,
>
> On Thu, Sep 30, 2021 at 6:21 PM Lee Jones <lee.jones@linaro.org> wrote:
> > On Thu, 30 Sep 2021, Geert Uytterhoeven wrote:
> > > On Thu, Sep 30, 2021 at 3:29 PM Lee Jones <lee.jones@linaro.org> wrote:
> > > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > > On 30/09/2021 14:39, Lee Jones wrote:
> > > > > > On Thu, 30 Sep 2021, Krzysztof Kozlowski wrote:
> > > > > >> On 30/09/2021 11:23, Lee Jones wrote:
> > > > > >>> [0] Full disclosure: part of my role at Linaro is to keep the Android
> > > > > >>> kernel running as close to Mainline as possible and encourage/push the
> > > > > >>> upstream-first mantra, hence my involvement with this and other sets.
> > > > > >>> I assure you all intentions are good and honourable.  If you haven't
> > > > > >>> already seen it, please see Todd's most recent update on the goals and
> > > > > >>> status of GKI:
> > > > > >>>
> > > > > >>>   Article: https://tinyurl.com/saaen3sp
> > > > > >>>   Video:   https://youtu.be/O_lCFGinFPM
> > > > > >>>
> > > > > >>
> > > > > >> Side topic, why this patchset is in your scope or Will's/Google's scope?
> > > > > >> Just drop it from Android main kernel, it will not be your problem. I
> > > > > >> mean, really, you don't need this patchset in your tree at all. The only
> > > > > >> platform which needs it, the only platform which will loose something
> > > > > >> will be one specific vendor. Therefore this will be an incentive for
> > > > > >> them to join both discussions and upstream development. :)
> > > > > >
> > > > > > How would they fix this besides upstreaming support for unreleased
> > > > > > work-in-progress H/W?
> > > > > >
> > > > > > Haven't I explained this several times already? :)
> > > > >
> > > > > Either that way or the same as Will's doing but that's not my question.
> > > > > I understand you flush the queue of your GKI patches to be closer to
> > > > > upstream. Reduce the backlog/burden. you can achieve your goal by simply
> > > > > dropping such patch and making it not your problem. :)
> > > >
> > > > git reset --hard mainline/master   # job done - tea break  :)
> > > >
> > > > Seriously though, we wish to encourage the use of GKI so all vendors
> > > > can enjoy the benefits of more easily updateable/secure code-bases.
> > > >
> > > > I can't see how pushing back on seamlessly benign changes would
> > > > benefit them or anyone else.
> > >
> > > I like your wording ;-)
> > >
> > > Indeed, seamlessly benign changes, which are (1) not tested, and (2)
> > > some believed by the platform maintainer to break the platform.
> > > What can possibly go wrong? ;-)
> >
> > William has already shown a willingness to test the series.
>
> I'm looking forward to the results!
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

Hi Krzysztof and Geert,

Thanks for all the responses! There are a few things I want to respond
to. Since this thread is 32 responses deep, I'm copying and pasting
the parts I'm responding to. I hope that isn't an issue.

>You are changing not default, but selectability which is part of the
>enforced configuration to make platforms working. The distros do not
>always choose defaults but rather all as modules.

I'm guessing you are referring to distros using the "allmodconfig". I
agree that this would affect their platforms in the sense that they
would get a module vs built-in, but obviously that is the choice they
are making by using "allmodconfig". I also agree that my changes
should not break any distros. I definitely don't want you or anyone
else to accept these without them being tested and verified.

>I don't understand the point (2) here. Anyone can use upstream kernel
>for supported and unsupported platforms. How upstream benefits from a
>change affecting supported platforms made for unsupported, downstream
>platforms.

We believe that if we make it easier for SoC vendors to directly use
the upstream kernel during bring-up and during the development stages
of their project, then that will decrease the friction of working with
upstream (less downstream changes) and increase the upstream
contributions. As mentioned in the LPC talk by Todd, the Android
kernel team is constantly pushing to reduce the number of out-of-tree
changes we carry so that we can ultimately use the upstream kernel
directly.

>You also mentioned downstream devices but without actually ever defining
>them. Please be more specific. What SoC, what hardware?

Saravana provided 2 platforms in his response. The first one is
hikey960 devboard and the second one is the Pixel 5 which uses the
Qualcomm sm7250 chipset. In particular the Pixel 5 is running a fully
modular kernel. It has 320+ modules including interrupt controllers,
timers, pinctrl and clocks.

To address your question regarding if Exynos is special? No, Exynos
isn't special. There is a long road ahead of us to get to where we'd
ultimately like to be and ARCH_EXYNOS happens to be along the way.
Currently, we are working through the arm64 ARCH_XXX configs that are
supported by Android to make sure that all drivers included by any
ARCH_XXX are modularized if they can be which benefits all parties.
From my understanding upstream does support modularizing drivers and
we can help with that. I believe this also kind of addresses Geert's
comments regarding what a generic kernel is. We look at it as a kernel
that only includes the absolutely necessary SoC drivers as built-in.
Obviously we can't go back and change older hardware like Cortex A9 to
support this, but we can do our best to support future SoCs.

Lastly, you mentioned a couple of times that we should just disable
ARCH_EXYNOS and move on. At the moment we are not able to do that
because some SoC specific drivers actually do need to be built-in to
the kernel (tainting it's generic-ness unfortunately) and are
protected by these ARCH_XXX configs. One example that you or Geert
pointed out is the early console serial driver which Greg did try to
decouple from ARCH_EXYNOS but was turned down. There are likely other
reasons as well that I don't know of off hand.

I hope Lee and I answered all your questions. Ultimately, I think we
all understand what is being debated here and it sounds like we all
agree that we are okay with modules as long as (1) they are proven to
work as both built-in and as modules and (2) can't be disabled if they
are required by the platform, but can be built-in or modular. Let me
know if I misunderstood that.

Regarding the testing, I will look around for a devboard that I can
test my clock driver changes on since those are the most contested
ones. I obviously am not going to be able to find all the HW variants
you mentioned, but I will try to find an ARM64 Exynos7 or Exynos5433
devboard since my patchset only modifies the clocks for these SoCs
which seem to be the most contested. Once I provide some testing
results, I'm hoping someone upstream can help verify that on the other
affected hardware. Of course this will take some time though.

In the meantime, I will try to push some of these easier patches that
don't need extensive testing, but are required before modularizing the
drivers.

Thanks,
Will
Christoph Hellwig Oct. 1, 2021, 4:01 a.m. UTC | #32
On Thu, Sep 30, 2021 at 09:10:31PM +0900, Tomasz Figa wrote:
> Generally, the subsystems being mentioned here are so basic (clock,
> pinctrl, rtc), that I really can't imagine what kind of rocket science
> one might want to hide for competitive reasons... If it's for an
> entire SoC, I wonder why Intel and AMD don't have similar concerns and
> contribute support for their newest hardware far before the release.

There is no reason at all, and to be honest this whole discussion with
these bullshit arguments from the Google/Linaro/SoC vendor crowd just
shows how on crack these people are, and shows a good example of why
we should not support these models at all.  There is no good reason
to "overide" uptream functionality EVER.  Stop digging yourselves into
your ever bigger holes and just f***king contribute upstream NOW.  Just
as we always have we should not give you more rope to shoot yoursel
while ausing us extra overhead.
Christoph Hellwig Oct. 1, 2021, 4:04 a.m. UTC | #33
On Thu, Sep 30, 2021 at 01:39:35PM +0100, Lee Jones wrote:
> How would they fix this besides upstreaming support for unreleased
> work-in-progress H/W?
> 
> Haven't I explained this several times already? :)

No you haven't.  Mostly likely because there is absolutely no good
explanation.  And if google/Linaro think they want to create hooks
for stupid vendor modules we need to do whatever we can to make your
life as miserable as possible.
Saravana Kannan Oct. 1, 2021, 4:52 a.m. UTC | #34
On Thu, Sep 30, 2021 at 9:02 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Sep 30, 2021 at 09:10:31PM +0900, Tomasz Figa wrote:
> > Generally, the subsystems being mentioned here are so basic (clock,
> > pinctrl, rtc), that I really can't imagine what kind of rocket science
> > one might want to hide for competitive reasons... If it's for an
> > entire SoC, I wonder why Intel and AMD don't have similar concerns and
> > contribute support for their newest hardware far before the release.
>
> There is no reason at all, and to be honest this whole discussion with
> these bullshit arguments from the Google/Linaro/SoC vendor crowd just
> shows how on crack these people are, and shows a good example of why
> we should not support these models at all.  There is no good reason
> to "overide" uptream functionality EVER.  Stop digging yourselves into
> your ever bigger holes and just f***king contribute upstream NOW.  Just
> as we always have we should not give you more rope to shoot yoursel
> while ausing us extra overhead.

Maybe you need to read up the code of conduct again.

-Saravana
Olof Johansson Oct. 1, 2021, 4:52 a.m. UTC | #35
On Wed, Sep 29, 2021 at 12:48 PM Will McVicker <willmcvicker@google.com> wrote:
>
> On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> <krzysztof.kozlowski@canonical.com> wrote:
> >
> > On 29/09/2021 01:56, Will McVicker wrote:
> > > This is v2 of the series of patches that modularizes a number of core
> > > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > > series of "select XXX". This includes setting the following configs as
> > > tristate:
> > >
> > >  * COMMON_CLK_SAMSUNG
> > >  * EXYNOS_ARM64_COMMON_CLK
> > >  * PINCTRL_SAMSUNG
> > >  * PINCTRL_EXYNOS
> > >  * EXYNOS_PMU_ARM64
> > >  * EXYNOS_PM_DOMAINS
> > >
> > > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > > The reason for these new configs is because we are not able to easily
> > > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > > the ARM and ARM64 portions into two separate configs.
> > >
> > > Overall, these drivers didn't require much refactoring and converted to
> > > modules relatively easily. However, due to my lack of exynos hardware, I
> > > was not able to boot test these changes. I'm mostly concerned about the
> > > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > > requesting help for testing these changes on the respective hardware.
> > >
> >
> > These are all not tested at all? In such case, since these are not
> > trivial changes, please mark the series as RFT.
> >
> > I will not be able to test these for some days, so it must wait.
> >
> >
> > Best regards,
> > Krzysztof
>
> +Cc Arnd and Olof,
>
> Hi Krzysztof,
>
> To avoid the scrambled conversation from the first patchset, I'm going
> to address all your general questions here in the cover letter thread
> so that it's easier for everyone to follow and reference in the
> future.

This patchset shouldn't go in.

GKI is a fantastic effort, since it finally seems like Google has the
backbone to put pressure on the vendors to upstream all their stuff.

This patcheset dilutes and undermines all of that by opening up a
truck-size loophole, reducing the impact of GKI, and overall removes
leverage to get vendors to do the right thing.

It's against our interest as a community to have this happen, since
there's no other reasonably justifiable reason to do this.


-Olof
Christoph Hellwig Oct. 1, 2021, 4:55 a.m. UTC | #36
On Thu, Sep 30, 2021 at 09:52:12PM -0700, Saravana Kannan wrote:
> Maybe you need to read up the code of conduct again.

*plonk*
Saravana Kannan Oct. 1, 2021, 5:23 a.m. UTC | #37
On Thu, Sep 30, 2021 at 9:52 PM Olof Johansson <olof@lixom.net> wrote:
>
> On Wed, Sep 29, 2021 at 12:48 PM Will McVicker <willmcvicker@google.com> wrote:
> >
> > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> > <krzysztof.kozlowski@canonical.com> wrote:
> > >
> > > On 29/09/2021 01:56, Will McVicker wrote:
> > > > This is v2 of the series of patches that modularizes a number of core
> > > > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > > > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > > > series of "select XXX". This includes setting the following configs as
> > > > tristate:
> > > >
> > > >  * COMMON_CLK_SAMSUNG
> > > >  * EXYNOS_ARM64_COMMON_CLK
> > > >  * PINCTRL_SAMSUNG
> > > >  * PINCTRL_EXYNOS
> > > >  * EXYNOS_PMU_ARM64
> > > >  * EXYNOS_PM_DOMAINS
> > > >
> > > > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > > > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > > > The reason for these new configs is because we are not able to easily
> > > > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > > > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > > > the ARM and ARM64 portions into two separate configs.
> > > >
> > > > Overall, these drivers didn't require much refactoring and converted to
> > > > modules relatively easily. However, due to my lack of exynos hardware, I
> > > > was not able to boot test these changes. I'm mostly concerned about the
> > > > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > > > requesting help for testing these changes on the respective hardware.
> > > >
> > >
> > > These are all not tested at all? In such case, since these are not
> > > trivial changes, please mark the series as RFT.
> > >
> > > I will not be able to test these for some days, so it must wait.
> > >
> > >
> > > Best regards,
> > > Krzysztof
> >
> > +Cc Arnd and Olof,
> >
> > Hi Krzysztof,
> >
> > To avoid the scrambled conversation from the first patchset, I'm going
> > to address all your general questions here in the cover letter thread
> > so that it's easier for everyone to follow and reference in the
> > future.
>
> This patchset shouldn't go in.
>
> GKI is a fantastic effort, since it finally seems like Google has the
> backbone to put pressure on the vendors to upstream all their stuff.
>
> This patcheset dilutes and undermines all of that by opening up a
> truck-size loophole, reducing the impact of GKI, and overall removes
> leverage to get vendors to do the right thing.
>
> It's against our interest as a community to have this happen, since
> there's no other reasonably justifiable reason to do this.

Oolf, Geert, Krzysztof, Arnd,

I skimmed through the emails and you all make a lot of good points. It
looks like you all at least like the idea of being able to have a
minimal generic kernel where everything that can be a module is a
module. Please correct me if I'm wrong on that.

I was thinking about this patch series and I was wondering if it'd be
good to come at it from the other end. Instead of taking the mostly
builtin generic kernel and trying to rip out drivers as modules (and
not having enough hardware to test them all) and hitting all these
issues, we could come at it from the other end.

A config like ARM64_MINIMAL_GENERIC_KERNEL that's off by default. But
if it's selected, all the "selects" done by the various ARCH_XXX are
not done any more. Something like:

ARCH_XXX
    select XXX_CLK1 if !ARM64_MINIMAL_GENERIC_KERNEL
    select XXX_PINCTRL1 if !ARM64_MINIMAL_GENERIC_KERNEL

ARCH_YYY
    select YYY_CLK1 if !ARM64_MINIMAL_GENERIC_KERNEL
    select YYY_PINCTRL1 if !ARM64_MINIMAL_GENERIC_KERNEL

And ARM64_MINIMAL_GENERIC_KERNEL itself would select the absolutely
mandatory stuff that can never be made into a module like the GIC,
architectured timer (as Geert mentioned) and UART early console
driver. I'm not sure if ARM32 has an equivalent to the standardized
GIC and arch timer. Basically the minimal kernel would need a timer
for the scheduler tick and IRQ controller to get the timer IRQ and the
fixed clock driver if the archtimer uses one to get its frequency and
the early UART console is pointless as a module (so build it in to
allow debugging/development).

And then all new drivers, we should make sure are implemented as
tristate drivers. And we can go back and slowly work on converting
existing drivers to modules (community effort -- not one person or
entity) -- at least the ones where the author has hardware or ones
where the change is very likely to be correct and someone else is
willing to test it. We'll never be able to support some/all ARM32 (do
they even have a GIC/arch timer standard?), but at least for ARM64,
this seems like a viable goal.

This way, we'll keep the existing model working, while working on a
fully modular kernel from the other end.

Thoughts?

-Saravana
Olof Johansson Oct. 1, 2021, 5:35 a.m. UTC | #38
On Thu, Sep 30, 2021 at 10:24 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Thu, Sep 30, 2021 at 9:52 PM Olof Johansson <olof@lixom.net> wrote:
> >
> > On Wed, Sep 29, 2021 at 12:48 PM Will McVicker <willmcvicker@google.com> wrote:
> > >
> > > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> > > <krzysztof.kozlowski@canonical.com> wrote:
> > > >
> > > > On 29/09/2021 01:56, Will McVicker wrote:
> > > > > This is v2 of the series of patches that modularizes a number of core
> > > > > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > > > > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > > > > series of "select XXX". This includes setting the following configs as
> > > > > tristate:
> > > > >
> > > > >  * COMMON_CLK_SAMSUNG
> > > > >  * EXYNOS_ARM64_COMMON_CLK
> > > > >  * PINCTRL_SAMSUNG
> > > > >  * PINCTRL_EXYNOS
> > > > >  * EXYNOS_PMU_ARM64
> > > > >  * EXYNOS_PM_DOMAINS
> > > > >
> > > > > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > > > > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > > > > The reason for these new configs is because we are not able to easily
> > > > > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > > > > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > > > > the ARM and ARM64 portions into two separate configs.
> > > > >
> > > > > Overall, these drivers didn't require much refactoring and converted to
> > > > > modules relatively easily. However, due to my lack of exynos hardware, I
> > > > > was not able to boot test these changes. I'm mostly concerned about the
> > > > > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > > > > requesting help for testing these changes on the respective hardware.
> > > > >
> > > >
> > > > These are all not tested at all? In such case, since these are not
> > > > trivial changes, please mark the series as RFT.
> > > >
> > > > I will not be able to test these for some days, so it must wait.
> > > >
> > > >
> > > > Best regards,
> > > > Krzysztof
> > >
> > > +Cc Arnd and Olof,
> > >
> > > Hi Krzysztof,
> > >
> > > To avoid the scrambled conversation from the first patchset, I'm going
> > > to address all your general questions here in the cover letter thread
> > > so that it's easier for everyone to follow and reference in the
> > > future.
> >
> > This patchset shouldn't go in.
> >
> > GKI is a fantastic effort, since it finally seems like Google has the
> > backbone to put pressure on the vendors to upstream all their stuff.
> >
> > This patcheset dilutes and undermines all of that by opening up a
> > truck-size loophole, reducing the impact of GKI, and overall removes
> > leverage to get vendors to do the right thing.
> >
> > It's against our interest as a community to have this happen, since
> > there's no other reasonably justifiable reason to do this.
>
> Oolf, Geert, Krzysztof, Arnd,

So close.

> I skimmed through the emails and you all make a lot of good points.

I skimmed through this email and I think it adds a lot of new
complexity and fragility to solve a problem that doesn't really exist
for upstream, adding yet more config parameter combinations to build
and test for.

A much more valuable approach would be to work towards being able to
free up memory by un-probed drivers at the end of boot. That would
possibly benefit all platforms on all architectures.


-Olof
William McVicker Oct. 1, 2021, 5:59 a.m. UTC | #39
On Thu, Sep 30, 2021 at 10:36 PM Olof Johansson <olof@lixom.net> wrote:
>
> On Thu, Sep 30, 2021 at 10:24 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Thu, Sep 30, 2021 at 9:52 PM Olof Johansson <olof@lixom.net> wrote:
> > >
> > > On Wed, Sep 29, 2021 at 12:48 PM Will McVicker <willmcvicker@google.com> wrote:
> > > >
> > > > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> > > > <krzysztof.kozlowski@canonical.com> wrote:
> > > > >
> > > > > On 29/09/2021 01:56, Will McVicker wrote:
> > > > > > This is v2 of the series of patches that modularizes a number of core
> > > > > > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > > > > > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > > > > > series of "select XXX". This includes setting the following configs as
> > > > > > tristate:
> > > > > >
> > > > > >  * COMMON_CLK_SAMSUNG
> > > > > >  * EXYNOS_ARM64_COMMON_CLK
> > > > > >  * PINCTRL_SAMSUNG
> > > > > >  * PINCTRL_EXYNOS
> > > > > >  * EXYNOS_PMU_ARM64
> > > > > >  * EXYNOS_PM_DOMAINS
> > > > > >
> > > > > > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > > > > > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > > > > > The reason for these new configs is because we are not able to easily
> > > > > > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > > > > > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > > > > > the ARM and ARM64 portions into two separate configs.
> > > > > >
> > > > > > Overall, these drivers didn't require much refactoring and converted to
> > > > > > modules relatively easily. However, due to my lack of exynos hardware, I
> > > > > > was not able to boot test these changes. I'm mostly concerned about the
> > > > > > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > > > > > requesting help for testing these changes on the respective hardware.
> > > > > >
> > > > >
> > > > > These are all not tested at all? In such case, since these are not
> > > > > trivial changes, please mark the series as RFT.
> > > > >
> > > > > I will not be able to test these for some days, so it must wait.
> > > > >
> > > > >
> > > > > Best regards,
> > > > > Krzysztof
> > > >
> > > > +Cc Arnd and Olof,
> > > >
> > > > Hi Krzysztof,
> > > >
> > > > To avoid the scrambled conversation from the first patchset, I'm going
> > > > to address all your general questions here in the cover letter thread
> > > > so that it's easier for everyone to follow and reference in the
> > > > future.
> > >
> > > This patchset shouldn't go in.
> > >
> > > GKI is a fantastic effort, since it finally seems like Google has the
> > > backbone to put pressure on the vendors to upstream all their stuff.
> > >
> > > This patcheset dilutes and undermines all of that by opening up a
> > > truck-size loophole, reducing the impact of GKI, and overall removes
> > > leverage to get vendors to do the right thing.
> > >
> > > It's against our interest as a community to have this happen, since
> > > there's no other reasonably justifiable reason to do this.

Are you saying that modularizing drivers is opening up a loophole? How
is this different from Krysztof pushing changes to modularize the
Exynos ChipId driver just last week [1].  I understand the push back
on "these aren't tested yet" and I agree that we should not merge them
until they are (I've re-iterated that multiple times and have
requested for testing help multiple times since I can't get my hands
on any Exynos arm64 hardware), but are you saying that if I gather the
test data to prove that these drivers can actually be made into
modules that you will still deny them out of the interest of the
community?

[1] https://lore.kernel.org/linux-samsung-soc/4aee1b0d-91a1-75ac-d2b7-6dab3d7a301f@kernel.org/T/#t

--Will

>
> >
> > Oolf, Geert, Krzysztof, Arnd,
>
> So close.
>
> > I skimmed through the emails and you all make a lot of good points.
>
> I skimmed through this email and I think it adds a lot of new
> complexity and fragility to solve a problem that doesn't really exist
> for upstream, adding yet more config parameter combinations to build
> and test for.
>
> A much more valuable approach would be to work towards being able to
> free up memory by un-probed drivers at the end of boot. That would
> possibly benefit all platforms on all architectures.
>
>
> -Olof
Saravana Kannan Oct. 1, 2021, 6:02 a.m. UTC | #40
On Thu, Sep 30, 2021 at 10:36 PM Olof Johansson <olof@lixom.net> wrote:
>
> On Thu, Sep 30, 2021 at 10:24 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Thu, Sep 30, 2021 at 9:52 PM Olof Johansson <olof@lixom.net> wrote:
> > >
> > > On Wed, Sep 29, 2021 at 12:48 PM Will McVicker <willmcvicker@google.com> wrote:
> > > >
> > > > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> > > > <krzysztof.kozlowski@canonical.com> wrote:
> > > > >
> > > > > On 29/09/2021 01:56, Will McVicker wrote:
> > > > > > This is v2 of the series of patches that modularizes a number of core
> > > > > > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > > > > > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > > > > > series of "select XXX". This includes setting the following configs as
> > > > > > tristate:
> > > > > >
> > > > > >  * COMMON_CLK_SAMSUNG
> > > > > >  * EXYNOS_ARM64_COMMON_CLK
> > > > > >  * PINCTRL_SAMSUNG
> > > > > >  * PINCTRL_EXYNOS
> > > > > >  * EXYNOS_PMU_ARM64
> > > > > >  * EXYNOS_PM_DOMAINS
> > > > > >
> > > > > > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > > > > > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > > > > > The reason for these new configs is because we are not able to easily
> > > > > > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > > > > > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > > > > > the ARM and ARM64 portions into two separate configs.
> > > > > >
> > > > > > Overall, these drivers didn't require much refactoring and converted to
> > > > > > modules relatively easily. However, due to my lack of exynos hardware, I
> > > > > > was not able to boot test these changes. I'm mostly concerned about the
> > > > > > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > > > > > requesting help for testing these changes on the respective hardware.
> > > > > >
> > > > >
> > > > > These are all not tested at all? In such case, since these are not
> > > > > trivial changes, please mark the series as RFT.
> > > > >
> > > > > I will not be able to test these for some days, so it must wait.
> > > > >
> > > > >
> > > > > Best regards,
> > > > > Krzysztof
> > > >
> > > > +Cc Arnd and Olof,
> > > >
> > > > Hi Krzysztof,
> > > >
> > > > To avoid the scrambled conversation from the first patchset, I'm going
> > > > to address all your general questions here in the cover letter thread
> > > > so that it's easier for everyone to follow and reference in the
> > > > future.
> > >
> > > This patchset shouldn't go in.
> > >
> > > GKI is a fantastic effort, since it finally seems like Google has the
> > > backbone to put pressure on the vendors to upstream all their stuff.
> > >
> > > This patcheset dilutes and undermines all of that by opening up a
> > > truck-size loophole, reducing the impact of GKI, and overall removes
> > > leverage to get vendors to do the right thing.
> > >
> > > It's against our interest as a community to have this happen, since
> > > there's no other reasonably justifiable reason to do this.
> >
> > Oolf, Geert, Krzysztof, Arnd,
>
> So close.

I'm sorry, it's pretty late here and I'm sleepy and messed it up.

>
> > I skimmed through the emails and you all make a lot of good points.
>
> I skimmed through this email and I think it adds a lot of new
> complexity and fragility to solve a problem that doesn't really exist
> for upstream, adding yet more config parameter combinations to build
> and test for.

How is this not an upstream problem? Having a minimal kernel with as
many drivers as modules is of interest to upstream. And what's the
complexity in having a config to easily disable a bunch of configs?
The new config gives a clear config against which new
platforms/systems should be developed against.

>
> A much more valuable approach would be to work towards being able to
> free up memory by un-probed drivers at the end of boot. That would
> possibly benefit all platforms on all architectures.

Sure it would help memory after boot, but it won't help with size on
"disk", kernel load time, etc. And some of the devices have very tight
boot requirements. Think battery operated outdoor cameras for example.

-Saravana
Olof Johansson Oct. 1, 2021, 6:27 a.m. UTC | #41
On Thu, Sep 30, 2021 at 11:03 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Thu, Sep 30, 2021 at 10:36 PM Olof Johansson <olof@lixom.net> wrote:
> >
> > On Thu, Sep 30, 2021 at 10:24 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > On Thu, Sep 30, 2021 at 9:52 PM Olof Johansson <olof@lixom.net> wrote:
> > > >
> > > > On Wed, Sep 29, 2021 at 12:48 PM Will McVicker <willmcvicker@google.com> wrote:
> > > > >
> > > > > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> > > > > <krzysztof.kozlowski@canonical.com> wrote:
> > > > > >
> > > > > > On 29/09/2021 01:56, Will McVicker wrote:
> > > > > > > This is v2 of the series of patches that modularizes a number of core
> > > > > > > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > > > > > > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > > > > > > series of "select XXX". This includes setting the following configs as
> > > > > > > tristate:
> > > > > > >
> > > > > > >  * COMMON_CLK_SAMSUNG
> > > > > > >  * EXYNOS_ARM64_COMMON_CLK
> > > > > > >  * PINCTRL_SAMSUNG
> > > > > > >  * PINCTRL_EXYNOS
> > > > > > >  * EXYNOS_PMU_ARM64
> > > > > > >  * EXYNOS_PM_DOMAINS
> > > > > > >
> > > > > > > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > > > > > > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > > > > > > The reason for these new configs is because we are not able to easily
> > > > > > > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > > > > > > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > > > > > > the ARM and ARM64 portions into two separate configs.
> > > > > > >
> > > > > > > Overall, these drivers didn't require much refactoring and converted to
> > > > > > > modules relatively easily. However, due to my lack of exynos hardware, I
> > > > > > > was not able to boot test these changes. I'm mostly concerned about the
> > > > > > > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > > > > > > requesting help for testing these changes on the respective hardware.
> > > > > > >
> > > > > >
> > > > > > These are all not tested at all? In such case, since these are not
> > > > > > trivial changes, please mark the series as RFT.
> > > > > >
> > > > > > I will not be able to test these for some days, so it must wait.
> > > > > >
> > > > > >
> > > > > > Best regards,
> > > > > > Krzysztof
> > > > >
> > > > > +Cc Arnd and Olof,
> > > > >
> > > > > Hi Krzysztof,
> > > > >
> > > > > To avoid the scrambled conversation from the first patchset, I'm going
> > > > > to address all your general questions here in the cover letter thread
> > > > > so that it's easier for everyone to follow and reference in the
> > > > > future.
> > > >
> > > > This patchset shouldn't go in.
> > > >
> > > > GKI is a fantastic effort, since it finally seems like Google has the
> > > > backbone to put pressure on the vendors to upstream all their stuff.
> > > >
> > > > This patcheset dilutes and undermines all of that by opening up a
> > > > truck-size loophole, reducing the impact of GKI, and overall removes
> > > > leverage to get vendors to do the right thing.
> > > >
> > > > It's against our interest as a community to have this happen, since
> > > > there's no other reasonably justifiable reason to do this.
> > >
> > > Oolf, Geert, Krzysztof, Arnd,
> >
> > So close.
>
> I'm sorry, it's pretty late here and I'm sleepy and messed it up.

This email thread will be here in the morning too, this is the last
reply I will make tonight myself.

> > > I skimmed through the emails and you all make a lot of good points.
> >
> > I skimmed through this email and I think it adds a lot of new
> > complexity and fragility to solve a problem that doesn't really exist
> > for upstream, adding yet more config parameter combinations to build
> > and test for.
>
> How is this not an upstream problem? Having a minimal kernel with as
> many drivers as modules is of interest to upstream. And what's the
> complexity in having a config to easily disable a bunch of configs?
> The new config gives a clear config against which new
> platforms/systems should be developed against.

The approach for general kernel upstream has been to have the built-in
drivers needed to reach rootfs and runtime load the rest from there.
For downstream embedded kernels, the right answer is to just exclude
the drivers you don't need.

I agree, some of this is probably useful work but
1) It was posted as a burden on the maintainers of the legacy
platforms to test and make sure it's not broken
2) It regresses distro configs, which we do care about
3) Based on the above, it's unclear whether the people pushing for
this patchset cares about the existing platforms, they're looking to
solve a different problem

I didn't go back and look at who and where, but the person who started
justifying this work with making it easier for out-of-tree vendors to
integrate their non-contributed code killed this patchset.

Would the out-of-tree argument on its own kill it? No, but the above
complexities/cost, plus the actual intent behind the patchset did.

> > A much more valuable approach would be to work towards being able to
> > free up memory by un-probed drivers at the end of boot. That would
> > possibly benefit all platforms on all architectures.
>
> Sure it would help memory after boot, but it won't help with size on
> "disk", kernel load time, etc. And some of the devices have very tight
> boot requirements. Think battery operated outdoor cameras for example.

I would question the choices made if someone ships a camera with a
generic kernel (without a generic filesystem which at that point also
needs sufficient storage such that the kernel image size becomes a
secondary consideration). Once your storage is that constrained the
balance shifts towards a custom minimal config.


-Olof
Olof Johansson Oct. 1, 2021, 6:30 a.m. UTC | #42
On Thu, Sep 30, 2021 at 11:27 PM Olof Johansson <olof@lixom.net> wrote:
>
> On Thu, Sep 30, 2021 at 11:03 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Thu, Sep 30, 2021 at 10:36 PM Olof Johansson <olof@lixom.net> wrote:
> > >
> > > On Thu, Sep 30, 2021 at 10:24 PM Saravana Kannan <saravanak@google.com> wrote:
> > > >
> > > > On Thu, Sep 30, 2021 at 9:52 PM Olof Johansson <olof@lixom.net> wrote:
> > > > >
> > > > > On Wed, Sep 29, 2021 at 12:48 PM Will McVicker <willmcvicker@google.com> wrote:
> > > > > >
> > > > > > On Wed, Sep 29, 2021 at 6:02 AM Krzysztof Kozlowski
> > > > > > <krzysztof.kozlowski@canonical.com> wrote:
> > > > > > >
> > > > > > > On 29/09/2021 01:56, Will McVicker wrote:
> > > > > > > > This is v2 of the series of patches that modularizes a number of core
> > > > > > > > ARCH_EXYNOS drivers. Based off of the feedback from the v1 series, I have
> > > > > > > > modularized all of the drivers that are removed from the ARCH_EXYNOS
> > > > > > > > series of "select XXX". This includes setting the following configs as
> > > > > > > > tristate:
> > > > > > > >
> > > > > > > >  * COMMON_CLK_SAMSUNG
> > > > > > > >  * EXYNOS_ARM64_COMMON_CLK
> > > > > > > >  * PINCTRL_SAMSUNG
> > > > > > > >  * PINCTRL_EXYNOS
> > > > > > > >  * EXYNOS_PMU_ARM64
> > > > > > > >  * EXYNOS_PM_DOMAINS
> > > > > > > >
> > > > > > > > Additionally, it introduces the config EXYNOS_PMU_ARM64 and EXYNOS_PMU_ARM
> > > > > > > > which was previously EXYNOS_PMU and EXYNOS_PMU_ARM_DRIVERS respectively.
> > > > > > > > The reason for these new configs is because we are not able to easily
> > > > > > > > modularize the ARMv7 PMU driver due to built-in arch dependencies on
> > > > > > > > pmu_base_addr under arch/arm/mach-exynos/*. So the new configs split up
> > > > > > > > the ARM and ARM64 portions into two separate configs.
> > > > > > > >
> > > > > > > > Overall, these drivers didn't require much refactoring and converted to
> > > > > > > > modules relatively easily. However, due to my lack of exynos hardware, I
> > > > > > > > was not able to boot test these changes. I'm mostly concerned about the
> > > > > > > > CLK_OF_DECLARE() changes having dependencies on early timers. So I'm
> > > > > > > > requesting help for testing these changes on the respective hardware.
> > > > > > > >
> > > > > > >
> > > > > > > These are all not tested at all? In such case, since these are not
> > > > > > > trivial changes, please mark the series as RFT.
> > > > > > >
> > > > > > > I will not be able to test these for some days, so it must wait.
> > > > > > >
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Krzysztof
> > > > > >
> > > > > > +Cc Arnd and Olof,
> > > > > >
> > > > > > Hi Krzysztof,
> > > > > >
> > > > > > To avoid the scrambled conversation from the first patchset, I'm going
> > > > > > to address all your general questions here in the cover letter thread
> > > > > > so that it's easier for everyone to follow and reference in the
> > > > > > future.
> > > > >
> > > > > This patchset shouldn't go in.
> > > > >
> > > > > GKI is a fantastic effort, since it finally seems like Google has the
> > > > > backbone to put pressure on the vendors to upstream all their stuff.
> > > > >
> > > > > This patcheset dilutes and undermines all of that by opening up a
> > > > > truck-size loophole, reducing the impact of GKI, and overall removes
> > > > > leverage to get vendors to do the right thing.
> > > > >
> > > > > It's against our interest as a community to have this happen, since
> > > > > there's no other reasonably justifiable reason to do this.
> > > >
> > > > Oolf, Geert, Krzysztof, Arnd,
> > >
> > > So close.
> >
> > I'm sorry, it's pretty late here and I'm sleepy and messed it up.
>
> This email thread will be here in the morning too, this is the last
> reply I will make tonight myself.
>
> > > > I skimmed through the emails and you all make a lot of good points.
> > >
> > > I skimmed through this email and I think it adds a lot of new
> > > complexity and fragility to solve a problem that doesn't really exist
> > > for upstream, adding yet more config parameter combinations to build
> > > and test for.
> >
> > How is this not an upstream problem? Having a minimal kernel with as
> > many drivers as modules is of interest to upstream. And what's the
> > complexity in having a config to easily disable a bunch of configs?
> > The new config gives a clear config against which new
> > platforms/systems should be developed against.
>
> The approach for general kernel upstream has been to have the built-in
> drivers needed to reach rootfs and runtime load the rest from there.

Correction: The intent for our own multiconfigs have been this. If
people want to go more granular and allow for modules of their core
drivers, we have never pushed back (but the rest of the argument still
stands).

> For downstream embedded kernels, the right answer is to just exclude
> the drivers you don't need.
>
> I agree, some of this is probably useful work but
> 1) It was posted as a burden on the maintainers of the legacy
> platforms to test and make sure it's not broken
> 2) It regresses distro configs, which we do care about
> 3) Based on the above, it's unclear whether the people pushing for
> this patchset cares about the existing platforms, they're looking to
> solve a different problem
>
> I didn't go back and look at who and where, but the person who started
> justifying this work with making it easier for out-of-tree vendors to
> integrate their non-contributed code killed this patchset.
>
> Would the out-of-tree argument on its own kill it? No, but the above
> complexities/cost, plus the actual intent behind the patchset did.
>
> > > A much more valuable approach would be to work towards being able to
> > > free up memory by un-probed drivers at the end of boot. That would
> > > possibly benefit all platforms on all architectures.
> >
> > Sure it would help memory after boot, but it won't help with size on
> > "disk", kernel load time, etc. And some of the devices have very tight
> > boot requirements. Think battery operated outdoor cameras for example.
>
> I would question the choices made if someone ships a camera with a
> generic kernel (without a generic filesystem which at that point also
> needs sufficient storage such that the kernel image size becomes a
> secondary consideration). Once your storage is that constrained the
> balance shifts towards a custom minimal config.
>
>
> -Olof
Krzysztof Kozlowski Oct. 1, 2021, 8:01 a.m. UTC | #43
On 01/10/2021 07:59, Will McVicker wrote:
> On Thu, Sep 30, 2021 at 10:36 PM Olof Johansson <olof@lixom.net> wrote:
>>
>>>>
>>>> GKI is a fantastic effort, since it finally seems like Google has the
>>>> backbone to put pressure on the vendors to upstream all their stuff.
>>>>
>>>> This patcheset dilutes and undermines all of that by opening up a
>>>> truck-size loophole, reducing the impact of GKI, and overall removes
>>>> leverage to get vendors to do the right thing.
>>>>
>>>> It's against our interest as a community to have this happen, since
>>>> there's no other reasonably justifiable reason to do this.
> 
> Are you saying that modularizing drivers is opening up a loophole? How
> is this different from Krysztof pushing changes to modularize the
> Exynos ChipId driver just last week [1].  

Modularizing drivers, which can work as modules or even can be disabled
because they are not essential for platform boot, is not opening
loophole and is helping upstream platforms. Modularizing everything,
even essential drivers, because downstream does not want to contribute
rest of its drivers, is not beneficial to the upstream project. Since
downstream does want to contribute its platforms and drivers, it decides
to change mainline project to fits its needs. Only its needs, not others.

I was repeating this multiple times - there is no point, no incentive
for the mainline to allow disabling essential SoC drivers. It's only
downstream interest without any benefit to the upstream.


Best regards,
Krzysztof
Geert Uytterhoeven Oct. 1, 2021, 8:19 a.m. UTC | #44
Hi Saravana,

On Fri, Oct 1, 2021 at 7:24 AM Saravana Kannan <saravanak@google.com> wrote:
> I skimmed through the emails and you all make a lot of good points. It
> looks like you all at least like the idea of being able to have a
> minimal generic kernel where everything that can be a module is a
> module. Please correct me if I'm wrong on that.
>
> I was thinking about this patch series and I was wondering if it'd be
> good to come at it from the other end. Instead of taking the mostly
> builtin generic kernel and trying to rip out drivers as modules (and
> not having enough hardware to test them all) and hitting all these
> issues, we could come at it from the other end.
>
> A config like ARM64_MINIMAL_GENERIC_KERNEL that's off by default. But
> if it's selected, all the "selects" done by the various ARCH_XXX are
> not done any more. Something like:
>
> ARCH_XXX
>     select XXX_CLK1 if !ARM64_MINIMAL_GENERIC_KERNEL
>     select XXX_PINCTRL1 if !ARM64_MINIMAL_GENERIC_KERNEL
>
> ARCH_YYY
>     select YYY_CLK1 if !ARM64_MINIMAL_GENERIC_KERNEL
>     select YYY_PINCTRL1 if !ARM64_MINIMAL_GENERIC_KERNEL
>
> And ARM64_MINIMAL_GENERIC_KERNEL itself would select the absolutely
> mandatory stuff that can never be made into a module like the GIC,
> architectured timer (as Geert mentioned) and UART early console
> driver. I'm not sure if ARM32 has an equivalent to the standardized

While the UART early console can work (assuming the related hardware
setup has been done by the boot loader), the actual serial driver
usually cannot, as it relies on clocks, PM Domains, pin control, which
won't be available until the corresponding modular drivers are loaded.
Actually earlycon is a debug feature, so I'm wondering if you actually
want that in your GKI kernel?

> GIC and arch timer. Basically the minimal kernel would need a timer
> for the scheduler tick and IRQ controller to get the timer IRQ and the
> fixed clock driver if the archtimer uses one to get its frequency and
> the early UART console is pointless as a module (so build it in to
> allow debugging/development).
>
> And then all new drivers, we should make sure are implemented as
> tristate drivers. And we can go back and slowly work on converting
> existing drivers to modules (community effort -- not one person or
> entity) -- at least the ones where the author has hardware or ones
> where the change is very likely to be correct and someone else is
> willing to test it. We'll never be able to support some/all ARM32 (do
> they even have a GIC/arch timer standard?), but at least for ARM64,
> this seems like a viable goal.

Cortex-A7/A15 and later have GIC and architectured timer, so it should
work for contemporary systems.
Cortex-A9 systems may have GIC, and TWD and/or Global Timer (but I've
seen SoCs where the interrupt for the latter was not wired :-(.

What are the plans for other architectures?
I've seen similar patches being applied for e.g. MIPS.

Gr{oetje,eeting}s,

                        Geert
Arnd Bergmann Oct. 1, 2021, 9 a.m. UTC | #45
On Fri, Oct 1, 2021 at 10:19 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Fri, Oct 1, 2021 at 7:24 AM Saravana Kannan <saravanak@google.com> wrote:
> > GIC and arch timer. Basically the minimal kernel would need a timer
> > for the scheduler tick and IRQ controller to get the timer IRQ and the
> > fixed clock driver if the archtimer uses one to get its frequency and
> > the early UART console is pointless as a module (so build it in to
> > allow debugging/development).
> >
> > And then all new drivers, we should make sure are implemented as
> > tristate drivers. And we can go back and slowly work on converting
> > existing drivers to modules (community effort -- not one person or
> > entity) -- at least the ones where the author has hardware or ones
> > where the change is very likely to be correct and someone else is
> > willing to test it. We'll never be able to support some/all ARM32 (do
> > they even have a GIC/arch timer standard?), but at least for ARM64,
> > this seems like a viable goal.
>
> Cortex-A7/A15 and later have GIC and architectured timer, so it should
> work for contemporary systems.
> Cortex-A9 systems may have GIC, and TWD and/or Global Timer (but I've
> seen SoCs where the interrupt for the latter was not wired :-(.

There are a number of well-known examples even with 64-bit chips or
Cortex-A7/A15 based SoCs that can't use the architected timer,
irqchip or iommu.

Apple M1, Broadcom BCM283x, Samsung Exynos5 and
some Hisilicon server parts come to mind, I'm sure there
are more.

> What are the plans for other architectures?
> I've seen similar patches being applied for e.g. MIPS.

There is some work in the more actively maintained MIPS
platforms to make those behave more like Arm/powerpc/riscv/m68k
platforms, using a single image and moving drivers into modules.
Most MIPS platforms seem unlikely to get updated to this,
and will continue to require a SoC specific kernel binary forever,
similar to the renesas superh platforms. Most of the less
common architectures (arc, csky, hexagon, nios2, xtensa,
microblaze, nds32, openrisc, sparc/leon) are way behind that
though, and generally don't work at all without out-of-tree
code.

      Arnd
Linus Walleij Oct. 1, 2021, 11:38 a.m. UTC | #46
On Fri, Oct 1, 2021 at 7:36 AM Olof Johansson <olof@lixom.net> wrote:

> A much more valuable approach would be to work towards being able to
> free up memory by un-probed drivers at the end of boot. That would
> possibly benefit all platforms on all architectures.

This would be really neat.

Also the ages-old problem of discarding unreferenced strings which
apparently still isn't possible would be great to have solved. E.g.
to couple strings to the file/module it belongs with and discard them
if the code is not probed.

Yours,
Linus Walleij
Geert Uytterhoeven Oct. 1, 2021, 11:59 a.m. UTC | #47
Hi Olof,

On Fri, Oct 1, 2021 at 7:36 AM Olof Johansson <olof@lixom.net> wrote:
> A much more valuable approach would be to work towards being able to
> free up memory by un-probed drivers at the end of boot. That would
> possibly benefit all platforms on all architectures.

We used to have such a functionality in arch/ppc (not arch/powerpc!),
where code/data could be tagged __prep, __chrp, or __pmac, to put it
in a special section, and to be freed with initdata when unused.  It
was removed in v2.6.15[1], as the savings weren't worth the hassle.
In a more fragmented space like arm the memory lost due to alignment
of the sections would be even more substantial.

Another problem is to know when is the end of the boot, especially
with deferred probing.

[1] 6c45ab992e4299c8 ("[PATCH] powerpc: Remove section free() and
linker script bits")

Gr{oetje,eeting}s,

                        Geert
Arnd Bergmann Oct. 1, 2021, noon UTC | #48
On Fri, Oct 1, 2021 at 8:02 AM Saravana Kannan <saravanak@google.com> wrote:

> > A much more valuable approach would be to work towards being able to
> > free up memory by un-probed drivers at the end of boot. That would
> > possibly benefit all platforms on all architectures.
>
> Sure it would help memory after boot, but it won't help with size on
> "disk", kernel load time, etc. And some of the devices have very tight
> boot requirements. Think battery operated outdoor cameras for example.

I think we can draw a clear line (or several lines) between devices that boot
from strictly constrained NOR flash and those that run a platform-independent
kernel.

Also, when I look at a distro kernel, I see over 5000 kernel modules that
need to be stored on disk, but only a small fraction of those are platform
specific while most are for general-purpose pluggable devices, network
features or file systems that could be used on any system.

The vmlinux file is clearly too big and includes too much stuff that should
be in loadable modules, but I'm not really that worried about disk space
for the platform specific code.

         Arnd
Lee Jones Oct. 1, 2021, 12:31 p.m. UTC | #49
On Fri, 01 Oct 2021, Arnd Bergmann wrote:
> The vmlinux file is clearly too big and includes too much stuff that should
> be in loadable modules

This for me is the crux of the matter.

The ability to replace modules was only brought to light as an "and
also, this is possible".  However in retrospect, given the attention
this has received, it probably shouldn't have even mentioned, as it's
not that important.

We should focus on the benefits of making parts of the kernel modular
if technically possible.  The most prominent of those is core binary
size, since this has a direct impact on boot-time and RAM usage.

Reclaiming dead code after boot is certainly one way to tackle part of
the problem.  Ensuring that it's not even loaded into RAM in the first
place is a better more encompassing solution to both issues IMHO.
Olof Johansson Oct. 1, 2021, 3:27 p.m. UTC | #50
On Fri, Oct 1, 2021 at 2:01 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Oct 1, 2021 at 10:19 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Fri, Oct 1, 2021 at 7:24 AM Saravana Kannan <saravanak@google.com> wrote:
> > > GIC and arch timer. Basically the minimal kernel would need a timer
> > > for the scheduler tick and IRQ controller to get the timer IRQ and the
> > > fixed clock driver if the archtimer uses one to get its frequency and
> > > the early UART console is pointless as a module (so build it in to
> > > allow debugging/development).
> > >
> > > And then all new drivers, we should make sure are implemented as
> > > tristate drivers. And we can go back and slowly work on converting
> > > existing drivers to modules (community effort -- not one person or
> > > entity) -- at least the ones where the author has hardware or ones
> > > where the change is very likely to be correct and someone else is
> > > willing to test it. We'll never be able to support some/all ARM32 (do
> > > they even have a GIC/arch timer standard?), but at least for ARM64,
> > > this seems like a viable goal.
> >
> > Cortex-A7/A15 and later have GIC and architectured timer, so it should
> > work for contemporary systems.
> > Cortex-A9 systems may have GIC, and TWD and/or Global Timer (but I've
> > seen SoCs where the interrupt for the latter was not wired :-(.
>
> There are a number of well-known examples even with 64-bit chips or
> Cortex-A7/A15 based SoCs that can't use the architected timer,
> irqchip or iommu.
>
> Apple M1, Broadcom BCM283x, Samsung Exynos5 and
> some Hisilicon server parts come to mind, I'm sure there
> are more.

There's also more and more movement towards having coprocessors with
standardized interfaces dealing with this functionality. We're
currently at the point where they have coprocessors with
non-standardized interfaces, and it's useful to keep encouraging
convergence in this area to everybody's benefit. I don't find it
particularly useful to make life easier for the custom solutions at
the expense of others like this patchset does, when that's (just
beyond? on?) the horizon.

> > What are the plans for other architectures?
> > I've seen similar patches being applied for e.g. MIPS.
>
> There is some work in the more actively maintained MIPS
> platforms to make those behave more like Arm/powerpc/riscv/m68k
> platforms, using a single image and moving drivers into modules.
> Most MIPS platforms seem unlikely to get updated to this,
> and will continue to require a SoC specific kernel binary forever,
> similar to the renesas superh platforms. Most of the less
> common architectures (arc, csky, hexagon, nios2, xtensa,
> microblaze, nds32, openrisc, sparc/leon) are way behind that
> though, and generally don't work at all without out-of-tree
> code.

One of the arguments for needing some of these core drivers in-kernel
is that some platforms boot at very conservative DVFS operating
points, to a degree that you really want to turn up the CPU clocks
fairly early during boot.

If you don't have the drivers built-in, you can't do that and/or you
create possible fragile or awkward inter-module dependencies with
deferred probing, etc. We do care about boot time enough to prefer to
just build them in for this reason.

If vmlinux binary size is a concern, maybe it's time to consider
splitting the drivers into a bare-minimum piece that's not a module
for early setup, and the rest that can be loaded post-boot.


-Olof
Olof Johansson Oct. 1, 2021, 3:43 p.m. UTC | #51
On Fri, Oct 1, 2021 at 5:31 AM Lee Jones <lee.jones@linaro.org> wrote:
>
> On Fri, 01 Oct 2021, Arnd Bergmann wrote:
> > The vmlinux file is clearly too big and includes too much stuff that should
> > be in loadable modules
>
> This for me is the crux of the matter.
>
> The ability to replace modules was only brought to light as an "and
> also, this is possible".  However in retrospect, given the attention
> this has received, it probably shouldn't have even mentioned, as it's
> not that important.

Too late, unfortunately.

I would actually argue that given the benefit of needing more vendor
engagement upstream for GKI to be smooth, it's in our interest to
welcome those engagements and make the most of it and help vendors get
there, and it's against those interests to make it easier to be
out-of-tree if it comes at the expense of our in-tree users and
maintainers which this does.

> We should focus on the benefits of making parts of the kernel modular
> if technically possible.  The most prominent of those is core binary
> size, since this has a direct impact on boot-time and RAM usage.

The way forward here should be to focus on the problem that needs to
be solved (vmlinux size) and not overly fixate on whether this
patchset is what needs to be merged to reach there, given the
downsides observed. I'm not saying let's not improve vmlinux binary
size, but this particular approach isn't appealing.

> Reclaiming dead code after boot is certainly one way to tackle part of
> the problem.  Ensuring that it's not even loaded into RAM in the first
> place is a better more encompassing solution to both issues IMHO.

See my reply to Arnd; the reason some of these drivers aren't modules
today is because they are needed during early boot to bring the
platform to a stable operating point. Work on fixing the binary size
is terrific, but this approach seems to be shortsighted and it's been
done in a way that rubs the maintainers the wrong way.


-Olof
Olof Johansson Oct. 1, 2021, 3:59 p.m. UTC | #52
On Fri, Oct 1, 2021 at 4:59 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Olof,
>
> On Fri, Oct 1, 2021 at 7:36 AM Olof Johansson <olof@lixom.net> wrote:
> > A much more valuable approach would be to work towards being able to
> > free up memory by un-probed drivers at the end of boot. That would
> > possibly benefit all platforms on all architectures.
>
> We used to have such a functionality in arch/ppc (not arch/powerpc!),
> where code/data could be tagged __prep, __chrp, or __pmac, to put it
> in a special section, and to be freed with initdata when unused.  It
> was removed in v2.6.15[1], as the savings weren't worth the hassle.
> In a more fragmented space like arm the memory lost due to alignment
> of the sections would be even more substantial.

Yeah, the balance between per-platform code size and overall kernel
code size shifted over time to a point where it wasn't as meaningful
on ppc.

> Another problem is to know when is the end of the boot, especially
> with deferred probing.

Most of this code either has a module_init() or an initcall that
actually registers the drivers and/or probes for the platform and does
the work.

This means you can have a late equivalent hook/initcall that
determines whether this path ended up being probed/used. If it wasn't,
you can then unregister and flag the corresponding memory to be freed
at the end, and would take out the heuristics and guessing on needing
to do it automatically from the code path that's doing said freeing.


-Olof
William McVicker Oct. 1, 2021, 4:51 p.m. UTC | #53
On Fri, Oct 1, 2021 at 9:00 AM Olof Johansson <olof@lixom.net> wrote:
>
> On Fri, Oct 1, 2021 at 4:59 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> > Hi Olof,
> >
> > On Fri, Oct 1, 2021 at 7:36 AM Olof Johansson <olof@lixom.net> wrote:
> > > A much more valuable approach would be to work towards being able to
> > > free up memory by un-probed drivers at the end of boot. That would
> > > possibly benefit all platforms on all architectures.
> >
> > We used to have such a functionality in arch/ppc (not arch/powerpc!),
> > where code/data could be tagged __prep, __chrp, or __pmac, to put it
> > in a special section, and to be freed with initdata when unused.  It
> > was removed in v2.6.15[1], as the savings weren't worth the hassle.
> > In a more fragmented space like arm the memory lost due to alignment
> > of the sections would be even more substantial.
>
> Yeah, the balance between per-platform code size and overall kernel
> code size shifted over time to a point where it wasn't as meaningful
> on ppc.
>
> > Another problem is to know when is the end of the boot, especially
> > with deferred probing.
>
> Most of this code either has a module_init() or an initcall that
> actually registers the drivers and/or probes for the platform and does
> the work.
>
> This means you can have a late equivalent hook/initcall that
> determines whether this path ended up being probed/used. If it wasn't,
> you can then unregister and flag the corresponding memory to be freed
> at the end, and would take out the heuristics and guessing on needing
> to do it automatically from the code path that's doing said freeing.
>
>
> -Olof

First off, I appreciate the constructive conversations and I
understand the ask here. So I'd like to close the "we don't want this"
and "this isn't possible" conversation. We have already proven
downstream that it is in fact possible to modularize these drivers on
other SoCs (mentioned earlier if you missed it) and I'd like to direct
the conversation towards verifying/testing here instead of negatively
arguing about how SoC vendors aren't upstreaming their drivers. I
think everyone understands that, but unfortunately I have no control
over that even though I would love everyone to work upstream directly.

I am fine with forcing these drivers to always be enabled in some form
upstream even though it doesn't really make much sense for a generic
kernel that will run on Qualcomm, Exynos, Mediatek, (you name it) SoC
devices. I thought about how to do this yesterday and wasn't able to
come up with a proper solution that didn't always force this driver to
be a module when CONFIG_MODULES is enabled.

For example, if I do this below, then we will be forcing all builds to
use CONFIG_XXX as a module if they want just one driver as a module.

config XXX
  tristate "blah blah" if COMPILE_TEST
  default m if (ARCH_XXX && MODULES)
  default ARCH_XXX

The best I was able to come up with was this below which would allow
the driver to be a module or built-in; however, obviously it lets you
disable it in EXPERT mode.

config XXX
  tristate "blah blah" if COMPILE_TEST || EXPERT
  default ARCH_XXX

Let me know if you have a better solution that doesn't force the
driver to be a module when CONFIG_MODULES=y. Saravana did propose a
MINIMUM_ARM64_GENERIC_KERNEL config that could solve this, but that
too was shot down.

Thanks,
Will
Olof Johansson Oct. 1, 2021, 5:15 p.m. UTC | #54
On Fri, Oct 1, 2021 at 9:51 AM Will McVicker <willmcvicker@google.com> wrote:
>
> On Fri, Oct 1, 2021 at 9:00 AM Olof Johansson <olof@lixom.net> wrote:
> >
> > On Fri, Oct 1, 2021 at 4:59 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > >
> > > Hi Olof,
> > >
> > > On Fri, Oct 1, 2021 at 7:36 AM Olof Johansson <olof@lixom.net> wrote:
> > > > A much more valuable approach would be to work towards being able to
> > > > free up memory by un-probed drivers at the end of boot. That would
> > > > possibly benefit all platforms on all architectures.
> > >
> > > We used to have such a functionality in arch/ppc (not arch/powerpc!),
> > > where code/data could be tagged __prep, __chrp, or __pmac, to put it
> > > in a special section, and to be freed with initdata when unused.  It
> > > was removed in v2.6.15[1], as the savings weren't worth the hassle.
> > > In a more fragmented space like arm the memory lost due to alignment
> > > of the sections would be even more substantial.
> >
> > Yeah, the balance between per-platform code size and overall kernel
> > code size shifted over time to a point where it wasn't as meaningful
> > on ppc.
> >
> > > Another problem is to know when is the end of the boot, especially
> > > with deferred probing.
> >
> > Most of this code either has a module_init() or an initcall that
> > actually registers the drivers and/or probes for the platform and does
> > the work.
> >
> > This means you can have a late equivalent hook/initcall that
> > determines whether this path ended up being probed/used. If it wasn't,
> > you can then unregister and flag the corresponding memory to be freed
> > at the end, and would take out the heuristics and guessing on needing
> > to do it automatically from the code path that's doing said freeing.
> >
> >
> > -Olof
>
> First off, I appreciate the constructive conversations and I
> understand the ask here. So I'd like to close the "we don't want this"
> and "this isn't possible" conversation. We have already proven
> downstream that it is in fact possible to modularize these drivers on
> other SoCs (mentioned earlier if you missed it) and I'd like to direct
> the conversation towards verifying/testing here instead of negatively
> arguing about how SoC vendors aren't upstreaming their drivers. I
> think everyone understands that, but unfortunately I have no control
> over that even though I would love everyone to work upstream directly.
>
> I am fine with forcing these drivers to always be enabled in some form
> upstream even though it doesn't really make much sense for a generic
> kernel that will run on Qualcomm, Exynos, Mediatek, (you name it) SoC
> devices. I thought about how to do this yesterday and wasn't able to
> come up with a proper solution that didn't always force this driver to
> be a module when CONFIG_MODULES is enabled.

This line of reasoning: "I couldn't think of a better option" made us
merge a userspace ABI some time ago that within a few months was
replaced with a better solution. In that case it was the kernel
headers bundling with a build (extending the vmlinux size by a lot),
that seemed to have no concerns about binary growth. Not all that far
after that went in, the BPF folks came up with a solid solution for
CO-RE by introducing BTF, etc.

So, the argument "I can't think of a better solution" is a local
maxima that we shouldn't settle for if there's a likely better global
maxima available with a bit more time/effort. If we say "this problem
is worth solving but this doesn't seem to be the solution we want to
go for" we might actually be better off long-term.


-Olof
William McVicker Oct. 1, 2021, 5:48 p.m. UTC | #55
On Fri, Oct 1, 2021 at 10:16 AM Olof Johansson <olof@lixom.net> wrote:
>
> On Fri, Oct 1, 2021 at 9:51 AM Will McVicker <willmcvicker@google.com> wrote:
> >
> > On Fri, Oct 1, 2021 at 9:00 AM Olof Johansson <olof@lixom.net> wrote:
> > >
> > > On Fri, Oct 1, 2021 at 4:59 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > >
> > > > Hi Olof,
> > > >
> > > > On Fri, Oct 1, 2021 at 7:36 AM Olof Johansson <olof@lixom.net> wrote:
> > > > > A much more valuable approach would be to work towards being able to
> > > > > free up memory by un-probed drivers at the end of boot. That would
> > > > > possibly benefit all platforms on all architectures.
> > > >
> > > > We used to have such a functionality in arch/ppc (not arch/powerpc!),
> > > > where code/data could be tagged __prep, __chrp, or __pmac, to put it
> > > > in a special section, and to be freed with initdata when unused.  It
> > > > was removed in v2.6.15[1], as the savings weren't worth the hassle.
> > > > In a more fragmented space like arm the memory lost due to alignment
> > > > of the sections would be even more substantial.
> > >
> > > Yeah, the balance between per-platform code size and overall kernel
> > > code size shifted over time to a point where it wasn't as meaningful
> > > on ppc.
> > >
> > > > Another problem is to know when is the end of the boot, especially
> > > > with deferred probing.
> > >
> > > Most of this code either has a module_init() or an initcall that
> > > actually registers the drivers and/or probes for the platform and does
> > > the work.
> > >
> > > This means you can have a late equivalent hook/initcall that
> > > determines whether this path ended up being probed/used. If it wasn't,
> > > you can then unregister and flag the corresponding memory to be freed
> > > at the end, and would take out the heuristics and guessing on needing
> > > to do it automatically from the code path that's doing said freeing.
> > >
> > >
> > > -Olof
> >
> > First off, I appreciate the constructive conversations and I
> > understand the ask here. So I'd like to close the "we don't want this"
> > and "this isn't possible" conversation. We have already proven
> > downstream that it is in fact possible to modularize these drivers on
> > other SoCs (mentioned earlier if you missed it) and I'd like to direct
> > the conversation towards verifying/testing here instead of negatively
> > arguing about how SoC vendors aren't upstreaming their drivers. I
> > think everyone understands that, but unfortunately I have no control
> > over that even though I would love everyone to work upstream directly.
> >
> > I am fine with forcing these drivers to always be enabled in some form
> > upstream even though it doesn't really make much sense for a generic
> > kernel that will run on Qualcomm, Exynos, Mediatek, (you name it) SoC
> > devices. I thought about how to do this yesterday and wasn't able to
> > come up with a proper solution that didn't always force this driver to
> > be a module when CONFIG_MODULES is enabled.
>
> This line of reasoning: "I couldn't think of a better option" made us
> merge a userspace ABI some time ago that within a few months was
> replaced with a better solution. In that case it was the kernel
> headers bundling with a build (extending the vmlinux size by a lot),
> that seemed to have no concerns about binary growth. Not all that far
> after that went in, the BPF folks came up with a solid solution for
> CO-RE by introducing BTF, etc.
>
> So, the argument "I can't think of a better solution" is a local
> maxima that we shouldn't settle for if there's a likely better global
> maxima available with a bit more time/effort. If we say "this problem
> is worth solving but this doesn't seem to be the solution we want to
> go for" we might actually be better off long-term.
>
>
> -Olof

If the answer is, "we don't have a solution for that" then that's
fine. I'm not an expert at Kconfig configurations and am asking if
there is already a way to handle this. To me it sounded like there
might be a solution already due to this policy of "we don't allow
disabling drivers that are essential." I'll wait for Krysztof to
respond (or someone who has a potential solution) before I dig into
this deeper.

--Will
Saravana Kannan Oct. 1, 2021, 7:26 p.m. UTC | #56
On Fri, Oct 1, 2021 at 8:27 AM Olof Johansson <olof@lixom.net> wrote:
>
> On Fri, Oct 1, 2021 at 2:01 AM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Fri, Oct 1, 2021 at 10:19 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > On Fri, Oct 1, 2021 at 7:24 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > GIC and arch timer. Basically the minimal kernel would need a timer
> > > > for the scheduler tick and IRQ controller to get the timer IRQ and the
> > > > fixed clock driver if the archtimer uses one to get its frequency and
> > > > the early UART console is pointless as a module (so build it in to
> > > > allow debugging/development).
> > > >
> > > > And then all new drivers, we should make sure are implemented as
> > > > tristate drivers. And we can go back and slowly work on converting
> > > > existing drivers to modules (community effort -- not one person or
> > > > entity) -- at least the ones where the author has hardware or ones
> > > > where the change is very likely to be correct and someone else is
> > > > willing to test it. We'll never be able to support some/all ARM32 (do
> > > > they even have a GIC/arch timer standard?), but at least for ARM64,
> > > > this seems like a viable goal.
> > >
> > > Cortex-A7/A15 and later have GIC and architectured timer, so it should
> > > work for contemporary systems.
> > > Cortex-A9 systems may have GIC, and TWD and/or Global Timer (but I've
> > > seen SoCs where the interrupt for the latter was not wired :-(.
> >
> > There are a number of well-known examples even with 64-bit chips or
> > Cortex-A7/A15 based SoCs that can't use the architected timer,
> > irqchip or iommu.
> >
> > Apple M1, Broadcom BCM283x, Samsung Exynos5 and
> > some Hisilicon server parts come to mind, I'm sure there
> > are more.
>
> There's also more and more movement towards having coprocessors with
> standardized interfaces dealing with this functionality. We're
> currently at the point where they have coprocessors with
> non-standardized interfaces, and it's useful to keep encouraging
> convergence in this area to everybody's benefit. I don't find it
> particularly useful to make life easier for the custom solutions at
> the expense of others like this patchset does, when that's (just
> beyond? on?) the horizon.
>
> > > What are the plans for other architectures?
> > > I've seen similar patches being applied for e.g. MIPS.
> >
> > There is some work in the more actively maintained MIPS
> > platforms to make those behave more like Arm/powerpc/riscv/m68k
> > platforms, using a single image and moving drivers into modules.
> > Most MIPS platforms seem unlikely to get updated to this,
> > and will continue to require a SoC specific kernel binary forever,
> > similar to the renesas superh platforms. Most of the less
> > common architectures (arc, csky, hexagon, nios2, xtensa,
> > microblaze, nds32, openrisc, sparc/leon) are way behind that
> > though, and generally don't work at all without out-of-tree
> > code.
>
> One of the arguments for needing some of these core drivers in-kernel
> is that some platforms boot at very conservative DVFS operating
> points, to a degree that you really want to turn up the CPU clocks
> fairly early during boot.
>
> If you don't have the drivers built-in, you can't do that and/or you
> create possible fragile or awkward inter-module dependencies with
> deferred probing, etc. We do care about boot time enough to prefer to
> just build them in for this reason.

Go look at a Pixel 5, we got this working just fine with all these
drivers as modules and we definitely care about boot time. You just
need to load your CPU freq driver and the other ones it needs early
on. And with fw_devlink=on (default in upstream), there's hardly any
deferred probing.

> If vmlinux binary size is a concern, maybe it's time to consider
> splitting the drivers into a bare-minimum piece that's not a module
> for early setup, and the rest that can be loaded post-boot.

Isn't this literally what I was suggesting with my
ARM64_MINIMAL_GENERIC_KERNEL suggestion? Build in all the bare minimum
drivers that are needed before module loading can happen? You'd just
select them all under that config. And any existing platform that
wants to use it would break up their drivers into modules and switch
to it.

-Saravana
Tomasz Figa Oct. 2, 2021, 1:47 a.m. UTC | #57
2021年10月2日(土) 4:26 Saravana Kannan <saravanak@google.com>:
>
> On Fri, Oct 1, 2021 at 8:27 AM Olof Johansson <olof@lixom.net> wrote:
> >
> > On Fri, Oct 1, 2021 at 2:01 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Fri, Oct 1, 2021 at 10:19 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > On Fri, Oct 1, 2021 at 7:24 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > > GIC and arch timer. Basically the minimal kernel would need a timer
> > > > > for the scheduler tick and IRQ controller to get the timer IRQ and the
> > > > > fixed clock driver if the archtimer uses one to get its frequency and
> > > > > the early UART console is pointless as a module (so build it in to
> > > > > allow debugging/development).
> > > > >
> > > > > And then all new drivers, we should make sure are implemented as
> > > > > tristate drivers. And we can go back and slowly work on converting
> > > > > existing drivers to modules (community effort -- not one person or
> > > > > entity) -- at least the ones where the author has hardware or ones
> > > > > where the change is very likely to be correct and someone else is
> > > > > willing to test it. We'll never be able to support some/all ARM32 (do
> > > > > they even have a GIC/arch timer standard?), but at least for ARM64,
> > > > > this seems like a viable goal.
> > > >
> > > > Cortex-A7/A15 and later have GIC and architectured timer, so it should
> > > > work for contemporary systems.
> > > > Cortex-A9 systems may have GIC, and TWD and/or Global Timer (but I've
> > > > seen SoCs where the interrupt for the latter was not wired :-(.
> > >
> > > There are a number of well-known examples even with 64-bit chips or
> > > Cortex-A7/A15 based SoCs that can't use the architected timer,
> > > irqchip or iommu.
> > >
> > > Apple M1, Broadcom BCM283x, Samsung Exynos5 and
> > > some Hisilicon server parts come to mind, I'm sure there
> > > are more.
> >
> > There's also more and more movement towards having coprocessors with
> > standardized interfaces dealing with this functionality. We're
> > currently at the point where they have coprocessors with
> > non-standardized interfaces, and it's useful to keep encouraging
> > convergence in this area to everybody's benefit. I don't find it
> > particularly useful to make life easier for the custom solutions at
> > the expense of others like this patchset does, when that's (just
> > beyond? on?) the horizon.
> >
> > > > What are the plans for other architectures?
> > > > I've seen similar patches being applied for e.g. MIPS.
> > >
> > > There is some work in the more actively maintained MIPS
> > > platforms to make those behave more like Arm/powerpc/riscv/m68k
> > > platforms, using a single image and moving drivers into modules.
> > > Most MIPS platforms seem unlikely to get updated to this,
> > > and will continue to require a SoC specific kernel binary forever,
> > > similar to the renesas superh platforms. Most of the less
> > > common architectures (arc, csky, hexagon, nios2, xtensa,
> > > microblaze, nds32, openrisc, sparc/leon) are way behind that
> > > though, and generally don't work at all without out-of-tree
> > > code.
> >
> > One of the arguments for needing some of these core drivers in-kernel
> > is that some platforms boot at very conservative DVFS operating
> > points, to a degree that you really want to turn up the CPU clocks
> > fairly early during boot.
> >
> > If you don't have the drivers built-in, you can't do that and/or you
> > create possible fragile or awkward inter-module dependencies with
> > deferred probing, etc. We do care about boot time enough to prefer to
> > just build them in for this reason.
>
> Go look at a Pixel 5, we got this working just fine with all these
> drivers as modules and we definitely care about boot time. You just
> need to load your CPU freq driver and the other ones it needs early
> on. And with fw_devlink=on (default in upstream), there's hardly any
> deferred probing.
>
> > If vmlinux binary size is a concern, maybe it's time to consider
> > splitting the drivers into a bare-minimum piece that's not a module
> > for early setup, and the rest that can be loaded post-boot.
>
> Isn't this literally what I was suggesting with my
> ARM64_MINIMAL_GENERIC_KERNEL suggestion? Build in all the bare minimum
> drivers that are needed before module loading can happen? You'd just
> select them all under that config. And any existing platform that
> wants to use it would break up their drivers into modules and switch
> to it.

I think the point here is that pinctrl and clk are considered a part
of "bare minimum" for Exynos.
Olof Johansson Oct. 2, 2021, 9:03 p.m. UTC | #58
On Fri, Oct 1, 2021 at 12:26 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Fri, Oct 1, 2021 at 8:27 AM Olof Johansson <olof@lixom.net> wrote:
> >
> > On Fri, Oct 1, 2021 at 2:01 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Fri, Oct 1, 2021 at 10:19 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > > > On Fri, Oct 1, 2021 at 7:24 AM Saravana Kannan <saravanak@google.com> wrote:
> > > > > GIC and arch timer. Basically the minimal kernel would need a timer
> > > > > for the scheduler tick and IRQ controller to get the timer IRQ and the
> > > > > fixed clock driver if the archtimer uses one to get its frequency and
> > > > > the early UART console is pointless as a module (so build it in to
> > > > > allow debugging/development).
> > > > >
> > > > > And then all new drivers, we should make sure are implemented as
> > > > > tristate drivers. And we can go back and slowly work on converting
> > > > > existing drivers to modules (community effort -- not one person or
> > > > > entity) -- at least the ones where the author has hardware or ones
> > > > > where the change is very likely to be correct and someone else is
> > > > > willing to test it. We'll never be able to support some/all ARM32 (do
> > > > > they even have a GIC/arch timer standard?), but at least for ARM64,
> > > > > this seems like a viable goal.
> > > >
> > > > Cortex-A7/A15 and later have GIC and architectured timer, so it should
> > > > work for contemporary systems.
> > > > Cortex-A9 systems may have GIC, and TWD and/or Global Timer (but I've
> > > > seen SoCs where the interrupt for the latter was not wired :-(.
> > >
> > > There are a number of well-known examples even with 64-bit chips or
> > > Cortex-A7/A15 based SoCs that can't use the architected timer,
> > > irqchip or iommu.
> > >
> > > Apple M1, Broadcom BCM283x, Samsung Exynos5 and
> > > some Hisilicon server parts come to mind, I'm sure there
> > > are more.
> >
> > There's also more and more movement towards having coprocessors with
> > standardized interfaces dealing with this functionality. We're
> > currently at the point where they have coprocessors with
> > non-standardized interfaces, and it's useful to keep encouraging
> > convergence in this area to everybody's benefit. I don't find it
> > particularly useful to make life easier for the custom solutions at
> > the expense of others like this patchset does, when that's (just
> > beyond? on?) the horizon.
> >
> > > > What are the plans for other architectures?
> > > > I've seen similar patches being applied for e.g. MIPS.
> > >
> > > There is some work in the more actively maintained MIPS
> > > platforms to make those behave more like Arm/powerpc/riscv/m68k
> > > platforms, using a single image and moving drivers into modules.
> > > Most MIPS platforms seem unlikely to get updated to this,
> > > and will continue to require a SoC specific kernel binary forever,
> > > similar to the renesas superh platforms. Most of the less
> > > common architectures (arc, csky, hexagon, nios2, xtensa,
> > > microblaze, nds32, openrisc, sparc/leon) are way behind that
> > > though, and generally don't work at all without out-of-tree
> > > code.
> >
> > One of the arguments for needing some of these core drivers in-kernel
> > is that some platforms boot at very conservative DVFS operating
> > points, to a degree that you really want to turn up the CPU clocks
> > fairly early during boot.
> >
> > If you don't have the drivers built-in, you can't do that and/or you
> > create possible fragile or awkward inter-module dependencies with
> > deferred probing, etc. We do care about boot time enough to prefer to
> > just build them in for this reason.
>
> Go look at a Pixel 5, we got this working just fine with all these
> drivers as modules and we definitely care about boot time. You just
> need to load your CPU freq driver and the other ones it needs early
> on. And with fw_devlink=on (default in upstream), there's hardly any
> deferred probing.

Unfortunately these problems are usually easier to fix on new
platforms, especially during new product development. The hard part is
making sure you haven't regressed any of the legacy platforms when
you're changing the implementation for them as well.


> > If vmlinux binary size is a concern, maybe it's time to consider
> > splitting the drivers into a bare-minimum piece that's not a module
> > for early setup, and the rest that can be loaded post-boot.
>
> Isn't this literally what I was suggesting with my
> ARM64_MINIMAL_GENERIC_KERNEL suggestion? Build in all the bare minimum
> drivers that are needed before module loading can happen? You'd just
> select them all under that config. And any existing platform that
> wants to use it would break up their drivers into modules and switch
> to it.

Do you understand the implications of your proposal on the complexity
for those who care about making sure different config combinations
keep building and working, i.e. both the minimal-generic-kernel and
the regular generic version? You've doubled the testing workload for
all of those folks. It's a different mindset from when you mostly need
to care about your one config and platform.


-Olof