Patchwork [Qemu-ppc] Qemu boot device precedence over nvram boot-device setting

login
register
mail settings
Submitter Avik Sil
Date Oct. 4, 2012, 10:55 a.m.
Message ID <506D6B20.7020508@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/189262/
State New
Headers show

Comments

Avik Sil - Oct. 4, 2012, 10:55 a.m.
On 09/27/2012 03:21 PM, Gleb Natapov wrote:
> On Thu, Sep 27, 2012 at 11:33:31AM +0200, Alexander Graf wrote:
>>
>> On 27.09.2012, at 11:29, Benjamin Herrenschmidt wrote:
>>
>>> On Thu, 2012-09-27 at 14:51 +0530, Avik Sil wrote:
>>>> Hi,
>>>>
>>>> We would like to get a method to boot from devices provided in -boot
>>>> arguments in qemu when the 'boot-device' is set in nvram for pseries
>>>> machine. I mean the boot device specified in -boot should get a
>>>> precedence over the 'boot-device' specified in nvram.
>>>>
>>>> At the same time, when -boot is not provided, i.e., the default boot
>>>> order "cad" is present, the device specified in nvram 'boot-device'
>>>> should get precedence if it is set.
>>>>
>>>> What should be the elegant way to implement this requirement?
>>>> Suggestions welcome.
>>>
>>> Actually I think it's a more open question. We have essentially two
>>> things at play here:
>>>
>>> - With the new nvram model, the firmware can store a boot device
>>> reference in it, which is standard OF practice, and in fact the various
>>> distro installers are going to do just that
>>>
>>> - Qemu has its own boot order thingy via -boot, which we loosely
>>> translate as c = first bootable disk we find (actually first disk we
>>> find, we should probably make the algorithm a bit smarter), d = first
>>> cdrom we find, n = network , ... We pass that selection (boot list) down
>>> to SLOF via a device-tree property.
>>>
>>> The question is thus what precedence should we give them. I was
>>> initially thinking that an explicit qemu boot list should override the
>>> firmware nvram setting but I'm now not that sure anymore.
>>>
>>> The -boot list is at best a "blurry" indication of what type of device
>>> the user wants ... The firmware setting in nvram is precise.
>>
>> IIRC gleb had implemented a specific boot order thing. Gleb, mind to enlighten us? :)
>>
> Yes, forget about -boot. It is deprecated :) You should use bootindex
> (device property) to set boot priority. It constructs OF device path
> and passes it to firmware. There is nothing "blurry" about OF device
> path. The problem is that it works reasonably well with legacy BIOS
> since it is enough to specify device to boot from, but with EFI (OF is
> the same I guess) it is not enough to point to a device to boot from,
> but you also need to specify a file you want to boot and this is where
> bootindex approach fails. If EFI would specify default file to boot from
> firmware could have used it, but EFI specifies it only for removable media
> (what media is not removable this days, especially with virtualization?).
> We can add qemu parameter to specify file to boot, but how users should
> know the name of the file?
>
I looked at the bootindex stuff and found that when the bootindex is 
specified for the disk and cdrom it generates a string like:

"/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"

Now converting/translating this to OF device path is going to be much 
trickier and might not be proper. So I propose a simple solution by 
introducing a global flag that checks if explicit -boot parameter is 
provided or not. The presence of this parameter is verified in SLOF 
firmware. The flag had to be introduced as boot_devices defaults to 
"cad" instead of null and passed to machine->init().



Comments welcome.

>>> However if we make the nvram override qemu, then it's trickier to
>>> force-boot from, let's say, a rescue CD. The user would have to stop the
>>> SLOF boot process by pressing a key then manually type something like
>>> "boot cdrom".
>>>
>>> Maybe the right approach is "in between", and is that the primary driver
>>> is the -boot argument. For each entry in the boot list, if it's "c", use
>>> the configured boot-device or fallback to the automatic guess SLOF tries
>>> to do today in absence of a boot-device. If it's "d" or "n" force it
>>> respectively to cdrom or network...
>>>
>>> I think there is no perfect solution here. What do you guys think is the
>>> less user unfriendly ?
>>
>> I think the command line should override anything user specified. So basically:
>>
>>    * user defined -boot option (or bootindex magic from Gleb)
>>    * nvram
>>    * fallback to default
>>
>>> Eventually we should try to implement some sort of interactive boot
>>> device selection in SLOF, such as SMS does on pseries, but that will
>>> take a bit of time.
>>
>> That would be en par with the bootmenu on x86 :). Please check out how x86 models these things. It could sure be interesting for pseries.
>>
>>
>> Alex
>
> --
> 			Gleb.
>
>
Regards,
Avik
Avik Sil - Oct. 4, 2012, 11:59 a.m.
>>>> I looked at the bootindex stuff and found that when the bootindex is
>>>> specified for the disk and cdrom it generates a string like:
>>>>
>>>> "/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
>>>> /spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"
>>>>
>>>> Now converting/translating this to OF device path is going to be
>>>> much trickier and might not be proper. So I propose a simple
>>>> solution by introducing a global flag that checks if explicit -boot
>>>> parameter is provided or not. The presence of this parameter is
>>>> verified in SLOF firmware. The flag had to be introduced as
>>>> boot_devices defaults to "cad" instead of null and passed to
>>>> machine->init().
>>>>
>>> So you want to hack around the problem. If -boot is specified what
>>> device are you going to boot from?
>>
>> It is going to boot from the device specified in -boot as
>> default_boot_order is set to 0 in that case.
>>
> -boot has not enough verbosity to tell the device to boot from if you
> have more than one device of each type. What are you going to boot from
> if you have two disks, two NICs, etc?

Yes, -boot has this limitation and -boot is what we are currently using. 
We are extending this using the nvram boot-device property. With the 
nvram driver in place, we would be booting from boot-device. We also 
need a way from qemu to override this, where we hit this issue of the 
default boot device. And currently SLOF boots from the first disk/cdrom 
it discovers in device tree in case there are multiple disks or cdroms.

Regards,
Avik

>
> --
> 			Gleb.
>
>
>
Avik Sil - Oct. 4, 2012, 12:18 p.m.
>>>> I looked at the bootindex stuff and found that when the bootindex is
>>>> specified for the disk and cdrom it generates a string like:
>>>>
>>>> "/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
>>>> /spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"
>>>>
>>>> Now converting/translating this to OF device path is going to be
>>>> much trickier and might not be proper. So I propose a simple
>>>> solution by introducing a global flag that checks if explicit -boot
>>>> parameter is provided or not. The presence of this parameter is
>>>> verified in SLOF firmware. The flag had to be introduced as
>>>> boot_devices defaults to "cad" instead of null and passed to
>>>> machine->init().
>>>>
>>> So you want to hack around the problem. If -boot is specified what
>>> device are you going to boot from?
>>
>> It is going to boot from the device specified in -boot as default_boot_order is set to 0 in that case.
>
> Imagine you have 2 controllers:
>
>    * vio
>    * virtio
>
> and you specify -boot c. Which device are you going to boot from?

Currently, by default SLOF boots from the first disk it discovers in the 
device tree.

Regards,
Avik

>
>
> Alex
>
>
Avik Sil - Oct. 4, 2012, 12:35 p.m.
>>>>> So you want to hack around the problem. If -boot is specified what
>>>>> device are you going to boot from?
>>>>
>>>> It is going to boot from the device specified in -boot as default_boot_order is set to 0 in that case.
>>>
>>> Imagine you have 2 controllers:
>>>
>>>    * vio
>>>    * virtio
>>>
>>> and you specify -boot c. Which device are you going to boot from?
>>
>> Currently, by default SLOF boots from the first disk it discovers in the device tree.
>
> So you want to replace one broken scheme with another broken scheme? :)

Ha ha, actually we hit this issue in some different context with respect 
to nvram boot-device which I mentioned in [1]. The patch is a workaround 
for that issue only.

Regards,
Avik

>
>
> Alex
>
>
>
[1] http://lists.nongnu.org/archive/html/qemu-ppc/2012-10/msg00020.html
David Gibson - Oct. 5, 2012, 12:34 a.m.
On Thu, Oct 04, 2012 at 04:25:28PM +0530, Avik Sil wrote:
> On 09/27/2012 03:21 PM, Gleb Natapov wrote:
> >On Thu, Sep 27, 2012 at 11:33:31AM +0200, Alexander Graf wrote:
> >>
> >>On 27.09.2012, at 11:29, Benjamin Herrenschmidt wrote:
> >>
> >>>On Thu, 2012-09-27 at 14:51 +0530, Avik Sil wrote:
> >>>>Hi,
> >>>>
> >>>>We would like to get a method to boot from devices provided in -boot
> >>>>arguments in qemu when the 'boot-device' is set in nvram for pseries
> >>>>machine. I mean the boot device specified in -boot should get a
> >>>>precedence over the 'boot-device' specified in nvram.
> >>>>
> >>>>At the same time, when -boot is not provided, i.e., the default boot
> >>>>order "cad" is present, the device specified in nvram 'boot-device'
> >>>>should get precedence if it is set.
> >>>>
> >>>>What should be the elegant way to implement this requirement?
> >>>>Suggestions welcome.
> >>>
> >>>Actually I think it's a more open question. We have essentially two
> >>>things at play here:
> >>>
> >>>- With the new nvram model, the firmware can store a boot device
> >>>reference in it, which is standard OF practice, and in fact the various
> >>>distro installers are going to do just that
> >>>
> >>>- Qemu has its own boot order thingy via -boot, which we loosely
> >>>translate as c = first bootable disk we find (actually first disk we
> >>>find, we should probably make the algorithm a bit smarter), d = first
> >>>cdrom we find, n = network , ... We pass that selection (boot list) down
> >>>to SLOF via a device-tree property.
> >>>
> >>>The question is thus what precedence should we give them. I was
> >>>initially thinking that an explicit qemu boot list should override the
> >>>firmware nvram setting but I'm now not that sure anymore.
> >>>
> >>>The -boot list is at best a "blurry" indication of what type of device
> >>>the user wants ... The firmware setting in nvram is precise.
> >>
> >>IIRC gleb had implemented a specific boot order thing. Gleb, mind to enlighten us? :)
> >>
> >Yes, forget about -boot. It is deprecated :) You should use bootindex
> >(device property) to set boot priority. It constructs OF device path
> >and passes it to firmware. There is nothing "blurry" about OF device
> >path. The problem is that it works reasonably well with legacy BIOS
> >since it is enough to specify device to boot from, but with EFI (OF is
> >the same I guess) it is not enough to point to a device to boot from,
> >but you also need to specify a file you want to boot and this is where
> >bootindex approach fails. If EFI would specify default file to boot from
> >firmware could have used it, but EFI specifies it only for removable media
> >(what media is not removable this days, especially with virtualization?).
> >We can add qemu parameter to specify file to boot, but how users should
> >know the name of the file?
> >
> I looked at the bootindex stuff and found that when the bootindex is
> specified for the disk and cdrom it generates a string like:
> 
> "/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
> /spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"

Ok, so I've just started looking at the bootindex stuff.  What
function is generating these strings?

We should also be able to get the raw bootindex values for a qdev,
yes?  I was thinking we could instead copy those values into the
device tree when we populate it.  The trouble is that we don't
actually generate (in qemu) nodes for individual disks under a vscsi,
or for individual PCI devices under the host bridge (that's done by
SLOF).  Still thinking...

An aside, I'm thinking that once we do get bootindex working, then
boot devices specified in NVRAM should have priority below all devices
with explicit supplied bootindex, but above any that don't.  Does that
seem right to you?

> Now converting/translating this to OF device path is going to be
> much trickier and might not be proper. So I propose a simple
> solution by introducing a global flag that checks if explicit -boot
> parameter is provided or not. The presence of this parameter is
> verified in SLOF firmware. The flag had to be introduced as
> boot_devices defaults to "cad" instead of null and passed to
> machine->init().

So, personally, I think this is quite a reasonable interim measure
until we figure out how to do bootindex.  I will fold it into our
internal tree for now, even if the qemu people are going to bitch and
moan about its imperfections.  Can you send me a clean copy with
commit message, please?
Alexander Graf - Oct. 5, 2012, 12:43 a.m.
On 05.10.2012, at 02:34, David Gibson wrote:

> On Thu, Oct 04, 2012 at 04:25:28PM +0530, Avik Sil wrote:
>> On 09/27/2012 03:21 PM, Gleb Natapov wrote:
>>> On Thu, Sep 27, 2012 at 11:33:31AM +0200, Alexander Graf wrote:
>>>> 
>>>> On 27.09.2012, at 11:29, Benjamin Herrenschmidt wrote:
>>>> 
>>>>> On Thu, 2012-09-27 at 14:51 +0530, Avik Sil wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> We would like to get a method to boot from devices provided in -boot
>>>>>> arguments in qemu when the 'boot-device' is set in nvram for pseries
>>>>>> machine. I mean the boot device specified in -boot should get a
>>>>>> precedence over the 'boot-device' specified in nvram.
>>>>>> 
>>>>>> At the same time, when -boot is not provided, i.e., the default boot
>>>>>> order "cad" is present, the device specified in nvram 'boot-device'
>>>>>> should get precedence if it is set.
>>>>>> 
>>>>>> What should be the elegant way to implement this requirement?
>>>>>> Suggestions welcome.
>>>>> 
>>>>> Actually I think it's a more open question. We have essentially two
>>>>> things at play here:
>>>>> 
>>>>> - With the new nvram model, the firmware can store a boot device
>>>>> reference in it, which is standard OF practice, and in fact the various
>>>>> distro installers are going to do just that
>>>>> 
>>>>> - Qemu has its own boot order thingy via -boot, which we loosely
>>>>> translate as c = first bootable disk we find (actually first disk we
>>>>> find, we should probably make the algorithm a bit smarter), d = first
>>>>> cdrom we find, n = network , ... We pass that selection (boot list) down
>>>>> to SLOF via a device-tree property.
>>>>> 
>>>>> The question is thus what precedence should we give them. I was
>>>>> initially thinking that an explicit qemu boot list should override the
>>>>> firmware nvram setting but I'm now not that sure anymore.
>>>>> 
>>>>> The -boot list is at best a "blurry" indication of what type of device
>>>>> the user wants ... The firmware setting in nvram is precise.
>>>> 
>>>> IIRC gleb had implemented a specific boot order thing. Gleb, mind to enlighten us? :)
>>>> 
>>> Yes, forget about -boot. It is deprecated :) You should use bootindex
>>> (device property) to set boot priority. It constructs OF device path
>>> and passes it to firmware. There is nothing "blurry" about OF device
>>> path. The problem is that it works reasonably well with legacy BIOS
>>> since it is enough to specify device to boot from, but with EFI (OF is
>>> the same I guess) it is not enough to point to a device to boot from,
>>> but you also need to specify a file you want to boot and this is where
>>> bootindex approach fails. If EFI would specify default file to boot from
>>> firmware could have used it, but EFI specifies it only for removable media
>>> (what media is not removable this days, especially with virtualization?).
>>> We can add qemu parameter to specify file to boot, but how users should
>>> know the name of the file?
>>> 
>> I looked at the bootindex stuff and found that when the bootindex is
>> specified for the disk and cdrom it generates a string like:
>> 
>> "/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
>> /spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"
> 
> Ok, so I've just started looking at the bootindex stuff.  What
> function is generating these strings?
> 
> We should also be able to get the raw bootindex values for a qdev,
> yes?  I was thinking we could instead copy those values into the
> device tree when we populate it.  The trouble is that we don't
> actually generate (in qemu) nodes for individual disks under a vscsi,
> or for individual PCI devices under the host bridge (that's done by
> SLOF).  Still thinking...

Well. You can track it down to the device level and you know the drive index. Maybe you could be clever if you had a device property that contains the drive index and boot index to it?

> 
> An aside, I'm thinking that once we do get bootindex working, then
> boot devices specified in NVRAM should have priority below all devices
> with explicit supplied bootindex, but above any that don't.  Does that
> seem right to you?

Yes, that sounds exactly right :).

> 
>> Now converting/translating this to OF device path is going to be
>> much trickier and might not be proper. So I propose a simple
>> solution by introducing a global flag that checks if explicit -boot
>> parameter is provided or not. The presence of this parameter is
>> verified in SLOF firmware. The flag had to be introduced as
>> boot_devices defaults to "cad" instead of null and passed to
>> machine->init().
> 
> So, personally, I think this is quite a reasonable interim measure
> until we figure out how to do bootindex.  I will fold it into our
> internal tree for now, even if the qemu people are going to bitch and
> moan about its imperfections.  Can you send me a clean copy with
> commit message, please?

I actually don't remember having seen a patch at all :).


Alex
David Gibson - Oct. 5, 2012, 12:48 a.m.
On Fri, Oct 05, 2012 at 02:43:31AM +0200, Alexander Graf wrote:
> 
> On 05.10.2012, at 02:34, David Gibson wrote:
> 
> > On Thu, Oct 04, 2012 at 04:25:28PM +0530, Avik Sil wrote:
> >> On 09/27/2012 03:21 PM, Gleb Natapov wrote:
> >>> On Thu, Sep 27, 2012 at 11:33:31AM +0200, Alexander Graf wrote:
> >>>> 
> >>>> On 27.09.2012, at 11:29, Benjamin Herrenschmidt wrote:
> >>>> 
> >>>>> On Thu, 2012-09-27 at 14:51 +0530, Avik Sil wrote:
> >>>>>> Hi,
> >>>>>> 
> >>>>>> We would like to get a method to boot from devices provided in -boot
> >>>>>> arguments in qemu when the 'boot-device' is set in nvram for pseries
> >>>>>> machine. I mean the boot device specified in -boot should get a
> >>>>>> precedence over the 'boot-device' specified in nvram.
> >>>>>> 
> >>>>>> At the same time, when -boot is not provided, i.e., the default boot
> >>>>>> order "cad" is present, the device specified in nvram 'boot-device'
> >>>>>> should get precedence if it is set.
> >>>>>> 
> >>>>>> What should be the elegant way to implement this requirement?
> >>>>>> Suggestions welcome.
> >>>>> 
> >>>>> Actually I think it's a more open question. We have essentially two
> >>>>> things at play here:
> >>>>> 
> >>>>> - With the new nvram model, the firmware can store a boot device
> >>>>> reference in it, which is standard OF practice, and in fact the various
> >>>>> distro installers are going to do just that
> >>>>> 
> >>>>> - Qemu has its own boot order thingy via -boot, which we loosely
> >>>>> translate as c = first bootable disk we find (actually first disk we
> >>>>> find, we should probably make the algorithm a bit smarter), d = first
> >>>>> cdrom we find, n = network , ... We pass that selection (boot list) down
> >>>>> to SLOF via a device-tree property.
> >>>>> 
> >>>>> The question is thus what precedence should we give them. I was
> >>>>> initially thinking that an explicit qemu boot list should override the
> >>>>> firmware nvram setting but I'm now not that sure anymore.
> >>>>> 
> >>>>> The -boot list is at best a "blurry" indication of what type of device
> >>>>> the user wants ... The firmware setting in nvram is precise.
> >>>> 
> >>>> IIRC gleb had implemented a specific boot order thing. Gleb, mind to enlighten us? :)
> >>>> 
> >>> Yes, forget about -boot. It is deprecated :) You should use bootindex
> >>> (device property) to set boot priority. It constructs OF device path
> >>> and passes it to firmware. There is nothing "blurry" about OF device
> >>> path. The problem is that it works reasonably well with legacy BIOS
> >>> since it is enough to specify device to boot from, but with EFI (OF is
> >>> the same I guess) it is not enough to point to a device to boot from,
> >>> but you also need to specify a file you want to boot and this is where
> >>> bootindex approach fails. If EFI would specify default file to boot from
> >>> firmware could have used it, but EFI specifies it only for removable media
> >>> (what media is not removable this days, especially with virtualization?).
> >>> We can add qemu parameter to specify file to boot, but how users should
> >>> know the name of the file?
> >>> 
> >> I looked at the bootindex stuff and found that when the bootindex is
> >> specified for the disk and cdrom it generates a string like:
> >> 
> >> "/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
> >> /spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"
> > 
> > Ok, so I've just started looking at the bootindex stuff.  What
> > function is generating these strings?
> > 
> > We should also be able to get the raw bootindex values for a qdev,
> > yes?  I was thinking we could instead copy those values into the
> > device tree when we populate it.  The trouble is that we don't
> > actually generate (in qemu) nodes for individual disks under a vscsi,
> > or for individual PCI devices under the host bridge (that's done by
> > SLOF).  Still thinking...
> 
> Well. You can track it down to the device level and you know the
> drive index. Maybe you could be clever if you had a device property
> that contains the drive index and boot index to it?

Yeah, I guess.  Working that out is a lot more complex than cases
where we have a one-to-one correspondance between qdevs and device
tree nodes.

> > An aside, I'm thinking that once we do get bootindex working, then
> > boot devices specified in NVRAM should have priority below all devices
> > with explicit supplied bootindex, but above any that don't.  Does that
> > seem right to you?
> 
> Yes, that sounds exactly right :).
> 
> >> Now converting/translating this to OF device path is going to be
> >> much trickier and might not be proper. So I propose a simple
> >> solution by introducing a global flag that checks if explicit -boot
> >> parameter is provided or not. The presence of this parameter is
> >> verified in SLOF firmware. The flag had to be introduced as
> >> boot_devices defaults to "cad" instead of null and passed to
> >> machine->init().
> > 
> > So, personally, I think this is quite a reasonable interim measure
> > until we figure out how to do bootindex.  I will fold it into our
> > internal tree for now, even if the qemu people are going to bitch and
> > moan about its imperfections.  Can you send me a clean copy with
> > commit message, please?
> 
> I actually don't remember having seen a patch at all :).

Um.. it was immediately below that in the original message.
Nikunj A Dadhania - Oct. 5, 2012, 4:45 a.m.
On Thu, 4 Oct 2012 14:37:22 +0200, Alexander Graf <agraf@suse.de> wrote:
> 
> >>>> Imagine you have 2 controllers:
> >>>> 
> >>>>   * vio
> >>>>   * virtio
> >>>> 
> >>>> and you specify -boot c. Which device are you going to boot from?
> >>> 
> >>> Currently, by default SLOF boots from the first disk it discovers in the device tree.
> >> 
> >> So you want to replace one broken scheme with another broken scheme? :)
> > 
> > Ha ha, actually we hit this issue in some different context with respect to nvram boot-device which I mentioned in [1]. The patch is a workaround for that issue only.
> 
> Seriously, just ignore -boot for now. It'd be a lot more useful to get bootindex working.

I understand your point here. Adding bootindex feature is desirable, I
agree to that. While, in the particular use case, we just need to know
that -boot was not provided (and some vague default "cad" is provided,
which can actually be a user passed parameter as well) and rest can be
handled by the firmware.

Regards,
Nikunj
Nikunj A Dadhania - Oct. 5, 2012, 5:30 a.m.
On Fri, 5 Oct 2012 10:34:16 +1000, David Gibson <dwg@au1.ibm.com> wrote:
> On Thu, Oct 04, 2012 at 04:25:28PM +0530, Avik Sil wrote:
> > On 09/27/2012 03:21 PM, Gleb Natapov wrote:
> > >On Thu, Sep 27, 2012 at 11:33:31AM +0200, Alexander Graf wrote:
> > >>
> > >>On 27.09.2012, at 11:29, Benjamin Herrenschmidt wrote:
> > >>
> > >>>On Thu, 2012-09-27 at 14:51 +0530, Avik Sil wrote:
> > >>>>Hi,
> > >>>>
> > >>>>We would like to get a method to boot from devices provided in -boot
> > >>>>arguments in qemu when the 'boot-device' is set in nvram for pseries
> > >>>>machine. I mean the boot device specified in -boot should get a
> > >>>>precedence over the 'boot-device' specified in nvram.
> > >>>>
> > >>>>At the same time, when -boot is not provided, i.e., the default boot
> > >>>>order "cad" is present, the device specified in nvram 'boot-device'
> > >>>>should get precedence if it is set.
> > >>>>
> > >>>>What should be the elegant way to implement this requirement?
> > >>>>Suggestions welcome.
> > >>>
> > >>>Actually I think it's a more open question. We have essentially two
> > >>>things at play here:
> > >>>
> > >>>- With the new nvram model, the firmware can store a boot device
> > >>>reference in it, which is standard OF practice, and in fact the various
> > >>>distro installers are going to do just that
> > >>>
> > >>>- Qemu has its own boot order thingy via -boot, which we loosely
> > >>>translate as c = first bootable disk we find (actually first disk we
> > >>>find, we should probably make the algorithm a bit smarter), d = first
> > >>>cdrom we find, n = network , ... We pass that selection (boot list) down
> > >>>to SLOF via a device-tree property.
> > >>>
> > >>>The question is thus what precedence should we give them. I was
> > >>>initially thinking that an explicit qemu boot list should override the
> > >>>firmware nvram setting but I'm now not that sure anymore.
> > >>>
> > >>>The -boot list is at best a "blurry" indication of what type of device
> > >>>the user wants ... The firmware setting in nvram is precise.
> > >>
> > >>IIRC gleb had implemented a specific boot order thing. Gleb, mind to enlighten us? :)
> > >>
> > >Yes, forget about -boot. It is deprecated :) You should use bootindex
> > >(device property) to set boot priority. It constructs OF device path
> > >and passes it to firmware. There is nothing "blurry" about OF device
> > >path. The problem is that it works reasonably well with legacy BIOS
> > >since it is enough to specify device to boot from, but with EFI (OF is
> > >the same I guess) it is not enough to point to a device to boot from,
> > >but you also need to specify a file you want to boot and this is where
> > >bootindex approach fails. If EFI would specify default file to boot from
> > >firmware could have used it, but EFI specifies it only for removable media
> > >(what media is not removable this days, especially with virtualization?).
> > >We can add qemu parameter to specify file to boot, but how users should
> > >know the name of the file?
> > >
> > I looked at the bootindex stuff and found that when the bootindex is
> > specified for the disk and cdrom it generates a string like:
> > 
> > "/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
> > /spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"
> 
> Ok, so I've just started looking at the bootindex stuff.  What
> function is generating these strings?

get_boot_devices_list gives you the above

> 
> We should also be able to get the raw bootindex values for a qdev,
> yes?  I was thinking we could instead copy those values into the
> device tree when we populate it.  The trouble is that we don't
> actually generate (in qemu) nodes for individual disks under a vscsi,
> or for individual PCI devices under the host bridge (that's done by
> SLOF).  Still thinking...
> 
> An aside, I'm thinking that once we do get bootindex working, then
> boot devices specified in NVRAM should have priority below all devices
> with explicit supplied bootindex, but above any that don't.  Does that
> seem right to you?

Even if the bootindex is taken care, there is still -boot that has to be
handled. Or we just need to drop -boot handling? In that case what
should we look at when there is no boot-index and nothing in nvram.

Regards
Nikunj
Avik Sil - Oct. 5, 2012, 5:44 a.m.
>>> I looked at the bootindex stuff and found that when the bootindex is
>>> specified for the disk and cdrom it generates a string like:
>>>
>>> "/spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,1
>>> /spapr-vio-bridge/spapr-vscsi/channel@0/disk@0,0"
>>
>> Ok, so I've just started looking at the bootindex stuff.  What
>> function is generating these strings?
>
> get_boot_devices_list gives you the above
>
>>
>> We should also be able to get the raw bootindex values for a qdev,
>> yes?  I was thinking we could instead copy those values into the
>> device tree when we populate it.  The trouble is that we don't
>> actually generate (in qemu) nodes for individual disks under a vscsi,
>> or for individual PCI devices under the host bridge (that's done by
>> SLOF).  Still thinking...
>>
>> An aside, I'm thinking that once we do get bootindex working, then
>> boot devices specified in NVRAM should have priority below all devices
>> with explicit supplied bootindex, but above any that don't.  Does that
>> seem right to you?
>
> Even if the bootindex is taken care, there is still -boot that has to be
> handled. Or we just need to drop -boot handling? In that case what
> should we look at when there is no boot-index and nothing in nvram.

Perhaps the default boot order "disk cdrom" should be taken care of if 
none of bootindex, -boot and nvram boot-device is provided.
>
> Regards
> Nikunj
>

Regards,
Avik
Benjamin Herrenschmidt - Oct. 5, 2012, 9:12 a.m.
On Fri, 2012-10-05 at 02:43 +0200, Alexander Graf wrote:
> > We should also be able to get the raw bootindex values for a qdev,
> > yes?  I was thinking we could instead copy those values into the
> > device tree when we populate it.  The trouble is that we don't
> > actually generate (in qemu) nodes for individual disks under a
> vscsi,
> > or for individual PCI devices under the host bridge (that's done by
> > SLOF).  Still thinking...
> 
> Well. You can track it down to the device level and you know the drive
> index. Maybe you could be clever if you had a device property that
> contains the drive index and boot index to it?

You can but it's hard... eventually we'll do it but it will take some
time. In the meantime, a patch allowing us know whether -boot was
specified at all or not would be handy to make things work as expected
in the most common cases.

Avik & Nikunj are going to send one if not already...

Cheers,
Ben.
Alexander Graf - Oct. 5, 2012, 10:32 a.m.
On 05.10.2012, at 11:12, Benjamin Herrenschmidt <benh@au1.ibm.com> wrote:

> On Fri, 2012-10-05 at 02:43 +0200, Alexander Graf wrote:
>>> We should also be able to get the raw bootindex values for a qdev,
>>> yes?  I was thinking we could instead copy those values into the
>>> device tree when we populate it.  The trouble is that we don't
>>> actually generate (in qemu) nodes for individual disks under a
>> vscsi,
>>> or for individual PCI devices under the host bridge (that's done by
>>> SLOF).  Still thinking...
>> 
>> Well. You can track it down to the device level and you know the drive
>> index. Maybe you could be clever if you had a device property that
>> contains the drive index and boot index to it?
> 
> You can but it's hard... eventually we'll do it but it will take some
> time. In the meantime, a patch allowing us know whether -boot was
> specified at all or not would be handy to make things work as expected
> in the most common cases.

Sure, no objections there :). See my reply on the patch.

Alex

> 
> Avik & Nikunj are going to send one if not already...
> 
> Cheers,
> Ben.
> 
>

Patch

diff --git a/hw/spapr.c b/hw/spapr.c
index e6bf522..673bcc8 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -284,7 +284,8 @@  static void *spapr_create_fdt_skel(const char 
*cpu_model,

          _FDT((fdt_property(fdt, "qemu,boot-kernel", &kprop, 
sizeof(kprop))));
      }
-    _FDT((fdt_property_string(fdt, "qemu,boot-device", boot_device)));
+    if (!default_boot_order)
+        _FDT((fdt_property_string(fdt, "qemu,boot-device", boot_device)));
      _FDT((fdt_property_cell(fdt, "qemu,graphic-width", graphic_width)));
      _FDT((fdt_property_cell(fdt, "qemu,graphic-height", graphic_height)));
      _FDT((fdt_property_cell(fdt, "qemu,graphic-depth", graphic_depth)));
diff --git a/sysemu.h b/sysemu.h
index 65552ac..f0822b4 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -129,6 +129,7 @@  extern int no_shutdown;
  extern int semihosting_enabled;
  extern int old_param;
  extern int boot_menu;
+extern int default_boot_order;
  extern uint8_t *boot_splash_filedata;
  extern int boot_splash_filedata_size;
  extern uint8_t qemu_extra_params_fw[2];
diff --git a/vl.c b/vl.c
index 48049ef..bf369e6 100644
--- a/vl.c
+++ b/vl.c
@@ -230,6 +230,7 @@  int ctrl_grab = 0;
  unsigned int nb_prom_envs = 0;
  const char *prom_envs[MAX_PROM_ENVS];
  int boot_menu;
+int default_boot_order = 1;
  uint8_t *boot_splash_filedata;
  int boot_splash_filedata_size;
  uint8_t qemu_extra_params_fw[2];
@@ -2668,6 +2669,7 @@  int main(int argc, char **argv, char **envp)
                          qemu_opts_parse(qemu_find_opts("boot-opts"),
                                          optarg, 0);
                      }
+                    default_boot_order = 0;
                  }
                  break;
              case QEMU_OPTION_fda: