diff mbox series

[v3] slof/fs/packages/disk-label.fs: improve checking for DOS boot partitions

Message ID 20240327054127.633598-1-kconsul@linux.ibm.com
State New
Headers show
Series [v3] slof/fs/packages/disk-label.fs: improve checking for DOS boot partitions | expand

Commit Message

Kautuk Consul March 27, 2024, 5:41 a.m. UTC
While testing with a qcow2 with a DOS boot partition it was found that
when we set the logical_block_size in the guest XML to >512 then the
boot would fail in the following interminable loop:
<SNIP>
Trying to load:  from: /pci@800000020000000/scsi@3 ... virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
</SNIP>

Change the count-dos-logical-partitions Forth subroutine and the Forth
subroutines calling count-dos-logical-partitions to check for this access
beyond end of device error.

After making the above changes, it fails properly with the correct error
message as follows:
<SNIP>
Trying to load:  from: /pci@800000020000000/scsi@3 ... virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!
virtioblk_transfer: Access beyond end of device!

E3404: Not a bootable device!

E3407: Load failed

  Type 'boot' and press return to continue booting the system.
  Type 'reset-all' and press return to reboot the system.

Ready!
0 >
</SNIP>

Signed-off-by: Kautuk Consul <kconsul@linux.ibm.com>
---
 slof/fs/packages/disk-label.fs | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

Comments

Thomas Huth March 27, 2024, 7:45 a.m. UTC | #1
On 27/03/2024 06.41, Kautuk Consul wrote:
> While testing with a qcow2 with a DOS boot partition it was found that
> when we set the logical_block_size in the guest XML to >512 then the
> boot would fail in the following interminable loop:
> <SNIP>
> Trying to load:  from: /pci@800000020000000/scsi@3 ... virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> </SNIP>
> 
> Change the count-dos-logical-partitions Forth subroutine and the Forth
> subroutines calling count-dos-logical-partitions to check for this access
> beyond end of device error.
> 
> After making the above changes, it fails properly with the correct error
> message as follows:
> <SNIP>
> Trying to load:  from: /pci@800000020000000/scsi@3 ... virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> virtioblk_transfer: Access beyond end of device!
> 
> E3404: Not a bootable device!
> 
> E3407: Load failed
> 
>    Type 'boot' and press return to continue booting the system.
>    Type 'reset-all' and press return to reboot the system.
> 
> Ready!
> 0 >
> </SNIP>
> 
> Signed-off-by: Kautuk Consul <kconsul@linux.ibm.com>
> ---
>   slof/fs/packages/disk-label.fs | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)

Reviewed-by: Thomas Huth <thuth@redhat.com>
Kautuk Consul March 27, 2024, 7:49 a.m. UTC | #2
Hi Thomas/Alexey,

On 2024-03-27 08:45:36, Thomas Huth wrote:
> On 27/03/2024 06.41, Kautuk Consul wrote:
> > While testing with a qcow2 with a DOS boot partition it was found that
> > when we set the logical_block_size in the guest XML to >512 then the
> > boot would fail in the following interminable loop:
> > <SNIP>
> > Trying to load:  from: /pci@800000020000000/scsi@3 ... virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > </SNIP>
> > 
> > Change the count-dos-logical-partitions Forth subroutine and the Forth
> > subroutines calling count-dos-logical-partitions to check for this access
> > beyond end of device error.
> > 
> > After making the above changes, it fails properly with the correct error
> > message as follows:
> > <SNIP>
> > Trying to load:  from: /pci@800000020000000/scsi@3 ... virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > virtioblk_transfer: Access beyond end of device!
> > 
> > E3404: Not a bootable device!
> > 
> > E3407: Load failed
> > 
> >    Type 'boot' and press return to continue booting the system.
> >    Type 'reset-all' and press return to reboot the system.
> > 
> > Ready!
> > 0 >
> > </SNIP>
> > 
> > Signed-off-by: Kautuk Consul <kconsul@linux.ibm.com>
> > ---
> >   slof/fs/packages/disk-label.fs | 16 ++++++++++++----
> >   1 file changed, 12 insertions(+), 4 deletions(-)
> 
> Reviewed-by: Thomas Huth <thuth@redhat.com>

Thanks, Thomas. Alexey, can you also review and possibly include this
patch in the next release ?

>
Segher Boessenkool March 27, 2024, 1:43 p.m. UTC | #3
Hi!

On Wed, Mar 27, 2024 at 01:41:27AM -0400, Kautuk Consul wrote:
> -\ read sector to array "block"
> -: read-sector ( sector-number -- )
> +\ read sector to array "block" and return actual bytes read
> +: read-sector-ret ( sector-number -- actual-bytes )

What does "-ret" mean?  The name could be clearer.

Why factor it like this, anyway?  Shouldn't "read" always read exactly
the number of bytes it is asked to?  So, "read-sector" should always
read exactly one sector, never more, never less.

If an exception happens you can (should!) throw an exception.  Which
you can then catch at a pretty high level.


Segher
Kautuk Consul March 28, 2024, 4:39 a.m. UTC | #4
Hi!

On 2024-03-27 08:43:25, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Mar 27, 2024 at 01:41:27AM -0400, Kautuk Consul wrote:
> > -\ read sector to array "block"
> > -: read-sector ( sector-number -- )
> > +\ read sector to array "block" and return actual bytes read
> > +: read-sector-ret ( sector-number -- actual-bytes )
> 
> What does "-ret" mean?  The name could be clearer.
> 
> Why factor it like this, anyway?  Shouldn't "read" always read exactly
> the number of bytes it is asked to?  So, "read-sector" should always
> read exactly one sector, never more, never less.
Okay I just thought to return the bytes actually read from that 1 sector
so that I could do some checking in the subroutines calling read-sector.

> 
> If an exception happens you can (should!) throw an exception.  Which
> you can then catch at a pretty high level.
Ah, correct. Thanks for the suggestion! I think I will now try to throw
an exception from read-sector if all the code-paths imply that a "catch"
is in progress. I will try to make some change like this and send out a
v4 whenever I have time.
> 
> 
> Segher
Segher Boessenkool March 28, 2024, 10:47 a.m. UTC | #5
On Thu, Mar 28, 2024 at 10:09:20AM +0530, Kautuk Consul wrote:
> On 2024-03-27 08:43:25, Segher Boessenkool wrote:
> > If an exception happens you can (should!) throw an exception.  Which
> > you can then catch at a pretty high level.
> Ah, correct. Thanks for the suggestion! I think I will now try to throw
> an exception from read-sector if all the code-paths imply that a "catch"
> is in progress.

Don't try to detect something is trying to catch things.  Just throw!
Always *something* will catch things (the outer interpreter, if nothing
else), anyway.  In SLOF this is very explicit:

: quit
  BEGIN
    0 rdepth!    \ clear nesting stack
    [            \ switch to interpretation state
    terminal     \ all input and output not redirected
    BEGIN
      depth . [char] > emit space  \ output prompt
      refill WHILE
      space
      ['] interpret catch          \ that is all the default throw/catch
                                   \ there is!  no special casing needed
      dup print-status             \ "ok" or "aborted" or abort" string
    REPEAT
  AGAIN ;

The whole programming model is that you can blindly throw a fatal error
whenever one happens.  You cannot deal with it anyway, it is fatal!
That is 98% or so of the exceptions you'll ever see.  Very sometimes it
is used for non-local control flow.  That has its place, but please
don't overuse that :-)


Segher
Kautuk Consul March 28, 2024, 11:02 a.m. UTC | #6
Hi,

On 2024-03-28 05:47:25, Segher Boessenkool wrote:
> On Thu, Mar 28, 2024 at 10:09:20AM +0530, Kautuk Consul wrote:
> > On 2024-03-27 08:43:25, Segher Boessenkool wrote:
> > > If an exception happens you can (should!) throw an exception.  Which
> > > you can then catch at a pretty high level.
> > Ah, correct. Thanks for the suggestion! I think I will now try to throw
> > an exception from read-sector if all the code-paths imply that a "catch"
> > is in progress.
> 
> Don't try to detect something is trying to catch things.  Just throw!
> Always *something* will catch things (the outer interpreter, if nothing
> else), anyway.  In SLOF this is very explicit:
> 
> : quit
>   BEGIN
>     0 rdepth!    \ clear nesting stack
>     [            \ switch to interpretation state
>     terminal     \ all input and output not redirected
>     BEGIN
>       depth . [char] > emit space  \ output prompt
>       refill WHILE
>       space
>       ['] interpret catch          \ that is all the default throw/catch
>                                    \ there is!  no special casing needed
>       dup print-status             \ "ok" or "aborted" or abort" string
>     REPEAT
>   AGAIN ;
> 
> The whole programming model is that you can blindly throw a fatal error
> whenever one happens.  You cannot deal with it anyway, it is fatal!
> That is 98% or so of the exceptions you'll ever see.  Very sometimes it
> is used for non-local control flow.  That has its place, but please
> don't overuse that :-)

Okay, in the v4 I just sent I added a catch statement in the open method
of disk-label.fs to make sure that there is a catch for this throw. Can
you please check that and tell me if I need to remove that CATCH
statement ? My idea was that maybe I needed to add an appropriate CATCH
statement for this in open.
> 
> 
> Segher
diff mbox series

Patch

diff --git a/slof/fs/packages/disk-label.fs b/slof/fs/packages/disk-label.fs
index 661c6b0..fa15982 100644
--- a/slof/fs/packages/disk-label.fs
+++ b/slof/fs/packages/disk-label.fs
@@ -132,11 +132,16 @@  CONSTANT /gpt-part-entry
    debug-disk-label? IF dup ." actual=" .d cr THEN
 ;
 
-\ read sector to array "block"
-: read-sector ( sector-number -- )
+\ read sector to array "block" and return actual bytes read
+: read-sector-ret ( sector-number -- actual-bytes )
    \ block-size is 0x200 on disks, 0x800 on cdrom drives
    block-size * 0 seek drop      \ seek to sector
-   block block-size read drop    \ read sector
+   block block-size read    \ read sector
+;
+
+\ read sector to array "block"
+: read-sector ( sector-number -- )
+   read-sector-ret drop
 ;
 
 : (.part-entry) ( part-entry )
@@ -204,7 +209,8 @@  CONSTANT /gpt-part-entry
          part-entry>sector-offset l@-le    ( current sector )
          dup to part-start to lpart-start  ( current )
          BEGIN
-            part-start read-sector          \ read EBR
+            part-start read-sector-ret          \ read EBR
+            block-size < IF UNLOOP drop 0 EXIT THEN
             1 partition>start-sector IF
                \ ." Logical Partition found at " part-start .d cr
                1+
@@ -279,6 +285,7 @@  CONSTANT /gpt-part-entry
    THEN
 
    count-dos-logical-partitions TO dos-logical-partitions
+   dos-logical-partitions 0= IF false EXIT THEN
 
    debug-disk-label? IF
       ." Found " dos-logical-partitions .d ." logical partitions" cr
@@ -352,6 +359,7 @@  CONSTANT /gpt-part-entry
    no-mbr? IF drop FALSE EXIT THEN  \ read MBR and check for DOS disk-label magic
 
    count-dos-logical-partitions TO dos-logical-partitions
+   dos-logical-partitions 0= IF drop 0 EXIT THEN
 
    debug-disk-label? IF
       ." Found " dos-logical-partitions .d ." logical partitions" cr