diff mbox series

[v2,04/11] tests/acceptance: add kernel record/replay test for x86_64

Message ID 159057546117.16818.15607496040935344350.stgit@pasha-ThinkPad-X280
State New
Headers show
Series Record/replay acceptance tests | expand

Commit Message

Pavel Dovgalyuk May 27, 2020, 10:31 a.m. UTC
This patch adds a test for record/replay an execution of x86_64 machine.
Execution scenario includes simple kernel boot, which allows testing
basic hardware interaction in RR mode.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
---
 0 files changed

Comments

Philippe Mathieu-Daudé May 27, 2020, 2:53 p.m. UTC | #1
On 5/27/20 12:31 PM, Pavel Dovgalyuk wrote:
> This patch adds a test for record/replay an execution of x86_64 machine.
> Execution scenario includes simple kernel boot, which allows testing
> basic hardware interaction in RR mode.
> 
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> ---
>  0 files changed
> 
> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
> index b8b277ad2f..c7526f1aba 100644
> --- a/tests/acceptance/replay_kernel.py
> +++ b/tests/acceptance/replay_kernel.py
> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>                      True, shift, args)
>          self.run_vm(kernel_path, kernel_command_line, console_pattern,
>                      False, shift, args)
> +
> +    def test_x86_64_pc(self):
> +        """
> +        :avocado: tags=arch:x86_64
> +        :avocado: tags=machine:pc
> +        """
> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
> +                      '/vmlinuz')
> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
> +
> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
> +        console_pattern = 'Kernel command line: %s' % kernel_command_line
> +
> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
> 

This one timeouted (I build with --enable-debug):

 (1/1) tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc:
replay: recording...
replay: replaying...
INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout
reached\nOriginal status: ERROR\n{'name':
'1-tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc',
'logdir':
'avocado/job-results/job-2020-05-27T16.48-71d7bf4/test-results/1-tes...
(90.68 s)
Alex Bennée May 27, 2020, 3:40 p.m. UTC | #2
Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:

> This patch adds a test for record/replay an execution of x86_64 machine.
> Execution scenario includes simple kernel boot, which allows testing
> basic hardware interaction in RR mode.
>
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> ---
>  0 files changed
>
> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
> index b8b277ad2f..c7526f1aba 100644
> --- a/tests/acceptance/replay_kernel.py
> +++ b/tests/acceptance/replay_kernel.py
> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>                      True, shift, args)
>          self.run_vm(kernel_path, kernel_command_line, console_pattern,
>                      False, shift, args)
> +
> +    def test_x86_64_pc(self):
> +        """
> +        :avocado: tags=arch:x86_64
> +        :avocado: tags=machine:pc
> +        """
> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
> +                      '/vmlinuz')
> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
> +
> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE +
> 'console=ttyS0'

I note that:

  KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 '

and given we are looking for repeatability here maybe we should use our
own command line so we can compare the recorded and replayed boot?

> +        console_pattern = 'Kernel command line: %s' % kernel_command_line
> +
> +        self.run_rr(kernel_path, kernel_command_line,
> console_pattern)
Alex Bennée May 27, 2020, 3:41 p.m. UTC | #3
Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:

> This patch adds a test for record/replay an execution of x86_64 machine.
> Execution scenario includes simple kernel boot, which allows testing
> basic hardware interaction in RR mode.
>
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
> ---
>  0 files changed
>
> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
> index b8b277ad2f..c7526f1aba 100644
> --- a/tests/acceptance/replay_kernel.py
> +++ b/tests/acceptance/replay_kernel.py
> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>                      True, shift, args)
>          self.run_vm(kernel_path, kernel_command_line, console_pattern,
>                      False, shift, args)
> +
> +    def test_x86_64_pc(self):
> +        """
> +        :avocado: tags=arch:x86_64
> +        :avocado: tags=machine:pc
> +        """
> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
> +                      '/vmlinuz')
> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
> +
> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
> +        console_pattern = 'Kernel command line: %s' % kernel_command_line
> +
> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)

This test fails for me on the replay:

  2020-05-27 16:22:21,658 machine          L0326 DEBUG| VM launch command: 'x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/var/tmp/tmp4n_geosi/qemu-9516-monitor.sock -mon chardev=mon,mode=control -machine pc -chardev socket,id=console,path=/var/tmp/tmp4n_geosi/qemu-9516-console.sock,server,nowait -serial chardev:console -icount shift=7,rr=replay,rrfile=/var/tmp/avocado_b85h3ycg/avocado_job_8xrxksgj/1-._tests_acceptance_replay_kernel.py_ReplayKernel.test_x86_64_pc/replay.bin -kernel /home/alex/avocado/data/cache/by_location/df533120a0e0ffda2626bed6e8a975d3b07e3f05/vmlinuz -append printk.time=0 console=ttyS0 -net none'
  2020-05-27 16:22:21,725 qmp              L0194 DEBUG| >>> {'execute': 'qmp_capabilities'}
  2020-05-27 16:22:21,736 qmp              L0202 DEBUG| <<< {'return': {}}
  2020-05-27 16:23:49,372 stacktrace       L0039 ERROR|
  2020-05-27 16:23:49,372 stacktrace       L0042 ERROR| Reproduced traceback from: /home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/core/test.py:860
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR| Traceback (most recent call last):
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 73, in test_x86_64_pc
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern)
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 57, in run_rr
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     False, shift, args)
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 46, in run_vm
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/boot_linux_console.py", line 37, in wait_for_console_pattern
  2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     vm=vm)
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py", line 131, in wait_for_console_pattern
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py", line 83, in _console_interaction
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     msg = console.readline().strip()
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/usr/lib/python3.7/socket.py", line 589, in readinto
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     return self._sock.recv_into(b)
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/plugins/runner.py", line 89, in sigterm_handler
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
  2020-05-27 16:23:49,374 stacktrace       L0045 ERROR| RuntimeError: Test interrupted by SIGTERM
Alex Bennée May 27, 2020, 4:20 p.m. UTC | #4
Alex Bennée <alex.bennee@linaro.org> writes:

> Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:
>
>> This patch adds a test for record/replay an execution of x86_64 machine.
>> Execution scenario includes simple kernel boot, which allows testing
>> basic hardware interaction in RR mode.
>>
>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>> ---
>>  0 files changed
>>
>> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
>> index b8b277ad2f..c7526f1aba 100644
>> --- a/tests/acceptance/replay_kernel.py
>> +++ b/tests/acceptance/replay_kernel.py
>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>                      True, shift, args)
>>          self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>                      False, shift, args)
>> +
>> +    def test_x86_64_pc(self):
>> +        """
>> +        :avocado: tags=arch:x86_64
>> +        :avocado: tags=machine:pc
>> +        """
>> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
>> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>> +                      '/vmlinuz')
>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
>> +
>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE +
>> 'console=ttyS0'
>
> I note that:
>
>   KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 '
>
> and given we are looking for repeatability here maybe we should use our
> own command line so we can compare the recorded and replayed boot?

To build on that I think a command line like:

  KERNEL_COMMON_COMMAND_LINE = 'printk.time=1 panic=-1 '

called with --no-reboot and a pattern:

  console_pattern = 'VFS: Cannot open root device'

You will run more of the kernel (importantly with timestamps > 0.000) so
we can have a better compare between the recorded and replayed run.
Pavel Dovgalyuk May 28, 2020, 6:12 a.m. UTC | #5
On 27.05.2020 18:41, Alex Bennée wrote:
> Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:
>
>> This patch adds a test for record/replay an execution of x86_64 machine.
>> Execution scenario includes simple kernel boot, which allows testing
>> basic hardware interaction in RR mode.
>>
>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>> ---
>>   0 files changed
>>
>> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
>> index b8b277ad2f..c7526f1aba 100644
>> --- a/tests/acceptance/replay_kernel.py
>> +++ b/tests/acceptance/replay_kernel.py
>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>                       True, shift, args)
>>           self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>                       False, shift, args)
>> +
>> +    def test_x86_64_pc(self):
>> +        """
>> +        :avocado: tags=arch:x86_64
>> +        :avocado: tags=machine:pc
>> +        """
>> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
>> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>> +                      '/vmlinuz')
>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
>> +
>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
>> +        console_pattern = 'Kernel command line: %s' % kernel_command_line
>> +
>> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
> This test fails for me on the replay:

Have you applied latest RR patches?


>
>    2020-05-27 16:22:21,658 machine          L0326 DEBUG| VM launch command: 'x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/var/tmp/tmp4n_geosi/qemu-9516-monitor.sock -mon chardev=mon,mode=control -machine pc -chardev socket,id=console,path=/var/tmp/tmp4n_geosi/qemu-9516-console.sock,server,nowait -serial chardev:console -icount shift=7,rr=replay,rrfile=/var/tmp/avocado_b85h3ycg/avocado_job_8xrxksgj/1-._tests_acceptance_replay_kernel.py_ReplayKernel.test_x86_64_pc/replay.bin -kernel /home/alex/avocado/data/cache/by_location/df533120a0e0ffda2626bed6e8a975d3b07e3f05/vmlinuz -append printk.time=0 console=ttyS0 -net none'
>    2020-05-27 16:22:21,725 qmp              L0194 DEBUG| >>> {'execute': 'qmp_capabilities'}
>    2020-05-27 16:22:21,736 qmp              L0202 DEBUG| <<< {'return': {}}
>    2020-05-27 16:23:49,372 stacktrace       L0039 ERROR|
>    2020-05-27 16:23:49,372 stacktrace       L0042 ERROR| Reproduced traceback from: /home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/core/test.py:860
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR| Traceback (most recent call last):
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 73, in test_x86_64_pc
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern)
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 57, in run_rr
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     False, shift, args)
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 46, in run_vm
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/boot_linux_console.py", line 37, in wait_for_console_pattern
>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     vm=vm)
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py", line 131, in wait_for_console_pattern
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py", line 83, in _console_interaction
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     msg = console.readline().strip()
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/usr/lib/python3.7/socket.py", line 589, in readinto
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     return self._sock.recv_into(b)
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/plugins/runner.py", line 89, in sigterm_handler
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR| RuntimeError: Test interrupted by SIGTERM
>
>
Pavel Dovgalyuk May 28, 2020, 7:12 a.m. UTC | #6
On 27.05.2020 17:53, Philippe Mathieu-Daudé wrote:
> On 5/27/20 12:31 PM, Pavel Dovgalyuk wrote:
>> This patch adds a test for record/replay an execution of x86_64 machine.
>> Execution scenario includes simple kernel boot, which allows testing
>> basic hardware interaction in RR mode.
>>
>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>> ---
>>   0 files changed
>>
>> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
>> index b8b277ad2f..c7526f1aba 100644
>> --- a/tests/acceptance/replay_kernel.py
>> +++ b/tests/acceptance/replay_kernel.py
>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>                       True, shift, args)
>>           self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>                       False, shift, args)
>> +
>> +    def test_x86_64_pc(self):
>> +        """
>> +        :avocado: tags=arch:x86_64
>> +        :avocado: tags=machine:pc
>> +        """
>> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
>> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>> +                      '/vmlinuz')
>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
>> +
>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
>> +        console_pattern = 'Kernel command line: %s' % kernel_command_line
>> +
>> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
>>
> This one timeouted (I build with --enable-debug):

I've got the strange behavior for the couple of times.

Console output was correct (I saw 'Kernel command line' in logs), but 
_console_interation function didn't notice it.

Therefore the test finished with timeout.

How this could be possible?


>   (1/1) tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc:
> replay: recording...
> replay: replaying...
> INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout
> reached\nOriginal status: ERROR\n{'name':
> '1-tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc',
> 'logdir':
> 'avocado/job-results/job-2020-05-27T16.48-71d7bf4/test-results/1-tes...
> (90.68 s)
>
Pavel Dovgalyuk May 28, 2020, 7:26 a.m. UTC | #7
On 27.05.2020 19:20, Alex Bennée wrote:
> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:
>>
>>> This patch adds a test for record/replay an execution of x86_64 machine.
>>> Execution scenario includes simple kernel boot, which allows testing
>>> basic hardware interaction in RR mode.
>>>
>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>> ---
>>>   0 files changed
>>>
>>> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
>>> index b8b277ad2f..c7526f1aba 100644
>>> --- a/tests/acceptance/replay_kernel.py
>>> +++ b/tests/acceptance/replay_kernel.py
>>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>>                       True, shift, args)
>>>           self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>>                       False, shift, args)
>>> +
>>> +    def test_x86_64_pc(self):
>>> +        """
>>> +        :avocado: tags=arch:x86_64
>>> +        :avocado: tags=machine:pc
>>> +        """
>>> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
>>> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>>> +                      '/vmlinuz')
>>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>>> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
>>> +
>>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE +
>>> 'console=ttyS0'
>> I note that:
>>
>>    KERNEL_COMMON_COMMAND_LINE = 'printk.time=0 '
>>
>> and given we are looking for repeatability here maybe we should use our
>> own command line so we can compare the recorded and replayed boot?
> To build on that I think a command line like:
>
>    KERNEL_COMMON_COMMAND_LINE = 'printk.time=1 panic=-1 '
>
> called with --no-reboot and a pattern:
>
>    console_pattern = 'VFS: Cannot open root device'
>
> You will run more of the kernel (importantly with timestamps > 0.000) so
> we can have a better compare between the recorded and replayed run.
>
This is reasonable, thank you.
Philippe Mathieu-Daudé May 28, 2020, 8:38 a.m. UTC | #8
On 5/28/20 9:12 AM, Pavel Dovgalyuk wrote:
> 
> On 27.05.2020 17:53, Philippe Mathieu-Daudé wrote:
>> On 5/27/20 12:31 PM, Pavel Dovgalyuk wrote:
>>> This patch adds a test for record/replay an execution of x86_64 machine.
>>> Execution scenario includes simple kernel boot, which allows testing
>>> basic hardware interaction in RR mode.
>>>
>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>> ---
>>>   0 files changed
>>>
>>> diff --git a/tests/acceptance/replay_kernel.py
>>> b/tests/acceptance/replay_kernel.py
>>> index b8b277ad2f..c7526f1aba 100644
>>> --- a/tests/acceptance/replay_kernel.py
>>> +++ b/tests/acceptance/replay_kernel.py
>>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>>                       True, shift, args)
>>>           self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>>                       False, shift, args)
>>> +
>>> +    def test_x86_64_pc(self):
>>> +        """
>>> +        :avocado: tags=arch:x86_64
>>> +        :avocado: tags=machine:pc
>>> +        """
>>> +        kernel_url =
>>> ('https://archives.fedoraproject.org/pub/archive/fedora'
>>> +                     
>>> '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>>> +                      '/vmlinuz')
>>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>>> +        kernel_path = self.fetch_asset(kernel_url,
>>> asset_hash=kernel_hash)
>>> +
>>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE +
>>> 'console=ttyS0'
>>> +        console_pattern = 'Kernel command line: %s' %
>>> kernel_command_line
>>> +
>>> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
>>>
>> This one timeouted (I build with --enable-debug):
> 
> I've got the strange behavior for the couple of times.
> 
> Console output was correct (I saw 'Kernel command line' in logs), but
> _console_interation function didn't notice it.
> 
> Therefore the test finished with timeout.
> 
> How this could be possible?

IIRC there is a problem in how Avocado consume the chardev output.

Cleber has been working on some PoC / kludge.

Cleber/Eduardo do you remember the problem?

> 
>>   (1/1) tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc:
>> replay: recording...
>> replay: replaying...
>> INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout
>> reached\nOriginal status: ERROR\n{'name':
>> '1-tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc',
>> 'logdir':
>> 'avocado/job-results/job-2020-05-27T16.48-71d7bf4/test-results/1-tes...
>> (90.68 s)
>>
>
Alex Bennée May 28, 2020, 11:07 a.m. UTC | #9
Philippe Mathieu-Daudé <philmd@redhat.com> writes:

> On 5/28/20 9:12 AM, Pavel Dovgalyuk wrote:
>> 
>> On 27.05.2020 17:53, Philippe Mathieu-Daudé wrote:
>>> On 5/27/20 12:31 PM, Pavel Dovgalyuk wrote:
>>>> This patch adds a test for record/replay an execution of x86_64 machine.
>>>> Execution scenario includes simple kernel boot, which allows testing
>>>> basic hardware interaction in RR mode.
>>>>
>>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>>> ---
>>>>   0 files changed
>>>>
>>>> diff --git a/tests/acceptance/replay_kernel.py
>>>> b/tests/acceptance/replay_kernel.py
>>>> index b8b277ad2f..c7526f1aba 100644
>>>> --- a/tests/acceptance/replay_kernel.py
>>>> +++ b/tests/acceptance/replay_kernel.py
>>>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>>>                       True, shift, args)
>>>>           self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>>>                       False, shift, args)
>>>> +
>>>> +    def test_x86_64_pc(self):
>>>> +        """
>>>> +        :avocado: tags=arch:x86_64
>>>> +        :avocado: tags=machine:pc
>>>> +        """
>>>> +        kernel_url =
>>>> ('https://archives.fedoraproject.org/pub/archive/fedora'
>>>> +                     
>>>> '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>>>> +                      '/vmlinuz')
>>>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>>>> +        kernel_path = self.fetch_asset(kernel_url,
>>>> asset_hash=kernel_hash)
>>>> +
>>>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE +
>>>> 'console=ttyS0'
>>>> +        console_pattern = 'Kernel command line: %s' %
>>>> kernel_command_line
>>>> +
>>>> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
>>>>
>>> This one timeouted (I build with --enable-debug):
>> 
>> I've got the strange behavior for the couple of times.
>> 
>> Console output was correct (I saw 'Kernel command line' in logs), but
>> _console_interation function didn't notice it.
>> 
>> Therefore the test finished with timeout.
>> 
>> How this could be possible?
>
> IIRC there is a problem in how Avocado consume the chardev output.

Is this the same problem Robert has tried to work around in tests/vm?

  From: Robert Foley <robert.foley@linaro.org>
  Subject: [PATCH v7 12/12] tests/vm: Add workaround to consume console
  Date: Tue, 19 May 2020 09:22:59 -0400
  Message-Id: <20200519132259.405-13-robert.foley@linaro.org>

>
> Cleber has been working on some PoC / kludge.
>
> Cleber/Eduardo do you remember the problem?
>
>> 
>>>   (1/1) tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc:
>>> replay: recording...
>>> replay: replaying...
>>> INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout
>>> reached\nOriginal status: ERROR\n{'name':
>>> '1-tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc',
>>> 'logdir':
>>> 'avocado/job-results/job-2020-05-27T16.48-71d7bf4/test-results/1-tes...
>>> (90.68 s)
>>>
>>
Philippe Mathieu-Daudé May 28, 2020, 11:32 a.m. UTC | #10
On 5/28/20 1:07 PM, Alex Bennée wrote:
> 
> Philippe Mathieu-Daudé <philmd@redhat.com> writes:
> 
>> On 5/28/20 9:12 AM, Pavel Dovgalyuk wrote:
>>>
>>> On 27.05.2020 17:53, Philippe Mathieu-Daudé wrote:
>>>> On 5/27/20 12:31 PM, Pavel Dovgalyuk wrote:
>>>>> This patch adds a test for record/replay an execution of x86_64 machine.
>>>>> Execution scenario includes simple kernel boot, which allows testing
>>>>> basic hardware interaction in RR mode.
>>>>>
>>>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>>>> ---
>>>>>   0 files changed
>>>>>
>>>>> diff --git a/tests/acceptance/replay_kernel.py
>>>>> b/tests/acceptance/replay_kernel.py
>>>>> index b8b277ad2f..c7526f1aba 100644
>>>>> --- a/tests/acceptance/replay_kernel.py
>>>>> +++ b/tests/acceptance/replay_kernel.py
>>>>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>>>>                       True, shift, args)
>>>>>           self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>>>>                       False, shift, args)
>>>>> +
>>>>> +    def test_x86_64_pc(self):
>>>>> +        """
>>>>> +        :avocado: tags=arch:x86_64
>>>>> +        :avocado: tags=machine:pc
>>>>> +        """
>>>>> +        kernel_url =
>>>>> ('https://archives.fedoraproject.org/pub/archive/fedora'
>>>>> +                     
>>>>> '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>>>>> +                      '/vmlinuz')
>>>>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>>>>> +        kernel_path = self.fetch_asset(kernel_url,
>>>>> asset_hash=kernel_hash)
>>>>> +
>>>>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE +
>>>>> 'console=ttyS0'
>>>>> +        console_pattern = 'Kernel command line: %s' %
>>>>> kernel_command_line
>>>>> +
>>>>> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
>>>>>
>>>> This one timeouted (I build with --enable-debug):
>>>
>>> I've got the strange behavior for the couple of times.
>>>
>>> Console output was correct (I saw 'Kernel command line' in logs), but
>>> _console_interation function didn't notice it.
>>>
>>> Therefore the test finished with timeout.
>>>
>>> How this could be possible?
>>
>> IIRC there is a problem in how Avocado consume the chardev output.
> 
> Is this the same problem Robert has tried to work around in tests/vm?
> 
>   From: Robert Foley <robert.foley@linaro.org>
>   Subject: [PATCH v7 12/12] tests/vm: Add workaround to consume console
>   Date: Tue, 19 May 2020 09:22:59 -0400
>   Message-Id: <20200519132259.405-13-robert.foley@linaro.org>

Exactly! I guess Marc-André or Eduardo suggested the same fix (use a
dedicated polling thread).

> 
>>
>> Cleber has been working on some PoC / kludge.
>>
>> Cleber/Eduardo do you remember the problem?
>>
>>>
>>>>   (1/1) tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc:
>>>> replay: recording...
>>>> replay: replaying...
>>>> INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred: Timeout
>>>> reached\nOriginal status: ERROR\n{'name':
>>>> '1-tests/acceptance/replay_kernel.py:ReplayKernel.test_x86_64_pc',
>>>> 'logdir':
>>>> 'avocado/job-results/job-2020-05-27T16.48-71d7bf4/test-results/1-tes...
>>>> (90.68 s)
>>>>
>>>
> 
>
Alex Bennée May 28, 2020, 1:26 p.m. UTC | #11
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes:

> On 27.05.2020 18:41, Alex Bennée wrote:
>> Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:
>>
>>> This patch adds a test for record/replay an execution of x86_64 machine.
>>> Execution scenario includes simple kernel boot, which allows testing
>>> basic hardware interaction in RR mode.
>>>
>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>> ---
>>>   0 files changed
>>>
>>> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
>>> index b8b277ad2f..c7526f1aba 100644
>>> --- a/tests/acceptance/replay_kernel.py
>>> +++ b/tests/acceptance/replay_kernel.py
>>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>>                       True, shift, args)
>>>           self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>>                       False, shift, args)
>>> +
>>> +    def test_x86_64_pc(self):
>>> +        """
>>> +        :avocado: tags=arch:x86_64
>>> +        :avocado: tags=machine:pc
>>> +        """
>>> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
>>> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>>> +                      '/vmlinuz')
>>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>>> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
>>> +
>>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
>>> +        console_pattern = 'Kernel command line: %s' % kernel_command_line
>>> +
>>> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
>> This test fails for me on the replay:
>
> Have you applied latest RR patches?

I have the following on top of the acceptance patches:

a36c23042fe * review/record-replay-acceptance-v2 icount: fix shift=auto for record/replay
4ab2164c10b * replay: synchronize on every virtual timer callback
66104ce6e4b * replay: notify the main loop when there are no instructions

>
>
>>
>>    2020-05-27 16:22:21,658 machine          L0326 DEBUG| VM launch command: 'x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/var/tmp/tmp4n_geosi/qemu-9516-monitor.sock -mon chardev=mon,mode=control -machine pc -chardev socket,id=console,path=/var/tmp/tmp4n_geosi/qemu-9516-console.sock,server,nowait -serial chardev:console -icount shift=7,rr=replay,rrfile=/var/tmp/avocado_b85h3ycg/avocado_job_8xrxksgj/1-._tests_acceptance_replay_kernel.py_ReplayKernel.test_x86_64_pc/replay.bin -kernel /home/alex/avocado/data/cache/by_location/df533120a0e0ffda2626bed6e8a975d3b07e3f05/vmlinuz -append printk.time=0 console=ttyS0 -net none'
>>    2020-05-27 16:22:21,725 qmp              L0194 DEBUG| >>> {'execute': 'qmp_capabilities'}
>>    2020-05-27 16:22:21,736 qmp              L0202 DEBUG| <<< {'return': {}}
>>    2020-05-27 16:23:49,372 stacktrace       L0039 ERROR|
>>    2020-05-27 16:23:49,372 stacktrace       L0042 ERROR| Reproduced traceback from: /home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/core/test.py:860
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR| Traceback (most recent call last):
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 73, in test_x86_64_pc
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     self.run_rr(kernel_path, kernel_command_line, console_pattern)
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 57, in run_rr
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     False, shift, args)
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/replay_kernel.py", line 46, in run_vm
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     self.wait_for_console_pattern(console_pattern, vm)
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/boot_linux_console.py", line 37, in wait_for_console_pattern
>>    2020-05-27 16:23:49,373 stacktrace       L0045 ERROR|     vm=vm)
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py", line 131, in wait_for_console_pattern
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     _console_interaction(test, success_message, failure_message, None, vm=vm)
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/acceptance/avocado_qemu/__init__.py", line 83, in _console_interaction
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     msg = console.readline().strip()
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/usr/lib/python3.7/socket.py", line 589, in readinto
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     return self._sock.recv_into(b)
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|   File "/home/alex/lsrc/qemu.git/builds/all/tests/venv/lib/python3.7/site-packages/avocado/plugins/runner.py", line 89, in sigterm_handler
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR|     raise RuntimeError("Test interrupted by SIGTERM")
>>    2020-05-27 16:23:49,374 stacktrace       L0045 ERROR| RuntimeError: Test interrupted by SIGTERM
>>
>>
Pavel Dovgalyuk May 29, 2020, 5:56 a.m. UTC | #12
On 28.05.2020 16:26, Alex Bennée wrote:
> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes:
>
>> On 27.05.2020 18:41, Alex Bennée wrote:
>>> Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:
>>>
>>>> This patch adds a test for record/replay an execution of x86_64 machine.
>>>> Execution scenario includes simple kernel boot, which allows testing
>>>> basic hardware interaction in RR mode.
>>>>
>>>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
>>>> ---
>>>>    0 files changed
>>>>
>>>> diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
>>>> index b8b277ad2f..c7526f1aba 100644
>>>> --- a/tests/acceptance/replay_kernel.py
>>>> +++ b/tests/acceptance/replay_kernel.py
>>>> @@ -55,3 +55,19 @@ class ReplayKernel(LinuxKernelUtils):
>>>>                        True, shift, args)
>>>>            self.run_vm(kernel_path, kernel_command_line, console_pattern,
>>>>                        False, shift, args)
>>>> +
>>>> +    def test_x86_64_pc(self):
>>>> +        """
>>>> +        :avocado: tags=arch:x86_64
>>>> +        :avocado: tags=machine:pc
>>>> +        """
>>>> +        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
>>>> +                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
>>>> +                      '/vmlinuz')
>>>> +        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
>>>> +        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
>>>> +
>>>> +        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
>>>> +        console_pattern = 'Kernel command line: %s' % kernel_command_line
>>>> +
>>>> +        self.run_rr(kernel_path, kernel_command_line, console_pattern)
>>> This test fails for me on the replay:
>> Have you applied latest RR patches?
> I have the following on top of the acceptance patches:
>
> a36c23042fe * review/record-replay-acceptance-v2 icount: fix shift=auto for record/replay
> 4ab2164c10b * replay: synchronize on every virtual timer callback
> 66104ce6e4b * replay: notify the main loop when there are no instructions
>
Please also try adding "replay: implement fair mutex"
Alex Bennée May 29, 2020, 9:20 a.m. UTC | #13
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes:

> On 28.05.2020 16:26, Alex Bennée wrote:
>> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes:
>>
>>> On 27.05.2020 18:41, Alex Bennée wrote:
>>>> Pavel Dovgalyuk <Pavel.Dovgaluk@gmail.com> writes:
>>>>
>>>>> This patch adds a test for record/replay an execution of x86_64 machine.
>>>>> Execution scenario includes simple kernel boot, which allows testing
>>>>> basic hardware interaction in RR mode.
<snip>
>>>> This test fails for me on the replay:
>>> Have you applied latest RR patches?
>> I have the following on top of the acceptance patches:
>>
>> a36c23042fe * review/record-replay-acceptance-v2 icount: fix shift=auto for record/replay
>> 4ab2164c10b * replay: synchronize on every virtual timer callback
>> 66104ce6e4b * replay: notify the main loop when there are no instructions
>>
> Please also try adding "replay: implement fair mutex"

OK that was enough to get the pc_i440fx to pass:

  JOB ID     : 33f94542f1f2d1e98c8dad820f66ec5ea9bd9fc1
  JOB LOG    : /home/alex/avocado/job-results/job-2020-05-29T09.36-33f9454/job.log
   (1/2) ./tests/acceptance/replay_linux.py:ReplayLinuxX8664.test_pc_i440fx: PASS (502.55 s)
   (2/2) ./tests/acceptance/replay_linux.py:ReplayLinuxX8664.test_pc_q35: INTERRUPTED: Test died without reporting the status.\nRunner error occurred: Timeout reached\nOriginal status: ERROR\n{'name': '2-./tests/acceptance/replay_linux.py:ReplayLinuxX8664.test_pc_q35', 'logdir': '/home/alex/avocado/job-results/job-2020-05-29T09.36-33f9454/test... (1811.23 s)
  RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 1 | CANCEL 0
  JOB TIME   : 2313.96 s

I've attached the log although it's a bit messy because the console
seems to have stripped out any newlines.
diff mbox series

Patch

diff --git a/tests/acceptance/replay_kernel.py b/tests/acceptance/replay_kernel.py
index b8b277ad2f..c7526f1aba 100644
--- a/tests/acceptance/replay_kernel.py
+++ b/tests/acceptance/replay_kernel.py
@@ -55,3 +55,19 @@  class ReplayKernel(LinuxKernelUtils):
                     True, shift, args)
         self.run_vm(kernel_path, kernel_command_line, console_pattern,
                     False, shift, args)
+
+    def test_x86_64_pc(self):
+        """
+        :avocado: tags=arch:x86_64
+        :avocado: tags=machine:pc
+        """
+        kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
+                      '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
+                      '/vmlinuz')
+        kernel_hash = '23bebd2680757891cf7adedb033532163a792495'
+        kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
+
+        kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=ttyS0'
+        console_pattern = 'Kernel command line: %s' % kernel_command_line
+
+        self.run_rr(kernel_path, kernel_command_line, console_pattern)