mbox series

[0/2] Quirk to enable QCA9377 to discover BLE devices

Message ID 20190423072236.24999-1-jprvita@endlessm.com
Headers show
Series Quirk to enable QCA9377 to discover BLE devices | expand

Message

=?UTF-8?q?Jo=C3=A3o=20Paulo=20Rechi=20Vita?= April 23, 2019, 7:22 a.m. UTC
As reported previously on [1], it is currently not possible to discover
BLE devices with QCA9377 controllers. When trying to start an active
scanning procedure with this controller, three commands are queued,
LE_SET_RANDOM_ADDR, LE_SET_SCAN_PARAM and LE_SET_SCAN_ENABLE. After the
first command is sent to the controller, a command complete event for it
is received, and the second command is sent, an extra command complete
for the first command is received. At this point the kernel sends the
next command and fails to process the command complete event for the
LE_SET_SCAN_PARAM command, because when it arrives it does not match the
last command that was sent. This makes hdev->le_scan_type never be
updated and the kernel behaves as if a passive scanning procedure was
being performed, thus no device found events are sent to userspace.

[1] https://www.spinics.net/lists/linux-bluetooth/msg79102.html

I have received no replies on the previous report and on further
attempts to contact the QCA addresses that have submitted Bluetooth
firmware blobs to linux-firmware upstream. This series avoids the
problem described above, but I believe ideally the controller should not
be sending this extra command complete event.

I'm not 100% sure if the approach taken here is the best way to work
around this problem in the kernel, as I am not super familiar with the
HCI layer. I'll be happy to hear suggestions of better approaches.

Full logs from btmon can be found bellow this message, and the extra
command complete event can be seen at timestamp 27.420131.

Best regards,

João Paulo Rechi Vita (2):
  Bluetooth: Create new HCI_QUIRK_WAIT_FOR_MATCHING_CC
  Bluetooth: Set HCI_QUIRK_WAIT_FOR_MATCHING_CC for QCA9377

 drivers/bluetooth/btusb.c        | 9 +++++++++
 include/net/bluetooth/hci.h      | 4 ++++
 include/net/bluetooth/hci_core.h | 1 +
 net/bluetooth/hci_core.c         | 3 +++
 net/bluetooth/hci_event.c        | 4 ++++
 5 files changed, 21 insertions(+)

Comments

Marcel Holtmann April 23, 2019, 4:17 p.m. UTC | #1
Hi Joao Paulo,

> As reported previously on [1], it is currently not possible to discover
> BLE devices with QCA9377 controllers. When trying to start an active
> scanning procedure with this controller, three commands are queued,
> LE_SET_RANDOM_ADDR, LE_SET_SCAN_PARAM and LE_SET_SCAN_ENABLE. After the
> first command is sent to the controller, a command complete event for it
> is received, and the second command is sent, an extra command complete
> for the first command is received. At this point the kernel sends the
> next command and fails to process the command complete event for the
> LE_SET_SCAN_PARAM command, because when it arrives it does not match the
> last command that was sent. This makes hdev->le_scan_type never be
> updated and the kernel behaves as if a passive scanning procedure was
> being performed, thus no device found events are sent to userspace.
> 
> [1] https://www.spinics.net/lists/linux-bluetooth/msg79102.html
> 
> I have received no replies on the previous report and on further
> attempts to contact the QCA addresses that have submitted Bluetooth
> firmware blobs to linux-firmware upstream. This series avoids the
> problem described above, but I believe ideally the controller should not
> be sending this extra command complete event.
> 
> I'm not 100% sure if the approach taken here is the best way to work
> around this problem in the kernel, as I am not super familiar with the
> HCI layer. I'll be happy to hear suggestions of better approaches.
> 
> Full logs from btmon can be found bellow this message, and the extra
> command complete event can be seen at timestamp 27.420131.

so can we get a fixed firmware from Qualcomm? Or at least some ROM patches for it?

> Best regards,
> 
> João Paulo Rechi Vita (2):
>  Bluetooth: Create new HCI_QUIRK_WAIT_FOR_MATCHING_CC
>  Bluetooth: Set HCI_QUIRK_WAIT_FOR_MATCHING_CC for QCA9377
> 
> drivers/bluetooth/btusb.c        | 9 +++++++++
> include/net/bluetooth/hci.h      | 4 ++++
> include/net/bluetooth/hci_core.h | 1 +
> net/bluetooth/hci_core.c         | 3 +++
> net/bluetooth/hci_event.c        | 4 ++++
> 5 files changed, 21 insertions(+)
> 
> -- 
> 2.20.1
> 
> Bluetooth monitor ver 5.50
> = Note: Linux version 5.0.0+ (x86_64)                                  0.352340
> = Note: Bluetooth subsystem version 2.22                               0.352343
> = New Index: 80:C5:F2:8F:87:84 (Primary,USB,hci0)               [hci0] 0.352344
> = Open Index: 80:C5:F2:8F:87:84                                 [hci0] 0.352345
> = Index Info: 80:C5:F2:8F:87:84 (Qualcomm)                      [hci0] 0.352346
> @ MGMT Open: bluetoothd (privileged) version 1.14             {0x0001} 0.352347
> @ MGMT Open: btmon (privileged) version 1.14                  {0x0002} 0.352366
> @ MGMT Open: btmgmt (privileged) version 1.14                {0x0003} 27.302164
> @ MGMT Command: Start Discovery (0x0023) plen 1       {0x0003} [hci0] 27.302310
>        Address type: 0x06
>          LE Public
>          LE Random
> < HCI Command: LE Set Random Address (0x08|0x0005) plen 6   #1 [hci0] 27.302496
>        Address: 15:60:F2:91:B2:24 (Non-Resolvable)
>> HCI Event: Command Complete (0x0e) plen 4                 #2 [hci0] 27.419117
>      LE Set Random Address (0x08|0x0005) ncmd 1
>        Status: Success (0x00)
> < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7  #3 [hci0] 27.419244
>        Type: Active (0x01)
>        Interval: 11.250 msec (0x0012)
>        Window: 11.250 msec (0x0012)
>        Own address type: Random (0x01)
>        Filter policy: Accept all advertisement (0x00)
>> HCI Event: Command Complete (0x0e) plen 4                 #4 [hci0] 27.420131
>      LE Set Random Address (0x08|0x0005) ncmd 1
>        Status: Success (0x00)

so we really need to ignore this command complete and not start ahead with the next command. Especially since we really only support one command at a time right now.

> < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2      #5 [hci0] 27.420259
>        Scanning: Enabled (0x01)
>        Filter duplicates: Enabled (0x01)
>> HCI Event: Command Complete (0x0e) plen 4                 #6 [hci0] 27.420969
>      LE Set Scan Parameters (0x08|0x000b) ncmd 1
>        Status: Success (0x00)

We need for this command complete to arrive and only then continue with LE Set Scan Enable. We don’t need a quirk for it. Just add support for dealing with unexpected command complete opcodes. And print a big error if that happens.

>> HCI Event: Command Complete (0x0e) plen 4                 #7 [hci0] 27.421983
>      LE Set Scan Enable (0x08|0x000c) ncmd 1
>        Status: Success (0x00)
> @ MGMT Event: Command Complete (0x0001) plen 4        {0x0003} [hci0] 27.422059
>      Start Discovery (0x0023) plen 1
>        Status: Success (0x00)
>        Address type: 0x06
>          LE Public
>          LE Random

Regards

Marcel
=?UTF-8?q?Jo=C3=A3o=20Paulo=20Rechi=20Vita?= April 24, 2019, 5:42 a.m. UTC | #2
Hello Marcel, thanks for the quick response.

On Wed, Apr 24, 2019 at 12:17 AM Marcel Holtmann <marcel@holtmann.org> wrote:
>
> Hi Joao Paulo,
>
> > As reported previously on [1], it is currently not possible to discover
> > BLE devices with QCA9377 controllers. When trying to start an active
> > scanning procedure with this controller, three commands are queued,
> > LE_SET_RANDOM_ADDR, LE_SET_SCAN_PARAM and LE_SET_SCAN_ENABLE. After the
> > first command is sent to the controller, a command complete event for it
> > is received, and the second command is sent, an extra command complete
> > for the first command is received. At this point the kernel sends the
> > next command and fails to process the command complete event for the
> > LE_SET_SCAN_PARAM command, because when it arrives it does not match the
> > last command that was sent. This makes hdev->le_scan_type never be
> > updated and the kernel behaves as if a passive scanning procedure was
> > being performed, thus no device found events are sent to userspace.
> >
> > [1] https://www.spinics.net/lists/linux-bluetooth/msg79102.html
> >
> > I have received no replies on the previous report and on further
> > attempts to contact the QCA addresses that have submitted Bluetooth
> > firmware blobs to linux-firmware upstream. This series avoids the
> > problem described above, but I believe ideally the controller should not
> > be sending this extra command complete event.
> >
> > I'm not 100% sure if the approach taken here is the best way to work
> > around this problem in the kernel, as I am not super familiar with the
> > HCI layer. I'll be happy to hear suggestions of better approaches.
> >
> > Full logs from btmon can be found bellow this message, and the extra
> > command complete event can be seen at timestamp 27.420131.
>
> so can we get a fixed firmware from Qualcomm? Or at least some ROM patches for it?
>

That was my initial expectation as well -- maybe you can show the
problem to some Qualcomm contacts?

> > Best regards,
> >
> > João Paulo Rechi Vita (2):
> >  Bluetooth: Create new HCI_QUIRK_WAIT_FOR_MATCHING_CC
> >  Bluetooth: Set HCI_QUIRK_WAIT_FOR_MATCHING_CC for QCA9377
> >
> > drivers/bluetooth/btusb.c        | 9 +++++++++
> > include/net/bluetooth/hci.h      | 4 ++++
> > include/net/bluetooth/hci_core.h | 1 +
> > net/bluetooth/hci_core.c         | 3 +++
> > net/bluetooth/hci_event.c        | 4 ++++
> > 5 files changed, 21 insertions(+)
> >
> > --
> > 2.20.1
> >
> > Bluetooth monitor ver 5.50
> > = Note: Linux version 5.0.0+ (x86_64)                                  0.352340
> > = Note: Bluetooth subsystem version 2.22                               0.352343
> > = New Index: 80:C5:F2:8F:87:84 (Primary,USB,hci0)               [hci0] 0.352344
> > = Open Index: 80:C5:F2:8F:87:84                                 [hci0] 0.352345
> > = Index Info: 80:C5:F2:8F:87:84 (Qualcomm)                      [hci0] 0.352346
> > @ MGMT Open: bluetoothd (privileged) version 1.14             {0x0001} 0.352347
> > @ MGMT Open: btmon (privileged) version 1.14                  {0x0002} 0.352366
> > @ MGMT Open: btmgmt (privileged) version 1.14                {0x0003} 27.302164
> > @ MGMT Command: Start Discovery (0x0023) plen 1       {0x0003} [hci0] 27.302310
> >        Address type: 0x06
> >          LE Public
> >          LE Random
> > < HCI Command: LE Set Random Address (0x08|0x0005) plen 6   #1 [hci0] 27.302496
> >        Address: 15:60:F2:91:B2:24 (Non-Resolvable)
> >> HCI Event: Command Complete (0x0e) plen 4                 #2 [hci0] 27.419117
> >      LE Set Random Address (0x08|0x0005) ncmd 1
> >        Status: Success (0x00)
> > < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7  #3 [hci0] 27.419244
> >        Type: Active (0x01)
> >        Interval: 11.250 msec (0x0012)
> >        Window: 11.250 msec (0x0012)
> >        Own address type: Random (0x01)
> >        Filter policy: Accept all advertisement (0x00)
> >> HCI Event: Command Complete (0x0e) plen 4                 #4 [hci0] 27.420131
> >      LE Set Random Address (0x08|0x0005) ncmd 1
> >        Status: Success (0x00)
>
> so we really need to ignore this command complete and not start ahead with the next command. Especially since we really only support one command at a time right now.
>

Agreed.

> > < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2      #5 [hci0] 27.420259
> >        Scanning: Enabled (0x01)
> >        Filter duplicates: Enabled (0x01)
> >> HCI Event: Command Complete (0x0e) plen 4                 #6 [hci0] 27.420969
> >      LE Set Scan Parameters (0x08|0x000b) ncmd 1
> >        Status: Success (0x00)
>
> We need for this command complete to arrive and only then continue with LE Set Scan Enable. We don’t need a quirk for it. Just add support for dealing with unexpected command complete opcodes. And print a big error if that happens.
>

Makes sense, I'm sending an updated version ignoring unexpected CC
events on all hardware. Looking at the code it seems the only
exception is a CC event for HCI_OP_RESET on some CSR controllers,
which is handled in hci_req_cmd_complete, so I'm letting that flow
through.

--
João Paulo Rechi Vita
http://about.me/jprvita