diff mbox series

Bug report (with fix) for DEC Tulip driver (de2104x.c)

Message ID CAK-9enMxA68mRYFG=2zD02guvCqe-aa3NO0YZuJcTdBWn5MPqg@mail.gmail.com
State Changes Requested
Delegated to: David Miller
Headers show
Series Bug report (with fix) for DEC Tulip driver (de2104x.c) | expand

Commit Message

Arlie Davis Sept. 16, 2019, 9:50 p.m. UTC
Hello. I'm a developer on GCE, Google's virtual machine platform. As
part of my work, we needed to emulate a DEC Tulip 2104x NIC, so I
implemented a basic virtual device for it.

While doing so, I believe I found a bug in the Linux driver for this
device, in de2104x.c. I see in MAINTAINERS that this is an orphaned
device driver, but I was wondering if the kernel would still accept a
patch for it.  Should I submit this patch, and if so, where should I
submit it?

Below is the commit text from my local repo, and the patch diffs
(they're quite short).

    Fix a bug in DEC Tulip driver (de2104x.c)

    The DEC Tulip Ethernet controller uses a 16-byte transfer descriptor for
    both its transmit (tx) and receive (rx) rings. Each descriptor has a
    "status" uint32 field (called opts1 in de2104x.c, and called TDES0 /
    Status in the DEC hardware specifications) and a "control" field (called
    opts2 in de2104x.c and called TDES1 / Control in the DEC
    specifications). In the "control" field, bit 30 is the LastSegment bit,
    which indicates that this is the last transfer descriptor in a sequence
    of descriptors (in case a single Ethernet frame spans more than one
    descriptor).

    The de2104x driver correctly sets LastSegment, in the de_start_xmit
    function. (The code calls it LastFrag, not LastSegment). However, in the
    interrupt handler (in function de_tx), the driver incorrectly checks for
    this bit in the status field, not the control field. This means that the
    driver is reading bits that are undefined in the specification; the
    spec does not make any guarantees at all about the contents of bits 29
    and bits 30 in the "status" field.

    The effect of the bug is that the driver may think that a TX ring entry
    is never finished, even though a compliant DEC Tulip hardware device (or
    a virtualized device, in a VM) actually did finish sending the Ethernet
    frame.

    The fix is to read the correct "control" field from the TX descriptor.

    DEC Tulip programming specification:

    https://web.archive.org/web/20050805091751/http://www.intel.com/design/network/manuals/21140ahm.pdf

    See section 4.2.2 for the specs on the transfer descriptor.

Here's my patch that fixes it:

Comments

Andrew Lunn Sept. 17, 2019, 9:28 p.m. UTC | #1
On Mon, Sep 16, 2019 at 02:50:53PM -0700, Arlie Davis wrote:
> Hello. I'm a developer on GCE, Google's virtual machine platform. As
> part of my work, we needed to emulate a DEC Tulip 2104x NIC, so I
> implemented a basic virtual device for it.
> 
> While doing so, I believe I found a bug in the Linux driver for this
> device, in de2104x.c. I see in MAINTAINERS that this is an orphaned
> device driver, but I was wondering if the kernel would still accept a
> patch for it.  Should I submit this patch, and if so, where should I
> submit it?
> 
> Below is the commit text from my local repo, and the patch diffs
> (they're quite short).
> 
>     Fix a bug in DEC Tulip driver (de2104x.c)
> 
>     The DEC Tulip Ethernet controller uses a 16-byte transfer descriptor for
>     both its transmit (tx) and receive (rx) rings. Each descriptor has a
>     "status" uint32 field (called opts1 in de2104x.c, and called TDES0 /
>     Status in the DEC hardware specifications) and a "control" field (called
>     opts2 in de2104x.c and called TDES1 / Control in the DEC
>     specifications). In the "control" field, bit 30 is the LastSegment bit,
>     which indicates that this is the last transfer descriptor in a sequence
>     of descriptors (in case a single Ethernet frame spans more than one
>     descriptor).
> 
>     The de2104x driver correctly sets LastSegment, in the de_start_xmit
>     function. (The code calls it LastFrag, not LastSegment). However, in the
>     interrupt handler (in function de_tx), the driver incorrectly checks for
>     this bit in the status field, not the control field. This means that the
>     driver is reading bits that are undefined in the specification; the
>     spec does not make any guarantees at all about the contents of bits 29
>     and bits 30 in the "status" field.
> 
>     The effect of the bug is that the driver may think that a TX ring entry
>     is never finished, even though a compliant DEC Tulip hardware device (or
>     a virtualized device, in a VM) actually did finish sending the Ethernet
>     frame.
> 
>     The fix is to read the correct "control" field from the TX descriptor.
> 
>     DEC Tulip programming specification:
> 
>     https://web.archive.org/web/20050805091751/http://www.intel.com/design/network/manuals/21140ahm.pdf

Hi Arlie

Without having access to real hardware, it is hard to verify
this. Maybe the programming specification is wrong? It could be, the
hardware designer thought the control field should be write only from
the CPU side, and the status field read only from the CPU side, to
avoid race conditions. So in practice it does mirror the LastSegment
bit from control to status?

Are there any other emulators of this out there? Any silicon vendor
who produces devices which claim to be compatible?

    Andrew
Arlie Davis Sept. 17, 2019, 9:36 p.m. UTC | #2
I checked QEMU v3.1, and I don't see any Tulip implementation for it.
I haven't checked any other emulators. A few cursory searches didn't
turn up anything.

I checked the FreeBSD driver for the same device. It just treats the
control field as a write-only field; the driver just uses its own
internal state, rather than reading anything from the transfer
descriptor, aside from the relevant status bits.

My guess is that the hardware just always sets bit 30 = 1, in the
status field. In this Linux driver, for a normal packet (non-SETUP,
non-DUMMY), all packets use a single TX descriptor, so LastFrag=1 is
always true. Because of this, I considered changing the Linux driver
to just remove the "if (status & LastFrag)" check, and make it
unconditional, since this driver never uses more than 1 descriptor per
transmitted packet. Would you support that change?

Likewise, I'm at a loss for testing with real hardware. It's hard to
find such things, now.

On Tue, Sep 17, 2019 at 2:28 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Mon, Sep 16, 2019 at 02:50:53PM -0700, Arlie Davis wrote:
> > Hello. I'm a developer on GCE, Google's virtual machine platform. As
> > part of my work, we needed to emulate a DEC Tulip 2104x NIC, so I
> > implemented a basic virtual device for it.
> >
> > While doing so, I believe I found a bug in the Linux driver for this
> > device, in de2104x.c. I see in MAINTAINERS that this is an orphaned
> > device driver, but I was wondering if the kernel would still accept a
> > patch for it.  Should I submit this patch, and if so, where should I
> > submit it?
> >
> > Below is the commit text from my local repo, and the patch diffs
> > (they're quite short).
> >
> >     Fix a bug in DEC Tulip driver (de2104x.c)
> >
> >     The DEC Tulip Ethernet controller uses a 16-byte transfer descriptor for
> >     both its transmit (tx) and receive (rx) rings. Each descriptor has a
> >     "status" uint32 field (called opts1 in de2104x.c, and called TDES0 /
> >     Status in the DEC hardware specifications) and a "control" field (called
> >     opts2 in de2104x.c and called TDES1 / Control in the DEC
> >     specifications). In the "control" field, bit 30 is the LastSegment bit,
> >     which indicates that this is the last transfer descriptor in a sequence
> >     of descriptors (in case a single Ethernet frame spans more than one
> >     descriptor).
> >
> >     The de2104x driver correctly sets LastSegment, in the de_start_xmit
> >     function. (The code calls it LastFrag, not LastSegment). However, in the
> >     interrupt handler (in function de_tx), the driver incorrectly checks for
> >     this bit in the status field, not the control field. This means that the
> >     driver is reading bits that are undefined in the specification; the
> >     spec does not make any guarantees at all about the contents of bits 29
> >     and bits 30 in the "status" field.
> >
> >     The effect of the bug is that the driver may think that a TX ring entry
> >     is never finished, even though a compliant DEC Tulip hardware device (or
> >     a virtualized device, in a VM) actually did finish sending the Ethernet
> >     frame.
> >
> >     The fix is to read the correct "control" field from the TX descriptor.
> >
> >     DEC Tulip programming specification:
> >
> >     https://web.archive.org/web/20050805091751/http://www.intel.com/design/network/manuals/21140ahm.pdf
>
> Hi Arlie
>
> Without having access to real hardware, it is hard to verify
> this. Maybe the programming specification is wrong? It could be, the
> hardware designer thought the control field should be write only from
> the CPU side, and the status field read only from the CPU side, to
> avoid race conditions. So in practice it does mirror the LastSegment
> bit from control to status?
>
> Are there any other emulators of this out there? Any silicon vendor
> who produces devices which claim to be compatible?
>
>     Andrew
John David Anglin Sept. 17, 2019, 10:51 p.m. UTC | #3
On 2019-09-17 5:36 p.m., Arlie Davis wrote:
> Likewise, I'm at a loss for testing with real hardware. It's hard to
> find such things, now.
How does de2104x compare to ds2142/43?  I have a c3750 with ds2142/43 tulip.  Helge
or some others might have a machine with a de2104x.

Dave
Helge Deller Sept. 18, 2019, 5:56 a.m. UTC | #4
On 18.09.19 00:51, John David Anglin wrote:
> On 2019-09-17 5:36 p.m., Arlie Davis wrote:
>> Likewise, I'm at a loss for testing with real hardware. It's hard to
>> find such things, now.
> How does de2104x compare to ds2142/43?  I have a c3750 with ds2142/43 tulip.  Helge
> or some others might have a machine with a de2104x.

The machines we could test are
* a C240 with a DS21140 tulip chip (Sven has one),
* a C3000 or similiar with DS21142 and/or DS21143 (me).

If the patch does not show any regressions, I'd suggest to
apply it upstream.

Helge
Thomas Bogendoerfer Sept. 18, 2019, 1:27 p.m. UTC | #5
On Wed, Sep 18, 2019 at 07:56:16AM +0200, Helge Deller wrote:
> On 18.09.19 00:51, John David Anglin wrote:
> > On 2019-09-17 5:36 p.m., Arlie Davis wrote:
> >> Likewise, I'm at a loss for testing with real hardware. It's hard to
> >> find such things, now.
> > How does de2104x compare to ds2142/43?  I have a c3750 with ds2142/43 tulip.  Helge
> > or some others might have a machine with a de2104x.
> 
> The machines we could test are
> * a C240 with a DS21140 tulip chip (Sven has one),
> * a C3000 or similiar with DS21142 and/or DS21143 (me).
> 
> If the patch does not show any regressions, I'd suggest to
> apply it upstream.

2114x chips use a different driver, so it won't help here.

Thomas.
Sven Schnelle Sept. 19, 2019, 8:31 p.m. UTC | #6
Hi,

On Wed, Sep 18, 2019 at 07:56:16AM +0200, Helge Deller wrote:
> On 18.09.19 00:51, John David Anglin wrote:
> > On 2019-09-17 5:36 p.m., Arlie Davis wrote:
> >> Likewise, I'm at a loss for testing with real hardware. It's hard to
> >> find such things, now.
> > How does de2104x compare to ds2142/43?  I have a c3750 with ds2142/43 tulip.  Helge
> > or some others might have a machine with a de2104x.
> 
> The machines we could test are
> * a C240 with a DS21140 tulip chip (Sven has one),

My C240 identifies as C240+:

[    0.000000] model 9000/782/C240+

which has a 21143 (verified by looking at the board):

root@c240:/# lspci -kvvnn -s 00:14.0
00:14.0 Ethernet controller [0200]: Digital Equipment Corporation DECchip 21142/43 [1011:0019] (rev 30)
	Subsystem: Hewlett-Packard Company DECchip 21142/43 [103c:104f]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 255 (5000ns min, 10000ns max), Cache Line Size: 32 bytes
	Interrupt: pin A routed to IRQ 23
	Region 0: I/O ports at 0080 [size=128]
	Region 1: Memory at f2802000 (32-bit, non-prefetchable) [size=128]
	Expansion ROM at f2f80000 [disabled] [size=256K]
	Kernel driver in use: tulip

Regards
Sven
Thomas Bogendoerfer Sept. 20, 2019, 10:43 a.m. UTC | #7
On Mon, Sep 16, 2019 at 02:50:53PM -0700, Arlie Davis wrote:
>     See section 4.2.2 for the specs on the transfer descriptor.
> 
> Here's my patch that fixes it:
> 
> diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c
> b/drivers/net/ethernet/dec/tulip/de2104x.c
> index f1a2da15dd0a..3a420ceb52e5 100644
> --- a/drivers/net/ethernet/dec/tulip/de2104x.c
> +++ b/drivers/net/ethernet/dec/tulip/de2104x.c
> @@ -545,6 +545,7 @@ static void de_tx (struct de_private *de)
>         while (tx_tail != tx_head) {
>                 struct sk_buff *skb;
>                 u32 status;
> +               u32 control;
> 
>                 rmb();
>                 status = le32_to_cpu(de->tx_ring[tx_tail].opts1);
> @@ -565,7 +566,8 @@ static void de_tx (struct de_private *de)
>                 pci_unmap_single(de->pdev, de->tx_skb[tx_tail].mapping,
>                                  skb->len, PCI_DMA_TODEVICE);
> 
> -               if (status & LastFrag) {
> +               control = le32_to_cpu(de->tx_ring[tx_tail].opts2);
> +               if (control & LastFrag) {

how about just remove the complete if ? We know that we always
use one descriptor per packet and chip doesn't touch control
field. So I see no reason to check it here. Tulip driver for
2114x cards doesn't check it neither.

Thomas.
Maciej W. Rozycki Oct. 3, 2019, 1:29 a.m. UTC | #8
On Wed, 18 Sep 2019, Thomas Bogendoerfer wrote:

> > >> Likewise, I'm at a loss for testing with real hardware. It's hard to
> > >> find such things, now.
> > > How does de2104x compare to ds2142/43?  I have a c3750 with ds2142/43 tulip.  Helge
> > > or some others might have a machine with a de2104x.
> > 
> > The machines we could test are
> > * a C240 with a DS21140 tulip chip (Sven has one),
> > * a C3000 or similiar with DS21142 and/or DS21143 (me).
> > 
> > If the patch does not show any regressions, I'd suggest to
> > apply it upstream.
> 
> 2114x chips use a different driver, so it won't help here.

 Asking at `linux-alpha' (cc-ed) might help; these chips used to be 
ubiquitous with older Alpha systems, so someone subscribed there might be 
able to step in and help right away.  Also testing with an Alpha always 
has the advantage of exposing any weak ordering issues.

 Myself I have an AS 300 (or AS 250 really as I suspect a mismatch between 
the enclosure and the MB; the two systems are almost identical anyway) and 
it does have a real 21040 chip on its riser I/O module.  However I have 
never got to setting up Linux on that machine and it may take me a bit to 
get it running suitably to get any verification done I'm afraid.

 NB for the original 21040 part "DECchip 21040 Ethernet LAN Controller for 
PCI Hardware Reference Manual", Order Number: EC-N0752-72, available here:
<ftp://ftp.netbsd.org/pub/NetBSD/misc/dec-docs/ec-n0752-72.ps.gz> is 
probably more relevant, although in the area concerned here it seems the 
same.

 Finally I don't expect any race condition in possibly examining control 
bits in the transmit interrupt handler as this is what the descriptor 
ownership bit guards against -- only when a descriptor is owned by the 
host accesses from the CPU side are allowed, and then it is safe to fiddle 
with any field.

  Maciej
diff mbox series

Patch

diff --git a/drivers/net/ethernet/dec/tulip/de2104x.c
b/drivers/net/ethernet/dec/tulip/de2104x.c
index f1a2da15dd0a..3a420ceb52e5 100644
--- a/drivers/net/ethernet/dec/tulip/de2104x.c
+++ b/drivers/net/ethernet/dec/tulip/de2104x.c
@@ -545,6 +545,7 @@  static void de_tx (struct de_private *de)
        while (tx_tail != tx_head) {
                struct sk_buff *skb;
                u32 status;
+               u32 control;

                rmb();
                status = le32_to_cpu(de->tx_ring[tx_tail].opts1);
@@ -565,7 +566,8 @@  static void de_tx (struct de_private *de)
                pci_unmap_single(de->pdev, de->tx_skb[tx_tail].mapping,
                                 skb->len, PCI_DMA_TODEVICE);

-               if (status & LastFrag) {
+               control = le32_to_cpu(de->tx_ring[tx_tail].opts2);
+               if (control & LastFrag) {
                        if (status & TxError) {
                                netif_dbg(de, tx_err, de->dev,
                                          "tx err, status 0x%x\n",