Message ID | 1284842625-13920-1-git-send-email-stefanha@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
Hi, Stefan Hajnoczi a écrit : > The OpenIndiana (Solaris) e1000g driver drops frames that are too long > or too short. It expects to receive frames of at least the Ethernet > minimum size. ARP requests in particular are small and will be dropped > if they are not padded appropriately, preventing a Solaris VM from > becoming visible on the network. > > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > --- > hw/e1000.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > [...] Another patch creating ARP replies at least 64 bytes long has been committed: http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813 Does it fix your issue? Hervé
On Sat, Sep 18, 2010 at 9:57 PM, Hervé Poussineau <hpoussin@reactos.org> wrote: > Another patch creating ARP replies at least 64 bytes long has been > committed: > http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813 > > Does it fix your issue? No I don't think so. This is an e1000 issue, it will happen if you use tap networking too. The commit you linked to only affects slirp and pads its ARP code. I think there are two places where the minimum frame length can be enforced: 1. The NIC emulation code. This is currently how rtl8139, pcnet, and ne2000 do it. My patch adds the same for e1000. 2. The net layer. If we're emulating Ethernet then it would be possible to pad to minimum frame length in common networking code (net.c). Stefan
On Sat, Sep 18, 2010 at 09:43:45PM +0100, Stefan Hajnoczi wrote: > The OpenIndiana (Solaris) e1000g driver drops frames that are too long > or too short. It expects to receive frames of at least the Ethernet > minimum size. ARP requests in particular are small and will be dropped > if they are not padded appropriately, preventing a Solaris VM from > becoming visible on the network. > > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> > --- > hw/e1000.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/hw/e1000.c b/hw/e1000.c > index 7d7d140..bc983f9 100644 > --- a/hw/e1000.c > +++ b/hw/e1000.c > @@ -55,6 +55,7 @@ static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL); > > #define IOPORT_SIZE 0x40 > #define PNPMMIO_SIZE 0x20000 > +#define MIN_BUF_SIZE 60 > > /* > * HW models: > @@ -635,10 +636,19 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) > uint32_t rdh_start; > uint16_t vlan_special = 0; > uint8_t vlan_status = 0, vlan_offset = 0; > + uint8_t min_buf[MIN_BUF_SIZE]; > > if (!(s->mac_reg[RCTL] & E1000_RCTL_EN)) > return -1; > > + /* Pad to minimum Ethernet frame length */ > + if (size < sizeof(min_buf)) { > + memcpy(min_buf, buf, size); > + memset(&min_buf[size], 0, sizeof(min_buf) - size); > + buf = min_buf; > + size = sizeof(min_buf); > + } > + Hi, This doesn't look right. AFAIK, MAC's dont pad on receive. IMO this kind of padding should somehow be done by the bridge that forwards packets into the qemu vlan (e.g slirp or the generic tap bridge). Cheers
On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote: > This doesn't look right. AFAIK, MAC's dont pad on receive. I agree. NICs that do padding will do it on transmit, not receive. Anything coming in on the wire should already have the minimum length. In QEMU that isn't true today and that's why rtl8139, pcnet, and ne2000 already do this same padding. This patch is the smallest change to cover e1000. > IMO this kind of padding should somehow be done by the bridge that forwards > packets into the qemu vlan (e.g slirp or the generic tap bridge). That should work and we can then drop the padding code from existing NICs. I'll take a look. Stefan
On Sun, Sep 19, 2010 at 07:36:51AM +0100, Stefan Hajnoczi wrote: > On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias > <edgar.iglesias@gmail.com> wrote: > > This doesn't look right. AFAIK, MAC's dont pad on receive. > > I agree. NICs that do padding will do it on transmit, not receive. > Anything coming in on the wire should already have the minimum length. > > In QEMU that isn't true today and that's why rtl8139, pcnet, and > ne2000 already do this same padding. This patch is the smallest > change to cover e1000. > > > IMO this kind of padding should somehow be done by the bridge that forwards > > packets into the qemu vlan (e.g slirp or the generic tap bridge). > > That should work and we can then drop the padding code from existing > NICs. I'll take a look. > > Stefan Not all nic devices have to be emulate ethernet, so not all devices want the padding, e.g. virtio does not. It's also easy to imagine an ethernet device that strips the padding: would be silly to add it just to have it stripped. If we really want to do this generically, we could implement a function dealing with the padding, and call it from relevant devices. But I think the patch looks fine as is.
On Sun, Sep 19, 2010 at 02:04:55PM +0200, Edgar E. Iglesias wrote: > On Sun, Sep 19, 2010 at 01:18:01PM +0200, Michael S. Tsirkin wrote: > > On Sun, Sep 19, 2010 at 07:36:51AM +0100, Stefan Hajnoczi wrote: > > > On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias > > > <edgar.iglesias@gmail.com> wrote: > > > > This doesn't look right. AFAIK, MAC's dont pad on receive. > > > > > > I agree. NICs that do padding will do it on transmit, not receive. > > > Anything coming in on the wire should already have the minimum length. > > > > > > In QEMU that isn't true today and that's why rtl8139, pcnet, and > > > ne2000 already do this same padding. This patch is the smallest > > > change to cover e1000. > > > > > > > IMO this kind of padding should somehow be done by the bridge that forwards > > > > packets into the qemu vlan (e.g slirp or the generic tap bridge). > > > > > > That should work and we can then drop the padding code from existing > > > NICs. I'll take a look. > > > > > > Stefan > > > > Not all nic devices have to be emulate ethernet, so not all devices want > > the padding, e.g. virtio does not. > > Right, ethernet behaviour should obviously not be applied unconditionally for > all net devices. > > > > It's also easy to imagine an > > ethernet device that strips the padding: would be silly to add it > > just to have it stripped. > > I dont beleive that is possible. The FCS comes last, so an ethernet MAC > would have to do really silly things to differentiate between padding and > real payload. > > > If we really want to do this generically, we could implement a function dealing > > with the padding, and call it from relevant devices. > > Another way is to have network devices register their link types so that the > generic bridge can apply whatever link specific fixups that may be needed. Well, we want to move away from using the generic bridge and to -netdev pairing of front/backend, anyway. Adding code there will just complicate that. > I would prefer to have the padding of bridged frames decoupled from the > device models, but I cant say I feel very strongly about this. > > Cheers
On Sun, Sep 19, 2010 at 01:18:01PM +0200, Michael S. Tsirkin wrote: > On Sun, Sep 19, 2010 at 07:36:51AM +0100, Stefan Hajnoczi wrote: > > On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias > > <edgar.iglesias@gmail.com> wrote: > > > This doesn't look right. AFAIK, MAC's dont pad on receive. > > > > I agree. NICs that do padding will do it on transmit, not receive. > > Anything coming in on the wire should already have the minimum length. > > > > In QEMU that isn't true today and that's why rtl8139, pcnet, and > > ne2000 already do this same padding. This patch is the smallest > > change to cover e1000. > > > > > IMO this kind of padding should somehow be done by the bridge that forwards > > > packets into the qemu vlan (e.g slirp or the generic tap bridge). > > > > That should work and we can then drop the padding code from existing > > NICs. I'll take a look. > > > > Stefan > > Not all nic devices have to be emulate ethernet, so not all devices want > the padding, e.g. virtio does not. Right, ethernet behaviour should obviously not be applied unconditionally for all net devices. > It's also easy to imagine an > ethernet device that strips the padding: would be silly to add it > just to have it stripped. I dont beleive that is possible. The FCS comes last, so an ethernet MAC would have to do really silly things to differentiate between padding and real payload. > If we really want to do this generically, we could implement a function dealing > with the padding, and call it from relevant devices. Another way is to have network devices register their link types so that the generic bridge can apply whatever link specific fixups that may be needed. I would prefer to have the padding of bridged frames decoupled from the device models, but I cant say I feel very strongly about this. Cheers
Am 18.09.2010 23:12, schrieb Stefan Hajnoczi: > On Sat, Sep 18, 2010 at 9:57 PM, Hervé Poussineau <hpoussin@reactos.org> wrote: >> Another patch creating ARP replies at least 64 bytes long has been >> committed: >> http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813 >> >> Does it fix your issue? > > No I don't think so. This is an e1000 issue, it will happen if you > use tap networking too. The commit you linked to only affects slirp > and pads its ARP code. > > I think there are two places where the minimum frame length can be enforced: > 1. The NIC emulation code. This is currently how rtl8139, pcnet, and > ne2000 do it. My patch adds the same for e1000. > 2. The net layer. If we're emulating Ethernet then it would be > possible to pad to minimum frame length in common networking code > (net.c). 3. The sender. I think it should be the sender's decision which packet he sends and there's no reason to manipulate it on its way to the guest. If the sender sends too short packets, this is where the bug is. Actually, instead of padding the packet we should already drop it in the device model if RCTL.SBP = 0. Does a real Solaris work when it receives the same packet? On the other hand, it seems that we're missing the padding where it actually belongs: when sending packets with TCTL.PSP = 1. Did you send the ARP packet from another qemu instance? If so, this might be the real reason. Kevin
Am 19.09.2010 08:36, schrieb Stefan Hajnoczi: > On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias > <edgar.iglesias@gmail.com> wrote: >> This doesn't look right. AFAIK, MAC's dont pad on receive. > > I agree. NICs that do padding will do it on transmit, not receive. > Anything coming in on the wire should already have the minimum length. > > In QEMU that isn't true today and that's why rtl8139, pcnet, and > ne2000 already do this same padding. This patch is the smallest > change to cover e1000. What's the reason that it isn't true in QEMU today? Shouldn't we fix these problems rather than making device emulations incorrect to compensate for it? Kevin
On Mon, Sep 20, 2010 at 10:42:31AM +0200, Kevin Wolf wrote: > Am 18.09.2010 23:12, schrieb Stefan Hajnoczi: > > On Sat, Sep 18, 2010 at 9:57 PM, Hervé Poussineau <hpoussin@reactos.org> wrote: > >> Another patch creating ARP replies at least 64 bytes long has been > >> committed: > >> http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813 > >> > >> Does it fix your issue? > > > > No I don't think so. This is an e1000 issue, it will happen if you > > use tap networking too. The commit you linked to only affects slirp > > and pads its ARP code. > > > > I think there are two places where the minimum frame length can be enforced: > > 1. The NIC emulation code. This is currently how rtl8139, pcnet, and > > ne2000 do it. My patch adds the same for e1000. > > 2. The net layer. If we're emulating Ethernet then it would be > > possible to pad to minimum frame length in common networking code > > (net.c). > > 3. The sender. I think it should be the sender's decision which packet > he sends and there's no reason to manipulate it on its way to the guest. > If the sender sends too short packets, this is where the bug is. Yes, but when using tap, the ethernet sender is QEMU itself. Tap doesn't have the same requirements as ethernet so the original sender has no reason to pad. Internally in QEMU, there is code that picks up tap packets and forwards them to the emulated ethernet links, this is were padding should be done IMO. Not in the device models receive path. The bridge that forwards frames from tap into emulated links must also handle different kinds of link types, as all emulated network devices are not necessarily ethernet. Cheers
On Mon, Sep 20, 2010 at 10:50:40AM +0200, Kevin Wolf wrote: > Am 19.09.2010 08:36, schrieb Stefan Hajnoczi: > > On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias > > <edgar.iglesias@gmail.com> wrote: > >> This doesn't look right. AFAIK, MAC's dont pad on receive. > > > > I agree. NICs that do padding will do it on transmit, not receive. > > Anything coming in on the wire should already have the minimum length. > > > > In QEMU that isn't true today and that's why rtl8139, pcnet, and > > ne2000 already do this same padding. This patch is the smallest > > change to cover e1000. > > What's the reason that it isn't true in QEMU today? Shouldn't we fix > these problems rather than making device emulations incorrect to > compensate for it? Yes we should, I agree. Cheers
On Mon, Sep 20, 2010 at 10:42:31AM +0200, Kevin Wolf wrote: > Am 18.09.2010 23:12, schrieb Stefan Hajnoczi: > > On Sat, Sep 18, 2010 at 9:57 PM, Hervé Poussineau <hpoussin@reactos.org> wrote: > >> Another patch creating ARP replies at least 64 bytes long has been > >> committed: > >> http://git.savannah.gnu.org/cgit/qemu.git/commit/?id=dbf3c4b4baceb91eb64d09f787cbe92d65188813 > >> > >> Does it fix your issue? > > > > No I don't think so. This is an e1000 issue, it will happen if you > > use tap networking too. The commit you linked to only affects slirp > > and pads its ARP code. > > > > I think there are two places where the minimum frame length can be enforced: > > 1. The NIC emulation code. This is currently how rtl8139, pcnet, and > > ne2000 do it. My patch adds the same for e1000. > > 2. The net layer. If we're emulating Ethernet then it would be > > possible to pad to minimum frame length in common networking code > > (net.c). > > 3. The sender. I think it should be the sender's decision which packet > he sends and there's no reason to manipulate it on its way to the guest. > If the sender sends too short packets, this is where the bug is. > > Actually, instead of padding the packet we should already drop it in the > device model if RCTL.SBP = 0. Does a real Solaris work when it receives > the same packet? > > On the other hand, it seems that we're missing the padding where it > actually belongs: when sending packets with TCTL.PSP = 1. Did you send > the ARP packet from another qemu instance? If so, this might be the real > reason. No, I brought up the tap interface on the host and tried to ping the guest directly. This caused the host to send an ARP request over the tap interface. It was not padded. Perhaps Linux expects the NIC to deal with padding transmit frames but tap does not do this. That's just my theory though from the experience that real NICs often do pad to minimum size. Does anyone know the definitive answer? Stefan
On Mon, Sep 20, 2010 at 11:03:37AM +0200, Edgar E. Iglesias wrote: > On Mon, Sep 20, 2010 at 10:50:40AM +0200, Kevin Wolf wrote: > > Am 19.09.2010 08:36, schrieb Stefan Hajnoczi: > > > On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias > > > <edgar.iglesias@gmail.com> wrote: > > >> This doesn't look right. AFAIK, MAC's dont pad on receive. > > > > > > I agree. NICs that do padding will do it on transmit, not receive. > > > Anything coming in on the wire should already have the minimum length. > > > > > > In QEMU that isn't true today and that's why rtl8139, pcnet, and > > > ne2000 already do this same padding. This patch is the smallest > > > change to cover e1000. > > > > What's the reason that it isn't true in QEMU today? Shouldn't we fix > > these problems rather than making device emulations incorrect to > > compensate for it? > > Yes we should, I agree. Does someone with more knowledge for QEMU networking than me want to take a stab at it? Stefan
On Sun, Sep 19, 2010 at 07:36:51AM +0100, Stefan Hajnoczi wrote: > On Sat, Sep 18, 2010 at 10:27 PM, Edgar E. Iglesias > <edgar.iglesias@gmail.com> wrote: > > This doesn't look right. AFAIK, MAC's dont pad on receive. > > I agree. NICs that do padding will do it on transmit, not receive. > Anything coming in on the wire should already have the minimum length. QEMU never gets access to the wire. Our APIs do not really pass complete ethernet packets: we forward packets without checksum and padding. I think it makes complete sense to keep this and handle padding in devices because we have devices that pass the frame to guest without padding and checksum. It should be easy to replace padding code in devices that need it with some kind of macro. > In QEMU that isn't true today and that's why rtl8139, pcnet, and > ne2000 already do this same padding. This patch is the smallest > change to cover e1000. > > > IMO this kind of padding should somehow be done by the bridge that forwards > > packets into the qemu vlan (e.g slirp or the generic tap bridge). > > That should work and we can then drop the padding code from existing > NICs. I'll take a look. > > Stefan
On Sat, Sep 18, 2010 at 09:43:45PM +0100, Stefan Hajnoczi wrote: > The OpenIndiana (Solaris) e1000g driver drops frames that are too long > or too short. It expects to receive frames of at least the Ethernet > minimum size. ARP requests in particular are small and will be dropped > if they are not padded appropriately, preventing a Solaris VM from > becoming visible on the network. > > Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> I'll put this patch on my tree: let's be consistent and fix e1000, it is also a good approach for 0.13 I think. Anthony - could you pick this up for 0.13 please? If someone wants to then strip it off from all devices and tweak, it can be done. > --- > hw/e1000.c | 10 ++++++++++ > 1 files changed, 10 insertions(+), 0 deletions(-) > > diff --git a/hw/e1000.c b/hw/e1000.c > index 7d7d140..bc983f9 100644 > --- a/hw/e1000.c > +++ b/hw/e1000.c > @@ -55,6 +55,7 @@ static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL); > > #define IOPORT_SIZE 0x40 > #define PNPMMIO_SIZE 0x20000 > +#define MIN_BUF_SIZE 60 > > /* > * HW models: > @@ -635,10 +636,19 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) > uint32_t rdh_start; > uint16_t vlan_special = 0; > uint8_t vlan_status = 0, vlan_offset = 0; > + uint8_t min_buf[MIN_BUF_SIZE]; > > if (!(s->mac_reg[RCTL] & E1000_RCTL_EN)) > return -1; > > + /* Pad to minimum Ethernet frame length */ > + if (size < sizeof(min_buf)) { > + memcpy(min_buf, buf, size); > + memset(&min_buf[size], 0, sizeof(min_buf) - size); > + buf = min_buf; > + size = sizeof(min_buf); > + } > + > if (size > s->rxbuf_size) { > DBGOUT(RX, "packet too large for buffers (%lu > %d)\n", > (unsigned long)size, s->rxbuf_size); > -- > 1.7.1 >
diff --git a/hw/e1000.c b/hw/e1000.c index 7d7d140..bc983f9 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -55,6 +55,7 @@ static int debugflags = DBGBIT(TXERR) | DBGBIT(GENERAL); #define IOPORT_SIZE 0x40 #define PNPMMIO_SIZE 0x20000 +#define MIN_BUF_SIZE 60 /* * HW models: @@ -635,10 +636,19 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) uint32_t rdh_start; uint16_t vlan_special = 0; uint8_t vlan_status = 0, vlan_offset = 0; + uint8_t min_buf[MIN_BUF_SIZE]; if (!(s->mac_reg[RCTL] & E1000_RCTL_EN)) return -1; + /* Pad to minimum Ethernet frame length */ + if (size < sizeof(min_buf)) { + memcpy(min_buf, buf, size); + memset(&min_buf[size], 0, sizeof(min_buf) - size); + buf = min_buf; + size = sizeof(min_buf); + } + if (size > s->rxbuf_size) { DBGOUT(RX, "packet too large for buffers (%lu > %d)\n", (unsigned long)size, s->rxbuf_size);
The OpenIndiana (Solaris) e1000g driver drops frames that are too long or too short. It expects to receive frames of at least the Ethernet minimum size. ARP requests in particular are small and will be dropped if they are not padded appropriately, preventing a Solaris VM from becoming visible on the network. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- hw/e1000.c | 10 ++++++++++ 1 files changed, 10 insertions(+), 0 deletions(-)