diff mbox

wireless: wext: allocate space for NULL-termination for 32byte SSIDs

Message ID 1260650850-16163-1-git-send-email-daniel@caiaq.de
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Daniel Mack Dec. 12, 2009, 8:47 p.m. UTC
We've experienced a long standing bug when quickly switching from
ad-hoc to managed mode on a hardware using a Libertas chipset.

The effect is that after a number of mode transistions (sometimes as few
as two sufficed), the kernel will oops at very strange locations, mostly
in something like __kmem_alloc().

While the root cause turned out to be an issue with the wpa-supplicant
which feeds the kernel driver with garbage, this occasion pointed out a
bug in the wireless wext core when SSIDs with 32 byte lengths are passed
from userspace. In this case, the string is not properly NULL-terminated
which causes some other part to corrupt memory.

(In the particular case I observed, an SIOCSIWESSID was issued with
 bogus data in iwp->pointer but iwp->length=32).

I admitedly couldn't find where the actual corruption itself happens,
but with this trivial fix, I can't reproduce the bug anymore.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Dan Williams <dcbw@redhat.com>
Cc: Michael Hirsch <m.hirsch@raumfeld.com>
Cc: netdev@vger.kernel.org
Cc: libertas-dev@lists.infradead.org
Cc: stable@kernel.org
---
 net/wireless/wext.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
---
 net/wireless/wext-core.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Comments

David Miller Dec. 15, 2009, 9:43 a.m. UTC | #1
From: Daniel Mack <daniel@caiaq.de>
Date: Sun, 13 Dec 2009 04:47:30 +0800

> We've experienced a long standing bug when quickly switching from
> ad-hoc to managed mode on a hardware using a Libertas chipset.

Can you please CC: linux-wireless for wireless patches?

Thanks.

> The effect is that after a number of mode transistions (sometimes as few
> as two sufficed), the kernel will oops at very strange locations, mostly
> in something like __kmem_alloc().
> 
> While the root cause turned out to be an issue with the wpa-supplicant
> which feeds the kernel driver with garbage, this occasion pointed out a
> bug in the wireless wext core when SSIDs with 32 byte lengths are passed
> from userspace. In this case, the string is not properly NULL-terminated
> which causes some other part to corrupt memory.
> 
> (In the particular case I observed, an SIOCSIWESSID was issued with
>  bogus data in iwp->pointer but iwp->length=32).
> 
> I admitedly couldn't find where the actual corruption itself happens,
> but with this trivial fix, I can't reproduce the bug anymore.
> 
> Signed-off-by: Daniel Mack <daniel@caiaq.de>
> Cc: Dan Williams <dcbw@redhat.com>
> Cc: Michael Hirsch <m.hirsch@raumfeld.com>
> Cc: netdev@vger.kernel.org
> Cc: libertas-dev@lists.infradead.org
> Cc: stable@kernel.org
> ---
>  net/wireless/wext.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> ---
>  net/wireless/wext-core.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c
> index 5e1656b..3d8f4b0 100644
> --- a/net/wireless/wext-core.c
> +++ b/net/wireless/wext-core.c
> @@ -759,8 +759,8 @@ static int ioctl_standard_iw_point(struct iw_point *iwp, unsigned int cmd,
>  		}
>  	}
>  
> -	/* kzalloc() ensures NULL-termination for essid_compat. */
> -	extra = kzalloc(extra_size, GFP_KERNEL);
> +	/* kzalloc() +1 ensures NULL-termination for essid_compat. */
> +	extra = kzalloc(extra_size + 1, GFP_KERNEL);
>  	if (!extra)
>  		return -ENOMEM;
>  
> -- 
> 1.6.3.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Dec. 15, 2009, 10:03 a.m. UTC | #2
On Tue, 2009-12-15 at 01:43 -0800, David Miller wrote:

> > The effect is that after a number of mode transistions (sometimes as few
> > as two sufficed), the kernel will oops at very strange locations, mostly
> > in something like __kmem_alloc().
> > 
> > While the root cause turned out to be an issue with the wpa-supplicant
> > which feeds the kernel driver with garbage, this occasion pointed out a
> > bug in the wireless wext core when SSIDs with 32 byte lengths are passed
> > from userspace. In this case, the string is not properly NULL-terminated
> > which causes some other part to corrupt memory.
> > 
> > (In the particular case I observed, an SIOCSIWESSID was issued with
> >  bogus data in iwp->pointer but iwp->length=32).
> > 
> > I admitedly couldn't find where the actual corruption itself happens,
> > but with this trivial fix, I can't reproduce the bug anymore.

Well you should try harder :)

> > -	/* kzalloc() ensures NULL-termination for essid_compat. */
> > -	extra = kzalloc(extra_size, GFP_KERNEL);
> > +	/* kzalloc() +1 ensures NULL-termination for essid_compat. */
> > +	extra = kzalloc(extra_size + 1, GFP_KERNEL);

That doesn't seem correct.

If this is used in a SET, then it is purely an in-kernel thing and
everything in the kernel is passed the length + data, and the kernel
MUST NEVER treat the SSID as a NUL-terminated string.

If this is used in a GET, then it will be filled up to 32 bytes by the
get handler, and the trailing \0 your patch reserves will never be
copied into userspace.

Since you indicate the kernel crashed, it is likely that libertas is
treating the buffer as a string, instead of an SSID.

That doesn't, however, make your fix and more valid. Whatever code uses
it as a string is clearly at fault, since this is a valid four-byte
SSID: "\x00\x01\x02\x03"

johannes
Johannes Berg Dec. 15, 2009, 10:05 a.m. UTC | #3
On Tue, 2009-12-15 at 11:03 +0100, Johannes Berg wrote:
> On Tue, 2009-12-15 at 01:43 -0800, David Miller wrote:
> 
> > > The effect is that after a number of mode transistions (sometimes as few
> > > as two sufficed), the kernel will oops at very strange locations, mostly
> > > in something like __kmem_alloc().
> > > 
> > > While the root cause turned out to be an issue with the wpa-supplicant
> > > which feeds the kernel driver with garbage, this occasion pointed out a
> > > bug in the wireless wext core when SSIDs with 32 byte lengths are passed
> > > from userspace. In this case, the string is not properly NULL-terminated
> > > which causes some other part to corrupt memory.

And, I forgot to mention, this is in fact not an issue or the "root
cause" of any issues -- it's completely intentional that wpa_supplicant
feeds the kernel with a random, valid, 32-byte SSID.

johannes
Johannes Berg Dec. 15, 2009, 10:07 a.m. UTC | #4
On Tue, 2009-12-15 at 11:03 +0100, Johannes Berg wrote:

> Since you indicate the kernel crashed, it is likely that libertas is
> treating the buffer as a string, instead of an SSID.

drivers/net/wireless/libertas$ grep lbs_deb_ * | grep ssid|grep '%s'
assoc.c:	lbs_deb_join("current SSID '%s', ssid length %u\n",
assoc.c:	lbs_deb_join("requested ssid '%s', ssid length %u\n",
assoc.c:	lbs_deb_join("ADHOC_START: SSID '%s', ssid length %u\n",
scan.c:		lbs_deb_wext("set_scan, essid '%s'\n",

johannes
Albert Cahalan Dec. 15, 2009, 10:16 a.m. UTC | #5
On Sat, Dec 12, 2009 at 3:47 PM, Daniel Mack <daniel@caiaq.de> wrote:

> While the root cause turned out to be an issue with the wpa-supplicant
> which feeds the kernel driver with garbage, this occasion pointed out a
> bug in the wireless wext core when SSIDs with 32 byte lengths are passed
> from userspace. In this case, the string is not properly NULL-terminated
> which causes some other part to corrupt memory.

This is the wrong fix.

These are not strings. They are 32 arbitrary bytes. It is perfectly
legitimate to have a NUL byte in the middle; the use of C string
functions will corrupt the data.

For your testing I suggest:

a. start the SSID with '-'
b. include "/../" in the SSID
c. include UTF-16 surrogates wrongly encoded as UTF-8
d. include "\r\n" in the SSID
e. include quote and backslash characters in the SSID
f. include bytes in the 0x80 to 0x9f range, surrounded by ASCII
g. include bytes in the 0xc0 to 0xff range, surrounded by ASCII
h. include the sequence 0xc0,0x80 (Java UTF-8 pseudo-NUL)
i. include the NUL byte
j. end the SSID with a plain ASCII letter

Verify that the whole stack, from driver to GUI, can handle this.
That includes config files, command lines, /proc and /sys, etc.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Mack Dec. 15, 2009, 10:20 a.m. UTC | #6
On Tue, Dec 15, 2009 at 11:07:14AM +0100, Johannes Berg wrote:
> On Tue, 2009-12-15 at 11:03 +0100, Johannes Berg wrote:
> 
> > Since you indicate the kernel crashed, it is likely that libertas is
> > treating the buffer as a string, instead of an SSID.

Ok, I dodn't know that - thanks for pointing it out. In my setup here,
SSIDs are always valid alphanumeric strings, and I was mislead by the
comment that mentions the NULL-termination.

> drivers/net/wireless/libertas$ grep lbs_deb_ * | grep ssid|grep '%s'
> assoc.c:	lbs_deb_join("current SSID '%s', ssid length %u\n",
> assoc.c:	lbs_deb_join("requested ssid '%s', ssid length %u\n",
> assoc.c:	lbs_deb_join("ADHOC_START: SSID '%s', ssid length %u\n",
> scan.c:		lbs_deb_wext("set_scan, essid '%s'\n",

Those macros are stubbed out as nops in my setup, so they can
unfortunately not be the reason. I'll dig deeper :)

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Holger Schurig Dec. 15, 2009, 10:30 a.m. UTC | #7
> drivers/net/wireless/libertas$ grep lbs_deb_ * | grep ssid
> |grep '%s' 
> assoc.c:	lbs_deb_join("current SSID '%s', ssid length %u\n",
> assoc.c:	lbs_deb_join("requested ssid '%s', ssid length %u\n",
> assoc.c:	lbs_deb_join("ADHOC_START: SSID '%s', ssid
>               length %u\n", 
> scan.c:		lbs_deb_wext("set_scan, essid '%s'\n",

All those lines are gone once my cfg80211 lands.

Do you know any way to make sparse moan about them?


BTW: the libertas firmware sometimes treat an SSID as a 
zero-terminated string. There are some firmware commands that 
accept just an u8[32] bytes for the SSID, but not an ssid_len, 
e.g. in the CMD_802_11_AD_HOC_START command.

You therefore can't connect to the otherwise legitimate SSID of 
TEST\0\0\0.
Johannes Berg Dec. 15, 2009, 10:31 a.m. UTC | #8
On Tue, 2009-12-15 at 18:20 +0800, Daniel Mack wrote:

> > > Since you indicate the kernel crashed, it is likely that libertas is
> > > treating the buffer as a string, instead of an SSID.
> 
> Ok, I dodn't know that - thanks for pointing it out. In my setup here,
> SSIDs are always valid alphanumeric strings, and I was mislead by the
> comment that mentions the NULL-termination.

Yeah ... that's an unfortunate historic mistake in wireless extensions
that we try to fix up there. It cannot always be successfully fixed up,
but the NUL-termination is nothing that anything but this code should be
worried about or rely on.

> > drivers/net/wireless/libertas$ grep lbs_deb_ * | grep ssid|grep '%s'
> > assoc.c:	lbs_deb_join("current SSID '%s', ssid length %u\n",
> > assoc.c:	lbs_deb_join("requested ssid '%s', ssid length %u\n",
> > assoc.c:	lbs_deb_join("ADHOC_START: SSID '%s', ssid length %u\n",
> > scan.c:		lbs_deb_wext("set_scan, essid '%s'\n",
> 
> Those macros are stubbed out as nops in my setup, so they can
> unfortunately not be the reason. I'll dig deeper :)

Well, the stack trace ought to help.

johannes
Johannes Berg Dec. 15, 2009, 10:35 a.m. UTC | #9
On Tue, 2009-12-15 at 11:30 +0100, Holger Schurig wrote:
> > drivers/net/wireless/libertas$ grep lbs_deb_ * | grep ssid
> > |grep '%s' 
> > assoc.c:	lbs_deb_join("current SSID '%s', ssid length %u\n",
> > assoc.c:	lbs_deb_join("requested ssid '%s', ssid length %u\n",
> > assoc.c:	lbs_deb_join("ADHOC_START: SSID '%s', ssid
> >               length %u\n", 
> > scan.c:		lbs_deb_wext("set_scan, essid '%s'\n",
> 
> All those lines are gone once my cfg80211 lands.
> 
> Do you know any way to make sparse moan about them?

Sorry, no, I don't think that's even possible unless you play dirty with
tricks like __iomem uses for instance but that'd require a lot of
casting in otherwise valid uses.

> BTW: the libertas firmware sometimes treat an SSID as a 
> zero-terminated string. There are some firmware commands that 
> accept just an u8[32] bytes for the SSID, but not an ssid_len, 
> e.g. in the CMD_802_11_AD_HOC_START command.
> 
> You therefore can't connect to the otherwise legitimate SSID of 
> TEST\0\0\0.

Ick! I guess your cfg80211 IBSS join handler needs to check for that
then and refuse such an SSID.

johannes
Daniel Mack Dec. 15, 2009, 10:37 a.m. UTC | #10
On Tue, Dec 15, 2009 at 11:31:23AM +0100, Johannes Berg wrote:
> On Tue, 2009-12-15 at 18:20 +0800, Daniel Mack wrote:
> 
> > > drivers/net/wireless/libertas$ grep lbs_deb_ * | grep ssid|grep '%s'
> > > assoc.c:	lbs_deb_join("current SSID '%s', ssid length %u\n",
> > > assoc.c:	lbs_deb_join("requested ssid '%s', ssid length %u\n",
> > > assoc.c:	lbs_deb_join("ADHOC_START: SSID '%s', ssid length %u\n",
> > > scan.c:		lbs_deb_wext("set_scan, essid '%s'\n",
> > 
> > Those macros are stubbed out as nops in my setup, so they can
> > unfortunately not be the reason. I'll dig deeper :)
> 
> Well, the stack trace ought to help.

It unfortunately doesn't. The site that causes the problem does not
crash itself. It just corrupts some memory, and the actual crash happens
much later, which makes it so evil.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcel Holtmann Dec. 15, 2009, 4:29 p.m. UTC | #11
Hi Daniel,

> We've experienced a long standing bug when quickly switching from
> ad-hoc to managed mode on a hardware using a Libertas chipset.

I don't know where you get your list of mailing lists from, but not
sending it to linux-wireless@vger.kernel.org will not really get you
anywhere. Check MAINTAINERS file first since it has all information
about it.

NETWORKING [WIRELESS]
M:      "John W. Linville" <linville@tuxdriver.com>
L:      linux-wireless@vger.kernel.org
T:      git git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git
S:      Maintained
F:      net/mac80211/
F:      net/rfkill/
F:      net/wireless/
F:      include/net/ieee80211*
F:      include/linux/wireless.h
F:      drivers/net/wireless/

Regards

Marcel


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Mack Dec. 16, 2009, 3:58 a.m. UTC | #12
On Tue, Dec 15, 2009 at 11:03:31AM +0100, Johannes Berg wrote:
> 
> > > -	/* kzalloc() ensures NULL-termination for essid_compat. */
> > > -	extra = kzalloc(extra_size, GFP_KERNEL);
> > > +	/* kzalloc() +1 ensures NULL-termination for essid_compat. */
> > > +	extra = kzalloc(extra_size + 1, GFP_KERNEL);
> 
> That doesn't seem correct.
> 
> If this is used in a SET, then it is purely an in-kernel thing and
> everything in the kernel is passed the length + data, and the kernel
> MUST NEVER treat the SSID as a NUL-terminated string.
> 
> If this is used in a GET, then it will be filled up to 32 bytes by the
> get handler, and the trailing \0 your patch reserves will never be
> copied into userspace.

The problem is the GET case. The libertas driver copies ssid_len
characters here and appends a trailing \0, which my patch caught now and
which caused memory corruption in before.

From what I've seen, libertas _does_ treat the extra data correctly
at all places, I checked it several times now. (Btw, the %s format
string you pointed out all use print_ssid() to properly escape all
non-printable characters, so they're rules out, too).

I'll send a patch to fix the flaw in libertas.

Thanks,
Daniel

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Albert Cahalan Dec. 16, 2009, 6:54 a.m. UTC | #13
On Tue, Dec 15, 2009 at 5:35 AM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> On Tue, 2009-12-15 at 11:30 +0100, Holger Schurig wrote:
>> > drivers/net/wireless/libertas$ grep lbs_deb_ * | grep ssid
>> > |grep '%s'
>> > assoc.c:    lbs_deb_join("current SSID '%s', ssid length %u\n",
>> > assoc.c:    lbs_deb_join("requested ssid '%s', ssid length %u\n",
>> > assoc.c:    lbs_deb_join("ADHOC_START: SSID '%s', ssid
>> >               length %u\n",
>> > scan.c:             lbs_deb_wext("set_scan, essid '%s'\n",
>>
>> All those lines are gone once my cfg80211 lands.
>>
>> Do you know any way to make sparse moan about them?
>
> Sorry, no, I don't think that's even possible unless you play dirty with
> tricks like __iomem uses for instance but that'd require a lot of
> casting in otherwise valid uses.
>
>> BTW: the libertas firmware sometimes treat an SSID as a
>> zero-terminated string. There are some firmware commands that
>> accept just an u8[32] bytes for the SSID, but not an ssid_len,
>> e.g. in the CMD_802_11_AD_HOC_START command.
>>
>> You therefore can't connect to the otherwise legitimate SSID of
>> TEST\0\0\0.
>
> Ick! I guess your cfg80211 IBSS join handler needs to check for that
> then and refuse such an SSID.

No, pad the SSID out to 32 bytes and let the firmware try.

First of all, isn't TEST\0\0\0 simply the wrong length anyway?
(that is, a length other than 32 is nonsense AFAIK)

Second of all, even if that is valid, the firmware probably handles
at least one SSID that starts with TEST and has some number
of NUL bytes on the end. Since you can't tell what that would be
with a particular firmware version, you might as well just let the
firmware try. The worst case failure here is that there is more than
one SSID of this form and you connect to the wrong one. If you
have a problem with this kind of trouble then you need ethernet.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Dec. 16, 2009, 8:19 a.m. UTC | #14
On Wed, 2009-12-16 at 01:54 -0500, Albert Cahalan wrote:

> >> You therefore can't connect to the otherwise legitimate SSID of
> >> TEST\0\0\0.
> >
> > Ick! I guess your cfg80211 IBSS join handler needs to check for that
> > then and refuse such an SSID.
> 
> No, pad the SSID out to 32 bytes and let the firmware try.

No, if we _know_ the firmware will try to connect to "TEST" instead of
"TEST\0\0\0" then refusing it is the right thing to do.

> First of all, isn't TEST\0\0\0 simply the wrong length anyway?
> (that is, a length other than 32 is nonsense AFAIK)

No.

> Second of all, even if that is valid, the firmware probably handles
> at least one SSID that starts with TEST and has some number
> of NUL bytes on the end. Since you can't tell what that would be
> with a particular firmware version, you might as well just let the
> firmware try. The worst case failure here is that there is more than
> one SSID of this form and you connect to the wrong one. If you
> have a problem with this kind of trouble then you need ethernet.

No. An SSID is a uniquely defined, 1-32 byte long byte bit pattern. It
doesn't treat \0 special in any way as your comments suggest. If the
firmware stops matching at \0, the firmware is broken and shouldn't be
given a choice.

johannes
Johannes Berg Dec. 16, 2009, 8:20 a.m. UTC | #15
On Wed, 2009-12-16 at 11:58 +0800, Daniel Mack wrote:

> > If this is used in a GET, then it will be filled up to 32 bytes by the
> > get handler, and the trailing \0 your patch reserves will never be
> > copied into userspace.
> 
> The problem is the GET case. The libertas driver copies ssid_len
> characters here and appends a trailing \0, which my patch caught now and
> which caused memory corruption in before.
> 
> From what I've seen, libertas _does_ treat the extra data correctly
> at all places, I checked it several times now. (Btw, the %s format
> string you pointed out all use print_ssid() to properly escape all
> non-printable characters, so they're rules out, too).

Oh, ok, print_ssid() is correct of course, it gets the length.

> I'll send a patch to fix the flaw in libertas.

Thanks.

johannes
Holger Schurig Dec. 16, 2009, 8:26 a.m. UTC | #16
> First of all, isn't TEST\0\0\0 simply the wrong length anyway?
> (that is, a length other than 32 is nonsense AFAIK)

No, the SSID IE in a beacon encodes also a length. So the beacons 
from SSID of TEST, TEST\0 and TEST\0\0\0 are different.

This is because in the beacon, the SSID is *NOT* an u8[32], but 
an IE, which is struct { u8 type, u8 length, u8 data[] };
diff mbox

Patch

diff --git a/net/wireless/wext-core.c b/net/wireless/wext-core.c
index 5e1656b..3d8f4b0 100644
--- a/net/wireless/wext-core.c
+++ b/net/wireless/wext-core.c
@@ -759,8 +759,8 @@  static int ioctl_standard_iw_point(struct iw_point *iwp, unsigned int cmd,
 		}
 	}
 
-	/* kzalloc() ensures NULL-termination for essid_compat. */
-	extra = kzalloc(extra_size, GFP_KERNEL);
+	/* kzalloc() +1 ensures NULL-termination for essid_compat. */
+	extra = kzalloc(extra_size + 1, GFP_KERNEL);
 	if (!extra)
 		return -ENOMEM;