From patchwork Sun Nov 25 17:49:13 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jouni Malinen X-Patchwork-Id: 201557 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from maxx.maxx.shmoo.com (maxx.shmoo.com [205.134.188.171]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "maxx.shmoo.com", Issuer "CA Cert Signing Authority" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 2C9252C008A for ; Mon, 26 Nov 2012 04:49:34 +1100 (EST) Received: from localhost (localhost [127.0.0.1]) by maxx.maxx.shmoo.com (Postfix) with ESMTP id 0D1039C166; Sun, 25 Nov 2012 12:49:31 -0500 (EST) X-Virus-Scanned: amavisd-new at maxx.shmoo.com Received: from maxx.maxx.shmoo.com ([127.0.0.1]) by localhost (maxx.shmoo.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OoLXxQeFSr-N; Sun, 25 Nov 2012 12:49:30 -0500 (EST) Received: from maxx.shmoo.com (localhost [127.0.0.1]) by maxx.maxx.shmoo.com (Postfix) with ESMTP id CFAC29C16A; Sun, 25 Nov 2012 12:49:26 -0500 (EST) X-Original-To: mailman-post+hostap@maxx.shmoo.com Delivered-To: mailman-post+hostap@maxx.shmoo.com Received: from localhost (localhost [127.0.0.1]) by maxx.maxx.shmoo.com (Postfix) with ESMTP id B08089C16A for ; Sun, 25 Nov 2012 12:49:24 -0500 (EST) X-Virus-Scanned: amavisd-new at maxx.shmoo.com Received: from maxx.maxx.shmoo.com ([127.0.0.1]) by localhost (maxx.shmoo.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fzkfJ+hSrSzN for ; Sun, 25 Nov 2012 12:49:19 -0500 (EST) Received: from jmaline2.user.openhosting.com (kvm.w1.fi [128.177.28.162]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by maxx.maxx.shmoo.com (Postfix) with ESMTPS id 6F5929C166 for ; Sun, 25 Nov 2012 12:49:19 -0500 (EST) Received: from jm (a88-112-100-117.elisa-laajakaista.fi [88.112.100.117]) (authenticated bits=0) by jmaline2.user.openhosting.com (8.13.8/8.13.8) with ESMTP id qAPHhgC5003499 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 25 Nov 2012 12:43:45 -0500 Received: by jm (sSMTP sendmail emulation); Sun, 25 Nov 2012 19:49:13 +0200 Date: Sun, 25 Nov 2012 19:49:13 +0200 From: Jouni Malinen To: hostap@lists.shmoo.com Subject: Re: Infinite loop in pmksa_cache_get_okc? Message-ID: <20121125174913.GB22600@w1.fi> Mail-Followup-To: hostap@lists.shmoo.com References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: hostap@lists.shmoo.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: HostAP Project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: hostap-bounces@lists.shmoo.com Errors-To: hostap-bounces@lists.shmoo.com On Sun, Aug 26, 2012 at 05:20:33PM +0200, michael-dev wrote: > this is with the hostapd a5ed45586c63ffd8f9d2b44e27c251d7bacbeaf4 + > OpenWRT patches on P1020WLAN (PowerPC Dualcore), compiled with -O0 -g > gdb. > There are two APs, which share LAN but no SSID. After some time of > operation (minutes to weeks), suddenly hostapd hangs consuming 100% CPU, > producing no more output and no station can (re-)connect. Are you still able to reproduce this? > GDB Backtrace shows that hostapd is within pmksa_cache_get_okc when > breaking and still in there (with same function args) some time later > again. This could be an infinite loop in pmksa_cache_get_okc, though > that function is probably not the cause but if looping is in > pmksa_cache_get_okc it looks more like a corrupt data structure. ../src/ap/pmksa_cache_auth.c:71 is not actually in pmksa_cache_get_okc(); it is in pmksa_cache_free_entry().. However, the earlier part of the backtrace seems to point to wpa_auth_okc_iter() which would indeed call pmksa_cache_get_okc(), but there is no path where that would result in pmksa_cache_free_entry() getting called. Anyway, an incorrect pointer in the PMKSA list could be the most likely reason for something like this. > I'm wondering off the _pthread_cleanup_pop_restore in the backtrace - > is it right or is this backtrace broken? Well, hostapd should not be using pthread at all.. > Has anybody seen such errors, too? I don't remember such an issue. I tried to go through pmksa_cache_auth.c to verify all list handling, but could not find any clear issues. Similarly, a module test with various PMKSA cache add/expire/remove operations did not seem to result in any inconsistency in the PMKSA cache lists. If you can still reproduce the issue, it would be interesting to see what the debug shows with the following changes to dump the PMKSA cache contents between each operation. diff --git a/src/ap/pmksa_cache_auth.c b/src/ap/pmksa_cache_auth.c index 3a9cc7b..5289536 100644 --- a/src/ap/pmksa_cache_auth.c +++ b/src/ap/pmksa_cache_auth.c @@ -32,6 +32,54 @@ struct rsn_pmksa_cache { }; +void pmksa_cache_dump(struct rsn_pmksa_cache *cache, const char *title) +{ + struct rsn_pmksa_cache_entry *e; + int i, count; + struct os_time now; + + os_get_time(&now); + wpa_printf(MSG_INFO, "PMKSA cache %p (pmksa_count=%d) now=%d [%s]", + cache, cache->pmksa_count, (int) now.sec, title); + + count = 0; + for (e = cache->pmksa; e; e = e->next) { + count++; + wpa_printf(MSG_INFO, "entry %p " MACSTR " opportunistic=%d " + "expiration=%d (next=%p hnext=%p)", + e, MAC2STR(e->spa), e->opportunistic, + (int) e->expiration, e->next, e->hnext); + wpa_hexdump(MSG_INFO, "pmkid", e->pmkid, PMKID_LEN); + } + if (count != cache->pmksa_count) { + wpa_printf(MSG_ERROR, "PMKSA cache count mismatch: list entry " + "count %d pmksa_count=%d", + count, cache->pmksa_count); + exit(1); + } + + count = 0; + for (i = 0; i < PMKID_HASH_SIZE; i++) { + if (cache->pmkid[i] == NULL) + continue; + wpa_printf(MSG_INFO, "pmkid hash[%d]", i); + for (e = cache->pmkid[i]; e; e = e->hnext) { + count++; + wpa_printf(MSG_INFO, "entry %p " MACSTR + " (next=%p hnext=%p)", + e, MAC2STR(e->spa), e->next, e->hnext); + } + } + if (count != cache->pmksa_count) { + wpa_printf(MSG_ERROR, "PMKSA cache count mismatch: hash list " + "entry count %d pmksa_count=%d", + count, cache->pmksa_count); + exit(1); + } + + wpa_printf(MSG_INFO, "PMKSA cache dump done [%s]", title); +} + static void pmksa_cache_set_expiration(struct rsn_pmksa_cache *pmksa); @@ -53,6 +101,7 @@ static void pmksa_cache_free_entry(struct rsn_pmksa_cache *pmksa, { struct rsn_pmksa_cache_entry *pos, *prev; + pmksa_cache_dump(pmksa, "pre-pmksa_cache_free_entry"); pmksa->pmksa_count--; pmksa->free_cb(entry, pmksa->ctx); pos = pmksa->pmkid[PMKID_HASH(entry->pmkid)]; @@ -85,6 +134,7 @@ static void pmksa_cache_free_entry(struct rsn_pmksa_cache *pmksa, pos = pos->next; } _pmksa_cache_free_entry(entry); + pmksa_cache_dump(pmksa, "post-pmksa_cache_free_entry"); } @@ -241,6 +291,7 @@ pmksa_cache_auth_add(struct rsn_pmksa_cache *pmksa, struct rsn_pmksa_cache_entry *entry, *pos; struct os_time now; + pmksa_cache_dump(pmksa, "pre-pmksa_cache_auth_add"); if (pmk_len > PMK_LEN) return NULL; @@ -276,6 +327,7 @@ pmksa_cache_auth_add(struct rsn_pmksa_cache *pmksa, } pmksa_cache_link_entry(pmksa, entry); + pmksa_cache_dump(pmksa, "post-pmksa_cache_auth_add"); return entry; } @@ -288,6 +340,7 @@ pmksa_cache_add_okc(struct rsn_pmksa_cache *pmksa, { struct rsn_pmksa_cache_entry *entry; + pmksa_cache_dump(pmksa, "pre-pmksa_cache_add_okc"); entry = os_zalloc(sizeof(*entry)); if (entry == NULL) return NULL; @@ -316,6 +369,7 @@ pmksa_cache_add_okc(struct rsn_pmksa_cache *pmksa, entry->opportunistic = 1; pmksa_cache_link_entry(pmksa, entry); + pmksa_cache_dump(pmksa, "post-pmksa_cache_add_okc"); return entry; } @@ -333,6 +387,7 @@ void pmksa_cache_auth_deinit(struct rsn_pmksa_cache *pmksa) if (pmksa == NULL) return; + pmksa_cache_dump(pmksa, __func__); entry = pmksa->pmksa; while (entry) { prev = entry; @@ -359,6 +414,7 @@ pmksa_cache_auth_get(struct rsn_pmksa_cache *pmksa, { struct rsn_pmksa_cache_entry *entry; + pmksa_cache_dump(pmksa, __func__); if (pmkid) entry = pmksa->pmkid[PMKID_HASH(pmkid)]; else @@ -392,6 +448,7 @@ struct rsn_pmksa_cache_entry * pmksa_cache_get_okc( struct rsn_pmksa_cache_entry *entry; u8 new_pmkid[PMKID_LEN]; + pmksa_cache_dump(pmksa, __func__); entry = pmksa->pmksa; while (entry) { if (os_memcmp(entry->spa, spa, ETH_ALEN) != 0) diff --git a/src/ap/pmksa_cache_auth.h b/src/ap/pmksa_cache_auth.h index d473f3f..547d929 100644 --- a/src/ap/pmksa_cache_auth.h +++ b/src/ap/pmksa_cache_auth.h @@ -55,5 +55,6 @@ pmksa_cache_add_okc(struct rsn_pmksa_cache *pmksa, const u8 *aa, const u8 *pmkid); void pmksa_cache_to_eapol_data(struct rsn_pmksa_cache_entry *entry, struct eapol_state_machine *eapol); +void pmksa_cache_dump(struct rsn_pmksa_cache *cache, const char *title); #endif /* PMKSA_CACHE_H */