From patchwork Tue Jun 14 15:15:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Malcolm X-Patchwork-Id: 635327 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rTXcH5hspz9t0v for ; Wed, 15 Jun 2016 00:49:51 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=DzYSqaVC; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=UO2/V1IRC053QDxOMz4sUK9RHu3pM+0sStBUPnnb/afZGi6ie1g8l svZmp5VGy26jJ3XE8ueyecO+pLfi8Uejfp1iUVWluoJ+gC7zfTlqy1fBRIV/k/iP 7HpOvZcNnNL/UrwruOIo4VplsKYIuXIANs2oYtcf8Hpvlx1++Tjwco= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=BqzGgBFPId3aXwlpJBUIutNjkTY=; b=DzYSqaVC3kAkUdvvGkXD QMsutDzpjldvkyf4v0I6xBSwP+gUHeMZHUusQFNUenZtz94O7ouT6sohMf+QwBFp 69Z8powLxivja8XX/ajRTB1/okfxnpMdeiQqGs4iZtdNpcIzZYwbovZeNvS9R2hy xAXHCqVMopYD4v6ABXxvHfk= Received: (qmail 34490 invoked by alias); 14 Jun 2016 14:49:30 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 34426 invoked by uid 89); 14 Jun 2016 14:49:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=candidates, Hx-languages-length:6522, dolor, food X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 14 Jun 2016 14:49:26 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A90237F09A for ; Tue, 14 Jun 2016 14:49:25 +0000 (UTC) Received: from c64.redhat.com (vpn-224-197.phx2.redhat.com [10.3.224.197]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u5EEnO5L005450; Tue, 14 Jun 2016 10:49:25 -0400 From: David Malcolm To: gcc-patches@gcc.gnu.org Cc: David Malcolm Subject: [PATCH 2/4] Add more spellcheck selftests Date: Tue, 14 Jun 2016 11:15:51 -0400 Message-Id: <1465917353-61387-2-git-send-email-dmalcolm@redhat.com> In-Reply-To: <1465917353-61387-1-git-send-email-dmalcolm@redhat.com> References: <1465917353-61387-1-git-send-email-dmalcolm@redhat.com> X-IsSubscribed: yes The next patch in the kit reimplements find_closest_string and find_closest_identifier, so it seems prudent to add some more test coverage for these. This patch also adds some more test coverage for levenshtein_distance itself. Successfully bootstrapped®rtested in combination with the rest of the kit on x86_64-pc-linux-gnu Successful -fself-test of stage1 on powerpc-ibm-aix7.1.3.0 of just this patch (on top of patch 1). OK for trunk if it passes individual bootstrap®rtest? gcc/ChangeLog: * selftest-run-tests.c (selftest::run_tests): Call selftest::spellcheck_tree_c_tests. * selftest.h (selftest::spellcheck_tree_c_tests): New decl. * spellcheck-tree.c: Include selftest.h and stringpool.h. (selftest::test_find_closest_identifier): New function. (selftest::spellcheck_tree_c_tests): New function. * spellcheck.c (selftest::test_find_closest_string): Verify that the order of the vec does not affect the results for this case. (selftest::test_data): New array. (selftest::test_metric_conditions): New function. (selftest::spellcheck_c_tests): Add a test of case-comparison. Call selftest::test_metric_conditions. --- gcc/selftest-run-tests.c | 1 + gcc/selftest.h | 1 + gcc/spellcheck-tree.c | 49 ++++++++++++++++++++++++++++++++++++ gcc/spellcheck.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 116 insertions(+) diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c index 934e700..d4a9c0b 100644 --- a/gcc/selftest-run-tests.c +++ b/gcc/selftest-run-tests.c @@ -61,6 +61,7 @@ selftest::run_tests () diagnostic_show_locus_c_tests (); fold_const_c_tests (); spellcheck_c_tests (); + spellcheck_tree_c_tests (); tree_cfg_c_tests (); /* This one relies on most of the above. */ diff --git a/gcc/selftest.h b/gcc/selftest.h index d1f8acc..7adea2f 100644 --- a/gcc/selftest.h +++ b/gcc/selftest.h @@ -54,6 +54,7 @@ extern void input_c_tests (); extern void pretty_print_c_tests (); extern void rtl_tests_c_tests (); extern void spellcheck_c_tests (); +extern void spellcheck_tree_c_tests (); extern void tree_c_tests (); extern void tree_cfg_c_tests (); extern void vec_c_tests (); diff --git a/gcc/spellcheck-tree.c b/gcc/spellcheck-tree.c index eb6e72a..2d73b774 100644 --- a/gcc/spellcheck-tree.c +++ b/gcc/spellcheck-tree.c @@ -23,6 +23,8 @@ along with GCC; see the file COPYING3. If not see #include "tm.h" #include "tree.h" #include "spellcheck.h" +#include "selftest.h" +#include "stringpool.h" /* Calculate Levenshtein distance between two identifiers. */ @@ -78,3 +80,50 @@ find_closest_identifier (tree target, const auto_vec *candidates) return best_identifier; } + +#if CHECKING_P + +namespace selftest { + +/* Selftests. */ + +/* Verify that find_closest_identifier is sane. */ + +static void +test_find_closest_identifier () +{ + auto_vec candidates; + + /* Verify that it can handle an empty vec. */ + ASSERT_EQ (NULL, find_closest_identifier (get_identifier (""), &candidates)); + + /* Verify that it works sanely for non-empty vecs. */ + tree apple = get_identifier ("apple"); + tree banana = get_identifier ("banana"); + tree cherry = get_identifier ("cherry"); + candidates.safe_push (apple); + candidates.safe_push (banana); + candidates.safe_push (cherry); + + ASSERT_EQ (apple, find_closest_identifier (get_identifier ("app"), + &candidates)); + ASSERT_EQ (banana, find_closest_identifier (get_identifier ("banyan"), + &candidates));; + ASSERT_EQ (cherry, find_closest_identifier (get_identifier ("berry"), + &candidates)); + ASSERT_EQ (NULL, + find_closest_identifier (get_identifier ("not like the others"), + &candidates)); +} + +/* Run all of the selftests within this file. */ + +void +spellcheck_tree_c_tests () +{ + test_find_closest_identifier (); +} + +} // namespace selftest + +#endif /* #if CHECKING_P */ diff --git a/gcc/spellcheck.c b/gcc/spellcheck.c index 11018f0..e03f484 100644 --- a/gcc/spellcheck.c +++ b/gcc/spellcheck.c @@ -217,6 +217,69 @@ test_find_closest_string () ASSERT_STREQ ("banana", find_closest_string ("banyan", &candidates)); ASSERT_STREQ ("cherry", find_closest_string ("berry", &candidates)); ASSERT_EQ (NULL, find_closest_string ("not like the others", &candidates)); + + /* The order of the vec can matter, but it should not matter for these + inputs. */ + candidates.truncate (0); + candidates.safe_push ("cherry"); + candidates.safe_push ("banana"); + candidates.safe_push ("apple"); + ASSERT_STREQ ("apple", find_closest_string ("app", &candidates)); + ASSERT_STREQ ("banana", find_closest_string ("banyan", &candidates)); + ASSERT_STREQ ("cherry", find_closest_string ("berry", &candidates)); + ASSERT_EQ (NULL, find_closest_string ("not like the others", &candidates)); +} + +/* Test data for test_metric_conditions. */ + +static const char * const test_data[] = { + "", + "foo" + "food", + "boo", + "1234567890123456789012345678901234567890123456789012345678901234567890" +}; + +/* Verify that levenshtein_distance appears to be a sane distance function, + i.e. the conditions for being a metric. This is done directly for a + small set of examples, using test_data above. This is O(N^3) in the size + of the array, due to the test for the triangle inequality, so we keep the + array small. */ + +static void +test_metric_conditions () +{ + const int num_test_cases = sizeof (test_data) / sizeof (test_data[0]); + + for (int i = 0; i < num_test_cases; i++) + { + for (int j = 0; j < num_test_cases; j++) + { + edit_distance_t dist_ij + = levenshtein_distance (test_data[i], test_data[j]); + + /* Identity of indiscernibles: d(i, j) > 0 iff i == j. */ + if (i == j) + ASSERT_EQ (dist_ij, 0); + else + ASSERT_TRUE (dist_ij > 0); + + /* Symmetry: d(i, j) == d(j, i). */ + edit_distance_t dist_ji + = levenshtein_distance (test_data[j], test_data[i]); + ASSERT_EQ (dist_ij, dist_ji); + + /* Triangle inequality. */ + for (int k = 0; k < num_test_cases; k++) + { + edit_distance_t dist_ik + = levenshtein_distance (test_data[i], test_data[k]); + edit_distance_t dist_jk + = levenshtein_distance (test_data[j], test_data[k]); + ASSERT_TRUE (dist_ik <= dist_ij + dist_jk); + } + } + } } /* Verify levenshtein_distance for a variety of pairs of pre-canned @@ -239,8 +302,10 @@ spellcheck_c_tests () ("Lorem ipsum dolor sit amet, consectetur adipiscing elit,", "All your base are belong to us", 44); + levenshtein_distance_unit_test ("foo", "FOO", 3); test_find_closest_string (); + test_metric_conditions (); } } // namespace selftest