From patchwork Tue Dec 1 12:55:01 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernhard Reutner-Fischer X-Patchwork-Id: 550831 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41A83140281 for ; Tue, 1 Dec 2015 23:56:24 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=RKgm28IM; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=leBfPOgKDEoxlc+QI8O+JE7QmoJ/5A4thLzqAKadi+99mbbOkBpGf 4mUzGUviYxHdJZX+T/nkhemxEt0qpgCO1IWm9t10BSN4szfkweUdS1jiXn2G9GCp Zy3GKyrk7QmYV5BbpQ9KDh2voryAkFBRjmI7MhmSrrRRCtQlidfza4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; s= default; bh=Tx569vjGFS9KZ29/0YieXvx5sd4=; b=RKgm28IMbaAdJqXHdut9 L4fOMJmcIUWT7m3ePUmPthg2cAFH0zjBVnXAWaBDWBL/V/sk0XHarkAZdB2DZBxY z2lu8fnQQm3jvlEcWKAssmrgQXfDijcSCQUtUYUfokkIA83J/+g3StsqRTrZXZ0m UF0ps3Oar80EHj2wBz6xUgs= Received: (qmail 100663 invoked by alias); 1 Dec 2015 12:56:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100642 invoked by uid 89); 1 Dec 2015 12:56:16 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.0 required=5.0 tests=AWL, BAYES_50, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-Spam-User: qpsmtpd, 2 recipients X-HELO: mail-wm0-f45.google.com Received: from mail-wm0-f45.google.com (HELO mail-wm0-f45.google.com) (74.125.82.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 01 Dec 2015 12:56:13 +0000 Received: by wmec201 with SMTP id c201so203885567wme.0; Tue, 01 Dec 2015 04:56:10 -0800 (PST) X-Received: by 10.194.78.212 with SMTP id d20mr86818921wjx.70.1448974570456; Tue, 01 Dec 2015 04:56:10 -0800 (PST) Received: from s42.loc (91-119-111-21.dynamic.xdsl-line.inode.at. [91.119.111.21]) by smtp.gmail.com with ESMTPSA id d66sm26141464wma.21.2015.12.01.04.56.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Dec 2015 04:56:09 -0800 (PST) Received: from cow by s42.loc with local (Exim 4.86) (envelope-from ) id 1a3kTa-00085J-1M; Tue, 01 Dec 2015 13:56:10 +0100 From: Bernhard Reutner-Fischer To: fortran@gcc.gnu.org Cc: Bernhard Reutner-Fischer , gcc-patches@gcc.gnu.org, dmalcolm@redhat.com Subject: [PATCH] RFC: Use Levenshtein spelling suggestions in Fortran FE Date: Tue, 1 Dec 2015 13:55:01 +0100 Message-Id: <1448974501-30981-4-git-send-email-rep.dot.nop@gmail.com> In-Reply-To: <1448974501-30981-1-git-send-email-rep.dot.nop@gmail.com> References: <1448974501-30981-1-git-send-email-rep.dot.nop@gmail.com> X-IsSubscribed: yes gcc/fortran/ChangeLog 2015-11-29 Bernhard Reutner-Fischer * gfortran.h (gfc_lookup_function_fuzzy): New declaration. * resolve.c: Include spellcheck.h. (lookup_function_fuzzy_find_candidates): New static function. (lookup_uop_fuzzy_find_candidates): Likewise. (lookup_uop_fuzzy): Likewise. (resolve_operator) : Call lookup_uop_fuzzy. (gfc_lookup_function_fuzzy): New definition. (resolve_unknown_f): Call gfc_lookup_function_fuzzy. * interface.c (check_interface0): Likewise. * symbol.c: Include spellcheck.h. (lookup_symbol_fuzzy_find_candidates): New static function. (lookup_symbol_fuzzy): Likewise. (gfc_set_default_type): Call lookup_symbol_fuzzy. (lookup_component_fuzzy_find_candidates): New static function. (lookup_component_fuzzy): Likewise. (gfc_find_component): Call lookup_component_fuzzy. gcc/testsuite/ChangeLog 2015-11-29 Bernhard Reutner-Fischer * gfortran.dg/spellcheck-operator.f90: New testcase. * gfortran.dg/spellcheck-procedure.f90: New testcase. * gfortran.dg/spellcheck-structure.f90: New testcase. --- David Malcolm nice Levenshtein distance spelling check helpers were used in some parts of other frontends. This proposed patch adds some spelling corrections to the fortran frontend. Suggestions are printed if we can find a suitable name, currently perusing a very simple cutoff factor: /* If more than half of the letters were misspelled, the suggestion is likely to be meaningless. */ cutoff = MAX (strlen (typo), strlen (best_guess)) / 2; which effectively skips names with less than 4 characters. For e.g. structures, one could try to be much smarter in an attempt to also provide suggestions for single-letter members/components. This patch covers (at least partly): - user-defined operators - structures (types and their components) - functions - symbols (variables) I do not immediately see how to handle subroutines. Ideas? If anybody has a testcase where a spelling-suggestion would make sense then please pass it along so we maybe can add support for GCC-7. Signed-off-by: Bernhard Reutner-Fischer --- gcc/fortran/gfortran.h | 1 + gcc/fortran/interface.c | 16 ++- gcc/fortran/resolve.c | 135 ++++++++++++++++++++- gcc/fortran/symbol.c | 129 +++++++++++++++++++- gcc/testsuite/gfortran.dg/spellcheck-operator.f90 | 30 +++++ gcc/testsuite/gfortran.dg/spellcheck-procedure.f90 | 41 +++++++ gcc/testsuite/gfortran.dg/spellcheck-structure.f90 | 35 ++++++ 7 files changed, 376 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/spellcheck-operator.f90 create mode 100644 gcc/testsuite/gfortran.dg/spellcheck-procedure.f90 create mode 100644 gcc/testsuite/gfortran.dg/spellcheck-structure.f90 diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 5487c93..cbfd592 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -3060,6 +3060,7 @@ bool gfc_type_is_extensible (gfc_symbol *); bool gfc_resolve_intrinsic (gfc_symbol *, locus *); bool gfc_explicit_interface_required (gfc_symbol *, char *, int); extern int gfc_do_concurrent_flag; +const char* gfc_lookup_function_fuzzy (const char *, gfc_symtree *); /* array.c */ diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c index 30cc522..19f800f 100644 --- a/gcc/fortran/interface.c +++ b/gcc/fortran/interface.c @@ -1590,10 +1590,18 @@ check_interface0 (gfc_interface *p, const char *interface_name) if (p->sym->attr.external) gfc_error ("Procedure %qs in %s at %L has no explicit interface", p->sym->name, interface_name, &p->sym->declared_at); - else - gfc_error ("Procedure %qs in %s at %L is neither function nor " - "subroutine", p->sym->name, interface_name, - &p->sym->declared_at); + else { + const char *guessed + = gfc_lookup_function_fuzzy (p->sym->name, p->sym->ns->sym_root); + if (guessed) + gfc_error ("Procedure %qs in %s at %L is neither function nor " + "subroutine; did you mean %qs?", p->sym->name, + interface_name, &p->sym->declared_at, guessed); + else + gfc_error ("Procedure %qs in %s at %L is neither function nor " + "subroutine", p->sym->name, interface_name, + &p->sym->declared_at); + } return 1; } diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index 685e3f5..6e1f63c 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. If not see #include "data.h" #include "target-memory.h" /* for gfc_simplify_transfer */ #include "constructor.h" +#include "spellcheck.h" /* Types used in equivalence statements. */ @@ -2682,6 +2683,61 @@ resolve_specific_f (gfc_expr *expr) return true; } +/* Recursively append candidate SYM to CANDIDATES. */ + +static void +lookup_function_fuzzy_find_candidates (gfc_symtree *sym, + vec *candidates) +{ + gfc_symtree *p; + for (p = sym->right; p; p = p->right) + { + lookup_function_fuzzy_find_candidates (p, candidates); + if (p->n.sym->ts.type != BT_UNKNOWN) + candidates->safe_push (p->name); + } + for (p = sym->left; p; p = p->left) + { + lookup_function_fuzzy_find_candidates (p, candidates); + if (p->n.sym->ts.type != BT_UNKNOWN) + candidates->safe_push (p->name); + } +} + + +/* Lookup function FN fuzzily, taking names in FUN into account. */ + +const char* +gfc_lookup_function_fuzzy (const char *fn, gfc_symtree *fun) +{ + auto_vec candidates; + lookup_function_fuzzy_find_candidates (fun, &candidates); + + /* Determine closest match. */ + int i; + const char *name, *best = NULL; + edit_distance_t best_distance = MAX_EDIT_DISTANCE; + + FOR_EACH_VEC_ELT (candidates, i, name) + { + edit_distance_t dist = levenshtein_distance (fn, name); + if (dist < best_distance) + { + best_distance = dist; + best = name; + } + } + /* If more than half of the letters were misspelled, the suggestion is + likely to be meaningless. */ + if (best) + { + unsigned int cutoff = MAX (strlen (fn), strlen (best)) / 2; + if (best_distance > cutoff) + return NULL; + } + return best; +} + /* Resolve a procedure call not known to be generic nor specific. */ @@ -2732,8 +2788,15 @@ set_type: if (ts->type == BT_UNKNOWN) { - gfc_error ("Function %qs at %L has no IMPLICIT type", - sym->name, &expr->where); + const char *guessed + = gfc_lookup_function_fuzzy (sym->name, sym->ns->sym_root); + if (guessed) + gfc_error ("Function %qs at %L has no IMPLICIT type" + "; did you mean %qs?", + sym->name, &expr->where, guessed); + else + gfc_error ("Function %qs at %L has no IMPLICIT type", + sym->name, &expr->where); return false; } else @@ -3504,6 +3567,63 @@ compare_shapes (gfc_expr *op1, gfc_expr *op2) return t; } +/* Recursively append candidate UOP to CANDIDATES. */ + +static void +lookup_uop_fuzzy_find_candidates (gfc_symtree *uop, + vec *candidates) +{ + gfc_symtree *p; + /* Not sure how to properly filter here. Use all for a start. + n.uop.op is NULL for empty interface operators (is that legal?) disregard + these as i suppose they don't make terribly sense. */ + for (p = uop->right; p; p = p->right) + { + lookup_function_fuzzy_find_candidates (p, candidates); + if (p->n.uop->op != NULL) + candidates->safe_push (p->name); + } + for (p = uop->left; p; p = p->left) + { + lookup_function_fuzzy_find_candidates (p, candidates); + if (p->n.uop->op != NULL) + candidates->safe_push (p->name); + } +} + +/* Lookup user-operator OP fuzzily, taking names in UOP into account. */ + +static const char* +lookup_uop_fuzzy (const char *op, gfc_symtree *uop) +{ + auto_vec candidates; + lookup_uop_fuzzy_find_candidates (uop, &candidates); + + /* Determine closest match. */ + int i; + const char *name, *best = NULL; + edit_distance_t best_distance = MAX_EDIT_DISTANCE; + + FOR_EACH_VEC_ELT (candidates, i, name) + { + edit_distance_t dist = levenshtein_distance (op, name); + if (dist < best_distance) + { + best_distance = dist; + best = name; + } + } + /* If more than half of the letters were misspelled, the suggestion is + likely to be meaningless. */ + if (best) + { + unsigned int cutoff = MAX (strlen (op), strlen (best)) / 2; + if (best_distance > cutoff) + return NULL; + } + return best; +} + /* Resolve an operator expression node. This can involve replacing the operation with a user defined function call. */ @@ -3703,7 +3823,16 @@ resolve_operator (gfc_expr *e) case INTRINSIC_USER: if (e->value.op.uop->op == NULL) - sprintf (msg, _("Unknown operator '%s' at %%L"), e->value.op.uop->name); + { + const char *name = e->value.op.uop->name; + const char *guessed; + guessed = lookup_uop_fuzzy (name, e->value.op.uop->ns->uop_root); + if (guessed) + sprintf (msg, _("Unknown operator '%s' at %%L; did you mean '%s'?"), + name, guessed); + else + sprintf (msg, _("Unknown operator '%s' at %%L"), name); + } else if (op2 == NULL) sprintf (msg, _("Operand of user operator '%s' at %%L is %s"), e->value.op.uop->name, gfc_typename (&op1->ts)); diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c index ff9aff9..212f7d8 100644 --- a/gcc/fortran/symbol.c +++ b/gcc/fortran/symbol.c @@ -27,6 +27,7 @@ along with GCC; see the file COPYING3. If not see #include "parse.h" #include "match.h" #include "constructor.h" +#include "spellcheck.h" /* Strings for all symbol attributes. We use these for dumping the @@ -235,6 +236,62 @@ gfc_get_default_type (const char *name, gfc_namespace *ns) } +/* Recursively append candidate SYM to CANDIDATES. */ + +static void +lookup_symbol_fuzzy_find_candidates (gfc_symtree *sym, + vec *candidates) +{ + gfc_symtree *p; + for (p = sym->right; p; p = p->right) + { + lookup_symbol_fuzzy_find_candidates (p, candidates); + if (p->n.sym->ts.type != BT_UNKNOWN) + candidates->safe_push (p->name); + } + for (p = sym->left; p; p = p->left) + { + lookup_symbol_fuzzy_find_candidates (p, candidates); + if (p->n.sym->ts.type != BT_UNKNOWN) + candidates->safe_push (p->name); + } +} + + +/* Lookup symbol SYM fuzzily, taking names in SYMBOL into account. */ + +static const char* +lookup_symbol_fuzzy (const char *sym, gfc_symbol *symbol) +{ + auto_vec candidates; + lookup_symbol_fuzzy_find_candidates (symbol->ns->sym_root, &candidates); + + /* Determine closest match. */ + int i; + const char *name, *best = NULL; + edit_distance_t best_distance = MAX_EDIT_DISTANCE; + + FOR_EACH_VEC_ELT (candidates, i, name) + { + edit_distance_t dist = levenshtein_distance (sym, name); + if (dist < best_distance) + { + best_distance = dist; + best = name; + } + } + /* If more than half of the letters were misspelled, the suggestion is + likely to be meaningless. */ + if (best) + { + unsigned int cutoff = MAX (strlen (sym), strlen (best)) / 2; + if (best_distance > cutoff) + return NULL; + } + return best; +} + + /* Given a pointer to a symbol, set its type according to the first letter of its name. Fails if the letter in question has no default type. */ @@ -253,8 +310,15 @@ gfc_set_default_type (gfc_symbol *sym, int error_flag, gfc_namespace *ns) { if (error_flag && !sym->attr.untyped) { - gfc_error ("Symbol %qs at %L has no IMPLICIT type", - sym->name, &sym->declared_at); + const char *guessed + = lookup_symbol_fuzzy (sym->name, sym); + if (guessed) + gfc_error ("Symbol %qs at %L has no IMPLICIT type" + "; did you mean %qs?", + sym->name, &sym->declared_at, guessed); + else + gfc_error ("Symbol %qs at %L has no IMPLICIT type", + sym->name, &sym->declared_at); sym->attr.untyped = 1; /* Ensure we only give an error once. */ } @@ -2188,6 +2252,55 @@ bad: } +/* Recursively append candidate COMPONENT structures to CANDIDATES. */ + +static void +lookup_component_fuzzy_find_candidates (gfc_component *component, + vec *candidates) +{ + for (gfc_component *p = component; p; p = p->next) + { + if (00 && p->ts.type == BT_DERIVED) + /* ??? There's no (suitable) DERIVED_TYPE which would come in + handy throughout the frontend; Use CLASS_DATA here for brevity. */ + lookup_component_fuzzy_find_candidates (CLASS_DATA (p), candidates); + candidates->safe_push (p->name); + } +} + +/* Lookup component MEMBER fuzzily, taking names in COMPONENT into account. */ + +static const char* +lookup_component_fuzzy (const char *member, gfc_component *component) +{ + auto_vec candidates; + lookup_component_fuzzy_find_candidates (component, &candidates); + + /* Determine closest match. */ + int i; + const char *name, *best = NULL; + edit_distance_t best_distance = MAX_EDIT_DISTANCE; + + FOR_EACH_VEC_ELT (candidates, i, name) + { + edit_distance_t dist = levenshtein_distance (member, name); + if (dist < best_distance) + { + best_distance = dist; + best = name; + } + } + /* If more than half of the letters were misspelled, the suggestion is + likely to be meaningless. */ + if (best) + { + unsigned int cutoff = MAX (strlen (member), strlen (best)) / 2; + if (best_distance > cutoff) + return NULL; + } + return best; +} + /* Given a derived type node and a component name, try to locate the component structure. Returns the NULL pointer if the component is not found or the components are private. If noaccess is set, no access @@ -2238,8 +2351,16 @@ gfc_find_component (gfc_symbol *sym, const char *name, } if (p == NULL && !silent) - gfc_error ("%qs at %C is not a member of the %qs structure", - name, sym->name); + { + const char *guessed = lookup_component_fuzzy (name, sym->components); + if (guessed) + gfc_error ("%qs at %C is not a member of the %qs structure" + "; did you mean %qs?", + name, sym->name, guessed); + else + gfc_error ("%qs at %C is not a member of the %qs structure", + name, sym->name); + } return p; } diff --git a/gcc/testsuite/gfortran.dg/spellcheck-operator.f90 b/gcc/testsuite/gfortran.dg/spellcheck-operator.f90 new file mode 100644 index 0000000..810a770 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/spellcheck-operator.f90 @@ -0,0 +1,30 @@ +! { dg-do compile } +! test levenshtein based spelling suggestions + +module mymod1 + implicit none + contains + function something_good (iarg1) + integer :: something_good + integer, intent(in) :: iarg1 + something_good = iarg1 + 42 + end function something_good +end module mymod1 + +program spellchekc + use mymod1 + implicit none + + interface operator (.mywrong.) + module procedure something_wring ! { dg-error "Procedure .something_wring. in operator interface .mywrong. at .1. is neither function nor subroutine; did you mean .something_good.\\?|User operator procedure .something_wring. at .1. must be a FUNCTION" } + end interface + + interface operator (.mygood.) + module procedure something_good + end interface + + integer :: i, j, added + i = 0 + j = 0 + added = .mygoof. j ! { dg-error "Unknown operator .mygoof. at .1.; did you mean .mygood.\\?" } +end program spellchekc diff --git a/gcc/testsuite/gfortran.dg/spellcheck-procedure.f90 b/gcc/testsuite/gfortran.dg/spellcheck-procedure.f90 new file mode 100644 index 0000000..7923081 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/spellcheck-procedure.f90 @@ -0,0 +1,41 @@ +! { dg-do compile } +! test levenshtein based spelling suggestions + +module mymod1 + implicit none + contains + function something_good (iarg1) + integer :: something_good + integer, intent(in) :: iarg1 + something_good = iarg1 + 42 + end function something_good +end module mymod1 + +subroutine bark_unless_zero(iarg) + implicit none + integer, intent(in) :: iarg + if (iarg /= 0) call abort +end subroutine bark_unless_zero + +function myadd(iarg1, iarg2) + implicit none + integer :: myadd + integer, intent(in) :: iarg1, iarg2 + myadd = iarg1 + iarg2 +end function myadd + +program spellchekc + use mymod1 + implicit none + + integer :: i, j, myadd + i = 0 + j = 0 +! I suppose this cannot be made to work, no\\? +! call barf_unless_zero(i) ! { -dg-error "; did you mean .bark_unless_zero.\\?" } + j = something_goof(j) ! { dg-error "no IMPLICIT type; did you mean .something_good.\\?" } + j = myaddd(i, j) ! { dg-error "no IMPLICIT type; did you mean .myadd.\\?" } + j = mya(i, j) ! { dg-error "no IMPLICIT type; did you mean .myadd.\\?" } + if (j /= 42) call abort + +end program spellchekc diff --git a/gcc/testsuite/gfortran.dg/spellcheck-structure.f90 b/gcc/testsuite/gfortran.dg/spellcheck-structure.f90 new file mode 100644 index 0000000..929e05f --- /dev/null +++ b/gcc/testsuite/gfortran.dg/spellcheck-structure.f90 @@ -0,0 +1,35 @@ +! { dg-do compile } +! test levenshtein based spelling suggestions +implicit none + +!!!!!!!!!!!!!! structure tests !!!!!!!!!!!!!! +type type1 + real :: radius + integer :: i +end type type1 + +type type2 + integer :: myint + type(type1) :: mytype +end type type2 + +type type3 + type(type2) :: type_2 +end type type3 +type type4 + type(type3) :: type_3 +end type type4 + +type(type1) :: t1 +t1%radiuz = .0 ! { dg-error ".radiuz. at .1. is not a member of the .type1. structure; did you mean .radius.\\?" } +t1%x = .0 ! { dg-error ".x. at .1. is not a member of the .type1. structure" } +type(type2) :: t2 +t2%mytape%radius = .0 ! { dg-error ".mytape. at .1. is not a member of the .type2. structure; did you mean .mytype.\\?" } +t2%mytype%radious = .0 ! { dg-error ".radious. at .1. is not a member of the .type1. structure; did you mean .radius.\\?" } +type(type4) :: t4 +t4%type_3%type_2%mytype%radium = 88.0 ! { dg-error ".radium. at .1. is not a member of the .type1. structure; did you mean .radius.\\?" } + +!!!!!!!!!!!!!! symbol tests !!!!!!!!!!!!!! +integer :: iarg1 +iarg2 = 1 ! { dg-error "Symbol .iarg2. at .1. has no IMPLICIT type; did you mean .iarg1.\\?" } +end