Message ID | 52C55062.3000809@sfr.fr |
---|---|
State | New |
Headers | show |
Hi Mikael, > this patch fixes PR58007, where the compiler was not able to relate a > component pointer to any loaded derived type symbol. > The problem came from an optimization avoiding loading again a symbol > which had already been loaded, skipping by the way the association of > component pointers (if the symbol was a derived type) with the > corresponding module indexes. > > The attached patch fixes this by reading the derived type symbol dump's > component list and do the pointer<->integer association by hand. Then > the regular on demand module loading code works properly and some code > in mio_component_ref that was forcing the module loading of the > containing derived type can be removed. > To associate the component pointers with the module integers, one has to > parse the symbol, or at least its components. It is done by hand in > this patch and to reduce the maintainance burden (for possible future > evolutions of symbol dumping format/content) the component list is moved > at the beginning. More exactly just after the symbol attributes, > because the check_for_ambiguous function relies on the symbol attributes > appearing first in a symbol. > This changes the module format, so MOD_VERSION is bumped. > > Regression tested on x86_64-unknown-linux-gnu. OK for trunk? Basically your patch looks ok to me. However, I don't quite see the necessity for changing the module format (apart from the fact that it makes your patch slightly simpler). Anyway, since the module format has been changed already in 4.9, I think it doesn't matter much if we do it once more, so okay with me. One question, though: I don't understand how the problem is specific to OOP code (and why it did not come up before with F95-style code). Do you think it would be possible to find a non-OOP case where it fails? If not, could there be some problem in gfortran's OOP implementation which causes the failure? > I plan to submit a variant that doesn't change the module format for the > branches (doing more parsing by hand). Isn't that what you posted in comment 12 of the PR? Which branches did you have in mind? The PR contains various test cases which fail on either 4.7 or 4.8 and sometimes both, so potentially both of them are affected by the bug, I guess? Cheers, Janus
Le 04/01/2014 16:35, Janus Weil a écrit : > Hi Mikael, > >> this patch fixes PR58007, where the compiler was not able to relate a >> component pointer to any loaded derived type symbol. >> The problem came from an optimization avoiding loading again a symbol >> which had already been loaded, skipping by the way the association of >> component pointers (if the symbol was a derived type) with the >> corresponding module indexes. >> >> The attached patch fixes this by reading the derived type symbol dump's >> component list and do the pointer<->integer association by hand. Then >> the regular on demand module loading code works properly and some code >> in mio_component_ref that was forcing the module loading of the >> containing derived type can be removed. >> To associate the component pointers with the module integers, one has to >> parse the symbol, or at least its components. It is done by hand in >> this patch and to reduce the maintainance burden (for possible future >> evolutions of symbol dumping format/content) the component list is moved >> at the beginning. More exactly just after the symbol attributes, >> because the check_for_ambiguous function relies on the symbol attributes >> appearing first in a symbol. >> This changes the module format, so MOD_VERSION is bumped. >> >> Regression tested on x86_64-unknown-linux-gnu. OK for trunk? > > Basically your patch looks ok to me. > > However, I don't quite see the necessity for changing the module > format (apart from the fact that it makes your patch slightly > simpler). Anyway, since the module format has been changed already in > 4.9, I think it doesn't matter much if we do it once more, so okay > with me. Well, I'm afraid of a future evolution where we change the module format, update mio_symbol, and forget to change read_module as well, so that the code reads something thinking it is the component list, but it has become something else. It is possible as empty parentheses `()' are common in the module format, and it matches an empty component list. Moving the component list earlier only makes it less likely. Another possibility to prevent this is adding a recognizable pattern like "@# component-list #@" or something, but the price to pay is the same: changing the module format. > > One question, though: I don't understand how the problem is specific > to OOP code (and why it did not come up before with F95-style code). > Do you think it would be possible to find a non-OOP case where it > fails? If not, could there be some problem in gfortran's OOP > implementation which causes the failure? > I attach here a non-OOP variant, as well as the script that generated it. You can see that i'm no good at writing shell scripts. ;-) Anyway the principle is: the bugs happens when the symbols are read in one particular (failing) order, so this script adds useless variables to the modules to saturate the pointer tree and trigger a tree rebalance, until it fails. For those who cannot afford updating their compilers, a variant of this script might be used to have this bug _not_ happen on their codes. This is left as an exercise; not sure how difficult it is. ;-) > >> I plan to submit a variant that doesn't change the module format for the >> branches (doing more parsing by hand). > > Isn't that what you posted in comment 12 of the PR? > It is, yes. > Which branches did you have in mind? The PR contains various test > cases which fail on either 4.7 or 4.8 and sometimes both, so > potentially both of them are affected by the bug, I guess? > Oops, I thought it was a 4.7/4.8 regression. Actually, the testcase provided here also fails with 4.8, so I was actually right. :-) Mikael
Hi, >>> this patch fixes PR58007, where the compiler was not able to relate a >>> component pointer to any loaded derived type symbol. >>> The problem came from an optimization avoiding loading again a symbol >>> which had already been loaded, skipping by the way the association of >>> component pointers (if the symbol was a derived type) with the >>> corresponding module indexes. >>> >>> The attached patch fixes this by reading the derived type symbol dump's >>> component list and do the pointer<->integer association by hand. Then >>> the regular on demand module loading code works properly and some code >>> in mio_component_ref that was forcing the module loading of the >>> containing derived type can be removed. >>> To associate the component pointers with the module integers, one has to >>> parse the symbol, or at least its components. It is done by hand in >>> this patch and to reduce the maintainance burden (for possible future >>> evolutions of symbol dumping format/content) the component list is moved >>> at the beginning. More exactly just after the symbol attributes, >>> because the check_for_ambiguous function relies on the symbol attributes >>> appearing first in a symbol. >>> This changes the module format, so MOD_VERSION is bumped. >>> >>> Regression tested on x86_64-unknown-linux-gnu. OK for trunk? >> >> Basically your patch looks ok to me. >> >> However, I don't quite see the necessity for changing the module >> format (apart from the fact that it makes your patch slightly >> simpler). Anyway, since the module format has been changed already in >> 4.9, I think it doesn't matter much if we do it once more, so okay >> with me. > Well, I'm afraid of a future evolution where we change the module > format, update mio_symbol, and forget to change read_module as well, so > that the code reads something thinking it is the component list, but it > has become something else. > It is possible as empty parentheses `()' are common in the module > format, and it matches an empty component list. Moving the component > list earlier only makes it less likely. Another possibility to prevent > this is adding a recognizable pattern like "@# component-list #@" or > something, but the price to pay is the same: changing the module format. > >> One question, though: I don't understand how the problem is specific >> to OOP code (and why it did not come up before with F95-style code). >> Do you think it would be possible to find a non-OOP case where it >> fails? If not, could there be some problem in gfortran's OOP >> implementation which causes the failure? >> > I attach here a non-OOP variant, as well as the script that generated > it. You can see that i'm no good at writing shell scripts. ;-) > Anyway the principle is: the bugs happens when the symbols are read in > one particular (failing) order, so this script adds useless variables to > the modules to saturate the pointer tree and trigger a tree rebalance, > until it fails. Good, thanks for checking. As written before, the patch is ok for trunk from my side. >> Which branches did you have in mind? The PR contains various test >> cases which fail on either 4.7 or 4.8 and sometimes both, so >> potentially both of them are affected by the bug, I guess? >> > Oops, I thought it was a 4.7/4.8 regression. > Actually, the testcase provided here also fails with 4.8, so I was > actually right. :-) In fact your test case fails with all versions I tried (4.4, 4.6, 4.7, 4.8 and trunk). So, is it a regression at all? Cheers, Janus
diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c index 98e22df..20bf1ba 100644 --- a/gcc/fortran/module.c +++ b/gcc/fortran/module.c @@ -82,7 +82,7 @@ along with GCC; see the file COPYING3. If not see /* Don't put any single quote (') in MOD_VERSION, if you want it to be recognized. */ -#define MOD_VERSION "11" +#define MOD_VERSION "12" /* Structure that describes a position within a module file. */ @@ -390,37 +390,6 @@ get_integer (int integer) } -/* Recursive function to find a pointer within a tree by brute force. */ - -static pointer_info * -fp2 (pointer_info *p, const void *target) -{ - pointer_info *q; - - if (p == NULL) - return NULL; - - if (p->u.pointer == target) - return p; - - q = fp2 (p->left, target); - if (q != NULL) - return q; - - return fp2 (p->right, target); -} - - -/* During reading, find a pointer_info node from the pointer value. - This amounts to a brute-force search. */ - -static pointer_info * -find_pointer2 (void *p) -{ - return fp2 (pi_root, p); -} - - /* Resolve any fixups using a known pointer. */ static void @@ -2588,45 +2557,13 @@ mio_pointer_ref (void *gp) the namespace and is not loaded again. */ static void -mio_component_ref (gfc_component **cp, gfc_symbol *sym) +mio_component_ref (gfc_component **cp) { - char name[GFC_MAX_SYMBOL_LEN + 1]; - gfc_component *q; pointer_info *p; p = mio_pointer_ref (cp); if (p->type == P_UNKNOWN) p->type = P_COMPONENT; - - if (iomode == IO_OUTPUT) - mio_pool_string (&(*cp)->name); - else - { - mio_internal_string (name); - - if (sym && sym->attr.is_class) - sym = sym->components->ts.u.derived; - - /* It can happen that a component reference can be read before the - associated derived type symbol has been loaded. Return now and - wait for a later iteration of load_needed. */ - if (sym == NULL) - return; - - if (sym->components != NULL && p->u.pointer == NULL) - { - /* Symbol already loaded, so search by name. */ - q = gfc_find_component (sym, name, true, true); - - if (q) - associate_integer_pointer (p, q); - } - - /* Make sure this symbol will eventually be loaded. */ - p = find_pointer2 (sym); - if (p->u.rsym.state == UNUSED) - p->u.rsym.state = NEEDED; - } } @@ -2983,7 +2920,7 @@ mio_ref (gfc_ref **rp) case REF_COMPONENT: mio_symbol_ref (&r->u.c.sym); - mio_component_ref (&r->u.c.component, r->u.c.sym); + mio_component_ref (&r->u.c.component); break; case REF_SUBSTRING: @@ -3865,6 +3802,14 @@ mio_symbol (gfc_symbol *sym) mio_lparen (); mio_symbol_attribute (&sym->attr); + + /* Note that components are always saved, even if they are supposed + to be private. Component access is checked during searching. */ + mio_component_list (&sym->components, sym->attr.vtype); + if (sym->components != NULL) + sym->component_access + = MIO_NAME (gfc_access) (sym->component_access, access_types); + mio_typespec (&sym->ts); if (sym->ts.type == BT_CLASS) sym->attr.class_ok = 1; @@ -3893,15 +3838,6 @@ mio_symbol (gfc_symbol *sym) if (sym->attr.cray_pointee) mio_symbol_ref (&sym->cp_pointer); - /* Note that components are always saved, even if they are supposed - to be private. Component access is checked during searching. */ - - mio_component_list (&sym->components, sym->attr.vtype); - - if (sym->components != NULL) - sym->component_access - = MIO_NAME (gfc_access) (sym->component_access, access_types); - /* Load/save the f2k_derived namespace of a derived-type symbol. */ mio_full_f2k_derived (sym); @@ -4000,11 +3936,11 @@ find_symbol (gfc_symtree *st, const char *name, /* Skip a list between balanced left and right parens. */ static void -skip_list (void) +skip_list (int nest_level = 0) { int level; - level = 0; + level = nest_level; do { switch (parse_atom ()) @@ -4638,7 +4574,6 @@ read_module (void) info->u.rsym.ns = atom_int; get_module_locus (&info->u.rsym.where); - skip_list (); /* See if the symbol has already been loaded by a previous module. If so, we reference the existing symbol and prevent it from @@ -4649,10 +4584,44 @@ read_module (void) if (sym == NULL || (sym->attr.flavor == FL_VARIABLE && info->u.rsym.ns !=1)) - continue; + { + skip_list (); + continue; + } info->u.rsym.state = USED; info->u.rsym.sym = sym; + /* The current symbol has already been loaded, so we can avoid loading + it again. However, if it is a derived type, some of its components + can be used in expressions in the module. To avoid the module loading + failing, we need to associate the module's component pointer indexes + with the existing symbol's component pointers. */ + if (sym->attr.flavor == FL_DERIVED) + { + gfc_component *c; + + mio_lparen (); /* symbol opening. */ + skip_list (); /* skip symbol attribute. */ + + mio_lparen (); /* component list opening. */ + for (c = sym->components; c; c = c->next) + { + pointer_info *p; + int n; + + mio_lparen (); /* component opening. */ + mio_integer (&n); + p = get_integer (n); + if (p->u.pointer == NULL) + associate_integer_pointer (p, c); + skip_list (1); /* component end. */ + } + mio_rparen (); /* component list closing. */ + + skip_list (1); /* symbol end. */ + } + else + skip_list (); /* Some symbols do not have a namespace (eg. formal arguments), so the automatic "unique symtree" mechanism must be suppressed