Patchwork [libfortran,4.7/4.8/4.9,Regression] PR38199 missed optimization: I/O performance

login
register
mail settings
Submitter jerry DeLisle
Date March 10, 2014, 12:39 a.m.
Message ID <531D09CF.708@charter.net>
Download mbox | patch
Permalink /patch/328436/
State New
Headers show

Comments

jerry DeLisle - March 10, 2014, 12:39 a.m.
Hi all,

This final patch does two things.

First:  In read.c it implements a simple space skipping scheme in read_decimal
where I found a lot of repeated next_char calls happening. This gives a pretty
good boost in performance and is applicable in general for reading integers.

Second: I have taken Thomas idea of using LEN_TRIM in unit.c revised it to work
on formatted READ.  I tried to document the code with comments.  There are
certain conditions for which one can not shorten the string length for internal
units. For arrays of characters you can not do this for strings more than rank 1
and stride 1. Also, you can not do this any time a BLANK='zero' is being used. I
also skip the optimization if there is any BLANK= specified in the READ. Thats
conservative.  I could also test for BLANK='NULL' in the DTP structure. I will
probably do that later.

I have added a helper function which tests for the BZ within a format string
when a format string is present.  I also check to see if the UNIT has had the
BLANK status set.  The optimization is skipped for these conditions.

So what are the results?

For the case in comment 29 of the PR which was the last case to be worked out:

Size:		NoPatch			Patch:

100000		7.48			.312
50000		3.65			.152
25000		2.07			.085	
12000		1.00			.054	
6000		.513			.036
3000		.255			.029
1500		.143			.028
750		.082			.026
300		.047			.025
150		.032			.024
75		.026			.022
50		.023			.021

Where Size is the size of the buffer string and format specifier in the read.

Times are in seconds.

I was thinking that at some string size we could just skip the optimization
altogether to avoid the overhead on smaller strings. Maybe strings less then 50
bytes long.

Regression tested on X86-64-Gnu-Linux. No new test case needed.

OK for trunk?

Regards,

Jerry

2014-03-08  Jerry DeLisle  <jvdelisle@gcc.gnu>

	PR libfortran/38199

Patch

Index: read.c
===================================================================
--- read.c	(revision 208303)
+++ read.c	(working copy)
@@ -677,7 +677,13 @@  read_decimal (st_parameter_dt *dtp, const fnode *f
 	
       if (c == ' ')
         {
-	  if (dtp->u.p.blank_status == BLANK_NULL) continue;
+	  if (dtp->u.p.blank_status == BLANK_NULL)
+	    {
+	      /* Skip spaces.  */
+	      for ( ; w > 0; p++, w--)
+		if (*p != ' ') break; 
+	      continue;
+	    }
 	  if (dtp->u.p.blank_status == BLANK_ZERO) c = '0';
         }
         
Index: unit.c
===================================================================
--- unit.c	(revision 208303)
+++ unit.c	(working copy)
@@ -375,6 +375,25 @@  find_or_create_unit (int n)
 }
 
 
+/* Helper function to test for BZ (Blank Zero) in format string. This
+   is used for optimization. You can't trim out blanks if they have
+   significance.  */
+static bool
+is_BZ (st_parameter_dt *dtp)
+{
+  if (dtp->common.flags & IOPARM_DT_HAS_FORMAT)
+    {
+      char *p = dtp->format;
+      off_t i;
+      for (i = 0; i < dtp->format_len; i++)
+	if (p[i] == 'b' || p[i] == 'B')
+	  if (p[i+1] == 'z' || p[i+1] == 'Z')
+	    return true;
+    }
+  return false;
+}
+
+
 gfc_unit *
 get_internal_unit (st_parameter_dt *dtp)
 {
@@ -402,6 +421,33 @@  get_internal_unit (st_parameter_dt *dtp)
      some other file I/O unit.  */
   iunit->unit_number = -1;
 
+  /* As an optimization, adjust the unit record length to not
+     include trailing blanks. This will not work under certain conditions
+     where trailing blanks have significance.  */
+  if (dtp->u.p.mode == READING && !dtp->u.p.ionml
+      && !(dtp->internal_unit_desc
+	   && (GFC_DESCRIPTOR_RANK (dtp->internal_unit_desc) > 1
+	   || GFC_DESCRIPTOR_STRIDE(dtp->internal_unit_desc, 0) != 1))
+      && !is_BZ (dtp) && dtp->blank_len == 0)
+    {
+      if (dtp->common.unit == 0)
+	{
+	  int tmp = string_len_trim (dtp->internal_unit_len,
+				     dtp->internal_unit);
+	  if (tmp > 0)
+	    dtp->internal_unit_len = tmp; 
+	  iunit->recl = dtp->internal_unit_len;
+	}
+      else
+	{
+	  int tmp = string_len_trim_char4 (dtp->internal_unit_len,
+			      (const gfc_char4_t*) dtp->internal_unit);
+	  if (tmp > 0)
+	    dtp->internal_unit_len = tmp;
+	  iunit->recl = dtp->internal_unit_len;
+	}
+    }
+
   /* Set up the looping specification from the array descriptor, if any.  */
 
   if (is_array_io (dtp))
@@ -414,27 +460,6 @@  get_internal_unit (st_parameter_dt *dtp)
 
       start_record *= iunit->recl;
     }
-  else
-    {
-      /* If we are not processing an array, adjust the unit record length not
-	 to include trailing blanks for list-formatted reads.  */
-      if (dtp->u.p.mode == READING && !(dtp->common.flags & IOPARM_DT_HAS_FORMAT))
-	{
-	  if (dtp->common.unit == 0)
-	    {
-	      dtp->internal_unit_len =
-		string_len_trim (dtp->internal_unit_len, dtp->internal_unit);
-	      iunit->recl = dtp->internal_unit_len;
-	    }
-	  else
-	    {
-	      dtp->internal_unit_len =
-		string_len_trim_char4 (dtp->internal_unit_len,
-				       (const gfc_char4_t*) dtp->internal_unit);
-	      iunit->recl = dtp->internal_unit_len;
-	    }
-	}
-    }
 
   /* Set initial values for unit parameters.  */
   if (dtp->common.unit)