Need IO Optimization help 
Author Message
 Need IO Optimization help

Hello:

We're having a little shoot out here at work with Ruby, Perl and Tcl.
So far, Ruby kicked on a recursive Fibonacci(sp?) sequence with
Perl about 50% slower and Tcl 10x slower.

Next we're looking at IO. So far, Perl is about as fast
as cp and Ruby is 50% slower and consumes over twice the
CPU (see table below):

  ruby 60.07u 21.32s 1:31.62 88.8%
  cp   0.01u  6.64s  0:53.34 12.4%
  perl 16.79u 7.66s  0:58.76 41.6%

Oh, the Tcl results? The code is still being written. :)

The Ruby code is below. I know there have been multiple
posts on ruby-talk about this with discussions about
sysread and read, but not being an IO expert, I have
only been able to follow these at a high level.

Could some expert look at the code below and tell me
what could be done to speed up the code below. Possibly
using #read or #sysread. Or, if someone has some ruby C
code, that would be cool.

#------ rw.rb
file = ARGV.shift

File.open(file + ".out", "w") { |of|
  File.open(file).each {|line|
    # do processing here
    of.print line
  }

Quote:
}

#-------

Thanks

--
Jim Freeze
----------
Different all twisty a of in maze are you, passages little.



Tue, 27 Sep 2005 00:03:08 GMT  
 Need IO Optimization help
Hi,

At Fri, 11 Apr 2003 01:03:08 +0900,

Quote:

> The Ruby code is below. I know there have been multiple
> posts on ruby-talk about this with discussions about
> sysread and read, but not being an IO expert, I have
> only been able to follow these at a high level.

You can use File.cp in ftools.rb or FileUtils.cp in
fileutils.rb.

--
Nobu Nakada



Tue, 27 Sep 2005 00:27:25 GMT  
 Need IO Optimization help

Quote:

> Hi,

> At Fri, 11 Apr 2003 01:03:08 +0900,

> > The Ruby code is below. I know there have been multiple
> > posts on ruby-talk about this with discussions about
> > sysread and read, but not being an IO expert, I have
> > only been able to follow these at a high level.

> You can use File.cp in ftools.rb or FileUtils.cp in
> fileutils.rb.

I see the syswrite src.sysread(bsize) in copy_stream, but
I need the data line by line to process. It's just that
for our tests we are not doing any processing.
So, while a direct copy may be fast, it is not exactly
what I want.

  def copy_stream( src, dest )
    bsize = fu_stream_blksize(src, dest)
    begin
      while true
        dest.syswrite src.sysread(bsize)
      end
    rescue EOFError
    end
  end

Is there a system call that reads line by line?

--
Jim Freeze
----------
I can't understand it.  I can't even understand the people who can
understand it.
                -- Queen Juliana of the Netherlands.



Tue, 27 Sep 2005 00:36:09 GMT  
 Need IO Optimization help



Quote:


> > Hi,

> > At Fri, 11 Apr 2003 01:03:08 +0900,

> > > The Ruby code is below. I know there have been multiple
> > > posts on ruby-talk about this with discussions about
> > > sysread and read, but not being an IO expert, I have
> > > only been able to follow these at a high level.

> > You can use File.cp in ftools.rb or FileUtils.cp in
> > fileutils.rb.

> I see the syswrite src.sysread(bsize) in copy_stream, but
> I need the data line by line to process. It's just that
> for our tests we are not doing any processing.
> So, while a direct copy may be fast, it is not exactly
> what I want.

>   def copy_stream( src, dest )
>     bsize = fu_stream_blksize(src, dest)
>     begin
>       while true
>         dest.syswrite src.sysread(bsize)
>       end
>     rescue EOFError
>     end
>   end

> Is there a system call that reads line by line?

Not as far as I know. But you can simulate this with String#each by
reading the whole file in one go and then iterating:

contents = f.read()
contents.each {|line|
  # processing...

Quote:
}

or work directly on contents by using String#sub!

Did you try to use "file.sync= false" in conjunction with "file.flush"?
Did you try reading the whole file in with "contents = file.read()" and
then do line by line processing by doing "contents.each { |line| .... }?

And: did you make sure that IO is the problem and not the processing part?

Regards

    robert



Tue, 27 Sep 2005 01:13:32 GMT  
 Need IO Optimization help
Hi,

In message "Need IO Optimization help"

|We're having a little shoot out here at work with Ruby, Perl and Tcl.
|So far, Ruby kicked on a recursive Fibonacci(sp?) sequence with
|Perl about 50% slower and Tcl 10x slower.

Which version of Ruby are you using?  Can you show us the whole
scripts?

                                                        matz.



Tue, 27 Sep 2005 01:16:50 GMT  
 Need IO Optimization help

Quote:

> Hi,

> In message "Need IO Optimization help"

> |We're having a little shoot out here at work with Ruby, Perl and Tcl.
> |So far, Ruby kicked on a recursive Fibonacci(sp?) sequence with
> |Perl about 50% slower and Tcl 10x slower.

> Which version of Ruby are you using?  Can you show us the whole
> scripts?

Sure:

 ruby -v
 ruby 1.8.0 (2003-04-10) [sparc-solaris2.8]

#- ruby
file = ARGV.shift

File.open(file + ".out", "w") { |of|
  File.open(file).each {|line|
    of << line
  }

Quote:
}

#---

#- tcl
set inFileName [lindex $argv 0]

set ifid [open $inFileName r]
set ofid [open ${inFileName}.out w]

while { ! [eof $ifid] } {
    gets $ifid newLine
    puts $ofid $newLine

Quote:
}

close $ifid
close $ofid
#---

#- perl
$file=shift;
open( inFile, "+<$file");
open(outFile, "+>${file}.out");
while (<inFile>){
  print outFile $_;

Quote:
}

#-----

rubyprint   60.07u  21.32s 1:31.62 88.8%
ruby<<      58.30u  21.13s 1:30.41 87.8%
cp           0.01u   6.64s 0:53.34 12.4%
perl        16.79u   7.66s 0:58.76 41.6%
tcl         226.49u  8.74s 4:17.72 91.2%
tclcp         0.21u  7.80s 1:17.81 10.2%
rubysys       1.46u 12.18s 1:16.28 17.8%

--
Jim Freeze
----------
Computers are not intelligent.  They only think they are.



Tue, 27 Sep 2005 01:29:24 GMT  
 Need IO Optimization help

Quote:




> Not as far as I know. But you can simulate this with String#each by
> reading the whole file in one go and then iterating:

> contents = f.read()
> contents.each {|line|
>   # processing...
> }

> or work directly on contents by using String#sub!

Hmm..., that would be a problem since the files are 200MB - 900MB.

Quote:
> Did you try to use "file.sync= false" in conjunction with "file.flush"?

No. I will try.

Quote:
> Did you try reading the whole file in with "contents = file.read()" and
> then do line by line processing by doing "contents.each { |line| .... }?

> And: did you make sure that IO is the problem and not the processing part?

The processing right now is nil.

--
Jim Freeze



Tue, 27 Sep 2005 01:32:01 GMT  
 Need IO Optimization help

Quote:


> > Hi,

> > In message "Need IO Optimization help"

> > |We're having a little shoot out here at work with Ruby, Perl and Tcl.
> > |So far, Ruby kicked on a recursive Fibonacci(sp?) sequence with
> > |Perl about 50% slower and Tcl 10x slower.

FWIW, here is a benchmark suite you can use:

############################################################
# iobenchmark.rb - benchmark to test IO read/write methods
############################################################
require "benchmark"
include Benchmark

outfile1 = "test1.out"
outfile2 = "test2.out"
outfile3 = "test3.out"
outfile4 = "test4.out"
outfile5 = "test5.out"
outfile6 = "test6.out"
outfile7 = "test7.out"
outfile8 = "test8.out"
outfile9 = "test9.out"

outfiles = Dir["test*.out"]

outfiles.each{ |f|
   File.delete(f)

Quote:
}

string = "The quick brown fox jumped over the lazy dog's back\n"
string_len = string.length # 52
iterations = 100000 # About 5mb per 100,000 iterations

bm do |x|
   x.report("print:"){
      iterations.times{
         File.open(outfile1,"a+"){ |f| f.print(string) }
      }
   }

   x.report("write:"){
      iterations.times{
         File.open(outfile2,"a+"){ |f| f.write(string) }
      }
   }

   x.report("syswrite:"){
      iterations.times{
         File.open(outfile3,"a+"){ |f| f.syswrite(string) }
      }
   }

   x.report("IO.foreach:"){
      outfile = File.open(outfile4,"a+")
      IO.foreach(outfile1){ |line|
         outfile.syswrite(line)
      }
      outfile.close
   }

   x.report("File::gets:"){
      infile = File.open(outfile1)
      outfile = File.open(outfile5,"a+")
      while line = infile.gets
         outfile.syswrite(line)
      end
      infile.close
      outfile.close
   }

   x.report("File::readline:"){
      infile = File.open(outfile1)
      outfile = File.open(outfile6,"a+")
      begin
         while line = infile.readline
            outfile.syswrite(line)
         end
      rescue EOFError
         # do nothing
      end
      infile.close
      outfile.close
   }

   x.report("File::read:"){
      infile = File.open(outfile1)
      outfile = File.open(outfile7,"a+")
      begin
         while line = infile.read(string_len)
            outfile.syswrite(line)
         end
      rescue EOFError
         # do nothing
      end
      infile.close
      outfile.close
   }

   x.report("File::sysread:"){
      infile = File.open(outfile1)
      outfile = File.open(outfile8,"a+")
      begin
         while line = infile.sysread(string_len)
            outfile.syswrite(line)
         end
      rescue EOFError
         # do nothing
      end
      infile.close
      outfile.close
   }

   x.report("File::each:"){
      outfile = File.open(outfile9,"a+")
      File.open(outfile1).each{|line|
         outfile.syswrite(line)
      }
      outfile.close
   }
end



Tue, 27 Sep 2005 01:37:05 GMT  
 Need IO Optimization help

Quote:


> > Did you try to use "file.sync= false" in conjunction with "file.flush"?

> No. I will try.

Looks like file.sync => false be default.

--
Jim Freeze
----------
Computer Science is merely the post-Turing decline in formal systems
theory.



Tue, 27 Sep 2005 01:37:29 GMT  
 Need IO Optimization help

Quote:


> FWIW, here is a benchmark suite you can use:

Thanks

--
Jim Freeze
----------



Tue, 27 Sep 2005 01:56:21 GMT  
 Need IO Optimization help

Below are the results I get from your benchmarks. I added
the last test which seems to be the fastest, AND is what
I was doing in my tests. To bad perl is 50% faster.

Can this be spead up with a C routine?

      user     system      total        real
      print:  7.680000   6.290000  13.970000 ( 14.107482)
      write:  7.680000   5.870000  13.550000 ( 13.634365)
      syswrite:  7.300000   6.090000  13.390000 ( 13.460545)
      IO.foreach:  2.150000   2.210000   4.360000 (  4.474877)
      File::gets:  2.150000   2.090000   4.240000 (  4.847931)
      File::readline:  2.200000   2.020000   4.220000 (  4.457981)
      File::read:  1.920000   2.160000   4.080000 (  4.214184)
      File::sysread:  2.170000   3.260000   5.430000 (  5.552458)
      File::each:  2.330000   2.040000   4.370000 (  4.375648)
      File::each: <<   1.780000   0.380000   2.160000 (  2.337520)

   x.report("File::each: << "){
     outfile = File.open(outfile9,"a+")
     File.open(outfile1).each{|line|
       outfile << line
     }
     outfile.close
   }

--
Jim Freeze
----------
"Life to you is a bold and dashing responsibility"
                -- a Mary Chung's fortune cookie



Tue, 27 Sep 2005 02:10:27 GMT  
 Need IO Optimization help

Quote:

> Below are the results I get from your benchmarks. I added
> the last test which seems to be the fastest, AND is what
> I was doing in my tests. To bad perl is 50% faster.

> Can this be spead up with a C routine?

I did some profiling of a very similar program, results posted at
http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb?key=-pg&cginame=namazu...

Conclusion: I can see no obvious bottlenecks. When you drill down and work
out how much time is spent in object creation and garbage collection, file
reading and scanning etc. it seems all pretty well balanced to me. The
reading and breaking-into-individual-lines accounts for 1.09 seconds out of
2.53 seconds

Regards,

Brian.



Tue, 27 Sep 2005 16:32:17 GMT  
 Need IO Optimization help



Quote:

> > Below are the results I get from your benchmarks. I added
> > the last test which seems to be the fastest, AND is what
> > I was doing in my tests. To bad perl is 50% faster.

> > Can this be spead up with a C routine?

> I did some profiling of a very similar program, results posted at

http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb?key=-pg&cginame=namazu...
ubmit=Search&dbname=ruby-talk&max=50&whence=0

Quote:

> Conclusion: I can see no obvious bottlenecks. When you drill down and
work
> out how much time is spent in object creation and garbage collection,
file
> reading and scanning etc. it seems all pretty well balanced to me. The
> reading and breaking-into-individual-lines accounts for 1.09 seconds out
of
> 2.53 seconds

Even if there are no improvements possible you should judge the effort of
improving speed of ruby vs. perl or others agains the effort to maintain a
changed ruby and the efforts saved by rubys higher development speed.  For
some script it's surely more efficient to write it in one hour and have it
run in another than to tweak a high performance perl script in four hours
that then runs in 15 minutes.

Regards

    robert



Tue, 27 Sep 2005 16:37:21 GMT  
 Need IO Optimization help

Quote:


> > Below are the results I get from your benchmarks. I added
> > the last test which seems to be the fastest, AND is what
> > I was doing in my tests. To bad perl is 50% faster.

> > Can this be spead up with a C routine?

> I did some profiling of a very similar program, results posted at
> http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb?key=-pg&cginame=namazu...

> Conclusion: I can see no obvious bottlenecks. When you drill down and work
> out how much time is spent in object creation and garbage collection, file
> reading and scanning etc. it seems all pretty well balanced to me. The
> reading and breaking-into-individual-lines accounts for 1.09 seconds out of
> 2.53 seconds

I did some profiles as well and came to the similar conclusions for these
simple tests. I'm not sure that if I dropped to rb_io_readline (or
whatever), that it would go any faster. However, I plan to repeat the
tests with 'code' inside the loop to see if Ruby catches up with Perl
when actual work is being done.

--
Jim Freeze
----------
Never put off till tomorrow what you can avoid all together.



Tue, 27 Sep 2005 17:40:13 GMT  
 Need IO Optimization help

Quote:




> > > Below are the results I get from your benchmarks. I added

> Even if there are no improvements possible you should judge the effort of
> improving speed of ruby vs. perl or others agains the effort to maintain a
> changed ruby and the efforts saved by rubys higher development speed.  For
> some script it's surely more efficient to write it in one hour and have it
> run in another than to tweak a high performance perl script in four hours
> that then runs in 15 minutes.

 For new scripts what you say is true. But I can't accept the fact
 that there are no improvements possible. And Ruby has no advantage
 when the Perl scripts already exist and are not being changed.

--
Jim Freeze
----------
"This is a test of the Emergency Broadcast System.  If this had been an
actual emergency, do you really think we'd stick around to tell you?"



Tue, 27 Sep 2005 17:47:27 GMT  
 
 [ 76 post ]  Go to page: [1] [2] [3] [4] [5] [6]

 Relevant Pages 

1. Perl IO to Ruby IO help needed

2. IVF 10: IO speed slower with full optimizations than with a debug build

3. Need help with flame code optimization, please

4. Optimization help needed: Search and Replace using dictionary of parameters

5. Optimization help needed

6. I need help ... (IO-Probs)

7. Direct IO Help needed

8. newbie need file io help

9. mixing IO#read and IO#syswrite

10. IMPORT IO, FROM IO IMPORT ?

11. IMPORT IO, FROM IO IMPORT ?

12. IMPORT IO, FROM IO IMPORT ?

 

 
Powered by phpBB® Forum Software