DATA object and processes; unexpected problem 
Author Message
 DATA object and processes; unexpected problem

I have have several forked processes simultateously accessing the data
area with the following function, and it all goes horribly wrong.

   def extract(dest,name,mode)
     User.asroot {
       start=0
       fn = path(dest,name)
       File.open(fn,File::CREAT|File::WRONLY,mode) { |file|
         DATA.rewind
         DATA.each { |line|
           if start==0
             start=1 if line =~ %r{^=begin #{name} -*}
           else
             break if line =~ %r{^=end -*}
             yield line if block_given?
             file.puts line
           end
         }
       }
     }
   end

Each process gets partial or corrupt data. Now it obviously has
something to do with the DATA object being shared between processes
somehow (there is presumably a file handle in there somewhere), but how
can I fix it?

Andrew Walrond



Mon, 28 Nov 2005 08:31:37 GMT  
 DATA object and processes; unexpected problem
Hi,

At Thu, 12 Jun 2003 09:31:37 +0900,

Quote:

> Each process gets partial or corrupt data. Now it obviously has
> something to do with the DATA object being shared between processes
> somehow (there is presumably a file handle in there somewhere), but
> how can I fix it?

Shared file descriptors share the position even between
processes.  You can duplicate DATA.

--
Nobu Nakada



Mon, 28 Nov 2005 09:43:59 GMT  
 DATA object and processes; unexpected problem

Quote:

> Shared file descriptors share the position even between
> processes.  You can duplicate DATA.

Yes I know, so the first thing I tried was

   def extract(dest,name,mode)

     data = DATA.dup

     User.asroot {
       start=0
       fn = path(dest,name)
       File.open(fn,File::CREAT|File::WRONLY,mode) { |file|
         data.rewind
         data.each { |line|
           if start==0
             start=1 if line =~ %r{^=begin #{name} -*}
           else
             break if line =~ %r{^=end -*}
             yield line if block_given?
             file.puts line
           end
         }
       }
     }
   end

Which didn't work.

Futher investigation proved to be confusing. Can anyone explain whats
going on?
(I've appended the output from each line to make it easier)

#!/bin/ruby -w
#Hello cat
#Hello dog
#hello canary

puts DATA.gets  #-> one               Ok, as expected.
puts DATA.gets  #-> two               Ok, as expected.

a=DATA.dup
puts a.gets     #-> nil               Huh? I was expecting 'three'
a.rewind        #                     Try rewinding...
puts a.gets     #-> #!/bin/ruby -w    Ok, I can live with that

b=DATA.dup
puts b.gets     #-> nil               Hmmm. Ok, same as before
b.rewind        #                     Rewind...
puts b.gets     #-> #!/bin/ruby -w    As expected

# Mix it up a bit...

puts a.gets     #-> #Hello cat        Ok
puts a.gets     #-> #Hello dog        Yep
puts b.gets     #-> #Hello cat        Ok
puts a.gets     #-> #Hello canary     Great!

# Ok, lets simplify it a bit

c=DATA.dup
c.rewind
d=DATA.dup
d.rewind
puts c.gets     #-> #!/bin/ruby -w    As expected
puts d.gets     #-> nil               What the hell is going on here?

# Try again, reordering stuff a bit...

c=DATA.dup
d=DATA.dup
c.rewind
puts c.gets     #-> #!/bin/ruby -w    As expected
d.rewind
puts d.gets     #-> #!/bin/ruby -w    It works! But why???

__END__
one
two
three



Mon, 28 Nov 2005 16:52:02 GMT  
 DATA object and processes; unexpected problem
Hi,

At Thu, 12 Jun 2003 17:52:02 +0900,

Quote:

> Futher investigation proved to be confusing. Can anyone explain whats
> going on?
> (I've appended the output from each line to make it easier)

It looks position mismatch between stdio and IO descriptor.
Although this would be a bug, try with #seek before #dup.

Quote:
> #!/bin/ruby -w
> #Hello cat
> #Hello dog
> #hello canary

> puts DATA.gets  #-> one               Ok, as expected.
> puts DATA.gets  #-> two               Ok, as expected.

DATA.seek(0, IO::SEEK_CUR)
Quote:
> a=DATA.dup
> puts a.gets     #-> nil               Huh? I was expecting 'three'
> a.rewind        #                     Try rewinding...
> puts a.gets     #-> #!/bin/ruby -w    Ok, I can live with that

DATA.seek(0, IO::SEEK_CUR)

Quote:
> b=DATA.dup
> puts b.gets     #-> nil               Hmmm. Ok, same as before
> b.rewind        #                     Rewind...
> puts b.gets     #-> #!/bin/ruby -w    As expected

But it doesn't work with the below, so more investigation is
needed.

Quote:
> # Ok, lets simplify it a bit

> c=DATA.dup
> c.rewind
> d=DATA.dup
> d.rewind
> puts c.gets     #-> #!/bin/ruby -w    As expected
> puts d.gets     #-> nil               What the hell is going on here?

--
Nobu Nakada


Tue, 29 Nov 2005 08:10:00 GMT  
 DATA object and processes; unexpected problem

Quote:

> Hi,

> It looks position mismatch between stdio and IO descriptor.
> Although this would be a bug, try with #seek before #dup.

OK;

#!/bin/ruby -w
#Hello cat
#Hello dog
#hello canary

puts DATA.gets  #-> one               Ok, as expected.
puts DATA.gets  #-> two               Ok, as expected.

DATA.seek(0, IO::SEEK_CUR)
a=DATA.dup
puts a.gets     #-> three             Yes!
a.rewind        #                     Try rewinding...
puts a.gets     #-> #!/bin/ruby -w    Yes!

DATA.seek(0, IO::SEEK_CUR)
b=DATA.dup
puts b.gets     #-> three             Yes!
b.rewind        #                     Rewind...
puts b.gets     #-> #!/bin/ruby -w    As expected

So as you suggested, the first part works fine with the sync
I'll leave it with you then :)

Andrew Walrond



Tue, 29 Nov 2005 19:30:48 GMT  
 DATA object and processes; unexpected problem
Hi,

At Fri, 13 Jun 2003 20:30:48 +0900,

Quote:

> So as you suggested, the first part works fine with the sync
> I'll leave it with you then :)

It was normal.  As I wrote at [ruby-talk:73295],

Quote:
> Shared file descriptors share the position even between
> processes.

Therefore, when first fd reached EOF, second fd points at EOF
too.  This is UNIX I/O model.

At least, however, seek before dup probably should be done
automatically.

Index: io.c
===================================================================
RCS file: /cvs/ruby/src/ruby/io.c,v
retrieving revision 1.213
diff -u -2 -p -r1.213 io.c
--- io.c        7 Jun 2003 15:33:40 -0000       1.213

     if (orig->f2) {
        io_fflush(orig->f2, orig);
+       fseeko(orig->f, 0L, SEEK_CUR);
     }
     else if (orig->mode & FMODE_WRITABLE) {
        io_fflush(orig->f, orig);
     }
+    else {
+       fseeko(orig->f, 0L, SEEK_CUR);
+    }

     /* copy OpenFile structure */

--
Nobu Nakada



Thu, 08 Dec 2005 09:49:48 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. expect & ssh unexpected process termination

2. Problem Generating Profiling Data (gmon.out) from Spawned Process in Expect

3. multi-process shared data and thread private data

4. Processor objects vs. data objects

5. Problem in report-unexpected page breaks...

6. Problem?: DVF DLL and unexpected I/O output

7. Sillicon Gr. Still unexpected problem

8. 2nd Posting - Problem with Global data in Data Dictionary

9. Fortran Data Input Problem-sample data and READ stmt

10. Problem with object-in-object and common superclass

11. cast object of python to java object - problem

12. Per process data

 

 
Powered by phpBB® Forum Software