Reading multicolumn-file without parsing strings 
Author Message
 Reading multicolumn-file without parsing strings

 I'm trying to write a perls-script that reads in a datafile containing
four columns of
numbers, and I'd like to know if there's a way of reading this without
having to read
in each line as a string and then parsing it to produce four seperate
scalars   I'm a total newbie to perl (this is only the second script
I've ever written) , so my apologies if  I'm asking the obvious.

--
Alex Borghgraef



Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings

Quote:

>  I'm trying to write a perls-script that reads in a datafile containing
> four columns of
> numbers, and I'd like to know if there's a way of reading this without
> having to read
> in each line as a string and then parsing it to produce four seperate
> scalars

Not clear to me what is the problem with that.


        $delimiter = ',';
        while (<DATAFILE>) {
          ($c1,$c2,$c3,$4) = split(/$delimiter/);




        }

Quote:
> I'm a total newbie to perl (this is only the second script
> I've ever written) , so my apologies if  I'm asking the obvious.

-James
--
James Peregrino
Harvard Div. Continuing Education


Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings

Quote:

>  I'm trying to write a perls-script that reads in a datafile
> containing four columns of numbers, and I'd like to know if there's
> a way of reading this without having to read in each line as a
> string and then parsing it to produce four seperate scalars.

If they are fixed width fields you can use read().  It is unlikely to
be much faster.  It is likely to be much less readable.

If it's tab-delimited or comma-delimited you could play about with $/.
It is unlikely to be much faster (it at all).  It is likely to be much
less readable.

If it's white-space delimited then you'd have to play about with
getc().  This would be much slower and much less readable.

Stick to doing things the natural way using split().

while(<FILE>) {
  my ($c1,$c2,$c3,$c4) = split;
  # Do stuff...

Quote:
}

--
     \\   ( )
  .  _\\__[oo

 .  l___\\  
  # ll  l\\  
 ###LL  LL\\


Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings

[ Please fix your horrid word wrap, else people will skip
  reading your posts...
]

Quote:

> I'm trying to write a perls-script that reads in a datafile containing
>four columns of
>numbers, and I'd like to know if there's a way of reading this without

                                                                ^^^^^^^
                                                                ^^^^^^^

Quote:
>having to read
>in each line as a string and then parsing it to produce four seperate
>scalars

Why?

What is wrong with reading and spliting?

It works fine. It gets your work done.

Call it finished and move on to solving the next problem.

You must read all of the lines if you are going to process all
of the lines.

I cannot figure why you want another way...

What is the _real_ problem that you are trying to solve?

Do you think it is too slow?

The code is too big?

What?

--
    Tad McClellan                          SGML Consulting

    Fort Worth, Texas



Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings
Alex Borghgraef schrob:

Quote:
>  I'm trying to write a perls-script that reads in a datafile containing
> four columns of
> numbers, and I'd like to know if there's a way of reading this without
> having to read
> in each line as a string and then parsing it to produce four seperate
> scalars   I'm a total newbie to perl (this is only the second script
> I've ever written) , so my apologies if  I'm asking the obvious.

Hi,

the only way that comes to my mind how to avoid the split would
be, if the list ist comma-separated, to use the DBD::CSV module
which is available at www.cpan.org.
Then the file could be accessed using the database functions of
DBI.pm-module. But I suppose you don't want to do that as a
newbie to perl.

Regards
Christian

--
|~-_ /~~~~~ Free Linux Portal: http://www.linux-config.de ~~~~~\ _-~|
|  //       de.etc.schreiben.* - Usenet-Literatur im www:       \\  |
| //               http://www.usenet-autoren.de                  \\ |



Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings

Quote:


> >  I'm trying to write a perls-script that reads in a datafile containing
> > four columns of
> > numbers, and I'd like to know if there's a way of reading this without
> > having to read
> > in each line as a string and then parsing it to produce four seperate
> > scalars

> Not clear to me what is the problem with that.

 Err, simply that I hadn't gotten to the chapter "Regular expressions" in
"Learning Perl"
from O'Reilly Press yet :-)  Meaning I hadn't encountered the split-operator
yet, so
I didn't know how easy it was to separate the different scalars this way.  I'm
mostly
programming fortran these days, so I was looking for the equivalent of
      read(1, *) var1, var2, var3, var4
So it turns out I was asking the obvious after all.

Quote:
> Are you saying you want to get each column into its own data structure?

 No, I wanted to add all members of each column together. This is what I've come

up
with so far (after using parts of your example)

#!/usr/bin/perl -w

print "Counting particles\n";
$proton = 0;
$pion = 0 ;
$neutron = 0;
$electron = 0;
$counter = 0;
$delimiter = '  ';
open(PARTFILE, "< calibd_fort.90") or die "can't find calibd_fort.90";
   while(<PARTFILE>)
   {


       $proton     += $fileline[1];
       $pion         += $fileline[2];
       $neutron  += $fileline[3];
       $electron += $fileline[4];
       $counter  += 1;
   }
close(PARTFILE);
print "$proton $pion $neutron $electron\n";
print "$counter\n";

Now this seems to work well, except for one thing: I seem to be getting only
half the lines
in the file, and this is not a good thing.  The file is formatted as follows:
  0  1  45  2
  1  0  0  0
  1  2  3  4
...
Each line starts with a double space, and a double space is used as separator
inbetween
the numbers (that's why I ignore $fileline[0], it will be empty anyway) .The
program
seems to miss all the odd lines. How comes?

--
Alex Borghgraef



Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings

Quote:



> > >  I'm trying to write a perls-script that reads in a datafile containing
> > > four columns of
> > > numbers, and I'd like to know if there's a way of reading this without
> > > having to read
> > > in each line as a string and then parsing it to produce four seperate
> > > scalars

> > Not clear to me what is the problem with that.

>  Err, simply that I hadn't gotten to the chapter "Regular expressions" in
> "Learning Perl"
> from O'Reilly Press yet :-)  Meaning I hadn't encountered the split-operator
> yet, so
> I didn't know how easy it was to separate the different scalars this way.  I'm
> mostly
> programming fortran these days, so I was looking for the equivalent of
>       read(1, *) var1, var2, var3, var4
> So it turns out I was asking the obvious after all.

> > Are you saying you want to get each column into its own data structure?

>  No, I wanted to add all members of each column together. This is what I've come

> up
> with so far (after using parts of your example)

> #!/usr/bin/perl -w

better use strict; and my() all the variables below
Quote:

> print "Counting particles\n";
> $proton = 0;
> $pion = 0 ;
> $neutron = 0;
> $electron = 0;
> $counter = 0;
> $delimiter = '  ';
> open(PARTFILE, "< calibd_fort.90") or die "can't find calibd_fort.90";

can't find? You don't really know why the open failed, but perl tells
you in $!.
die "failed to open calibd_fort.90: $!";

Quote:
>    while(<PARTFILE>)

read a line into $_

Quote:
>    {


read another line. The other one is still in $_

Quote:
> Each line starts with a double space, and a double space is used as separator
> inbetween
> the numbers (that's why I ignore $fileline[0], it will be empty anyway) .The

You can also use magic split on ' ', which ignores leading whitespace

Quote:
> program
> seems to miss all the odd lines. How comes?

You read twice and process once.

Golfified version:

while (<DATA>) {
  my $i = 0;
  for (split) {
    $sums[$i++] += $_;
  }

Quote:
}


- Alex



Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings

Quote:

> >    while(<PARTFILE>)

> read a line into $_

> >    {

> read another line. The other one is still in $_

I get it now.  I knew I had to be reading the file twice in each loop, but I didn't
understand the $_  stuff. I've been reading some more, and it's all becoming
alot clearer now. So I could be doing


to solve this problem? Allthough your solution below is much more elegant.

Quote:
> > Each line starts with a double space, and a double space is used as separator
> > inbetween
> > the numbers (that's why I ignore $fileline[0], it will be empty anyway) .The

> You can also use magic split on ' ', which ignores leading whitespace

 Magic split? What's that?

Quote:
> Golfified version:

> while (<DATA>) {
>   my $i = 0;
>   for (split) {
>     $sums[$i++] += $_;
>   }
> }



 Awesome! I'm really starting to like this language.

--
Alex Borghgraef



Wed, 18 Jun 1902 08:00:00 GMT  
 Reading multicolumn-file without parsing strings

Quote:


>> You can also use magic split on ' ', which ignores leading whitespace

> Magic split? What's that?

   perldoc -f split

      "As a special case, specifying a PATTERN of space (' ')
       will split on..."

Quote:
> Awesome! I'm really starting to like this language.

Start to use the docs that come with it too  :-)

--
    Tad McClellan                          SGML Consulting

    Fort Worth, Texas



Wed, 18 Jun 1902 08:00:00 GMT  
 
 [ 9 post ] 

 Relevant Pages 

1. How to Parse string read from a file in Perl

2. Parse::Lex, read from String ?

3. reading .ini file without using a module

4. reading file without confilct..

5. how to read a huge file line by line without loading it into memory

6. read last line without reading previous lines, how?

7. Reading file, parsing $variables

8. Reading multi-lines -- Parsing C HEADER files

9. need help reading and parsing wtmpx login accounting file

10. Reading and parsing a file help needed

11. Reading/parsing my mail spool file with Perl

12. Parse HTML and rename file with string

 

 
Powered by phpBB® Forum Software