Author |
Message |
HoboSo #1 / 14
|
 Must be a better way..
Clearly Im a newbie to both Perl and programming, but I havent learned a better way to do this... Ive used similar many times without any problems but Im trying to be more efficient, can someone just give me a nudge in the right direction? Thank you while (<TEXTFILE>) { if ( /SETTLEMENT SECTION/ ) { $section = "settlement"; } elsif ( /DIVIDENDS REPORT/ ) { $section = "divs"; } elsif ( /RISK MANAGEMENT REPORT/ ) { $section = "risk"; } if ( $section eq "settlement" ) { parse_settlement($_); } elsif ($section eq "divs" ) { parse_divs($_); } elsif ( $section eq "risk" ) { parse_risk($_); } Quote: }
|
Sun, 16 May 2004 02:15:08 GMT |
|
 |
Andrew Ham #2 / 14
|
 Must be a better way..
Quote: HoboSong wrote... >Clearly Im a newbie to both Perl and programming, but I havent learned >a better way to do this... Ive used similar many times without any problems >but Im trying to be more efficient, can someone just give me a nudge in the >right direction?
That kinda depends on what's happening inside the parse_ functions. Are they reading more lines from the TEXTFILE? Are they just processing ugly stuff on the rest of the line? Any alternatives to your code won't necessarily be more efficient, but just variations on how to express yourself. It's hard to optimise when you must find something and then perform an action list this. Besides that, the I/O time spent reading TEXTFILE will be far more significant than any speed increases you might squeeze out here, so my solution is presented on the basis that I/O time will be over-significant and code clarity is nicer. Try this: while(<TEXTFILE>) { chomp; # on a hunch, I've added this for you parse_settlement($_) if /SETTLEMENT SECTION/; parse_divs($_) if /DIVIDENDS REPORT/; parse_risk($_) if /RISK MANAGEMENT REPORT/; Quote: }
I can't even be bothered trying to skip the next tests because of the cost of reading the file in the first place. I don't think you need your $section variable, because that's implicitly known since you are entering a specific code path. However, if your parse_ functions is modifying $_ or reading more lines, there may be a chance of false matches on the following lines, then you can use a structure like this: while(<TEXTFILE>) { chomp; # on a hunch, I've added this for you parse_settlement($_), next if /SETTLEMENT SECTION/; parse_divs($_), next if /DIVIDENDS REPORT/; parse_risk($_), next if /RISK MANAGEMENT REPORT/; Quote: }
As usual, there's more than one way to do it in Perl. Which is why that's a famous Perl slogan. As you can gather from my reply, the proper answer for you depends on your data and what you will be doing to it. -- Space Corps Directive #723 Terraformers are expressly forbidden from recreating Swindon. -- Red Dwarf
|
Sun, 16 May 2004 03:12:36 GMT |
|
 |
John W. Krah #3 / 14
|
 Must be a better way..
Quote:
> Clearly Im a newbie to both Perl and programming, but I havent learned > a better way to do this... Ive used similar many times without any problems > but Im trying to be more efficient, can someone just give me a nudge in the > right direction?
Speed efficiency? Size efficiency? Maintenance efficiency? Quote: > while (<TEXTFILE>) { > if ( /SETTLEMENT SECTION/ ) { > $section = "settlement"; > } elsif ( /DIVIDENDS REPORT/ ) { > $section = "divs"; > } elsif ( /RISK MANAGEMENT REPORT/ ) { > $section = "risk"; > } > if ( $section eq "settlement" ) { > parse_settlement($_); > } elsif ($section eq "divs" ) { > parse_divs($_); > } elsif ( $section eq "risk" ) { > parse_risk($_); > } > }
my %lookup = ( qr/SETTLEMENT SECTION/, \&parse_settlement, qr/DIVIDENDS REPORT/, \&parse_divs, qr/RISK MANAGEMENT REPORT/, \&parse_risk, ); while ( <TEXTFILE> ) { for my $regex ( keys %lookup ) { /$regex/ and $lookup{ $regex }->( $_ ); last; } } John -- use Perl; program fulfillment
|
Sun, 16 May 2004 05:00:43 GMT |
|
 |
Uri Guttma #4 / 14
|
 Must be a better way..
JWK> my %lookup = ( qr/SETTLEMENT SECTION/, \&parse_settlement, JWK> qr/DIVIDENDS REPORT/, \&parse_divs, JWK> qr/RISK MANAGEMENT REPORT/, \&parse_risk, JWK> ); JWK> while ( <TEXTFILE> ) { JWK> for my $regex ( keys %lookup ) { JWK> /$regex/ and $lookup{ $regex }->( $_ ); JWK> last; hmm, will that ever try the 2nd and third regexes if the first fails? i think you meant: $lookup{ $regex }->( $_ ), last if /$regex/ ; JWK> } JWK> } uri --
-- Stem is an Open Source Network Development Toolkit and Application Suite - ----- Stem and Perl Development, Systems Architecture, Design and Coding ---- Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
|
Sun, 16 May 2004 05:28:29 GMT |
|
 |
David Hilse #5 / 14
|
 Must be a better way..
Quote: > Clearly Im a newbie to both Perl and programming, but I havent learned > a better way to do this... Ive used similar many times without any problems > but Im trying to be more efficient, can someone just give me a nudge in the > right direction? > Thank you > while (<TEXTFILE>) { > if ( /SETTLEMENT SECTION/ ) { > $section = "settlement"; > } elsif ( /DIVIDENDS REPORT/ ) { > $section = "divs"; > } elsif ( /RISK MANAGEMENT REPORT/ ) { > $section = "risk"; > } > if ( $section eq "settlement" ) { > parse_settlement($_); > } elsif ($section eq "divs" ) { > parse_divs($_); > } elsif ( $section eq "risk" ) { > parse_risk($_); > } > }
It seems like you might could use the .. or ... operators here, but the code doesn't indicate that this is currently possible with the data you're expecting. Just for informative purposes, here's some example code: #!/usr/bin/perl -w use strict; while (<DATA>) { if ( /BEGIN FOO_SECTION/ .. /END FOO_SECTION/ ) { print "Foo section: $_"; } elsif ( /BEGIN BAR_SECTION/ .. /END BAR_SECTION/ ) { if ( /BEGIN BAR_INT_SECTION/ .. /END BAR_INT_SECTION/ ) { print "Bar internal section: $_"; } else { print "Bar section: $_"; } } elsif ( /BEGIN BAZ_SECTION/ .. /END BAZ_SECTION/ ) { print "Baz section: $_"; } Quote: }
__DATA__ BEGIN FOO_SECTION Foo line 1 Foo line 2 END FOO_SECTION BEGIN BAR_SECTION Bar line 1 BEGIN BAR_INT_SECTION Bar internal line 1 END BAR_INT_SECTION Bar line 2 END BAR_SECTION BEGIN BAZ_SECTION Baz line 1 Baz line 2 END BAZ_SECTION Maybe you can work with this. -- David Hilsee
|
Sun, 16 May 2004 06:24:18 GMT |
|
 |
HoboSo #6 / 14
|
 Must be a better way..
Quote:
> while(<TEXTFILE>) { > chomp; # on a hunch, I've added this for you > parse_settlement($_) if /SETTLEMENT SECTION/; > parse_divs($_) if /DIVIDENDS REPORT/; > parse_risk($_) if /RISK MANAGEMENT REPORT/; > } > I can't even be bothered trying to skip the next tests because of the cost > of reading the file in the first place. I don't think you need your $section > variable, because that's implicitly known since you are entering a specific > code path. > However, if your parse_ functions is modifying $_ or reading more lines, > there may be a chance of false matches on the following lines, then you can > use a structure like this: > while(<TEXTFILE>) { > chomp; # on a hunch, I've added this for you > parse_settlement($_), next if /SETTLEMENT SECTION/; > parse_divs($_), next if /DIVIDENDS REPORT/; > parse_risk($_), next if /RISK MANAGEMENT REPORT/; > } > As usual, there's more than one way to do it in Perl. Which is why that's a > famous Perl slogan. > As you can gather from my reply, the proper answer for you depends on your > data and what you will be doing to it.
Im sorry, I should have been more clear. Im using those regexes as 'switches' while reading a text file. Each section of the file has a completely different set of instructions, so I need to process all the lines until I run into a new section. How do you say: &DoThis until ( /YOU SEE THIS/ ) and then &DoDifferent until ( /YOU GET HERE/); etc..,etc.. Oh, and I didnt use chomp because I cant seem to find the right combination of "\l\n" or whatever to designate the end of a page, so I leave them as is. Thanks, Chris
|
Sun, 16 May 2004 08:52:57 GMT |
|
 |
Bart Lateu #7 / 14
|
 Must be a better way..
Quote:
>Clearly Im a newbie to both Perl and programming, but I havent learned >a better way to do this... Ive used similar many times without any problems >but Im trying to be more efficient, can someone just give me a nudge in the >right direction? >while (<TEXTFILE>) { >if ( /SETTLEMENT SECTION/ ) { > $section = "settlement"; > } elsif ( /DIVIDENDS REPORT/ ) { > $section = "divs"; > } elsif ( /RISK MANAGEMENT REPORT/ ) { > $section = "risk"; > } >if ( $section eq "settlement" ) { > parse_settlement($_); > } elsif ($section eq "divs" ) { > parse_divs($_); > } elsif ( $section eq "risk" ) { > parse_risk($_); > } >}
It looks like you're look ing for a switch statement. No, Perl doesn't have them, although there's a module on CPAN which implements them -- no, it won't be faster, I'm pretty sure of that. There's an entry in the FAQ about them (comes with perl install): Found in lib\pod\perlfaq7.pod How do I create a switch or case statement? but what it suggests is not much different from what you came up with. I just want to suggest using a dispatch table for the second part: Set up: my %dispatch = ( settlement => \&parse_settlement, divs => \&parse_divs, risk => \&parse_risk, ); sub default { my $section = shift; die "No default action, section = $section"; } Invocation: ($dispatch{$section} || \&default)->($section); -- Bart.
|
Sun, 16 May 2004 13:26:52 GMT |
|
 |
HoboSo #8 / 14
|
 Must be a better way..
Quote:
> JWK> my %lookup = ( qr/SETTLEMENT SECTION/, \&parse_settlement, > JWK> qr/DIVIDENDS REPORT/, \&parse_divs, > JWK> qr/RISK MANAGEMENT REPORT/, \&parse_risk, > JWK> ); > JWK> while ( <TEXTFILE> ) { > JWK> for my $regex ( keys %lookup ) { > JWK> /$regex/ and $lookup{ $regex }->( $_ ); > JWK> last; > hmm, will that ever try the 2nd and third regexes if the first fails? > i think you meant: > $lookup{ $regex }->( $_ ), last if /$regex/ ; > JWK> } > JWK> } > uri
When I go to http://www.perldoc.com/perl5.6/pod/func/qr.html#top to look for documentation on "qr/ /" , I get a blank page, Im assuming it is "quote regex"? Then, If I am reding your code correctly, it looks like the values to the %lookup hash are pointers? to the subroutines...so I ask, This would only process lines that match the regex?
|
Sun, 16 May 2004 16:20:33 GMT |
|
 |
Bart Lateu #9 / 14
|
 Must be a better way..
Quote:
>When I go to http://www.perldoc.com/perl5.6/pod/func/qr.html#top to >look for documentation on "qr/ /" , I get a blank page, Im assuming it >is "quote regex"?
That looks like an error. Anyway, here you can find out more: <http://www.perldoc.com/perl5.6/pod/perlop.html#Regexp-Quote-Like-Oper...> although you'll have to scroll a little to find it, or use your browser's search, look for "qr/STRING/". -- Bart.
|
Sun, 16 May 2004 16:44:12 GMT |
|
 |
Andrew Ham #10 / 14
|
 Must be a better way..
Quote: > Im sorry, I should have been more clear. Im using those regexes as >'switches' while reading a text file. Each section of the file has a >completely different set of instructions, so I need to process all the >lines until I run into a new >section. How do you say: > &DoThis until ( /YOU SEE THIS/ ) and then &DoDifferent until ( /YOU >GET HERE/);
wellll, assuming that each section is started by a particular line and stopped by a recogmisable end, you can do this: &Section1 if /SECTION1/ .. /END SECTION1/; &Section2 if /SECTION2/ .. /END SECTION2/; &Section3 if /SECTION3/ .. /END SECTION3/; I process files which are satisfied with /^{/ .. /^}/ or /something/ .. /^end$/ so that's a very nice way of doing things. If the ending of one section is actually just the starting of the next section, then either your ending patterns must either include the keys for all sections (possibly including itself if sections can repeat) or you use a state variable like you did - hence, not much in the way of savings! Quote: >Oh, and I didnt use chomp because I cant seem to find the right >combination of "\l\n" or whatever to designate the end of a page, so I >leave them as is.
What's the end of a page? I can't see anything in your messages. If you can describe that (and maybe go into details about your sections including their ends) then it will be easier to give useful advice. -- Space Corps Directive #723 Terraformers are expressly forbidden from recreating Swindon. -- Red Dwarf
|
Mon, 17 May 2004 02:19:23 GMT |
|
 |
John W. Krah #11 / 14
|
 Must be a better way..
Quote:
> > JWK> my %lookup = ( qr/SETTLEMENT SECTION/, \&parse_settlement, > > JWK> qr/DIVIDENDS REPORT/, \&parse_divs, > > JWK> qr/RISK MANAGEMENT REPORT/, \&parse_risk, > > JWK> ); > > JWK> while ( <TEXTFILE> ) { > > JWK> for my $regex ( keys %lookup ) { > > JWK> /$regex/ and $lookup{ $regex }->( $_ ); > > JWK> last; > > hmm, will that ever try the 2nd and third regexes if the first fails? > > i think you meant: > > $lookup{ $regex }->( $_ ), last if /$regex/ ; > > JWK> } > > JWK> } > > uri > When I go to http://www.perldoc.com/perl5.6/pod/func/qr.html#top to > look for documentation on "qr/ /" , I get a blank page, Im assuming it > is "quote regex"?
Yes. Quote: > Then, If I am reding your code correctly, it looks like the values to > the %lookup hash are pointers? to the subroutines...so I ask,
Yes. Quote: > This would only process lines that match the regex?
Yes. The example wasn't complete (as pointed out by Uri [thanks Uri :)]). A more complete example would be: my %lookup = ( qr/SETTLEMENT SECTION/, \&parse_settlement, qr/DIVIDENDS REPORT/, \&parse_divs, qr/RISK MANAGEMENT REPORT/, \&parse_risk, ); LINE: while ( <TEXTFILE> ) { for my $regex ( keys %lookup ) { if ( /$regex/ ) { $lookup{ $regex }->( $_ ); last LINE; } } } John -- use Perl; program fulfillment
|
Mon, 17 May 2004 05:27:10 GMT |
|
 |
Joe Smi #12 / 14
|
 Must be a better way..
Quote:
>> while (<TEXTFILE>) { >> if ( /SETTLEMENT SECTION/ ) { >> $section = "settlement"; >> } elsif ( /DIVIDENDS REPORT/ ) { >> $section = "divs"; >> } elsif ( /RISK MANAGEMENT REPORT/ ) { >> $section = "risk"; >> } >> if ( $section eq "settlement" ) { >> parse_settlement($_); >> } elsif ($section eq "divs" ) { >> parse_divs($_); >> } elsif ( $section eq "risk" ) { >> parse_risk($_); >> } >> } >my %lookup = ( qr/SETTLEMENT SECTION/, \&parse_settlement, > qr/DIVIDENDS REPORT/, \&parse_divs, > qr/RISK MANAGEMENT REPORT/, \&parse_risk, > ); >while ( <TEXTFILE> ) { > for my $regex ( keys %lookup ) { > /$regex/ and $lookup{ $regex }->( $_ ); > last; > } > }
That does not have the right logic. It callse &parse_*($_) on the change-of-section lines, when it should be doing the other lines. All lines that do not start a new section need be processed. %parse = (SET => \&parse_settlement, DIV => \&parse_divs, RIS => \&parse_risk, '?' => sub { warn "Data before first section header ignored: $_"; } ); $current_section = '?'; # No section header seen yet while (<TEXTFILE>) { /^SETTLEMENT SECTION$/ and $current_section = "SET" and next; /^DIVIDENDS REPORT$/ and $current_section = "DIV" and next; /^RISK MANAGEMENT REPORT$/ and $current_section = "RIS" and next; # Any line that does not start a new section is parsed based on # the most recently seen section header. $parse{$current_section}->($_); Quote: }; close TEXTFILE;
parse_end_of_file(); -Joe -- See http://www.inwap.com/ for PDP-10 and "ReBoot" pages.
|
Mon, 17 May 2004 12:06:12 GMT |
|
 |
HoboSo #13 / 14
|
 Must be a better way..
Quote:
> That does not have the right logic. It callse &parse_*($_) on the > change-of-section lines, when it should be doing the other lines. > All lines that do not start a new section need be processed. > %parse = (SET => \&parse_settlement, > DIV => \&parse_divs, > RIS => \&parse_risk, > '?' => sub { warn "Data before first section header ignored: $_"; } > ); > $current_section = '?'; # No section header seen yet > while (<TEXTFILE>) { > /^SETTLEMENT SECTION$/ and $current_section = "SET" and next; > /^DIVIDENDS REPORT$/ and $current_section = "DIV" and next; > /^RISK MANAGEMENT REPORT$/ and $current_section = "RIS" and next; > # Any line that does not start a new section is parsed based on > # the most recently seen section header. > $parse{$current_section}->($_); > }; close TEXTFILE; > parse_end_of_file(); > -Joe
Joe, thanks, I think you understand what Im looking for, and although I believe this approach would save me alot of typing, Im confused about how these lines work : Quote: > /^SETTLEMENT SECTION$/ and $current_section = "SET" and next; > /^DIVIDENDS REPORT$/ and $current_section = "DIV" and next; > /^RISK MANAGEMENT REPORT$/ and $current_section = "RIS" and next;
It looks like a more Perl-like way of saying... if ( /^SETTLEMENT SECTION/ && ($current_section eq "SET") ) Am I reading that right? Also, how does that " and next;" work when used in the same phrase? Forgive me if these are basic questions but I must have missed that chapter somewhere ( Can anyone recommend good Perl books, I have only the llama book so far). Thank you, Chris
|
Mon, 17 May 2004 17:01:34 GMT |
|
 |
Bart Lateu #14 / 14
|
 Must be a better way..
Quote:
>I believe this approach would save me alot of typing, Im confused >about how these lines work : >> /^SETTLEMENT SECTION$/ and $current_section = "SET" and next; >> /^DIVIDENDS REPORT$/ and $current_section = "DIV" and next; >> /^RISK MANAGEMENT REPORT$/ and $current_section = "RIS" and next; > It looks like a more Perl-like way of saying... > if ( /^SETTLEMENT SECTION/ && ($current_section eq "SET") ) >Am I reading that right? Also, how does that " and next;" work when >used in the same phrase?
No, you'renot reading it right. It is another way of saying: if ( /^SETTLEMENT SECTION/) { $current_section = "SET"; next; } It's an assignment, not an equality test. THe "and" on the same line works because the value of the assignment is "SET", which is a true value, so it goes on. How about this for a shortcut: $current_section = /^SETTLEMENT SECTION$/ &&"SET" and next; $current_section = /^DIVIDENDS REPORT$/ && "DIV" and next; $current_section = /^RISK MANAGEMENT REPORT$/ &&"RIS" and next; or even: $current_section = (/^SETTLEMENT SECTION$/ &&"SET") || (/^DIVIDENDS REPORT$/ && "DIV") || (/^RISK MANAGEMENT REPORT$/ &&"RIS"); -- Bart.
|
Mon, 17 May 2004 23:15:29 GMT |
|
|
|