Newbie LWP callback question 
Author Message
 Newbie LWP callback question

Hi All,

As this is my first posting to a news group, please forgive any
mistakes.  

As my first "big" perl program I wrote a script using LWP that goes to
a site (Ebay) and parses certain information.

My first pass at the program downloaded and printed the entire html
page to disk.  I then assigned it to a filehandle and stepped through
each line with a while loop and checked it with a series of regexes.

This all worked perfectly (although) slowly.  In an attempt to speed
up the process I tried to rewrite the program to take advantage of
UserAgent's callback parameter.  I rewrote the regex section as a sub
and added it to the UserAgent's request.  This is where everything
fell apart.  

I think my problem is the the $data scalar.  I know that a filehandle
will read a single line of text, but how do you get the same behavior
from a scalar?

Thanks for any possible advice.

Chair



Mon, 19 Apr 2004 06:08:04 GMT  
 Newbie LWP callback question
Hi!

Quote:

> I think my problem is the the $data scalar.  I know that a filehandle
> will read a single line of text, but how do you get the same behavior
> from a scalar?

I understand that you have a whole file in a scalar $data and want to
process it line for line?

If so, do something like:



        # do something with $line ...
    }

Hope that was your problem?

Roland Halder



Mon, 19 Apr 2004 23:02:01 GMT  
 Newbie LWP callback question
On Thu, 1 Nov 2001 16:02:01 +0100, "Roland Halder"

Quote:

>Hi!

>> I think my problem is the the $data scalar.  I know that a filehandle
>> will read a single line of text, but how do you get the same behavior
>> from a scalar?

>I understand that you have a whole file in a scalar $data and want to
>process it line for line?

>If so, do something like:



>        # do something with $line ...
>    }

>Hope that was your problem?

>Roland Halder

Hi Roland,

Thanks for the tip!  While it didn't solve my problem I now know how
to split a scalar into individual lines.  

From what I understand, when using the call back in LWP::UserAgent
pieces of the file (not the whole thing) gets passed to the
subroutine.  This allows (in this case) for the subroutine to start
parsing the data before all of it is in.

This leads to an additional question;  how does the subroutine know
when to request more data?

When I did a search on "LWP callback" in Google, I did get a few
relevant hits, but each one of them involved passing the data from the
request ($data) to a LWP parser that accepted it as an argument.  This
alone tells me my problem isn't insurmountable.

Any additional comments would be greatly appreciated.

Chair



Tue, 20 Apr 2004 02:11:10 GMT  
 Newbie LWP callback question

Quote:

>Hi Roland,

>Thanks for the tip!  While it didn't solve my problem I now know how
>to split a scalar into individual lines.  

>Chair

Hi (again) Ronald,

Ignore the previous reply to your suggestion.  Your advice works
beautifully.  I stupidly forgot to change the arguments to the binding
operator for my regexes to $line.

Thank you very much!  I spent 2 days on the 'net and looking through
perl docs for the answer.  Without seeing a{*filter*} of code you got me
the answer right off the bat!

My next donation to the September 11th Fund will be in your name.
Thanks again.

Chair



Tue, 20 Apr 2004 02:37:13 GMT  
 Newbie LWP callback question

Quote:

>On Thu, 1 Nov 2001 16:02:01 +0100, "Roland Halder"

>>I understand that you have a whole file in a scalar $data and want to
>>process it line for line?



>>        # do something with $line ...
>>    }

>From what I understand, when using the call back in LWP::UserAgent
>pieces of the file (not the whole thing) gets passed to the
>subroutine.  This allows (in this case) for the subroutine to start
>parsing the data before all of it is in.

The solution to this problem is simply to have the callback remember the
last partial line, if any:

  # bare block for scoping:
  {
      my $lastline = "";
      sub callback {


          $data[0] = $lastline . $data[0];


      }
  }

  # initialize $ua, $url, et cetera here

  my $req = HTTP::Request->new('GET', $url);
  my $res = $ua->request($req, \&callback);

If you're worried about unterminated last lines, you may need to pass an
extra newline explicitly to the callback function:

  callback("\n", $res, $res->protocol);    # flush last line!

Note that the solution can be generalized -- for example, if your
callback were matching a global regexp against the data, instead of
splitting it into lines, you could remember where the last successful
match ended and save the rest of the data for a retry:

  {
      my $tail;
      sub init_callback { $tail = "" }    # call before each request!
      sub callback {

          $data = $tail . $data;

          my $pos = 0;
          while ($data =~ /$regexp/g) {
              $pos = pos($data);
              # do something with $1, $2, etc.
          }
          $tail = substr $data, $pos;
      }
  }

Whether this actually produces the desired result (identical behavior
regardless of buffer size) depends on whether or not the regexp will
consistently fail when it encounters an unexpected end-of-string.

(In fact, it's possible to use /(.*)\n/g as the regexp, and achieve the
same line-by-line behavior as in the first solution I gave.  If your use
a very large buffer size, this may use less memory than split().)

--
Ilmari Karonen -- http://www.sci.fi/~iltzu/
"Get real!  This is a discussion group, not a helpdesk.  You post something,
we discuss its implications.  If the discussion happens to answer a question
you've asked, that's incidental."           -- nobull in comp.lang.perl.misc



Tue, 20 Apr 2004 05:47:05 GMT  
 
 [ 5 post ] 

 Relevant Pages 

1. Newbie LWP Callback Redux

2. LWP::UserAgent - Callback for POST and PUT?

3. newbie question::problems installing LWP for Perl-Win32

4. Newbie LWP Question on POST Method

5. LWP Newbie question

6. newbie LWP redirect/cookie question

7. Newbie LWP question

8. Newbie LWP Question

9. newbie question: installing lwp

10. newbie: lwp/post question

11. Newbie LWP install question

12. Newbie question about LWP use

 

 
Powered by phpBB® Forum Software