
Getting a web page content using perl
Quote:
> I am new about using perl with the internet. I would like to use perl to
> get the web page content using perl. For example, if I input
> "http://www.perl.com" to my perl program, my perl program will give me
> the source content of that web page. Could this kind of thing be done by
> perl? Any suggestion?
This is in the PERL FAQ (you'll be hearing that a lot on this newsgroup)
at http://www.perl.com/CPAN-local/doc/FAQs/FAQ/PerlFAQ.html . The FAQ is
very helpful and really does contain the most common questions.
You can use backtics (``) to just run Lynx and capture the output:
$html_code = `lynx -source $url`;
$text_data = `lynx -dump $url`;
This, of course, assumes you have lynx installed. You can also do it in
100% perl. The LWP::Simple perl module makes this very easy:
use LWP::Simple;
$content = get("http://www.perl.com/");
For full documentation, you can type "perldoc LWP::Simple". If you don't
have the LWP::Simple module installed, you can get it with the CPAN.pm
module:
perl -MCPAN -e shell;
install LWP::Simple
For slightly more complicated requirements, you can use LWP::UserAgent.
Speaking of which, could anyone provide me with an example of fetching a
password-protected page with LWP::UserAgent? tnx.
And that's it.
Tobin Fricke